Hypothesis Testing
Hypothesis Testing
Hypothesis Testing
1
What is hypothesis testing ?
Ex : You say avg height of citizens of city is more than 5.8 ft.
Votes for the politician will be more than 60%
All those example we assume need some statistic way to prove those. we need some
mathematical conclusion what ever we are assuming is true.
2
Example
Process A Process
✔It is claimed that a process has been 89.7 B 84.7
improved in yield by bringing a 81.4 86.1
change in an important factor X. 84.5 83.2
Yield data are collected from old and 84.8 91.9
⮚ Descriptive Statistics
Variable Process N Mean Std. Dev.
Yield A 10 84.24 2.90
B 10 85.54 3.65
⮚ Statistical Question:
Is there a statistically significant difference between mean of Process B (85.54)
and mean of Process A (84.24)? Or, is this difference in mean just due to
chance?
4
Hypothesis Testing
Example: Medicine B for treating
Develop the hypothesis for population
headache that is newly developed by a
and make statistical decision by
pharmaceutical company has 30
determining the acceptance of
minutes longer effect than existing
hypothesis using sample data.
Medicine A.
• Null Hypothesis (H0): Argument • H0 : Medicine A and B have same
made so far, or hypothesis saying that effect
there is no change or difference • H1 : Medicine B has 30 minutes
• Alternative Hypothesis (H1): New longer effect than Medicine A
argument, that is a hypothesis that
you want to prove with solid ground
obtained from sample
5
Procedure of Hypothesis Testing
7
1-tail, 2-Tail
1-Sample, 2-Sample
8
9
from scipy import stats
stats.norm.cdf(z)
stats.t.cdf(t,df=10)
10
11
12
Normal Distribution codes
13
Student’ t- Distribution codes
14
Simple Exercise
Conducting a hypothesis test is a bit like putting accused person on trial in front of
a jury.
The jury assumes that the accused person is innocent unless there is strong
evidence against him, but even after considering the evidence,
15
Simple Exercise
A accused person is on trial for a crime, and you’re on the jury. The jury’s
task is to assume the prisoner is innocent, but if there’s enough
evidence against him, they need to convict him.
16
Simple Exercise
17
Simple Exercise
18
The errors we can make when conducting a hypothesis test are the same sort
of errors we could make when putting a prisoner on trial
Hypothesis tests are basically tests where you take a claim and put it on trial
by assessing the evidence against it. If there’s sufficient evidence against it,
you reject it, but if there’s insufficient evidence against it, you accept it.
You may correctly accept or reject the null hypothesis, but even considering
the evidence, it’s also possible to make an error. You may reject a valid null
hypothesis, or you might accept it when it’s actually false
19
Statisticians have special names for these types of errors.
A Type I error is :
when you wrongly reject a true null hypothesis (Punished a innocent guy), and
A Type II error is :
when you wrongly accept a false null hypothesis (Let guilty go free).
20
ERRORS
Actual
Situation
21
22
23
Let us solve the problem
Mean =4.0
Standard deviation
=3
Sample Size =50
Sample mean =4.6
import scipy
import numpy as np
T statistic = (4-4.6)/(3/np.sqrt(50))
2*stats.t.cdf(-1.41,df=49)
24
25
26
Let us do it in Python
scipy.stats.ttest_1samp(array,m
u)
27
One-sample and one tail t-tests
Is there evidence that the mean level of Salmonella in the ice cream is
greater than 0.3 MPN/g?
28
Let be the mean level of Salmonella in all batches of ice cream. Here
the hypothesis of interest can be expressed as:
Data = pd.Series([0.593, 0.142, 0.329, 0.691, 0.231, 0.793, 0.519, 0.392, 0.418])
scipy.stats.ttest_1samp(data,0.3)
29
Two-sample t-tests
Ex. 6 subjects were given a drug (treatment group) and an additional 6 subjects a
placebo (control group). Their reaction time to a stimulus was measured (in ms).
We want to perform a two-sample t-test for comparing the means of the
treatment and control groups.
30
Let Mu1 be the mean of the population taking medicine and Mu2 the mean of
the untreated population. Here the hypothesis of interest can be expressed as:
H0: Mu1-Mu2=0
Ha: Mu1-Mu2 !=0
Ttest_indResult(statistic=-3.445612673536487, pvalue=0.006272124350809803)
31
2 Proportion t test
Usecase : Is there a significant difference between the population proportions of state 1
and state 2 who report that they have been placed immediately after education?
Populations: All Students who have completed graduation and Post graduation in both
both states
Parameter of Interest: p1 — p2, where p1 = state1 and p2 = state2
Data: 247 students from state 1. 36.8% of students report that they have got the job.
308 students from state 2. 38.9% of students report that they have got the job.
Hypothesis Definition:
Null Hypothesis: p1 - p2 = 0
Alternative Hypothesis: p1 -p2 ≠ 0
The difference in population proportion needs t-test. Also, the population
follows a binomial distribution here. We can just pass on the two population
quantities with the appropriate binomial distribution parameters to the t-test
function
Data Given:
n1 = 247
p1 = .37
n2 = 308
p2 = .39
Thank you
35