Basic Inference Hypotheses Testing
Basic Inference Hypotheses Testing
Testing
Using sample statistics to Test
Hypotheses
about population parameters
Definition of a hypothesis
2
Definition of Statistical hypotheses
• They are hypotheses that are stated in such a way
that they may be evaluated by appropriate statistical
techniques.
• There are two hypotheses involved in hypothesis
testing
• Null hypothesis H0: It is the hypothesis to be tested .
• Alternative hypothesis HA : It is a statement of what
we believe is true if our sample data cause us to
reject the null hypothesis
3
Testing a hypothesis about the
mean of a population:
• We have the following steps:
1.Data: determine variable, sample size (n), sample
mean( x ) , population standard deviation (σ) or
sample standard deviation (s) if σ is unknown
2. Assumptions : We have two cases:
• Case1: Population is normally or approximately
normally distributed with known or unknown
variance (sample size n may be small or large),
• Case 2: Population is not normal with known or
unknown variance (n is large i.e. n≥30).
4
• 3.Hypotheses:
• we have three cases
• Case I : H0: μ=μ0
HA: μ μ0
• e.g. we want to test that the population mean is not
different to 50.
• Case II : H0: μ ≤ μ0
HA: μ > μ0
• e.g. we want to test that the population mean is at most
50 .
• Case III : H0: μ ≥ μ0
HA: μ< μ0
• e.g. we want to test that the population mean is at
least 50.
5
4.Test Statistic:
• Case 1: population is normal or approximately normal
σ2 is known σ2 is unknown
( n large or small)
n large n small
X - o
Z= X - o
Z =
X - o T =
n s s
n n
• Case2: If population is not normally distributed and n is
large
• i)If σ2 is known ii) If σ2 is unknown
X - o X - o
Z = Z =
n s 6
n
5.Decision Rule:
i) If HA: μ μ0
• Reject H 0 if Z >Zα/2 or Z< - Zα/2
(when use Z - test)
Or Reject H 0 if T >tα/2,n-1 or T< - tα/2,n-1
(when use T- test)
• __________________________
• ii) If HA: μ> μ0
• Reject H0 if Z>Zα (when use Z - test)
Or Reject H0 if T>tα,n-1 (when use T - test)
7
• iii) If HA: μ< μ0
Reject H0 if Z< - Zα (when use Z - test)
• Or
Reject H0 if T<- tα,n-1 (when use T - test)
Note:
Zα are tabulated values obtained from z-table
tα are tabulated values obtained from t-table
with (n-1) degree of freedom (df)
8
• 6.Decision :
• If we reject H0, we can conclude that HA is
true.
• If ,however ,we do not reject H0, we may
conclude that H0 is true.
9
An Alternative Decision Rule using the
p - value Definition
• The p-value is defined as the smallest value of
α for which the null hypothesis can be
rejected.
• If the p-value is less than α ,we reject the null
hypothesis (p < α).
• If the p-value is greater than or equal to α
(p ≥ α), we do not reject the null hypothesis and
say we fail to reject the null hypothesis.
10
Example
• Researchers are interested in the mean age of a
certain population.
• A random sample of 10 individuals drawn from the
population of interest has a mean of 27.
• Assuming that the population is approximately
normally distributed with variance 20,can we
conclude that the mean is different from 30 years ?
(α=0.05) .
11
Solution
1-Data: variable is age, n=10, x =27 ,σ2=20,α=0.05
2-Assumptions: the population is approximately
normally distributed with variance 20
3-Hypotheses:
• H0 : μ=30
• HA: μ 30
12
4-Test Statistic:
• Z = -2.12
5.Decision Rule
• The alternative hypothesis is
• HA: μ ≠ 30
• Hence we reject HO if Z >1.96 or Z < -1.96
(from Z-table Z.025)
13
• 6.Decision:
14
Example
• Referring to example that the researchers
have asked: Can we conclude that μ<30.
1.Data.see previous example
2. Assumptions .see previous example
3.Hypotheses:
• H0 μ ≥ 30
• HA: μ < 30
15
4.Test Statistic :
X - o 27 − 30
• Z= = = -2.12
20
10
n
5. Decision Rule: Reject H0 if Z< -Z α, where
16
Example
• Among 157 African-American men ,the mean
systolic blood pressure was 146 mm Hg with a
standard deviation of 27. We wish to know if
on the basis of these data, we may conclude
that the mean systolic blood pressure for a
population of African-American is greater than
140. Use α=0.01.
17
Solution
1. Data: Variable is systolic blood pressure,
n=157 , x =146, s=27, α=0.01.
2. Assumption: population is not normal, σ2 is
unknown
3. Hypotheses: H0 :μ ≤ 140
HA: μ>140
4.Test Statistic:
X - 146 − 140 6
• Z = s = 27 =
o
= 2.78
2.1548
n 157
18
5. Decision Rule:
we reject H0 if Z>Zα
= Z0.01= 2.33
(from Z-table)
19
Hypothesis Testing :The Difference between
two population mean :
• We have the following steps:
1.Data: determine variable, sample size (n), sample means,
population standard deviation or samples standard
deviation (s) if is unknown for two population.
2. Assumptions : We have two cases:
• Case1: Population is normally or approximately normally
distributed with known or unknown variance (sample size
n may be small or large),
• Case 2: Population is not normal with known variances (n is
large i.e. n≥30).
20
• 3.Hypotheses:
• we have three cases
• Case I : H0: µ1 = µ2 → μ 1 - μ2 = 0
• HA: μ 1 ≠ μ 2 → μ1 - μ2 ≠ 0
• e.g. we want to test that the mean for first population is
not different from second population mean.
• Case II : H0: 𝜇1 ≤ 𝜇2 → μ 1 - μ2 ≤ 0
HA: μ 1 > μ 2 →μ 1 - μ 2 > 0
• e.g. we want to test that the mean for first population is
at most the second population mean.
• Case III : H0: 𝜇1 ≥ 𝜇2 → μ 1 - μ2 ≥ 0
HA: μ 1 < μ 2 → μ1 - μ2 <0
• e.g. we want to test that the mean for first population
is at least the second population mean.
21
4.Test Statistic:
• Case 1: Two population is normal or approximately
normal
σ2 is known σ2 is unknown if
( n1 ,n2 large or small) ( n1 ,n2 small)
(X1 - X 2 ) - ( 1 − 2 )
Z=
12 22 population population Variances
+
n1 n2 Variances equal not equal
(X1 - X 2 ) - ( 1 − 2 ) (X1 - X 2 ) - ( 1 − 2 )
T= T=
S12 S 22
Sp
1 1
+ +
n1 n2 n1 n2
n1 + n2 − 2
p
22
• Case2: If population is not normally distributed
• and n1, n2 is large(n1 ≥ 30 ,n2≥ 30)
• and population variances is known,
(X1 - X 2 ) - ( 1 − 2 )
Z=
12 22
+
n1 n2
23
5.Decision Rule:
i) If HA: μ 1 ≠ μ 2 → μ 1 - μ 2 ≠ 0
• Reject H 0 if Z >Zα/2 or Z< - Zα/2
(when use Z - test)
Or Reject H 0 if T >tα/2 ,(n1+n2 -2) or T< - tα/2,(n1+n2 -2)
(when use T- test)
• __________________________
• ii) HA: μ 1 > μ 2 → μ 1 - μ 2 > 0
• Reject H0 if Z>Zα (when use Z - test)
Or Reject H0 if T>tα,(n1+n2 -2) (when use T - test)
24
• iii) If HA: μ 1 < μ 2 → μ 1 - μ 2 < 0 Reject H0
if Z< - Zα (when use Z - test)
• Or
Reject H0 if T<- tα, (n1+n2 -2) (when use T - test)
Note:
Zα/2 , Zα are tabulated values obtained from Z
table
tα/2, tα are tabulated values obtained from t
table with (n1+n2 -2) degree of freedom (df)
6. Conclusion: reject or fail to reject H0
25
Example
• Researchers wish to know if the data have collected
provide sufficient evidence to indicate a difference in mean
serum uric acid levels between normal individuals and
individual with Down’s syndrome. The data consist of serum
uric reading on 12 individuals with Down’s syndrome from
normal distribution with variance 1 and 15 normal
individuals from normal distribution with variance 1.5 . The
mean are X1 = 4.5mg / 100 and X 2 = 3.4mg / 100 with α=0.05.
Solution:
1. Data: Variable is serum uric acid levels, n1=12 , n2=15,
σ21=1, σ22=1.5 ,α=0.05.
26
2. Assumption: Two population are normal, σ21 , σ22
are known
3. Hypotheses: H0: μ 1 = μ2 → μ 1 - μ2 = 0
• HA: μ 1 ≠ μ 2 → μ1 - μ2 ≠ 0
4.Test Statistic:
(X - X ) - ( − ) (4.5 - 3.4) - (0)
• Z = 1 2
2
1
2
2
= = 2.57
1 1 .5
1
n1
+ 2
n2
+
12 15
5. Desicion Rule:
Reject H 0 if Z >Zα/2 or Z< - Zα/2
Zα/2= Z.05/2= Z0.025 = 1.96 (from Z-table)
6-Conclusion: Reject H0 since 2.57 > 1.96
Or if p-value =0.102→ reject H0 if p < α → then reject H0
27
Example
The purpose of a study by Tam, was to investigate wheelchair
Maneuvering in individuals with over-level spinal cord injury (SCI)
And healthy control (C). Subjects used a modified a wheelchair to
incorporate a rigid seat surface to facilitate the specified
experimental measurements. The data for measurements of the
left ischial tuerosity ) (عظام الفخذ وتأثيرها من الكرسي المتحركfor SCI and control C
are shown below
28
We wish to know if we can conclude, on the
basis of the above data that the mean of
left ischial tuberosity for control C lower
than mean of left ischial tuerosity for SCI,
Assume normal populations equal
variances. α=0.05, p-value = .10
29
Solution:
1. Data:, nC=10 , nSCI=10, SC=21.8, SSCI=133.1 ,α=0.05.
• X C = 126.1 , X SCI = 133 .1 (calculated from data)
2.Assumption: Two population are normal, σ21 , σ22 are
unknown but equal
3. Hypotheses: H0: μ C ≥ μ SCI → μ C - μ SCI ≥ 0
HA: μ C < μ SCI → μ C - μ SCI < 0
4.Test Statistic:
) - ( 1 − 2 ) (126 .1 − 133 .1) − 0
• T =
(X - X
1 2
= = −0.569
1 1 1 1
Sp + 756 .04 +
n1 n2 10 10
30
5. Decision Rule:
Reject H 0 if T< - T1-α,(n1+n2 -2)
T1-α,(n1+n2 -2) = T0.95,18 = 1.7341 (from T-table)
31
Group Mean LgG level Sample standard ٍ
Size deviation
Thrombosis 59.01 53 44.89
No 46.61 54 34.85
Thrombosis
Solution:
1. Data:, n1=53 , n2=54, S1= 44.89, S2= 34.85 α=0.01.
2.Assumption: Two population are not normal, σ21 , σ22
are unknown and sample size large
3. Hypotheses: H0: μ 1 = μ 2 → μ 1 - μ 2 = 0
HA: μ 1 > μ 2 → μ 1- μ 2 > 0
4.Test Statistic:
• Z = (X1 - X 2 )2 - ( 12− 2 ) = (59 .01 − 46 .61) − 0
2 2
= 1.59
S1 S 44 .89 34 .85
+ 2 +
n1 n2 53 54
32
5. Decision Rule:
Reject H 0 if Z > Z1-α
Z1-α = Z0.99 = 2.33 (from Z-table)
33
Hypothesis Testing A single
population proportion:
• Testing hypothesis about population proportion (P) is carried out
in much the same way as for mean when condition is necessary for
using normal curve are met
• We have the following steps:
1.Data: sample size (n), sample proportion( p̂) , P0
no.of element in thesample withsome charachtaristic a
pˆ = =
Total no.of element in thesample n
34
• 3.Hypotheses:
• we have three cases
• Case I : H0: P = P0
HA: P ≠ P0
• Case II : H0: P ≤ P0
HA: P > P0
• Case III : H0: P ≥ P0
HA: P < P0
4.Test Statistic: ˆ − p0
p
Z =
p0 q0
n
35
5.Decision Rule:
i) If HA: P ≠ P0
• Reject H 0 if Z >Z1-α/2 or Z< - Z1-α/2
• _______________________
• ii) If HA: P> P0
• Reject H0 if Z>Z1-α
• _____________________________
• iii) If HA: P< P0
Reject H0 if Z< - Z1-α
Note: Z1-α/2 , Z1-α , Zα are tabulated values obtained from Z-
table
6. Conclusion: reject or fail to reject H0
36
2. Assumptions : p̂ is approximately normally distributed
3.Hypotheses:
• we have three cases
• H0: P = 0.063
HA: P > 0.063
• 4.Test Statistic : Z =
ˆ − p0
p
=
0.08 − 0.063
= 1.21
p0 q 0 0.063 (0.937 )
n 301
37
6. Conclusion: Fail to reject H0
Since
Z =1.21 > Z1-α=1.645
Or ,
If P-value = 0.1131,
fail to reject H0 → P > α
38
Example
Wagen collected data on a sample of 301 Hispanic women
Living in Texas .One variable of interest was the percentage
of subjects with impaired fasting glucose (IFG). In the
study,24 women were classified in the (IFG) stage .The article
cites population estimates for (IFG) among Hispanic women
in Texas as 6.3 percent .Is there sufficient evidence to
indicate that the population Hispanic women in Texas has a
prevalence of IFG higher than 6.3 percent ,let α=0.05
Solution:
a 24
pˆ = = = 0.08
1.Data: n = 301, p0 = 6.3/100=0.063 ,a=24, n 301
q0 =1- p0 = 1- 0.063 =0.937, α=0.05
39
Hypothesis Testing :The
Difference between two
population proportion:
• Testing hypothesis about two population proportion (P1,, P2 ) is
carried out in much the same way as for difference between two
means when condition is necessary for using normal curve are met
• We have the following steps:
1.Data: sample size (n1 وn2), sample proportions( Pˆ1 , Pˆ2 ),
Characteristic in two samples (x1 , x2),
40
• 3.Hypotheses:
• we have three cases
• Case I : H0: P1 = P2 → P1 - P2 = 0
HA: P1 ≠ P2 → P1 - P2 ≠ 0
• Case II : H0: P1 ≤ P2 → P1 ≤ P2 = 0
HA: P1 > P2 → P1 - P2 > 0
• Case III : H0: P1 ≥ P2 → P1 - P2 ≥ 0
HA: P1 < P2 → P1 - P2 < 0
4.Test Statistic:
ˆ1 − p
(p ˆ 2 ) − ( p1 − p2 )
Z =
p (1 − p ) p (1 − p )
+
n1 n2
41
5.Decision Rule:
i) If HA: P1 ≠ P2
• Reject H 0 if Z >Z1-α/2 or Z< - Z1-α/2
• _______________________
• ii) If HA: P1 > P2
• Reject H0 if Z >Z1-α
• _____________________________
• iii) If HA: P1 < P2
• Reject H0 if Z< - Z1-α
Note: Z1-α/2 , Z1-α , Zα are tabulated values obtained from Z-
table
6. Conclusion: reject or fail to reject H0
42
Example
Noonan is a genetic condition that can affect the heart growth,
blood clotting and mental and physical development. Noonan examined
the stature of men and women with Noonan. The study contained 29
Male and 44 female adults. One of the cut-off values used to assess
stature was the third percentile of adult height .Eleven of the males fell
below the third percentile of adult male height ,while 24 of the female
fell below the third percentile of female adult height .Does this study
provide sufficient evidence for us to conclude that among subjects with
Noonan ,females are more likely than males to fall below the respective
of adult height? Let α=0.05
Solution:
1.Data: n M = 29, n F = 44 , x M= 11 , x F= 24, α=0.05
xM + xF 11 + 24
p= = = 0.479 pˆ M = xm = 11 = 0.379, pˆ F = xF = 24 = 0.545
nM + nF 29 + 44 nM 29 nF 44
43
2- Assumption : Two populations are independent .
3.Hypotheses:
• Case II : H0: PF ≤ PM → PF - PM ≤ 0
HA: PF > PM → PF - PM > 0
• 4.Test Statistic:
( pˆ 1 − pˆ 2 ) − ( p1 − p2 ) (0.545 − 0.379 ) − 0
Z= = = 1.39
p (1 − p ) p (1 − p ) (0.479 )(0.521) (0.479 )(0.521)
+ +
n1 n2 44 29
5.Decision Rule:
Reject H0 if Z >Z1-α , Where Z1-α = Z1-0.05 =Z0.95= 1.645
6. Conclusion: Fail to reject H0
Since Z =1.39 > Z1-α=1.645
Or , If P-value = 0.0823 → fail to reject H0 → P > α
44