0% found this document useful (0 votes)
28 views118 pages

Lecture 4-Statistical Inferences

The document is a lecture on Estimation and Hypothesis Testing for MPH students, covering key concepts such as types of estimation, confidence intervals, and types of errors. It discusses the importance of statistical inference, properties of good estimators, and provides examples of confidence intervals for single populations and differences between populations. The lecture emphasizes the use of sample statistics to estimate population parameters and the significance of understanding confidence levels in statistical analysis.

Uploaded by

Dagi Magna
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
28 views118 pages

Lecture 4-Statistical Inferences

The document is a lecture on Estimation and Hypothesis Testing for MPH students, covering key concepts such as types of estimation, confidence intervals, and types of errors. It discusses the importance of statistical inference, properties of good estimators, and provides examples of confidence intervals for single populations and differences between populations. The lecture emphasizes the use of sample statistics to estimate population parameters and the significance of understanding confidence levels in statistical analysis.

Uploaded by

Dagi Magna
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 118

Yekatit 12 Hospital Medical College

School of Public Health


Department of Epidemiology and Biostatistics
Lecture 4: Estimation and Hypothesis Testing
For MPH Student
BY
Dube Jara
Assistant Professor of Epidemiology
Email: [email protected]

February, 2025
Addis Ababa, Ethiopia 1
Outline
• Introduction
• Types of Estimation
• Confidence intervals
• Types of errors
• Power of a test and significance level
• Application of different test statistics

2
Introduction
• Estimation and Hypothesis testing are the two forms of
statistical inferences

• Statistical inference is the procedure by which we reach at a


conclusion about a population on the basis of the information
contained in a sample drawn from that population

3
Population
Sample

Parameter Statistic

4
Introduction…
Definitions
• Population: a largest collection of individuals of interest

• Sample: part of the population

– Parameter: Numerical value of some characteristics in a


population (a fixed number, practically we do not know it’s
value). Example : µ, σ, σ2 ,P or Л

• Statistic: Numerical value of some characteristics in a sample


(value is known, but can change from sample to sample).
-We often use a statistic to estimate an unknown parameter.
Example :S,S2,p, x

5
Estimation
• The objective of estimation is to determine the value of
a population parameter on the basis of a sample
statistic.

• In short, it is the use of sample statistic to estimate


population parameter.

6
Estimate and Estimator
• A single computed value is an estimate.

• Estimator is the formula or the method used to obtain an


estimate.

• x = Σxi/n is an estimator of the population mean, µ.

– Each sample statistic can be used only to estimate the


corresponding population parameter.

7
Properties of good estimator
• Unbiasedness
– If a measure of the sample statistic is equal to the population
parameter, then the sample statistics estimate is unbiased.
Example,
– The mean of the sampling distribution of means is equal to the population
mean, hence the mean of the sampling distribution of means is unbiased
estimator of the population mean.

• Minimum variance
– An estimate which has a minimum standard error
Example :
– Standard error can be the expected value of the standard deviation of the
means of several samples.
– In a skewed distribution median has a minimum standard error

8
Cont’d…
• Efficiency:-if it has the smallest standard error compared to
other estimates. E.g. mean has the smallest error than median.

• Consistency:-if the sample statistics tend to the value of


parameter as it increases. E.g. mean is consistent estimates.

• Sufficiency:-if it used all a sample value in its computation. e.g.


mean
– B/c mean (x ̅ ) has all the properties it is a good estimate of
population parameter (μ)

9
Types of estimation
• There are two types of estimations , Point Estimation and
Interval Estimation

Point Estimation
– Is a single numerical value used to estimate the
corresponding population parameter

– A point estimate of Some population parameter “A” is a


single value “a” of sample statistic

10
Point estimation , Example
• The mean stay of 2000 inpatients , who are randomly selected ,
in B hospital is found to be 5 days with a standard deviation of
2 days.

• Then, the point estimates for the population parameters µ and


σ , with regard to hospital stay , are 5 days and 2 days
respectively.
Sample Statistic Population Parameter

µ
X
S2 σ2

S σ

p Л or P

11
Interval Estimation
• It is a statement that describes a population parameter has a
value lying in between two specified limits with a certain
confidence interval.
• A point estimate does not give any indication on how far away
the parameter lies.
• A more useful method of estimation is to compute an interval
which has a high probability of containing the parameters.
This will lead to the concept of confidence interval

12
Confidence interval
• Is an interval estimate of a population parameter

• Used to indicate the reliability of an estimate

• How likely the interval contain the parameter is determined by


the confidence level or the confidence coefficient ( Z at a certain
value of 1-α ).

• Increasing the desired confidence level will widen the


confidence interval

13
Cont’d…
A larger confidence level produces a wider confidence interval

14
Cont’d…
• A confidence interval has the form of,

Point estimate ± margin of error (Precision)

Margin of error =Reliability coefficient x Standard error

• The estimate is our guess for the value of the unknown


parameter

• The margin of error shows how accurate we believe our guess


is, based on the sampling distribution of the estimate.
15
Cont’d…
• The confidence interval has a lower and upper limit that can be
expressed in the form of
[Estimate – (Z α/2 X standard error)]
which is a lower limit and

[Estimate + (Z α/2 X standard error)]


which is an upper limit.

N.B. The standard error is computed as σ/√n

16
Cont’d…
• Example , Estimating a population mean

X Z 
2 n

   
 X  Z , X  Z 
 2 n 2 n

Lower Limit Upper Limit

17
Cont’d…
• Confidence interval provides a range of values of the estimate
likely to include the “true” (population) value with a given
probability.

• It is usually accepted that a 5% chance that the range will not


include the true population value and the range of interval is
called 95% confidence interval.

• When we say a confidence interval of 95 %, it is to mean that, in


repeated sampling, the interval would encompass the true
parameter value in 95 % of the time.

18
The following table shows the standard errors computed for
different population parameters that will be used
Parameter Estimate Standard Error

Sample mean µ
 2
X n
Difference in µ1-µ2  12  22
Means
X1  X 2 n1

n2
Sample p
Proportion
  (1   )
n
p1-p2
Difference in
Proportions 1   2  1 (1   1 )  2 (1   2 )
n1

n2
19
Estimation for Single Population

20
1. CI for a Single Population Mean (normally
distributed)
A. Known variance (large sample size)
• Consider the task of computing a CI estimate of μ for a
population distribution that is normal with σ known.
• Available are data from a random sample of size = n.

21
Assumptions
 Population standard deviation () is known
 Population is normally distributed
 If population is not normal, use large sample
• A 100(1-)% C.I. for  is:

22
3. Commonly used CLs are 90%, 95%, and 99%

23
Finding the Critical Value

24
Example :
• The mean weight of 100 children who are 5 years old in
a certain locality is found to be 14 kg. A clinician wants
to know the mean weight of all the children in that
locality with 95 % confidence interval, if it is known
that the SD for all children is 4kg.

25
Cont’d…
Given points:
CI = 95 %  α = 0.05 and α/2 = 0.025 and the value
of Z at α/2 is 1.96
n=100 σ = 4 and x = 14
• When you Insert the given values in the formula

X Z  ( )
2 n
• The result will be 14 ± 0.784 (13.21 and
14.78 )
• Interpretation ?
26
Cont,d….
• Example:-suppose a survey conducted on a
reprehensive sample of 900 newborn babies in A/A and
it is found that their average weight at birth is 3.5 kg
with SD of 0.5Kg estimate the wt of newborn babies in
A/A at the 95% level of confidence.
• Solution:-
Given n=900 =3.5kg S=0.5kg level of confidence =95% μ=?
Case-II:- 0.025 b/c =0.05
=3.5 ± 1.96x (0.5/√900) = 0.033
=3.467, 3.533
27
B. Unknown variance (small sample size, n ≤ 30)
• What if the  for the underlying population is unknown and the
sample size is small?
• As an alternative we use Student’s t distribution.

28
29
Student’s t Distribution
• The t is a family of distributions
• Bell Shaped
• Symmetric about zero (the mean)
• Flatter than the Normal (0,1). This means
– The variability of a t is greater than that of a Z that is
normal(0,1)
– Thus, there is more area under the tails and less at center
– Because variability is greater, resulting confidence
intervals will be wider.

30
• Note: t approaches z as n increases

31
What happens to CI as sample gets larger?

 s 
x Z   For large samples: Z
and t values become
 n almost identical, so CIs
are almost identical.
 s 
x t  
 n 32
Degrees of Freedom (df)

df = Number of observations that are free to vary after


sample mean has been calculated

df = n-1

33
Student’s t Table

34
t distribution values
• With comparison to the Z value

35
Example2
• sample of 20 houses studied to estimate the mean
sprayable area of house for controlling of malaria
epidemic. The result was =22.9m2, SD is
6.0m.construct CI for mean sprayable of area of the
population with 95% confidence.
• Solution:-given =22.9m2 SD=6.0m =0.05 0.025 degree of
freedom (n-1) =19 t=2.09
= 22.9 2.09(6/) =22.9 2.09(1.34)
=22.9 2.8 =22.9-2.8, 22.9+2.8
=20.01, 25.7
• We are 95% confident that the total sprayable area of a house
is b/n 20.01 and 25.7m2.
36
2.CI of single Population proportion

• The CI for population proportion is calculated as:

pq
p Z 
2 n
37
Example 1
• A random sample of 100 people shows that 25 are
left-handed. Form a 95% CI for the true proportion of
left-handers.

38
Interpretation

39
Estimation for Two Populations

40
3. CI for the difference between population
means (normally distributed)
A. Known variances (2 independent samples)

• When 1 and 2 are known and both populations


are normal or both sample sizes are at least 30,
the test statistic is a z-value…

41
42
Illustration
 A researcher performs a drug trial involving two
independent groups.
– A control group is treated with a placebo while, separately;
– The intervention group is treated with an active agent.

– Interest is in a comparison of the mean control response


with the mean intervention response under the
assumption that the responses are independent.

43
Example
• Researchers are interested in the difference between serum
uric acid levels in patients with and without Down’s syndrome.
• Patients without Down’s syndrome
– n=12, sample mean=4.5 mg/100ml, 2=1.0
• Patients with Down’s syndrome
– n=15, sample mean=3.4 mg/100ml, 2=1.5
• Calculate the 95% CI.
• SE = 0.43, 95% CI = 1.1 ± 1.96 (0.43) = (0.26, 1.94)
• We are 95% confident that the true difference between the
two population means is between 0.26 and 1.94.

44
B. Unknown variances (Independent samples)
I. Population variances equal (large sample)
• Assumptions:
– Samples are randomly and independently drawn
– Both sample sizes are ≥30
– Population standard deviations are unknown

45
Forming confidence estimates:
• Use sample standard deviation s to estimate , and
• the test statistic is a z-value

46
II. Population variances equal (small sample)
• Assumptions:
– Populations are normally distributed
– The populations have equal variances
– Samples are independent
– Both sample sizes are <30
– Population standard deviations are unknown

* If 0.5  s12/s22  2 then we assume that the population variances are


equal.

47
Forming confidence estimates:
• The population variances are assumed equal, so use
the two sample standard deviations and pool them to
estimate 
• The test statistic is a t value with (n1 + n2 – 2) degrees
of freedom
• The pooled estimate (s2p) is the weighted average of
the two sample variances.

48
• The pooled standard deviation is :

• The standard error of the estimate is given by:

49
50
III. Population variances unequal (small sample)

• The confidence interval for µ1-µ2 is:

• Where the degree of freedom (d’) is


given by:

51
C. Paired Samples
 Tests Means of 2 Related Populations
∆ Paired or matched samples
∆ Repeated measures (before/after)
∆ Use difference between paired values:
d = x1-x2
 Eliminates variation among subjects
 Assumptions:
 Both populations are normally distributed,
 Or, if not normal, use large samples.

52
53
• Where tα/2 has n-1 df.
Example
• Ten hypertensive patients are given methyl dopa for
their condition.
• They are asked to come back 1 week later and have
their blood pressures measured again. Suppose the
initial and follow-up SBPs (mm Hg) of the patients are
given below.
54
Example…

1. What is the mean and sd of the difference?


2. What is the standard error of the mean?
3. Assume that the difference is normally distributed,
construct a 95% CI for µ.
55
Answer
• We have the following data and summary statistics

56
4. Two Population Proportions
• We are often interested in comparing proportions
from 2 populations:
• Is the incidence of disease A the same in two
populations?
• Patients are treated with either drug D, or with
placebo. Is the proportion “improved” the same in
both groups?

57
58
Confidence Interval for Two Population Proportions

• SE of the difference =

• The confidence interval for p1 – p2 is:

59
Example
• In a clinical trial for a new drug to treat hypertension,
n1 = 50 patients were randomly assigned to receive
the new drug, and n2 = 50 patients to receive a
placebo. 34 of the patients receiving the drug
showed improvement, while 15 of those receiving
placebo showed improvement.
– Compute a 95% CI estimate for the difference between
proportions improved.

60
Example…
• p1 = 34/50 = 0.68, p2 = 15/50 = 0.30
• The point estimate for the difference is:
= [0.68−0.30]=0.38

• SE of the difference =

• 95% CI
– Lower = ( point estimate ) - (Zα/2) (SE)
= 0.38 – (1.96)(0.0925) = 0.20
– Upper = ( point estimate ) + (Zα/2) (SE)
= 0.38 + (1.96)(0.0925) = 0.56
• 95% CI = (0.20, 0.56)

61
Hypothesis Testing
Hypothesis
• Hypothesis is a statement made about one or more population
parameter

• It helps in reaching a conclusion concerning a population by


examining a sample from a population

• The purpose of HT is to aid the clinician, researcher or


administrator in reaching a decision (conclusion) concerning a
population by examining a sample from that population

63
Examples of Research Hypotheses
Population Mean
• The average length of stay of patients admitted to the hospital
is five days
• The mean birth weight of babies delivered by mothers with low
SES is lower than those from higher SES.
Population Proportion
• The proportion of adult smokers in Addis Ababa City is p = 0.40
• The prevalence of HIV among non-married adults is higher than
that in married adults, etc

64
Cont’d…
There are five ingredients to any statistical test
 Null Hypothesis
 Alternate Hypothesis
 Test Statistic
 Rejection/Critical Region
 Conclusion

65
Types of Hypothesis
1. The Null Hypothesis, H0
· Is a statement claiming that there is no difference between
the hypothesized value and the population value.
· (The effect of interest is zero = no difference)
· States the assumption (hypothesis) to be tested
· H0 is a statement of agreement (or no difference)
· H0 is always about a population parameter, not about a
sample statistic

66
Cont’d…
• Begin with the assumption that the Ho is true
– Similar to the notion of innocent until proven guilty

• Always contains “=” , “ ≤” or “≥ ” sign


• May or may not be rejected

67
2. The Alternative Hypothesis, HA
• Is a statement of what we will believe is true if our sample data
causes us to reject Ho.
• Is generally the hypothesis that is believed (or needs to be
supported) by the researcher
• Is a statement that disagrees (opposes) with Ho
(The effect of interest is not zero)
· Never contains “=” , “ ≤” or “≥ ” sign
• May or may not be accepted

68
Steps in Hypothesis Testing
1. Formulate the appropriate statistical hypotheses clearly
Specify HO and HA
H0:  = 0 H0:  ≤ 0 H0:  ≥ 0
H1:   0 H1:  > 0 H1:  < 0
two-tailed one-tailed one-tailed
2. State the assumptions necessary for computing probabilities
• A distribution is approximately normal (Gaussian)
• Variance is known or unknown

69
Cont,d…
3. Select a sample and collect data
• Categorical, continuous
4. Decide on the appropriate test statistic for the
hypothesis. E.g., One population

OR

70
Cont,d…
5. Specify the desired level of significance for the
statistical test and determine the critical value.
(=0.05, 0.01, etc.)
– A value the test statistic must attain to be declared
significant.

-1.96 1.96 1.645 -1.645

71
Cont,d…
6. Obtain sample evidence and compute the test statistic
7. Reach a decision and draw the conclusion
• If Ho is rejected, we conclude that HA is true (or
accepted).
• If Ho is not rejected, we conclude that Ho may
be true.

72
Significance level
• The significance level of a statistical hypothesis test is a fixed
probability of wrongly rejecting the null hypothesis H0, if it is in
fact true.

• It is the maximum probability of committing a Type I Error

• The significance level is usually denoted by: α


P (type I error) = α

• The commonly used significance level is 0.05

73
Another way to state conclusion
• Reject Ho if P-value < α
• Accept Ho if P-value ≥ α

P-value is the probability of obtaining a test statistic as extreme


as or more extreme than the actual test statistic obtained if the
Ho is true
The larger the test statistic, the smaller is the P-value. OR, the
smaller the P-value the stronger the evidence against the Ho.

74
Types of Errors in Hypothesis Tests
• Whenever we reject or accept the Ho, we commit errors.
• Two types of errors are committed.

– Type I Error
– Type II Error

75
Type I Error
• The error committed when a true Ho is rejected
• Considered a serious type of error
• The probability of a type I error is the probability of rejecting
the Ho when it is true
• The probability of type I error is α which is Called level of
significance of the test
• Set by researcher in advance

76
Type II Error
• The error committed when a false Ho is not rejected
• The probability of Type II Error is 
• Usually unknown but larger than α
Power
• The probability of rejecting the Ho when it is false.
Power = 1 – β = 1- probability of type II error

• We would like to maintain low probability of a Type I error (α)


and low probability of a Type II error (β) [high power = 1 - β].

77
Action Reality
(Conclusion)
Ho True Ho False

Do not Correct action Type II error (β)


reject Ho (Prob. = 1-α) (Prob. = β= 1-Power)

Reject Ho Type I error (α) Correct action


(Prob. = α = Sign. level) (Prob. = Power = 1-β)

78
Factors Affecting Type II Error

79
Factors Affecting the Power of the Test

The power depends on:


1. As n↑, power ↑
2. As |µ1-µo|↑, power ↑
3. As ↑, power ↓
4. As α↓, power ↓

80
Hypothesis Test for One Sample
• Test for single mean
• Test for single proportion

Hypothesis Test for Two Samples


• Test for the difference between two population means
• Test for the difference between two population proportions

81
1. Hypothesis Testing of a Single Mean
(Normally Distributed)

82
1.1 Known Variance

83
Example: Two-Tailed Test
1. A simple random sample of 10 people from a certain
population has a mean age of 27. Can we conclude that the
mean age of the population is not 30? The variance is known
to be 20. Let CL = .95.
A. Data
n = 10, sample mean = 27, 2 = 20, α = 0.05
B. Assumptions
Simple random sample
Normally distributed population

84
C. Hypotheses
Ho: µ = 30
HA: µ ≠ 30
D. Test statistic
As the population variance is known, we use Z as the
test statistic.

85
E. Decision Rule
• Reject Ho if the Z value falls in the rejection region.
• Don’t reject Ho if the Z value falls in the non-rejection region.
• Because of the structure of Ho it is a two tail test. Therefore,
reject Ho if Z ≤ -1.96 or Z ≥ 1.96.

86
F. Calculation of test statistic

G. Statistical decision
We reject the Ho because Z = -2.12 is in the rejection region. The
value is significant at 5% α.
H. Conclusion
We conclude that µ is not 30. P-value =
(1-.9830)2=( 0.0170)2= 0.0340

A Z value of -2.12 corresponds to an area of 0.0170. Since there are two


parts to the rejection region in a two tail test, the P-value is twice this
which is .0340.
87
1.2 Unknown Variance
• In most practical applications the standard deviation of the
underlying population is not known
• In this case,  can be estimated by the sample standard deviation s.
• If the underlying population is normally distributed, then the test
statistic is:

88
Example: Two-Tailed Test
• A simple random sample of 14 people from a certain
population gives a sample mean body mass index (BMI) of 30.5
and sd of 10.64. Can we conclude that the BMI is not 35 at α
5%?
• Ho: µ = 35, HA: µ ≠35
• Test statistic

• If the assumptions are correct and Ho is true, the test statistic


follows Student's t distribution with 13 degrees of freedom.

89
• Decision rule
– We have a two tailed test. With α = 0.05 it means that each tail is
0.025. The critical t values with 13 df are -2.1604 and 2.1604.
– We reject Ho if the t ≤ -2.160 or t ≥ 2.160.

• Do not reject Ho because -1.58 is not in the rejection


region. Based on the data of the sample, it is possible
that µ = 35. p-value is found b/n 0.05 and0.1.

90
2. Hypothesis Testing about the Difference Between
Two Population Means
Independent Samples (Normally Distributed)
Two Sample Means,

91
2.1 Known Variances (Independent Samples)
• When two independent samples are drawn from a
normally distributed population with known variance,
the test statistic for testing the Ho of equal population
means is:

92
Example:
 The means SUA levels on 12 individuals with Down’s
syndrome and 15 normal individuals are 4.5 and 3.4 mg/100
ml, respectively. with variances. (2=1, 2=1.5, respectively). Is
there a difference between the means of both groups at α
5%?
• Hypotheses:
Ho: µ1- µ2 = 0 or Ho: µ1 = µ2
HA: µ1 - µ2 ≠ 0 or HA: µ1 ≠ µ2

93
• With α = 0.05, the critical values of Z are -1.96 and +1.96.
We reject Ho if Z < -1.96 or Z > +1.96.

• Reject Ho because 2.57 > 1.96.


• From these data, it can be concluded that the population
means are not equal. A 95% CI would give the same
conclusion. P-value=(1-0.9949)2=(0.0051)2 = 0.01.

94
2.2 Unknown Variances
i. Equal variances (Independent samples)
• With equal population variances, we can obtain a pooled
value from the sample variances.
• The test statistic for µ1 - µ2 is:
• Where tα/2 has (n1 + n2 – 2) df., and

95
Example:
• We wish to know if we may conclude, at the 95%
confidence level, that smokers, in general, have greater
lung damage than do non-smokers.

• Calculation of Pooled
Variance

96
Example:
• Hypotheses:
Ho: µ1 ≤ µ2 = 0, H A: µ 1 > µ 2
• With α = 0.05 and df = 23, the critical value of t is 1.714. We reject Ho if
t > 1.714.
• Test statistic

• Reject Ho because 2.6563 > 1.714. On the basis of the data,


we conclude that µ1 > µ2.p-value is b/n 0.005 and 0.01,
which is <0.05.

97
ii. Unequal variances (Independent samples)

• We are still interested in testing


H0 : μ1 = μ2 vs HA: μ1 ≠ μ2
• The test statistic used is:

• To compute a test statistic, we simply substitute s12 for


12 and s22 for 22.

98
Hypothesis Testing for Paired Samples
• Two samples are paired when each data point of the first sample
is matched and is related to a unique data point of the second
sample.
• Tests means of 2 related populations
– Paired or matched samples
– Repeated measures (before/after)
• Longitudinal or follow-up study
• Assumptions:
– Both populations are normally distributed
– Or, if not normal, use large samples

99
The Paired t Test

n is the number of pairs in the paired sample


Sd = Sample standard deviation

100
101
102
Example:
• The following data show the SBP levels (mm Hg) in 10 women
while not using (baseline) and while using (follow-up) oral
contraceptives. Can we conclude that there is a difference
between mean baseline and follow-up SBP at α 5%? di =
baseline – follow-up

i SBP (baseline) SBP (follow-up) di


1 115 128 13
2 112 115 3
3 107 106 -1
4 119 128 9
5 115 122 7
6 138 145 7
7 126 132 6
8 105 109 4
9 104 102 -2
10 115 117 2

103
Example…
= (13 + 3 + …. + 2)/10 = 4.80
S2d = [(13-4.8)2 + … + (2-4.8)2]/9 = 20.844
Sd = √20.844 = 4.566
t = 4.80/(4.566/√10) = 4.80/1.44 = 3.32
• From the Table, t9,α/2 = 2.262
• Since t (= 3.32) > t9,α/2 (2.262) Ho is rejected
• P-value <0.005.
• Since 3.32 falls in the rejection region, there is a
significance difference between the population means
SBP while not using and using OC use.
104
Hypothesis Tests for Proportions
• Involves categorical values
• Two possible outcomes
– “Success” (possesses a certain
characteristic)
– “Failure” (does not possesses that
characteristic)
• Fraction or proportion of population in the
“success” category is denoted by p

105
Proportions

106
3. Hypothesis Testing about a Single Population Proportion
(Normal Approximation to Binomial Distribution)

107
Example
• In the general population of 0 to 4-year-olds, the
annual incidence of asthma is 1.4%. If 10 cases of
asthma are observed over a single year in a sample
of 500 children whose mothers smoke, can we
conclude that this is different from the underlying
probability of p0 = 0.014? cl = 95%

H0 : p = 0.014
HA: p ≠ 0.014

108
• The test statistic is given by:

109
Example…
• The critical value of Zα/2 at α=5% is ±1.96.
• Don’t reject Ho since Z (=1.14) in the non-rejection region
between ±1.96.
• P-value = (1-0.8729)2= 0.2542
• We do not have sufficient evidence to conclude that the
probability of developing asthma for children whose mothers
smoke in the home is different from the probability in the
general population

110
4. Hypothesis Tests about the Difference Between
Two Population Proportions

111
Where X1 = the observed number of events in the first sample
and X2 = the observed number of events in the second sample

112
113
Example
• Among the 225 students who ate the sandwiches, 109
became ill. While, among the 38 students who did not eat the
sandwiches, 4 became ill. Is there a significant difference
between the two groups at α =5%.
• We wish to test
Ho: p1 = p2 against the alternative
HA: p1 ≠ p2

114
115
• Assume that the sample sizes are large enough, and
the normal approximation to the binomial
distribution is valid.
• If the Ho is true, then p1 = p2 = p

116
The zcal >z tab .We reject H0 at the 0.05 level.

The proportion of students who became ill differs in the


two groups; those who ate the prepared sandwiches were
more likely to develop gastroenteritis.

117
Thank you ! !!

118

You might also like