100% found this document useful (1 vote)
170 views108 pages

Inferential Hypothesis Testing

1. Hypotheses testing involves formulating and testing a null hypothesis (H0) and an alternative hypothesis (HA). The goal is to determine whether to reject or fail to reject H0 based on sample data. 2. The key steps are: 1) state H0 and HA, 2) collect sample data, 3) choose a test statistic, 4) determine a critical value based on the significance level, 5) calculate the test statistic, and 6) either reject or fail to reject H0 depending on where the test statistic falls relative to the critical value. 3. There are two types of errors - a Type I error of rejecting a true H0, and a Type II error of

Uploaded by

Abrham Belay
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
170 views108 pages

Inferential Hypothesis Testing

1. Hypotheses testing involves formulating and testing a null hypothesis (H0) and an alternative hypothesis (HA). The goal is to determine whether to reject or fail to reject H0 based on sample data. 2. The key steps are: 1) state H0 and HA, 2) collect sample data, 3) choose a test statistic, 4) determine a critical value based on the significance level, 5) calculate the test statistic, and 6) either reject or fail to reject H0 depending on where the test statistic falls relative to the critical value. 3. There are two types of errors - a Type I error of rejecting a true H0, and a Type II error of

Uploaded by

Abrham Belay
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 108

Hypotheses

Testing
Ephrem Mannekulih (BSc, MSc)
Biostatistics and Health Informatics
Introduction
 Without objective verification, science would face strong
challenge to exist

 Inferential statistics are the usual ways of objective


verification about reality in the population

 There are usually two methods of inferential statistics

1. Estimation

2. Hypotheses Testing
Introduction
 Estimation

o It use point estimate along with confidence intervals to


arrive at a conclusions about population parameter

 Hypotheses Testing

o It use test statistics to arrive at the same conclusions


about population parameter
Hypotheses
 Hypotheses is a statement, claim or assumption about one or
more population that may or may not be true

 It is frequently concerned with the parameters of the


population about which the statement is made.
Cont.…
 Example:

o The average length of stay of patients admitted to the


Adama hospital could be 5 days

o A particular educational program will result in improved


communication between health care provider and patient

o A certain drug will be effective in 90% of the cases for


which it is used
Aim of Hypotheses Testing
 To aid the clinician, researcher or administrator in reaching a
conclusion concerning a population by examining a sample
from that population.

 It provides an objective framework for making decisions using


probabilistic methods

 Hypotheses are formulated, experiments are performed, and


results are evaluated for their consistency or inconsistency
Types of Hypotheses
1. The Null Hypotheses, 𝑯𝟎

 It is a statement claiming that there is no difference between


the hypothesized value and the population value

o The effect of interest is zero

 States the hypotheses to be tested

 It is what we assume is true until proven

 Similar to the notion of innocent until proven guilty


…..Types of Hypotheses
 It is always about a population parameter, not about a sample
statistic

 Always contains “=” , “ ≤” or “≥ ” sign

 May or may not be rejected


………..Types of Hypotheses
2. The Alternative Hypotheses, 𝑯𝑨

 It is a statement claiming that there is a difference between the


hypothesized value and the population value

 Is a statement that disagrees (opposes) with 𝐻0

o (The effect of interest is not zero)

 It is a statement of what we will believe is true if our sample


data causes us to reject 𝐻0 .
……Types of hypotheses
 It is generally the hypotheses that is believed or needs to be
supported by the researcher.

 What you hope to conclude should be placed in the 𝐻𝐴

 Never contains “=” , “ ≤” or “≥ ” sign

 May or may not be accepted

 The 𝐻0 and 𝐻𝐴 are complementary


Steps in Hypotheses Testing
Step-1. Formulate hypotheses

 State the null (𝐻0 ) and alternate (𝐻𝐴 ) hypotheses

 In general, a hypotheses test about the value of a population


mean µ takes one of the following three forms

o 𝐻0 :  = 𝜇0 𝐻0 :  ≤ 𝜇0 𝐻0 :  ≥ 𝜇0

o 𝐻𝐴 :   𝜇0 𝐻𝐴 :  > 𝜇0 𝐻𝐴 :  < 𝜇0

o Two-tailed One-tailed One-tailed

 Where 𝜇0 is the hypothesized value of the population mean


……Steps in Hypotheses Testing
Step-2. Data
 Select a sample and collect data
 Check and understand the nature of the data
o Categorical Vs. continuous
Step-3. Assumptions
 State the assumptions necessary for selecting and computing
test statistics
o Normality of the population distribution
o Equality of variances & known or unknown
o Independence of samples
…..Steps in Hypotheses Testing
Step-4. Choose the appropriate test statistic

 Select the appropriate test statistics based on hypothesised


value, nature of data and the assumptions

 Example;

o For testing hypotheses about one population mean

OR
…..Steps in Hypotheses Testing
Step-5. Decision Rule

 The decision rule tells us to reject or not to reject the null


hypotheses based on;

o The value of the test statistic compute from sample data

o The corresponding P-Value of the test statistic

o The confidence interval of the test statistic


…..Steps in Hypotheses Testing
 The values of the test statistic assume the points on the
horizontal axis of the distribution of test statistics

 The test statistics are divided the graph of the distribution of


the test statistic into two regions:
o Rejection region and
o Non-rejection region

 The values of the test statistic forming the rejection region are
less likely to occur if the 𝐻0 is true.

 The values test statistic making the acceptance (non-rejection)


region are more likely to occur if the 𝐻0 is true.
…..Steps in hypotheses Testing
 The decision as to which values go into the rejection region or
non rejection region is made on the basis of the desired level of
significance, designated by 

 Level of Significance, α

o It is the probability of rejecting a true 𝐻0 or committing a


type I error

o The level of significance(α) specifies the area under the


curve above the values on the horizontal axis constituting
the rejection region
…..Steps in hypotheses Testing
 The choice of the level of significance depends on the
seriousness of the type I error

 More frequently used values of α are 0.01, 0.05 and 0.10

 α is selected by the researcher at the beginning


Example: Two-sided test at α 5%
Rejection region Non-rejection region Rejection region

= 0.025 = 0.025
0.95

-1.96 1.96
…..Steps in hypotheses Testing
Step-6. Determine the critical value.

 After a significance level is specified, a critical value is


determined from a table for the appropriate test

 The critical value(C.V.) is the values of the test statistic that


separate the rejection and non rejection
…..Steps in hypotheses Testing
 The critical value can be on the right side of the mean or on
the left side of the mean or both
…..Steps in hypotheses Testing
Step-7. Compute the test statistic

 The test statistic is some statistic that may be computed from


sample data

 The test statistic serves as a decision maker

 The decision to reject or not to reject the null hypotheses


depends on the magnitude of the test statistic

 General Formula to compute test statistic;


…..Steps in hypotheses Testing
Step-8. Reach a decision

 Decision based on the value of the test statistic compute from


sample data

o Reject 𝑯𝟎 if the value of the test statistic that we compute


from our sample is one of the values in the rejection region

o Don’t reject 𝑯𝟎 if the computed value of the test statistic


is one of the values in the non-rejection region.
…..Steps in hypotheses Testing
 Decision based on the corresponding P-Value of the test
statistic

 P-value is the probability of obtaining a test statistic as extreme


as or more extreme than a specified value of the test statistic if
the 𝐻0 is true
o Reject 𝐻0 or accept 𝐻𝐴 if P-value < α
o Accept 𝐻0 or reject 𝐻𝐴 if P-value ≥ α
 The p-value is a number that tells us how unusual our sample
results are, given that 𝐻0 is true

 The larger the test statistic, the smaller is the P-value or,
 The smaller the P-value the stronger the evidence against the 𝐻0
…..Steps in hypotheses Testing
Step-9. Draw the conclusion

o If 𝐻0 is rejected, we conclude that 𝐻𝐴 is true (or


accepted).

o If 𝐻0 is not rejected, we conclude that 𝐻0 may be true.


One tail and two tail tests
 In a one tail test, the rejection region is at one end of the
distribution or the other.

 In a two tail test, the rejection region is split between the two
tails.

 Which one is used depends on the way the 𝐻𝐴 is written.


Types of Errors in hypotheses Tests
 Whenever we reject or fail to reject the 𝐻0 , we commit
errors.

 Two types of errors are committed.

o Type I Error

o Type II Error
Type I Error
 The error committed when a true 𝐻0 is rejected

o The probability of a type I error is the probability of


rejecting the 𝐻0 when it is true

 Considered as a serious type of error

 The probability of type I error is α, called level of


significance of the test

 Set by researcher in advance


Type II Error
 The type II error is the error committed when a false 𝐻0 is not
rejected

 The probability of Type II Error is 

 Usually unknown but larger than α


Power
 The probability of rejecting the 𝐻0 when it is false.

 Power = 1 – β = 1- probability of type II error

 We would like to maintain low probability of a Type I error


(α) and low probability of a Type II error (β).
Cont.…
Reality
Action
(Conclusion) Ho True Ho False

Do not Correct action Type II error (β)


reject Ho (Prob. = 1-α) (Prob. = β= 1-Power)

Type I error (α)


Correct action
Reject Ho (Prob. = α = Sign.
(Prob. = Power = 1-β)
level)
Type I Vs. Type II Error
 Type I error (α) and Type II error (β) can not happen at the
same time

 Type I error can only occur when 𝐻0 is true

 Type II error can only occur when 𝐻0 is false


Factors affecting the power
 If α decreases, the power decreases

 When the difference between 𝐻0 and 𝐻𝐴 increases, then the


power increases

 When  increases, then the power decreases

 If the sample size (n) increases, the power increases


Factors Affecting Type II Error
Hypotheses Testing for Mean & Proportion
 Hypotheses Test for One population

o Test for single population mean

o Test for single population proportion

 Hypotheses Test for Two Population

o Test for the difference between two population means

o Test for the difference between two population


proportions
1. Hypotheses Testing of a Single Mean
A. Assumptions: B. Assumptions:
 The population 𝜎 is known  The population 𝜎 is known
 If the sample is randomly  If the sample is randomly
drawn from normally drawn from not normally
distributed population distributed population
 The test statistic is:  The sample is large (n ≥ 30)
 The test statistic is:
𝑿 − 𝝁𝟎
𝒁= 𝝈 𝑿 − 𝝁𝟎
𝒏 𝒁= 𝝈
𝒏
Example: Two-Tailed Test
 A simple random sample of 10 people drown from normally
distributed population has a mean age of 27. Can we conclude
that the mean age of the population is not 30? The population
variance is known to be 20. Let α=0.05.
Answer
Step-1. Hypotheses formulation

o 𝐻0 : µ = 30

o 𝐻𝐴 : µ ≠ 30

Step-2. Data n = 10, sample mean 𝑋= 27, 𝜇 = 30, 𝜎 2 = 20, α =


0.05

Step-3. Assumptions

o Simple random sample

o Normally distributed population


Cont.…
Step-4. Test statistic

o As the population variance is known, we use Z as the test


statistic.
Cont.…
Step-5. Decision Rule

o Reject 𝐻0 if the Z value falls in the rejection region.

o Don’t reject 𝐻0 if the Z value falls in the non-rejection


region.

o Because of the structure of 𝐻0 it is a two tail


test. Therefore, reject 𝐻0 if Z ≤ -1.96 or Z ≥ 1.96.
Cont.…
Step-6. Compute test statistic

Step-7. Statistical decision

 We reject the 𝐻0 because Z = -2.12 is in the rejection


region. The value is significant at 5% α.
Cont.…
Step-8. Conclusion

o We conclude that µ is not 30. P-value = 0.0340

o A Z value of -2.12 corresponds to an area of 0.0170.

o Since there are two parts to the rejection region in a two


tail test, the P-value is twice this which is .0340.
Hypotheses test using confidence interval
 A problem like the above example can also be solved using a
confidence interval.

 A confidence interval will show that the calculated value of Z


does not fall within the boundaries of the interval.
However, it will not give a probability.

 Confidence interval
Example: One -Tailed Test
 A simple random sample of 10 people drown normally
distributed population has a mean age of 27. Can we
conclude that the mean age of the population is less than
30? The variance is known to be 20. Let α = 0.05.
Answer
Step-1. Hypotheses Formulation

o 𝐻0 : µ ≥ 30,

o 𝐻𝐴 : µ < 30

Step-2.Data

o n = 10, sample mean = 27, 𝜎 2 = 20, α = 0.05

Step-3. Assumptions

o Simple random sample

o Normally distributed population


Cont.…
Step-4. Test statistic
=

 Rejection Region Lower tail test


Cont.….
Step-5. Statistical decision

o We reject the Ho because -2.12 < -1.645.

Step-6. Conclusion

o We conclude that µ < 30.

o p = .0170 this time because it is only a one tail test and not
a two tail test.
Cont.….
 Suppose that the 𝐻0 and 𝐻𝐴 take the form

o 𝐻0 : µ = 𝜇0 , 𝐻𝐴 : µ > 𝜇0

 In this case, 𝐻0 would be rejected for large values of test


statistic (critical values >0)

 The P-value would correspond to the area in the upper tail of


the SND, to the right of the value of the test statistic.

Upper tail test


……..Hypotheses Testing of a Single Mean
C. Assumptions: D. Assumptions:
 The population 𝜎 is  The population 𝜎 is
unknown unknown
 If the sample is randomly  If the sample is randomly
drawn from normally drawn from not normally
distributed population distributed population
 The test statistic is:  The sample is large (n ≥ 30)
 The test statistic is:
𝑿 − 𝝁𝟎
𝒕𝒏;𝟏 = 𝑿 − 𝝁𝟎
𝑺
𝒕𝒏;𝟏 =
𝒏 𝑺
𝒏
Example: Two-Tailed Test
 A simple random sample of 14 people from a certain
population gives a sample mean body mass index (BMI) of
30.5 and sd of 10.64. Can we conclude that the BMI is not 35
at α = 5%?
Answer
Step-1. Formulating hypotheses

o 𝐻0 : µ = 35, 𝐻𝐴 : µ ≠ 35

 Step-2.Understanding Data

o Continuous

 Step-3. Checking the assumptions

o Population is normally distributed

o Variance unknown

o Sample is small
Cont.…
Step-4. Test statistic

 If the assumptions are correct and 𝐻0 is true, the test statistic


follows Student's t distribution with 13 degrees of freedom.

𝑿 − 𝝁𝟎
𝒕𝒏;𝟏 =
𝑺
𝒏
Cont.…
 Step-5. Decision rule

o We have a two tailed test. With α = 0.05 it means that


each tail is 0.025. The critical t values with 13 df are -
2.1604 and 2.1604.

 Step-6. Determine critical value

o We reject Ho if the t ≤ -2.1604 or t ≥ 2.1604.


Cont.…
 Step-7. Compute test statistics

 Step-8. Decision

o Do not reject Ho because -1.58 is not in the rejection


region.
Cont.…
 Step-9. Decision

o Based on the data from the sample, it is possible that µ =


35. P-value = 0.1375
2. Hypotheses Testing for the Difference Between
Two Population Means
 The most commonly employed hypotheses testing

 Aimed at determining whether or not it is reasonable to


conclude that the two population means are unequal

 The hypotheses could be formulated in one of the following


forms;
Cont...
 When studying one-sample tests for a continuous random
variable, the unknown mean μ of a single population was
compared to some known value 𝜇0 .

 We are usually interested in comparing the means of two


different populations when the values of both means are
unknown
Cont.…
 Hypotheses testing for the difference between two
population means employed under three basic assumptions

1. When sampling is from normally distributed


populations with known population variances

2. When sampling is from normally distributed


populations with unknown population variances, and

3. When sampling is from populations that are not


normally distributed.
2.1 Known Variances (Independent Samples)
A. Assumptions B. Assumptions
 When two independent  When two independent
samples are drawn randomly samples are drawn randomly
from a normally distributed from not normally
population with known distributed population with
variance, known variance and (both
 The test statistic is: 𝑛1 & 𝑛1 ≥ 30),
 The test statistic is:
Example:
 Researchers wish to know a difference in mean serum uric
acid (SUA) levels between normal individuals and
individuals with Down’s syndrome. The means SUA levels
on 12 individuals with Down’s syndrome and 15 normal
individuals randomly drawn from normally distributed
population are 4.5 and 3.4 mg/100 ml, respectively. With
variances (𝜎1 2 = 1,𝜎2 2 = 1.5, respectively). Is there a
difference between the means of both groups at α = 5%?
Answer:
Step-1. Formulating Hypotheses:

o 𝐻0 : 𝜇1 - 𝜇2 = 0 or 𝐻0 : 𝜇1 = 𝜇2

o 𝐻𝐴 : 𝜇1 - 𝜇2 ≠ 0 or 𝐻𝐴 : 𝜇1 ≠ 𝜇2

 Step.2. Data

o Quantitative

 Step.3. Assumptions

o Two independent samples

o Normally distributed population

o Known variance
Cont.…
Step-4. Select the test statistics:

Step-5. Decision rule

 𝛼 = 0.05 , Reject 𝐻0 if 𝑧𝑐𝑎𝑙𝑐𝑢𝑙𝑎𝑡𝑒𝑑 > 𝑧𝑐𝑟𝑖𝑡𝑖𝑐𝑎𝑙 or P-value < 𝛼

Step-6. Determine Critical Value

 At 𝛼 = 0.05 critical values of 𝑧𝛼/2 are ±1.96


Cont.…
Step-7. Compute test statistics
Cont.…
Step-8. Decision

 Reject 𝐻0 because 2.57 > 1.96.

Step-9. Conclusion

 From these data, it can be concluded that the population


means are not equal. A 95% CI would give the same
conclusion. P-value = 0.01.
2.2 Unknown Variances (Independent Samples)

 When the population variances are unknown, two


possibilities exist.

1. The two population variances may be equal; or

2. They may be unequal.


…..Unknown Variances (Independent Samples)

1. Equal variances

 With equal population variances, we can obtain a pooled


variance value from the sample variances using formula;
…..Unknown Variances (Independent Samples)

 The test statistic for 𝜇1 − 𝜇1 is:

 Where tα/2 has (𝑛1 + 𝑛2 – 2) df.


Example:
 We wish to know if we may conclude, at the 95% confidence
level, that smokers, in general, have greater lung damage than
non-smokers. Assume that the two population variances are
equal

o Givens;

• Smokers: 𝑋1 = 17.5 𝑛1 = 16 𝑆1 = 4.4752

• Non-Smokers: 𝑋2 = 12.4 𝑛2 = 9 𝑆2 = 4.8492

• 𝛼 = 0.05
Answer
Step-1. Formulate Hypotheses:

 𝐻0 : 𝜇1 ≤ 𝜇2 = 0, 𝐻𝐴 : 𝜇1 > 𝜇2

Step-2. Data

 Quantitative

Step-3. Assumptions

 Both populations are approximately normally distributed

 The population variances are unknown but are assumed to be


equal
Cont.…
Step-5. Select the test statistics

Step-5. Decision Rule


 With α = 0.05 and df = 23, We reject 𝐻0 if t > 1.7139 or P-
value < 0.05
Step-6. Determine critical value
 At α = 0.05 the critical value of t is 1.7139.
 Reject Ho because 2.6563 > 1.7139. On the basis of the data,
we conclude that µ1 > µ2.
Cont.…
Step-7. Compute the test statistics

Calculate the pooled variance

2 𝑛1 ;1 𝑆1 2 : 𝑛2 ;1 𝑆2 2 2 15 (4.4711)2 : 8 (4.8492)2
 𝑆𝑝 = , 𝑆𝑝 =
𝑛1 : 𝑛2 ; 2 16 :9 ; 2

2 299.86 :188.12
 𝑆𝑝 = = 21.2165
23
Cont.…
Step-8. Decision

 Since t > 1.7139 we reject 𝐻0

Step-9. Conclusion

 From these data, it can be concluded that the population


means are not equal. A 95% CI would give the same
conclusion. P-value = 0.01
Cont.…
2. Unequal variances

 The test statistic used is:

 The critical value of 𝑡 ′ for an 𝛼 level of significance is


approximately
Cont.…
 Where 𝑡 ′ is;
Example
 In the 15 patients with hypertension (group 1), the mean
aortic stiffness index was 19.16 with a standard deviation of
5.29. In the 30 control subjects (group 2), the mean aortic
stiffness index was 9.53 with a standard deviation of 2.69.
We wish to determine if the two populations represented by
these samples differ with respect to mean aortic stiffness
index.
Answer
Step-1. Formulate Hypotheses:

 𝐻0 : 𝜇1 ≤ 𝜇2 = 0, 𝐻𝐴 : 𝜇1 > 𝜇2

Step-2. Data

 Quantitative

Step-3. Assumptions

 Both populations are approximately normally distributed

 The population variances are unknown and are not equal


Cont.…
Step-5. Select the test statistics

Step-5. Decision Rule

 With α = 0.05 and We reject 𝐻0 if 𝑡𝑐𝑎𝑙𝑐𝑢𝑙𝑎𝑡𝑒𝑑 > 𝑡𝑐𝑟𝑖𝑡𝑖𝑐𝑎𝑙 or


P-value < 0.05
Cont.…
Step-6. Determine critical value

 At α = 0.05 the critical value of t computed as;

5.29 2 2.69 2
o 𝑤1 = = 1.8656 and 𝑤2 = = 0.2412
15 30

o From the table 𝑡1 = 2.1448 𝑎𝑛𝑑 𝑡1 = 2. 0452

 The critical value (𝑡 ′ ) is calculated as;


Cont.…
Step-7. Compute the test statistics
Cont.…
Step-8. Decision

 Since 𝑡 ′ > 2.133 we reject 𝐻0

Step-9. Conclusion

 From these data, it can be concluded that the population


means are not equal. A 95% CI would give the same
conclusion. P-value = 0.01
2.3. Hypotheses Testing for Paired Samples
 Two samples are paired when each data point of the first
sample is matched and is related to a unique data point of the
second sample.

 Tests means of 2 related populations

o Paired or matched samples

o Repeated measures (before/after)

• Longitudinal or follow-up study


……Hypotheses Testing for Paired Samples
 Assumptions:

o Both populations are normally distributed, if not use large


samples (n ≥ 30)

 The test statistics for 𝑑 is;


……Hypotheses Testing for Paired Samples

 n is the number of pairs in the paired sample Sd = Sample


standard deviation

 𝑑𝑖 is the 𝑖𝑡ℎ paired difference and computed as;

 The point estimate for the population mean paired difference


is 𝑑 and computed;
……Hypotheses Testing for Paired Samples
Example:
 The following data show the SBP levels (mm Hg) in 10
women while not using (baseline) and while using (follow-up)
oral contraceptives. Can we conclude that there is a difference
between mean baseline and follow-up SBP at α 5%? di =
baseline – follow-up
i SBP (baseline) SBP (follow-up) di
1 115 128 13
2 112 115 3
3 107 106 -1
4 119 128 9
5 115 122 7
6 138 145 7
7 126 132 6
8 105 109 4
9 104 102 -2
10 115 117 2
Answer
Step-1. Formulate Hypotheses:

 𝐻0 : 𝜇𝑑 = 0, 𝐻𝐴 : 𝜇1 ≠ 0

Step-2. Data

 Quantitative

Step-3. Assumptions

 Paired sample and the paired differences are approximately


normally distributed
Cont.….
Step-5. Select the test statistics

Step-5. Decision Rule

 With α = 0.05 and We reject 𝐻0 if 𝑡𝑐𝑎𝑙𝑐𝑢𝑙𝑎𝑡𝑒𝑑 > 𝑡𝑐𝑟𝑖𝑡𝑖𝑐𝑎𝑙 or


P-value < 0.05

Step-6. Determine critical value

 At α = 0.05 the critical value from the table, 𝑡𝑛;1(9) = 2.262


Cont.….
Step-7. Compute the test statistics

o 𝑑 = (13 + 3 + …. + 2)/10 = 4.80

o 𝑆𝑑2 = [(13−4.8)2 + … + (2−4.8)2 ]/9 = 20.844

o 𝑆𝑑 = 20.844 = 4.566

4.80
o t = 4.566 = 4.80/1.44 = 3.32
10
Cont.….
Step-8. Decision
 Since t = 2.262 and P-value < 0.01 𝐻0 is rejected
Step-9. Conclusion

 There is a significance difference between the population


means SBP while not using and using OC use
Hypotheses Testing for Proportion
 Involves categorical values

 Two possible outcomes

o Success (possesses a certain characteristic)

o Failure(does not possesses that characteristic)

 Population proportion is denoted by p and sample proportion


is denoted by 𝑃

𝑋 𝑇ℎ𝑒 𝑝𝑟𝑜𝑝𝑜𝑟𝑡𝑖𝑜𝑛 𝑜𝑓 𝑠𝑢𝑐𝑐𝑒𝑠𝑠𝑒𝑠 𝑖𝑛 𝑎 𝑠𝑎𝑚𝑝𝑙𝑒


o 𝑃= =
𝑛 𝑇𝑜𝑡𝑎𝑙 𝑠𝑎𝑚𝑝𝑙𝑒 𝑠𝑖𝑧𝑒
Hypotheses Testing for Proportion
Assumptions
 The sample is randomly drawn from population

 The conditions for the binomial distribution are satisfied.

 When n𝑝 ≥ 5 and n𝑞 ≥ 5; 𝑝 can be approximated by normal


distribution

o Hypotheses testing for a single population proportion

o Hypotheses testing for the difference between two


population proportion
3. Hypotheses Testing for Single Proportions

 When n𝑝 ≥ 5 and n𝑞 ≥ 5; 𝑝 can be approximated by normal


distribution with;

 Mean

 Standard deviation
Cont.….
 The test statistics;
Example
 Data were collected on a sample of 301 Hispanic women living in San
Antonio, Texas. One variable of interest was the percentage of subjects
with impaired fasting glucose (IFG). IFG refers to a metabolic stage
intermediate between normal glucose homeostasis and diabetes. In the
study, 24 women were classified in the IFG stage. The article cites
population estimates for IFG among Hispanic women in Texas as 6.3
percent. Is there sufficient evidence to indicate that the population of
Hispanic women in San Antonio has a prevalence of IFG higher than 6.3
percent?
Answer
Step-1. Formulating hypotheses

 𝐻0 : p ≤ 0.063, 𝐻𝐴 : p > 0.063

Step-2.Understanding Data

 Categorical

Step-3. Checking the assumptions

 The sampling distribution of 𝑃 is approximately normally


distributed in accordance with the central limit theorem.
Cont.…
Step-4. Test statistic

Step-5. Decision Rule

 With α = 0.05; We reject 𝐻0 if 𝑍𝑐𝑎𝑙𝑐𝑢𝑙𝑎𝑡𝑒𝑑 ≥ 𝑍𝐶𝑟𝑖𝑡𝑖𝑐𝑎𝑙 or P-


value < 0.05
Cont.…
Step-6. Determine critical value

 At α = 0.05 the critical value of Z is 1.645

Step-7. Compute test statistics


Cont.…
Step-8. Decision

 Do not reject because 𝐻0 ≤ 1.645, p = 0.1131

Step-9. Conclusion

 We cannot conclude that in the sampled population the


proportion who are IFG is higher than 6.3 percent.
4. Hypotheses Tests for the Difference Between
Two Population Proportions
 The hypotheses can be formulated in one of the following
forms
Cont.…
 The most frequent test employed relative to the difference
between two population proportions is that their difference is
zero

 When 𝐻0 to be tested is 𝑃1 − 𝑃2 = 0, we are hypothesizing


that the two population proportions are equal

 Then we use the a pooled estimate of the hypothesized


common proportion
Cont.…
 The pooled estimate for the overall proportion is;

 Where 𝑋1 = the observed number of events in the first sample


and 𝑋2 = the observed number of events in the second sample
Cont.…
 The test statistics is
Example
 A study was conducted to investigate the possible cause of
gastroenteritis outbreak following a lunch served in a high
school cafeteria. Among the 225 students who ate the
sandwiches, 109 became ill. While, among the 38 students
who did not eat the sandwiches, 4 became ill. Is there a
significant difference between the two groups at α =5%.
Answer
Step-1. Formulating hypotheses

 𝐻0 : 𝑃1 = 𝑃2 , 𝐻𝐴 : 𝑃1 ≠ 𝑃2

Step-2.Understanding Data

 Categorical

Step-3. Checking the assumptions

 Independent random sample and The sampling distribution of


𝑃1 − 𝑃2 is approximately normally distributed in accordance
with the central limit theorem.
Cont.…
Step-4. Test statistic

Step-5. Decision Rule

 With α = 0.05; We reject 𝐻0 if 𝑍𝑐𝑎𝑙𝑐𝑢𝑙𝑎𝑡𝑒𝑑 ≥ 𝑍𝐶𝑟𝑖𝑡𝑖𝑐𝑎𝑙 or P-


value < 0.05
Cont.…
Step-6. Determine critical value

 At α = 0.05 the critical value of Z is 1.96

Step-7. Compute test statistics

𝑋1 109
o 𝑃1 = = = 48.4%
𝑛1 225

𝑋2 4
o 𝑃2 = = = 10.5%
𝑛2 38

𝑋1 :𝑋2 109:4
o 𝑃= = = 0.43
𝑛1 :𝑛2 225:38
Cont.…
Step-7. Compute test statistics
Cont.…
Step-8. Decision

 𝑍𝑐𝑎𝑙𝑐𝑢𝑙𝑎𝑡𝑒𝑑 (4.36) > 1.96 or is less than 0.05. Therefore, we


reject 𝐻0

Step-9. Conclusion

 The proportion of students who became ill differs in the two


groups; those who ate the prepared sandwiches were more
likely to develop gastroenteritis.
Google for more!!!

You might also like