Hypothesis Testing Examples

Download as pdf or txt
Download as pdf or txt
You are on page 1of 5

1.

Test for a single mean (Large sample)

Ref: Statistics for Managers, 8th Ed., Levine et.al

An insurance agent has claimed that the average age of policyholders who insure through him is less
than the average for all agents, which is 30.5 years. A random sample of 100 policyholders who had
insured through him gave the following age distribution.

Age No. of persons


15.5 – 20.5 12
20.5 – 25.5 22
25.5 – 30.5 20
30.5 – 35.5 30
35.5 – 40.5 16
Total 100

Test the agent’s claim at 5% level of significance.

Solution

Mid point
Lower Upper (x) freq(f) x.f x^2 f
15.5 20.5 18 12 216 3888
20.5 25.5 23 22 506 11638
25.5 30.5 28 20 560 15680
30.5 35.5 33 30 990 32670
35.5 40.5 38 16 608 23104
Total 100 2880 86980
Mean 28.8
Variance 40.76767677
SD 6.384957069
SE 0.638495707
Z -2.66250811
Ho rejected

Sample size = n = 100; Sample Mean = 𝑥̅ = 28.8 years; Sample SD = s = 6.45 years

Null Hypothesis: Ho: μ = 30.5 against the alternative hypothesis H1: μ < 30.5

𝑥̅ −𝜇 𝑠
Test Statistic: Z = , where SE(𝑥̅ )= ; Reject Ho at 5% level if (and only if) Z < -1.645
𝑆𝐸(𝑥̅ ) √𝑛

Here SE(𝑥̅ ) = 6.45/√100 = 0.645; Z = (28.8 – 30.5)/0.645 = - 2.636

Since Z is less than -1.645, we reject Ho at 5% level; and conclude that the insurance agent’s
claim is valid.
2. Test for difference of two means

One measure of the performance of bank branches is the waiting time for the arriving customers. In
a study of waiting time for serving customers in two branches of a bank, 15 customers from each
branch have been randomly selected, and waiting times are recorded as follows:

Branch 1: 4.21 5.55 3.02 5.13 4.77 2.34 3.54 3.20 4.50 6.10

0.38 5.12 6.46 6.19 3.79

Branch 2: 9.66 5.90 8.02 5.79 8.73 3.82 8.01 8.35 10.49 6.68

5.64 4.08 6.17 9.91 5.47

Assuming that the population variances from both branches are equal, is there evidence of a
difference in the mean waiting time between the two branches? Use 5% level of significance.

Solution

Branch 1 Branch 2

Mean 4.286667 Mean 7.114667


Standard Deviation 1.637985 Standard Deviation 2.082189
Sample Variance 2.682995 Sample Variance 4.335512
Count 15 Count 15

Assuming a common unknown population variance, the pooled estimate of the common variance

(𝑛1−1) 𝑠12 +(𝑛2−1)𝑠22 1 1


𝑠= √ = 1.873366; SE(Mean difference) = s√ +
𝑛1+𝑛2−2 𝑛1 𝑛2

Null Hyp: Ho: µ1=µ2; Alternative Hyp H1: µ1≠µ2

̅̅̅̅
̅̅̅̅−𝑥2
𝑥1
Test Statistics: t = = -4.13416; Degrees of Freedom = n1+n2-2 = 28
𝑆𝐸

5% Tabled value of t with 28 DF (two-tailed) = 2.048

We shall reject Ho if (and only if) observed absolute t > tabled t

Here Absolute t = 4.13416 > 2.048, so we reject Ho

Difference in the mean waiting time between the two branches is significantly different at 5% level
of significance
3. Test for Goodness of Fit

Is gender independent of education level? A random sample of 395 people


were surveyed and each person was asked to report the highest education
level they obtained. The data that resulted from the survey is summarized
in the following table:

High School Bachelors Masters Ph.d. Total


Female 60 54 46 41 201
Male 40 44 53 57 194
Total 100 98 99 98 395
Question: Are gender and education level dependent at 5% level of
significance? In other words, given the data collected above, is there a
relationship between the gender of an individual and the level of education
that they have obtained?

Here's the table of expected counts:

High School Bachelors Masters Ph.D. Total


Female 50.886 49.868 50.377 49.868 201
Male 49.114 48.132 48.623 48.132 194
Total 100 98 99 98 395

So, working this out, χ2=(60−50.886)2/50.886+⋯+(57−48.132)2/48.132=8.006

The critical value of χ2 with 3 degree of freedom is 7.815. Since 8.006 >
7.815, therefore we reject the null hypothesis and conclude that the
education level depends on gender at a 5% level of significance.
4. Test for equality of more than two means (One way ANOVA)

In Problem 2, we have seen how we compare whether the means of two populations are
equal or not. A natural question that arises is: what we do if we need to compare the means
of more than two populations.
Continuing from Problem 2, suppose data on waiting time of 12 customers for one more
branch, say Branch 3, is available.

Branch 1: 4.21 5.55 3.02 5.13 4.77 2.34 3.54 3.20 4.50 6.10

0.38 5.12 6.46 6.19 3.79

Branch 2: 9.66 5.90 8.02 5.79 8.73 3.82 8.01 8.35 10.49 6.68

5.64 4.08 6.17 9.91 5.47

Branch 3: 7.25 6.92 6.16 8.25 5.59 9.93 11.21 10.29 11.68 9.95

8.88 7.97

Assuming that the population variances from all branches are equal, is there evidence of a
difference in the mean waiting time among the three branches? Use 5% level of significance.

Supposing the means are µ1, µ2, and µ3. We may be interested to test the null hypothesis

Ho: µ1= µ2= µ3 against the alternative Ha: they are unequal.

The technique by which this can be carried out is called Analysis of Variance (ANOVA). We
assume that the underlying population variances are all equal, but unknown.

Anova: Single Factor

SUMMARY
Groups Count Sum Average Variance
Branch 1 15 64.3 4.286667 2.682995238
Branch 2 15 106.72 7.114667 4.335512381
Branch 3 12 104.08 8.673333 3.903442424

ANOVA
Source of Variation SS df MS F P-value F crit
Between Groups 135.7254 2 67.86271 18.74435236 1.97521E-06 3.238096135
Within Groups 141.197 39 3.620435

Total 276.9224 41

You might also like