Chapter 4.3 ZICA
Chapter 4.3 ZICA
To find a good sample is often not easy. For example, we sample the fruit at the
top of a basket we may have no idea if there is a bad fruit in the middle or at the
bottom of the basket. If we are studying college students’ attitudes and we
interview students on the steps of the School of technology, we may never
encounter a business student. A sample only provides an estimate of a population
measure and accuracy of the estimate will depend on:
a) The right size for a sample, the larger the sample size the greater is the
probability that the sample is representative of the population.
b) Selecting the right sampling method so that the sample represents the
population.
Why do we sample?
1) Time and cost are probably the two most important reasons for sampling.
If one wants to determine if customers in a given market will buy a
product, one doesn’t usually have the time or funds to interview all
potential customers. For example, try interviewing everyone who uses
protex in Zambia.
2) Testing may prove destructive. If you want to test the durability of your
product, stress tests on the entire output may leave you with no product to
sell.
3) Accuracy is another reason for sampling. One would think that a study of
the entire population would be more accurate than a study of a sample. If
one has a very large population and wishes to take a complete count, then
one must hire a large number of inventory takers who must work at a rapid
rate. As more personnel are hired, it is likely that they may be less
efficient than the original employees. Thus, a limited number of skilled
workers, studying a well defined sample, may provide more accurate
results than a survey of the entire population.
156
Sampling Designs
Probability Samples
Taking a sample to use for the study of a population means that one must make
sure that the sample really represents the population. The first problem is to
decide what is the population that should be sampled. We need a sampling
frame. A sampling frame is a list of every item or member of the population to
be sampled. Sampling of individual items should be done at random i.e random
sampling procedure yielding a representative sample requires that everyone in the
population has an equal chance of being chosen.
Choosing people from a list may be done using a table of random numbers. If we
were interested in sampling a group of 10 people, for example, we could number
10 pieces of paper from 0 through 9 and put them in a container. Then, we could
mix the pieces of paper and draw out a number. This is essentially what has been
done when a random number table is constructed. Such a table uses numbers of
one, two, three, four or more digits, and a computer picks a number at random.
These numbers are then recorded in a table such as Table 1 in the Appendix..
Table 1 shows a list of randomly selected numbers. Let us say that we wanted to
select a sample of 20 people from a list of 500. First we number everyone on the
list from 1 to 500. Now we might decide, which number to choose as the start of
our sample.
Assume we close our eyes and put our pencil down upon a number, say 62463,
the sixth number in third column. Now, assume that we continue reading down
this column from this point on and then start at the top of the next column. We
discard numbers that are too large for our list (that is, numbers with three digits
and less than or equal to 500 are retained). We could just as easily read across the
rows. Any procedure is acceptable as long as we follow it consistently.
Now let us pick 20 numbers we have, 022, 077, 065, 484, 140, 135, 239, 074,
037, 275, 474, 075, 145, 401, 264, 076, 449, 374, 230 and 041. Note that if the
number was selected previously, we discard it. We now look up these numbers
on our numbered list and interview the appropriate persons.
Nowadays, many hand calculators have random number programs that can be
used to generate a set of random numbers. The method just described above is
called Simple random sampling.
157
Systematic Random Sampling
This method consists of starting with a random number from the random number table
counting down to this number on our list. Then selecting additional units at evenly
spaced intervals (every kth population unit K > 1) until the sample has been formed. If
the list follows some numerical pattern, this method can’t be used. The regular pattern
might cause bias.
Cluster Sampling
Clusters are identified in the populations, a set of clusters is randomly sampled, and a
complete census is taken of each to form the sample. Clustering tends to decrease costs
and increase sampling error for the same size sample, because people who live close
together are more likely to be similar than others. For example, a few geographical areas
(perhaps a township or a road in a town) are selected at random and every single
household of the population.
158
Non Probability Samples
These samples do not rely on the laws of probability for selection, but depend on the
judgment of interviewers or their supervisors.
Convenience samples consist of studies of people who happen to be available or who call
in their results. Suppose we are interested in conducting a study of the attitudes of
shoppers at a new shopping center on the kinds of stores in the center, the attractiveness
of the center, parking difficulties, and so on. To collect sample information, we ask
persons to participate in the survey who happen to walk past the central area of the
center. The sample in this instance is a convenient sample – the people are not being
selected according to a probability plan and, presumably judgment is not being used in
selecting those to participate in the survey. Convenience samples are prone to bias by
their very nature – selecting population elements that are convenient to choose almost
always makes them special ro different from the rest of the elements in the population in
some way.
Judgment Sampling
Judgment sampling is done by an expert who is familiar with the population measures.
He selects the units from the population. The quality of judgment sample depends on the
competence of the expert who selects the population units to be sampled.
Quota Sampling
Quota sampling attempt to ensure that the sample represents the characteristics of the
population. The interviewer is free to select any one who meet the given specifications.
He or she may choose in a non random fashion. We cannot make good estimate of
sampling error because we haven’t used a random sampling procedure. The method is
cheap and reasonably effective and in consequence is widely used.
159
4.4 Statistical Inferences
Introduction
Statistical inference can be divided into two parts namely estimation and
hypothesis testing. Firstly we deal with estimation which is the procedure or
rules or formulae used to estimate a population characteristics (parameter).
Sample measures (or statistics) are used to estimate population measures (such as
population means, µ the Greek symbol ‘mu’, population variance σ 2 the Greek
symbol ‘Sigma’). The corresponding sample measures are sample mean,‘ x ’
pronounced x-bar, and sample variance ‘ S 2 ’ respectively.
Estimation
In practical situation, it is not possible to have all the four qualities on one
estimator. The researcher choose which qualities he/she wants the estimator to
have.
160
Distribution of Sample Means
If the population is normally distributed, or if the sample size is ‘large’ (i.e. n > 30) , then
the sample means is approximately normal.
Example 1
A normal population has a mean = 500 and standard deviation = 125. Find the
probability that a sample of 65 values has a mean greater than 538.
We have µ = 500 and σ = 125. The distribution of sample means has mean
σ 125
µ x = 500 and s tan dard error σ x = = = 15.50
n 65
σ x = 15.50
538 − 500
Z= = 2.45
15.50
Therefore the probability that a sample mean greater than 538 is equal to the area shaded
= 0.5000 – 0.4929 = 0.0071 from the tables.
161
Example 2
The daily output from a production line has a mean 7500 unts with a standard deviation
of 500 units. What is the probability that during the next 125 days the average output
will be under 7400 units per day?
σ x = 44.72
7400 − 7500
Z= = −2.24
44.72
Confidence Intervals
If we have chosen a good sample and have calculated the mean from the sample for the
effect we wish to study, we may offer this estimate to the public or a company as an
estimate for the population mean. This is called a point estimate. The only problem is
that we offer evidence from one sample about the nature of the population, and we have
no idea how reliable this estimate is.
Reliability depends on sample size n and the amount of variability in the original
population σ . Even more helpful would be a combination of the estimate, the standard
error of the estimate and some notion of the probability of being correct. This
information is contained in what is called a confidence interval.
162
Since sample means are normally distributed, we can use normal curve probabilities to
describe our estimates.
Example 3
Find the 95% confidence limits for the average daily output over 125 days given in
Example 2.
We are 95% sure that the average output over 125 days will be between 7412.35 and
7587.65 units per day.
The principles involved in setting confidence limits can be used to determine what
sample size should be taken, if we wish to achieve a given level of precision.
163
Example 4
In Example 2, if the daily output from a production line has a mean of 8000 units with a
standard deviation of 534 units. If the probability is 0.99 that the error will be at most
116 points on the test scale for such data, how large should the sample size be?
d = the error term which is half the width of the confidence interval. Hence d = 2.5
σ α
Therefore; d = Z α , α = .01, = .005, Z 0.005 = 2.85 from the tables.
2 n 2
2.85(534)
Hence, 116 =
n
(2.85)(534)
n=
116
n = 172.13
n ≅ 173
For random samples of size n taken from a finite population having the mean µ and
standard deviation σ , the sampling distribution x has the mean µ x = µ and the standard
σ N −n
deviation σ x =
n N −1
Where N is the population size and the sample size is n > 5% of the population size.
164
Example 5
A sample of 120 is drawn from a population of 1200 with a sample standard deviation of
9 centimeters. What is the finite population correction factor? What is the standard error
of the mean?
N −n
fpc = ifn > 5% N
N −1
1200 − 120
fpc = = 0.949
1200 − 1
σ N −n 9
σx = = (0.949) = 0.7797cm
n N −1 120
So far the process of statistical inference has been applied to the arithmetic mean. The
standard error of a sample proportion is given by
pq
σ pˆ = and µ pˆ = P
n
Using this value, we are able to set confidence interval for the estimate of the population
proportion based on the sample proportion in exactly the same way as outlined previously
for the mean as:
pq
µ pˆ ± Z α
2
n
165
Example 6
a) Find the mean and standard deviation of the proportion of defectives obtained in
sample 500 items.
b) Find the 95% confidence limits for the proportion of defectives in a sample of 500
items.
∴ mean = µ pˆ = P = 0.05
P (1 − P )
S tan dard deviation = σ pˆ =
n
0.05(0.95)
= = 0.00975
500
Example 7
In a random sample of 375 employees, 68% were found to be in favour of strike action.
Find the 99% confidence limits for the proportion of all employees in the company who
are in favour of such action.
166
We are given pˆ = 0.69 and n = 375
∴ µ pˆ = pˆ = 0.68
pˆ (1 − pˆ ) (0.68)(0.32)
σ pˆ = =
n 375
= 0.0241
When sampling is done where sample size is less than 30 and the population variance is
unknown (i.e. S, the sample standard deviation is used as an estimate of σ , the
population standard deviation). The confidence limits for the time population mean are
given by
S
x ± tα
, n −1 n
2
∑x
2
2
− nx
S=
n −1
x−µ
t= S
n
167
where x is the mean of a random sample of n measurements, µ is the population mean
of the x distribution, and S is the sample standard distribution.
ii) It is flatter than the Normal distribution i.e. the area near the tails are greater than
the Normal Distribution.
iii) As the sample size becomes larger the t distribution approaches the normal
Distribution. To use the t distribution the tables in Appendix 1.
Example 8
A random sample of 12 men is taken and is found to have a mean height of 1.67cm and a
standard deviation of 0.48cm. Find:
i) 99%
ii) 95%
α
x = 1.67, n = 12 and S = 48. 1 − α = 99, α = .01 and = 0.005. Hence
2
tα = t0.005,11 = 3.106 from Table 2. The 99% confidence limits are given by:
, n −1
2
S
x ± tα
, n −1 n
2
0.48
= 1.67 ± 3.106
12
= 1.67 ± 0.430
= 1.24cm to 2.1cm
168
Thus the 99% confidence limits for the population mean are between 1.24 and
2.1cm.
α
ii) 1 − α = .95, α = 0.05, = 0.025. Therefore tα = t0.025, 11 = 2.201 , from
2 , n −1
2
Table 2 the 95% confidence limits are given by
S
x ± tα
, n −1 n
2
0.48
= 1.67 ± 2.201
12
= 1.67 ± 0.305
= 1.365 to 1.975
Thus the 95% confidence limits for the population mean are between 1.365 to
1.975cm.
Exercise 4
1) A study of a sample of 500 bank accounts is made to estimate the average size of
a bank account. The sample mean is calculated to be K500 000. From previous
studies of bank accounts, it is known that the standard deviation is K67 000.
Construct a 99% confidence for the mean size of bank accounts.
2) An ice cream factory wishes to know the average number of women per block of
houses in a given compound. A sample of 120 blocks of houses within the
compound indicates that the average number of women is 94. When the standard
deviation is estimated for those 120 block of houses, it is found to be S = 15.04.
Calculate a 95% confidence interval for the number of women.
3) Assume that some college students want to find out what percent of the
population will vote for the MMD candidate. A sample of 125 voters reveals that
65 will vote MMD. Should we predict that the MMD candidate will win
(assuming that there are just two parties)? Construct a 98% confidence interval
that the MMD will win.
169
4) A study of a sample of 125 customers at the college bookstore indicated
20% preferred new books while 80% wanted used ones.
8) The Local Authority have tested the durability of a new paint for white
center lines, a highway department has painted test strips across heavily
travelled roads in nine different locations, and automatic counters showed
that they disappear after having been crossed by 200, 245, 235, 225, 220,
230, 235, 248 and 250 cars. Construct a 99% confidence interval for the
average number of crossings this paint can withstand before it disappears.
170
This belief about the population parameter is called the null hypothesis denoted H o . The
value specified in the null hypothesis is often a historical value, a claim or a production
specification. The opposite of the null hypothesis is the alternative hypothesis denoted
by H a or H 1.
For example, if the average score of Mathematics students is 85, 10 years ago, we might
use a null hypothesis H o : µ = 85 for a study involving the average score of this year’s
mathematics class. If television networks claim that the average length of time devoted
to commercials in a 45-minutes program is 10 minutes, we would use H o : µ = 10
minutes as our null hypothesis in a study regarding the length of time devoted to
commercials. The alternative hypothesis is accepted if the null hypothesis is rejected. In
the two examples above, if we believe the average score of mathematics students is
greater than it was 10 years ago, we could use an alternative hypothesis H1 : µ > 85
while in the commercial case if the length of time devoted to commercials is not10
minutes, we could use an alternative hypothesis H1 : µ ≠ 10 .
If we reject the null hypothesis when it is in fact true, we have an error that is called a
type 1 error. On the other hand, if we accept the null hypothesis when it is in fact
false, we have made an error that is called a type II error. Table 4.1 summarizes these
results.
Our Decision
x−µ x−µ
Z= i. e Z=
σx σ/ n
171
A selection of ‘significant’ values of Z together with the significance level α are given
below.
Example 9
H o : µ = 150
H a : µ ≠ 150
This type of a test, we call it two tailed test. If µ ≠ 150 , it could either be less than 150
or greater than 150, hen the term two tailed test.
α α
= 0 .5 = .025
2 2
1 − α = .95
-1.96 1.96
We are given α = .05. Since it is a two tailed test, we share this area equally in the two
α .05
tails i.e = = .025. The shaded area is called the rejection region and the unshaded
2 2
area the acceptance region. The point separating the rejection region from the acceptance
region is called the critical point.
172
We are given that α = 30, n = 24 and x = 168, now
A ‘significant” value of Z at the 5% level is 1.96, i.e the 95% confidence limits for Z are
–1.96 and +1.96 (see diagram above). Therefore, our value of Z is significant (i.e it is
outside the confidence limits). We reject H o . We thus accept that the population mean
is not equal to 150.
Example 10
A random sample of 12 family toy cars is found to have an average retail price of K300
000. Assuming that toy car prices are Normally distributed with a standard deviation of
K50 000.00, test the assumption (at the 5% level) that the average price of a family toy
car is:
a) H o : µ = 35 000
H1 : µ ≠ 35 000
173
x−µ 300 000 − 350 000
Z= =
σ 50
n 12
− 50 000
=
50 000
3.4641
= −3.46
.025 .025
1 − α = .95
−1.96 1.96
b) H o : µ = 35 000
H1 : µ > 35 000
= −3.46
174
5%
0 1.65 Z
Example 11
A random sample of 12 items is obtained from a Normal population and is found to have
a mean = 50 with a standard deviation = 7. Test the assumption, at the 5% significance
level, that the population mean is 40.
H o : µ = 40
H1 : µ ≠ 40 (two tailed test).
Now the sample size is small and hence the test statistic is no longer Z but t given by
x − µ 50 − 40
t= =
S 7
n 12
10
=
2.021
= 4.95
175
Example 12
An assessment test is given to all prospective employees in a company. Test scores are
known to be Normally distributed. A random sample of 7 participants obtained the
following results: 69, 58, 68, 66, 75, 85, 80.
Test the assumption that the mean test score is 65 using the 5% significance level.
H o : µ = 65
H1 : µ ≠ 65
This is a two-tailed test. We have x : 49, 58, 66,75, 85, 82. Now
x=
∑ x = 483 = 69, and
n 7
∑x
2
2
− nx 34319 − 7(69) 2
S= =
n −1 6
= 12.858
x − µ 69 − 65 4
∴ t= = =
S 12.858 4.86
n 7
t = 0.823
Example 13
176
We write: H o : µ = 250 000
H a : µ > 250 000
t = 3.125
Pˆ − µ pˆ
Z=
σ pˆ
Example 14
It is assumed that over half of the employees in a large company are in favour of a
proposed new salary structure. A sample of 250 employees found that 42% were in
favour. Does this sample verify the assumption? (use the 1% significance level)
H o : P = 0.50
H i : P < 0.50
177
Now,
P (1 − P ) 0.5(0.5)
σ pˆ = = = 0.0316
n 250
Pˆ − µ pˆ 0.42 − 0.50
∴ Z= = = −2.53
σ pˆ 0.0316
Example 15
It is required to test the hypothesis that 56% of households have a television set. A
random sample of 500 households found that 75% of the sample had television sets. The
significance level is 1%.
This is a two tailed test because we wish to test the hypothesis as it is and not against a
specific alternative hypothesis that the real proportion is either larger or smaller.
i.e H o : P = 0.56
H i : P ≠ 0.56
Now,
0.56(.44)
σ pˆ = = 0.0222
500
0.75 − .56
∴ Z= = 8.56
0.0222
At the 1% level of significance for a two-tailed test the appropriate Z value is 2.58.
178
Exercise 5
2) A machine makes twist-off caps for bottles. The machine is adjusted to make
caps of diameter 1.87cm. Production records show that when the machine iis so
adjusted, it will make caps with mean diameter 1.87 cm and with standard
deviation σ = 0.045cm. During an inspector checks the diameters of caps to see
if the machine is not functioning properly in which case the diameter is no longer
1.87cm. A sample of 65 caps is taken and the mean diameter for this sample x is
found to be 1.98cm. Is the machine working properly i.e µ ≠ 1.87 . (Use a 5%
level of significance).
4) A random sample of seven bank accounts show balances equal to: K270 000,
K120 000, K1600 000, K620 000, K1980 000, K3200 000, K2600 000. Test the
assumption that the mean bank balance is K1250 000. (use the 5% significance
level).
7) A team of eye surgeons has developed a new technique for a risky eye operation
to restore the sight of people blinded from a certain disease. Under the old
method, it is know that only 45% of the patient who undergo this operation
recover their eyesight. Suppose that surgeons in various hospitals have performed
a total of 230 operations using the new method and that 98 have been successful
(the patients fully recovered their sight). Can we justify the claim that the new
method is better than the old one? (use a 5% level of significance).
179
Hypothesis Testing of the Difference Between Two means
The distribution of sample mean differences is normally distributed and remains normally
distributed whatever the distribution of the population from which the samples are drawn.
When n > 30 i.e. large samples, the Normal area tables are used. When n < 30 the t
distribution are used.
S12 S 22
The standard errors of the σˆ ( x1 − x 2 ) = + difference of means where:
n1 n2
x1 − x 2
Z=
σˆ x − x
1 2
Example 1
A psychological study was conducted to compare the reaction times of men and women
to a certain stimulus. Independent random samples of 50 men and 50 women were
employed in the experiment. The results were shown in the table below. Do the data
present sufficient evidence to suggest a difference between time and mean reaction times
for men and women? Use α = 0.05
Men Women
n1 = 50 n2 = 50
x1 = 43 x 2 = 37
S12 = 20 S 22 = 12
180
We have
H o : µ1 = µ 2
H a : µ1 ≠ µ 2 (two − tailed )
∧ 512 522
σ( x1 − x 2 )
= +
n1 n2
20 12
= +
50 50
= 0 .8
x1 − x2 43 − 37
Z= =
0 .8
Now σ x −x
1 2
= 7 .5
A significant value of Z at the 5% level with Z = 1.96 . Therefore our value of Z = 7.5 is
significant. We reject H 0 . We thus conclude that there is a significant difference
between the average earnings in the two companies.
Example 2
A consumer group is testing camp stoves. To test the heating capacity of a stove, the
group measures the time required to bring 2 litres of water from 10ºc to boiling (at sea
level).
Two competing models are under consideration. Thirty-seven of each model are tested
and the following results are obtained.
Is there any difference between the performances of these two models (use a 1% level of
significance)?
181
We have
H o : µ1 = µ 2
H a : µ1 ≠ µ 2
x1 − x 2 12.5 − 10.1
Z= =
S12 S 22 (2.6) 2 (3.2) 2
+ +
n1 n2 37 37
2 .4
= = 3.54
0.678
Z = 3.54 is greater than the critical value of Z = 2.58. Therefore , we reject H o and
accept the assumption that there is a significant difference between the performances of
these two models.
In a similar manner it may be required to test the difference between the proportions of a
given attribute found in two random samples.
Sample 1 Sample 2
The assumption is that the two sample are from the same population. Hence the pooled
sample proportion.
P1n1 + P2 n2 pq pq
P= and the s tan dard erorr is σˆ ( p1 − P2 ) = + , and
n1 + n2 n1 n2
( Pˆ1 − Pˆ2 ) − ( P1 − P2 )
Z= → (1)
σˆ ( P1 − P2 )
182
But under the null hypothesis H : P1 = P2 , hence (1) is reduced to:
( Pˆ1 − Pˆ2 )
Z=
σˆ ( P1 − P2 )
Example 3
The following results have been recorded from random samples of candidates taking two
Institute examinations.
n1 = 50 n2 = 85
35 42
Pˆ1 = = 0 .7 Pˆ2 = = 0.49
50 85
H o : P1 = P2
H a : P1 ≠ P2
(.0568)(0.432) (0.568)(0.432)
σˆ ( P − P ) = +
1 2
50 85
= 0.0882
0.7 − 0.49
∴Z = = 2.38
0.0883
183
A significant value of Z at 1% level (two tailed is 2.58. Therefore, our value of Z = 2.38
is not significant. We accept H o . There is no significant difference between the
proportion of candidates passing the two examinations.
Example 4
A college committee wishes to know if the proportion of students who received A grades
was decreasing as a result of the committee recent report to the Principal that showed that
grades had risen since 2002 and that Mampi College grades had risen faster than grades
in other college in the country. A sample of grades in 2002 and 2003, after the
committee’s report was given, were studied to see if the proportion of A grades had gone
down significantly. Use α = 0.05.
We have:
H o : P2 ≥ P1
H a : P2 < P1
0.70(120) + .50(110)
Now P = = 0.604
120 + 110
1 1
σˆ ( P − P ) = (0.604)(0.396) + = 0.065
120 110
1 2
0.50 − 0.70
Z= = −3.08
0.065
Since –3.08 is less than –1.65, we reject H o . We therefore conclude that the grades have
gone down since the grade committee reports was issued.
184
Exercise 6
Use a 5% significant level to test the claim that line A is more reliable than line B.
3) The graduating class to two prestigious business schooks were surveyed about
their average starting salary with the following results.
185
5) Consider the following null and alternative hypotheses.
H o : µ1 − µ 2 ≤ 0
H a : µ1 − µ 2 > 0
186
EXAMINATION QUESTION WITH ANSWERS
1.1 A simple random sample of size 25 is drawn from a finite population consisting of
145 units. If the population standard deviation is 10.5, what is the standard error
of the sample mean when the sample is drawn with replacement?
1.2 A Corkhill machine is set of fill a small bottle with 9.00 grams of medicine. It is
claimed that the mean weight is less than 9.0 grams. The hypothesis is to be
tested at the 0.01 level. A sample revealed these weight (in grams): 9.2, 8.7, 8.9,
8.6, 8.8, 8.5, 8.7 and 9.0. What are the null and alternative hypotheses?
1.3 A study of Excelsior Furniture limited regarding the payment of invoices revealed
that, on average, an invoice was paid 20 days after it was received. The standard
deviation equaled 5 days. What percentage f the invoices were paid within 15
days of receipt?
1.4 A procedure based on sample evidence and probability theory used to determine
whether a statement about the value of a population parameter is reasonable and
should not be rejected, or unreasonable and should be rejected is called:
D. Hypothesis testing.
(Natech, 1.2 Mathematics and Statistics, December 2003)
187
1.5 If, in a sample of 150, 60 respondents say they prefer product P to product Q ,
then the standard error of the sample proportion is:
1.7 The finite correction factor is used in computing the standard error of the mean
when:
D. σ is finite.
1.8 In a “5 percent two-tail test” Concerned with the value of the population mean,
the area in each tail (region of rejection) of the standard-normal distribution
model is:
1.10 In the general procedure of hypothesis testing the “benefit of doubt is give to the:
188
SECTION B
QUESTION ONE
a) State the null and alternative hypotheses given the following information.
During the last year, the average quarterly charge on a current account held at a
bank was K25,000. The bank wishes to investigate whether the amount paid in
charges has increased or not. So they sample 50 accounts and obtain a mean of
K25,500.
QUESTION TWO
Required:
189
ii) Calculate the number of students to be interviewed if the sample
promotions is to lie within 1% of the true population proportion at 95%
confidence level.
(Natech, 1.2 Mathematics and Statistics, December 2004)
QUESTION THREE
a) i) The mean and the standard deviation of the height of a random sample of
100 students are 168.75cm and 7.5cm respectively.
Required:
Calculate the 99% confidence interval for the mean height of all students
at the college.
Required:
190
d) A random sample of six (6) bank accounts showed balances equal to
QUESTION FOUR
a) The Manager of a coach company wishes to estimate the average daily number of
kilometers covered by his coaches. He requires a confidence limit of 80%, but the
error must be within plus or minus 20 kilometers of the true mean.
Assuming that the previous investigation have indicated that a very good estimate
for the population standard deviation is 130 kilometers.
ii) Suppose the sample size is greater than his fleet of coaches, what steps
should be taken?
b) XYZ Ltd has developed a cleaning detergent for use by housewives in their
homes. A decision must now be made whether or not to market the detergent.
Marketing the product will be profitable if the mean number of units ordered per
household is greater than 3.0 and unprofitable if less than 3.0. The decision will
be based on the sales potential shown in the home demonstrations of the detergent
with a random sample of the housewives in the target market.
i) Specify the null and alternatives hypotheses if the more costly error is to
market the detergent when it is not profitable.
ii) Specify the null and alternative hypotheses if the more costly error is to
fail to market the detergent when the mean number of units ordered per
household exceeds 3.0.
191
c) An instructor wishes to determine whether or not student performance has
changed over the duration of the course. Scores in two equivalent test in
“mathematics proficiency”, one before and the other after the course, are
summarized in the Table below.
Student 1 2 3 4 5 6 7 8 9 10 11
Before 29 61 73 51 49 71 33 48 44 75 55
After 55 67 73 35 48 93 59 47 42 60 46
Is there any statistical evidence that the course has produced some learning? Use
α = 0.05 .
QUESTION FIVE
a) Formulate the appropriate null and alternative hypotheses in each of the following
situations.
ii) Suppose you own a book shop that sells a variable number of books per
day, and that if the mean number of books sold is less than 20 per day, you
will eventually be bankrupt. If the mean number of books sold exceeds 20
per day, you care financially safe. You wish to determine whether your
sales (from the bookshop) are leading to a financial disaster.
b) A relief organization knows from the previous studies that the average distance
that each family has to walk to fetch water is 5.6km. A small capital investment
programme is initiated to sink boreholes to address this problem.
Two months after the completion of the programme, a sample of thirty families
revealed that the mean distance is now 5km with a standard deviation of 1.4km.
Is this a significant improvement? Conduct your test at 5% level of significance.
(Natech, 1.2 Mathematics and Statistics, December 1996)
192
QUESTION SIX
With this information and using the 0.01 level of significance, test the hypothesis
that the population mina is 200.
(Natech, 1.2 Mathematics and Statistics, December 1997)
193
C N
Confidence Intervals, 162 numerical, 158
D P
Distribution of Sample Means, 161 Probability Samples, 157, 159
E R
EXAMINATION QUESTION WITH ANSWERS, random, 157, 158, 159, 160, 164, 166, 168, 170, 172,
187 173, 175, 176, 177, 178, 179, 180, 183, 185, 187,
190, 191, 192
F
S
Finite Population Correction Factor, 164
Sampling, 156, 157, 158, 159, 185
Simple Random Sample, 157
H standard deviation, 161, 162, 164, 165, 166, 167, 168,
169, 170, 172, 173, 175, 177, 179, 180, 182, 187,
Hypothesis testing, 160, 171, 188 188, 189, 190, 191, 192, 193
Hypothesis Testing, 171, 177, 180, 183 Statistical Inferences, 160
194