Stat LL CHP 2
Stat LL CHP 2
CHAPTER TWO
HYPOTHESIS TESTING
INTRODUCTION
In statistics, as in life, nothing is as certain as the presence of uncertainty. However, just
because we are not 100% sure of something, that is no reason why we cannot reach same
conclusions that are highly likely to be true. Therefore, this chapter is aimed on the
examination of the very important process of reaching conclusions based on sample
information.
2.1 The basic concepts of hypothesis testing
2.1.1 Null and alternative hypothesis
The first step in examining claims is to form a null hypothesis, expressed as H 0 (“H sub
naught”). The null hypothesis is a statement about the value of a population parameter and is
put for testing in the face of numerical evidence. The null hypothesis is either rejected or
fails be rejected. In the philosophy of hypothesis testing, the null hypothesis is assumed to be
true unless we have statistically overwhelming evidence to the contrary.
Alterative hypothesis, H1 (“H sub one”), is an assertion that holds if the null hypothesis is
false. For a given test, the null and alternative hypotheses include all possible values of the
Population parameter, so either one or the other must be false. Described in terms of an
(unknown) population mean () they might be listed as follows:
Null Hypothesis Alternative Hypothesis
Ho: =$10 H1: $10 ( is $ 10, or it is not)
Ho: $10 H1: <$10 ( is at least $ 10, or it is less)
Ho: $ 10 H1: > 10 ( is not less than $ 10, or it
is more)
Notice that each null hypothesis has an equality term in its statement ( i.e., “= ,” “, “ or “
“). Thus, an actual population mean of $ 10 would cause all three of them to be true.
However, this does not make the three sets interchangeable. The null and alternative
hypotheses are the foundation for a hypothesis test, and the selection of one of these three
sets will depend on (1)the directionality or non-directionality of the original claim or
assertion that led to the test and (2) the purpose for which the test is being conducted.
A directional claim or assertion holds that a population parameter is greater than (>), at least
(), no more than (), or less than (<) some quantity.
A non-directional claim or assertion states that a parameter is equal to some quantity. For
example Ato Kebede claims that 35% of his transit riders are senior citizens.
Directional assertions lead to what are called one-tail tests, where a null hypothesis can be
rejected by an extreme result in one direction only. A non-directional assertion involves a two
– tail test, in which a null hypothesis can be rejected by an extreme result occurring in either
direction.
2.1.3 Errors in Hypothesis Testing
Whenever we reject a null hypothesis, there is a chance that we have made a mistake i.e., that
we have rejected a true statement. Rejecting a true null hypothesis is referred to as a Type I
error, and our probability of making such an error is represented by the Greek letter alpha (
). This probability, which is referred to as the significance level of the test, is of primary
concern in hypothesis testing.
On the other hand, we can also make the mistake of failing to reject a false null hypothesis
this is a Type II error. Our probability of making it is represented by the Greek letter beta (β).
Table 2.1 below gives the summary of the possibilities for mistakes and correct decisions in
hypothesis testing. The probability of incorrectly rejecting a true null hypothesis is , the
significance level. The probability that the test will correctly reject a false null hypothesis is
(1- β), the power of the test.
Table 2.1
“Hypothesis
test stays Incorrect decision (Type I error). Correct decision. Probability
Probability of making this error (1- β) is power of the test.
“Reject Ho “ is , the significance level.
2
Hawassa University Department of Accounting and Finance
As mentioned before, to test the validity of the claim or assumption about the population
parameter, a sample is drawn from the population and analyzed. The results of the analysis
are used to decide whether the claim is true or not. The steps of general procedures for any
hypothesis testing are summarized below:
Step1: State the null hypothesis (H0) and alternative hypothesis (H1).
The null hypothesis Ho refers to a hypothesized numerical value or range of values of the
population parameter. Theoretically hypothesis testing requires that the null hypothesis be
considered true (Status quo or no difference) until it is proved false on the basis of results
observed from the sample data. The null hypothesis is always expressed in the form of an
equation making a claim regarding the specific value of the population parameter. That is:
Ho: = 0, where is population mean and 0 represents hypothesized parameter value.
An alternative hypothesis, H1, is the logical opposite of the null hypothesis, that is, an
alternative hypothesis must be true when the null hypothesis is found to be false. In other
words, the alternative hypothesis states that specific population parameter value is not equal
to the value stated in the null hypothesis and is written as
H1: 0
Consequently Ho: <0 or H1: > 0
Step 2: State the level of significance, (alpha) for the test.
The level of significance, usually denoted by (alpha), is specified before the samples are
drawn, so that the results obtained should not influence the choice of the decision maker. It is
specified in terms of the level of probability of null hypothesis H 0 being wrong. In other
words, a probability which has a null hypothesis may be rejected when it is true.
Step 3: Establish critical or rejection region
The sample space of the experiment which corresponds to the area under the sampling
distribution curve of the test statistic is divided in to the mutually exclusive regions which are
called the acceptance region and the rejection or critical region.
Rejection Rejection
Region, /2 Acceptance Region, /2
(H0 is rejected) region, 1- (H0 is rejected)
(H0 is accepted)
= 0
Critical
Critical
Zα Zα
Figure 2.1 Areas of Acceptance
Value, 2 and Rejection of Ho (two –Value,
Tailed Test)
2
Reject H0
Reject H0
3
Hawassa University Department of Accounting and Finance
If the value of the test statistic falls into the acceptance region, the null hypothesis is
accepted, otherwise it is rejected. The rejection region consists of all values of the test
statistic that are likely to occur if null hypothesis is true.
Step 4: Calculate the suitable test statistic
The value of test statistic is calculated from the distribution of sample statistic by using the
following formula
Test statistic= Value of sample Statistic - Value of hypothesized population
Standard error of the sample statistic
For Example, Z = =
The choice of a probability distribution of a sample statistic is guided by the sample size n
and the value of population standard deviation
Table 2.2: Choice of Probability distribution
Sample size n Population standard deviation
Known Unknown
n>30 Normal distribution Normal distribution
n<30, population Normal distribution t-distribution
being assumed
normal
Imply that any deviation (either on the lower side or higher side) of the calculated value of
test statistic from a hypothesized value o leads to rejection of the null hypothesis. Hence, it is
4
Hawassa University Department of Accounting and Finance
necessary to keep the rejection region on ‘both tails’ of the sampling distribution of the test
statistic. This type of test is called two-tailed test.
A summary of certain critical values at various significance levels for test statistic Z is given
in table 2.3.
Table 2.3: Summary of certain critical values for sample statistic z
Rejection Region Level of significance, per cent
10% 5% 1% 0.5% 0.02%
One –tailed region 1.2 1.6
2.33 2.58 2.88
8 4
Two-tailed region 1.6 1.9
2.58 2.81 3.08
45 6
distribution. Consequently the sample distribution of mean is also normal. Even if the
population does not have a normal distribution, the sampling distribution of mean is
assumed to be normal due to the central limit theorem because the sample size is large.
2.2.1 Hypothesis Testing for Single Population Mean
a) Let o be hypothesized value of the population mean to be tested. For this, the null
and alternative hypotheses for two tailed test are defined as:
H0: = o or - o = 0 (Two- tailed test)
H1: o
If standard deviation of the population is known, then based on the central limit theorem,
the sampling distribution of mean would follow the standard normal distribution for a large
sample size. The Z test statistic is given by:
Z= =
5
Hawassa University Department of Accounting and Finance
In this formula, the numerator - , measures how far (in an absolute sense) the observed
sample mean is from the hypothesized mean (). The denominator is the standard
error of the mean, so the Z-test statistic represents how many standard errors is from ().
If the population standard deviation is not known, then a sample standard deviation (s) is
used to estimate ( ). The value of the test statistic is
Z=
Since the rejection region is divided in to two equal parts of Z each at the two tails of a
normal curve, therefore the decision rule for the two – tailed test would be
Where Z is the table value (also called critical value) of Z at a chosen level of
significance .
b) Large sample (n>30) hypothesis testing about a population mean for a left-tailed test is of
the form
H0: 0 and H1: < 0 (left- tailed test)
Test statistic: Z =
Decision rule: Reject Ho if Zcal < -Z (Table value of Z at )
Otherwise accept H0
C) Large sample (n>30) hypothesis testing about a population mean for
a right – tailed test is of the form:
Ho:
Test statistic: Z =
Decision rule: Reject Ho if Zcal > Z (Table value of Z at )
Otherwise accept Ho
Example 1: Individual filing of income tax returns prior to 30 June had an average refund of
$ 1200. Consider the population of Last “minute” filers who file their returns during the last
week of June. For a sample of 400 individuals who filed a return between 25 and 30 June, the
6
Hawassa University Department of Accounting and Finance
sample mean refund was $1054 and the sample standard deviation was $1600. Using 5 per
cent level of significance, test the belief that the individuals wait until the last week of June to
file their returns to get a higher refund than early filers.
Example 2: A packaging device is set to fill detergent powder packets with a mean weight of
5 kg. The standard deviation is known to be 0.01 kg. These are known to drift upwards over a
period of time due to machine fault, which is not tolerable, A random sample of 100 packets
is taken and weighed. This sample has a mean weight of 5.03 kg and a standard deviation of
0.21kg. Can we conclude that the mean weight produced by the machine has increased? Use
a 5 percent level of significance.
2.2.1.1 One –Tail Testing of Mean, known
Example: The light bulbs in an industrial ware house have been found to have mean lifetime
of 1030 hours, with a standard deviation of 90 hours. The warehouse manager has been
approached by a representative of Extendabulb, a company that makes a device intended to
increase bulb life. The manager is concerned that the average life time of Extendabulb-
equipped bulbs might not be any greater than the 1030 hours historically experienced. In a
subsequent test, the manager tests 40 bulbs equipped with the device e and find their mean
life to be 1061.6 hours. Does extend bulb relay work?
2.2.3 Testing A Mean, Population Standard Deviation Unknown
The true standard deviation of a population will usually be unknown. The t-test is appropriate
for hypothesis tests in which the sample standard deviation (s) is used in estimating the value
of the population standard deviation, . The t-test is based on the t distribution (with number
of degrees of freedom, df = n-1) and the assumption that the population is approximately
normally distributed. As the sample size becomes larger, the assumption of population
normality becomes less important.
When degree of freedom is small, the t distribution is flatter and more spread out than the
normal distribution, but for larger degrees of freedom, successive members of the family
more closely approach the normal distribution. As the number of degrees of freedom
approaches infinity, the two distributions become identical.
Like the Z-test, the t-test depends on the sampling distribution for the sample man. The
appropriate test statistic is similar in appearance, but includes S instead of , because S is
being used to estimate the (unknown) value of . The test statistic can be calculated as
follows:
Test statistic, t – test for a sample mean:
7
Hawassa University Department of Accounting and Finance
2
Then we can extend the hypothesis testing concepts developed in the previous section to test
whether there is any significant difference between the means of these populations.
Let two independent random samples of large size n 1 and n2 be drawn from the first and
second population, respectively. Let the sample means so calculated be The Z-test
statistic used to determine the difference between the population means is based on
the difference between the sample mean . This test statistic will follow the standard
normal distribution for a large sample due to the central limit theorem. The Z-test statistic is
8
Hawassa University Department of Accounting and Finance
Z=
If the standard deviations and of each of the populations are not known, then we may
The standard error of the difference between standard deviation of sampling distribution is
given by
If are not known, then we use standard deviation S 1 and S2 of the sampling
distribution.
The null hypothesis that there is no difference between two population means is stated as:
Ho:
If null hypothesis is true, then The critical (or table) value of the Z-test statistic at
a particular level of significance for two –tailed test is given by Z . For a two – tailed
test, the rejection region of the null hypothesis H o is divided into two equal parts of Z
each at the two tails of a normal curve, therefore the decision rule would be
9
Hawassa University Department of Accounting and Finance
Example: A firm believes that the tires produced by process A on an average last longer than
tires produced by process B. To test this belief, random samples tires produced by the two
processes were tested and the results are:
________________________________________________________
Process Sample size Average lifetime Standard
(in km) Deviation (in km)
A 50 22,400 1000
B 50 21,800 1000
Is there evidence at a 5 per cent level of significance that the firm is correct in its belief?
2.3 HYPOTHESIS TESTING FOR POPULATION PROPORTION
Sometimes instead of testing a hypothesis pertaining to a population mean, a population
proportion (P) values in a particular category is considered. For this, a random sample of size
n is selected to compute the proportion of successes in the particular sample as follows:
=
Occasions may arise when we wish to compare a sample proportion, p, with a value that has
been hypothesized for the population proportion .
The theoretically correct distribution for dealing with proportions is the binomial distribution.
However, the normal distribution, is good approximation when n 5 and n(1- ) 5. The
larger the sample size, the better this approximation becomes, and for most practical settings,
this condition is satisfied. When using the normal distribution for hypothesis tests of sample
proportion, the test statistics is as follows:
Test statistic, Z – test for a sample proportion:
10
Hawassa University Department of Accounting and Finance
Directional tests for a proportion are similar to the preceding example, but have only one-tail
area in which the null hypothesis can be rejected.
Example: Let us assume that in an administrative decision, the ministry of health closed the
cardiac surgery units of several hospitals that either performed fewer than 150
operations per year or had mortality rates higher than 5%. In one of the closed
surgery units, 100 operations had been performed during the preceding year, with
a mortality rate of 7%. At the 0.01 level of significance, was the mortality rate of
this hospital significantly greater than the 5% cut off point? Consider the
hospital’s performance as representing a sample from the population of possible
operations it might have performed if the patients had been available.
2.3.2 Two – Tail Testing Of a Proportion
Whenever the null hypothesis involves a proportion and is non-directional, this technique is
appropriate. To demonstrate how it works, consider the following situation.
Example: The career services director of Hobart university has said that 70% of the school’s
seniors enter the job market in a position directly related to their under graduate field of
study. In a sample consisting of 200 of the graduates from last year’s class, 66% have entered
jobs related to their field of study.
2.3.3 HYPOTHESIS TESTING FOR DIFFERENCE BETWEEN PROPORTIONS OF
TWO INDEPENDENT POPULATIONS
Let two independent populations each having proportion and standard deviation of an
attribute be as follows:
Population Proportion Standard Deviation
1 P1 P1
2 P2 P2
The hypothesis testing concepts developed in the previous section can be extended to test
whether there is any difference between the proportions of these populations. The null
hypothesis that there is no difference between two population proportions is stated as:
H0: P1 = P2 or P1 –P2 = 0 and H1 = P1 P2
11
Hawassa University Department of Accounting and Finance
=
Thus the Z test statistic for the difference between two population proportions is stated as:
Z=
Invariably, the standard error of difference between sample proportions is not known.
Thus when a null hypothesis states that there is no difference between the population
proportions, we combine two sample proportions, and to get one unbiased estimate of
population proportion as follows:
Pooled estimate
The z- test statistic is then restated as:
Z=
Example1: An auditor claims that 10 per cent of customers’ ledger accounts are carrying
mistakes of posting and balancing a random sample of 600 was taken to test the accuracy of
posting and balancing and 45 mistakes were found. Are these sample results consistent with
the claim of the auditor? Use 5 per cent level of significance.
Example 2: Random samples of 1600 workers in region 1 and 1400 workers in region 2 have
been obtained to determine whether the population proportions unemployed in the two
regions are different. Perform a hypothesis test at the 5 percent level if the numbers
unemployed in the samples were 120 in region 1 and 84 in region 2.
12