Statistical Hypothesis
Statistical Hypothesis
HYPOTHESIS A statement about a population developed for the purpose of testing. In most cases the
population is so large that it is not feasible to study all the items, objects, or persons in the population. For
example, it would not be possible to contact every systems analyst in the United States to find his or her
monthly income. Likewise, the quality assurance department at Cooper Tire cannot check each tire
produced to determine whether it will last more than 60,000 miles.
The terms hypothesis testing and testing a hypothesis are used interchangeably. Hypothesis testing starts
with a statement, or assumption, about a population parameter- such as the population mean. As noted,
this statement is referred to as a hypothesis. A hypothesis might be that the mean monthly commission of
sales associates in retail electronics stores, such as Circuit City, is $2,000.
There is a five-step procedure that systematizes hypothesis testing; when we get to step 5, we are ready to
reject or not reject the hypothesis. However, hypothesis testing as used by statisticians does not provide
proof that something is true, in the manner in which a mathematician "proves" a statement. It does provide
a kind of "proof beyond a reasonable doubt," in the manner of the court system. Hence, there are specific
rules of evidence, or procedures, that are followed. The steps are shown in the diagram at the bottom of
this page. We will discuss in detail each of the steps.
Step 1: State the Null Hypothesis (Ho) and the Alternate Hypothesis (HI)
The first step is to state the hypothesis being tested. It is called the null hypothesis, designated Ho, and read
"H sub zero." The capital letter H stands for hypothesis, and the subscript zero implies "no difference."
There is usually a "not" or a "no" term in the null hypothesis, meaning that there is "no change." For
example, the null hypothesis is that the mean number of miles driven on the steel-belted tire is not
different from 60,000. The null hypothesis would be written 𝐻𝑜: 𝜇 = 60,000. Generally speaking, the null
hypothesis is developed for the purpose of testing. We either reject or fail to reject the null hypothesis. The
null hypothesis is a statement that is not rejected unless our sample data provide convincing evidence that
it is false.
The alternate hypothesis describes what you will conclude if you reject the null hypothesis. It is written 𝐻1
and is read "H sub one." It is often called the research hypothesis. The alternate hypothesis is accepted if
the sample data provide us with enough statistical evidence that the null hypothesis is false. The following
example will help clarify what is meant by the null hypothesis and the alternate hypothesis. A recent article
indicated the mean age of U.S. commercial aircraft is 15 years. To conduct a statistical test regarding this
statement, the first step is to determine the null and the alternate hypotheses. The null hypothesis
represents the current or reported condition. It is written 𝐻𝑜 : 𝜇 = 15. The alternate hypothesis is that the
statement is not true, that is, 𝐻1 : 𝜇 ≠ 15. It is important to remember that no matter how the problem is
stated, the null hypothesis will always contain the equal sign. The equal sign (=) will never appear in the
alternate hypothesis. Why? Because the null hypothesis is the statement being tested, and we need a
specific value to include in our calculations. We turn to the alternate hypothesis only if the data suggests
the null hypothesis is untrue.
After establishing the null hypothesis and alternate hypothesis, the next step is to select the level of
significance. The level of significance is designated 𝛼 the Greek letter alpha. It is also sometimes called the
level of risk. This may be a more appropriate term because it is the risk you take of rejecting the null
hypothesis when it is really true. There is no one level of significance that is applied to all tests. A decision is
made to use the .05 level (often stated as the 5 percent level), the .01 level, the .10 level, or any other level
between 0 and 1. Traditionally, the .05 level is selected for consumer research projects, .01 for quality
2
assurance, and .10 for political polling. You, the researcher, must decide on the level of significance before
formulating a decision rule and collecting sample data.
To illustrate how it is possible to reject a true hypothesis, suppose a firm manufacturing personal
computers uses a large number of printed circuit boards. Suppliers bid on the boards, and the one with the
lowest bid is awarded a sizable contract. Suppose the contract specifies that the computer manufacturer's
quality-assurance department will sample all incoming shipments of circuit boards. If more than 5 percent
of the boards sampled are substandard, the shipment will be rejected. The null hypothesis is that the
incoming shipment of boards contains 5 percent or less substandard boards. The alternate hypothesis is
that more than 5 percent of the boards are defective.
There are many test statistics. In this chapter we use both z and t as the test statistic. We will use such test
statistics as F and 𝜒 2 , called chi-square. TEST STATISTIC A value, determined from sample information,
used to determine whether to reject the null hypothesis.
𝒙−𝝁
z DISTRIBUTION AS A TEST STATISTIC 𝒛= 𝜹
√𝒏
A decision rule is a statement of the specific conditions under which the null hypothesis is rejected and the
conditions under which it is not rejected. The region or area of rejection defines the location of all those
values that are so large or so small that the probability of their occurrence under a true null hypothesis is
rather remote. CRITICAL VALUE The dividing point between the region where the null hypothesis is
rejected and the region where it is not rejected.
Step 5: Make a Decision: The fifth and final step in hypothesis testing is computing the test statistic,
comparing it to the critical value, and making a decision to reject or not to reject the null hypothesis. As
noted, only one of two decisions is possible in hypothesis testing-either accept or reject the null hypothesis.
Instead of "accepting" the null hypothesis, Ho, some researchers prefer to phrase the decision as: "Do not
reject 𝐻0 ," "We fail to reject 𝐻0 ," or "The sample results do not allow us to reject 𝐻0 ."
1. Establish the null hypothesis (𝐻0 ) and the alternate hypothesis (𝐻1 ).
4. Calculations
6. Make a decision regarding the null hypothesis based on the sample information. Interpret the results of
the test.
It depicts a one-tailed test. The region of rejection is only in the right (upper) tail of the curve. To illustrate,
suppose that the packaging department at General Foods Corporation is concerned that some boxes of
Grape Nuts are significantly overweight. The cereal is packaged in 453-gram boxes, so the null hypothesis is
𝐻𝑜 : 𝜇 ≤ 453. This is read, "the population mean (𝜇) is equal to or less than 453." The alternate hypothesis
is, therefore, 𝐻1 : 𝜇 > 453. This is read, " 𝜇 is greater than 453." Note that the inequality sign in the
alternate hypothesis (>) points to the region of rejection in the upper tail. Also note that the null
hypothesis includes the equal sign. That is, 𝐻𝑜 : 𝜇 ≤ 453. The equality condition a/ways appears in 𝐻0 ,
never in 𝐻1 .
One way to determine the location of the rejection region is to look at the direction in which the inequality
sign in the alternate hypothesis is pointing (either < or >). In this problem it is pointing to the left, and the
rejection region is therefore in the left tail.
In summary, a test is one-tailed when the alternate hypothesis, 𝐻1 , states a direction, such as:
𝐻0 : The mean income of women financial planners is less than or equal to $65,000 per year.
𝐻1 : The mean income of women financial planners is greater than $65,000 per year.
If no direction is specified in the alternate hypothesis, we use a two-tailed test. Changing the previous
problem to illustrate, we can say:
per year.
Example 1:test the hypothesis that the mean of a normal population with known variance 70 is 31, if a
sample of size 13 gave mean (𝑥̅ =34). let the alternative hypothesis be 𝐻1 : 𝜇 > 31 𝑎𝑛𝑑 𝑙𝑒𝑡 𝛼 = 0.10
Solution:
𝑥̅ − 𝜇
𝑧= , 𝑢𝑛𝑑𝑒𝑟 𝑡ℎ𝑒 𝐻𝑜 ℎ𝑎𝑠 𝑎 𝑠𝑡𝑎𝑛𝑑𝑎𝑟𝑑 𝑛𝑜𝑟𝑚𝑎𝑙 𝑑𝑖𝑠𝑡𝑟𝑖𝑏𝑡𝑢𝑖𝑜𝑛
𝛿
√𝑛
4 The critical region for 𝛼 = 0.10 𝑖𝑠 𝑍 > 1.28
34 − 31
𝑍= = 1.29
70
√13
6 Conclusion
The calculated value (Z=1.29) falls in the critical region, so we reject our null hypothesis and accept the
alternative hypothesis. We conclude that population mean is greater than 31.
5
Example 2: A random sample of size 25 values gives 𝑥̅ =83. Can this sample be regarded as drawn from a
population with mean 𝜇 = 80 𝑎𝑛𝑑 𝛿 = 7?
Solution:
𝑥̅ − 𝜇
𝑧= , 𝑢𝑛𝑑𝑒𝑟 𝑡ℎ𝑒 𝐻𝑜 ℎ𝑎𝑠 𝑎 𝑠𝑡𝑎𝑛𝑑𝑎𝑟𝑑 𝑛𝑜𝑟𝑚𝑎𝑙 𝑑𝑖𝑠𝑡𝑟𝑖𝑏𝑡𝑢𝑖𝑜𝑛
𝛿
√𝑛
4 The critical region for 𝛼 = 0.5 𝑖𝑠 |𝑍| ≥ 1.96
83 − 80
𝑧= = 2.14
7
√25
6 Conclusion
The calculated value (Z=2.14) falls in the critical region, so we reject our null hypothesis and accept the
alternative hypothesis. We conclude that population mean is not equal to 80.
Questions: a sample of 900 members has a mean 2.4 inches. could it be resonably regarded as being a
simple random sample from large population whose mean is 2.9 inches and standard deviation 3.2 inches?
Questions: a sample of size 400 has a mean 6 inches . Can it be regarded as a simple random sample from a
large population with mean 6.2 inches and standard deviation 2.25 inches?.
Questions: Ten individuals are chosen at random from a normal population and the heights are found to be
in inches 67,68,69,70,70,71,71,72,72 and 73. In the light of these data suggestion that mean in the
population is 66 inches.
6
𝑥̅ − 𝜇
𝑡= 𝑠 , 𝑢𝑛𝑑𝑒𝑟 𝑡ℎ𝑒 𝐻𝑜 ℎ𝑎𝑠 𝑡ℎ𝑒 𝑠𝑡𝑢𝑑𝑒𝑛𝑡 𝑡 𝑑𝑖𝑠𝑡𝑟𝑖𝑏𝑢𝑡𝑖𝑜𝑛 𝑤𝑖𝑡ℎ 𝑛 − 1 𝑑𝑒𝑔𝑟𝑒𝑒 𝑜𝑓 𝑓𝑟𝑒𝑒𝑑𝑜𝑚
√𝑛
4 The critical region for 𝛼 = 0.5 𝑖𝑠 |𝑡| ≥ 𝑡0.025(9) = 2.262
∑ (𝑥𝑖 − 𝑥̅ )2
𝑠2 = = 3.557
𝑛−1
𝑠 = 1.888 𝑖𝑛𝑐ℎ𝑒𝑠
70.3 − 66
𝑡= = 7.22
1.888
√10
6 Conclusion
The calculated value of t=7.22 does falls in the critical region, we therefore, do not reject Ho and may
conclude that the population mean is 66 inches.
----------------------------------------------------------------------------------------------------------------
In testing hypothesis about the difference between the two populations means , we deal with the following
three cases.
1. Both the population are normal with known standard deviations, then use the
test statistic
(𝑋̅1 − 𝑋̅2 ) − 𝜇1 − 𝜇2
𝑍= ℎ𝑎𝑠 𝑡ℎ𝑒 𝑠𝑡𝑎𝑛𝑑𝑎𝑟𝑑 𝑛𝑜𝑟𝑚𝑎𝑙 𝑑𝑖𝑠𝑡𝑟𝑖𝑏𝑢𝑡𝑖𝑜𝑛
𝛿2 𝛿22
√ 1 +
𝑛1 𝑛2
(𝑋̅1 − 𝑋̅2 ) − 𝜇1 − 𝜇2
𝑧= ℎ𝑎𝑠 𝑎𝑝𝑝𝑟𝑜𝑥𝑖𝑚𝑎𝑡𝑒𝑙𝑦 𝑠𝑡𝑎𝑛𝑑𝑎𝑟𝑑 𝑛𝑜𝑟𝑚𝑎𝑙 𝑑𝑖𝑠𝑡𝑟𝑖𝑏𝑢𝑡𝑖𝑜𝑛
𝑆2 𝑆22
√ 1 +
𝑛1 𝑛2
3. Both the populations are non-normal in which case, both sample sizes are
necessarily large are equal to 30, if 𝑛1 𝑎𝑛𝑑 𝑛2 ≥ 30 then use the test statistic
(𝑋̅1 − 𝑋̅2 ) − 𝜇1 − 𝜇2
𝑧= ℎ𝑎𝑠 𝑎𝑝𝑝𝑟𝑜𝑥𝑖𝑚𝑎𝑡𝑒𝑙𝑦 𝑠𝑡𝑎𝑛𝑑𝑎𝑟𝑑 𝑛𝑜𝑟𝑚𝑎𝑙 𝑑𝑖𝑠𝑡𝑟𝑖𝑏𝑢𝑡𝑖𝑜𝑛
𝑆2 𝑆22
√ 1 +
𝑛1 𝑛2
Case 1.
(𝑋̅1 − 𝑋̅2 ) − 𝜇1 − 𝜇2
𝑍= , 𝑢𝑛𝑑𝑒𝑟 𝑡ℎ𝑒 𝐻𝑜 ℎ𝑎𝑠 𝑡ℎ𝑒 𝑠𝑡𝑛𝑎𝑑𝑎𝑟𝑑 𝑛𝑜𝑟𝑚𝑎𝑙 𝑑𝑖𝑠𝑡𝑟𝑖𝑏𝑢𝑡𝑖𝑜𝑛
𝛿2 𝛿22
√ 1 +
𝑛1 𝑛2
Case 2.
(𝑋̅1 − 𝑋̅2 ) − 𝜇1 − 𝜇2
𝑍= , 𝑢𝑛𝑑𝑒𝑟 𝑡ℎ𝑒 𝐻𝑜 ℎ𝑎𝑠 𝑡ℎ𝑒 𝑎𝑝𝑝. 𝑠𝑡𝑎𝑛𝑑𝑎𝑟𝑑 𝑛𝑜𝑟𝑚𝑎𝑙 𝑑𝑖𝑠𝑡𝑟𝑖𝑏𝑢𝑡𝑖𝑜𝑛
𝑆2 𝑆22
√ 1 +
𝑛1 𝑛2
𝑖𝑓 𝑎𝑛𝑑 𝑛1 𝑎𝑛𝑑 𝑛2 𝑎𝑟𝑒 𝑔𝑟𝑒𝑎𝑡𝑒𝑟 𝑡ℎ𝑎𝑛 30 𝑎𝑛𝑑 𝑎𝑛𝑑 𝛿1 and 𝛿1 𝑎𝑟𝑒 𝑢𝑛𝑘𝑜𝑤𝑛 𝑓𝑜𝑟 𝑛𝑜𝑟𝑚𝑎𝑙 𝑜𝑟 𝑛𝑜𝑛 −
𝑛𝑜𝑟𝑚𝑎𝑙 𝑑𝑖𝑠𝑡𝑟𝑖𝑏𝑢𝑡𝑖𝑜𝑛.
5 Calculation
6 Conclusion
8
Example 16. The two sample A and B detailed below, were taken from normal populations of standard
deviation 0.8. test whether the difference of means is significant.
(𝑋̅1 − 𝑋̅2 ) − 𝜇1 − 𝜇2
𝑍= , 𝑢𝑛𝑑𝑒𝑟 𝑡ℎ𝑒 𝐻𝑜 ℎ𝑎𝑠 𝑡ℎ𝑒 𝑠𝑡𝑛𝑎𝑑𝑎𝑟𝑑 𝑛𝑜𝑟𝑚𝑎𝑙 𝑑𝑖𝑠𝑡𝑟𝑖𝑏𝑢𝑡𝑖𝑜𝑛
𝛿2 𝛿22
√ 1 +
𝑛1 𝑛2
5 Calculation
𝑋̅1 = 12.8
𝑋̅2 = 13.675
(12.8 − 13.675) − 0
𝑧= = −2.11
2 2
√(0.8) + (0.8)
7 8
6 Conclusion
The calculated value of Z is falls in the critical region. Therefore, we reject the null hypothesis and accept
alternative hypothesis. We also conclude that the there is significant difference between population
means.
--------------------------------------------------------------------------------------------------------------------------
9
Let X11, X12,.....,X1n and X21, X22,.....,X2n be two small in dependant random samples from two normal
populations with means 𝜇1 𝑎𝑛𝑑 𝜇2 𝑎𝑛𝑑 𝑠𝑡𝑎𝑛𝑑𝑎𝑟𝑑 𝑑𝑒𝑣𝑖𝑎𝑡𝑖𝑜𝑛𝑠 𝛿1 𝑎𝑛𝑑 𝛿2. We will test is given below.
(𝑋̅1 − 𝑋̅2 ) − ∆
𝑡= , 𝑢𝑛𝑑𝑒𝑟 𝑡ℎ𝑒 𝐻𝑜 ℎ𝑎𝑠 𝑡ℎ𝑒 𝑠𝑡𝑢𝑑𝑒𝑛𝑡 𝑡 𝑑𝑖𝑠𝑡𝑟𝑖𝑏𝑢𝑡𝑖𝑜 𝑤𝑖𝑡ℎ 𝑛1 + 𝑛2 − 2 𝑑𝑓
1 1
𝑠𝑝 √𝑛 + 𝑛
1 2
Example: Given the following samples from two normal distributed populations with identical standard
deviations but unknown, test 𝐻0 : 𝜇1 − 𝜇2 ≤ 3 𝑎𝑛𝑑 𝐻0 : 𝜇1 − 𝜇2 > 3
𝑆𝑎𝑚𝑝𝑙𝑒 1: 51, 42, 49, 55, 46, 63, 56, 58, 47, 39, 47
Solution
𝐻0 : 𝜇1 − 𝜇2 ≤ 3 𝑎𝑛𝑑 𝐻0 : 𝜇1 − 𝜇2 > 3
(𝑋̅1 − 𝑋̅2 ) − ∆
𝑡= , 𝑢𝑛𝑑𝑒𝑟 𝑡ℎ𝑒 𝐻𝑜 ℎ𝑎𝑠 𝑡ℎ𝑒 𝑠𝑡𝑢𝑑𝑒𝑛𝑡 𝑡 𝑑𝑖𝑠𝑡𝑟𝑖𝑏𝑢𝑡𝑖𝑜 𝑤𝑖𝑡ℎ 𝑛1 + 𝑛2 − 2 𝑑𝑓
1 1
𝑠𝑝 √𝑛 + 𝑛
1 2
5 Calculations
𝑥̅1 = 50.3
𝑥̅2 = 37.8
sp=7.41
50.3 − 37.8 − 3
𝑡= = 2.53
1 1
7.41√11 + 6
6 Conclusion. the calculated value of t falls in the critical region, so we reject the null hypothesis and accept
the alternative hypothesis.
Testing hypothesis about difference of means of two normal population when 𝜹𝟏 ≠ 𝜹𝟐 Use the same
procedure as given above instead of test statistics and df. The test is given by
(𝑋̅1 − 𝑋̅2 ) − ∆
𝑡= , ℎ𝑎𝑠 𝑎𝑝𝑝𝑟𝑜𝑥𝑖𝑚𝑎𝑡𝑒𝑙𝑦 𝑎 𝑆𝑡𝑢𝑑𝑒𝑛𝑡 𝑡 𝑑𝑖𝑠𝑡𝑟𝑖𝑏𝑢𝑡𝑖𝑜𝑛 𝑤𝑖𝑡ℎ 𝑣 𝑑𝑒𝑔𝑟𝑒𝑒 𝑜𝑓 𝑓𝑟𝑒𝑒𝑑𝑜𝑚
𝑠2 𝑠22
√ 1 +
𝑛1 𝑛2
------------------------------------------------------------------------------------------------------------
Introduction
A paired t-test is used to compare two population means where you have two samples in which
observations in one sample can be paired with observations in the other sample.
11
• Before-and-after observations on the same subjects (e.g. students’ diagnostic test results before and after
a particular module or course).
• A comparison of two different methods of measurement or two different treatments where the
measurements/treatments are applied to the same subjects.
Suppose a sample of n students were given a diagnostic test before studying a particular module and then
again after completing the module. We want to find out if, in general, our teaching leads to improvements
in students’ knowledge/skills (i.e. test scores). We can use the results from our sample of students to draw
conclusions about the impact of this module in general.
Let x = test score before the module, y = test score after the module
To test the null hypothesis that the true mean difference is zero, the procedure is as follows:
1. Calculate the difference (𝑑𝑖 = 𝑦𝑖 − 𝑥𝑖 ) between the two observations on each pair,
6. Conclusion.
Example: A political candidate wishes to determine if endorsing increased social spending is likely to affect
her standing in the polls. She has access to data on the popularity of several other candidates who have
endorsed increases spending. The data was available both before and after the candidates announced their
positions on the issue [see below Table].
Popularity Ratings
Candidate Before After
1 42 43
2 41 45
3 50 56
4 52 54
5 58 65
6 32 29
7 39 46
8 42 48
9 48 47
12
10 47 53
𝑑̅
𝑡= , 𝑢𝑛𝑑𝑒𝑟 𝑡ℎ𝑒 𝐻𝑜 ℎ𝑎𝑠 𝑡ℎ𝑒 𝑠𝑡𝑢𝑑𝑒𝑛𝑡 𝑡 𝑑𝑖𝑠𝑡𝑟𝑖𝑏𝑢𝑡𝑖𝑜 𝑤𝑖𝑡ℎ 𝑛 𝑑𝑓
𝑠𝑑 /√𝑛
5 Calculations
Popularity Ratings
Candidate Before After Difference
1 42 43 1
2 41 45 4
3 50 56 6
4 52 54 2
5 58 65 7
6 32 29 -3
7 39 46 7
8 42 48 6
9 48 47 -1
10 47 53 6
3.50
𝑡= = 3.103
3.567/√10
6 Conclusion. the calculated value of t falls in the critical region, so we reject the null hypothesis and accept
the alternative hypothesis, and conclude that there is significant effect of popularity rating by increase of
spending .
POINT ESTIMATE:
A point estimate is a single statistic used to estimate a population parameter. Suppose Best Buy, Inc. wants
to estimate the mean age of buyers of high-definition televisions. They select a random sample of 50 recent
purchasers, determine the age of each purchaser, and compute the mean age of the buyers in the sample.
The mean of this sample is a point estimate of the mean of the population.
The sample mean, 𝑋̅, is a point estimate of the population mean, 𝜇; and S, the sample standard deviation,
is a point estimate of 𝛿, the population standard deviation. A point estimate, however, tells only part of the
story. While we expect the point estimate to be close to the population parameter, we would like to
measure how close it really is. A confidence interval serves this purpose.
13
CONFIDENCE INTERVAL A range of values constructed from sample data so that the population parameter
is likely to occur within that range at a specified probability. The specified probability is called the level of
confidence.
For example, we estimate the mean yearly income for construction workers in the New York-New Jersey
area is $65,000. The range of this estimate might be from $61,000 to $69,000. We can describe how
confident we are that the population parameter is in the interval by making a probability statement. We
might say, for instance, that we are 90 percent sure that the mean yearly income of construction workers in
the New York-New Jersey area is between $61,000 and $69,000.
Confidence Interval For The Population Mean when sample is selected from the normal population and
𝝈 𝒊𝒔 𝒌𝒏𝒐𝒘𝒏 then use 𝒔𝒕𝒂𝒏𝒅𝒂𝒓𝒅 𝒏𝒐𝒓𝒎𝒂𝒍 𝒅𝒊𝒔𝒕𝒓𝒊𝒃𝒖𝒕𝒊𝒐𝒏 𝑿 ̅ ± 𝒁𝜶 𝜹/√𝒏
𝟐
Confidence Interval For The Population Mean when sample is selected from the normal population and
𝝈 𝒊𝒔 𝒖𝒏𝒌𝒏𝒐𝒘𝒏, 𝒕𝒉𝒆𝒏 𝒄𝒉𝒆𝒄𝒌 𝒕𝒉𝒆 𝒔𝒂𝒎𝒑𝒍𝒆 𝒔𝒊𝒛𝒆 𝒊𝒇 𝒏 ≥ 𝟑𝟎 then
̅
use 𝒂𝒑𝒑𝒓𝒐𝒙𝒊𝒎𝒂𝒕𝒆𝒍𝒚 𝒔𝒕𝒂𝒏𝒅𝒂𝒓𝒅 𝒏𝒐𝒓𝒎𝒂𝒍 𝒅𝒊𝒔𝒕𝒓𝒊𝒃𝒖𝒕𝒊𝒐𝒏 𝑿 ± 𝒁𝜶 𝑺/√𝒏
𝟐
Confidence Interval For The Population Mean when sample is selected from the non-normal population and
𝝈 𝒊𝒔 𝒖𝒏𝒌𝒏𝒐𝒘𝒏, 𝒕𝒉𝒆𝒏 𝒄𝒉𝒆𝒄𝒌 𝒕𝒉𝒆 𝒔𝒂𝒎𝒑𝒍𝒆 𝒔𝒊𝒛𝒆 𝒊𝒇 𝒏 ≥ 𝟑𝟎 then
use 𝒂𝒑𝒑𝒓𝒐𝒙𝒊𝒎𝒂𝒕𝒆𝒍𝒚 𝒔𝒕𝒂𝒏𝒅𝒂𝒓𝒅 𝒏𝒐𝒓𝒎𝒂𝒍 𝒅𝒊𝒔𝒕𝒓𝒊𝒃𝒖𝒕𝒊𝒐𝒏 ̅ 𝑿 ± 𝒁𝜶 𝑺/√𝒏
𝟐
Confidence Interval For The Population Mean when sample is selected from the normal population and
𝝈 𝒊𝒔 𝒖𝒏𝒌𝒏𝒐𝒘𝒏, 𝒕𝒉𝒆𝒏 𝒄𝒉𝒆𝒄𝒌 𝒕𝒉𝒆 𝒔𝒂𝒎𝒑𝒍𝒆 𝒔𝒊𝒛𝒆 𝒊𝒇 𝒏 < 30 then use 𝒔𝒕𝒖𝒅𝒆𝒏𝒕 𝒕 𝒅𝒊𝒔𝒕𝒓𝒊𝒃𝒖𝒕𝒊𝒐𝒏 ̅ 𝑿±
𝒕(𝜶)𝒗 𝒔/√𝒏
𝟐
-----------------------------------------------------------------------------------------------------------------------------------------------
---------------------------------------------------------------------------------------------------
To contract a confidence interval for β the population regression co-efficient , we use b, the sample
estimate of β. the sampling distributed of b is normally distributed with a mean β and a standard deviation
𝜎𝑌.𝑋 𝑏−𝛽
∑(𝑋𝑖 −𝑋̅)2
. That is, the variable 𝑍 = 𝜎𝑌.𝑋 is standard normal variable.
̅ )2
∑(𝑋𝑖 −𝑋
But 𝜎𝑌.𝑋 𝑖𝑠 𝑔𝑒𝑛𝑒𝑟𝑎𝑙𝑙𝑦 𝑛𝑜𝑡 𝑘𝑛𝑜𝑤𝑛, 𝑤𝑒 𝑡ℎ𝑒𝑟𝑒𝑓𝑜𝑟𝑒 𝑒𝑠𝑡𝑖𝑚𝑎𝑡𝑒 𝑖𝑡 𝑓𝑟𝑜𝑚 𝑡ℎ𝑒 𝑠𝑎𝑚𝑝𝑙𝑒 𝑑𝑎𝑡𝑎 𝑏𝑦
∑(𝑌𝑖 −𝑌̂)2
𝑠𝑌.𝑋 = √ 𝑛−2
we shall then use the student's t distribution rather than normal distribution 𝑏 ±
𝑡(𝛼)(𝑛−2) 𝑠𝑏 ---------------------------------------------------------------------------------------------------------------------
2
T|o construct a confidence interval for 𝛼 we use a, the sample estimate of 𝛼. We have already observed
1 𝑋̅ 2
that a is distributed normally with 𝜇𝛼 = 𝛼 and standard deviation 𝜎𝛼 = 𝜎𝑌.𝑋 √𝑛 + ∑(𝑋 −𝑋̅)2
𝑖
Since 𝜎𝑌.𝑋 is usually unknown, we use its unbiased sample estimate 𝑠𝑌.𝑋 .
𝛼 ± 𝑡(𝛼)(𝑛−2) 𝑠𝑏
2