SBC 3305
SBC 3305
Why sample?
Considering samples from a distribution enables us to obtain information about a population where we
cannot, for reasons of practicality, economy, or both, inspect the whole of the population. For example,
it is impossible to check the complete output of some manufacturing processes. Items such as electric
light bulbs, nuts, bolts, springs, and light emitting diodes (LEDs) are produced in their millions, and
the sheer cost of checking every item as well as the time implications of such a checking process render
it impossible. In addition, testing is sometimes destructive—one would not wish to destroy the whole
production of a given component!
Basic terminology
• Population - the entire group of objects about which information is wanted.
• Unit - any individual member of the population
• Sample - a part of subset of the population used to gain information about the whole.
Much of the theory (and hence the practice) of sampling is based on the Central Limit Theorem.
Essentially, the Central Limit Theorem says that if we take large samples of size n with mean X̄ from a
population which has a mean µ and standard deviation σ, then the distribution of sample means X̄ is
normally distributed with mean µ and standard deviation:
σ
√ .
n
That is, the sampling distribution of the mean X̄ follows the distribution:
σ
X̄ ∼ N µ, √ .
n
Muindi J.
2
Strictly speaking, we require σ 2 < ∞, and it is important to note that no claim is made about the
way in which the original distribution behaves, and it need not be normal. This is why the Central Limit
Theorem is so fundamental to statistical practice. One implication is that a random variable which takes
the form of a sum of many components which are random but not necessarily normal will itself be normal
provided that the sum is not dominated by a small number of components. This explains why many
biological variables, such as human heights, are normally distributed.
2. Determine the appropriate test statistic and calculate it using the sample data.
3. Comparison of test statistic to critical region to draw initial conclusions.
4. Calculation of p-value.
Is there strong evidence for the alternative? The burden of proof is placed on those who believe
in the alternative claim. This initially favored claim (H0 ) will not be rejected in favor of the alternative
claim (Ha or H1 ) unless the sample evidence provides significant support for the alternative assertion.
If the sample does not strongly contradict H0 , we will continue to believe in the plausibility of the null
hypothesis.
The two possible conclusions:
• Reject H0 .
• Fail to reject H0 .
Example: Suppose a company is considering putting a new type of coating on bearings that it produces.
The true average wear life with the current coating is known to be 1000 hours. With µ denoting the
true average life for the new coating, the company would not want to make any (costly) changes unless
evidence strongly suggested that µ exceeds 1000.
An appropriate problem formulation would involve testing
H0 : = 1000 against Ha : > 1000.
The conclusion that a change is justified is identified with Ha , and it would take conclusive evidence
to justify rejecting H0 and switching to the new coating.
Muindi J.
3
The alternative to the null hypothesis H0 : θ = θ0 will look like one of the following three assertions:
1. Ha : θ ̸= θ0
2. Ha : θ > θ0 (in which case the null hypothesis is θ ≤ θ0 )
B. Determine the Appropriate Test Statistic and Calculate it Using the Sam-
ple Data.
The test statistic is a function of the sample data that will be used to make a decision about whether
the null hypothesis should be rejected or not.
Example: Company A produces circuit boards, but 10% of them are defective. Company B claims
that they produce fewer defective circuit boards.
Our data is a random sample of n = 200 boards from company B. What test procedure (or rule)
could we devise to decide if the null hypothesis should be rejected?
There are an infinite number of possible tests that could be devised, so we have to limit this in some
way or total statistical madness will ensue! Choice of a particular test procedure must be based on the
probability the test will produce incorrect results.
• A Type I error occurs when the null hypothesis (H0 ) is rejected, but it is actually true.
• A Type II error occurs when H0 is not rejected, but H0 is actually false.
How do we apply this to the circuit board problem?
Type I Errors
Usually, we specify the largest value of α (significance level) that can be tolerated, and then find a
rejection region with that α.
The resulting value of α is often referred to as the significance level of the test.
Traditional levels of significance are 0.10, 0.05, and 0.01, though the level in any particular problem
will depend on the seriousness of a Type I error.
Rule: The more serious the Type I error, the smaller the significance level should be.
We can also obtain a smaller value of α– the probability that the null will be incorrectly rejected – by
decreasing the size of the rejection region. However, this results in a larger value of β for all parameter
values consistent with Ha.
No rejection region that will simultaneously make both α and all β’s small. A region must be
chosen to strike a compromise between α and β.
Muindi J.
4
Muindi J.
5
Example
An inventor has developed a new, energy-efficient lawn mower engine. He claims that the engine will
run continuously for more than 5 hours (300 minutes) on a single gallon of regular gasoline. The leading
brand lawnmower engine runs for 300 minutes on 1 gallon of gasoline.
From his stock of engines, the inventor selects a simple random sample of 50 engines for testing. The
engines run for an average of 305 minutes. The true standard deviation σ is known to be 30 minutes,
and the run times of the engines are normally distributed.
Test the hypothesis that the mean run time is more than 300 minutes. Use a 0.05 level of significance.
X̄ − µ0
Z= √
S/ n
CI vs. Hypotheses
Rejection regions have much in common with confidence intervals.
Muindi J.
6
Example:
The Brinell scale measures how hard a material is. An engineer hypothesizes that the mean Brinell score
of all subcritically annealed ductile iron pieces is not equal to 170.
The engineer measured the Brinell score of 25 pieces of this type of iron and calculated the sample
mean to be 174.52 with a sample standard deviation of 10.31.
Perform a hypothesis test that the true average Brinell score is not equal to 170, as well as the
corresponding confidence interval. Set α = 0.01.
D. Calculation of p-value
The p-value measures the ”extremeness” of the sample.
Definition: The p-value is the probability that we would get the sample we have or something more
extreme if the null hypothesis were true.
The smaller the p-value, the more evidence there is in the sample data against the null hypothesis.
Select a significance level α (as before, the desired type I error probability), then α defines the
rejection region.
Then the decision rule is:
• reject H0 if P-value ≤ α
• do not reject H0 if P-value > α
Muindi J.
7
Thus if the p-value exceeds the chosen significance level, the null hypothesis cannot be rejected at that
level. Note, the p-value can be thought of as the smallest significance level at which H0 can be rejected.
Each of these is the probability of getting a value at least as extreme as what was obtained (assuming
H0 true).
Muindi J.
8
Examples
Example 1: An inventor has developed a new, energy-efficient lawn mower engine. He claims that the
engine will run continuously for more than 5 hours (300 minutes) on a single gallon of regular gasoline.
The leading brand lawnmower engine runs for 300 minutes on 1 gallon of gasoline. From his stock of
engines, the inventor selects a simple random sample of 50 engines for testing. The engines run for an
average of 305 minutes. The true standard deviation σ is known to be 30 minutes, and the run times
of the engines are normally distributed. Test the hypothesis that the mean run time is more than 300
minutes. Use a 0.05 level of significance. a) Set up test hypothesis b) Write test statistics c) Apply
classical or P-value method d) Draw conclusion and interpret.
Solution
Muindi J.
9
b) Test Statistic
Since the population standard deviation is known, we will use the Z-test. The test statistic is given by:
x̄ − µ0
Z=
√σ
n
Where:
• x̄ = 305 (sample mean),
• µ0 = 300 (hypothesized mean),
Zα = Z0.05 = 1.645
Since the calculated Z-statistic (Z = 1.18) is less than the critical value (1.645), we fail to reject the
null hypothesis.
P-value Method
We can also calculate the p-value for the Z-statistic. Using the standard normal distribution table, for
Z = 1.18, the p-value is approximately 0.1190. Since this is a one-tailed test, the p-value is 0.1190.
Since the p-value (0.1190) is greater than the significance level (α = 0.05), we fail to reject the null
hypothesis.
P-value Method
The p-value (0.1190) is greater than the significance level (α = 0.05), so we fail to reject the null
hypothesis.
Muindi J.
10
Interpretation
There is insufficient evidence at the 0.05 significance level to conclude that the mean run time of the
lawn mower engines is more than 300 minutes on a single gallon of gasoline.
Example 2: An enzyme researcher hypothesizes that the mean enzymatic activity of a specific enzyme
under suboptimal conditions is not equal to 170 µmol/min. The researcher measures the enzymatic
activity of 25 enzyme samples under these conditions and calculates the sample mean to be 174.52
µmol/min with a sample standard deviation of 10.31 µmol/min.
Perform a hypothesis test to determine if the true average enzymatic activity is different from 170
µmol/min, and calculate the corresponding confidence interval. Use a significance level of α = 0.01
Solution
Given:
• Sample size (n) = 25
• Sample mean (x̄) = 174.52 µmol/min
• Sample standard deviation (s) = 10.31 µmol/min
df = n − 1 = 25 − 1 = 24
For a two-tailed test with α = 0.01, the critical t-value at 24 degrees of freedom is:
tα/2,24 = 2.797
Muindi J.
11
Step 5: Conclusion
At the 0.01 significance level, we do not have sufficient evidence to conclude that the true mean enzymatic
activity differs from 170 µmol/min.
174.52 ± 5.77
Thus, the 99% confidence interval is:
Final Conclusion
The 99% confidence interval for the true mean enzymatic activity is between 168.75 µmol/min and 180.29
µmol/min. Since the hypothesized value of 170 is within this range, this supports our conclusion that
we fail to reject the null hypothesis.
Muindi J.