QUARTER 3: STATISTICS AND PROBABILTY
Lesson Illustrating the t-distribution
Objective
14 1. illustrates the t-distribution.
According to the Central Limit Theorem, the sampling distribution of a statistic (like a sample
mean, x ) will follow a normal distribution, as long as the sample size ( n ) is sufficiently large.
Therefore, when we know the standard deviation of the population, we can compute a z-score and
use the normal distribution to evaluate probabilities with the sample mean.
But sample sizes are sometimes small, and often we do not know the standard deviation of the
population. When either of these problems occurs, the solution is to use a different distribution.
Student’s t-distribution
The Student’s t-distribution is a probability distribution that is used to estimate population
parameters when the sample size is small ( i .e . sample ¿ 30 ) and/or when the population variance is
unknown. It was developed by William Sealy Gosset in 1908. He used the pseudonym or pen name
“Student” when he published his paper which describes the distribution. That is why it is called
“Student’s t-distribution”. He worked at a brewery and was interested in the problems of small
samples, for example, the chemical properties of barley. In the problem he analyzed, the sample
size might be as low as three.
Suppose you are about to draw a random sample of n observations from a normally
distributed population, you previously learned that,
x−μ
z=
σ
√n
where z is the z-score, x is the sample mean, μ is the population mean, σ is the population
standard deviation and n is the sample size, have the standard normal distribution. (Note that if
x−μ
we are standardizing a single observation, the value of n is 1. Hence, the formula becomes z=
σ
. You can use this concept to construct a confidence interval for the population mean, μ. But in
practice, you encounter a problem, and that problem is that you don’t know the value of the
population standard deviation, σ . The standard deviation for the entire population σ is a parameter
and you don’t typically know its value, so you can’t use that in your formula. If that happens, you
could do the next best thing, instead of using the “population” standard deviation, σ ; you are going
x−μ
to use your “sample” standard deviation s, to estimate it. And instead of σ , you are going to
√n
x−μ
have s where s is your sample standard deviation.
√n
You must take note of the change in the formula. The quantity σ is a constant but you
don’t know its value, so you used s which is a statistic and this statistic s has a sampling
x−μ
distribution and its value would vary from sample to sample. And so, the quantity s would no
√n
longer have the standard normal distribution. This quantity is labeled as t because it has a t-
distribution. When you are sampling from a normally distributed population, the quantity
1
x−μ
t=
s
√n
has the t-distribution with n-1 degrees of freedom. Note that the number of degrees of freedom is
one less than the sample size. So, if the sample size n is 25, the number of degrees of freedom is
24. Similarly, at t distribution having 16 degrees of freedom, the sample size is 17.
What does the t-distribution look like? If you look at
x−μ
the statistic s , it looks like a z-statistic which has
√n
standard normal distribution except that you replaced
the population standard deviation, σ , by the sample
standard deviation s. You are estimating a parameter
with a statistic, so there is a greater variability. Hence,
your t-distribution is going to look like the normal distribution except with greater variance.
You have here a plot of standard normal distribution in black and t-distributions with 3, 5, 20, and
30 degrees of freedom in red, green, violet, and blue respectively. You can see that both the z-
distribution and t-distributions are symmetric about 0 and bell-shaped. But the t-distributions
have heavier tails (more area in the tails) and lower peaks.
The exact shape of the t-distribution depends on the degrees of freedom. The figure above
tells you that as the degrees of freedom increase, the t-distribution tends toward the standard
normal distribution. At 30 degrees of freedom, the blue curve might look very close to the normal
curve. But if you look very closely, you would see that the t-distribution still has slightly heavier
tails and slightly lower peak. But if you let those degrees of freedom continue to increase, the t-
distribution is going to get closer and closer to the standard normal distribution.
Properties of t-distribution
The t-distribution has the following properties:
1. The t-distribution is symmetrical about 0. That means if you
draw a segment from the peak of the curve down to the 0
mark on the horizontal axis, the curve is divided into two
equal parts or areas. The t- scores on the horizontal axis will
be divided also with half of the t-scores being positive and
half negative.
2. The t-distribution is bell-shaped like the normal distribution
but has heavier tails. That means it is more prone
to producing values that fall far from the
mean. The tails are asymptotic to the horizontal axis. (Each
tail approaches the horizontal axis but never touches it.)
3. The mean, median, and mode of the t-distribution are all equal
to zero.
v
4. The variance is always greater than 1. It is equal to where v is the
v−2
number of degrees of freedom. As the number of degrees of freedom increases and
approaches infinity, the variance approaches 1. Using the formula, if the number of
10 10
degrees of freedom is 10, the variance is =
10−2 8
= 1.25
2
5. As the degrees of freedom increase, the t-distribution curve looks more and more like the
normal distribution. With infinite degrees of freedom, t- distribution is the same as the normal
distribution.
6. The standard deviation of the t-distribution varies with the sample size. It is always greater
than 1. Unlike the normal distribution, which has a standard deviation of 1.
7. The total area under a t-distribution curve is 1 or 100%.
One can say that the area under the t-distribution
curve represents the probability or the percentage
associated with specific sets of t-values.
Lesson Identifying Percentiles Using the t-Table
Objective
14 1. identifies percentiles using the t-table.
The t-Table
In finding the areas and percentiles for a t-distribution you need to familiarize yourself with
the t-table. You are going to use a table that is different from the z-table you used in finding the
area under the normal curve.
Below is an example of a t-table. It is a right-tailed t-table because the given areas in this
table are areas on the right tail of the t-distribution. Some t-tables are slightly different in format.
Look at the t-table below. In the first column in the left-most part, you have the degrees of freedom.
It ranges from 1 down to ∞ . While the first row in the upper part of the t-table represents the area
3
under the right tail of the t-distribution. Some of the given areas are from 0.25 down to 0.0005.
The rest of the entries in the body of the table are the values of the variable t (t-values).
By looking at the table, you can see
that the t-value for an area of 0.10 in the
right tail of the t-distribution with 10 degrees
of freedom is 1.372. This is the intersection of
the row containing the 10 degrees of freedom
and the column containing the area of 0.10.
Similarly, the area to the right tail of a t-
distribution with 15 degrees of freedom
corresponding to the t-value of 2.249 is 0.02. Focus
on the row containing 15 degrees of freedom, then
look for the t-value of 2.249. The column that you
need is the column containing the area of 0.02.
Identifying Percentiles Using the t-Table
A percentile is a value on a t-distribution that is less
than the probability in the given percentage. For example, the
90th percentile of the t-distribution is that t-value whose left
tail probability is 90% and whose right-tail probability is 10%.
Since the area under the t-distribution curve also represents
the probability, the 90th percentile of the t-distribution is the t-
value whose area on its left tail is 0.90 and whose area on its
right tail is 0.10.
Illustrative Example 1
Find the 95th percentile of a t-distribution with 6 degrees
of freedom.
Since the area of the entire curve is 1, this implies
that the area to the right of the 95 th percentile is 0.05.
Hence, the 95th percentile is the value of the variable t that
has an area of 0.05 to the right. That means finding the 95 th
percentile is looking for the t-value with an area to the right
of 0.05 under a t-distribution with 6 degrees of freedom.
4
Hence the 95th
percentile is 1.943. That means
the t-value of 1.943 has 95% of
the area to the left of it, or 0.95.
Also, you can say that the t-
value of 1.943 has an area of
0.05 to its right. And so, using
the t-table, you will find that
the 95th percentile is 1.943.
Illustrative Example 2
Find the 5th percentile of a t-distribution with 6 degrees of freedom.
The 5th percentile is the value of the variable t that has
an area of 5% or 0.05 to the left. And since the area of the
entire curve is 1, you are convinced that the area to the right
of the 5th percentile is 0.95. Hence, the 5 th percentile is the
value of the variable t that has an area of 0.95 to the right.
Therefore, finding the 5th percentile is the same as finding for
the t-value with an area to the right of 0.95 under a t-
distribution with 6 degrees of freedom.
But if you look at the given areas in the first row of the
t table, there is no entry for an area of 0.95. There is no way you can find an area of 0.95 because
your table is a right-tailed t table. That means it is set to display only the areas under the right tail
of the t distribution.
At this point, you need to recall one of the properties of
the t-distribution that it is symmetric about zero. That means
the right tail of the distribution is exactly the mirror image of its
left tail. So, you can easily find the values in the left tail by
relying on this “symmetry–about–zero” property. Hence, if you
are going to find the value of t such that the area to the left of it
is 0.05, recall that the area to the right of 1.943 is also 0.05
(See Illustrative Example 1).
Therefore, you can say that since the t-distribution is symmetric about 0, the t-value with
an area to the left of 0.05 must be -1.943. So, you will find that the 5th percentile is –1.943.
Illustrative Example 3
What is the area to the right of 2.4 under a t-distribution with 7 degrees of freedom?
Remember that in the previous example, you found t-
values using the given areas under the t-distribution curve.
But in this example, you will be doing the opposite because
in this problem you are given a t-value and you need to find
the area to the right of the t-distribution with 7 degrees of
freedom.
So, looking back at the table, you need
to focus on the 7 degrees of freedom
line. You will observe that the t-value of
2.4 cannot be found in this row but you
5
do find these two values 2.365 and 2.517 that surround 2.4 (The t-value 2.4 is between 2.365 and
2.517).
So, using the table you found that the area to the right of 2.4 under the t-distribution with 7
degrees of freedom lies somewhere between 0.02 and 0.025. If you need to get the exact value,
you need to use software that easily calculates the area under the t-distribution curve with the
given t-value and number of degrees of freedom. Using such software, you could find that the area
to five decimal places is 0.02373.
Lesson Identifying the Length of a Confidence Interval
Objective
15 1. identifies the length of a confidence interval.
What is the difference between the Confidence Level and Confidence interval?
The Confidence level of an interval estimate of a parameter is the probability that the interval
estimate contains a parameter, it describes what percentage of intervals from many different
samples contains the unknown population parameter.
The confidence level has its corresponding coefficient which is called confidence coefficients.
These coefficients are used to find the margin of error, for instance, the table below shows the
corresponding coefficient confidence level.
Confidence 99% 98% 96% 95% 92% 90% 85% 80% 70%
Level 0.99 0.98 0.96 0.95 0.92 0.90 0.85 0.80 0.70
Zc 2.58 2.33 2.05 1.96 1.75 1.645 1.44 1.28 1.04
Confidence interval or interval estimate is a range of values that is used to estimate a parameter.
This estimate may or may not contain the true parameter value.
For instance, we write it in this form Lower limit < μ< Upper limit Or (Lower limit, Upper Limit)
The Lower limit is obtained by using the formula LL= X −E , while the Upper limit is obtained by
using the formula UL= X + E , where E is the Margin of Error and X is the sample mean.
As mentioned earlier, the confidence coefficient is used on finding the margin of error, which is
the range of values above and below the sample statistic. For instance, Margin of error is obtained
σ
using the formula: E=Z α /2 • where, n ¿ sample size ; z α/ 2=¿ confidence coefficient
√n
σ ¿ population standard deviation ; E ¿margin of error
But with this lesson, the margin of error will be given as well as the sample mean.
Example:
A random sample of 46 scores from the examination of ABM learners is taken and it gives a
sample mean of 78 with the interval scores between 77.18 and 78.82 having a 90% level of
confidence.
Let’s answer the questions!
Which of the following is the x in the given statement?
Since it is given in the statement above, the sample mean is 78.
What is the upper limit? What is the lower limit?
The upper limit is 78.82 while the lower limit is 77.18
What is the margin of error in the given statement?
As we can see, the Margin of error is not directly mentioned, but the lower limit and
upper limit is there. As mentioned earlier the formula of the upper limit and the lower
limit includes the Margin of error.
LL= X – E 78.82 = X + E
Therefore, the margin of error is 0.82
77.18 = 78 – E 78.82 = 78 + E
E = 78-77.18 E = 78.82-78
E = 0.82 E = 0.82
6
What is the confidence interval in the given statement? To find the confidence interval, we
have to use Lower limit < μ< Upper limit and substitute the given data. We have,
77.18 < μ< 78.82 or (77.18, 78.82)
So, the Confidence interval is between 77.18 and 78.82.
What is the confidence level? How will you conclude?
The confidence level is 90%. So, we are 90% confident that the mean score lies between
77.18 and 78.82.
Note: Sometimes, you just need to convert the formula to find what is missing.
References: Statistics and Probability – Grade 11 Alternative Delivery Mode; Department of
Education – Region IV-A CALABARZON ; Senior High Conceptual Math & Beyond Statistics and
Probability by: Jose M. Ocampo., Ph.D and Wilmer G. Marquez, M.A.
Quarter 3 WEEK STATISTICS AND PROBABILTY
Activity sheet
6
Name: __________________________________ Date:__________________
Strand:__________________________________
Directions: Give what is being asked. Write your final answer on this activity sheets only.
ACTIVITY 1 “Oh, Is That for Real?”
Most of us hate fake news, fake information, and even fake friends. We need to develop our
ability to distinguish what is real from what is not. Write “REAL” if the statement is true about the
t-distribution and “FAKE” if it’s not.
_________1. The t-distribution is used to estimate population parameters when the sample
size is small and/or the population variance is unknown.
_________2. The mean, median and mode are all equal to zero.
_________3. The variance is equal to 1.
_________4. The t-distribution curve is bell-shaped.
_________5. The standard deviation is always greater than 1.
_________6. Half of the total area under the t-distribution curve is equal to 1.
_________7. The curve is symmetrical about its zero.
_________8. The shape of the t-distribution curve depends on the sample mean.
_________9. The tails of the t-distribution curve approach the horizontal axis but never touch it.
_________10. As the degrees of freedom increase, the t-distribution curve looks more and more like
the normal distribution
ACTIVITY 2
Give what is being asked. Use the t-table to find the answer. Write your final answer on the
space provided.
1. Find the 98th percentile of a t-distribution with 16 degrees of freedom. _______________________
2. Find the 80th percentile of a t-distribution with 30 degrees of freedom. _______________________
3. Find the 20th percentile of a t-distribution with 30 degrees of freedom. _______________________
4-5. What is the area to the right of 1.5 under a t-distribution with 25 degrees of freedom?
Between __________ and ___________
ACTIVITY 2
Given the statements below, answer the following questions.
Online selling
An online seller of yema cake, which is very popular in Tayabas Quezon surveyed
several customers. Fifty-two percent (52) of the customers were satisfied with the
7
services that were offered with a 3.98% margin of error. Determine the confidence
interval using this information.
1. The average percentage of people who are satisfied with the product is
_______________.
2. The lower limit of confidence interval is _________________.
3. The upper limit of confidence interval is _________________.
4. The confidence interval is _______________________.