3rd Quarter Stats
3rd Quarter Stats
ACTIVITY
GRAB A COIN
Statistics and TOSS IT
Probability
WHAT DID YOU GET?
WEEK 1 - 2
HEAD OR TAIL?
1 2
3 4
2/8/2023
RANDOM VARIABLE
A numerical amount that is derived from the results of an arbitrary trial or experiment
A Random Variable is a function that associates a real number with each element in the
sample space. It is a variable whose values are determined by chance. Thus, in simple
words, a Random Variable is a numerical quantity that is derived from the outcomes of
a random experiment.
5 6
Random Variables
Discrete Random Variable
7 8
2/8/2023
9 10
Discrete
Continuous
Discrete
Continuous
Discrete
11 12
2/8/2023
13 14
Example 1. Suppose two coins are tossed, let Z be the random variable Example 2. Suppose that four coins are tossed. Let Y be the random
representing the number of heads that occur. Find the values of the random variable representing the number of heads that occur.
variable Z.
Let H be the Heads and T be the
Tails, the sample space is
Let T = tails and H = heads.
Then the sample space for this S = {HHHH, HHHT, HHTH, HHTT,
experiment is S = {TT, TH, HT, HTHH, HTHT, HTTH, HTTT, THHH,
HH}. The possible outcomes THHT, THTH, THTT, TTHH, TTHT,
TTTH, TTTT}
with regards to the number of
heads will be, And there are 16 possible outcomes. TREE DIAGRAM
15 16
2/8/2023
HHHH 4
HHHT 3
HHTH 3
HHTT 2
HTHH 3
There are 5 distinct possible values
HTHT 2
HTTH 2 of Y, that is 0, 1, 2, 3, and 4.
HTTT 1
THHH 3
THHT 2
THTH 2
THTT 1
TTHH 2
TTHT 1
TTTH 1
TTTT 0
17 18
a. List the sample space in the given experiment. How many outcomes are
possible?
Let D represent the defective computer and N for the non-defective computer.
The sample space is: S= {NND, NDN, DNN, DND, DDN, NDD, DDD, NNN} and there are 8
possible outcomes
19 20
2/8/2023
b. Construct a table showing the number of defective computers in each c. Illustrate a probability distribution. What is the probability value P(X) to
outcome and assign this number to this outcome. What is the value of the each value of the random variable?
random variable X?
Each of these numbers
Count the number of defective corresponds to an event in the
computers in each outcome in sample space S of equally likely
the sample space and assign outcomes for this experiment.
this number to this outcome. Since the value of the random
For instance, if you list NND, the variable X represents the
number of defective computers number of defective computers,
is 1. X = 0 to (NNN),
X = 1 to (NND, NDN, DNN),
X=2 to (DND, DDN, NDD)
There are four possible values of the random variable X representing the X= 3 to (DDD).
number of defective computers. The possible values that X can take are 0, 1,
2, and 3.
21 22
X = 0 to (NNN), d. What is the sum of the probabilities of all values of the random variable?
X = 1 to (NND, NDN, DNN),
X=2 to (DND, DDN, NDD)
X= 3 to (DDD).
23 24
2/8/2023
e. What do you notice about the probability of each value of the random
variable?
25 26
27 28
2/8/2023
29 30
X = 35 to {1}
X = 37 to {2,5}
X = 40 to {7}
X = 42 to {8, 10}
X = 45 to {4, 6, 10}
X = 50 to {3}
31 32
2/8/2023
𝑃(𝑋 ≥ 40)
𝑃 𝑋 ≥ 40 = 𝑃 40 + 𝑃 42 + 𝑃 45 + 𝑃(50)
1 1 3 1
𝑃 𝑋 ≥ 40 = + + +
10 5 10 10
7
𝑃 𝑋 ≥ 40 =
10
𝑷 𝑿 ≥ 𝟒𝟎 = 𝟎. 𝟕 = 𝟕𝟎%
33 34
b. What is the probability that the number of boxes delivered will be at least c. What is the probability that at most 40 boxes will be delivered on a
37 but not more than 50? particular day?
𝑃(37 ≤ 𝑋 < 50) 𝑃(𝑋 ≤ 40)
𝑃 37 ≤ 𝑋 < 50 = 𝑃 37 + 𝑃 40 + 𝑃 42 + 𝑃(45) 𝑃 𝑋 ≤ 40 = 𝑃 40 + 𝑃 37 + 𝑃(35)
1 1 1 3 1 1 1
𝑃 37 ≤ 𝑋 < 50 = + + + 𝑃 𝑋 ≤ 40 = + +
5 10 5 10 10 5 10
8 4 4
𝑃 37 ≤ 𝑋 < 50 = =
10 5 𝑃 𝑋 ≤ 40 =
10
𝑷 𝟑𝟕 ≤ 𝑿 < 𝟓𝟎 = 𝟎. 𝟖 = 𝟖𝟎%
𝑃 𝑋 ≤ 40 = 0.4 = 40%
35 36
2/8/2023
𝑃 (40) + 𝑃 (50)
𝑃 𝑋 ≤ 45
= 𝑃 45 + 𝑃 42 + 𝑃 40 + 𝑃 37 + 𝑃(35) 1 1
𝑃 40 + 𝑃 50 = +
3 1 1 1 1
10 10
𝑃 𝑋 ≤ 45 = + + + +
10 5 10 5 10 2 1
𝑃 40 + 𝑃 50 = =
9 10 5
𝑃 𝑋 ≤ 45 =
10
𝑃 40 + 𝑃 50 = 0.10 = 10%
𝑷 𝑿 ≤ 𝟒𝟓 = 𝟎. 𝟗 = 𝟗𝟎%
37 38
39 40
2/8/2023
“Expected Value”
41 42
1 𝑛 𝑛+1
= ∗
𝑛 2
𝒏+𝟏
=
𝟐
43 44
2/8/2023
3 + 9 + 11 + 13 + 19
𝐸 𝑋 =
5
Closely related measure of variability
55
𝐸 𝑋 = = 11
5
45 46
47 48
2/8/2023
4 10
49 50
51 52
2/8/2023
Expected Value or Mean Value is the sum of the products of each possible
value of a random variable and that value’s probability.
In symbol,
𝐸 𝑋 = 𝜇 = ∑𝑥 𝑃 𝑋
= 0 0.03 + 1 0.05 + 2 0.12 + 3 0.30 + 4 0.28 + 5 0.22
STEPS IN FINDING THE MEAN
𝑬(𝑿) = 𝝁𝒙 = 𝟑. 𝟒𝟏
1. Multiply the random variable by its probability.
2. Use equation (𝜇 = ∑(𝑋) ∗ 𝑃(𝑋)) to find the mean by adding all
products of each random variable and its probability.
53 54
Table of 𝟐
Probability 𝒙−𝝁
Values, 𝒙 ∗ 𝑷(𝑿) 𝒙−𝝁 𝒙−𝝁 𝟐
𝑷(𝑿) ∗ 𝑷(𝑿)
𝑿
Definition: If X is a random variable with mean 𝐸 𝑋 = µ , then the variance of X
is defined by, 0 0.03 0 -3.41 11.6281 0.3488
1 0.05 0.05 -2.41 5.8081 0.2904
𝑽𝒂𝒓(𝑿) = 𝝈𝟐𝒙 = 𝒙 − 𝝁𝒙 𝟐 𝑷(𝒙)
𝒂𝒍𝒍 𝒑𝒐𝒔𝒔𝒊𝒃𝒍𝒆 2 0.12 0.24 -1.41 1.9881 0.2386
𝒗𝒂𝒍𝒖𝒆𝒔 𝒐𝒇 𝒙
3 0.30 0.90 -0.41 0.1681 0.0504
STEPS IN FINDING THE VARIANCE
4 0.28 1.12 0.59 0.3481 0.0975
1. Subtract the computed mean from each value of the random variable: 𝑋 − 𝜇
2. Square the value obtained in Step 1: 𝑋 − 𝜇 5 0.22 1.10 1.59 2.5281 0.5562
3. Multiply the value obtained in Step 2 by the given Probability: 𝑋 − 𝜇 ∗ 𝑃(𝑋)
4. Use the equation (𝜎𝑥 = ∑ 𝑋 − 𝜇 𝑃(𝑥)) to find the variance by adding all the 𝐸 𝑋 =𝜇
values obtained in Step 3. 𝑥−𝜇 ∗ 𝑃 𝑋 = 1.5189
= 3.41
55 56
2/8/2023
𝝈𝒙 = √𝝈𝒙
𝝈𝒙 = 1.26
57 58
59
2/8/2023
Statistics and
Probability
WEEK 3 - 4.
1 2
𝒗𝒂𝒍𝒖𝒆𝒔 𝒐𝒇 𝒙
STANDARD 𝜎 = 𝑽𝒂𝒓 𝑿
DEVIATION 𝝈𝒙 = √𝝈𝒙
3 4
2/8/2023
2. How does the assumed value of the outcome vary from the average number of dots that would appear?
To determine the variability of the assumed values from the mean, use the formulas for finding variance and
standard deviation. But first, construct a table like the one below:
5 6
Therefore, the variance of the random variable X (the number of dots appeared) is equal to
1.81 while the standard deviation is equal to 1.35.
Take note that, small variance or standard deviation means that the assumed values or
data points tend to be very close to the mean, while higher variance or standard deviation
means that the assumed values or data points are spread out from the mean. Specifically,
the variance and standard deviation measures or describes how far a set of data
(assumed values of random variables) is spread out. Since the value of the standard
deviation is 1.35, we can say that the assumed values of each outcome are somewhat
close to the mean for about 1.35 units from the mean
7 8
2/8/2023
9 10
To answer question 1 in the previous activity, you have to understand first the consequences of buying a ticket, will it
give you advantages and disadvantages? Can you afford to spend extra money to buy a ticket?
Php 15,000.00
11 12
2/8/2023
What if 1000 tickets were purchased by different individuals, what is the expected value of buying one
ticket?
The expected value is also defined as the average value of a random variable over numerous trials of an experiment.
Php 15,000.00 The table below is the probability distribution of the given situation.
NET GAIN
𝑷 𝒙 =
𝟏
𝑷 𝒙 =
𝟗𝟗𝟗 Using the formula of expected value,
𝟏𝟎𝟎𝟎 𝟏𝟎𝟎𝟎
𝐄(𝐱) = (𝟏𝟒, 𝟗𝟎𝟎)(𝟎. 𝟎𝟎𝟏) + (−𝟏𝟎𝟎)(𝟎. 𝟗𝟗𝟗)
𝑷 𝒙 = 𝟎. 𝟎𝟎𝟏 = 𝟎. 𝟏% 𝑷 𝒙 = 𝟎. 𝟗𝟗𝟗 = 𝟗𝟗. 𝟗% 𝐄(𝐱) = −𝟖𝟓
13 14
15 16
2/8/2023
17 18
19 20
2/8/2023
3. THE MEAN, MEDIAN, AND MODE COINCIDE AT THE CENTER. This also means that in a 4. THE WIDTH OF THE CURVE IS DETERMINED BY THE STANDARD DEVIATION OF THE
normal distribution, or a distribution described by a normal curve, the mean, median, and DISTRIBUTION.
mode are equal.
𝜎 𝜎 𝜎 𝜎 𝜎 𝜎
21 22
5. THE TAILS OF THE CURVE ARE PLOTTED IN BOTH DIRECTIONS AND 6. THE TOTAL AREA UNDER A NORMAL CURVE IS 1. This means that the normal
FLATTEN OUT INDEFINITELY ALONG THE HORIZONTAL AXIS. curve represents the probability, or the proportion, or the percentage associated
with specific sets of measurement values.
23 24
2/8/2023
25 26
The shape of a normal curve is based on the two given parameters, the mean and the
b. When the means are equal, but the standard deviations are equal. (µ = µ ; ơ ≠
standard deviation of the distribution. When comparing two distributions each described by
ơ ), the curves are centered at the same point but they have different height and
the normal curve, the following are the three situations based on the said parameters
spreads.
a. When the means are not equal, but the standard deviations are equal. (µ ≠ µ ;
ơ = ơ ), the curves have a similar shape but centered at different points.
27 28
2/8/2023
c. When the means are different and the standard deviations are also different (µ ≠
EMPIRICAL RULE
µ ; ơ ≠ ơ ), the curves are centered at different points and vary in shapes.
The EMPIRICAL RULE is better known as 68% - 95% - 99.70% rule. This rule states that
the data in the distribution lies within one (1), two (2), and three (3) of the standard
deviation from the mean are approximately 68%, 95%, and 99.70%, respectively. Since
the area of a normal curve is equal to 1 or 100% as stated on its characteristics, there are
only a few data which is 0.30% falls outside the 3-standard deviation from the mean. For
instance, the distribution of the grades of the Senior High School students in Statistics
and Probability for the Third Quarter is shown below in Figure 7.
29 30
• 68% of data lies within 1 standard deviation from the mean have a grade of 83
to 91
• 95% of data lies within 2 standard deviations from the mean have a grade of
79 to 95
• 99.70% of data lies within 3 standard deviations from the mean have a grade
of 75 to 99
31 32
2/8/2023
68%
Example 1.
The scores of Senior High Schools students in their Statistics and
Probability quarterly examination are normally distributed with a
68% mean of 35 and a standard deviation of 5.
95%
Answer the following questions:
a. What percent of the scores between 30 to 40?
b. What scores fall within 95% of the distribution?
95%
33 34
99.7% 99.7%
95% 95%
68% 68%
2.35% 13.50% 34% 34% 13.50% 2.35% 2.35% 13.50% 34% 34% 13.50% 2.35%
20 25 30 35 40 45 50 20 25 30 35 40 45 50
The scores 30 and 40 is approximately 68% if the distribution. Based from the distribution, the scores that fall within the 95% of the
distribution are from 25 to 45
35 36
2/8/2023
Example 2.
The district nurse of Candelaria East needs to measure the BMI (Body
Mass Index) of the Alternative Learning System students. She found out
So, to answer the questions a & b.
that the heights of male students are normally distributed with a mean
a.The scores 30 and 40 is approximately 68% if the of 160 cm and a standard deviation of 7 cm. Find the percentage of
distribution. male students whose height is within 153 cm to 174 cm.
b.Based from the distribution, the scores that fall within
the 95% of the distribution are from 25 to 45.
37 38
99.7%
As stated in figure 8,
95%
153 cm falls at 1 standard deviation from the mean to the left and the height of
68% 174 cm falls at 2 standard deviations from the mean to the right. Therefore, it
covers the whole 68% and 13.5%. of the distribution and the sum of it is 81.5%
39 40
2/8/2023
41 42
Example 1. Find the area to the left of -1.69 the hundredths Example 1. Find the area to the left of -1.69
-1.69 -1.69
whole number and tenths Therefore, the area to the left of -1.69 is
0.04551
43 44
2/8/2023
STEPS ON HOW TO FIND THE AREA THAT CORRESPONDS TO Z – VALUE Example 1. Find area that corresponds below z = -1.35
1. Draw/sketch a normal curve and locate the given z-value on the normal curve Solution:
2. Shade the region of the curve according to the condition of z-value whether it is Step 2: Shade the region of the curve according to
below, above, or between. Step 1. Draw/sketch a normal curve and locate the
the condition of z-value whether it is below, above,
3. Use the table of the area under the normal curve to find the corresponding area. given z – value
or between
4. Choose the appropriate operation based on step 2 and 3.
• 4.1 When the z-value is to the left or any related terms (e.g. below, less than)
just write the value we obtained in step 3
• 4.2 When the z-value is to the right or any related terms (e.g. above, greater
than), subtract 1 by the obtained value in step 3
• 4.3 When the shaded region is in between of the two z-value, subtract the
biggest by the smallest value obtained in step 3
5. Label the shaded region and draw a conclusion
45 46
Step 3. Use the table of the area under the normal Step 4. Choose the appropriate operation based on Step 5. Label the shaded region and draw a conclusion.
curve to find the corresponding area step 2 and 3.
z = -1.35
4.1 When the z-value is to the left or any
related terms (e.g. below, less than) just
write the value we obtained in step 3
47 48
2/8/2023
Example 2: Find the area to the right of z = - 1.35. Step 3. Use the table of the area under the normal Step 4. Choose the appropriate operation based on
curve to find the corresponding area step 2 and 3.
Solution: z = -1.35
4.2 When the z-value is to the right or any
Step 2: Shade the region of the curve according to related terms (e.g. above, greater than),
Step 1. Draw/sketch a normal curve and locate the
the condition of z-value whether it is below, above, subtract 1 by the obtained value in step 3
given z – value
or between
Since the shaded region of the curve is to
the right of the z= -1.35 and the intersection
between -1.3 and 0.05 is 0.0885, we will
subtract it by 1. Therefore, the area of the
shaded region is 1 - 0.0885 = 0.9915
49 50
Step 5. Label the shaded region and draw a conclusion. Example 3: Find the area to between z = -1.30 and z = 2
Solution:
Step 2: Shade the region of the curve according to
Step 1. Draw/sketch a normal curve and locate the
the condition of z-value whether it is below, above,
given z – value
or between
51 52
2/8/2023
Step 3. Use the table of the area under the normal Step 4. Choose the appropriate operation based on Step 5. Label the shaded region and draw a conclusion.
curve to find the corresponding area step 2 and 3.
The value that corresponds to 𝑧 = −1.3 is 0.0968 4.3 When the shaded region is in between of
and 𝑧 = 2 is 0.4772. the two z-value, subtract the biggest by the
smallest value obtained in step 3
53 54
55 56
2/8/2023
To determine the z – score, consider the following: Example 1. Suppose IQ scores are normally distributed with a mean of 100 and
Given any value x from a normal distribution with mean μ and standard deviation σ, to convert standard deviation of 10. If your IQ is 85, what is your z score? (Round off your answer
x to a z-score (standard normal score), you need to; to the nearest hundredths)
• Subtract the mean μ from x.
• Divide this quantity, 𝑥 – 𝜇, by the standard deviation 𝜎 Solution:
The formula used in converting a random variable x to a standard normal variable z is: a The z – score can be computed using the formula 𝑧 = . With 𝜇 = 100, 𝜎 = 10 & 𝑥 = 85, them the z
– score is
𝑥−𝜇
𝑧= 85 − 100
𝜎 𝑧=
10
Where −15
𝑧=
𝑧 – standard normal score or - score 10
𝑥 – any data value in a normal distribution
𝜇 – mean 𝒛 = −𝟏. 𝟓𝟎
𝜎 – standard deviation
57 58
Example 2. On a nationwide placement test that is normally distributed, the mean was
125 and standard deviation was 15. If you scored 149, what was your z-score? (Round 𝑥−𝜇
𝑧 =
off your answer to the nearest hundredths) 𝜎
Solution: 𝑧𝜎 = 𝑥 − 𝜇
The z – score can be computed using the formula 𝑧 = . With 𝜇 = 125, 𝜎 = 15 & 𝑥 = 149, them the z
– score is 𝒙 = 𝒛𝝈 + 𝝁
149 − 125
𝑧=
15
24
𝑧=
15
𝒛 = 𝟏. 𝟔𝟎
59 60
2/8/2023
Example 3. The heights of teachers in Sta. Catalina National High School are normally Example 4. The time it takes for a cell to divide is normally distributed with an average
distributed with a mean of 150 cm and standard deviation of 15 cm. The height of Sir of 60 minutes and standard deviation of 5 minutes. How long will it take for a given
Victor has a z-score of 3.25. What is the actual height of Sir Victor? (Round off your cell to divide if its “mitosis” has a z-score of -1.35?
answer to the nearest hundredths)
Solution:
Solution:
From the example given, we given are as follows, 𝑧 = −1.35; 𝜎 = 5 𝑚𝑖𝑛𝑢𝑡𝑒𝑠 ; 𝜇 = 60 𝑚𝑖𝑛𝑢𝑡𝑒𝑠. We need
From the example given, we given are as follows, 𝑧 = 3.25; 𝜎 = 15 𝑐𝑚 ; 𝜇 = 150 𝑐𝑚. We need to to determine the z – score. Utilizing the formula that we have derived, 𝑥 = 𝑧𝜊 + 𝜇, the z – score will be
determine the x. Utilizing the formula that we have derived, 𝑥 = 𝑧𝜊 + 𝜇, the x jwill be
𝑥 = 𝑧𝜎 + 𝜇 𝑥 = 𝑧𝜎 + 𝜇
𝑥 = (3.25)(15𝑐𝑚) + 150𝑐𝑚 𝑥 = −1.35 5 𝑚𝑖𝑛 + 60 𝑚𝑖𝑛
𝑥 = 48.75 𝑐𝑚 + 150𝑐𝑚 𝑥 = −6.75 𝑚𝑖𝑛 + 60 𝑚𝑖𝑛
𝒙 = 𝟏𝟗𝟖. 𝟕𝟓 𝒄𝒎 𝒙 = 𝟓𝟑. 𝟐𝟓 𝒎𝒊𝒏𝒖𝒕𝒆𝒔
61 62
A normal distribution curve can be used as a probability distribution curve for normally
distributed variables. The area under the standard normal distribution curve can also be
thought of as a probability. That is, if it's possible to select any 𝑧 value at random, the
probability of choosing one, say, below 1.45 would be the same as the area under the
curve at the left of 1.45. In this case, the area is 0.9265. Therefore, the probability of
randomly selecting a 𝑧 value below of 1.45 is 0.9265 or 92.65%. The problems involving
probabilities and percentiles are solved in the same manner as finding the areas under a
normal curve.
63 64
2/8/2023
65 66
SOLUTION: SOLUTION:
a. 𝑃(𝑍 < −1.05) b. 𝑃(−0.75 < 𝑍 < 1.56)
0.1469
0.2266
67 68
2/8/2023
SOLUTION:
b. 𝑃(−0.75 < 𝑍 < 1.56) b. 𝑃(−0.75 < 𝑍 < 1.56)
𝑃(𝑍 < 1.56) = 0.9406 𝑃(−0.75 < 𝑍 < 1.56) = 0.94062 – 0.22663
= 0.7140 = 71.40%
69 70
SOLUTION:
c. 𝑃(𝑍 > −0.88)
Example 2. Let 𝑋 be a normal random variable with mean 𝜇 =
15 and standard deviation 𝜎 = 3. Find the probabilities of the
following:
0.1894
71 72
2/8/2023
SOLUTION: SOLUTION:
NRV Z - score
NRV 19 𝑥−𝜇
𝑧=
𝜎
Z - score
𝜇 = 15
𝑥−𝜇 0.9082 𝜎=3
𝑧= 𝑥 = 10
𝜎
𝜇 = 15 10 − 15 −5
𝑧= = = −1.67
𝑃(𝑋 < 19) = 0.9082 3 3
𝜎=3 = 90.82 %
𝑥 = 19 Z (x=19) = 0.9082 0.0475
19 − 15 4
𝑧= = = 1.33 𝑷 𝟏𝟎 < 𝑿 < 𝟏𝟗 = 𝟎. 𝟗𝟎𝟖𝟐 − 𝟎. 𝟎𝟒𝟕𝟓
3 3
𝑷 𝟏𝟎 < 𝑿 < 𝟏𝟗 = 𝟎. 𝟖𝟔𝟎𝟕 = 𝟖𝟔. 𝟎𝟕%
73 74
4/12/2023
Statistics and
Probability
WEEK 5 - 6
1 2
Random Sampling is a sampling method of choosing representatives from the population wherein every sample has
an equal chance of being selected.
3 4
4/12/2023
RANDOM SAMPLING
5 6
7 8
4/12/2023
A random sampling that uses a list of all the elements in the a random sampling wherein the population is divided into different
population and then elements are being selected based on the kth strata or divisions. The number of samples will be proportionately
consistent intervals. To get the kth interval, divide the population size picked in each stratum that is why all strata are represented in the
by the sample size. samples.
Example
The company has 800 female employees and 200
Example
male employees. You want to ensure that the
All employees of the company are listed in alphabetical
sample reflects the gender balance of the
order. From the first 10 numbers, you randomly select a
company, so you sort the population into two
starting point: number 6. From number 6 onwards, every
strata based on gender. Then you use random
10th person on the list is selected (6, 16, 26, 36, and so
sampling on each group, selecting 80 women and
on), and you end up with a sample of 100 people.
20 men, which gives you a representative sample
of 100 people.
9 10
CLUSTER SAMPLING
a random sampling wherein population is divided into clusters or
groups and then the clusters are randomly selected. All elements of
the clusters randomly selected are considered the samples of the
study.
11 12
4/12/2023
measure that is used to describe the measure that is used to describe the
population sample
EXAMPLE: EXAMPLE:
• In 2010, the population of the town numbered about • According to a survey of 606 city residents, garbage
5000 collection was the city service people liked most.
• The number of students enrolled at the college is • A survey of 2000 federation members had shown that
1500 48% believed police should have the right to take
industrial action.
13 14
POPULATION MEAN 𝝁
The mean is the sum of the data divided by the number of data. The mean is used to describe where the set
of data tends to concentrate at a certain point.
Population mean is the mean computed based on the elements of the population or data.
The symbol µ (read as “mu”) is used to represent population mean.
To compute for the population mean, we simply add all the data (X) and then, divide it by the number of
elements in the population (N).
∑𝑋
𝜇=
𝑁
where:
µ = the population mean
∑x = the summation of x (sum of the measures)
𝑁 = number of elements in the population
15 16
4/12/2023
Grades in Statistics of Grade 11 Students during the Third Quarter POPULATION VARIANCE AND POPULATION STANDARD DEVIATION
Students Grade (X)
Variance and standard deviation determine how to spread or to scatter each data on the set from the mean.
Number
Standard deviation is simply the square root of the variance.
1 94
∑𝑋 Population variance is the computed variance of the elements of the population. The symbol 𝜎 (read as
2 85 𝜇=
𝑁 “sigma squared”) is used to represent population variance.
3 88
4 79 94 + 85 + 88 + 79 + 78 + 75 + 89 + 91 + 84 + 77 ∑ 𝑥−𝜇
𝜇= 𝜎 =
5 78 10 𝑁
6 75 where:
840
7 89 𝜇= 𝜎 - population variance
10 X – given data
8 91
µ = the population mean
9 84 𝜇 = 84
𝑁 = number of elements in the population
10 77
𝑁 = 10
17 18
Population standard deviation is the computed standard deviation of the elements of the population. The Grades in Statistics of Grade 11 Students during the Third Quarter
symbol 𝜎 (read as “sigma”) is used to represent population standard deviation. Students Grade (X) 𝟐
𝒙−𝝁 𝒙−𝝁
Number
𝜎= 𝜎 1 94 10 100 ∑ 𝑥−𝜇
𝜎 =
2 85 1 1 𝑁
∑ 𝑥−𝜇 3 88 4 16
𝜎= 382
𝑁 4 79 -5 25 𝜎 =
10
5 78 -6 36
where: 6 75 -9 81 𝝈𝟐 = 𝟑𝟖. 𝟐
𝜎 – Standard Deviation 7 89 5 25
𝜎 - population variance
8 91 7 49
X – given data
µ = the population mean 9 84 0 0 𝜎 = 38.2
𝑁 = number of elements in the population 10 77 7 49
𝝈 ≈ 𝟔. 𝟏𝟖
𝑁 = 10 𝑥 = 840 𝑥−𝜇 = 382
𝝁 = 𝟖𝟒
19 20
4/12/2023
21 22
SAMPLE MEAN 𝒙 Grades in Statistics of Grade 11 Students during the Third Quarter
The sample mean is the average of all the data of the samples. Students (Population) (Sample)
Number Students Students
The symbol 𝑥̅ (read as “x bar”) is used to represent the sample mean. Grade Grade
∑𝑥
To compute for the sample mean, we simply add all the data and divide it by the number of elements in the 1 94 94 𝑥̅ =
𝑛
sample (n). 2 85
3 88 88 94 + 88 + 79 + 89 + 91 + 84 + 77
𝑥̅ =
4 79 79 7
∑𝑥
𝑥̅ = 5 78
𝑛 602
6 75 𝑥̅ =
7
where: 7 89 89
𝑥̅ = the sample mean 8 91 91 𝑥̅ = 86
∑x = the summation of x (sum of the measures)
9 84 84
𝑛 = number of elements in the sample
10 77 77
𝑁 = 10 𝑥 = 602
𝑛=7
23 24
4/12/2023
where: where:
𝑠 - sample variance 𝑠 - sample standard deviation
x – given data x – given data
𝑥̅ = the sample mean 𝑥̅ = the sample mean
n = number of elements in the sample n = number of elements in the sample
25 26
27 28
4/12/2023
1. Determine the number of sets of all possible random samples that can be drawn from the given
population by using the formula, NCn, where 𝑁 is the population size and 𝑛 is the sample size.
2. List all the possible random samples and solve for the sample mean of each set of samples.
3. Construct a frequency and probability distribution table of the sample means indicating its number
of occurrence or the frequency and probability.
29 30
Example:
STEP 2 : List all the possible random samples and solve for the sample mean of each set of samples.
A population of Senior High School consists of numbers 1, 2, 3, 4, and 5. Create a sampling distribution
of size 3. SAMPLE SAMPLE MEAN
1,2,3
1,2,4
STEP 1
! 1,2,5
NCn = 5C3 = ! !
=
1,3,4
5! 5!
= = 10 1,3,5
3! 5 − 3 ! 3! (2!)
1,4,5
2,3,4
2,3,5
2, 4, 5
3,4,5
31 32
4/12/2023
STEP 3 : Construct a frequency and probability distribution table of the sample means indicating its
SAMPLE MEAN number of occurrence or the frequency and probability
Sample Mean
1,2,3 2 Frequency Probability P(x)
2
1,2,4 2.33 1
1,2,5 2.67 2.33
1
1,3,4 2.67
2.67
2
1,3,5 3
3
1,4,5 3.33 2
2,3,4 3 3.33
2
2,3,5 3.33 3.67
1
2,4,5 3.67
4
1
3,4,5 4
33 34
STEP 3 : Construct a frequency and probability distribution table of the sample means indicating its
number of occurrence or the frequency and probability
Sample Mean
Frequency Probability P(x)
2 1
1 10
2.33 1
1 10
2.67 1
2 5
3 1
2 5
3.33 1
2 5
3.67 1
1 10
4 1
1 10
35 36
4/12/2023
𝐸 𝑋 =𝜇 =𝑋 𝑃 𝑋 +𝑋 𝑃 𝑋 + ⋯+ 𝑋 𝑃 𝑋
The mean of the sampling distribution of the Sample
= 𝑋 𝑃 𝑋
Mean is given by
𝜇 ̅=∑𝑋 𝑃 𝑋
𝑉𝑎𝑟 𝑋 = 𝜎 = 𝑥−𝜇 𝑃 𝑋
37 38
𝜇 Population mean
𝜎 Population Variance
Or
𝑛 Sample Size
𝜎̅ =∑𝑋 𝑃 𝑋 −𝜇 𝑁 Population Size
𝑋 Sample Mean
𝑃 𝑋 Probability of the Sample Mean
𝜎 = for finite population 𝑋−𝜇 square of the difference between the sample mean and population mean
∑𝑃 𝑋 𝑋−𝜇 summation of the products of probability of the sample mean and the square of the
difference between the sample mean and the population mean
𝑋 𝑃 𝑋 sum of the product of the square of the sample mean and the probability of the sample
mean
39 40
4/12/2023
1. What is the mean and variance of the sampling distribution of the sample
means?
2. Compare these values to the mean and variance of the population
41 42
43 44
4/12/2023
45 46
47 48
4/12/2023
49 50
Determine whether the following statements have a known or unknown population variance.
1. Because of Inclusive Education, learners with disabilities are also part of the
normal students. Consider a population of the PWD learners consisting of 1, KNOWN
2, 3, 4, and 5. Samples of size 2 are drawn from this population.
51
4/12/2023
Statistics and
Probability
JOHN MARK DE CHAVEZ
QUEZON NATIONAL HIGH SCHOOL – S ENIOR HIGH SCHOOL
1 2
CENTRAL LIMIT THEOREM Consider the population of Senior High School consisting the values:
1,2,3,4,5 and 6. Compute the following:
1. Population Mean
If random samples of size n are drawn from a population, 2. Population Variance
then as n goes big, the sampling distribution of the mean 3. Population Standard Deviation
approaches the normal population, regardless of the shape 4. Illustrate the Probability Histogram of the sampling distribution of the means
3 4
4/12/2023
2. Population Variance
1. Population Mean
𝑿 Data – Population Mean Square of the Data –
∑𝑋 Population Mean
𝜇= (𝑿 − 𝝁)
𝑁 𝑿−𝝁 𝟐
∑ 𝑋−𝜇
𝜎 =
𝑁
1 1 – 3.5 = −2.5 −2.5 = 6.25
∑𝑋
𝜇= Substitute
𝑁 2 2 − 3.5 = −1.5 −1.5 = 2.25
=
17.5
6
3 3 − 3.5 = −0.5 −0.5 = 0.25
= Add each data divided by the population size 4 4 − 3.5 = 0.5 0.5 = 0.25 𝝈𝟐 ≈ 𝟐. 𝟗𝟐
5 6
3. Population Standard Deviation 4. Illustrate the Probability Histogram of the sampling distribution of the means
𝑿 Probability
∑ 𝑋−𝜇
𝜎= 𝜎 = 𝑷(𝑿)
𝑁
1 1
𝜎 ≈ 2.92 6
2 1
6
𝝈 ≈ 𝟏. 𝟕𝟏 3 1
6
4 1
6
5 1
6
6 1
6
N=6
𝑷 𝑿 =𝟏
7 8
4/12/2023
9 10
∑𝑋
𝜇 =
𝑁
Source: Merle, R. (2020), “Statistics and Probability, ADM Module: Illustrating the Central Limit Theorem (First Edition)”, Department of Education [DepEd], Gate 2 Karangalan Village,
Barangay San Isidro Cainta, Rizal 1800
11 12
4/12/2023
1. Mean of the sampling distribution of the sample mean 2. Variance of the sampling distribution of the sample mean
∑𝑋 ∑ 𝑋−𝜇
𝜇 = 𝜎 =
𝑁 𝑁
126
𝜇 =
36
𝝁𝑿 = 𝟑. 𝟓
13 14
∑ 𝑋−𝜇 ̅
𝜎̅ =
𝑁
52.5
𝜎̅ =
36
𝝈𝟐𝒙 ≈ 𝟏. 𝟒𝟔
Source: Merle, R. (2020), “Statistics and Probability, ADM Module: Illustrating the Central Limit Theorem (First Edition)”, Department of Education [DepEd], Gate 2 Karangalan Village,
Barangay San Isidro Cainta, Rizal 1800
15 16
4/12/2023
Probability
3. Standard deviation of the sampling distribution of the sample mean 4. Illustrate the probability
𝑿 𝒇
𝑷(𝑿)
histogram of the sampling 1
1 1
∑ 𝑋−𝜇 distribution of the mean. 36
̅ 2
𝜎̅= 𝜎̅ = 1.5 2
36
𝑛 3
2 3
36
4
2.5 4
52.6 36
𝜎̅= 3 5
5
36 36
6
3.5 6
36
5
𝝈𝒙 ≈ 𝟏. 𝟐𝟏 4 5
36
4
4.5 4
36
3
5 3
36
2
5.5 2
36
1
6 1
36
17 18
4. Illustrate the probability histogram of the sampling distribution of the mean. COMPARISON
𝑀𝑒𝑎𝑛
Population Mean Mean of the Sampling
Distribution
𝜇 = 3.5 𝜇 = 3.5
𝝁 = 𝝁𝒙
19 20
4/12/2023
COMPARISON COMPARISON
2.92 𝟏. 𝟕𝟏
1.46 = 𝟏. 𝟐𝟏 =
2 𝟐
𝝈𝟐 𝝈
𝝈𝟐𝒙 = 𝝈𝑿 =
𝒏 𝒏
21 22
COMPARISON
CENTRAL LIMIT THEOREM
𝐻𝑖𝑠𝑡𝑜𝑔𝑟𝑎𝑚
The distribution of the sample mean tends toward the normal
distribution as the sample size increases, regardless of the
distribution from which we are sampling. As a simple guideline,
the sample mean can be considered approximately normally
distributed if the sample size is at least 30 (n ≥ 30 )
𝑛=1
𝑛=2
23 24
4/12/2023
if the population is not normally distributed, or if we don’t know of its distribution, the Example: Suppose that the average age of the people living in a Barangay is 34
Central Limit Theorem allows us to conclude that the distribution of the sample mean will with a standard deviation of 4. If 100 residents of a certain Barangay decided to
be normal if the sample size is sufficiently large. take summer outing after COVID-19 pandemic and Enhanced Community
Quarantine has been lifted for bonding and relaxation, what is the probability that
It is generally accepted that a sample size of at least 30 is large enough to conclude that the average age of these residents is less than 35?
the Central Limit Theorem will ensure a normal distribution in the sampling process
Step 1: Write the given data Step 3: Use the Z table to find P (Z < 2.5).
regardless of the distribution of the original population. Further, we can continue to use
𝜇 = 34 ; 𝜎 = 4 ; 𝑋 = 35 ; 𝑛 = 100
the z conversion formula in our calculations. This time we will use the formula,
𝑃 𝑍 < 2.5 = 𝟎. 𝟗𝟗𝟑𝟖 = 𝟗𝟗. 𝟑𝟖%
Step 2: Convert the raw score to the
standard score using the formula.
𝑋−𝜇 Therefore, the probability that the random sample
𝑧=𝜎 𝑧=
𝑋−𝜇
𝜎
of 100 persons has an average of fewer than 35
years is 0.9938 or 99.38%.
𝑛
𝑛 35 − 34 10
𝑧= = = 2.5
4 4
100
25 26
A sampling distribution of the sample mean is a frequency distribution of the sample mean computed from all
possible random samples of a specific size n taken from a population.
The probability distribution of the sample mean is also called the sampling distribution of the sample mean.
The standard distribution of the sampling distribution of the sample mean is also known as the standard error
of the mean.
27 28
4/12/2023
Here are the steps to solve problems involving sampling distribution of the
sample mean. If the problem is dealing with If the problem is dealing with
an individual data obtained data about the sample mean
from the population , we will or 𝒏observations, we will be
Step 1: Identify the given information be using the formula using the formula
29 30
1. What is the probability that a randomly selected senior high school student will complete the
EXAMPLE:
examination in less than 48 minutes?
Probable Time Step 1: Identify the given information Step 4: Compute for the Probability
The mean time it takes a group of senior high students to complete a certain examination is 50.6 𝜇 = 50.6 𝑋−𝜇
minutes. The standard deviation is 6 minutes. Assume that the variable is normally distributed. 𝑧 =
𝜎=6 𝜎
𝑋 = 48 48 − 50.6
𝑧 = = −𝟎. 𝟒𝟑
1. What is the probability that a randomly selected senior high school student will complete the 6
Step 2: Identify what is asked for:
examination in less than 48 minutes?
𝑃(𝑋 < 48)
Find 𝑃(𝑋 < 48) by getting the area under the normal
2. If 49 randomly selected senior high school students take the examination, what is the Step 3: Identify the formula to be used curve. 𝑃(𝑋˂ 48) = 𝑃(𝑧 ˂ − 0.43) = 𝟎. 𝟑𝟑𝟑𝟔
probability that the mean time it takes the group to complete the test will be less than 48 minutes? The problem is dealing with an individual data
obtained from the population so the formula to be
3. If 49 randomly selected senior high school students take the examination, what is the used is 𝑧 = to convert 48 to standard score
probability that the mean time it takes the group to complete the test will be more than 51 minutes?
4. If 49 randomly selected senior high students take the examination, what is the probability that
the mean time it takes the group to complete the test is between 47.8 and 53 minutes? Therefore, the probability that a randomly selected college student will complete
the examination in less than 48 minutes is 0.3336 or 33.36%
31 32
4/12/2023
2. If 49 randomly selected senior high school students take the examination, what is the 3. If 49 randomly selected senior high school students take the examination, what is the
probability that the mean time it takes the group to complete the test will be less than 48 minutes? probability that the mean time it takes the group to complete the test will be more than 51 minutes?
Step 1: Identify the given information Step 4: Compute for the Probability Step 1: Identify the given information Step 4: Compute for the Probability
33 34
4. If 49 randomly selected senior high students take the examination, what is the probability that the mean
Step 4: Compute for the Probability
time it takes the group to complete the test is between 47.8 and 53 minutes?
(4.a) 𝑃 𝑋 > 47.8
Step 1: Identify the given information 𝑋−𝜇
𝑧= 𝜎
𝜇 = 50.6
𝜎=6 𝑛
𝑋 = 47.8 𝑎𝑛𝑑 53 47.8 − 50.6
𝑛 = 49 𝑧= = −𝟑. 𝟐𝟕
6
49
Step 2: Identify what is asked for: Find 𝑃(𝑋 < 53) by getting the area under the
Find 𝑃(𝑋̅ > 47.8) by getting the area under the normal curve
𝑃(47.8 < 𝑋̅ < 53) normal curve
𝑃(𝑥̅ > 47.8) = 𝑃(𝑧 > − 3.27) = 𝟎. 𝟎𝟎𝟎𝟓
Step 3: Identify the formula to be used 𝑃(𝑥̅ < 53) = 𝑃(𝑧 < 2.8) = 𝟎. 𝟗𝟗𝟕𝟒
The problem is dealing with data about the sample mean or (4.b) 𝑃 𝑋 < 53 To find the probability that 49 randomly selected
𝒏 observations, so the formula to be used to standardize senior high school students will complete the test
47.8 and 53 is 𝑧 = . 𝑋−𝜇 between 47.8 and 53 minutes, subtract the
𝑧= 𝜎
smaller area from the bigger area under the
𝑛
normal curve.
53 − 50.6
𝑧= = 𝟐. 𝟖
6
49 That is 0.9974 – 0.0005 = 0.9969 or 99.69%
35 36
4/12/2023
. Illustrating t distribution
37 38
T - DISTRIBUTION
When 𝑛 < 30, and/or when the variance is unknown
we are using the t table
39 40
4/12/2023
𝑥̅ − 𝜇
𝑧=𝜎 instead of using the “population” standard deviation, 𝜎; you are going to use your “sample” standard deviation
s, to estimate it.
Z - score 𝑥̅ − 𝜇
𝑛 𝜎
𝑛
𝑥̅ − 𝜇
𝑠
𝑛
Population Sample Size
standard
deviation
41 42
𝑥̅ − 𝜇
𝑡=𝑠
𝑛
the t-distribution with 𝑛 − 1 degrees of freedom. 𝑥̅ − 𝜇
𝑥̅ − 𝜇
𝜎
𝑛 𝑠
𝑛
Note that the number of degrees of freedom is one less than the sample size.
So, if the sample size 𝑛 is 25, the number of degrees of freedom is 24. Similarly, at t distribution having 16
degrees of freedom, the sample size is 17.
43 44
4/12/2023
PROPERTIES OF T – DISTRIBUTION
45 46
2. The t-distribution is bell-shaped like the normal distribution but has heavier tails. 3. The mean, median, and mode of the t-distribution are all equal to zero.
47 48
4/12/2023
4. The variance is always greater than 1 5. As the degrees of freedom increase, the t-distribution curve looks more and more like the normal
distribution.
𝒗
𝒗−𝟐
where 𝑣 is the number of degrees of freedom.
49 50
6. The standard deviation of the t-distribution varies with the sample size 7. The total area under a t-distribution curve is 1 or 100%.
51 52
4/12/2023
53 54
1.372
T VALUES
Degrees of freedom
55 56
4/12/2023
Illustrative Example 1.
IDENTIFYING PERCENTILES USING THE T – TABLE Find the 95th percentile of a t-distribution with 6 degrees of freedom.
A percentile is a value on a t-distribution that is less than the probability in the given percentage.
References: Department of Education [DepEd], Region IV A – CALABARZON (2020), Statistics References: Department of Education [DepEd], Region IV A – CALABARZON (2020), Statistics
and Probability, ADM Module (First Edition), Gate 2, Karangalan Village, San Isidro, Cainta Rizal, and Probability, ADM Module (First Edition), Gate 2, Karangalan Village, San Isidro, Cainta Rizal,
1800 1800
57 58
Illustrative Example 2.
Find the 5th percentile of a t-distribution with 6 degrees of freedom.
1.943
References: Department of Education [DepEd], Region IV A – CALABARZON (2020), Statistics References: Department of Education [DepEd], Region IV A – CALABARZON (2020), Statistics
and Probability, ADM Module (First Edition), Gate 2, Karangalan Village, San Isidro, Cainta Rizal, and Probability, ADM Module (First Edition), Gate 2, Karangalan Village, San Isidro, Cainta Rizal,
1800 1800
59 60
4/12/2023
1.943
- 1.943
61 62