Elementary Statistics (Stat 1)
Elementary Statistics (Stat 1)
CHAPTER 1: Introduction
DEFINITION OF STATISTICS
Statistics is a set of numerical data.
It is a branch of science which deals with the collection, presentation, analysis and interpretation of
data.
NATURE OF STATISTICS
General Uses of Statistics
a. Statistics aids in decision making
provides comparison
explains action that has taken place
justifies a claim or assertion
predicts future outcome
estimates unknown quantities
b. Statistics summarizes data for public use
Examples on the Role of Statistics
In the biological and medical sciences, it can help researchers to discover relationships
worthy of further attention.
Example: A doctor can use statistics to determine to what extent is an increase in blood
pressure dependent upon age.
In business, a company can use statistics to forecast sales, design products and produce
goods more efficiently.
Example: A pharmaceutical company can apply statistical procedures to find out if a new
formula is indeed more effective than the one being used. Results can
help the company decide whether to market new formula or not.
In engineering, it can be used to test properties of various materials.
Example: A quality controller can use statistics to estimate the average lifetime of the
products produced by their current equipment.
FIELDS OF STATISTICS
a. Statistical Methods of Applied Statistics – refers to procedures and techniques used in the
collection, presentation, analysis and interpretation of data.
Descriptive statistics
- methods concerned with the collection, description and analysis of a set of data without
drawing conclusions or inferences about a larger set.
- the main concern is simply describe the set of data.
Inferential Statistics
- methods concerned with making predictions or inferences about a larger set of data
using only the information gathered from a subset of this larger set.
- the main is not merely to describe but actually predict and make inferences based on
the information gathered.
b. Statistical Theory of Mathematical Statistics – deals with the development and exposition of
theories that serve as bases of statistical methods.
Page
Descriptive Statistics vs. Inferential Statistics
Descriptive Inferential
A bowler wants to estimate his chance of
A bowler wants to find his bowling average for
winning a game based on his current season
the past 12 games.
averages and the averages of his opponents.
A housewife would like to predict based on
A housewife wants to determine the average
the last year’s grocery bills, the average
weekly amount she spent on the groceries in
weekly amount she will spend on groceries for
the past 3 months.
this year.
A politician would like to estimate, based on
A politician wants to know the exact number
an opinion poll, his chance for winning the
of votes he received in the last election.
upcoming election.
1. Yesterday’s attendance shows that five (5) employees were absent due to dengue fever.
2. If the present trends continues, architects will construct more contemporary homes than
colonials in the next 5 years.
3. In certain, arsonists deliberately set 3% of all fires reported last year.
4. At least 30% of all new homes being built today are of a contemporary design.
5. Based from the present sales trend, it is expected that after two year’s sales will be doubled.
Example: A manufacturer of a kerosene heater wants to determine if customers are satisfied with the
performance of their heaters. Toward this goal, 5,000 of his 200,000 customers are
contacted and each is asked, “Are you satisfied with the performance of the kerosene
heater you purchased?” Identify the population and the sample for this situation.
Example: In order to estimate the true proportion of students at a certain college who smoke cigarettes,
the administration polled a sample of 200 students and determined that the proportion
of the students from the sample who smoke cigarettes is 0.12. Identify the parameter and
statistic.
Page
Seat Work #02 - __________
A. Identify the population and sample.
1. A survey of 1,353 American households found that 18% of the households own a computer.
Population: _______________________________________________________________________
Sample: __________________________________________________________________________
2. A recent survey of 2,625 elementary school children found that 28% of the children could be
classified obese.
Population: _______________________________________________________________________
Sample: __________________________________________________________________________
3. The average weight of every sixth person entering the mall within 3 hours period was 146 lb.
Population: _______________________________________________________________________
Sample: __________________________________________________________________________
Statistic: _________________________________________________________________________
2. The average salary of all assembly-line employees at a certain car manufacturer is $33,000.
Parameter : ______________________________________________________________________
Statistic: _________________________________________________________________________
3. The average late fee for 360 credit card holders was found to be $56.75.
Parameter : ______________________________________________________________________
Statistic: _________________________________________________________________________
Page
ELEMENTARY STATISTICS (STAT 1)
CHAPTER 2: Collection and Presentation of Data
PRELIMINARIES
Steps in Statistical Inquiry
1. Define the problem.
2. Formulate the research design.
3. Collect data.
4. Code and analyzed the collected data.
5. Interpret the results.
CLASSIFICATION OF VARIABLE
1. Discrete vs. Continuous
Discrete – a variable which can assume finite number of values; usually measured by counting or
enumeration.
Continuous – a variable which can assume infinitely many values corresponding to a line number.
LEVEL OF MEASUREMENT
1. Nominal Level – the nominal level or classificatory scale is the weakest level of measurement where
numbers or symbols are used simply for categorizing subjects into different
groups.
Examples: Sex: M-Male F-Female
Marital Status:1-Single 2-Married 3-Widowed 4-Separated
2. Ordinal Level – the ordinal level of measurement contains the properties of the nominal level, and in
addition, the numbers assigned to categories of any variables may be ranked or
ordered in some low-to-high manner.
Examples: Teaching Ratings 1-poor 2-fair 3-good 4-excellent
Year Level 1-1st year 2-2nd year 3-3rd year 4-4th year
3. Interval Level – the interval level is that which the distances between any two numbers on the scale
are of known sizes.
Example: IQ level, Temperature
Page
4. Ratio Level – the ratio level of measurement contains all the properties of the interval level, and in
addition, it has a “true zero” point.
Example: Number of correct answers in exam.
C. Identify the data set’s level of measurement (nominal, ordinal, interval, ratio).
_______________1. Hair color of women on a high school tennis team.
_______________2. Number of milligrams of tar in 28 cigarettes.
_______________3. Temperatures of 22 selected refrigerators.
_______________4. The ratings of a movie raging from “poor” to “good’ to “excellent”.
_______________5. List of zip codes for Chicago.
CLASSIFICATION OF DATA
1. Primary vs. Secondary
a. Primary Source – data measured by the researcher/agency that published it.
b. Secondary Source – any republication of data by another agency.
Example: The publication of the National Statistics Office (NSO) is primary sources and all
subsequent publications of other agencies are secondary sources.
Page
DATA COLLECTION METHODS
2. Observation method – makes possible the recording of behavior but only at the time of occurrence.
3. Experimental method – a method designed for collecting data under controlled conditions. An
experiment is an operation where there is actual human interference with the conditions than
can affect the variable under study.
4. Use of existing studies – e.g., census, health statistics, and weather bureau reports.
Two type:
Documentary sources – published or written reports, periodicals, unpublished documents, etc.
Field sources – researchers who have done studies on the area of interest are asked personally
or directly for information needed.
5. Registration method – e.g., car registration, student registration and hospital admission.
Cross out the column that defines the statement, whether it is a self-administered questionnaires or
personal interview.
Self-Administered Personal
Questionnaires Interview
1. It is more appropriate in obtaining objective information
2. lower response rate
3. higher response rate
4. Respondents may feel cautious particularly in answering
sensitive question
5. It is administered to a person or group one at a time
6. The respondents may feel free to express views and opinions
7. Obtained information is limited
8. Vague responses are minimized
9. It is appropriate in obtaining emotional responses or opinion
10. It can be administered to a large number of people
simultaneously
Page
Lesson #05 - __________
Page
Seat Work #05 - __________
Identify the sampling technique used (random, cluster, stratified, convenience, systematic).
_______________1. Every fifth person boarding a plane is searched thoroughly.
_______________2. At a local community College, five math classes are randomly selected out of 20 and all
of the students from each class are interviewed.
_______________3. A researcher randomly selects and interviews fifty male and fifty female teachers.
_______________4. Based on 12,500 responses from 42,000 surveys sent to its alumni, a major university
estimated that the annual salary of its alumni was 92,500.
_______________5. A community college student interviews everyone in a biology class to determine the
percentage of students that own a car.
Advantages Disadvantages
When a large mass of quantitative data are
It gives emphasis to significant figures and included in a text or paragraph, the
comparisons. presentation becomes almost
incomprehensible.
It is simplest and most appropriate approach Paragraphs can be tiresome to read especially
when there are only a few numbers to be if the same words are repeated so many
presented. times.
Page
Table 4.4 – CRIME VOLUME AND RATE BY TYPE: 1991 – 1993
heading
(Rate per 100,000 populations)
Graphical Presentation – a graph or chart is a device for showing numerical values or relationships in pictorial
form.
Advantages:
Main features and implications of a body of data can be grasped at a glance.
Can attract attention and hold the reader’s interest.
Simplifies concepts that would otherwise have been expressed in so many words.
Can readily clarify data; frequently bring hidden facts and relationships.
Page
Seat Work #06 - __________
2. Construct a bar graph presentation of the monthly sales of a medical representative for a
period of six months.
Month Sales (in thousand pesos)
January 120
February 89
March 94
April 125
May 75
June 100
Page
ELEMENTARY STATISTICS (STAT 1)
CHAPTER 3: Frequency Distribution
82 82 83 72 79 71 84 59 77 50 87
83 82 63 75 50 85 76 79 68 69 62
79 69 74 53 73 71 50 76 57 81 62
72 88 84 80 68 50 74 84 71 73 68
71 80 72 60 81 89 94 80 84 81 50
84 76 75 82 76 53 91 69 60 89 79
59 62 79 82 72 81 60 84 68 66 94
77 78 87 75 86 82 74 73 72 84 51
50 69 75 70 77 87 86 77 75 96 66
87 73 84 68 85 62 87 92 69 52 65
50 50 50 50 50 50 51 52 53 53 57
59 59 60 60 60 62 62 62 62 63 65
66 66 68 68 68 68 68 69 69 69 69
69 70 71 71 71 71 72 72 72 72 72
73 73 73 73 74 74 74 75 75 75 75
75 76 76 76 76 77 77 77 77 78 79
79 79 79 79 80 80 80 81 81 81 81
82 82 82 82 82 82 83 83 84 84 84
84 84 84 84 85 85 86 86 87 87 87
87 87 87 88 89 89 91 92 94 94 96
Advantage:
1. easier to detect the smallest and largest value
2. easier to find the measure of position
In the construction of a frequency distribution, the various items of a series are classified into groups.
The frequency distribution table shows the number of items falling into each group.
Example:
Class Frequency LCB UCB CM
50 – 54 10 49.5 55.5 52.5
56 – 61 6 55.5 61.5 58.5
62 – 67 8 61.5 67.5 64.5
68 – 73 24 67.5 73.5 70.5
74 – 79 22 73.5 79.5 76.5
3. Determine the lowest class limit. The first class must include the smallest value in the data set.
4. Determine all the class limits by adding the class size to the limit of the previous class.
5. Tally the frequencies for each class. Sum the frequencies and check against the total number of
observations.
6. Determine the lower class boundaries by subtracting 0.5 from the lower limits.
7. Determine the upper class boundaries by adding 0.5 to the upper limits.
8. Determine the class mark by getting the average of the lower and upper limits.
Page
Class Frequency LCB UCB CM <CFD >CFD RF RFP
50 – 55 10 49.5 55.5 52.5 10 110 0.09 9%
56 – 61 6 55.5 61.5 58.5 16 100 0.05 5%
62 – 67 8 61.5 67.5 64.5 24 94 0.07 7%
68 – 73 24 67.5 73.5 70.5 48 86 0.22 22%
74 – 79 22 73.5 79.5 76.5 70 62 0.20 20%
80 – 85 24 79.5 85.5 82.5 94 40 0.22 22%
86 – 91 12 85.5 91.5 88.5 106 16 0.11 11%
92 – 97 4 91.5 97.5 94.5 110 4 0.04 4%
Page
Lesson #08 - __________
1. Frequency Histogram – a bar graph that displays the classes on the horizontal axis and the frequencies
of the classes on the vertical axis; the vertical lines of the bars are erected at the class boundaries and
the height of the bars correspond to the class frequency.
2. Frequency Polygon – a line chart that is constructed by plotting the frequencies at the class marks and
connecting the plotted points by means of straight lines; the polygon is closed by considering an
additional class at each end and the ends of the lines are brought down to the horizontal axis at the
midpoints of the additional.
Page
THE STEM-AND-LEAF DISPLAY
The stem-and-leaf display is an alternative method for describing a set of data. It presents a histogram-
like picture of the data, while allowing the experimenter to retain the actual observed values of each
data point. Hence, the stem-and-leaf display is partly tabular and partly graphical in nature.
In creating a stem-and-leaf display, we divide each observation into two parts, the stem and the leaf.
for example, we could divide the observation 244 as follows:
Stem Leaf
2 | 44
Alternatively, we could choose the point of division between the units and tens, whereby
Stem Leaf
24 | 4
The choice of the stem and leaf coding depends on the nature of the data set.
Example:
Typing speeds (net words per minute) for 20 secretarial applicants
68 72 91 47
52 75 63 55
65 35 84 45
58 61 69 22
46 55 66 71
Page
Seat Work #08 - __________
1. Frequency histogram
2. Frequency Polygon
3. Ogives
4. Stem-and-Leaf
Page
ELEMENTARY STATISTICS (STAT 1)
CHAPTER 4: Measures of Central Tendency
a. Mean – this is obtained by summing up all the observations and divided by the sum by the number of
observations. We call this the simple mean.
Formula: x́=
∑x
n
Where: x́ = mean
x = value of the particular item
n = number of items in the sample
Example:
A sample of 10 students was taken and was asked how much time they travel from their respective
places of residences to the school. The results are listed below. Compute the mean.
x́=
∑x
n
30+15+35+20+25+ 45+10+25+30+ 15
¿
10
¿ 25 minutes
Weighted Mean – this is used when several observations have similar values.
Formula: x́ w =
∑ wx
∑w
Where: x́ = mean
x = value of the particular item
w = weight or number of observations of the same values
Page
∑w = sum of the weights
Example:
XYZ Construction firm has 10 workers who are paid P350 per day, 5 workers who are paid P455 per day
and 2 workers who are paid P600 per day. What is the weighted daily wage of the 17 workers?
Solution: x́ w =
∑ wx
∑w
10 ( 350 )+5 ( 455 ) +2(600)
¿
17
¿ P 410.29
The weighted mean is also used to compute the weighted average rating of the students in his subjects
with different number of units.
Rating of students in Four Subjects
Subjects Number of Units Rating
Bar Management 6 90%
Statistics 3 85%
Physical Education 2 87%
Personality Development 1 95%
Solution: x́ w =
∑ wx
∑w
6 ( 90 ) +3 ( 85 ) +2 ( 87 ) +1 ( 95 )
¿
12
¿ 88.67 %
b. Median – It is the middle value after arranging the set of observations into ascending or descending order.
If the number of observation is odd number, the median is the middle value and if the number of
observation is even number, the median is the average of the two middle values or observations.
Formula:
ODD EVEN
n n
Median=
n+1
2 Median=
2
+ +1
2 ( )( )
2
Example:
1. A sample of 10 students was taken and was asked how much time they travel from their respective
places of residences to the school. The results are listed below. Compute the mean.
Page
F 45
G 10
H 25
I 30
J 15
Solution: Arrange the set of the observations according to its magnitude.
10 15 15 20 25 25 30 30 35 45
n n
Median=
( )( )
2
+ +1
2
2
n 10
() 2
= =5 → 5th observation is 25
2
n n
Median=
( 2 ) ( 2 ) 25+25
+ +1
= =25
2 2
2. Find the median for the following set of scores
3 8 6 7 9 9 3 3 10
Solution: Arrange the set of the observations according to its magnitude.
3 3 3 6 7 8 9 9 10
n+1 9+1
Median= = =5 →the 5 th observation is 7.
2 2
c. Mode – it is the observation that appears most often. Mode is the least preferred measure of central
location.
Example: Find the mode
Observations Mode
3 8 6 7 9 9 3 3 10 3 - unimodal
10 15 15 20 25 25 30 35 45 15 & 25 - bimodal
10 15 15 20 25 25 30 30 35 45 15, 25 & 30 - trimodal
3 8 6 6 7 7 9 9 3 6 3 10 7 9 3, 6, 7, & 9 - multimodal
2. A computer shop was able to sell the following unit of laptop for the month of July: two Dell
laptops @ P89, 900 each; 3 Samsung laptops @ P45, 000 each; 2 Toshiba laptops @ P26, 000 each;
3 Acer laptops @ P65, 000 each. Find the average sale for that month.
Lesson #10 - __________
Example:
Final grades of Stat 101 students arrange in array. Solve for the mean.
50 50 50 50 50 50 51 52 53 53 57
59 59 60 60 60 62 62 62 62 63 65
66 66 68 68 68 68 68 69 69 69 69
69 70 71 71 71 71 72 72 72 72 72
73 73 73 73 74 74 74 75 75 75 75
75 76 76 76 76 77 77 77 77 78 79
79 79 79 79 80 80 80 81 81 81 81
82 82 82 82 82 82 83 83 84 84 84
84 84 84 84 85 85 86 86 87 87 87
87 87 87 88 89 89 91 92 94 94 96
Solution:
K=1+ 3.322log 110=7.78∨8 R=96−50=40 C=46 ÷ 8=6
Class Frequency CM (x) fx
x́=
∑ fx
50 – 55 10 52.5 525 n
56 – 61 6 58.5 351
62 – 67 8 64.5 516 8175
¿
68 – 73 25 70.5 1,762.5 110
74 – 79 22 76.5 1,683
80 – 85 23 82.5 1,897.5 ¿ 74.32
86 – 91 12 88.5 1,062
92 – 97 4 94.5 378
N= 110 fx = 8,175
b. Median
Page
n
Formula:
Where: LCB md
~
x=LCB md
=
+[( 2
−¿ cf )
f md
p
i ]
lower class boundary of the median class
n = number of observations
¿ cf p = sum of the frequencies before the median class
f md = frequency of the median class
i = class interval/size
Example:
Final grades of Stat 101 students arrange in array. Solve for the median.
Class Frequency LCB <cf
50 – 55 10 49.5 10
56 – 61 6 55.5 16
62 – 67 8 61.5 24
68 – 73 25 67.5 49
74 – 79 22 73.5 71
80 – 85 23 79.5 94
86 – 91 12 85.5 106
92 – 97 4 91.5 110
N= 110
Solution:
1. Determine the median class by dividing the total number of observations by 2.
n 110
= =55
2 2
2. Go over the entries in the less than cumulative frequency column. The class that immediately has a
sum of frequencies greater than the result of step 1 is the median class.
n
Class
50 – 55
56 – 61
62 – 67
Frequency
10
6
8
LCB
49.5
55.5
61.5
<cf
10
16
24
~
x=LCB md +
2
[
f md
(
−¿ cf p )i ]
68 – 73 25 67.5 49
74 – 79 22 73.5 71 Median class
110
80 – 85
86 – 91
92 – 97
N=
23
12
4
110
79.5
85.5
91.5
94
106
110
~
x=73.5+
2
22 [( )]
−49
6
~
x=75.14
Page
d. Mode
Formula: ^x =LCB m +¿
Where: ^x = Mode
LCB m = LCB of the modal class
fm = Frequency of the modal class
d1 = difference between the frequency of the modal
class and the frequency before the modal class
d2 = difference between the frequency of the modal
class and the frequency preceding the modal class
Example:
Final grades of Stat 101 students arrange in array. Solve for the median.
Class Frequency LCB <cf
50 – 55 10 49.5 10
56 – 61 6 55.5 16
62 – 67 8 61.5 24
68 – 73 25 67.5 49
74 – 79 22 73.5 71
80 – 85 23 79.5 94
86 – 91 12 85.5 106
92 – 97 4 91.5 110
N= 110
Solution:
1. Determine the modal class by identifying the class that contains the highest frequency or
observation.
Class Frequency LCB <cf ^x =LCB m +¿
50 – 55 10 49.5 10
56 – 61 6 55.5 16 25−17
62 – 67 8 61.5 24
^x =67.5+
(2 ( 25 )−17−3 )
6
68 – 73 25 67.5 49 Modal class
74 – 79 22 73.5 71 ^x =69.10
80 – 85 23 79.5 94
86 – 91 12 85.5 106
92 – 97 4 91.5 110
N= 110
Complete the Frequency Distribution Table to find the mean, median and mode of the data set given:
Class F CM (x) fx LCB <CF
Page
10-19 3
20-29 1
30-39 3
40-49 2
50-59 9
60-69 8
70-79 35
80-89 30
90-99 9
Percentile (P) … … … … … … … … … … …
…100
10 20 25 30 40 50 60 70 75 80 90
Decile (D) 1 2 3 4 5 6 7 8 9 10
Quartile (Q) 1 2 3 4
i ( n+1 )
a. Percentile – to compute for the i th percentile: Pi=¿ is the value of the
[ 100 ] th observation in the
array.
Where: Pi = Percentile location
i = Percentile of interest
n = number of observation
Example:
Below is the list of the daily wages of 20 workers of XYZ Construction Company. Compute for P 87.
200 200 265 285 290 300 300 315 330 350
375 450 450 500 550 550 600 615 630 650
Solution:
i ( n+1 )
Pi= [ 100 ] P87=615+ 0.27(630−615)
Page
87 ( 20+1 )
P87= [100 ] P87=619.05∨619
P87=18.27 th location
i ( n+1 )
b. Decile – to compute for the i th decile: Di=¿ is the value of the
[ 10 ]
th observation in the array.
7 ( 20+1 )
D 7= [ 10 ] D7=535
D 7=14.70 th location
i ( n+1 )
c. Quartile – to compute for the i th quartile: Q i=¿ is the value of the [ 4 ]
th observation in the array.
3 ( 20+1 )
Q 3= [ 4 ] Q 3=550
Q 3=15.75 th location
Page
4th 12
5th 12
6th 10
7th 15
8th 15
9th 15
10th 14
Calculate the following:
Q1 D8
Q3 P45
D3 P89
Where: LCB Q
Qk =LCB Q +
k
=
4
k
fQ [
−¿ cf p(i
k
)
]
lower class boundary of the quartile class
n = number of observations
¿ cf p = sum of the frequencies before the quartile class
f Qk = frequency of the quartile class
i = class interval/size
Example:
Final grades of Stat 101 students arrange in array. Solve for the Q 1.
Class Frequency LCB <cf
50 – 55 10 49.5 10
56 – 61 6 55.5 16
62 – 67 8 61.5 24
68 – 73 25 67.5 49
74 – 79 22 73.5 71
80 – 85 23 79.5 94
86 – 91 12 85.5 106
92 – 97 4 91.5 110
N= 110
Page
Solution:
1. Determine the Quartile class by dividing the number of observation by 4.
n 110
= =27.5
4 4
2. Go over the entries in the less than cumulative frequency column. The class that has a sum of
n
frequencies greater than the is the quartile 1 class.
4
Class Frequency LCB <cf n
50 – 55
56 – 61
62 – 67
10
6
8
49.5
55.5
61.5
10
16
24
Q1=LCB Q +
4
−¿ cf p
fQ
i
1 [ (
k
)
]
68 – 73 25 67.5 49
110
74 – 79
80 – 85
86 – 91
92 – 97
22
23
12
4
73.5
79.5
85.5
91.5
71
94
106
110
Q1=67.5+
4
−24
25
6 [ ( )
]
N= 110 Q 1=68.34
b. Deciles
kn
Formula:
Where: LCB D k
Dk = LCB D
=
k
+[( 10
−¿ cf )
fD k
i
p
]
lower class boundary of the deciles class
n = number of observations
¿ cf p = sum of the frequencies before the deciles class
fD k
= frequency of the quartile class
i = class interval/size
Example:
Final grades of Stat 101 students arrange in array. Solve for the D 8.
Class Frequency LCB <cf
50 – 55 10 49.5 10
56 – 61 6 55.5 16
62 – 67 8 61.5 24
68 – 73 25 67.5 49
74 – 79 22 73.5 71
80 – 85 23 79.5 94
86 – 91 12 85.5 106
92 – 97 4 91.5 110
N= 110
Page
Solution:
1. Determine the Deciles class by dividing the number of observation by 10.
k n 8∗110
= =¿
10 10
2. Go over the entries in the less than cumulative frequency column. The class that has a sum of
n
frequencies greater than the is the deciles 8 class.
10
Class Frequency LCB <cf kn
50 – 55
56 – 61
62 – 67
10
6
8
49.5
55.5
61.5
10
16
24
D8=LCB D +
10
−¿ cf p
fD
i
8 [ (
k
)
]
68 – 73 25 67.5 49
74 – 79 22 73.5 71
80 – 85 23 79.5 94
86 – 91 12 85.5 106
92 – 97 4 91.5 110
N= 110
c. Percentile
kn
Formula:
Where: LCB P k
Pk =LCB P
=
k
+[( 100
−¿ cf )
fP k
i
p
]
lower class boundary of the percentile class
n = number of observations
¿ cf p = sum of the frequencies before the percentile class
fP k
= frequency of the percentile class
i = class interval/size
Example:
Final grades of Stat 101 students arrange in array. Solve for the P 57.
Class Frequency LCB <cf
50 – 55 10 49.5 10
56 – 61 6 55.5 16
62 – 67 8 61.5 24
68 – 73 25 67.5 49
74 – 79 22 73.5 71
80 – 85 23 79.5 94
86 – 91 12 85.5 106
92 – 97 4 91.5 110 Page
N= 110
Solution:
1. Determine the Percentile class by dividing the number of observation by 100.
n 57∗110
= =¿
100 100
2. Go over the entries in the less than cumulative frequency column. The class that has a sum of
n
frequencies greater than the is the percentile 57 class.
100
Class Frequency LCB <cf kn
50 – 55
56 – 61
62 – 67
10
6
8
49.5
55.5
61.5
10
16
24
P57=LCB P +57 [
100
fP
(
−¿ cf p
k
i
)
]
68 – 73 25 67.5 49
74 – 79 22 73.5 71
80 – 85 23 79.5 94
86 – 91 12 85.5 106
92 – 97 4 91.5 110
N= 110
Complete the Frequency Distribution Table to find the Q3, D6 and P94 of the data set given:
Class F LCB <CF
10-19 3
20-29 1
30-39 3
40-49 2
50-59 9
60-69 8
70-79 35
80-89 30
Page
90-99 9
Page
ELEMENTARY STATISTICS (STAT 1)
CHAPTER 5: Measures of Variability
MEASURES OF DISPERSION
It indicates the extent to which individual items in a series are scattered about an average.
Where: s
s=
=
√ n−1
sample standard deviation
x = observation
x́ = sample mean
n = number of observation
Steps in Calculating the Standard Deviation
1. Compute the mean
2. Compute the deviations by subtracting the mean from each of the observations
3. Square the deviations
4. Take the sum of the squared deviations
5. Divide the sum by N – 1
6. Take the square root of the sample variance
Page
Example:
Below is the list of the scores of two groups of students in a grammar quiz.
Group A Group B
13 10
14 10
15 15
16 18
19 18
20 19
25 26
30 36
Solution:
1. Compute the mean
x́ A=
∑ x = 152 =19 x́ B =
∑ x = 152 =19
n 8 n 8
2. Compute the deviations by subtracting the mean from each of the observations, and then square
the deviations.
Group A x−x́ ( x−x́ )2 Group B x−x́ ( x−x́ )2
13 -6 36 10 -9 81
14 -7 49 10 -9 81
15 -4 16 15 -4 16
16 -3 9 18 -1 1
19 0 0 18 -1 1
20 1 1 19 0 0
25 6 36 26 7 49
30 11 121 36 17 289
3. Take the sum of the squared deviations, then divide the sum by N – 1, then take the square root of
the sample variance
∑ ( x− x́ )2 = ∑ ( x −x́ )2 =
sA =
√ n−1 √ 268
8−1
=6.19 s B=
√ n−1 √ 518
8−1
=8.60
A pediatrician has clinic hours in two leading hospitals. His clinic schedule in Alabang is 10:00 to 12:00
pm, MWF. His clinic schedule in Makati is 2:00 to 4:00 pm, TTh. The logbook of his secretaries shows the
number of patients who visited him for the last two weeks.
Hospital in Alabang Hospital in Makati
4,800 4,200
4,200 3,600
4,200 3,600
3,000 3,000
2,400 4,800
Page
Lesson #14 - _________
Where: s =
s=
n−1√
sample standard deviation
f = frequency
x = class mark
x́ = sample mean
n = number of observation
Steps in Calculating the Standard Deviation
1. Compute the mean
2. Compute the deviations by subtracting the mean from each of the class mark
3. Square the deviations
4. Multiply the squared deviations by its corresponding frequency
5. Take the sum of the product of the squared deviations and the frequency
6. Divide the sum by N – 1
7. Take the square root of the sample variance
Example:
Final grades of students in Stat 101 arranged in FDT. Solve for the Standard deviation.
Class Frequency CM (x) fx x−x́ ( x−x́ )2 f ( x− x́ )
2
∑ f ( x− x́ )2 =
s=
√ n−1 √ 13,084.1
110−1
=10.96
Page
Complete the Frequency Distribution Table to find the standard deviation of the data set given:
Class F CM (x) fx x−x́ ( x−x́ )2 f ( x− x́ )2
10-19 3
20-29 1
30-39 3
40-49 2
50-59 9
60-69 8
70-79 35
80-89 30
90-99 9
Page
ELEMENTARY STATISTICS (STAT 1)
CHAPTER 6: Normal Distribution
NORMAL DISTRIBUTION
a. The mean, median, and mode are all equal and are located at the center of the distribution.
b. The distribution is symmetric. The distribution depicts a bell-shaped curve where the left area is a
mirror image of the right area.
c. The total area under the normal curve is 1 or 100%.
d. The distribution is asymptotic.
e. The location of the distribution is determined by the mean and the standard deviation determines
dispersion of the distribution.
The mean and the standard deviation determine the shape of the distribution. Below are illustrations
of normal distributions with different means and standard deviations.
Page
Normal Probability Distribution with equal standard deviations and different means.
As previously stated, there are infinite families of curves depending upon the standard deviation of the
distribution. This may suggest that we have to use different table corresponding to a particular mean and
standard deviation. Well, it is not. It is necessary that we need to standardize a given observation. the
standardized score may also be termed as Z-value, Z statistics, standard deviate, standard normal value or just
normal value. The formula is shown below.
x −μ
Z=
σ
Example:
The scores of 120 students in a stat preliminary examination show bell-shaped distribution.
The mean score is 29 and the standard deviation 3.02. If a student is selected at random, find the probability
of selecting a student whose is:
x−μ x−μ
z= z=
σ σ
24−29 35−29
z= z=
3.02 3.02
z=−1.66 z=1.99
Step 2: Find the area of the standardized score using the areas under the normal curve.
z – value = -1.66 Area = 0.4515
z – value = 1.99 Area = 0.4767
Step 3: Draw the curve and write the Z-value along the horizontal line to where it should belong.
Step 4: Calculate the area. The shaded region serves as our guide on what we are going to do with the
areas corresponding to their respective z – values.
The shaded region tells us that we are going to add the areas:
from 0 to -1.66 = 0.4515
from 0 to 1.99 = 0.4767
Between -1.66 & 1.99 = 0.9282
The probability of selecting a student whose score is between 24 and 35 is 0.9282 or 92.82%.
x−μ x−μ
z= z=
σ σ
33−29 37−29
z= z=
3.02 3.02
z=1.32 z=2.65
Page
z – value = 2.65 Area = 0.4960
Step 3:
x−μ x−μ
z= z=
σ σ
22−29 26−29
z= z=
3.02 3.02
z=−2.32 z=−0.99
Step 3:
Page
between -2.32 & -0.99 = 0.1509
The probability of selecting a student whose score is between 22 and 26 is 0.1509 or 15.09%.
d. Greater that 34?
Step 1: x=34
x−μ
z=
σ
34−29
z=
3.02
z=1.66
Step 3:
x−μ
z=
σ
37−29
z=
3.02
z=2.65
Page
Step 3:
x−μ
z=
σ
23−29
z=
3.02
z=−1.99
Step 3:
Page
Step 4: area of the right half = 0.5000
from 0 to -1.99 = 0.4767
greater than -1.99 = 0.9767
The probability of selecting a student whose score greater than 23 is 0.9767 or 97.67%.
x−μ
z=
σ
26−29
z=
3.02
z=−0.99
Step 3:
Page
Seat Work #15 - __________
Sketch the normal distribution of the given problem. Show your solutions.
A data set follows a normal distribution with a mean of 40 and a standard deviation of 4.75.
What is the area under the normal curve?
a. Between 34.06 and 46.08?
b. Between 28.6 and 35.11?
c. Greater than 49.5?
d. Less than 44.04?
Page