Module 5 Statistics
Module 5 Statistics
Mathematics in
the Modern World
OUTCOME BASED MODULE
GE MMW
Republic of the Philippines
PALAWAN STATE UNIVERSITY
Coron Campus
Module 5:
Statistics: Data
Management
GE MMW
Republic of the Philippines
PALAWAN STATE UNIVERSITY
Coron Campus
Learning Objectives:
At the end of this lesson, learners would be able to:
a. Solve for Mean, Median and Mode
b. Solve word problems involving measurements of Central Tendency of a set of a
numerical data
Introduction
Statistics involves collection, organization, summarization, presentation and
interpretation of data. The branch of statistics that involves collection, summarization
and presentation of data is called descriptive statistics. The branch that interprets and
draws conclusion from the data is called inferential statistics.
Statistics is widely used and utilized almost in every field. Knowing the
fundamental operations in order for you to interpret your data would help you in the
future.
Read:
The Arithmetic Mean
The arithmetic mean is the most commonly used measures of central tendency.
The arithmetic mean of a set of numbers is often referred to as simply mean. To find the
mean of a set of data, find the sum of the data values and divide it by the number of
data values. For example, find the mean of the following salaries: $43,750, $39,500,
$38,000, $41,250 and $44,000.
Solution:
$ 43750+ $ 39500+ $ 38000+ $ 41250+ $ 44000
Mean =
5
$ 206,500
= = $41,300
5
GE MMW
Republic of the Philippines
PALAWAN STATE UNIVERSITY
Coron Campus
In statistics, it is often necessary to find the sum of a set of numbers. The
traditional symbol used to indicate a summation is a greek letter signma, Σ . Thus the
notation Σx , called summation notation, denotes the sum of all the numbers in a given
set. We can define mean using summation notation.
The mean of n numbers is the sum of the numbers divided by n.
Σx
Mean =
n
Statistician often collect data from a small portions of a large group in order to
determine information about the group. in such situations, the entire group under
consideration is called a population, and any subset of population is called a sample. It
is traditional to denote the mean of a sample by x (which is read as “x bar”) and to
denote the man of the population by the greek letter μ (lower case mu).
The Median
Another type of average is the median. Essentially, the median is the middle
number or the mean of the two middle numbers in the list of numbers that had been
arranged in numerical order from the smallest to largest or largest to smallest. Any list of
numbers that is arranged in numerical order from the smallest to largest or largest to
smallest is a ranked list.
The median of the ranked list number n is:
The middle number if n is odd.
The mean of the two middle numbers if n is even.
GE MMW
Republic of the Philippines
PALAWAN STATE UNIVERSITY
Coron Campus
The Mode
The third type of average is the mode. The mode of a list of numbers is the
number that occurs most frequently.
Some list of numbers do not have a mode. For instance, in the list 1, 6, 8, 10, 32,
15 and 49, each number occurs exactly once. Because no number occurs more often
than the other numbers, there is no mode.
A list of numerical data can have more than one mode. For instance, in the list 4,
2, 6, 2, 7, 9, 2, 4, 9, 8, 9, 7, the number 2 occurs three times and the number 9 occurs
three times. Each of the other numbers occurs less than three times. Thus 2 and 9 are
both modes for the data.
GE MMW
Republic of the Philippines
PALAWAN STATE UNIVERSITY
Coron Campus
a. In the list 18, 15, 21, 16, 15, 14, 15, 21, the number 15 occurs more often than
the other numbers. Thus, 15 is the mode.
b. Each number in the list 2, 5, 8, 9, 11, 4, 7, 23, occurs only once. Because no
number occurs more often than the others, there is no mode.
The Range
In the preceding section, we introduced three types of average values for a data
set – the mean, median and the mode. Some evidences of a data set may not be
evident from the examination of averages.
The range of a set of data values is the difference between the greatest data
value and the least data value.
GE MMW
Republic of the Philippines
PALAWAN STATE UNIVERSITY
Coron Campus
Solution
The greatest number of ounces dispensed is 10.07 and the least is 5.85. The
range of the numbers of ounces dispensed is 10.07 – 5.85 = 4.22 oz.
GE MMW
Republic of the Philippines
PALAWAN STATE UNIVERSITY
Coron Campus
If x1, x2, x3, … , xn is a sample of n numbers with a mean of μ then the standard
2
Σ ( x−x )
deviation of the sample is σ =√ (2)
n−1
Procedure for Computing a Standard Deviation
1. Determine the mean of the n numbers.
2. For each number, calculate the deviation (difference) between the number and
the mean of the numbers.
3. Calculate the square of each deviation and find the sum of these squared
deviations.
4. If the data is a population, divide the sum by n. If the data is a sample, divide
the sum by n-1.
5. Find the square root of the quotient in Step 4.
x x-x (x – x)2
2 2 – 8 = -6 (-6)2 = 36
GE MMW
Republic of the Philippines
PALAWAN STATE UNIVERSITY
Coron Campus
4 4 – 8 = -4 (-4)2 = 16
7 7 – 8 = -1 (-1)2 = 1
12 12 – 8 = 4 (4)2 = 16
15 15 – 8 = 7 (7)2 = 49
Sum of the squared deviation 118
Step 3: Calculate the square of each deviation in Step 2, and find the sum of each
deviations.
Step 4: Because we have a sample of n = 5 values, divide the sum 118 by n – 1 which
is 4.
118
= 29.5
4
Step 5: The standard deviation of the sample is s=√ 29.5. To the nearest hundredth, the
standard deviation is s = 5.43
The Variance
A statistic known as the variance is also used as a measure of dispersion. The
variance for a given set of data is the square of the standard deviation of the data. The
following chart shows the mathematical notations that are used to denote standard
deviations and variances.
Notations for Standard Deviation and Variance
σ is the standard deviation of a population.
2
σ is the variance of the population.
GE MMW
Republic of the Philippines
PALAWAN STATE UNIVERSITY
Coron Campus
weights of new born babies at the hospital, the SAT scores of a large group of students
and a lifespan of the light bulbs.
Normal Distribution forms a bell-shaped curve that is symmetric about a vertical
line through the mean of the data.
Properties of a Normal Distribution
Every normal distribution has the following properties
The graph is symmetric about a vertical line through the mean of the
distribution.
The mean, median and mode are equal.
The y-value of each point on the curve is the percent (expressed as a
decimal) of the data at the corresponding x – value.
Areas under the curve that are symmetric about the mean are equal.
The total area under the curve is 1.
Empirical Rule for a Normal Distribution
y A normal distribution
GE MMW
Republic of the Philippines
PALAWAN STATE UNIVERSITY
Coron Campus
$3.10 and a standard deviation of $0.18. How many of the stations charge between
$2.74 and $3.46 for a gallon of regular gas?
Solution: The $2.74 per gallon price is 2 standard deviations below the mean. The
$3.46 price is 2 standard deviations above the mean. In a normal distribution, 95% of all
the data lie within 2 standard deviation of the mean. Therefore, approximately
(95%)(1000) = (0.95)(1000) = 950
of the stations charge between $2.74 and $3.46 for a gallon of a regular gas.
z scores
Tables for normal distribution are based on standard nominal distribution whose
μ = 0 with a standard deviation σ = 0. Any normal random variable x can be transformed
to a standard normal random variable z using the formula:
x−μ x−x
z= or z=
σ s
GE MMW
Republic of the Philippines
PALAWAN STATE UNIVERSITY
Coron Campus
second test, for which the mean of all scores was 45 and the standard deviation was 12. In
comparison to the other students, did Raul do better on the first test or the second test?
Solution:
72−65 60−45
First test: z=
8
= 0.875 Second Test: z=
12
= 1.25
Raul scored 0.875 standard deviation above the mean on the first test and 1.25
standard deviation above the mean on the second test. These z-scores indicate that, in
comparison to his classmates, Raul scored better on the second test than he did on the
first one.
( Σ x )2
2
Where: SSxx=Σ x −
n
GE MMW
Republic of the Philippines
PALAWAN STATE UNIVERSITY
Coron Campus
( Σ y )2
SSyy=Σ y 2−
n
2 ΣxΣ y
SSxy=Σ xy −
n
and n is the sample sie and “SS” stands for sum of the squares.
The sum of the square of r is called the coefficient of determination which
describes the degree of a variability between the dependent variable y and the
independent variable x.
References:
Published:
Aufman, Richard N., et al, Mathematical Excursions. 3rd ed., Brookes/Cole, Cengage
Learning.
Aufman, R., Lockwood, J.,Nation, R.,Clegg D., Epp, S., Mathematics in the Modern
World, Cengage Learning
GE MMW