0% found this document useful (0 votes)
52 views13 pages

Module 5 Statistics

Uploaded by

Shemiah Gonzales
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
52 views13 pages

Module 5 Statistics

Uploaded by

Shemiah Gonzales
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 13

Republic of the Philippines

PALAWAN STATE UNIVERSITY


Coron Campus

Mathematics in
the Modern World
OUTCOME BASED MODULE

GE MMW
Republic of the Philippines
PALAWAN STATE UNIVERSITY
Coron Campus

Module 5:
Statistics: Data
Management

Student Signature: Date


Returned:

GE MMW
Republic of the Philippines
PALAWAN STATE UNIVERSITY
Coron Campus

Lesson 1: Measures of Central Tendency

Learning Objectives:
At the end of this lesson, learners would be able to:
a. Solve for Mean, Median and Mode
b. Solve word problems involving measurements of Central Tendency of a set of a
numerical data

Introduction
Statistics involves collection, organization, summarization, presentation and
interpretation of data. The branch of statistics that involves collection, summarization
and presentation of data is called descriptive statistics. The branch that interprets and
draws conclusion from the data is called inferential statistics.
Statistics is widely used and utilized almost in every field. Knowing the
fundamental operations in order for you to interpret your data would help you in the
future.

Read:
The Arithmetic Mean
The arithmetic mean is the most commonly used measures of central tendency.
The arithmetic mean of a set of numbers is often referred to as simply mean. To find the
mean of a set of data, find the sum of the data values and divide it by the number of
data values. For example, find the mean of the following salaries: $43,750, $39,500,
$38,000, $41,250 and $44,000.

Solution:
$ 43750+ $ 39500+ $ 38000+ $ 41250+ $ 44000
Mean =
5
$ 206,500
= = $41,300
5

GE MMW
Republic of the Philippines
PALAWAN STATE UNIVERSITY
Coron Campus
In statistics, it is often necessary to find the sum of a set of numbers. The
traditional symbol used to indicate a summation is a greek letter signma, Σ . Thus the
notation Σx , called summation notation, denotes the sum of all the numbers in a given
set. We can define mean using summation notation.
The mean of n numbers is the sum of the numbers divided by n.
Σx
Mean =
n
Statistician often collect data from a small portions of a large group in order to
determine information about the group. in such situations, the entire group under
consideration is called a population, and any subset of population is called a sample. It
is traditional to denote the mean of a sample by x (which is read as “x bar”) and to
denote the man of the population by the greek letter μ (lower case mu).

Example 1: Find a mean


Six friends in biology class of 20 students received test grades of 92, 84, 65, 76,
88 and 90. Find the mean of the tests scores.
Solution:
Σx 92+84+ 65+76+88+ 90 495
x=
n
= 6
= 6
= 82.5
The mean of the test scores is 82.5.

The Median
Another type of average is the median. Essentially, the median is the middle
number or the mean of the two middle numbers in the list of numbers that had been
arranged in numerical order from the smallest to largest or largest to smallest. Any list of
numbers that is arranged in numerical order from the smallest to largest or largest to
smallest is a ranked list.
The median of the ranked list number n is:
 The middle number if n is odd.
 The mean of the two middle numbers if n is even.

GE MMW
Republic of the Philippines
PALAWAN STATE UNIVERSITY
Coron Campus

Example 2: Find a median


Find the median data of the following lists:
a. 4, 8, 1, 14, 9, 21, 12 b. 46, 23, 92, 89, 77,108
Solution:
a. The list 4, 8, 1, 14, 9, 21, 12 contains 7 numbers. The median of a list with an
odd number of entries is found by ranking the numbers and finding the middle number.
Ranking the numbers from the smallest to largest gives
1, 4, 8, 9, 12, 14, 21
The middle number is 9. Thus 9 is the median.
b. The list 46, 23, 92, 89, 77, 108 contains 6 numbers. The median of a list of
data with an even number of entries is found by ranking the numbers and computing the
mean of the two middle numbers. Ranking the numbers from smallest to largest gives.
23, 46, 77, 89, 92, 108
The two middle numbers are 77 and 89. The mean of 77 and 89 is 83. Thus 83 is
the median of the data.

The Mode
The third type of average is the mode. The mode of a list of numbers is the
number that occurs most frequently.
Some list of numbers do not have a mode. For instance, in the list 1, 6, 8, 10, 32,
15 and 49, each number occurs exactly once. Because no number occurs more often
than the other numbers, there is no mode.
A list of numerical data can have more than one mode. For instance, in the list 4,
2, 6, 2, 7, 9, 2, 4, 9, 8, 9, 7, the number 2 occurs three times and the number 9 occurs
three times. Each of the other numbers occurs less than three times. Thus 2 and 9 are
both modes for the data.

Example 3. Find a Mode


Find the mode of the data in the following lists:
a. 18, 15, 21, 16, 15, 14, 15, 21 b. 2, 5, 8, 9, 11, 4, 7, 23
Solution:

GE MMW
Republic of the Philippines
PALAWAN STATE UNIVERSITY
Coron Campus
a. In the list 18, 15, 21, 16, 15, 14, 15, 21, the number 15 occurs more often than
the other numbers. Thus, 15 is the mode.
b. Each number in the list 2, 5, 8, 9, 11, 4, 7, 23, occurs only once. Because no
number occurs more often than the others, there is no mode.

The Weighted Mean


A value called the weighted mean is often used when some data values are
more important than the others. For instance, many professors determine a student’s
course grade from the student’s tests and the final examination. Consider the situation
in which a professor counts the final examination score as 2 test scores. To find the
weighted mean of the student’s scores, the professor first assigns a weight to each
score. In this case, the professor could assign each of the test scores a weight of 1 and
the final exam score of 2. A student with test scores of 65, 70, and 75 and a final
examination score of 90 has a weighted mean of:
( 65 x 1 ) + ( 70 x 1 ) + ( 75 x 1 ) +(90 x 2) 390
Weighted Mean = =
5
= 78
5
Note that the numerator of the weighted mean above is the sum of the products
of each test score and its corresponding weight. The number 5 in the denominator is the
sum of all the weights (1+1+1+2=5). The procedure for finding the weighted mean can
generalied as follows:
The weighted mean of the n numbers x1, x2, x3, … , xn with the respected assigned
weights w1, w2, w3, … , wn is
Σ ( x . w)
Weighted Mean =
Σw

Lesson 2: Measures of Dispersion

The Range
In the preceding section, we introduced three types of average values for a data
set – the mean, median and the mode. Some evidences of a data set may not be
evident from the examination of averages.
The range of a set of data values is the difference between the greatest data
value and the least data value.

GE MMW
Republic of the Philippines
PALAWAN STATE UNIVERSITY
Coron Campus

Example 4: Find a Range


Find the range of the ounces dispensed by the vending machine 1 from the data
in table 1 in table below.
Machine 1 Machine 2
9.52 8.01
6.41 7.99
10.07 7.95
5.85 8.03
8.15 8.02
x = 8.0 x = 8.0

Solution
The greatest number of ounces dispensed is 10.07 and the least is 5.85. The
range of the numbers of ounces dispensed is 10.07 – 5.85 = 4.22 oz.

The Standard Deviation


The range of the set of data is easy to compute, but it can be deceiving. The
range is the measure that depends only on two most extreme values, and as such it is
very sensitive. A measure of dispersion that is less sensitive to extreme values is the
standard deviation. The standard deviation of a set of numerical data makes use of the
amount by which each individual data value deviates from the mean. These deviations
represented by (x-x), are positive when the data value x is greater than the mean x and
are negative when x is less than the mean x. The sum of all the deviation (x – x) is 0 for
all sets of data. This is shown in the table below for the machine on the example above.
x x-x
8.01 8.01 – 8 = 0.01
7.99 7.99 – 8 = -0.01
7.95 7.95 – 8 = -0.05
8.03 8.03 – 8 = 0.03
8.02 8.02 – 8 = 0.02
Sum of deviation = 0

If x1, x2, x3, … , xn is a population of n numbers with a mean of μ then the


2
Σ ( x−μ )
standard deviation of the population is σ =√ (1)
n

GE MMW
Republic of the Philippines
PALAWAN STATE UNIVERSITY
Coron Campus
If x1, x2, x3, … , xn is a sample of n numbers with a mean of μ then the standard
2
Σ ( x−x )
deviation of the sample is σ =√ (2)
n−1
Procedure for Computing a Standard Deviation
1. Determine the mean of the n numbers.
2. For each number, calculate the deviation (difference) between the number and
the mean of the numbers.
3. Calculate the square of each deviation and find the sum of these squared
deviations.
4. If the data is a population, divide the sum by n. If the data is a sample, divide
the sum by n-1.
5. Find the square root of the quotient in Step 4.

Example 5: Find the Standard Deviation


The following numbers were obtained by sampling a population.
2, 4, 7, 12, 15
Find the standard deviation of the sample.
Solution:
Step 1: The mean of the numbers is
2+ 4 +7+12+15 40
x= = =8
5 5
Step 2: For each number, calculate the deviation between the number and the mean.
x x-x
2 2 – 8 = -6
4 4 – 8 = -4
7 7 – 8 = -1
12 12 – 8 = 4
15 15 – 8 = 7

x x-x (x – x)2
2 2 – 8 = -6 (-6)2 = 36

GE MMW
Republic of the Philippines
PALAWAN STATE UNIVERSITY
Coron Campus
4 4 – 8 = -4 (-4)2 = 16
7 7 – 8 = -1 (-1)2 = 1
12 12 – 8 = 4 (4)2 = 16
15 15 – 8 = 7 (7)2 = 49
Sum of the squared deviation 118
Step 3: Calculate the square of each deviation in Step 2, and find the sum of each
deviations.

Step 4: Because we have a sample of n = 5 values, divide the sum 118 by n – 1 which
is 4.
118
= 29.5
4
Step 5: The standard deviation of the sample is s=√ 29.5. To the nearest hundredth, the
standard deviation is s = 5.43

The Variance
A statistic known as the variance is also used as a measure of dispersion. The
variance for a given set of data is the square of the standard deviation of the data. The
following chart shows the mathematical notations that are used to denote standard
deviations and variances.
Notations for Standard Deviation and Variance
σ is the standard deviation of a population.
2
σ is the variance of the population.

s is the standard deviation of a sample.


s2 is the variance of a sample.

Lesson 3: Normal Distribution and Linear Regression

The Normal Distribution and the Empirical Rule


One of the most important statistical distribution of data is known as the normal
distribution. This distribution occurs in a variety of applications. Types of data that may
demonstrate a normal distribution include the lengths of the leaves on the tree, the

GE MMW
Republic of the Philippines
PALAWAN STATE UNIVERSITY
Coron Campus
weights of new born babies at the hospital, the SAT scores of a large group of students
and a lifespan of the light bulbs.
Normal Distribution forms a bell-shaped curve that is symmetric about a vertical
line through the mean of the data.
Properties of a Normal Distribution
Every normal distribution has the following properties
 The graph is symmetric about a vertical line through the mean of the
distribution.
 The mean, median and mode are equal.
 The y-value of each point on the curve is the percent (expressed as a
decimal) of the data at the corresponding x – value.
 Areas under the curve that are symmetric about the mean are equal.
 The total area under the curve is 1.
Empirical Rule for a Normal Distribution
y A normal distribution

2.35% 13.5% 34% 34% 13.5% 2.35%


μ−3 σ μ−2 σ μ−σ μ μ+σ μ+2 σ μ+3 σ x

68% of the data

95% of the data

99.7% of the data

In a normal distribution, approximately


 68% of the data lie within 1 standard deviation of the mean.
 95% of the data lie within 2 standard deviations of the mean.
 99.7% of the data lie within 3 standard deviations of the mean.

Example 6: Use the Empirical Rule to Solve an Application


A survey of 1000 U.S. gas stations found that the price charged for a gallon of
regular gas could be closely, approximated by a normal distribution with a mean of

GE MMW
Republic of the Philippines
PALAWAN STATE UNIVERSITY
Coron Campus
$3.10 and a standard deviation of $0.18. How many of the stations charge between
$2.74 and $3.46 for a gallon of regular gas?
Solution: The $2.74 per gallon price is 2 standard deviations below the mean. The
$3.46 price is 2 standard deviations above the mean. In a normal distribution, 95% of all
the data lie within 2 standard deviation of the mean. Therefore, approximately
(95%)(1000) = (0.95)(1000) = 950
of the stations charge between $2.74 and $3.46 for a gallon of a regular gas.

z scores
Tables for normal distribution are based on standard nominal distribution whose
μ = 0 with a standard deviation σ = 0. Any normal random variable x can be transformed
to a standard normal random variable z using the formula:
x−μ x−x
z= or z=
σ s

Example 7: Use z scores to solve for a missing value.


A consumer group tested a sample of 100 light bulbs. It is found that the mean
life expectancy of the bulbs was 842h, with a standard deviation of 90. One particular
light bulb from the DuraBright Company had a z-score of 1.2. What was the life span of
the light bulb?
Solution:
Substitute the given values into the z-score equation and solve for x.
x−x
z=
s
x−842
1.2=
90
108 = x – 842
x = 950
The light bulb had a life span of 950h.

Example 8: Compare z scores


Raul has taken two tests in his chemistry class. He scored 72 on the first test, for which
the mean of all scores was 65 and the standard deviation was 8. He received a 60 on the

GE MMW
Republic of the Philippines
PALAWAN STATE UNIVERSITY
Coron Campus
second test, for which the mean of all scores was 45 and the standard deviation was 12. In
comparison to the other students, did Raul do better on the first test or the second test?
Solution:

Find the z scores for each test.

72−65 60−45
First test: z=
8
= 0.875 Second Test: z=
12
= 1.25

Raul scored 0.875 standard deviation above the mean on the first test and 1.25
standard deviation above the mean on the second test. These z-scores indicate that, in
comparison to his classmates, Raul scored better on the second test than he did on the
first one.

Simple Linear Regression


Linear regression is the most basic and commonly used predictive analysis in
the field of statistics and time analysis. Regression estimates are used to describe data
and to explain the nature of relationship among the variables involved.
Correlation analysis is one statistical technique used to study casual
relationships among variables. Regression analysis is used to determine the nature of
relationship. In a two variable linear regression or simple linear regression, a positive
relationship occurs when the two variables increase at the same time, while the
negative relationship occurs when one variable increases and the other variable
decreases or vice versa.
Correlation coefficient to determine if there exists a linear relationship between
two variables, use the correlation coefficient r, whose values range from -1 to 1.
Value of r Interpretation
Close to +1 Strong positive relationship
Close to 0 Weak or no relationship
Close to -1 Strong negative linear relationship

The value of r may be obtained by the least squares method.


SSxy
r=
√ SSxxSSyy

( Σ x )2
2
Where: SSxx=Σ x −
n

GE MMW
Republic of the Philippines
PALAWAN STATE UNIVERSITY
Coron Campus
( Σ y )2
SSyy=Σ y 2−
n
2 ΣxΣ y
SSxy=Σ xy −
n
and n is the sample sie and “SS” stands for sum of the squares.
The sum of the square of r is called the coefficient of determination which
describes the degree of a variability between the dependent variable y and the
independent variable x.

References:
Published:
Aufman, Richard N., et al, Mathematical Excursions. 3rd ed., Brookes/Cole, Cengage
Learning.
Aufman, R., Lockwood, J.,Nation, R.,Clegg D., Epp, S., Mathematics in the Modern
World, Cengage Learning

Congratulations for Finishing Module 4.


Keep up the Good Work!

GE MMW

You might also like