0% found this document useful (0 votes)
177 views21 pages

STATISTICS MODULE 2-Updated

This document provides an overview of descriptive statistics and introduces key concepts related to measures of central tendency and variability. It discusses summary statistics, measures of central tendency including the mean, median and mode, and measures of variability. Examples are provided to demonstrate how to calculate and interpret the mean, median, mode, and measures of variability using sample data. The purpose is to help students understand and apply descriptive statistics to describe basic features of data through simple numerical and graphical analyses.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
177 views21 pages

STATISTICS MODULE 2-Updated

This document provides an overview of descriptive statistics and introduces key concepts related to measures of central tendency and variability. It discusses summary statistics, measures of central tendency including the mean, median and mode, and measures of variability. Examples are provided to demonstrate how to calculate and interpret the mean, median, mode, and measures of variability using sample data. The purpose is to help students understand and apply descriptive statistics to describe basic features of data through simple numerical and graphical analyses.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 21

STATISTICS

MODULE

FE C. MONTECALVO
Professor VI

GRADUATE SCHOOL
MAED, MSGC, MAST, MBA, & MPM
2020

[Course Code]: [Course Title] Page 1 of 21


MODULE 2

Module Title: Descriptive Statistics

Module Description:

This module deals with Descriptive Statistics. Descriptive statistics will teach
you the basic concepts used to describe data. They are used to describe the basic
features of the data in a study. They provide simple summaries about the sample and
the measures. Together with simple graphics analysis, they form the basis of virtually
every quantitative analysis of data.

This module contains the detailed discussion of the measures of central


tendency and its characteristics, measures of location, and measures of variability.
Enjoy learning this module and go over the discussion and examples if you have not
yet mastered a concept.

Purpose of the Module:

This module provides students with an understanding and knowledge of


descriptive statistics which can be useful for two purposes: 1) to provide basic
information about variables in a dataset and 2) to highlight potential relationships
between variables. 

Module Guide:

This module is designed to engage the learner on the topic on Descriptive


Statistics. This will take approximately two weeks to complete. It is divided into two
(2) lessons designed for independent online or offline based learning of the students.

Lesson 1: Measures of Central Tendency/ Other Measures of Location


(Ungrouped Data)
Lesson 2: Measures of Variability (Ungrouped Data)

Module Outcomes:

At the end of the module, the students should be able to:

1. Define the following terms:


1.1 summary statistics
1.2 measures of central tendency
1.3 measures of location
1.4 measures of variability
1.5 Mean, median, mode, variance, and standard deviation

2. Determine the characteristics of the mean, median and mode

[Course Code]: [Course Title] Page 2 of 21


3. describe and calculate the measures of central tendency: mean, median, and
mode using ungrouped and grouped data;
4. Explain the importance of measuring variability;
5. learn the concept of the variability of a data set;
6. learn how to compute the measures of the variability of a data set: the range,
inter-quartile range, mean absolute deviation, variance, and standard
deviation using ungrouped and ungrouped data; and
7. use excel to find the measures of central tendency and measures of variability.

Module Requirements:

At the end of this module, the students shall submit the following:

1. Assignments
2. Quizzes

Assessments:

Reflected in Lessons 1, 2 and 3

Key Terms:
 mean
 median
 mode
 variability
 range
 Inter-quartile range
 Quartile deviation
 Mean absolute deviation
 Variance
 Standard deviation

[Course Code]: [Course Title] Page 3 of 21


Learning Plan
Lesson No: 1

Lesson Title: Measures of Central Tendency / Other Measures of Location


(Ungrouped Data)

Let’s Hit These:

At the end of the lesson, the students should be able to:

1. Define the following terms:


1.1 summary statistics
1.2 measures of central tendency
1.3 measures of location
1.4 Mean, median, and mode
2. Determine the characteristics of the mean, median and mode
3. Describe and calculate the measures of central tendency: mean, median, and
mode.

Let’s Get Started:

SUMMARY STATISTICS
numerical measures that are used to describe certain characteristics of the
data

Common Types of Summary Measures

 Measures of Central Tendency

 Measures of Location

 Measures of Dispersion

MEASURES OF CENTRAL TENDENCY


any single value which is used to identify the “center” of the data or the typical
value; it is oftentimes referred to as the average
The Mean

 sum of all values of the observations divided by the number of


observations in the data set

[Course Code]: [Course Title] Page 4 of 21


Population Mean (for a finite population):

sum of the observations


µ= ————————————————
size of the population (N)

Sample Mean:

sum of the observations

x̄ = ————————————————   
Size of the sample (n)

Let’s Find Out:

Example:

The achievement test scores in Math of all 50 freshmen students from


a certain college are as follows:

43 51 53 55 57 58 58 59 61 61
61 62 63 64 65 65 66 66 67 68
68 69 69 69 69 70 70 70 71 71
72 73 73 74 74 75 76 76 77 78
79 79 81 82 82 85 87 89 91 96

The mean of this population is:

43 + 51 + …+ 91 + 96 3498
µ = —————————— = ——— = 69.96
50 50
Suppose that a sample of seven students from this college yielded the
following observations:

70, 82, 77, 96, 55, 85, 64

The corresponding sample mean is

70 + 82 + 77 + 96 + 55 + 85 + 64

x̄ = ——————————————— = 75.57   
7

Suppose another sample of students of the same size was taken and
resulted to the following scores:

[Course Code]: [Course Title] Page 5 of 21


58, 72, 77, 89, 63, 85, 51

The sample mean is given by

58 + 72 + 77 + 89 + 63 + 85 +64

x̄ = ——————————————— = 70.714   
7

Characteristics of the Mean

 It is the most familiar measure of the central tendency used, and it


employs all available information.

 It is strongly influenced by extreme values.

 Since the mean is a calculated number, it may not be an actual


number in the data set.

 It can be applied to data that are measured in at least interval level.

The Median

 a value that divides an ordered set of data (array) into two equal parts
and its commonly denoted by Md

 A value below which one-half of the data must fall

To get the median:

 When the number of observations is odd:


Md = middle value in the array
= (n+1/2)th of observation in the array
 When the numeric of observations is even:
Md = mean of the two middle values in the array
= mean of (n/2)th and (n/2 + 1)th observations
in the array
Examples:

a. The following are the total receipts of 7 companies (in million pesos):

1.2, 7.2, 12.5, 6.5, 50.6, 4.5, 10.4

The array corresponding to the above data is given by

1.2, 4.5, 6.5, 7.2, 10.4, 12.5, 50.6


[Course Code]: [Course Title] Page 6 of 21
Thus, the median is 7.2

b. The following are the number of years of operation of 8 manufacturing


companies:

8, 10, 17, 18, 11, 16, 17, 10

The array is given by

8, 10, 10, 11, 16, 17, 17, 18

The median is

11+16
Md = —————— = 13.5
2

Characteristic of the Median

 It is a positional measure.

 It is not influenced by extreme values.

 It can be applied to data that are measured in at least ordinal Level.

The Mode

 the value in the data set that occurs with the highest frequency

Example:

A psychologist has developed a new technique intended to improve rote


memory. To test the method against other standard methods, 30 high
school students representing three sections are selected at random, and
each is taught the new technique. The students are then asked to
memorize at list of 100 word phrases using the technique. The following
are the number of word phrases memorized correctly by the students from
each section:

Section 1: 83 64 98 66 83 87 83 93
86 80 93 83 75
Section 2: 87 76 96 77 94 92 88 85
66 89
Section 3: 68 84 79 79 84 75 80

Determine the mode for each set in the context of this problem.
[Course Code]: [Course Title] Page 7 of 21
Section Mode

1 83
2 does not exist
3 84 and 79

Characteristics of the Mode

 It is the easiest to interpret among measures of central tendency.

 It is not affected by extreme values.

 It does not always exist; if it does, it may not be unique. If a data set
has two modes, we call it bimodal, if there are three modes, we call
it trimodal and so on.

 One advantage of the mode is that it can be applied to observations


that are measured in the nominal level.

MEASURES OF LOCATION

numbers below which a specified amount or percentage of data must lie and
are oftentimes used to find the position of specific piece of data in relation to
the entire set of data

Percentiles

 Values that divide an ordered set of data into 100 equal parts

 the ith percentile (i=1,2,…,99), denoted by P i, is a value below which i%


of the data must lie

to determine Pi, we have the following steps:

i. arrange the data from lowest to highest.


ii. If ni/100 is not a whole number, Pi is the mean of the mean of the (ni/100)
th
ordered values.
iii. If ni/100 is not a whole number, P i is the kth ordered value where k is
the closet whole number greater than ni/100.
Deciles

 Values that divide an ordered set of the data into 10 equal parts

[Course Code]: [Course Title] Page 8 of 21


 the ith decile (i=1,2,…,9), denoted by Di, is the value below which 10i%
of the data must lie

Quartiles

 values that divide an ordered set of the data into 4 equal parts

 the ith quartile (i=1,2,3), denoted by Q i, is a value which 25i% of the data
must lie

Example:

The data from 50 measurements of the traffic noise level at an intersection are
already ordered from smallest to largest in the table given below. Locate the
quartiles.

Measurements of Traffic Noise level ( in decibels)


52.0 55.9 56.7 59.4 60.2 61.0 62.1 63.8 65.7 67.9
54.4 55.9 56.8 59.4 60.3 61.4 62.6 64.0 66.2 68.2
54.5 56.2 57.2 59.5 60.5 61.7 62.7 64.6 66.8 68.9
55.7 56.4 57.6 59.8 60.6 61.8 63.1 64.8 67.0 69.4
55.8 56.4 58.9 60.0 60.8 62.0 63.6 64.9 67.1 77.1
Source: Johnson, Richard A. et. Statistics: Principles and Methods, 3rd ed. P.45

The quartiles are as follows:

Q1 (N/4) = P25 (25N/100) = 13th Observation in the array


= 57.2
60.8 + 61
Q2 (2N/4) = D5 (5N/10) = P50 = ————— = 60.9
2

Q3 (3N/4) = P75 (75N/100) = 38th observation in the array


= 64.6

Suggested Readings:

 https://uomustansiriyah.edu.iq/media/lectures/5/5_2018_12_10!
09_06_45_PM.pdf

 Measures of Position for Ungrouped Data (Made use of Linear


Interpolation)

https://www.slideshare.net/chuckrymaunes5/measures-of-position-for-
ungrouped-data-quartiles-deciles-percentiles-130064276

[Course Code]: [Course Title] Page 9 of 21


Assignment:

Given below are the scores of 16 students in a mathematics examination.


(Use ungrouped Data)

58 30 75 77 94 97 80 35
74 58 70 99 60 63 71 16

Find: 1. Mean
2. Median
3. Mode
4. Q1
5. D5
6. P75

References/Sources:

Altares, Priscilla, et. al. (2014). Elementary Statistics with Computer


Applications (2nd Edition). Manila: Rex Book Store.

Agbayani, Victor A. E. (2001). Applied Statistics for Business and


Research. Quezon City: AFA Publications.

Batacan,M.C.A., et. al., (2007). Statistics for Filipino Students, 2 nd ed. Manila
Phils: National Books Store

Calmorin, Laurentina Paler and Melchor A. Calmorin. (2007). Statistics in


Education and the Sciences. Manila: Rex Book Store.

Learning Plan
Lesson No: 2

Lesson Title: Measures of Variability (Ungrouped Data)

Let’s Hit These:

At the end of the lesson, the students should be able to:

1. Explain the importance of measuring variability;


2. learn the concept of the variability of a data set;
3. learn how to compute the measures of the variability of a data set: the range,
quartile deviation, mean absolute deviation, variance, and standard deviation
using ungrouped and ungrouped data; and
4. Calculate and interpret the coefficient of variability between two or more data.

[Course Code]: [Course Title] Page 10 of 21


Let’s Get Started:

Measures of Dispersion /Measures of Variability

MEASURES OF DISPERSION

numerical descriptive measures which indicate the extent to which


individual observation in the set of data are scattered about an average

Meaning of Variability:

Variability means ‘Scatter’ or ‘Spread’. Thus measures of variability refer to the


scatter or spread of scores around their central tendency. The measures of
variability indicate how the distribution scatter above and below the central
tender.

Why are measures of variability important?

Variability serves both as a descriptive measure and as


an important component of most inferential statistics. ... In the context of
inferential statistics, variability provides a measure of how accurately any
individual score or sample represents the entire population.

Need of Variability:

1.  Helps to as-certain the measures of deviation:

The measures of variability help us to measure the degree of deviation,


which exist in the data. By that can determine the limits within which the
data will navy in some measureable variety or quality.

2.  It helps to compare different group:

With the help of measures of validity we can compare the original data
expressed in different units.

3.  It is useful to supplement the information provided by the measures of


central tendency.

4.  It is useful to calculate further advance statistics based on the measures of


dispersion.

Let’s Find Out:

[Course Code]: [Course Title] Page 11 of 21


The terms variability, spread, and dispersion are synonyms, and refer to
how spread out a distribution is. Just as in the section on central tendency we
discussed measures of the center of a distribution of scores, in this chapter we
will discuss measures of the variability of a distribution. There are four
frequently used measures of variability, the range: quartile deviation, average
deviation or mean absolute deviation, and standard deviation. In the next few
paragraphs, we will look at each of these four measures of variability in more
detail.

There are four measures of variability:


A. The Range
B. The Quartile Deviation
C. The Average Deviation
4. The Standard Deviation

These are:

A. Range

The range is the simplest measure of variability to calculate, and


one you have probably encountered many times in your life. The range is
simply the highest score minus the lowest score.

Let’s take a few examples.

1.) What is the range of the following group of numbers: 10, 2, 5, 6, 7, 3, 4?

Well, the highest number is 10, and the lowest number is 2.

So, Range = 10 – 2 = 8

2.) What is the range of the dataset with 10 numbers: 99, 45, 23, 67, 45, 91,
82, 78, 62, 51?

Range = 99 – 23 = 76

B. The Quartile Deviation (Q):

It is based upon the interval containing the middle fifty percent of cases in
a given distribution. One quarter means 1/4th of something, when a scale
is divided in to four equal parts. “The quartile deviation or Q is the one-
half the scale distance between the 75t and 25th percentiles in a
frequency distribution.”

Symbolically: 

[Course Code]: [Course Title] Page 12 of 21


Example #1
Consider a data set of following numbers: 22, 12, 14, 7, 18, 16, 11, 15, 12.
You are required to calculate the Quartile Deviation.

Solution:

First, we need to arrange data in ascending order to find Q3 and Q1 and avoid
any duplicates.

7, 11, 12, 13, 14, 15, 16, 18, 22

Calculation of Q1 can be done as follows,

Q1 = ¼ (n + 1)

= ¼ (9 + 1)

=¼ (10)

Q1= 2.5 Term

Calculation of Q3 can be done as follows,

Q3= ¾ (n + 1)

= ¾ (9 + 1)

=¾ (10)

Q3= 7.5 Term

Calculation of quartile deviation can be done as follows,

 Q1 is an average of 2nd which is11 and adds the product of the

difference between 3rd & 4th and 0.5 which is (12-11)*0.5 = 11.50.

 Q3 is 7th term and product of 0.5 and the difference between 8 th and

7th term which is (18-16)*0.5 and the result is  16 + 1 = 17.

Q.D. = Q3 – Q1 / 2 = (17-11.50) / 2 = 5.5 / 2 = 2.75

Solution: We first need to sort the frequency data given to us before proceeding


with the quartiles calculation –

[Course Code]: [Course Title] Page 13 of 21


Sorted Data – 5, 10, 15, 17, 18, 19, 20, 21, 25, 28
n(number of data points) = 10

Now, to find the quartiles, we use the logic that the first quartile lies halfway
between the lowest value and the median; and the third quartile lies halfway
between the median and the largest value.

First Quartile Q1 = n+1 th term


4
= 10 + 1 th term = 2.75th term
4
= 2nd term + 0.75 × (3rd term – 2nd term)
= 10 + 0.75 × (15 – 10)
= 10 + 3.75
= 13.75

Third Quartile Q3 = 3(n+1) term


4

= 3(10+1) = 8.25 th term

= 8th term + 0.25 × (9th term – 8th term)


= 21 + 0.25 × (25 – 21)
= 21 + 1
= 22

Using the values for Q1 and Q3, now we can calculate the Quartile Deviation as
follows –

Quartile Deviation = Semi-Inter Quartile Range

= 22–13.75
2
= 8.25
2
= 4.125

C. Absolute Deviation and Mean Absolute Deviation

[Course Code]: [Course Title] Page 14 of 21


Formula:

Mean absolute deviation (MAD)= (Σ |xi –  |) / n

 Σ – just a fancy symbol that means “sum”


 xi – the ith data value or individual score
  – the mean value or mean score
 n – sample size
| | = take the absolute value (i.e. ignore the minus sign

Example: the Mean Absolute Deviation of 3, 6, 6, 7, 8, 11, 15, 16

Step 1: Find the mean:

Mean =  3 + 6 + 6 + 7 + 8 + 11 + 15 + 16 / 8  =  72/ 8  = 9

Step 2: Find the distance of each value from that mean:

Distance from 9
X
xi –  |xi –  |
3 3 – 9 = -6 6
6 6 – 9 = -3 3
6 6 – 9 - -3 3
7 7 – 9 = -2 2
8 8 – 9 = -1 1
11 11 – 9 = 2 2
15 15 – 9 = 6 6
16 16 – 9 = 7 7

Σ |xi –  | = 30
Step 3. Find the mean of those distances:

Mean Absolute Deviation =  6 + 3 + 3 + 2 + 1 + 2 + 6 + 7


8

=  30/8  = 3.75

 Or MAD = (Σ |xi –  ) / n


= 30/8 = 3.75

So, the mean = 9, and the mean deviation = 3.75

[Course Code]: [Course Title] Page 15 of 21


D. Variance

Variability can also be defined in terms of how close the scores in the
distribution are to the middle of the distribution. Using the mean as the
measure of the middle of the distribution, the variance is defined as the
average squared difference of the scores from the mean.

The formula for the population variance is:

where σ2 is the variance, μ is the mean, and N is the number of numbers.

If the variance in a sample is used to estimate the variance in a


population, then the previous formula underestimates the variance and the
following formula should be used:

where s2 is the estimate of the variance and M or is the sample mean. Note
that M is the mean of a sample taken from a population with a mean of μ.
Since, in practice, the variance is usually computed in a sample, this formula
is most often used. The simulation "estimating variance" illustrates the bias in
the formula with N in the denominator.

Let's take a concrete example: 3, 6, 6, 7, 8, 11, 15, 16

For this example,

Distance
X xi –  from 9 (X - )2
|xi –  |
3 3 – 9 = -6 6 36
6 6 – 9 = -3 3 9
6 6 – 9 - -3 3 9

[Course Code]: [Course Title] Page 16 of 21


7 7 – 9 = -2 2 4
8 8 – 9 = -1 1 1
11 11 – 9 = 2 2 4
15 15 – 9 = 6 6 36
16 16 – 9 = 7 7 49

Σ (xi –  )2= 148

Population Variance (Assuming the given is population)

= 148/8 = 18.5

Sample Variance (Assuming the given is a sample)

= 148/7 = 21.14

Standard Deviation

The standard deviation is simply the square root of the variance. The symbol
for the population standard deviation is σ; the symbol for an estimate
computed in a sample is s.

What are the formulas for the standard deviation?

The sample standard deviation formula is:

where,

s = sample standard deviation


 = sum of...
 = sample mean
n = number of scores in sample.

[Course Code]: [Course Title] Page 17 of 21


The population standard deviation formula is:

where,

 = population standard deviation


 = sum of...
 = population mean
n = number of scores in sample.

Using the same example above:

The sample standard deviation formula is:

148
S=
√ 8
= √ 18.5 = 4.30

The population standard deviation formula is:

148
σ=
√ 7
= √ 21.14 = 4.60

[Course Code]: [Course Title] Page 18 of 21


Figure 1. Normal distributions with
standard deviations of 5 and 10.

Let’s Read:

Assignment

Given below are the scores of 16 students in a mathematics examination.


(Use ungrouped Data)

58 30 75 77 94 97 80 35
74 58 70 99 60 63 71 16

Find: 1. Range
2. Quartile Deviation
3. Mean Absolute Deviation
4. Variance (Sample and pop. Variance)
5. Standard Deviation (Sample and pop SD)

References:
References/Sources:

Altares, Priscilla, et. al. (2014). Elementary Statistics with Computer


Applications (2nd Edition). Manila: Rex Book Store.

Agbayani, Victor A. E. (2001). Applied Statistics for Business and


Research. Quezon City: AFA Publications.

[Course Code]: [Course Title] Page 19 of 21


Batacan,M.C.A., et. al., (2007). Statistics for Filipino Students, 2 nd ed. Manila
Phils: National Books Store

Calmorin, Laurentina Paler and Melchor A. Calmorin. (2007). Statistics in


Education and the Sciences. Manila: Rex Book Store.

[Course Code]: [Course Title] Page 20 of 21


[Course Code]: [Course Title] Page 21 of 21

You might also like