0% found this document useful (0 votes)
66 views7 pages

ENDATA130 Data Summarization-Computation of Measures of Variation

This document discusses measures of central tendency and variation. It provides examples to illustrate key concepts like mean, range, mean absolute deviation, variance, standard deviation, and coefficient of variation. Formulas and step-by-step solutions are given for computing these measures from raw data and grouped frequency distributions. Examples use data on calls received by companies and cans of paint sold. The document also discusses Chebychev's rule and measures of rank/position like quartiles and quintiles.

Uploaded by

fabyunaaa
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
66 views7 pages

ENDATA130 Data Summarization-Computation of Measures of Variation

This document discusses measures of central tendency and variation. It provides examples to illustrate key concepts like mean, range, mean absolute deviation, variance, standard deviation, and coefficient of variation. Formulas and step-by-step solutions are given for computing these measures from raw data and grouped frequency distributions. Examples use data on calls received by companies and cans of paint sold. The document also discusses Chebychev's rule and measures of rank/position like quartiles and quintiles.

Uploaded by

fabyunaaa
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

ENDATA130 March 12, 2021

Handouts for Data Description or Data Summarization or Data Analysis

Relating Measures of Central Tendency and Measures of Variation:

Illustration: Two data file/set with the same center (mean) but different variation away from the center.

Sample 1: No. of calls received per day for 5 days by two companies

Data Set A vs Data Set B


1, 3, 6, 15, 45 10, 10,12, 18, 20
𝑥̅ = 14.00 (center) 𝑥̅ = 14.00(center)

Plotting the data points on a number line:

Sample 2: Cans of paint sold per month for Brand A and Brand B
From Bluman

Measures of Variation or Measures of Dispersion


- describes how the data points are scattered/spread or dispersed away from the center (mean).
- Indicates the degree of scattering of data points.
- There are several ways to represent the measures of variation or dispersion of the data, namely:
1.) Range
2.) Mean absolute deviation(MAD) or the mean deviation
3.) Variance
4.) Standard deviation – commonly used
5.) Coefficient of Variation, Cvar

CONDITION:
The value of the different measures of variation is directly proportional to the extent or degree of
scattering.
Like: Range is greater, then scattering of data points is wider while when the Range is lesser, scattering
of data points is shorter.

Determination of the Range (R):


Range = the difference between the highest and lowest value in the data file.
- the simplest measure of dispersion.
Case 1: For Raw Data
R =H–L

Illustration: Computation of Range for Raw Data


Sample 1:
Given:
Data Set A Data Set B
1, 3, 6, 15, 45 10, 10,12, 18, 20

𝑥̅ = 14.00 (center) 𝑥̅ = 14.00

Reqd: Range
Soln:

Case 2: For Grouped Data (Frequency Distribution)

R = Midpt of highest (Last) class – midpoint of lowest (First) class

Illustration: Range for Grouped Data (Frequency Distribution)


Example:
From Bluman
Given: Reqd: Range of Shipment days

Determination of the Mean Absolute Deviation or Mean Deviation (MAD):

Mean absolute deviation or mean deviation = the sum of the absolute values of the deviation of each
values in the data file divided by the number of values in the data file.
- the average of the absolute deviations of the scores around the mean.

Let: X = specific value or score in the Data File/set = specific values of the variable
x̅ =mean
d = deviation

and: Solving for deviation:


d = x - x̅ (for Raw Data)

d = M - x̅ (for grouped Data = Frequency Distribution)

Computation for MAD


Case 1: Using Ungrouped Data (Raw Data)
Steps: a.) Get the deviation (d) for each value from the mean and the corresponding absolute
deviation ldl.

|𝑑|= |𝑥 − 𝑥̅ |
b.) Compute the mean absolute deviation as

∑|𝑑| ∑|𝑥− 𝑥̅ |
𝑀𝐴𝐷 = = ∑𝑓
𝑛
Illustration: Computation of MAD for Raw Data
Sample 1:
Data Set A Data Set B
1, 3, 6, 15, 45 10, 10,12, 18, 20
x̅ =14.00 x̅ =14.00

Reqd: MAD
Soln: For Data Set A

NOTE: The sum of the deviations (d) of all the data points/scores is ZERO.
For Data Set B

Case 2: Using Grouped Data (Frequency Distribution)


Steps: a.) Find the mean (µ or x̅) using Midpoint Method.
b.) Get the deviation of each midpoint from the mean (d = M - µ = 𝑀 − 𝑋̅).
c.) Multiply each deviation by the corresponding frequency (fd).
d.) Get the absolute value of lfdl.
e.) Compute the MAD as follows:

Σl fd l Σ l fd l
MAD = N or n or: MAD = Σ f

For Tabular Computations: (Columns needed)

Classes f M Mf (use to solve mean) d=M-x


̅ fd lfdl

Illustration: Computation of MAD for Grouped Data (Frequency Distribution)


Example:
From Bluman
Given:
Reqd: MAD of Shipment days

Soln: Using Tabular Computations

Classes f M fM d=M-x
̅ fd lfdl
(Days of Shipment)
1 -3 6
4–6 8
7-9 10
10 - 12 7
13 - 15 0
16 - 18 5

Determination of the Variance σ2 or s2:

Variability can also be defined in terms of how close the scores in the distribution are to the middle
or center (mean) of the distribution. Using the mean as the measure of the middle or center of the
distribution, the variance is defined as the average squared difference of the scores from the mean (refer
to “squared deviation = d2). The variance is the measure of dispersion that eliminates negative signs by
squaring all deviations.

Standard Symbols for the Variance:


σ2 = variance for the population
s2 = variance for the sample

Determination of the Variance:


Case 1: Using Ungrouped (Raw Data)
Steps: a. ) Find the mean (µ or 𝑥̅ ).
b.) Determine the deviation of each score to the mean, d = (X - µ) = x - 𝑥̅
c.) Calculate the variance for the population or the sample as follows:

∑ 𝑑2 ∑(𝑥− 𝜇)2
𝜎2 = =
𝑁 𝑁

∑ 𝑑2 ∑(𝑥− 𝑥̅ )2
𝑠2 = =
𝑛−1 𝑛−1
lllustration: Computation of Variance for raw data
Sample 1:
Data Set A Data Set B
1, 3, 6, 15,45 10, 10,12, 18, 20
x̅ =14.00 x̅ =14.00
Reqd: Variance = s2
Soln:
For Data Set A

For Data Set B

Case 2: For Grouped Data (Frequency Distribution)


Steps: 1.) Solve for the mean of the data file (𝑥̅ )
2.) Get the midpoint of each class, M
3.) Compute for the deviation, d = M - 𝑥̅
4.) Solve for the square of the deviation, d2 =(M - 𝑥̅ )2
5.) Multiply the frequency of each class with the squared deviation, fd2
6.) solve for the variance as follows:
σ2 = Σfd2 / N

s2 = Σfd2 / n- 1
Illustration: Variance for Grouped Data (Frequency Distribution)
Example:
From Bluman
Given:

Reqd: Variance of Shipment days


Soln:

Classes f M d=M-x
̅ d2 fd2
(Days of Shipment)
1 -3 6 2
4–6 8 5
7-9 10 8
10 - 12 7 11
13 - 15 0 14
16 - 18 5 17
Determination of the Standard Deviation:

Standard deviation (σ or S) = square root of the variance

σ = population standard deviation = √σ 2

s = sample standard deviation = √𝑠 2

Determination of the Coefficient of Variation, Cvar:


Coefficient of variation, denoted by CVar, is the standard deviation divided by the mean. The result is
expressed as a percentage. A statistic that allows you to compare standard deviations when the units are
different
Cvar = standard deviation/ mean (100) = (σ/ μ) 100 = (s/x̅ ) 100

Chebychev’s Rule: (For Normal or Bell-Shaped Distribution)


The Empirical (Normal) Rule
Chebyshev’s theorem applies to any distribution regardless of its shape. However, when a distribution is bell-shaped
(or what is called normal), the following statements, which make up the empirical rule, are true.
1.) Approximately 68% of the data values will fall within 1 standard deviation of the mean.
2.) Approximately 95% of the data values will fall within 2 standard deviations of the mean.
3.) Approximately 99.7% of the data values will fall within 3 standard deviations of the mean.

Measures of Rank /Position or Quantile Points


- Indicate the relative position of a single or specific value/score/data point relative to the rest of the
values in the data set/file.
- Include
1.) Quartile Points(Q) – single values or scores that divide the data file or set into
four equal portions.
2.) Quintile Points (Qn)– single values or scores that divide the data file or set into
five equal portions.
3.) Decile Points (D) – single values or scores that divide the data file or set into ten
equal portions.
4.) Percentile Points (P) – single values or scores that divide the data file or set into
one hundred equal portions.
General Formula for Determination of Quantile Points for Grouped Data:

(x/y)n - F
Quantile Point = LB + f’ C

Where: LB = lower boundary of chosen quantile class

x/y = fractional part represented by the quantile point


F = less than cumulative frequency up to but not
exceeding (x/y)n
f’ = frequency of quantile class

NOTE: The quantile class is the class whose “less than” cumulative frequency
contains the value (x/y)n.

◼ Example7:
Age of students enrolled in an adult basic mathematics subject:

Age Interval(yrs) f Determine:


15-19 13
20-24 15 a.) Q3
25-29 20 b.) Qn2
30-34 10 c.) youngest of the top 40% from the
35-39 8 oldest of the group.
40-44 4
d.) eldest of the lowest 25% of the
youngest in the group.

◼ Example 8:
Data for the record high temperatures for each of the 50 states
Determine:

a.) D3
b.) P55
c.) Lowest value of the
25% hottest temperature
d.) Temperature
range/interval of the
middle 50% of the
recorded temperature

Exercises
1.) The nicotine contents, in milligrams, for 40 cigarettes of a certain brand were recorded as follows:
1.09 1.92 2.31 1.79 2.28 1.74 1.47 1.97 0.85 1.24
1.58 2.03 1.70 2.17 2.55 2.11 1.86 1.90 1.68 1.51
1.64 0.72 1.69 1.85 1.82 1.79 2.46 1.88 2.08 1.67
1.37 1.93 1.40 1.64 2.09 1.75 1.63 2.37 1.75 1.69
Find the
(a) mean
(b) median.
(c) MAD
(d) standard deviation

2.) Thirty automobiles were tested for fuel efficiency (in miles per gallon). This frequency distribution was
obtained. Calculate the mean and standard deviation of the fuel frequency in miles per gallons.
Class boundaries Frequency
7.5–12.5 3
12.5–17.5 5
17.5–22.5 15
22.5–27.5 5
27.5–32.5 2

You might also like