0% found this document useful (0 votes)
10 views

Lesson 3 Methods of Summarizing Data

Uploaded by

aicelleg.redondo
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views

Lesson 3 Methods of Summarizing Data

Uploaded by

aicelleg.redondo
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 40

Camarines Norte State College

College of Business and Public Administration

Lesson 3: Method of
Summarizing Data

CHRIS ADAM B. YBA


Camarines Norte State College
OBJECTIVES: College of Business and Public Administration

1. Summarize data using different measures positions and measures of


variability.
2. Compute manually the different measures of positions and measures of
variability.
3. Determine the different measures of positions and variability using
available statistical software.
Camarines Norte State College

MEASURES OF CENTRAL TENDENCY College of Business and Public Administration

❑ In this unit, we will be learning the different statistical methods that can further help us describe and
summarize data sets.
❑ The most common of these methods is by finding the average. Let us consider the following:
▪ The average height of a Filipino man is 5 feet and 3 inches; while the average height of Filipino
woman is just 5 feet.
▪ The average salary for a teacher is ₱20,695 per month in Philippines.
▪ On the average, 24 million people receive animal bites.
▪ The average American is sick in bed in seven days a year missing five days of work.
❑ In the above examples, the word average is “ambiguous”.
❑ Loosely stated, the average means the “center of the distribution” or the “most typical case”.
Camarines Norte State College

MEASURES OF CENTRAL TENDENCY College of Business and Public Administration

❑ One can think of an average as one value that best represents an entire group of
scores. You can also think of the average as the “middle” space or a fulcrum on a
seesaw –it’s the point where all the values in a set of values are balanced.
❑ Measures of average are also called “measures of central tendency” that include
mean, median, mode, and midrange.
❑ The succeeding sections will guide you through the procedures and processes on
how to compute these averages.
Camarines Norte State College

MEASURES OF CENTRAL TENDENCY College of Business and Public Administration

Mean
❑ The most common type of average; it is also referred as “arithmetic average.”
❑ It is the score located at the mathematical center of the distribution. Also, it is used to
summarize the interval and ratio variables when the distribution is symmetrical.
❑ Generally, it is the sum of all the values divided by the number of values in the given
data set.
Camarines Norte State College

MEASURES OF CENTRAL TENDENCY College of Business and Public Administration

Mean
A statistic is a characteristic or
For the sample mean: measure obtained by using data
values from a sample.
𝑥1 + 𝑥2 + ⋯ + 𝑥𝑛 σ 𝑥
𝑥ҧ = =
𝑛 𝑛 A parameter is a characteristic or
measure obtained by using all the
where 𝑛 represents the total number values in the sample data values from a specific
For the population mean: population.

𝑋1 + 𝑋2 + ⋯ + 𝑋𝑛 σ 𝑋
𝜇= =
𝑁 𝑁
where 𝑁 represents the total number values in the population
Camarines Norte State College

Mean College of Business and Public Administration

Example 1. General Weighted Average (GWA)


Below are the general weighted averages (GWA) of 20 students in BSEd major in Mathematics.

Solution
General Rounding Rule:
Let us say that these 20 students represent a sample,
In Statistics, the general rounding
σ𝑥
𝑥ҧ = rule is that when computations are done in
𝑛
2.80 + 2.30 + 2.30 + ⋯ + 1.75 + 1.40 43.45
the calculation, rounding should be done
𝑥ҧ =
20
=
20 until the final answer is calculated. When
𝑥ҧ = 2.1725 ≈ 2.173 rounding is done in intermediate steps, it
Thus, the average GWA of 20 students is 2.173. tends to increase the difference between
the answer and the exact one.
Camarines Norte State College

MEASURES OF CENTRAL TENDENCY College of Business and Public Administration

Median
❑ The halfway point in a data set; “middle value” in the data set; the score is at 50th percentile.
❑ Other references called this as “midpoint” of the data array (when the data set is ordered, it is
called an array).
❑ Used to summarize ordinal or highly skewed interval or ratio variables.
❑ To find the median of ungrouped data, follow these steps:
1. Arrange the values/quantities (ascending or descending).
2. Number the values/quantities consecutively from 1 to n.
𝑛+1 𝑡ℎ
3. Case 1. If n is odd, the median is the quantity.
2
𝑛 𝑡ℎ 𝑛+1 𝑡ℎ
Case 2. If n is even, the median is the average of the and 2 quantities.
2
Camarines Norte State College

Median College of Business and Public Administration

Example 2. General Weighted Average (GWA)


Below are the general weighted averages (GWA) of 20 students in BSEd major in Mathematics.

Solution
Arrange these GWA values and number these values from 1 to 20 (n = 20).

The median is the average of the 10th and 11th values, that is
2.10 + 2.30 4.40
𝑀𝐷 = = = 2.20
2 2
Thus, the middle score of student’s GWA is 2.20
Camarines Norte State College

MEASURES OF CENTRAL TENDENCY College of Business and Public Administration

Mode
❑ The score in the data set that “occurs most frequently”; the score value(s) with
the highest frequency.
❑ The mode can be used when the data are nominal or categorical, such as religious
preference, gender, political affiliation.
❑ The mode is not always unique.
❑ The set of data may be considered unimodal if it contains only one mode; bimodal if
it has two modes; trimodal if it contains three; or sometimes a data set has no mode.
Camarines Norte State College

Mode College of Business and Public Administration

Example 2. General Weighted Average (GWA)


Below are the general weighted averages (GWA) of 20 students in BSEd major in
Mathematics.

Solution
By mere inspection, one can easily identify the mode of an ungrouped data.

Since 2.30 appeared 3 times, therefore, the modal score of students’ GWA is 2.30.
Camarines Norte State College

MEASURES OF CENTRAL TENDENCY College of Business and Public Administration

Midrange
❑ It is a (very) “rough estimate” of the middle; can be easily affected by extremely
high or extremely low value.
❑ It is defined as the sum of the lowest and highest values in the data set, divided by
2. That is,
𝑙𝑜𝑤𝑒𝑠𝑡 𝑣𝑎𝑙𝑢𝑒 + ℎ𝑖𝑔ℎ𝑒𝑠𝑡 𝑣𝑎𝑙𝑢𝑒
𝑀𝑅 =
2
Camarines Norte State College

Midrange College of Business and Public Administration

Example 2. General Weighted Average (GWA)


Below are the general weighted averages (GWA) of 20 students in BSEd major in Mathematics.

Solution
With the given data set below, the lowest value is 1.10 and the highest values is 3.00.

Therefore,
𝑙𝑜𝑤𝑒𝑠𝑡 𝑣𝑎𝑙𝑢𝑒 + ℎ𝑖𝑔ℎ𝑒𝑠𝑡 𝑣𝑎𝑙𝑢𝑒 1.10 + 3.00 4.10
𝑀𝑅 = = = = 2.05
2 2 2
Thus, the midrange of the given GWA scores is 2.05
Camarines Norte State College

Midrange College of Business and Public Administration

Example 2. General Weighted Average (GWA)


Below are the general weighted averages (GWA) of 20 students in BSEd major in
Mathematics.

Let us summarize the obtained values for the given data set:
Camarines Norte State College

MEASURES OF CENTRAL TENDENCY College of Business and Public Administration

Properties: Mean
1. The mean is found by using all the values of the data.
2. The mean varies less than the median or mode when samples are taken from the
same population and all three measures are computed for these samples.
3. The mean is used in computing other statistics, such as the variance.
4. The mean is used in computing other statistics, such as the data values.
5. The mean cannot be computed for the data in a frequency distribution that has an
open-ended class.
6. The mean is affected by extremely high or low values, called outliers, and may not
be the appropriate average to use in these situations.
Camarines Norte State College

MEASURES OF CENTRAL TENDENCY College of Business and Public Administration

Properties: Median
1. The median is used to find the center or middle value of data set.
2. The median is used when it is necessary to find out whether the data values fall into the upper half
or lower half of the distribution.
3. The median is used for an open-ended distribution.
4. The median is affected less than the mean by extremely high or extremely low values.
Properties: Mode
1. The mode is used when the most typical case is desired.
2. The mode is the easiest average to compute/determine.
3. The mode can be used when the data are nominal or categorical, such as religious preference,
gender, political affiliation.
4. The mode is NOT always unique. A data set can have more than one mode, or no mode at all.
Camarines Norte State College

MEASURES OF CENTRAL TENDENCY College of Business and Public Administration

Properties: Midrange
1. The is easy to compute.
2. The midrange gives the midpoint.
3. The midrange is affected by extremely high or low values in the data set.
Camarines Norte State College

MEASURES OF CENTRAL TENDENCY College of Business and Public Administration

❑ So far, we have learned how to compute and/or identify the mean, median, mode,
and midrange of an ungrouped data. However, there are some cases where data are
expressed as grouped frequency distribution (grouped data).
❑ Let us consider again this data set:
Temperatures °𝑭 in Provinces
These data represent the record high temperatures in degrees Fahrenheit for 50
provinces in the Philippines.
Camarines Norte State College

MEASURES OF CENTRAL TENDENCY College of Business and Public Administration

❑ Grouped Frequency Distribution Table:


❑ To compute for the mean of grouped data,

Class Mark Formula:


𝑥ҧ =
σ 𝑓 ∙ 𝑥𝑚 where: 𝑥𝑚 = class mark
𝑛
𝑓 = frequency of each class

Coded Formula:

where: 𝑥0 = assumed mean (class mark with code 0)
σ𝑓 ∙𝑥
𝑥ҧ = 𝑥0 + 𝑖 𝑥 ′ = coded value
𝑛
𝑓 = frequency of each class
𝑖 = class interval/ class width
Camarines Norte State College

MEASURES OF CENTRAL TENDENCY College of Business and Public Administration

❑ Grouped Frequency Distribution Table:


❑ To compute for the mean of grouped data,

Class Mark Formula:


𝑥ҧ =
σ 𝑓 ∙ 𝑥𝑚 where: 𝑥𝑚 = class mark
𝑛
𝑓 = frequency of each class

Coded Formula:

where: 𝑥0 = assumed mean (class mark with code 0)
σ𝑓 ∙𝑥
𝑥ҧ = 𝑥0 + 𝑖 𝑥 ′ = coded value
𝑛
𝑓 = frequency of each class
𝑖 = class interval/ class width
Camarines Norte State College

MEASURES OF CENTRAL TENDENCY College of Business and Public Administration

Mean of Grouped Data


Steps:
1. Add “Class Mark” column and find the midpoints of every class.
2. Add “𝒇 ∙ 𝒙𝒎 ” column and type in the values (multiply the frequency and the class mark for each
class)
3. Find the sum of “𝒇 ∙ 𝒙𝒎 ”.
4. Plug in the values to the formula.
σ 𝑓 ∙ 𝑥𝑚
𝑥ҧ =
𝑛
(5710)
𝑥ҧ =
50
𝑥ҧ = 114.2
Thus, the mean temperature in 50 provinces was 114.2°𝐹.
Camarines Norte State College

MEASURES OF CENTRAL TENDENCY College of Business and Public Administration

Mean of Grouped Data


Steps:
1. Add “Coded Value” column and assign integer values to each class. Add “𝒇 ∙ 𝒙𝒎 ” column and
type in the values (multiply the frequency and the class mark for each class)
2. Add “𝒇 ∙ 𝒙′ ” column and type in the values to every class (multiply the frequency and the
assigned coded values).
3. Find the sum of “𝒇 ∙ 𝒙′ ”
4. Plug in the values to the formula.
σ 𝑓 ∙ 𝑥′
𝑥ҧ = 𝑥0 + 𝑖
𝑛
−28
𝑥ҧ = 117 + 5
50
𝑥ҧ = 114.2
Camarines Norte State College

MEASURES OF CENTRAL TENDENCY College of Business and Public Administration

Median (Grouped Data)


❑ To compute for the median of grouped data,
▪ Median Formula:

𝑛
− 𝑐𝑓𝑝
𝑀𝐷 = 𝑥𝐿𝐵 + 2 𝑖
𝑓𝑚
Where: 𝑥𝐿𝐵 = lower boundary of the median class
𝑓𝑚 = frequency of the median class
𝑐𝑓𝑝 = cumulative frequency of the class preceding the median class
𝑖 = class interval/ class width
Camarines Norte State College

MEASURES OF CENTRAL TENDENCY College of Business and Public Administration

Median (Grouped Data)

Steps:
1. Add columns for “Class Boundaries” and “Cumulative Frequency”.
Note:
i. The formula requires/uses the lower class boundaries.
ii. The < 𝑐𝑓 is obtained by adding up the frequencies from the lowest class.
Camarines Norte State College

MEASURES OF CENTRAL TENDENCY College of Business and Public Administration

Median (Grouped Data)

Median Class

25th
Score

2. Determine the “Median Class” Note: The median class is the class containing the middle score.
→n/2 = (50)/2 = 25th score
Camarines Norte State College

MEASURES OF CENTRAL TENDENCY College of Business and Public Administration

Median (Grouped Data)

3. Plug in the values to the formula.


(25) − 𝑐𝑓𝑝
𝑀𝐷 = 𝑥𝐿𝐵 + 𝑖
𝑓𝑚
𝑛 Thus, the median temperature was 𝟏𝟏𝟑. 𝟔𝟕℉
− 10
𝑀𝐷 = 109.5 + 2 (5) = 113.67
18
Camarines Norte State College

MEASURES OF CENTRAL TENDENCY College of Business and Public Administration

Mode (Grouped Data)


❑ To compute for the mode of grouped data,
▪Mode Formula:

𝑑1
𝑥ො = 𝑥𝐿𝐵 + 𝑖
𝑑1 + 𝑑2

Where: 𝑥𝐿𝐵 = lower boundary of the modal class


𝑑1 = difference of the frequency of the modal class and the frequency of the preceding class
𝑑2 = difference of the frequency of the modal class and the frequency of the preceding class
𝑖 = class interval/ class width
Camarines Norte State College

MEASURES OF CENTRAL TENDENCY College of Business and Public Administration

Mode (Grouped Data)


Steps:
1. Add “Class Boundaries” column and type in the values.
2. Determine the modal class (the class with the highest frequency). → Modal class: 110 – 114
3. Determine 𝑑1 and 𝑑2 .
4. Plug in the values to the formula.
𝑑1
𝑥ො = 𝑥𝐿𝐵 + 𝑖
𝑑1 + 𝑑2
10
𝑥ො = 109.5 + 5
10 + 5
𝑥ො = 112.83 Thus, the mode of the given data set is 𝟏𝟏𝟐. 𝟖𝟑℉
Camarines Norte State College

MEASURES OF VARIABILITY College of Business and Public Administration

❑ Consider the following problem:


❑ A testing lab wishes to test two experimental brands of outdoor paints to see how long each will last
before fading. The testing lab makes 6 gallons of each paint to test. Since different chemical agents
are added to each group and only six cans are involved, these two small constitutes two small
populations. The results (in months) are shown in the table.

❑ Which brand is better? Why? What is your basis?


Camarines Norte State College

MEASURES OF VARIABILITY College of Business and Public Administration

❑ In statistics, to describe the data set accurately, statisticians must know more than the measures of
central tendency.
❑ The “measures of variability” group of analytical tools that describes the “spread” or variability of a
data set.
❑ It indicates how close or widespread the scores are from the average
❑ Different measures of variability:
▪ Range
▪ Interquartile Range and Interquartile Deviation
▪ Mean Deviation
▪ Variance and Standard deviation
Camarines Norte State College

MEASURES OF VARIABILITY College of Business and Public Administration

Range
❑ The range is the highest value minus the lowest value. The symbol R is used for the range.
❑ It tells us the “width” of our data set.
❑ Advantages:
▪ Easy to calculate
❑ Disadvantages:
▪ It does not consider every value in the data set; whether most of the scores are in the extremes
▪ Easily affected by extreme values
❑ The formula:
𝑅 = 𝐻𝑉 − 𝐿𝑉
where HV = highest value and LV = lowest values
Camarines Norte State College

MEASURES OF VARIABILITY College of Business and Public Administration

Example:
Solution.
Identifying the highest value and lowest value in every
data set and plugging in,
Brand A: R = 60 – 15 = 50 months
Brand B: R = 45 – 25 = 20 months
Interpretation:
The range of Brand A shows that 50 months separate
the largest data from the smallest data value. For Brand
B, 20 months separate the largest data from the smallest
data value, which is less than one-half of Brand A’s
range
Camarines Norte State College

MEASURES OF VARIABILITY College of Business and Public Administration

Interquartile Range and Interquartile Deviation


❑ The interquartile range, denoted by IQR, is the difference between the upper an lower quartiles.
That is,
𝐼𝑄𝑅 = 𝑄3 − 𝑄1
❑ It tells us difference between the largest and smallest values in the middle 50% of a data set.
❑ The interquartile deviation, denoted by IQD, is the average of the difference between the upper
and lower quartiles. That is,
𝑄3 − 𝑄1
𝐼𝑄𝐷 =
2
❑ It is not affected by extreme values, thus, it is resistant measure of variability.
Camarines Norte State College

MEASURES OF VARIABILITY College of Business and Public Administration

Interquartile Range & Interquartile Deviation


Example:
Compute for the IQR and IQD of the given data set below.
1 3 4 5 5 6 7 11
Solution. Determine 𝑄3 and 𝑄1 ,
𝑄1 : 3.5 → the middle value of the lower half
𝑄3 : 6.5 → the middle value of the upper half
Thus,
𝐼𝑄𝑅 = 𝑄3 − 𝑄1 = 6.5 − 3.5 = 3
3
and 𝐼𝑄𝐷 = 2 = 1.5
Camarines Norte State College

MEASURES OF VARIABILITY College of Business and Public Administration

Mean Absolute Deviation


❑ The mean absolute deviation, denoted by MAD, is the average distance of all the elements in a
data set from the mean of the same data set.
❑ It indicates how spread out your data set is. A large MAD indicates a data set that is more spread
out in relation to the mean. On the other hand, a smaller MAD would indicate data that is less spread
out and located closer to the mean.
❑ To compute for MAD, we use,
σ 𝑋−𝜇
𝑀𝐴𝐷 =
𝑁
Camarines Norte State College

MEASURES OF VARIABILITY College of Business and Public Administration

Mean Absolute Deviation


Example:
Compute for the MAD of each data set.

Solution.
Using the formula,
σ 𝑋−𝜇 90
MADA: 𝑀𝐴𝐷 = = = 15→spread out about the mean
𝑁 6
σ 𝑋−𝜇 30
MADB: 𝑀𝐴𝐷 = 𝑁
= 6
= 5→relatively closer about the mean
Camarines Norte State College

MEASURES OF VARIABILITY College of Business and Public Administration

Variance and Standard Deviation


❑ Generally, the variance is the average of the squares of the distance each value is from the mean.
❑ The symbol for the population variance is 𝜎 2 (Greek lower case letter sigma).
❑ The formula,
2
σ(𝑋 − 𝜇)2
𝜎 =
𝑁
Where: 𝑋 = individual value
𝜇 = population mean
𝑁 = population size
❑ The population standard deviation is the square root of the variance. The symbol for the population standard deviation is
𝜎.
❑ The corresponding formula,
σ(𝑋 − 𝜇)2
𝜎= 𝜎2 =
𝑁
Camarines Norte State College

MEASURES OF VARIABILITY College of Business and Public Administration

Variance and Standard Deviation


Example. Compute for the variance and standard deviation of each data set.

Solution.
Using the formula,
2 σ 𝑋−𝜇 2 1750
Brand A: 𝜎 = = = 291.67→ 𝜎 = 𝜎 2 = 291.57 = 17
𝑁 6
σ 𝑋−𝜇 2 250
Brand B: 𝜎 2 = = = 41.67→ 𝜎 = 𝜎 2 = 41.67 = 6.45
𝑁 6
Camarines Norte State College

MEASURES OF VARIABILITY College of Business and Public Administration

Variance and Standard Deviation


❑ On the other hand, the formula for sample variance, denoted by 𝑠 2 , is
σ (𝑋 − ത 2
𝑋)
𝑠2 =
𝑛−1
Where: 𝑋 = individual value
𝑋ത = sample mean
𝑛 = sample size
❑ Similarly, the corresponding formula for the sample standard deviation, is
ത 2
σ(𝑋−𝑋)
𝑠= 𝑠2 = 𝑛−1
Camarines Norte State College

MEASURES OF VARIABILITY College of Business and Public Administration

Variance and Standard Deviation


If (for the sake of comparison), we treat the same data sets as sample data sets and use the formulas for sample variance
and standard deviation,

Solution.
Using the formula for sample variance and standard deviation,
2 ത 2
σ(𝑋−𝑋) 1750
Brand A: 𝑠 = = = 350→ s = 𝑠 2 = 350 = 18.71
𝑛−1 6−1
ത 2
σ(𝑋−𝑋) 250
Brand B: 𝑠 2 = = = 50→ s= 𝑠 2 = 50 = 7.07
𝑛−1 6−1

You might also like