0% found this document useful (0 votes)
14 views38 pages

Lec 6

Uploaded by

shima abdeen
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views38 pages

Lec 6

Uploaded by

shima abdeen
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 38

An Introduction to Statistics and Probability (S102)

Numerical Summaries
Measures of Central Tendency
Lecture 6

Azza Osman Mohamed, Ph.D

Faculty of Mathematical Sciences and Informatics


University of Khartoum

July 20, 2022

1 / 38
Outline

1 Measures of Central Tendency


The Mean
The Median
The Mode
The Midrange
The Weighted Mean
The Geometric Mean
Harmonic mean
Properties and Uses of Measures of Central Tendency

2 / 38
Lecture Objectives

Summarize data, using measures of central tendency mean,


median, midrange, weighted mean, mode, harmonic mean and
geometric mean.

3 / 38
Measures of Central Tendency

In a previous lectures you learned how to gain useful information


from raw data by organizing them into a frequency distribution and
then presenting the data by using various graphs.
In this lecture, you will study the measures of central tendency
which is a numerical method to explain the data in brief.
You can see examples of summarizing a large set of data in
day-to-day life
average marks obtained by students of a class in a test.
Average rainfall in an area.
Median income for college professors.
manufacturer would like to know the size of shoes that has
maximum demand.

4 / 38
The Mean

The mean or arithmetic mean is the most commonly used


measure of central tendency.
It is defined as the sum of the values of all observations divided by
the number of observations, and usually denoted by x̄
Pn
xi x1 + x2 + x3 + ... + xn
x̄ = i=1 =
n n

where n represent total number of values in the sample


For a population, µ is used for the mean.
PN
i=1 Xi X1 + X2 + X3 + ... + XN
µ= =
N N
where N represents the total number of values in the population.

5 / 38
The Mean

Example 1
The data represent the number of days off per year for a sample of
individuals selected from nine different countries.

20, 26, 40, 36, 23, 42, 35, 24, 30

Find the mean.

6 / 38
The Mean

Solution
Pn
i=1 xi 20+26+40+36+23+42+35+24+30 276
x̄ = n
= 9
= 9
=30.7

Hence, the mean number of days off is 30.7 ∼


= 31 days.

7 / 38
The Mean

Example 2
The manager of the BiLo Supermarket, gathered the following
information on the number of times a customer visits the store during a
month. The responses of 20 customers.
Using the frequency distribution below, find the mean?
class limits frequency
6 − 10 1
11− 15 2
16 −20 3
21 − 25 5
26 − 30 4
31 − 35 3
36 − 40 2
Total=20

8 / 38
The Mean

First Find the class boundaries.


Second Find the mid point For each class.
solution Third multiply the frequency
by the midpoint, find it’s sum.
fourth divide the previous sum by n to get the mean.
P
f ∗ xm
x̄ =
n

9 / 38
The Mean

Solution:
class limits frequency Mid pints f ∗ xm
5.5 − 10.5 1 8 8
10.5− 15.5 2 13 26
15.5 –20.5 3 18 54
20.5 – 25.5 5 23 115
25.5 – 30.5 4 28 112
30.5 – 35.5 3 33 99
35.5 − 40.5 2 38 76
P
Total=20 f ∗ xm = 490
P
f ∗ xm 490
x̄ = = = 24.5
n 20

10 / 38
The Median

The Median
The median is the midpoint of the data array. The symbol for the
median is MD
When the data set is ordered, it is called a data array.
Example
The number of rooms in the seven hotels is
713, 300, 618,595, 311, 401, and 292. Find the median

11 / 38
The Median

Solution:
First Arrange the data in order.
292, 300, 311, 401, 595, 618, 713
Second Find the median for the data
if the total data values is odd the median order is n+1
2
if the total data values is even the median is the average
of the two midpoint values. m1 = n2 m2 = n2 + 1
x +x
MD = m1 2 m2
the total data in this example is odd so the median order
is 7+1 th
2 = 4, the median is the four element which is 401.

12 / 38
The Median

Examble
The number of children with asthma during a specific year in eight
cities is:

253, 125, 328, 417, 201, 70, 110, 90

Find the median.


Solution:

First order the data


70, 90,110, 125, 201, 253, 328, 417
the total number of the data is even 8
Second find the two midpoints order
m1 = 82 = 4 m2 = 28 + 1 = 5
MD = x4 +x2
5
= 125+201
2 = 326
2 = 163

13 / 38
The Median

Examble
Using the frequency table below find the median
class limits frequency
6 − 10 1
11− 15 2
16 −20 3
21 − 25 5
26 − 30 4
31 − 35 3
36 − 40 2
Total=20

14 / 38
The Median

solution
First Find the cumulative frequency.
Second Determine the median class,
which is the class with the cumulative frequency
exactly greater then the median order.
Third find the median value using the rule
median(MD) = L + fwm ∗ (0.5 ∗ n − cfb )
where
L=lower class limit of the class that contains the median
n= total frequency
cfb =the sum of frequencies (cumulative frequency)
for all classes before the median class
fm =frequency of the class interval containing the median
w= interval width

15 / 38
The Median

solution
class frequency CF
less than 10.5 1 1
less than 15.5 2 3
less than 20.5 3 6
less than 25.5 5 11
less than 30.5 4 15
less than 35.5 3 18
less than 40.5 2 20
Total=20

The median class is the class with the cumulative frequency greater
than 10 which is 11 the class is 21 − 25, applying the median rule we
get
5
MD = 20.5 + ∗ (0.5 ∗ 20 − 6) = 24.5
5

16 / 38
The Mode

The mode is the least used of the measures of central tendency and
can only be used when dealing with nominal data.

The Mode
The mode is the value that occurs most often in a data set.

17 / 38
The Mode

A data set that has only one value that occurs with the greatest
frequency is said to be unimodal.
If a data set has two values that occur with the same greatest frequency,
both values are considered to be the mode and the data set is said to be
bimodal.
If a data set has more than two values that occur with the same greatest
frequency, each value is used as the mode, and the data set is said to
be multimodal.
When no data value occurs more than once, the data set is said to have
no mode.
A data set can have more than one mode or no mode at all.

18 / 38
The Mode

Example
Find the mode for the following data sets
1 8, 9, 9, 14, 8, 8, 10, 7, 6, 9, 7, 8, 10, 14, 11, 8, 14, 11
2 10, 31, 30, 84, 20, 18, 62, 77, 33, 52
3 15, 18, 18, 18, 20, 22, 24, 24, 24, 26, 26

19 / 38
The Mode

solution
1 The mode is 8.
2 There is no mode.
3 The modes are 18 and 24.

20 / 38
The Mode

Example
Using the previous frequency table from the last example find the
mode?
class limits frequency
6 − 10 1
11− 15 2
16 −20 3
21 − 25 5
26 − 30 4
31 − 35 3
36 − 40 2
Total=20

21 / 38
The Mode

Solution
First Find the modal class.
Which the class with the highest frequency
Second find the mode value using the rule
f −fi−1
mode = L + (f −f i )+(f ∗w
i i−1 i −fi+1 )
where
L=lower class limit of the interval that contains
the mode(modal class).
fi = the frequency of the modal class.
fi−1 = the frequency of the class previous to the modal class.
fi+1 = the frequency of the class after the modal class.
w= the width of the modal class.

22 / 38
The Mode

The modal class is 21 − 25, applying the previous rule


5−3
mode = 21 + (5−3)+(5−4) ∗ 5 = 24.333

23 / 38
The Midrange

The Midrange
The midrange is defined as the sum of the lowest and highest values in
the data set, divided by 2.

Midrange is useful for finding a quick average or midpoint of


certain data sets.
It is a very rough estimate of the average, though the formula for
mean is more often used for efficiency.
Outliers (extremely high or low value) can cause difficulties in any
statistical analysis, it is particularly damaging for the midrange that
depends only on the maximum and minimum values in its
calculation.
The symbol MR is used for the midrange.
Lowest value + Highest value
Midrange = MR =
2
24 / 38
The Midrange

Example
For the following dataset find the midrange

2, 3, 6, 8, 4, 1

Solution

1+8
MR = 2 = 4.5

The midrange is 4.5

25 / 38
The Midrange

Example
Find the midrange of data for the yearly income for a sample of
employees in a company

18.0, 14.0, 34.5, 10, 11.3, 10, 12.4, 10

solution

10+34.5
MR = 2 = 22.5

Notice that the midrange doesn’t represent most of the data, and this
is due to the highest value is extremely large.

26 / 38
The Weighted Mean
The weighted mean is a special case of the arithmetic mean
(average). Instead of each data point contributing equally to the
final mean, some data points contribute more “weight” than others.
If all the weights are equal, then the weighted mean equals the
arithmetic mean.
The Weighted Mean
The weighted mean of a variable X is found by multiplying each value
by its corresponding weight and dividing the sum of the products by
the sum of the weights.
Pn
w1 X1 +w2 X2 +w3 X3 +...+wn Xn wi Xi
X̄ = w1 +w2 +w3 +...+wn = Pi=1
n
i=1 wi

where
w1 , w2 , w3 , . . . , wn are the weights and X1 , X2 , X3 , . . . , Xn are the
values, and it’s used when you need to find the mean of a data set in
which not all values are equally represented.
27 / 38
The Weighted Mean

Example
A student received an A in STAT1 (72 ), a C in Comp (53 ), a B in
Mang(65 ), and a D in Math (47 ). Assuming STAT1 and Comp are 3
hours credit each, Mang and Math are 2 hours credit.what is the
student’s average
solution
applying the weighted mean rule we get

72 ∗ 3 + 53 ∗ 3 + 65 ∗ 2 + 47 ∗ 2 599
X̄ = = = 59.9
3+3+2+2 10
the student’s grade average is 59.9.

28 / 38
The Geometric Mean

The geometric mean is most common in business and finance,


where it is frequently used when dealing with percentages to
calculate growth rates and returns on a portfolio of securities. It is
also used in certain financial and stock market indexes.

The Geometric Mean


The geometric mean of a set of n positive numbers is defined as the
nth root of the product of n values. The formula for the geometric mean
is written: p
GM = n (X1 )(X2 )...Xn )

P
log x
log GM =
n

29 / 38
The Geometric Mean

Example
Find the GM of 1,3 and 9
solution
p
3
GM = (1) (3) (9) = 3

30 / 38
The Geometric Mean

For the frequency distribution of weights of sorghum ear-heads given


in table below. Calculate the Geometric mean

Weight of ear head x (g) No of ear heads (f)


60-80 22
80-100 38
100-120 45
120-140 35
140-160 20
Total 160

31 / 38
The Geometric Mean

First Find the log of the mid point


Second find summation of the frequancy ∗ log of the mid point
Third find the Geometric
P
mean using the rule
fi log xi
log GM = P
fi
log GM = 324.2
160 = 2.02625
GM = 106.23
Weight of ear h x (g) No of ear h (f) Mid point log mid f*m
60-80 22 70 1.845 40.59
80-100 38 90 1.954 74.25
100-120 45 110 2.041 91.85
120-140 35 130 2.114 73.99
140-160 20 150 2.176 43.52
Total 160 324.2

32 / 38
Harmonic mean
Harmonic mean
The harmonic mean (HM) is defined as the number of values divided
by the sum of the reciprocals of each value.

The harmonic mean is useful for finding average of the ratios,


rates, and to calculate quantities such as speed (This is because
speed is expressed as a ratio of two measuring units such as
km/hr).
n
HM = P 1
x
or P
fi
HM = P i 1
fi xi
For example HM of 1,4,5 and 2
4
HM = 1 1 1 1
= 2.05
1 + 4 + 5 + 2
33 / 38
Properties and Uses of Measures of Central
Tendency(The Mean)

1 The mean is found by using all the values of the data.


2 The mean varies less than the median or mode when samples are
taken from the same population and all three measures are
computed for these samples.
3 The mean is used in computing other statistics, such as the
variance.
4 The mean for the data set is unique and not necessarily one of the
data values.
5 The mean cannot be computed for the data in a frequency
distribution that has an open-ended class.
6 The mean is affected by extremely high or low values, called
outliers, and may not be the appropriate average to use in these
situations.

34 / 38
Properties and Uses of Measures of Central
Tendency(The Median)

1 The median is used to find the center or middle value of a data set.
2 The median is used when it is necessary to find out whether the
data values fall into the upper half or lower half of the distribution.
3 The median is used for an open-ended distribution.
4 The median is affected less than the mean by extremely high or
extremely low values.

35 / 38
Properties and Uses of Measures of Central
Tendency(The Mode)

1 The mode is used when the most typical case is desired.

2 The mode is the easiest average to compute.

3 The mode can be used when the data are nominal or categorical,
such as gender.

4 The mode is not always unique. A data set can have more than
one mode, or the mode may not exist for a data set.

36 / 38
Properties and Uses of Measures of Central
Tendency(The Midrange)

1 The midrange is easy to compute.

2 The midrange gives the midpoint.

3 The midrange is affected by extremely high or low values in a data


set.

37 / 38
Properties and Uses of Measures of Central
Tendency(The Geometric Mean)

1 The geometric mean is more representative of value than the


arithmetic mean as it is not affected by extremely high or low
values.
2 The geometric mean is the most appropriate measure to calculate
the mean ratio and growth rate.
3 The geometric mean can not use if one of the data is zero or
negative.

38 / 38

You might also like