0% found this document useful (0 votes)
7 views46 pages

Week7 - Measures of Central Tendency

Uploaded by

bonolobadire447
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views46 pages

Week7 - Measures of Central Tendency

Uploaded by

bonolobadire447
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 46

Week 7

Measures of Central
Tendency/Location
Review

Distinguish between population and sample,


parameter and statistic, good sampling
methods: simple random sample, stratified
sample, etc.

Frequency distributions, summarizing data


using graphs, describing the center, variation,
distribution, outliers, and changing
characteristics over time in a data set
Objectives
By the end of this lesson, you must be able to:
1. Distinguish between the measures of central
tendency; Arithmetic mean, Median, mode and
midrange.
2. Compute mean, median and mode for grouped
and ungrouped data.
3. Interpret the mean, median and mode.
4. Describe the effect of outliers on the measures of
central tendency
5. Describe quartiles and percentiles
6. Construct and Describe box plots for normal and
skewed distributions
Summary Definitions

• Measures of central tendency/location: extent to


which all the data values group around a typical or
central value (concentration)
• Central location is the middle value of
concentration of the data
• Non-central location measures identify values “off
the centre”
Round-off Rule for Measures of
Center/Location

Carry one more decimal place than is


present in the original set of values
Critical Thinking

• Think about whether the results are


reasonable.
• Think about the method used to collect
the sample data.
easures of Central Tendency/Location

• The value at the center or middle of a data set


• These measures tend to lie near the center of a
distribution when the data are arranged according
to magnitude.
• If the frequency distribution is bilaterally
symmetrical, unimodal distribution, then all three
measures of central tendency will be equal
Measures of Central
Tendency/Location
Most common measures of central
tendency are:
1. Mean/arithmetic mean/average
2. Median/second quartile/middle
quartile/50th percentile
3. Mode (most frequent value)
All three can be used to analyse numerical
data; Mode is the only one that can
be used for categorical data
(nominal and ordinal data)
Symmetric distribution
Arithmetic Mean

• The measure of center obtained by


adding the values and dividing the total
by the number of values
• What most people call an average.
Notation

 denotes the sum of a set of values.


x is the variable usually used to
represent the individual data values.

represents the number of data


n values in a sample. (Sample size)

represents the number of data


N values in a population.
Notation

x is pronounced ‘x-bar’ and denotes the mean of a


set of sample values
x
x
n
 is pronounced ‘mu’ and denotes the mean of all
values in a population
x

N
Mean For Ungrouped Data

Find the mean: Age of 5 statistics students;


22, 22, 26, 24, 23

x 22  22  26  24  23 117
x  
n 5 5
= 23.4

Interpretation?
Mean for Grouped Data
Consider grouped data below, how do we
calculate the mean?
Age group Frequency (f)

20-24 5

25-29 8

30-34 2

35-39 3

40-44 2
Calculating Mean from a
Frequency Distribution- Grouped

data
Calculate the mid point of the group and assume
that all sample values in each class are equal to the
class midpoint. Use variable x for class midpoint.
• Multiply the mid point by the frequency in each
group
• The mean will therefore be;
Mean= Sum of midpoints x frequencies
Sum of frequencies
( f x)
x
f
Mean Grouped Data….
Age Frequency Mid point Frequency X mid
group (f) of age point
group ( f.x)
(x)
20-24 5 22 110
25-29 8 27 216
30-34 2 32 64
35-39 3 37 111
40-44 2 42 84
Total(su 20 585
m)
Mean Grouped Data…
Therefore,
( f x)
x
f
= 585
20
= 29.3
Mean
Advantages
• Sample means drawn from the same population
tend to vary less than other measures of center
• Takes every data value into account
Disadvantage
• Is sensitive to every data value, one extreme
value can affect it dramatically; is not a resistant
measure of center (affected by OUTLIERS). The
mean tends to follow the OUTLIER, making the
distribution to be non uniform….Skewed.
• Not Used for Categorical Data
Skewness
Exercises

• Given the following data of the age of patients


in years.
2, 5, 17, 8, 25, 20, 35, 70, 15, 45, 52, 68, 70,
55, 66,
82, 37, 59, 22, 19.
a) Construct a frequency distribution (Classes=8)
b) Use your frequency distribution to calculate
the mean.
Median

• The middle value when the original data


values are arranged in order of
increasing (or decreasing) magnitude

• Denoted byx (pronounced ‘x-tilde’)

• Is not affected by an extreme value - is a


resistant measure of the center
• Cannot be calculated for categorical data
Median for Ungrouped Data

First sort the values (arrange them in


order). Then –
1. If the number of data values is odd, the
median is the number located in the
exact middle of the list.
2. If the number of data values is even,
the median is found by computing the
mean of the two middle numbers.
Median for Ungrouped
Data
• The location of the median when the values are in
numerical order (smallest to largest):
n 1
Median position  position in the ordered data
2
• If the number of values is odd, the median is the
middle number

• If the number of values is even, the median is the


average of the two middle numbers
n 1
Note that 2 is not the value of the median,
only the position of the median in the ranked data
Example

• The duration of hospital stay in a hospital x


are;
6, 6, 6 ,1, 1, 2, 4, 4, 4, 2, 10, 38, 80, 3, 3, 4,
5, 6,7, 8, 10.
• Calculate the median;
Arrange the numbers in descending or
ascending order.
The middle number/score is the median.
i.e. 1, 1, 2, 2, 3, 3, 4, 4, 4, 4, 5, 6, 6, 6, 6, 7, 8,
10, 10, 38, 80
Therefore the median duration of stay= 5
Example

• For the even numbers;


1, 1, 2, 2, 3, 3, 4, 4, 4, 5, 6, 6, 6, 6, 7,
8, 10, 10, 38, 80.
• Median is the average of the middle
numbers.
i.e. 5 + 6 = 5.5
2 Interpretati
on?
Median…

• The median is less often used than the


mean.
• However, median is more stable if the
data is asymmetrical.
Median is not affected by
Outliers
Median for Grouped Data
Age group Frequency (f)

20-24 5

25-29 8

30-34 2

35-39 3

40-44 2

• Determined by finding the age group at which we


have 50% of the sample above and 50% below
• This can be done using frequency counts or
cumulative frequency
Midrange

The value midway between the


maximum and minimum value

maximum value + minimum


Midrange = value
2
Midrange

• Sensitive to extremes (outliers)because


it uses only the maximum and minimum
values.

• It is rarely used
Mode
• The value that occurs with the greatest frequency

• Data set can have one, more than one, or no mode

Bimodal two data values occur with the same greatest


frequency
Multimodal more than two data values occur with the
same greatest frequency
No Mode no data value is repeated

• Mode is the only measure of central


tendency that can be used with nominal
data.
• If data has outliers, use median or mode
Mode.

e.g. if the duration of stay in ward x for


patients is like shown below;
3,4,5,7,2,3,5,9,1,5,3,5.

• By rearranging, the numbers; 1,


2,3,3,3,4,5,5,5,5,7,9,
The mode= 5.
Example

a. 5.40 1.10 0.42 0.73 0.48 1.10 Mode is 1.10

b. 27 27 27 55 55 55 88 88 99
Bimodal - 27

& 55
c. 1 2 3 6 7 8 9 10
No Mode
Unimodal distribution
Multimodal distribution
Critical Thinking

• When the mean and median are not close


to each other in terms of their value, it’s a
good idea to report both and let the reader
interpret the results from there.
• Also, as a general rule, be sure to ask for
the median if you are only given the mean.
Non-Central Location Measures

1. Quartiles
2. Percentiles
Quartiles

• Quartiles split the ordered data into 4


segments with an equal number of
values per
25% segment
25% 25% 25%

Q1 Q2 Q3

• The first quartile, Q1, is the value for which


25% of the observations are smaller and 75%
are larger
• Q2 is the same as the median (50% of the
observations are smaller and 50% are larger)
• Only 25% of the observations are greater
than the third quartile, Q3
Locating
Quartiles
To find a quartile: rank/order data and
determine the value in the appropriate
position in the ranked data, where,
First quartile position: Q1 = (n+1)/4
value

Second quartile position: Q2 = (n+1)/2


value

Third quartile position: Q3 =


3(n+1)/4 value
Calculation Rules

• When calculating the ranked


position use the following rules:
– If the result is a whole number then it is
the ranked position to use

– If the result is a fractional half (e.g. 2.5,


7.5, 8.5, etc.) then average the two
corresponding data values.

– If the result is not a whole number or a


fractional half then round the result to
the nearest integer to find the ranked
position.
Locating Quartiles- Example

11 12 13 16 16 17 18 21 22
(n = 9)
Q1 is in the (n+1)/4 position; (9+1)/4 = 2.5 position
of the ranked data. So use the value half way
between the 2nd and 3rd values,

Use the formula


So Qfor to 12.5
= find Q2
1
and Q4

Q1 and Q3 are measures of non-central


location
Q2 = median, is a measure of central
Locating Quartiles- Example

11 12 13 16 16 17 18 21 22
(n = 9)
Q1 is in the (n+1)/4 position; (9+1)/4 = 2.5 position
of the ranked data. So use the value half way
between the 2nd and 3rd values,

So Q1 (9+1)/2
Q2 is in the (n+1)/2 position; = 12.5 = 5th position
So Q2 = median = 16

Q3 is in the 3(n+1)/4; 3(9+1)/4 = 7.5 position


So Q3 = (18+21)/2 = 19.5
Percentiles

11 12 13 16 16 17 18 21 22

A percentile is a data point below


which a given percentage of data points
in the distribution fall.
Critical Thinking

Compare these two data sets;


1. what’s the mean, median and mode?
2. What is the midrange?

• 199, 200, 201

• 0, 200, 400
Critical Thinking

• What if two sets of data have about the same


average and the same median? Does that mean
that the data are all the same?
• For example, the data sets 199, 200, 201, and 0,
200, 400 both have the same average, which is
200, and the same median, which is also 200. Yet
they have very different amounts of variability
• The first data set has a very small amount of
variability compared to the second.
• Therefore, in addition to center we also measure
variability.
END WEEK 7

You might also like