0% found this document useful (0 votes)

19 views60 pages

Numerical Summary Measures

The document discusses numerical summary measures, focusing on central tendency and dispersion. It defines measures such as mean, median, mode, quartiles, and percentiles, explaining their calculations and properties. Additionally, it highlights the importance of choosing appropriate measures based on data types and variability.

Uploaded by

feredenatnael

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

19 views60 pages

Numerical Summary Measures

Uploaded by

feredenatnael

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 60

Numerical Summary Measures

Mekdes W.(MPH)
Numerical summary
measures
 A single number which quantify the characteristics of a
distribution of values.

Measures of central tendency (location)

Measures of dispersion (variability)

A. Measures of Central location

• A measure of central tendency (MCT)is a univariate

statistic that indicates, in one manner or another,

– the average or typical observed value of a variable

in a data set, or

– put otherwise, the center of the frequency

distribution of the data.
Cont’d…

• Measures used to summarize the point at which the

data tend to cluster in a single number.
• The term “number crunching” is used to illustrate
this aspect of data description.
•We describe them as mean, median and mode.
1.Mean

• The sum of the observations divided by the

number of observations.
• The mean is defined if and only if the variable
is at least interval in nature [i.e., interval or
ratio].
Reading assignment
• Read on the different types of mean.
arithmetic mean
weighted mean
geometric mean (GM)
harmonic mean (HM)
a)Ungrouped data
• If x 1 , x 2 , ..., x n are n observed values,
then
b) Grouped data
• It is calculated as follow:

 m ifi
i=1
x = k

 i=1
fi

• where,

k = the number of class intervals

mi = the mid-point of the ith class interval

fi = the frequency of the ith class

Example. Compute the mean age of 169 subjects from
the grouped data.
Mean = 5810.5/169 = 34.48 years

Class interval Mid-point (mi) Frequency (fi) mifi

[10-19] 14.5 4 58.0
[20-29] 24.5 66 1617.0
[30-39] 34.5 47 1621.5
[40-49] 44.5 36 1602.0
[50-59] 54.5 12 654.0
[60-69] 64.5 4 258.0

Total 169 5810.5

Properties of the arithmetic
mean
• For given set of data there is one and only one arithmetic
mean (uniqueness).

• It is easy to calculate and understand (simple).

• Poor measure of central location if the underlying distribution

is not normal (or not Gaussian).

• Influenced by each and every value in the data set hence

affected by the extreme values(outliers).

• In grouped data if any class interval is open, arithmetic

mean can not be calculated.
Median
• With the observations arranged in increasing or decreasing
order,
the median is defined as the middle observation.

a) ungrouped data

If observations are odd, the median is defined as the [(n+1)/2]th

observation.

• If observations are even the median is the average of the

two middle (n/2)th and [(n/2)+1]th values i.e
Cont’d…
Example : Find the median for the following
•20 20 19 22 24 27 27 27 34 21 20
•19 20 20 20 21 22 24 27 27 27 34
b) Grouped data

 we assume that the values within a class-interval are evenly

distributed through the interval.

– The first step is to locate the class interval in which it

is located.

– Find n/2 and see a class interval with a minimum

cumulative frequency which contains n/2.
Median for Grouped data…..
To find a unique median value, use the following formal.

nF 
~  
x = Lm  2 c W
  fm 
• where,
 
• Lm = lower true class boundary of the interval containing the median

• Fc = cumulative frequency of the interval just above the median class

interval

• fm = frequency of the interval containing the median

• W= class interval width

• n = total number of
observations
Example. Compute the median age of 169 subjects from the
grouped data.

n/2 = 169/2 = 84.5

Class interval Mid-point (mi) Frequency (fi) Cum. freq

[10-19] 14.5 4 4
[20-29] 24.5 66 70
[30-39] 34.5 47 117
[40-49] 44.5 36 153
[50-59] 54.5 12 165
[60-69] 64.5 4 169
Total 169
• n/2 = 84.5 = in the 3rd class interval

• Lower limit = 29.5, Upper limit = 39.5

• Frequency of the class = 47

• Fc = 70

• (n/2 – fc) = 84.5-70 = 14.5

• Median = 29.5 + (14.5/47)10 = 32.58 ≈

33
Properties of median

• There is only one median for a given set of data (uniqueness)

• The median is easy to calculate

• Median is a positional average and hence it is not sensitive to

very large or very small values.

• The median is a better measure of central tendency (than

the mean) when the distribution is skewed (not normal)

• Can be calculated even in the case of open end intervals

Quartiles
• If the data are divided into four equal parts, we speak
of quartiles.

• The median divides the data into two equal parts

a) The first quartile (Q1): 25% of all the ranked

observations are less than Q1. [25th percentile]

b) b) The second quartile (Q2): 50% of all the ranked observations

are less than Q2. [50th percentile] The second quartile is the
median.

c) The third quartile (Q3): 75% of all the ranked observations are
less than Q3. [75th percentile] 104
Percentiles

 Simply divide the data into 100 pieces.

 Commonly used percentiles:
→ 10, 20, ….. 90% (deciles)
→ 20, 40, ….. 80% (quintiles)
→ 25, 50, 75%
(quartiles)
→ 33.3, 66.7%
(tertiles)
– P0: The minimum

– P25: 25% of the sample values are less than or equal to this value.
P25 means 1st Quartile or 25th percentile and given by:-
0.25(n+1)th observation

– P50: 50% of the sample are less than or equal to this value. 2nd
Quartile or 50th percentile and given by:-

0.5(n+1)th observation

– P75: 75% of the sample values are less than or equal to this
value. 3rd Quartile or 75th percentile and given by:-

0.75(n+1)th observation
– P100: The maximum
Class exercise
1. The following data set is birth in grams. Find
the 10th and 90th percentile.
2069, 2581, 2759, 2834, 2838, 2841, 3031,
3101, 3200, 3245, 3248,3260, 3265, 3314, 3323,
3484, 3541, 3609, 3649, 4146
Solution
 10th percentile = 0.1(20+1) = 2.1th value
the average of the 2nd and 3rd values =
(2581+2759)/2 = 2670 g
 90th percentile = 0.9(20+1) = 18.9th value
• the average of the18th and 19th values =
(3609+3649)/2 = 3629 g
Mode

• It is a value that occur most often.

• Most distributions have one peak and are described as uni-

modal.
• Some distributions have more than one mode

 Unimodal: A distribution with one mode.

 Bimodal: A distribution with two modes.

 Trimodal: A distribution with three modes.

Mode….

• The mode of grouped data usually refers to the modal class with
the highest frequency.

• If a single value for the mode of grouped data must be

specified, it is taken as the mid point of the modal class interval.
Properties of mode

 It is not affected by extreme values

 Often its value is not unique (more than one mode is possible)

 The main drawback of mode is that often it does not exist,

therefore it is not a good summary of the majority of the
data.
Cont’d
• Given a continuous frequency curve:
– the mode is the value of the variable under the highest
point of the frequency curve (the point with the greatest
density of observed values).
Considerations for Choosing a Measure of
Central Tendency
• For a nominal variable, the mode is the only measure
that can be used.

• For ordinal variables, the mode and the median may

be used. The median provides more information

• For interval-ratio variables, the mode, median, and

mean may all be calculated. The mean provides the
most information about the distribution, but the
median is preferred if the distribution is skewed.
Descriptive statistics
Measures of
dispersion
Measures of Dispersion……

Consider the following two sets of data:

A: 177, 193, 195, 209, 226 Mean = 200
B: 192, 197, 200, 202, 209 Mean = 200

 Two or more sets may have the same

mean and/or median but they may be
quite different.
 MCT are not good to describe about
the variability or spread of the values.
Measure of dispersion
 Measures that quantify the variation or dispersion
of a set of data from its central location.
 Dispersion refers to the variety exhibited by
the values of the data.
 The amount may be small when the values are close
together.
 If all the values are the same, no dispersion
1. Range (R)
• The difference between the largest and smallest observations in a
data set.

• Range = Maximum value – Minimum value

• Example –

– Data values: 5, 9, 12, 16, 23, 34, 37, 42

– Range = 42-5 = 37
Properties of range

 It is the simplest crude measure and can be easily understood

 It takes into account only two values which causes it to be a poor

measure of dispersion

 Very sensitive to extreme observations

2. Inter-quartile range (IQR)
• Indicates the spread of the middle 50% of the observations,
and used with median

IQR = Q3 - Q1

Example: Suppose the first and third quartile for weights of girls
12 months of age are 8.8 Kg and 10.2 Kg, respectively.

IQR = 10.2 Kg – 8.8 Kg

i.e., 50% of the infant girls weigh between 8.8 and 10.2 Kg.
Example 2
• Given the following data set (age of patients):-

18, 59, 24, 42, 21, 23, 24, 32

• Find the inter-quartile range

• Solution: 18 21 23 24 24 32 42 59

• 1st quartile = {(n+1)/4}th = (2.25)th = (21 + 23)/2 = 22

• 3rd quartile = {3/4 (n+1)}th = (6.75)th = (32 + 42)/2 = 37

• Hence, IQR = 37 - 22 = 15
Properties of IQR:

• It encloses the central 50% of the observations

• It is not based on all observations but only on two specific

values

• It is important in selecting cut-off points in the formulation

of clinical standards.

• Since it excludes the lowest and highest 25% values, it is

not affected by extreme values

• Less sensitive to the size of the sample

n
 (x i  x) 2
i=1
S2 
n-
1
n
 (x i  x) 2
i=1
S2 
n-
1
n
 (x i  x) 2
i=1
S2 
n-
1
n
 (x i  x) 2
i=1
S2 
n-
1
n
 (x i  x) 2
i=1
S2 
n-
1
n
 (x i  x) 2
i=1
S2 
n-
1
n
 (x i  x) 2
i=1
S2 
n-
1
n
 (x i  x) 2
i=1
S2 
n-
1
n
 (x i  x) 2
i=1
S2 
n-
1
n
 (x i  x) 2
i=1
S2 
n-
1
Example. Compute the variance and SD of the age of 169 subjects from
the grouped data.
Mean = 5810.5/169 = 34.48
years S2 = 20199.22/169-1 =
120.23
SD = √S2 = √120.23 = 10.96
Class
interval (mi) (fi) (mi-Mean) (mi-Mean)2 (mi-Mean)2 fi
10-19 14.5 4 -19.98 399.20 1596.80
20-29 24.5 66 -9-98 99.60 6573.60
30-39 34.5 47 0.02 0.0004 0.0188
40-49 44.5 36 10.02 100.40 3614.40
50-59 54.5 12 20.02 400.80 4809.60
60-69 64.5 4 30.02 901.20 3604.80
Total 169 1901.20 20199.22
Properties of SD
• Has the advantage of being expressed in the same units
of measurement as the mean

• The best measure of dispersion and is used widely because of the

properties of the theoretical normal curve.

• However, if the units of measurements of variables of two data sets

is not the same, then there variability can‟t be compared by
comparing the values of SD.
Coefficient of variation (CV)
 When two data sets have different units of measurements the CV
should be used as a measure of dispersion.

 It is the best measure to compare the variability of two series of

sets of observations.

 Data with less coefficient of variation is

considered more consistent.
CV is the ratio of the SD to the mean multiplied by
100.

S
CV  x 
100

SD Mean CV (%)

SBP 15mm 130mm 11.5

Cholesterol 40mg/dl 200md/dl 20.0

“Cholesterol is more variable than systolic blood

pressure”
Skewed distributions

 Skewness: If extremely low or extremely high observations are

present in a distribution, then the mean tends to shift towards
those scores.

 Based on the type of Skewness, distributions can be:

A. Positively skewed distribution: Occurs when the majority of

scores are at the left end of the curve and a few extreme large
scores are scattered at the right end.
B. Negatively skewed distribution: occurs when majority of
scores are at the right end of the curve and a few small scores
are scattered at the left end.

C. Symmetrical distribution: It is neither positively

nor negatively skewed.

A curve is symmetrical if one half of the curve is the mirror

image of the other half.
Mean, Median & Mode
Which measures to use?
• When the distribution is symmetric, summarize the data using means and
standard deviations.

• When the data are skewed, it is preferable to use the median and IQR as
summary statistics.

• Median and IQR are not easily influenced by extreme values in a

skewed
distribution unlike means and standard deviations.

• Remark:
• The mean and median of symmetric distribution coincide.

• When skewed to the right, its mean is larger than its median.

• When skewed to the left, its mean is smaller than its median.
Median Mode Mean
Fig. 2(a). Symmetric Distribution Mode Median Mean
Fig. 2(b). Distribution skewed to the right

Mean = Median = Mode Mean > Median > Mode

Mean Median Mode

Fig. 2(c). Distribution skewed to the left

Mean < Median < Mode 143

Any question?

144

Crush Step 1 The Ultimate USMLE Step 1 Review - 3rd Edition Textbook PDF Download
100% (19)
Crush Step 1 The Ultimate USMLE Step 1 Review - 3rd Edition Textbook PDF Download
15 pages
Presec - Shs 3 - Core Maths Mock Exams, May 2022
75% (4)
Presec - Shs 3 - Core Maths Mock Exams, May 2022
3 pages
Stats Form 4
100% (2)
Stats Form 4
35 pages
Chapter 3 - Describing Comparing Data
No ratings yet
Chapter 3 - Describing Comparing Data
21 pages
3.3 Measures of Skew and Outliers
No ratings yet
3.3 Measures of Skew and Outliers
42 pages
CH 3
No ratings yet
CH 3
59 pages
2descriptive Numerical Summary Measures Central
No ratings yet
2descriptive Numerical Summary Measures Central
52 pages
L3 Numerical Summary Measures
No ratings yet
L3 Numerical Summary Measures
44 pages
2.3 Descriptive Numerical Summary Measures
No ratings yet
2.3 Descriptive Numerical Summary Measures
67 pages
MCT and MD For Pharmacy Students
No ratings yet
MCT and MD For Pharmacy Students
58 pages
Lecture-3&4 - Measure of Centeral
No ratings yet
Lecture-3&4 - Measure of Centeral
67 pages
Unit - 2: Measures of Central Tendency
No ratings yet
Unit - 2: Measures of Central Tendency
8 pages
Measures of Central Tendency
No ratings yet
Measures of Central Tendency
13 pages
3descriptive Numerical Summary Measures
No ratings yet
3descriptive Numerical Summary Measures
111 pages
Summarizing Data
No ratings yet
Summarizing Data
49 pages
Portion 9
No ratings yet
Portion 9
44 pages
Lecture-3&4 - Measure of Centeral T
No ratings yet
Lecture-3&4 - Measure of Centeral T
171 pages
3rd Week
No ratings yet
3rd Week
87 pages
Measures of Centrality and Variability
No ratings yet
Measures of Centrality and Variability
42 pages
Data Analysis: Kulwant Singh Kapoor
No ratings yet
Data Analysis: Kulwant Singh Kapoor
60 pages
Lecture 2
No ratings yet
Lecture 2
73 pages
Lec - 4 (Summary Data)
No ratings yet
Lec - 4 (Summary Data)
89 pages
Chapter Three Bio
No ratings yet
Chapter Three Bio
38 pages
Intro Stat Session 5
No ratings yet
Intro Stat Session 5
43 pages
Summare Measures (Autosaved)
No ratings yet
Summare Measures (Autosaved)
155 pages
Share MBBS - Lecture 4 (1) - 1
No ratings yet
Share MBBS - Lecture 4 (1) - 1
68 pages
2.1 Measures of Central Tendency
No ratings yet
2.1 Measures of Central Tendency
32 pages
Stat I Chapter 3
No ratings yet
Stat I Chapter 3
48 pages
2.3 Summary Statistics - Measures of Center and Spread
No ratings yet
2.3 Summary Statistics - Measures of Center and Spread
11 pages
CH03 - Descriptive Statistics 2
No ratings yet
CH03 - Descriptive Statistics 2
67 pages
20 - Levels of Measurement, Central Tendency Dispersion
No ratings yet
20 - Levels of Measurement, Central Tendency Dispersion
35 pages
Statistics Report, Group I
No ratings yet
Statistics Report, Group I
44 pages
Business Statistics: Measures of Central Tendency
No ratings yet
Business Statistics: Measures of Central Tendency
44 pages
Measures of Central Tendency
No ratings yet
Measures of Central Tendency
30 pages
Summary Measures-1
No ratings yet
Summary Measures-1
101 pages
Measusres of Locations
No ratings yet
Measusres of Locations
52 pages
Biostat Lecture Four
No ratings yet
Biostat Lecture Four
53 pages
Lecture 2-Descriptive Statistics
No ratings yet
Lecture 2-Descriptive Statistics
74 pages
Quantitative Methods For Management
No ratings yet
Quantitative Methods For Management
118 pages
Chapter 3
No ratings yet
Chapter 3
59 pages
Dsbda Unit 2
No ratings yet
Dsbda Unit 2
155 pages
Data Management: Midterm
0% (1)
Data Management: Midterm
85 pages
Data Analytics TB
No ratings yet
Data Analytics TB
1,944 pages
Modern Math Reviewer
No ratings yet
Modern Math Reviewer
14 pages
CH III Stat I
No ratings yet
CH III Stat I
63 pages
Presentation 3
100% (1)
Presentation 3
37 pages
Properties - Describing Quantitative Data
No ratings yet
Properties - Describing Quantitative Data
36 pages
Descriptive Statistics
No ratings yet
Descriptive Statistics
38 pages
Dtatistical Measures
No ratings yet
Dtatistical Measures
54 pages
Measures of Central Tendency and Dispersion Measure of Central Tendency
No ratings yet
Measures of Central Tendency and Dispersion Measure of Central Tendency
8 pages
EDA W3 Obtaining-Data
No ratings yet
EDA W3 Obtaining-Data
57 pages
Business Statistics - Session Descriptive Statistics
No ratings yet
Business Statistics - Session Descriptive Statistics
28 pages
Applied Statistical Methods (ASM) : "The True Logic of This World Is in The Calculus of Probabilities"
No ratings yet
Applied Statistical Methods (ASM) : "The True Logic of This World Is in The Calculus of Probabilities"
90 pages
03 Numerical Description
No ratings yet
03 Numerical Description
52 pages
Module 3. Organizing and Summarizing Quantitative Data
No ratings yet
Module 3. Organizing and Summarizing Quantitative Data
13 pages
Chapter 2 Measure of Central Tendency Dhiraj (Becon 2025)
No ratings yet
Chapter 2 Measure of Central Tendency Dhiraj (Becon 2025)
80 pages
DSJ BMS Unit2
No ratings yet
DSJ BMS Unit2
18 pages
Chapter 4 Numerical Descriptive Measures of Data
No ratings yet
Chapter 4 Numerical Descriptive Measures of Data
35 pages
UNIT 2 Descriptive Statistics Measures of Central TendencyLocation
No ratings yet
UNIT 2 Descriptive Statistics Measures of Central TendencyLocation
31 pages
4b) ppt-C4-prt 2
No ratings yet
4b) ppt-C4-prt 2
48 pages
Central Tendency and Dispersion: A.Ramesh
No ratings yet
Central Tendency and Dispersion: A.Ramesh
58 pages
Chapter 3 Descriptive Statistics
No ratings yet
Chapter 3 Descriptive Statistics
78 pages
Statistics I Essentials
From Everand
Statistics I Essentials
Emil G. Milewski
No ratings yet
Group 1
No ratings yet
Group 1
1 page
Organizational Communication
No ratings yet
Organizational Communication
64 pages
Public Healthy
No ratings yet
Public Healthy
216 pages
Air Born Disease Nursing Students 2022
No ratings yet
Air Born Disease Nursing Students 2022
64 pages
Estimation
No ratings yet
Estimation
40 pages
7 Causation
No ratings yet
7 Causation
11 pages
2measure of Mort & Morb
No ratings yet
2measure of Mort & Morb
70 pages
8.hypothesis Testing
No ratings yet
8.hypothesis Testing
43 pages
Study Designs
No ratings yet
Study Designs
78 pages
4.epidemiological Study Design
No ratings yet
4.epidemiological Study Design
58 pages
Natural History of Disease
No ratings yet
Natural History of Disease
15 pages
5 Probability
No ratings yet
5 Probability
51 pages
Newborn Care
No ratings yet
Newborn Care
46 pages
1.cephalo Pelvic Disproportion
No ratings yet
1.cephalo Pelvic Disproportion
31 pages
Final!!!!!!!!!!!!!!!!!!!!
No ratings yet
Final!!!!!!!!!!!!!!!!!!!!
45 pages
PHP Apcir 7
No ratings yet
PHP Apcir 7
30 pages
1-Introduction To Statistics in Psychology-Updated
No ratings yet
1-Introduction To Statistics in Psychology-Updated
43 pages
ADM-SHS-StatProb-Q3-M19-Illustrating The Central Limit Theorem
No ratings yet
ADM-SHS-StatProb-Q3-M19-Illustrating The Central Limit Theorem
37 pages
A Discursive Essay
100% (2)
A Discursive Essay
59 pages
Math 2023 Practice Questions
No ratings yet
Math 2023 Practice Questions
30 pages
De-Escalation Techniques
No ratings yet
De-Escalation Techniques
26 pages
Sta 101 Note PDF
No ratings yet
Sta 101 Note PDF
109 pages
Double-Blind Test of The Effects of Distant Intention On Water Crystal Formation
0% (1)
Double-Blind Test of The Effects of Distant Intention On Water Crystal Formation
4 pages
Schubert, Kirchner - 2014 - Gait & Posture Ellipse Area Calculations and Their Applicability in Posturography
No ratings yet
Schubert, Kirchner - 2014 - Gait & Posture Ellipse Area Calculations and Their Applicability in Posturography
5 pages
Selection of Design Lower Deck Elevation of Fixed Offshore Platforms For Mexican Code
No ratings yet
Selection of Design Lower Deck Elevation of Fixed Offshore Platforms For Mexican Code
8 pages
Inferential Statistics C1-3
No ratings yet
Inferential Statistics C1-3
111 pages
Thesis Edited V
No ratings yet
Thesis Edited V
30 pages
Skittles Part 4
No ratings yet
Skittles Part 4
3 pages
Winesand Lilly 2002
No ratings yet
Winesand Lilly 2002
15 pages
102 PDF
No ratings yet
102 PDF
8 pages
Smith and Todd 2005 PDF
No ratings yet
Smith and Todd 2005 PDF
49 pages
Central Limit Theorem: When Sample Size Is Large ( 30), The Average of
No ratings yet
Central Limit Theorem: When Sample Size Is Large ( 30), The Average of
2 pages
Glossaryof Research Termsby Abdullah Noori
No ratings yet
Glossaryof Research Termsby Abdullah Noori
56 pages
PPOL 501 - 04 Answer Key Problem Set #2
No ratings yet
PPOL 501 - 04 Answer Key Problem Set #2
5 pages
Mean and Variance of The Sampling Distribution of
50% (2)
Mean and Variance of The Sampling Distribution of
31 pages
PPT-student Misbehaviour
100% (1)
PPT-student Misbehaviour
14 pages
Measures of Central Tendency
No ratings yet
Measures of Central Tendency
22 pages
PE 7 MODULE 6 Correct
No ratings yet
PE 7 MODULE 6 Correct
11 pages
Dispersion Theory
No ratings yet
Dispersion Theory
7 pages
Central Values PDF
No ratings yet
Central Values PDF
22 pages
Test For Two Relates Samples
No ratings yet
Test For Two Relates Samples
48 pages
Chapter 5
No ratings yet
Chapter 5
104 pages