0% found this document useful (0 votes)
76 views36 pages

02.1 Numerical Summary Measures

Here are the steps to solve the self-test questions: a. P1 = 45.2 b. Q3 = 58 c. D9 = 57 d. P56 = 56.3 e. Percentile corresponding to 55 is P56.52 = 57 The document provides information on numerical summary measures including measures of location (percentiles, quartiles, deciles), measures of central tendency (mean, median, mode), and measures of variability. It includes definitions, formulas, and examples of calculating these measures.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
76 views36 pages

02.1 Numerical Summary Measures

Here are the steps to solve the self-test questions: a. P1 = 45.2 b. Q3 = 58 c. D9 = 57 d. P56 = 56.3 e. Percentile corresponding to 55 is P56.52 = 57 The document provides information on numerical summary measures including measures of location (percentiles, quartiles, deciles), measures of central tendency (mean, median, mode), and measures of variability. It includes definitions, formulas, and examples of calculating these measures.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 36

Numerical Summary Measures

Glyzel Grace M. Francisco


STAT2205 – Introduction to Biostatistics
2nd Semester, 2020-2021

CENTRAL LUZON STATE UNIVERSITY


DEPARTMENT of
STATISTICS Numerical Summary Measures

1. Measures of Location

2. Measures of Central Tendency

3. Measures of Variability

STAT2205 – INTRODUCTION TO BIOSTATISTICS | 2


DEPARTMENT of
STATISTICS Measures of Location
• used if it is of interest to measure other parts of the
distribution of data other than its center
• numerical measures that give the relative position of a data
value to the entire data set

Quantiles or Fractiles
Partition of data into groups with roughly the same number
of values
• Percentiles
• Deciles
• Quartiles
STAT2205 – INTRODUCTION TO BIOSTATISTICS | 3
DEPARTMENT of
STATISTICS Percentiles

• the 99 intermediate values that divide an array of data into


100 equal parts

• The jth percentile, denoted by Pj, is the data value that


separates the bottom j% of the data from the top (100-j)%

STAT2205 – INTRODUCTION TO BIOSTATISTICS | 4


DEPARTMENT of
STATISTICS Percentiles

• 𝑷𝟏 ,read as 1st percentile, is the value below which 1% of the


values fall.
• 𝑷𝟐 ,read as 2nd percentile, is the value below which 2% of the
values fall.
.
.
.
• 𝑷𝟗𝟗 ,read as 99th percentile, is the value below which 99% of
the values fall.

STAT2205 – INTRODUCTION TO BIOSTATISTICS | 5


DEPARTMENT of
STATISTICS Percentiles
Finding the corresponding value/observation in the data set
given a quantile/fractile

Finding the jth Percentile


1. Make an array (ascending order)
2. Find the location of 𝑃𝑗 in the arranged list by computing
𝑗
L= 𝑥𝑁
100
Where: N = the total number of data values
j = percentile of interest
STAT2205 – INTRODUCTION TO BIOSTATISTICS | 6
DEPARTMENT of
STATISTICS Percentiles

3. (a) If L is a whole number, then Pj is the mean of the data


values in position L and position L+1
(b) If L is not a whole number, Pj is taken as the data value in
the next higher whole number position

Remark: Percentiles are generally computed for large data sets

STAT2205 – INTRODUCTION TO BIOSTATISTICS | 7


DEPARTMENT of
STATISTICS Quartiles

values that divide an array into four equal parts, each part
having 25% of the data values, denoted by 𝑸𝒋

𝑄1 = 𝑃25
𝑄2 = 𝑃50 = 𝐷5 = 𝑀𝑒𝑑𝑖𝑎𝑛
𝑄3 = 𝑃75

STAT2205 – INTRODUCTION TO BIOSTATISTICS | 8


DEPARTMENT of
STATISTICS Quartiles
𝑸𝟏 (First quartile): Separates the bottom 25% of the sorted values from
the top 75%. (To be more precise, at least 25% of the
sorted values are less than or equal 𝑄1 , and at least
75% of the values are greater than or equal to 𝑄1 .)
𝑸𝟐 (Second quartile): Same as the median; separates to bottom 50% of
the sorted values from the top 50%.
𝑸𝟑 (Third quartile): Separates the bottom 75% of the sorted values from
the top 25% . (To be more precise, at least 75% of
the sorted values are less than or equal 𝑄3 , and at
least 25% of the values are greater than or equal to
𝑄3 .)
STAT2205 – INTRODUCTION TO BIOSTATISTICS | 9
DEPARTMENT of
STATISTICS Deciles

values that divide an array into ten equal parts, each part having
ten percent of the data values, denoted by 𝑫𝒋

𝑫𝟏 = 𝑷𝟏𝟎 𝑫𝟔 = 𝑷𝟔𝟎
𝑫𝟐 = 𝑷𝟐𝟎 𝑫𝟕 = 𝑷𝟕𝟎
𝑫𝟑 = 𝑷𝟑𝟎 𝑫𝟖 = 𝑷𝟖𝟎
𝑫𝟒 = 𝑷𝟒𝟎 𝑫𝟗 = 𝑷𝟗𝟎
𝑫𝟓 = 𝑷𝟓𝟎

STAT2205 – INTRODUCTION TO BIOSTATISTICS | 10


DEPARTMENT of
STATISTICS Measures of Location

2nd Decile
STAT2205 – INTRODUCTION TO BIOSTATISTICS | 11
DEPARTMENT of
STATISTICS Measures of Location
1 5 5 6 8 10 12 13 14 15
15 17 18 18 18 18 18 20 24 25
26 26 L 26 L+1 26 27 28 31 31 34 37
37 37 39 40 40 41 41 42 43 44
44 46 46 47 49 50
The table lists the 46 sorted ages of dogs in month/s.
Find the median. 23𝑟𝑑 + 24𝑡ℎ 𝑑𝑎𝑡𝑎 26 + 26
𝑄2 = 𝑃50 𝑄2 = 𝑃50 = = = 𝟐𝟔
2 2
50
𝐿= 𝑥46 = 23
100 Hence, the the 50th percentile or the median age
Refer to slide 7 of dogs in month/s is 26 months
STAT2205 – INTRODUCTION TO BIOSTATISTICS | 12
DEPARTMENT of
STATISTICS Measures of Location
1 5 5 6 8 10 12 13 14 15
15 17 18 18 18 18 18 20 L 24 25
26 26 26 26 27 28 31 31 34 37
37 37 39 40 40 41 41 42 43 44
44 46 46 47 49 50
The table lists the 46 sorted ages of dogs in month/s.
Find the 40th percentile. 𝐷4 = 𝑃40 = 19𝑡ℎ 𝑑𝑎𝑡𝑎 = 𝟐𝟒
𝐷4 = 𝑃40
40
𝐿= 𝑥46 = 18.4 ≈ 19 Hence, the 40th percentile age of dogs in
100 month/s is 24 months
Refer to slide 7
STAT2205 – INTRODUCTION TO BIOSTATISTICS | 13
DEPARTMENT of
STATISTICS Measures of Location
Finding the quantile/fractile that corresponds to a particular
value x

𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑣𝑎𝑙𝑢𝑒𝑠 𝑙𝑒𝑠𝑠 𝑡ℎ𝑎𝑛 𝑥


𝑝𝑒𝑟𝑐𝑒𝑛𝑡𝑖𝑙𝑒 𝑜𝑓 𝑥 = 𝑥 100
𝑡𝑜𝑡𝑎𝑙 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑣𝑎𝑙𝑢𝑒𝑠

(round the result to the nearest whole number)

STAT2205 – INTRODUCTION TO BIOSTATISTICS | 14


DEPARTMENT of
STATISTICS Measures of Location
1 5 5 6 8 10 12 13 14 15
15 17 18 18 18 18 18 20 24 25
26 26 26 26 27 28 31 31 34 37
37 37 39 40 40 41 41 42 43 44
44 46 46 47 49 50

Find the percentile corresponding to 31.


𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑣𝑎𝑙𝑢𝑒𝑠 𝑙𝑒𝑠𝑠 𝑡ℎ𝑎𝑛 𝑥
𝑝𝑒𝑟𝑐𝑒𝑛𝑡𝑖𝑙𝑒 𝑜𝑓 𝑥 = 𝑡𝑜𝑡𝑎𝑙 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑣𝑎𝑙𝑢𝑒𝑠 𝑥 100
26
𝑝𝑒𝑟𝑐𝑒𝑛𝑡𝑖𝑙𝑒 𝑜𝑓 𝑥 = 𝑥100 = 56.52 ≈ 57 31 months is the
46
57th percentile
STAT2205 – INTRODUCTION TO BIOSTATISTICS | 15
DEPARTMENT of
STATISTICS Measures of Location
1 5 5 6 8 10 12 13 14 15
15 17 18 18 18 18 18 20 24 25
26 26 26 26 27 28 31 31 34 37
37 37 39 40 40 41 41 42 43 44
44 46 46 47 49 50

Find the percentile corresponding to 40.


𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑣𝑎𝑙𝑢𝑒𝑠 𝑙𝑒𝑠𝑠 𝑡ℎ𝑎𝑛 𝑥
𝑝𝑒𝑟𝑐𝑒𝑛𝑡𝑖𝑙𝑒 𝑜𝑓 𝑥 = 𝑡𝑜𝑡𝑎𝑙 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑣𝑎𝑙𝑢𝑒𝑠 𝑥 100
33
𝑝𝑒𝑟𝑐𝑒𝑛𝑡𝑖𝑙𝑒 𝑜𝑓 𝑥 = 𝑥100 = 71.74 ≈ 72 40 months is the
46
72nd percentile
STAT2205 – INTRODUCTION TO BIOSTATISTICS | 16
DEPARTMENT of
STATISTICS Self-test
A sample of 30 women were randomly selected to measures their weight in
kilograms an hour after their workout. The data are as follows:

75, 63, 57, 66, 47, 47, 54, 47, 55, 60,
78, 65, 64.3, 55, 57, 47, 60, 63, 58, 54,
58, 57, 70, 56.3, 45.2, 46, 51, 52, 57, 56

Find the following:


a. P1
b. Q3
c. D9
d. P56
e. Find the percentile corresponding to 55

GGMFRANCISCO STAT2205 – INTRODUCTION TO BIOSTATISTICS | 17


DEPARTMENT of
STATISTICS Measures of Central Tendency

- Value at the center or at the middle of a data set


- Value where the data tend to cluster
- Also called average

• Mean
• Median
• Mode

STAT2205 – INTRODUCTION TO BIOSTATISTICS | 18


DEPARTMENT of
STATISTICS Notations

Parameter Statistic
(Population Data) (Parameter Data)

Mean 𝜇 𝑥ҧ
Median Md 𝑥෤
Mode Mo 𝑥ො

STAT2205 – INTRODUCTION TO BIOSTATISTICS | 19


DEPARTMENT of
STATISTICS MEAN
The arithmetic average obtained by adding up all the data values and dividing
by the total number of observations.
Population mean
σ 𝑥𝑖 where:
𝜇= 𝑥 = value at ith observation
𝑁 𝑖

Sample mean
N = no. of obs. in population
σ 𝑋𝑖
𝑥ҧ =
𝑛 n = no. of obs. in sample
Weighted mean
σ 𝑊𝑖 𝑋𝑖 𝑊𝑖 = weight of ith observation
𝑥ҧ =
σ 𝑊𝑖
STAT2205 – INTRODUCTION TO BIOSTATISTICS | 20
DEPARTMENT of
STATISTICS EXAMPLES

Find the sample mean of the following data set


Set A: 2, 9, 8, 12, 1, 16, 3, 5, 3, 7 n=10
Set B : 13.2, 11.0, 3.8, 10.1, 15.0, 18.3, 20.1 n=7
σ 𝑋𝑖 𝑋1 + 𝑋2 +. . . 𝑋10 2 + 9 + 8 + 12 + 1+16+3 + 5 + 3 + 7 66
𝑋ത𝐴 = = = = = 𝟔. 𝟔
𝑛 𝑛 10 10

σ 𝑋𝑖 𝑋1 + 𝑋2 +. . . 𝑋8 13.2 + 11.0 + 3.8 + 10.1 + 15.0 + 18.3 + 20.1


𝑋ത𝐵 = = =
𝑛 𝑛 7
91.5
= = 𝟏𝟑. 𝟎𝟕𝟏𝟒
7

GGMFRANCISCO STAT2205 – INTRODUCTION TO BIOSTATISTICS | 21


DEPARTMENT of
STATISTICS MEDIAN

The value that divides the distribution into two equal parts, so that
half of the cases are above it and half below it.

The median is the middle value, or average middle value in a


distribution.

STAT2205 – INTRODUCTION TO BIOSTATISTICS | 22


DEPARTMENT of
STATISTICS MEDIAN

𝑋𝑛+1 , 𝑖𝑓 𝑛 𝑖𝑠 𝑜𝑑𝑑
2

𝑀𝑑 =
𝑋𝑛 + 𝑋𝑛+2
2 2
, 𝑖𝑓 𝑛 𝑖𝑠 𝑒𝑣𝑒𝑛
2

STAT2205 – INTRODUCTION TO BIOSTATISTICS | 23


DEPARTMENT of
STATISTICS EXAMPLES
Find the sample median of the following data set
Set A: 2, 9, 8, 12, 1, 16, 3, 5, 3, 7
Set B : 13.2, 11.0, 3.8, 10.1, 15.0, 18.3, 20.1
𝑆𝑡𝑒𝑝 1. 𝑎𝑟𝑟𝑎𝑛𝑔𝑒 𝑡ℎ𝑒 𝑑𝑎𝑡𝑎 𝑖𝑛𝑡𝑜 𝑎𝑟𝑟𝑎𝑦 (𝑙𝑜𝑤𝑒𝑠𝑡 𝑡𝑜 ℎ𝑖𝑔ℎ𝑒𝑠𝑡 𝑜𝑟 ℎ𝑖𝑔ℎ𝑒𝑠𝑡 𝑡𝑜 𝑙𝑜𝑤𝑒𝑠𝑡)
𝑆𝑡𝑒𝑝 2. 𝐹𝑖𝑛𝑑 𝑡ℎ𝑒 𝑚𝑖𝑑𝑑𝑙𝑒 𝑣𝑎𝑙𝑢𝑒 𝑜𝑟 𝑡ℎ𝑒 𝑎𝑣𝑒𝑟𝑎𝑔𝑒 𝑚𝑖𝑑𝑑𝑙𝑒 𝑣𝑎𝑙𝑢𝑒
Set A: 1, 2, 3, 3, 5, 7, 8, 9, 12, 16
Set B : 3.8, 10.1, 11.0, 13.2, 15.0, 18.3, 20.1
𝑋10 + 𝑋10+2
2 2 𝑋5 + 𝑋6 5 + 7
𝑥෤𝐴 = = = =𝟔 𝑤ℎ𝑒𝑛 𝑛 𝑖𝑠 𝑒𝑣𝑒𝑛 𝑛 = 10
2 2 2
𝑥෤𝐵 = 4𝑡ℎ = 13.2 𝑤ℎ𝑒𝑛 𝑛 𝑖𝑠 𝑜𝑑𝑑 𝑛=7
GGMFRANCISCO STAT2205 – INTRODUCTION TO BIOSTATISTICS | 24
DEPARTMENT of
STATISTICS MODE
The value(quantitative) or category(qualitative) with the largest frequency (or
percentage) in the distribution.
Example: Find the sample mode of the following data set
Set A: 2, 9, 8, 12, 1, 16, 3, 5, 3, 7
Set B : 13.2, 11.0, 3.8, 10.1, 15.0, 18.3, 20.1
𝑆𝑡𝑒𝑝 1. 𝑎𝑟𝑟𝑎𝑛𝑔𝑒 𝑡ℎ𝑒 𝑑𝑎𝑡𝑎 𝑖𝑛𝑡𝑜 𝑎𝑟𝑟𝑎𝑦 𝑙𝑜𝑤𝑒𝑠𝑡 𝑡𝑜 ℎ𝑖𝑔ℎ𝑒𝑠𝑡 𝑜𝑟 ℎ𝑖𝑔ℎ𝑒𝑠𝑡 𝑡𝑜 𝑙𝑜𝑤𝑒𝑠𝑡
𝑆𝑡𝑒𝑝 2. 𝐹𝑖𝑛𝑑 𝑡ℎ𝑒 𝑐𝑎𝑡𝑒𝑔𝑜𝑟𝑦/𝑖𝑒𝑠 𝑜𝑟 𝑣𝑎𝑙𝑢𝑒/𝑠 𝑤𝑖𝑡ℎ 𝑡ℎ𝑒 𝑙𝑎𝑟𝑔𝑒𝑠𝑡 𝑓𝑟𝑒𝑞𝑢𝑒𝑛𝑐𝑦
Set A: 1, 2, 3, 3, 5, 7, 8, 9, 12, 16
Set B : 3.8, 10.1, 11.0, 13.2, 15.0, 18.3, 20.1

𝑥ො𝐴 = 𝟑
𝑥ො𝐵 = mode does not exist

GGMFRANCISCO STAT2205 – INTRODUCTION TO BIOSTATISTICS | 25


DEPARTMENT of
STATISTICS COMPARISON

Takes every Affected by


Advantage/
Average Existence observation extreme values
Disadvantage
into account? (outliers)?
works well with
Mean Always exists Yes Yes many statistical
methods
good choice if
Median Always exists No No
there are outliers
Might not exist, appropriate for
Mode May be more No No data at nominal
than one level
STAT2205 – INTRODUCTION TO BIOSTATISTICS | 26
DEPARTMENT of
STATISTICS MEASURES OF VARIABILITY/ DISPERSION

—indicates how observations in a data set are scattered about an


average

1. Range
2. Variance
3. Standard Deviation
4. Coefficient of Variation
5. Standard Error

STAT2205 – INTRODUCTION TO BIOSTATISTICS | 27


DEPARTMENT of
STATISTICS RANGE
– measures how far the highest value is from the lowest value
– a rough measure of dispersion
– it uses only the extreme values
– it fails to communicate any information about the clustering or the lack of
clustering of the values between the extremes
– a weakness is that an outlier can greatly alter its value
𝑹 = max – min or 𝑹 = highest value – lowest value
Example: Find the range of the following data set
𝑅𝐴 = 16 − 1 = 𝟏𝟓
Set A: 2, 9, 8, 12, 1, 16, 3, 5, 3, 7
Set B : 13.2, 11.0, 3.8, 10.1, 15.0, 18.3, 20.1 𝑅𝐵 = 20.1 − 3.8 = 𝟏𝟔. 𝟑

GGMFRANCISCO STAT2205 – INTRODUCTION TO BIOSTATISTICS | 28


DEPARTMENT of
STATISTICS VARIANCE
– the average squared difference of the observations from the mean
– comes in the square of the unit of measure of the given set of values
Population Sample Characteristics of the Variance
σ 𝑋 𝑖 − ത
𝑋 2 • Always non-negative
2
𝑠 = • A large variance corresponds to
𝑛−1
σ 𝑋 − ത
𝑋 2 or a highly dispersed set of values
𝑖
𝜎2 = (σ 𝑋𝑖 )2
• Easy to manipulate for further
𝑁 2
σ 𝑋𝑖 − mathematical computations
𝑠2 = 𝑛
• Make use of all the observations
𝑛−1
Note: Denominator is in the data
n if n ≥ 30 and n-1 if n < 30 • Comes in a unit that is the
squares of the unit in the data
STAT2205 – INTRODUCTION TO BIOSTATISTICS | 29
DEPARTMENT of
STATISTICS EXAMPLE
Find the sample variance of the data set below 𝑋𝑖 𝑋𝑖 𝑋𝑖2 𝑋𝑖 − 𝑋ത 2
2, 9, 8, 12, 1, 16, 3, 5, 3, 7
𝑋1 2 22 =4 2 − 6.6 2
= 21.16
n = 10, 𝑋ത = 6.6 ( see slide 22)
𝑋2 9 92 =81 9 − 6.6 2 = 5.76
σ 𝑋𝑖 − ത
𝑋 2
206.4 𝑋3 8 82 =64 8 − 6.6 2
= 1.96
𝑠2 = = = 𝟐𝟐. 𝟗𝟑𝟑𝟑
𝑛−1 10 − 1 𝑋4 12 122 =144 12 − 6.6 2 = 29.16
𝑋5 1 12 =1 1 − 6.6 2
= 31.36
or 𝑋6 16 162 =256 16 − 6.6 2 = 88.36
2 2
𝑋7 3 32 =9 3 − 6.6 2 = 12.96
(σ 𝑋𝑖 ) (66)
σ 𝑋𝑖2 − 642 − 𝑋8 5 52 =25 5 − 6.6 2 = 2.56
𝑠2 = 𝑛 = 10
𝑛−1 10 − 1 𝑋9 3 32 =9 3 − 6.6 2 = 12.96

2
𝑋10 7 72 =49 7 − 6.6 2 = 0.16
𝑠 = 𝟐𝟐. 𝟗𝟑𝟑𝟑 σ 𝑋𝑖 =66 σ 𝑋𝑖2 =642 σ 𝑋𝑖 − 𝑋ത 2 = 206.4

GGMFRANCISCO STAT2205 – INTRODUCTION TO BIOSTATISTICS | 30


DEPARTMENT of
STATISTICS STANDARD DEVIATION
- the standard deviation has the same units of measurement (such as minutes or grams or
dollars) as the original data values.
- the average deviation between the individual scores in the distribution and
the mean for the distribution; square root of the variance
- Values close together have a small standard deviation, but values with
much more variation have a larger standard deviation.
- It is affected by the value of every observation. It may be distorted by few
extreme values
Population Sample
Example: 𝜎= 𝜎2 𝑠= 𝑠2
Find the sample standard deviation of the data set below
2, 9, 8, 12, 1, 16, 3, 5, 3, 7
𝑠 = 𝑠 2 = 22.9333
𝑠 2 = 22.9333 (See slide 30) 𝑠 = 𝟒. 𝟕𝟖𝟖𝟗
GGMFRANCISCO STAT2205 – INTRODUCTION TO BIOSTATISTICS | 31
DEPARTMENT of
STATISTICS COEFFICIENT OF VARIATION
– a measure that indicates the magnitude of variation relative to the magnitude
of the mean
– expressed in percent
– used to compare dispersion of two or more data sets with the same or
different units
Population Sample

𝜎 𝑠
𝐶𝑉 = 𝑥 100% 𝐶𝑉 = 𝑥 100%
𝜇 𝑋ത
Example:
Find the sample coefficient of variation of the data set below
2, 9, 8, 12, 1, 16, 3, 5, 3, 7
𝑠 4.7889
𝐶𝑉 = 𝑥 100% = 𝑥100% = 𝟕𝟐. 𝟓𝟔%
s = 4.7889 (see slide 32) 𝑋ത 6.6

GGMFRANCISCO STAT2205 – INTRODUCTION TO BIOSTATISTICS | 32


DEPARTMENT of
STATISTICS STANDARD ERROR of the MEAN
– a measure of statistical accuracy of an estimate
– indication of reliability of the mean
– a small SE is an indication that the sample mean is more accurate reflection of
the true mean
– standard deviation of the sampling distribution of the mean
Population Sample
𝜎 𝑠
SE = SE =
Example: 𝑁 𝑛
Find the sample standard error of the data set below
2, 9, 8, 12, 1, 16, 3, 5, 3, 7 𝑠 4.7889
SE = = = 𝟏. 𝟓𝟏𝟒𝟒
𝑛 10
s = 4.7889 (see slide 32)
GGMFRANCISCO STAT2205 – INTRODUCTION TO BIOSTATISTICS | 33
DEPARTMENT of
STATISTICS Supplement to the Use of Standard Deviation
1. Empirical Rule
For data having bell-shaped
distribution:
oApproximately 68% of the data
values will be within 1 sd of
the mean.
oApproximately 95% of the data
values will be within 2 sd of
the mean.
oAlmost all (99%) of the values
will be within 3 sd of the
mean.

GGMFRANCISCO STAT2205 – INTRODUCTION TO BIOSTATISTICS | 34


DEPARTMENT of
STATISTICS Supplement to the Use of Standard Deviation
2. Chebyshev’s Theorem
oenables us to make statements about the proportions of data values that
must be within a specified number of standard deviations of the mean
oChebyshev’s Theorem can be applied to any data set regardless of the
shape of the distribution
oit states that at least (1 − 1Τ𝑧 2 ) of the data values must be within z standard
deviations of the mean, where z is any value greater than 1

Some implications of this theorem, with z = 2, 3, and 4 standard deviations


of the mean
▪ At least 75% of the data values must be within z = 2 standard deviations of the mean.
▪ At least 89% of the data values must be within z = 3 standard deviations of the mean.
▪ At least 94% of the data values must be within z = 4 standard deviations of the mean.
GGMFRANCISCO STAT2205 – INTRODUCTION TO BIOSTATISTICS | 35
DEPARTMENT of
STATISTICS Self-test
A sample size of 9 patients were randomly selected to measure their weight in
kilograms an hour after taking a medication. The data are as follows:

63, 57, 66, 47, 47, 54, 47, 55, 60

Calculate the following:


a. mean
b. Median
c. Mode
d. Range
e. Variance
f. Standard Deviation
g. Coefficient of Variation
h. Standard Error
GGMFRANCISCO STAT2205 – INTRODUCTION TO BIOSTATISTICS | 36

You might also like