0% found this document useful (0 votes)
9 views6 pages

Chapter 5

The document discusses measures of central tendency, including mean, median, and mode, explaining their calculations and properties. It also covers measures of position (quartiles, deciles, percentiles) and measures of variability (range, interquartile range, variance, standard deviation). Additionally, it addresses skewness and kurtosis, providing insights into data distribution and outlier detection.

Uploaded by

Resha Gordon
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views6 pages

Chapter 5

The document discusses measures of central tendency, including mean, median, and mode, explaining their calculations and properties. It also covers measures of position (quartiles, deciles, percentiles) and measures of variability (range, interquartile range, variance, standard deviation). Additionally, it addresses skewness and kurtosis, providing insights into data distribution and outlier detection.

Uploaded by

Resha Gordon
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

A.

Measures of Central Tendency Used when different data points have different
levels of importance (weights)’
Measures of central tendency, also known as
measures of center or central location, summarize a 2. Median
dataset by identifying a single value that represents
the middle or central point of the data distribution. The median is the middle value of an ordered
The three primary measures of central tendency dataset. It divides the dataset into two equal halves.
are:
Finding the Median:
• Mean
• Median
• Mode

1. Mean (Arithmetic Mean) (x̅)

The mean is calculated by summing all the data


points and dividing by the total number of
observations. It is the most commonly used
measure of central tendency. Example:

Formulas: • Find the median of pulse rates (BPM): 84, 74,


50, 60, 52

1. Arrange in ascending order: 50, 52, 60, 74,


84
2. Middle value = 60 BPM → Median = 60

• If a 6th value (62 BPM) is added:

1. Ordered values: 50, 52, 60, 62, 74, 84


2. Median = (60 + 62) / 2 = 61 BPM

Properties of the Median:

✅ Resistant to outliers (extreme values do not


significantly affect it)
Example:
❌ Does not consider every data value.
A hospital records the systolic blood pressure (SBP)
readings of 5 patients: 3. Mode
120, 135, 140, 125, 130 mmHg
The mode is the value(s) that appear most
frequently in a dataset.
Mean SBP = (120 + 135 + 140 + 125 + 130) / 5
Types of Mode:
Mean SBP = 130 mmHg
• Unimodal → One mode
Properties of the Mean (Triola, 2018):
• Bimodal → Two modes
• Trimodal → Three modes
✅ Uses all data values
• Multimodal → More than three modes
✅ Less variation in repeated sampling than other
• No Mode → No repeating values
measures
❌ Sensitive to outliers (a single extreme value can Example:
significantly change the mean)
Find the mode of pulse rates (BPM):
Weighted Mean Formula: 58, 58, 58, 58, 60, 60, 62, 64

The mode = 58 BPM (most frequent value)


Properties of the Mode: B. Measures of Position

✅ Can be used for qualitative (categorical) data Measures of position determine the relative
✅ Can handle multi-modal distributions standing of a single value in relation to other values
❌ May not exist or may have multiple values in a sample or population. They help identify where
a particular data point falls within a dataset.
4. Midrange
Types of Measures of Position
The midrange is the value midway between the
minimum and maximum values. 1. Quartiles (Q1, Q2, Q3)
2. Deciles (D1–D9)
Formula: 3. Percentiles (P1–P99)

Quartiles

Example: Quartiles divide a dataset into four equal parts:

Find the midrange of pulse rates: 84, 74, 50, 60, 52 • First Quartile (Q1): 25% of the data falls
BPM below this point.
• Second Quartile (Q2 or Median): 50% of the
• Min = 50, Max = 84 data falls below this point.
• Midrange = (50 + 84) / 2 = 67 BPM • Third Quartile (Q3): 75% of the data falls
below this point.
Properties of the Midrange:
Steps to Find Quartiles
✅ Easy to compute
❌ Highly sensitive to outliers 1. Arrange the data from lowest to highest.
2. Determine Qk using the formula:
Comparison of Measures of Central Tendency

Sensitiv
Key Best
Measur e to Avoid
Interpretati Used
e Outlier When…
on When…
s?
higher mean: where k = 1, 2, or 3, and n = number of data
greater Data is points.
overall Data is
skewed
tendency. normally o If the result is a whole number, use
Mean or has
lower mean: ✅ Yes distribut that observation.
lower typical extreme
ed o If it is a decimal, take the next highest
value. values
integer as the position.

higher Example:
median: Data is Data is
higher
skewed evenly
Median tendency. ❌ No
lower
or has distribut
median: outliers ed
lower trend.
Data is Data has
Most no
categoric
Mode frequent ❌ No repetitio
al or has
value n
peaks
A quick
Data has
Midran Middle of estimate
✅ Yes extreme
ge min & max of the
values
center
Example:

Deciles

Deciles divide the dataset into ten equal parts:

• D1: 10% of data falls below this point. Key Interpretations


• D2: 20% falls below.
• D5: 50% falls below (Median). Measure Interpretation
• D9: 90% falls below. Quartiles Divide data into four parts; useful in
(Q1, Q2, Q3) box plots and detecting skewness.
Steps to Find Deciles
Help classify data into ten equal
Deciles (D1–
groups, commonly used in income
1. Arrange the data from lowest to highest. D9)
distribution.
2. Determine DkD_k using the formula:
Used in standardized tests, medical
Percentiles research, and growth charts to
(P1–P99) compare an individual to a
population.

where k = 1, 2, 3, ..., 9 and n = total observations.


Example Applications
Example:
Measure Application
Standardized testing (e.g., SAT, GRE),
Percentiles Growth charts for babies (e.g., 25th
percentile for weight).
Income distribution, Box plots,
Quartiles
Analyzing the spread of medical data.
Economic research, Top 10% or bottom
Deciles
10% income classification.
Percentiles

Percentiles divide the dataset into 100 equal parts: C. Measures of Variability
• P1: 1% of data falls below. Measures of variability (also called measures of
• P25: 25% falls below (Q1). spread or dispersion) describe how similar or varied
• P50: 50% falls below (Median). a set of observed values is for a particular dataset.
• P75: 75% falls below (Q3). These measures help determine the consistency or
• P99: 99% falls below. dispersion of data points around a central value.
Steps to Find Percentiles Types of Measures of Variability:
1. Arrange the data in increasing order. 1. Range
2. Determine Pk using the formula: 2. Interquartile Range (IQR)
3. Variance
4. Standard Deviation
5. Measures of Relative Dispersion (Z-score,
Coefficient of Variation)
where k = percentile rank and n = total number of
observations.
1. Range (Min-Max) • Key Interpretation:
o Higher variance → Data points are
• Definition: The difference between the more spread out.
highest and lowest values in a dataset. o Lower variance → Data points are
• Example: Data on birthweight (in ounces) of closer to the mean.
the newly born child at certain hospital are
as follows: 112, 111, 107, 119, 92, 80, 81, 84, 4. Standard Deviation (s or σ)
118, 106, 103, and 94. Calculate the range.
Range = 119 - 80 = 39 • Definition: The square root of the variance,
• Key Point: The range is sensitive to extreme measuring how much data deviates from the
values (outliers), making it a non-resistant mean.
measure of variability. • Formulas:

2. Interquartile Range (IQR) (Q1-Q3)

• Definition: The difference between the third


quartile (Q3) and the first quartile (Q1). It
represents the spread of the middle 50% of
the data.
• Example:
Using the same birthweight dataset,
calculating Q1 and Q3 will provide the IQR.

• Key Point: The IQR is resistant to outliers • Key Interpretation:


because it ignores the highest and lowest o Smaller standard deviation → Data
25% of the data. is consistent and clustered around
the mean.
3. Variance (s² or σ²) o Larger standard deviation → Data is
widely spread, indicating greater
• Definition: The average of the squared variability.
differences between each data point and the
mean. General Interpretation of Standard Deviation:
• Formulas:
Standard
Interpretation
Deviation (σ)
σ < 5% of the Data points are tightly packed
mean (low variability).
σ ≈ 10% of the Some variability, but values stay
mean close to the mean.
σ > 20% of the High variability, data is widely
mean spread.

5. Measures of Relative Dispersion

These measures standardize data variability, making


it easier to compare different datasets.
i. Coefficient of Variation (CV) • Interpretation:
o Positive Skew (Right-Skewed): The
• Definition: The ratio of the standard tail on the right side is longer,
deviation to the mean, expressed as a meaning most values are
percentage. concentrated on the left.
• Formula: o Negative Skew (Left-Skewed): The
• tail on the left side is longer, meaning
most values are concentrated on the
right.
o Zero Skew: A perfectly symmetrical
distribution.

• Key Interpretation:
o Higher CV → More variability relative
to the mean. Formula for Moment Coefficient of Skewness
o Lower CV → More consistency in the
dataset.

ii. Standard Score (Z-Score) where:

• Definition: Measures how many standard • SK = Skewness coefficient


deviations a value is from the mean. • x = Each data point
• Formula: • x̅ = Mean of the dataset
• s = Standard deviation
• n = Sample size

• Key Interpretation: Kurtosis


o Z > 0 → Value is above the mean.
o Z < 0 → Value is below the mean. • Definition: Kurtosis measures the
o Z = 0 → Value is equal to the mean. "tailedness" or peakedness of a distribution
o Z > ±2 or ±3 → Indicates an outlier or compared to a normal distribution.
rare occurrence. • Types of Kurtosis:
1. Leptokurtic (KU > 3): High peak,
heavy tails (more extreme values).
2. Mesokurtic (KU = 3): Normal
D. Measures of Skewness & Kurtosis distribution.
3. Platykurtic (KU < 3): Flat peak, light
Skewness tails (fewer extreme values).

Skewness measures the asymmetry of a


distribution. It indicates the direction and extent to
which a dataset deviates from a normal distribution.
Formula for Coefficient of Kurtosis

where:

• KU = Kurtosis coefficient

Applications of Skewness & Kurtosis:

Measure When to Use Example Applications


Identifying Survey biases,
Skewness asymmetry in financial stock returns,
data distribution revenue growth trends
Risk analysis in
Checking
finance, test score
Kurtosis extreme values
outlier detection,
and outliers
product defect rates

Key Interpretations:

• Higher positive skewness → More values


spread on the right.
• Higher negative skewness → More values
spread on the left.
• Higher kurtosis (>3) → More extreme
outliers.
• Lower kurtosis (<3) → Fewer outliers.

Skewness & Kurtosis in Excel

• Skewness Calculation: = SKEW (range)


• Kurtosis Calculation: = KURT (range)

You might also like