0% found this document useful (0 votes)
79 views

Measures of Variability: Range

This document discusses various measures of variability in data including range, variance, standard deviation, and coefficient of variation. It also discusses analyzing distributions through percentiles, quartiles, z-scores, the empirical rule, identifying outliers, and box plots. The range is the difference between the maximum and minimum values. Variance and standard deviation utilize all data values and measure deviation from the mean. Percentiles and quartiles divide data into percentage points or four equal parts. Z-scores standardize values relative to the mean and standard deviation. The empirical rule describes how many data points fall within standard deviations of the mean.

Uploaded by

gladysann church
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
79 views

Measures of Variability: Range

This document discusses various measures of variability in data including range, variance, standard deviation, and coefficient of variation. It also discusses analyzing distributions through percentiles, quartiles, z-scores, the empirical rule, identifying outliers, and box plots. The range is the difference between the maximum and minimum values. Variance and standard deviation utilize all data values and measure deviation from the mean. Percentiles and quartiles divide data into percentage points or four equal parts. Z-scores standardize values relative to the mean and standard deviation. The empirical rule describes how many data points fall within standard deviations of the mean.

Uploaded by

gladysann church
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 5

Measures of Variability

Range

Variance

Standard Deviation

Coefficient of Variation

Range

The range is a simple measure that tells you the spread of values in a data set. It has a simple definition:

• Found by subtracting the smallest value from the largest value in a data set

Range = maximum value – minimum value

So if you have a set of data such as 4, 2, 5, 8, 12, 15, the range is the highest number (15) minus
the lowest number (2). In this case:

Range = 15-2 = 13

• Illustration: Consider the data on home sales in Cincinnati, Ohio, suburb

Home Sale Selling Price ($)


10 138000
10 254000
10 186000
10 257500
10 108000
10 254000
10 138000
10 298000
10 199500
10 208000
10 142000
10 456250

• Largest home sales price: $456,250

• Smallest home sales price: $108,000

• Range = Largest value – Smallest value

= $456,250 – $108,000

= $348,250

• Drawback: Range is based on only two of the observations and thus is highly influenced by
extreme values

Variance

• Measure of variability that utilizes all the data


• It is based on the deviation about the mean, which is the difference between the value of each
observation (xi) and the mean

• The deviations about the mean are squared while computing the variance
∑(𝑥𝑖 − 𝑥̅ )2
• Sample variance, 𝑠 2 =
𝑛−1

∑(𝑥𝑖 − µ)2
• Population variance , 𝜎 2 = 𝑁

Table 2.12: Computation of Deviations and Squared Deviations about the Mean for the Class Size Data

Computation of Sample Variance:

• Standard Deviation

• Positive square root of the variance

• Measured in the same units as the original data

• For sample , s = √𝑠 2

• For population, σ = √σ2

• Coefficient of Variation
Standard deviation
• ( x 100 ) %
Mean

• Measures the standard deviation relative to the mean

• Expressed as a percentage

Illustration:

• Consider the class size data:

46 54 42 46 32

• Mean, 𝑥̅ = 44
• Standard deviation, s = 8
8
• Coefficient of variation = (44 x 100)% = 18.2%

Analyzing Distributions
Percentiles Empirical Rule

Quartiles Identifying Outliers

Z-Scores Box Plots

Percentiles

• Value of a variable at which a specified (approximate) percentage of observations are below


that value

• The pth percentile tells us the point in the data where:

• Approximately p percent of the observations have values less than the pth percentile

• Approximately (100 – p) percent of the observations have values greater than the pth
percentile

• Steps to calculate the pth percentile:

• Arrange the data in ascending order (smallest to largest value)

• Compute k = (n + 1) × p

• Divide k into its integer component, i, and its decimal component, d

• If d = 0, find the kth largest value in the data set; this is the pth percentile

• If d > 0, the percentile is between the values in positions i and i + 1 in the sorted
data; to find this percentile, we must interpolate between these two values:

• Calculate the difference between the values in positions i and i + 1 in


the sorted data set; we define this difference between the two values as
m

• Multiply this difference by d: t = m × d

• To find the pth percentile, add t to the value in position i of the sorted
data

• Illustration

• To determine the 85th percentile for the home sales data in Table 2.9.

1. Arrange the data in ascending order

108,000 138,000 138,000 142,000 186,000 199,500


208,000 254,000 254,000 257,500 298,000 456,250

Compute k = (n + 1) × p = (12 + 1) × 0.85 = 11.05

2. Dividing 11.05 into the integer and decimal components gives us i = 11 and d = 0.05

d > 0, interpolate between the values in the 11th and 12th positions in the sorted data

Illustration (contd.)
• To determine the 85th percentile for the home sales data in Table 2.9

• The value in the 11th position is 298,000

• The value in the 12th position is 456,250

m = 456,250 – 298,000 = 158,250

t = m × d = 158,250 × 0.05 = 7912.5

pth percentile = 298,000 + 7912.5 = 305,912.5

$305,912.50 represents the 85th percentile of the home sales data

Quartiles

• When the data is divided into four equal parts:

• Each part contains approximately 25% of the observations

• Division points are referred to as quartiles

• 𝑄1 = first quartile, or 25th percentile

• 𝑄2 = second quartile, or 50th percentile (also the median)

• 𝑄3 = third quartile, or 75th percentile

z-score

• Measures the relative location of a value in the data set

• Helps to determine how far a particular value is from the mean relative to the data set’s
standard deviation

• Standardized value

• If 𝑥1 , 𝑥2 , . . . , 𝑥𝑛 is a sample of n observations
𝑥𝑖 − 𝑥̅
𝑧𝑖 = 𝑠

• 𝑧𝑖 = z-score for 𝑥𝑖

• 𝑥̅ = sample mean

• s = sample standard deviation

• For class size data, 𝑥̅ = 44 and s = 8

• For observations with a value > mean, z-score > 0

• For observations with a value < mean, z-score < 0


Empirical Rule

• For data having a bell-shaped distribution:

• Within 1 standard deviation—approximately 68% of the data values

• Within 2 standard deviations—approximately 95% of the data values

• Within 3 standard deviations—almost all the data values

Identifying Outliers

• Outliers: Extreme values in a data set

• It can be identified using standardized values (z-scores)

• Any data value with a z-score less than –3 or greater than +3 is an outlier

Box Plots

• Graphical summary of the distribution of data

• Developed from the quartiles for a data set

*q`

Figure 2.23: Box Plots Comparing Home Sale Prices in Different Communities
Figure 2.22: Box Plot
for the Home Sales
Data

You might also like