0% found this document useful (0 votes)
60 views21 pages

Describing Distributions With Numbers

1) Converting heights to feet would divide all values by 12, rescaling the data. This would divide the mean, median, standard deviation, and IQR by 12 as well. 2) Adding 2 inches to each height would shift all values up by 2 inches, changing the mean and median by 2 inches but leaving the standard deviation and IQR unchanged. 3) Adding 4 inches to each height and converting to feet would first shift all values up by 4 inches, then rescale the data by dividing by 12. This would divide the new mean and median by 12 and multiply the standard deviation and IQR by 12.

Uploaded by

krothroc
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
60 views21 pages

Describing Distributions With Numbers

1) Converting heights to feet would divide all values by 12, rescaling the data. This would divide the mean, median, standard deviation, and IQR by 12 as well. 2) Adding 2 inches to each height would shift all values up by 2 inches, changing the mean and median by 2 inches but leaving the standard deviation and IQR unchanged. 3) Adding 4 inches to each height and converting to feet would first shift all values up by 4 inches, then rescale the data by dividing by 12. This would divide the new mean and median by 12 and multiply the standard deviation and IQR by 12.

Uploaded by

krothroc
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 21

Describing Distributions

with Numbers
Section 1.2
What Does That Mean?
 We are about to learn specific ways to calculate
center and spread of a distribution. You can
calculate these numerical values for any
quantitative variable. But to interpret these
measures of center and spread, and to choose
among the several methods you will learn, you
must think about the shape of the distribution
and the meaning of the data. The numbers, like
the graphs, are aids to understanding, not “the
answer” in themselves.
Measures of Center
Mean:
of a sample: of a population:

x
 x i
  x i

n n
Median:
the value that divides the data into
equal halves (*it may or may not
be a value in the data set)
The median The mean is
divides the the balance
distribution into point of the
two equal areas. distribution
Practice:
1. Find the mean and median for
each list and contrast their
behavior:
1. 1, 2, 6
2. 1, 2, 9
3. 1, 2, 297
Measures of Spread
Range: maximum - minimum

Percentiles: the pth percentile of a distribution is the


value such that p percent of the observations that
fall at or below it (the median is the 50th
percentile)

Quartiles: the lower quartile (Q1) is the 25th


percentile (or the median of the lower half) and
the upper quartile (Q3) is the 75th percentile (or the
median of the upper half)

Interquartile Range (IQR) = Q1-Q3


Another Measure of Spread
Standard Deviation (Std Dev):

1
of a sample: sx 
n 1
 ( xi  x ) 2

1
of a population: x 
n
 ( xi  x ) 2
More About Standard Deviation
 The differences of each value from
the mean are deviations: xx

 Since the mean is the balance point


of the distribution, the set of all
deviations from the mean will always
 (x  x)  0
add to zero:
1
 The Variance is: sx 
2

n 1
 ( xi  x ) 2

 The Standard Deviation is:


1
sx 
n 1
 ( xi  x ) 2
Practice:
2. For the sample: 1, 2, 4, 6, 9

a. Verify that the sum of the deviations from


the mean is 0.

b. Find the standard deviation by hand.

c. Find the standard deviation on the


graphing calculator.
Practice:
3. Without computing, match each list of numbers
on the left, with its SD on the right:

a. 1, 1, 1, 1 i. 0
b. 1, 2, 2 ii. 0.058
c. 1, 2, 3, 4, 5 iii. 0.577
d. 10, 20, 20 iv. 1.581
e. 0.1, 0.2, 0.2 v. 3.162
f. 0, 2, 4, 6, 8 vi. 3.606
g. 0, 0, 0, 0, 5, 6, 6, 8, 8 vii. 5.774
The Five-Number Summary
 The Five-Number Summary includes:
minimum, Q1, median, Q3, maximum

 It is used to create Boxplots.

 The five-number summary is usually better than


the mean and std dev for describing a skewed
distribution or a distribution with strong outliers.
Use x-bar and sx only for reasonably symmetric
distributions that are free of outliers.
Danger Will Robinson, danger!!!
 While all of the method
discussed to compute numerical
measures are very useful, they
should not be applied blindly.

 Statistical measures and


methods based on them are
generally meaningful only for
distributions of sufficiently
regular shape.
What Happened to the Whiskers?

Side-by-Side Boxplots:
maximum

Q1
median
Q3
minimum
Calculating Outliers
 An observation is considered an Outlier if it falls
outside the interval:

(Q1 - 1.5 • IQR, Q3 + 1.5 • IQR)

 In general, it is not a good idea to


just ignore or delete outliers, but
they do have a strong influence on
the data so sometimes calculations
are done with and without the
outliers and then compared.
The Influence of Outliers
 Resistant – a summary statistic is resistant
to outliers if it is not changed very much if
the outlier is removed from the data set:
 median, IQR

 Sensitive – a summary statistic is sensitive


to outliers if it tends to be affected by
outliers
 mean, range, standard deviation
Give Me a Graph, Baby!
 Remember that a graph gives the best
overall picture of a distribution. Numerical
measures of center and spread report
specific facts about a distribution, but they
do not describe its entire shape. Always
plot your data!
Changing the Unit of Measurement

 A change in the measurement unit is called


a Linear Transformation.

 A linear transformation changes the


original variable x into the new variable xnew
by using the equation:

xnew = a + bx
So What Does That Mean?
 Adding the constant a shifts all
of the values of x left or right by
the same amount (the data is
recentered.)

 Multiplying by the positive


constant b changes the size of
the unit of measurement (the
data is rescaled.)
What Measures Are Effected?
 Adding a (recentering) changes the mean,
median, and quartiles by a. However,
none of the measures of spread change.

 Multiplying by b (rescaling) multiplies both


the measures of center and spread by b.

 Linear transformations do not change the


shape of a distribution!
Practice
4. The mean height of a class of 15 children
is 48 inches, the median is 45 inches, the
standard deviation is 2.4 inches, and the
IQR is 3 inches. Find the mean, median,
standard deviation, and IQR if…
a. you convert each height to feet.
b. each child grows 2 inches.
c. each child grows 4 inches and you
convert their heights to feet.

You might also like