Descriptive Statistics - Part 3

Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 16

* Measures of dispersion express quantitatively the degree of

variation or dispersion of values in a population or in


a sample. It measures how spread out a set of data is.
* Along with measures of central tendency, measures of
dispersion are widely used in practice as descriptive
statistics.
* Some measures of dispersion are the variance , standard
deviation, the range , the coefficient of variation and the
interquartile range.
Measures of Dispersion I (looks at the
spread of the data)
3

Description Applicability Advantage Disadvantage

Range Difference between the largest 1. Interval/ratio 1. Simple to calculate 1. Highly influenced
and the smallest value in the 2. No outliers exist by outliers.
data. 2. Does not use all
data

Variance/ Variance is the average 1. Interval/ratio 1. Use all the data 1. Not resistant to
standard squared deviations from the 2. When no outliers outliers.
deviation mean. exist 2. Variance depends
on the units of
Standard deviation is square measurement,
root of the variance. therefore not easy
Commonly used. to make
comparisons.

Coefficient Measures variability relative 1. Interval/ratio 1. Ensures fairness in 1. Not


resistant to
of Variation to the magnitude of the data. 2. When no outliers comparison outliers.
It compares the variability of exist 2. Use all the data
different data

3
Measures of Dispersion II
4

Raw data Ungroup frequency Group frequency Notations


distribution distribution

Range Max. value – Min Same NA NA


value
X = the actual values (for raw
Variance/  x  2
 fx 2
 fx 2 data and ungrouped freq.
standard x 2

n
 fx 2

n
 fx 2

n
dist.)
deviation s2  s2  s2  = midpoints (for group
n 1 n 1 n 1 freq. dist.)
f = frequency
n = sample size

 = summation or sum of
s 2 = sample variance
Coefficient of
variation  stdev  Same as raw data Same as raw data
CV    100%
 mean 

4
A computer salesperson, X, sells the following number of
computers in 12 months:
34, 47, 1, 15, 57, 24, 20, 11, 19, 50, 28, 37.

calculate the:
a.Mean
b.Variance
c.Standard deviation
d.Coefficient of variation
The data below shows the age distribution of a sample of persons
from Country X in 1990.

Age (years) No. of Persons


Less than 20 10
20 - <30 14
30 - <40 25
40 – <50 27
Calculate the mean,
50 – <60 15
standard deviation and
60 and over 9
the coefficient of
variation for the data and
interpret your findings.
Age group Number unemployed

15-19 3 688

20-24 4 031

25-34 5 432
   
35-44 4 360

45-54 3 162

55-64 1 702

Calculate the standard


deviation and coefficient
of variation.
Interpret your results.
* The table provides information on the days to maturity
of 40 short-term investments

Days to Maturity Frequency


30–39 3
40–49 1
50–59 8
60–69 10
70–79 7
80–89 7
90–99 4

Estimate the sample mean and sample standard


deviation of the days-to-maturity data.
*Empirical Rule
 The empirical rule states that symmetric or normal distribution with
population mean μ and standard deviation σ have the following
properties.

9
A group of 220 Year 10 students were asked how much time they spent
watching television per week. The results are shown in the table given
below. The mean and standard deviation of hours spent watching television
by the 220 students was 30.32 and 6.04 respectively.

Hours No. of students

10-14 2

15-19 12

20- 24 23

25- 29 60

30- 34 77

35-39 38

40- 44 8
Assuming the frequency distribution is approximately normal,
calculate the interval within which;

i.68% of all observations would expect occur;


ii.95% of all observations would expect occur;
iii.99.7% of all observations would expect occur.
E.G. ANSWER 99% of all observations

This means that there is about a 99% certainty


that an observation will lie between 12 hours
and 48 hours. That is, a student in the sample
will watch between 12 and 48 hours of
television each week.
*Measure of Shape I
*Like central tendency and variation, shape is an important
feature of a distribution.

*Skewness is an important measure of shape.

It is a lack of symmetry.

*A distribution is symmetric if it looks the same to the left


and right of the mean.

*A distribution is skewed if one of its tails is longer than the


other, caused by outliers. 13
Measure of Shape II
Positive skewness Zero skewness or symmetric Negative skewness

Mode < Median < Mean Mean = Median = Mode Mean < Median < Mode

Pearson’s 1st co-efficient of Skewness Pearson’s 2nd co-efficient of Skewness

mean  mod e 3(mean  median)


S .D. S .D.
14
*Example VI
1.Using the data from Example III, compute the 1st and 2nd
Pearson measure of skewness and sketch the diagram.

15
*Bonus Question
The marks out of 10 for forty students who attempted a SOCI1005 test were
recorded as follows:
 
9, 10, 7, 8, 9, 6, 5, 9, 4, 7, 0, 7, 2, 7, 8, 5, 4, 3, 10, 7,
 
3, 7, 8, 6, 9, 7, 4, 2, 3, 9, 4, 3, 7, 5, 5, 2, 7, 9, 7, 1.
 
 
* a) Construct a grouped frequency table of the scores using 6 classes.
* b) Using the table, calculate the mean, median and mode.
* c) Calculate the variance, standard deviation and coefficient of variation
* d)How would you interpret these results?

16

You might also like