0% found this document useful (0 votes)
40 views41 pages

Central Tendency and Variability

This document discusses descriptive statistics and measures of central tendency and variability. It defines statistics as the use of quantitative representations and models to analyze empirical data. Descriptive statistics aims to summarize data in a clear way by looking at patterns, central values, and how data is clustered or spread. Measures of central tendency, like the mean, median, and mode, describe the central location of data. The mean is the average value, the median is the middle value, and the mode is the most frequent value. Measures of variability, like the range and variance, describe how spread out the data is. Variance measures how far data deviates from the mean on average. Understanding measures of central tendency and variability is important for describing data

Uploaded by

g23033
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
40 views41 pages

Central Tendency and Variability

This document discusses descriptive statistics and measures of central tendency and variability. It defines statistics as the use of quantitative representations and models to analyze empirical data. Descriptive statistics aims to summarize data in a clear way by looking at patterns, central values, and how data is clustered or spread. Measures of central tendency, like the mean, median, and mode, describe the central location of data. The mean is the average value, the median is the middle value, and the mode is the most frequent value. Measures of variability, like the range and variance, describe how spread out the data is. Variance measures how far data deviates from the mean on average. Understanding measures of central tendency and variability is important for describing data

Uploaded by

g23033
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 41

Central Tendency and

Variability
Statistics
A type of mathematical analysis involving the
use of quantified representations, models and
summaries for a given set of empirical data or
real world observations. Statistical
analysis involves the process of collecting and
analyzing data and then summarizing the
data into a numerical form.
Descriptive Statistics
The goal of descriptive statistics is to
summarize a collection of data in a clear and
understandable way.
 What is the pattern of data over the range of
possible values?
 Where, on the scale of possible data, is a point
that best represents the set of data?
 Do the data cluster about their central point or do
they spread out around it?
Central Tendency
Measure of Central Tendency:
 A single summary figure that best describes the
central location of an entire distribution of data.
 A typical data.
 The center of the distribution.
 One distribution can have multiple locations where
data cluster.
 Must decide which measure is best for a given situation.
Central Tendency
Measures of Central Tendency:
 Mean
 The sum of all scores divided by the number of
scores.
 Median
 The value that divides the distribution in half
when observations are ordered.
 Mode
 The most frequent score.
Central Tendency Example:
Mode
52, 76, 100, 136, 186, 196, 205, 150,
257, 264, 264, 280, 282, 283, 303, 313,
317, 317, 325, 373, 384, 384, 400, 402,
417, 422, 472, 480, 643, 693, 732, 749,
750, 791, 891
Mode: most frequent observation
Mode(s) for hotel rates:
 264, 317, 384
Pros and Cons of the Mode
Pros Cons
 Good for nominal  Ignores most of the
data-that are used information in a
as labels only distribution.
 Easiest to compute  Small samples may
and understand. not have a mode.
 The average comes
from the data set.
Central Tendency Example:
Median
52, 76, 100, 136, 186, 196, 205, 150, 257, 264,
264, 280, 282, 283, 303, 313, 317, 317, 325,
373, 384, 384, 400, 402, 417, 422, 472, 480,
643, 693, 732, 749, 750, 791, 891
The median is the middle value when
observations are ordered.
 To find the middle, count in (N+1)/2 scores when
observations are ordered lowest to highest.
Median hotel rate:
 (35+1)/2 = 18
 317
Finding the median with an
even number of scores.
2, 2, 3, 5, 6, 7, 7, 7, 8, 9
With an even number of scores, the median is
the average of the middle two observations
when observations are ordered.
 Find the average of the N/2 and the (N+2)/2
score.
 N/2 = 5th score, (N+2)/2 = 6th score
 Add middle two observations and divide by two.
 (6+7)/2 = 6.5
Median is 6.5
Pros and Cons of Median
Pros Cons
 Not influenced by  May not exist in the
extreme scores or data set.
skewed distributions.  Doesn’t take actual
 Good with ordinal values into account.
data-that can be
ordered.
 Easier to compute
than the mean.
Mean
Is the balance point of a distribution.
The sum of negative deviations from the
mean exactly equals the sum of positive
deviations from the mean.
Mean “sigma”, the sum of X, add up
all scores

Population
X
“mu”  “N”, the total number of
N scores in a population

Sample “sigma”, the sum of X, add up


all scores
X
“X bar” X 
n
“n”, the total number
of scores in a sample
Central Tendency Example:
Mean
52, 76, 100, 136, 186, 196, 205, 150, 257, 264, 264, 280, 282, 283,
303, 313, 317, 317, 325, 373, 384, 384, 400, 402, 417, 422, 472, 480,
643, 693, 732, 749, 750, 791, 891

Mean hotel rate:


X
 X 
n
13005
 X   371.60
35
 Mean hotel rate: $371.60
Pros and Cons of the Mean
Pros Cons
 Mathematical center of a
distribution.
 Influenced by extreme
 Just as far from scores scores and skewed
above it as it is from distributions.
scores below it.  May not exist in the data.
 Good for interval and
ratio data i.e. numerical
data
 Does not ignore any
information.
 Inferential statistics is
based on mathematical
properties of the mean.
The effect of skew on
average.
In a normal
distribution, the
mean, median, and
mode are the same.
In a skewed
distribution, the
mean is pulled
toward the tail.
Which average?
Each measure contains a different kind of
information.
 For example, all three measures are useful for
summarizing the distribution of American
household incomes.
 In 1998, the income common to the greatest number of
households was $25,000.
 Half the households earned less than $38,885.
 The mean income was $50,600.
 Reporting only one measure of central tendency
might be misleading and perhaps reflect a bias.
Measures of Variability
A single summary figure that describes
the spread of observations within a
distribution.
Measures of Variability
Range
 Difference between the smallest and largest
observations.
Variance
 Mean of all squared deviations from the mean.
Standard Deviation
 Rough measure of the average amount by which
observations deviate from the mean. The square
root of the variance.
Variability Example: Range
Las Vegas Hotel Rates
52, 76, 100, 136, 186, 196, 205, 150,
257, 264, 264, 280, 282, 283, 303, 313,
317, 317, 325, 373, 384, 384, 400, 402,
417, 422, 472, 480, 643, 693, 732, 749,
750, 791, 891
Range: 891-52 = 839
Pros and Cons of the Range
Pros Cons
 Very easy to  Value depends only
compute. on two scores.
 Very sensitive to
outliers.
 Influenced by
sample size (the
larger the sample,
the larger the
range).
Variance
The average amount that a score deviates
from the typical score.
 Score – Mean = Difference Score
 Average of Difference Scores = 0
 In order to make this number not 0, square the
difference scores (no negatives to cancel out the
positives).
Variance: Definitional Formula
Population Sample

 
2  (X  ) 2

S 2

 (X  X ) 2

N n 1
“sigma”
Variance
Use the definitional formula to calculate the variance.

 2

(X  X ) 2

n
(3  6) 2  (4  6) 2  (4  6) 2  (4  6) 2  (6  6) 2  (7  6) 2  (7  6) 2  (8  6) 2  (8  6) 2  (9  6) 2
 
2

10
40
2   4.0
10
Variance:
Computational Formula
Population

N  X 2  ( X ) 2
2 
N2
Variance
Use the computational formula to calculate the variance.

X X2
3 9
4 16
4 16
4 16
6 36
7 49
7 49
10(400)  (60) 2
 
2
n X 2  ( X ) 2
8 64
10 2 8 64
2  4000  3600 9 81
n2 2  Sum: 60 Sum: 400
100
 2  4.0
Variability Example: X
472
X2
222784

Variance
303 91809
280 78400
282 79524
417 173889
400 160000
254 64516
205 42025
384 147456

Las Vegas Hotel Rates


264 69696
317 100489
76 5776
643 413449
480 230400

n X 2  ( X ) 2 136 18496

2 
250 62500
100 10000

n2 732
317
535824
100489
264 69696

35( 6686202)  (13386) 2 384 147456

2 
750 562500
402 161604

352 422
373
178084
139129

234017070  179184996
325 105625
313 97969

 
2 749 561001

1225
791 625681
196 38416
891 793881

 2  44760.88 283
52
80089
2704
186 34596
693 480249
S um: 13386 S um: 6686202
Pros and Cons of Variance
Pros Cons
 Takes all data into  Hard to interpret.
account.  Can be influenced by
 Lends itself to extreme scores.
computation of other
stable measures
Standard Deviation
To “undo” the squaring of difference
scores, take the square root of the
variance.
Return to original units rather than
squared units.
Standard Deviation
Rough measure of the average amount by which
observations deviate on either side of the mean.
The square root of the variance.
Population Sample
2
  2
s s
 (X   ) (X  X )
2 2

 S
N n 1

N  X  (  X)
2 2

 2
N
Variability Example: Standard
Deviation

 (X  X ) 2

n
(3  6) 2  (4  6) 2  (4  6) 2  (4  6) 2  (6  6) 2  (7  6) 2  (7  6) 2  (8  6) 2  (8  6) 2  (9  6) 2

10
n X 2  ( X )
2
40
  2.0 
10 n2

10(400)  (60) 2

10 2
4000  3600
Mean: 6 
100
  4 .0
Standard Deviation: 2
  2 .0
Variability Example: Standard
Deviation
Las Ve g as Ho te l Rate s

6
Frequency

5
hote l ra te s
4

0
0-99

100-199

200-299

300-399

400-499

500-599

600-699

700-799

800-899
Ra te s

35(6686202)  (13386) 2

Mean: $371.60 352
234017070  179184996

Standard Deviation: 1225
  44760.88  $211 .57
Pros and Cons of Standard
Deviation
Pros Cons
 Influenced by extreme
 Lends itself to scores.
computation of other
stable measures.
 Average of deviations
around the mean.
Mean and Standard Deviation
Using the mean and standard deviation
together:
 Is an efficient way to describe a distribution with
just two numbers.
 Allows a direct comparison between distributions .
Monthly Income
Family Income (Rs)
1 400
2 850
3 1750
4 200
5 375
6 4250
7 2225
8 750
9 500
10 1300
Daily Wages
Wages/hr in (Rs) No. of Workers
15 2
18 3
20 5
25 10
30 12
35 10
40 5
42 2
45 1
Marks of Students
Marks No of Students
10-20 2
20-30 4
30-40 8
40-50 40
50-60 24
60-70 10
70-80 6
80-90 6
Coin Tossing
Eight Coins were tossed .
No of Heads Frequency
0 1
1 9
2 26
3 59
4 72
5 52
6 29
7 7
8 2

Determine the Median.


Daily Income
Daily Income (Rs.) Frequency
Below 200 4
200-250 10
250-300 16
300-350 25
350-400 40
400-450 30
450-500 20
500-550 15
550-600 6
600-650 4

Determine the Median income.


Daily Income
Daily Income (Rs.) No. of Workers
Below 300 5
300-400 12
400-500 15
500-600 20
600-700 30
700-800 25
800-900 18
900-1000 10
1000 and above 5

Find the Modal Wage.


Sample Problems
1. The means of two groups of sizes 50 and
100 are 54.12 & 50.3 and standard
deviations are 8 and 7 respectively.
Obtain the mean and s.d. of the group
combining both.
2. The average runs scored by three
batsman A, B and C in the same series
are 50 , 48,&12 and the corresponding
s.d. are 5, 12, & 2.
Who is the most consistent of the three? If
one of the three is to be selected, who
will be selected?
Missing Information
Gr-1 Gr-2 Gr-3 Combined

No. 50 ? 90 200
Sd 6 7 ? 7.746
Mean 113 ? 115 116

You might also like