Central Tendency and Variability
Central Tendency and Variability
Variability
Statistics
A type of mathematical analysis involving the
use of quantified representations, models and
summaries for a given set of empirical data or
real world observations. Statistical
analysis involves the process of collecting and
analyzing data and then summarizing the
data into a numerical form.
Descriptive Statistics
The goal of descriptive statistics is to
summarize a collection of data in a clear and
understandable way.
What is the pattern of data over the range of
possible values?
Where, on the scale of possible data, is a point
that best represents the set of data?
Do the data cluster about their central point or do
they spread out around it?
Central Tendency
Measure of Central Tendency:
A single summary figure that best describes the
central location of an entire distribution of data.
A typical data.
The center of the distribution.
One distribution can have multiple locations where
data cluster.
Must decide which measure is best for a given situation.
Central Tendency
Measures of Central Tendency:
Mean
The sum of all scores divided by the number of
scores.
Median
The value that divides the distribution in half
when observations are ordered.
Mode
The most frequent score.
Central Tendency Example:
Mode
52, 76, 100, 136, 186, 196, 205, 150,
257, 264, 264, 280, 282, 283, 303, 313,
317, 317, 325, 373, 384, 384, 400, 402,
417, 422, 472, 480, 643, 693, 732, 749,
750, 791, 891
Mode: most frequent observation
Mode(s) for hotel rates:
264, 317, 384
Pros and Cons of the Mode
Pros Cons
Good for nominal Ignores most of the
data-that are used information in a
as labels only distribution.
Easiest to compute Small samples may
and understand. not have a mode.
The average comes
from the data set.
Central Tendency Example:
Median
52, 76, 100, 136, 186, 196, 205, 150, 257, 264,
264, 280, 282, 283, 303, 313, 317, 317, 325,
373, 384, 384, 400, 402, 417, 422, 472, 480,
643, 693, 732, 749, 750, 791, 891
The median is the middle value when
observations are ordered.
To find the middle, count in (N+1)/2 scores when
observations are ordered lowest to highest.
Median hotel rate:
(35+1)/2 = 18
317
Finding the median with an
even number of scores.
2, 2, 3, 5, 6, 7, 7, 7, 8, 9
With an even number of scores, the median is
the average of the middle two observations
when observations are ordered.
Find the average of the N/2 and the (N+2)/2
score.
N/2 = 5th score, (N+2)/2 = 6th score
Add middle two observations and divide by two.
(6+7)/2 = 6.5
Median is 6.5
Pros and Cons of Median
Pros Cons
Not influenced by May not exist in the
extreme scores or data set.
skewed distributions. Doesn’t take actual
Good with ordinal values into account.
data-that can be
ordered.
Easier to compute
than the mean.
Mean
Is the balance point of a distribution.
The sum of negative deviations from the
mean exactly equals the sum of positive
deviations from the mean.
Mean “sigma”, the sum of X, add up
all scores
Population
X
“mu” “N”, the total number of
N scores in a population
2 (X ) 2
S 2
(X X ) 2
N n 1
“sigma”
Variance
Use the definitional formula to calculate the variance.
2
(X X ) 2
n
(3 6) 2 (4 6) 2 (4 6) 2 (4 6) 2 (6 6) 2 (7 6) 2 (7 6) 2 (8 6) 2 (8 6) 2 (9 6) 2
2
10
40
2 4.0
10
Variance:
Computational Formula
Population
N X 2 ( X ) 2
2
N2
Variance
Use the computational formula to calculate the variance.
X X2
3 9
4 16
4 16
4 16
6 36
7 49
7 49
10(400) (60) 2
2
n X 2 ( X ) 2
8 64
10 2 8 64
2 4000 3600 9 81
n2 2 Sum: 60 Sum: 400
100
2 4.0
Variability Example: X
472
X2
222784
Variance
303 91809
280 78400
282 79524
417 173889
400 160000
254 64516
205 42025
384 147456
n X 2 ( X ) 2 136 18496
2
250 62500
100 10000
n2 732
317
535824
100489
264 69696
2
750 562500
402 161604
352 422
373
178084
139129
234017070 179184996
325 105625
313 97969
2 749 561001
1225
791 625681
196 38416
891 793881
2 44760.88 283
52
80089
2704
186 34596
693 480249
S um: 13386 S um: 6686202
Pros and Cons of Variance
Pros Cons
Takes all data into Hard to interpret.
account. Can be influenced by
Lends itself to extreme scores.
computation of other
stable measures
Standard Deviation
To “undo” the squaring of difference
scores, take the square root of the
variance.
Return to original units rather than
squared units.
Standard Deviation
Rough measure of the average amount by which
observations deviate on either side of the mean.
The square root of the variance.
Population Sample
2
2
s s
(X ) (X X )
2 2
S
N n 1
N X ( X)
2 2
2
N
Variability Example: Standard
Deviation
(X X ) 2
n
(3 6) 2 (4 6) 2 (4 6) 2 (4 6) 2 (6 6) 2 (7 6) 2 (7 6) 2 (8 6) 2 (8 6) 2 (9 6) 2
10
n X 2 ( X )
2
40
2.0
10 n2
10(400) (60) 2
10 2
4000 3600
Mean: 6
100
4 .0
Standard Deviation: 2
2 .0
Variability Example: Standard
Deviation
Las Ve g as Ho te l Rate s
6
Frequency
5
hote l ra te s
4
0
0-99
100-199
200-299
300-399
400-499
500-599
600-699
700-799
800-899
Ra te s
35(6686202) (13386) 2
Mean: $371.60 352
234017070 179184996
Standard Deviation: 1225
44760.88 $211 .57
Pros and Cons of Standard
Deviation
Pros Cons
Influenced by extreme
Lends itself to scores.
computation of other
stable measures.
Average of deviations
around the mean.
Mean and Standard Deviation
Using the mean and standard deviation
together:
Is an efficient way to describe a distribution with
just two numbers.
Allows a direct comparison between distributions .
Monthly Income
Family Income (Rs)
1 400
2 850
3 1750
4 200
5 375
6 4250
7 2225
8 750
9 500
10 1300
Daily Wages
Wages/hr in (Rs) No. of Workers
15 2
18 3
20 5
25 10
30 12
35 10
40 5
42 2
45 1
Marks of Students
Marks No of Students
10-20 2
20-30 4
30-40 8
40-50 40
50-60 24
60-70 10
70-80 6
80-90 6
Coin Tossing
Eight Coins were tossed .
No of Heads Frequency
0 1
1 9
2 26
3 59
4 72
5 52
6 29
7 7
8 2
No. 50 ? 90 200
Sd 6 7 ? 7.746
Mean 113 ? 115 116