2) SummarizationOfData Mean Median Mod SD CV
2) SummarizationOfData Mean Median Mod SD CV
Presentation of Data
* Measures of Central Tendency
* Measures of variation
* Coefficient of Variation
*
Especially in quantitative data is often
important to give in a single value, an indication
of the general level of a set of measurements.
-Are the data values all rather close to the mean or some of
them scatter widely around it ? –
a – MEAN
The most commonly used central tendency measure is
the mean more appropriately titled "ARITMETIC MEAN".
_ X i
X
n
Example: Age at death of 7 surgeons are ;
70, 68, 68, 40, 71, 73, 65.
The mean is 455 / 7 = 65 years
{1,3,6,7,2,3,5}
n 7
x x i i
1 3 6 7 2 3 5 27
X i 1
i 1
n 7 7 7
X 3,9 years
Example (Mode)
The mode is the value of x that occurs most frequently.
Data {1,3,7,3,2,3,6,7}
Mode : 3
Data {1,3,7,3,2,3,6,7,1,1}
Mode : 1 and 3
G x1 x2 xn
n or log( G )
log( x )
i
n
For example,
in the area of psychometrics it is well known that
the rated intensity of a stimulus (e.g., brightness of a light)
is often a logarithmic function of the actual
intensity of the stimulus (brightness measured in units of Lux).
In this instance, the geometric mean is a better
"summary" of ratings than the simple mean.
Harmonic Mean is a "summary" statistic used in
analyses of frequency data.
The harmonic mean is sometimes used to average
values that change in time.
n
HM
1
xi
4
HM 2.71
1 1 1 1
10 8 4 1
( xi x ) 2
n
• The units and dimensions of the standard deviation will be
those of the quantity expressed by xi .
( xi x ) 2
ഥ = 56/7 = 8
𝒙 SD2=100/(7-1)=16.67 SD= 16.67 = 4.08
It is inherent in the definition of a mean
that the algebraic sum of all the deviations
(both positive and negative ) is equal to 0,
so that the average deviation is 0.
18
4.5
3 -2 4 n 1 4
25 0 18 s s 2 4.5 2.12
x (x x) (x x)2
1 -4 16 x
x 25
5
n 5
3 -2 4
5
6
0
1
0
1 s2
(x x) 2
46
11.5
10 5 25 n 1 4
25 0 46 s s 2 11.5 3.39
n
NOTE: The sum of the deviation, ( xi x ) , is always zero.
For example: Comparison of variances
The second set of data is more dispersed than
the first set, and therefore its variance is larger.
3
First sample 3 5 6 8
s2=4.5
Second sample 1 3 5 6 10 s2=11.5
iii - COEFFICIENT OF VARIATION (CV)
It is occasionally useful to describe the
variability by expressing the standard deviation SD
as a proportion, or a percentage, of the mean. CV _ *100
The resulting measure, called the COEFFICIENT X
OF VARIATION , is thus a dimensionless
quantity -a pure number-. in symbols,
The CV is most useful as a descriptive tool to detect
normality of data series.
4.08
CV * 100 _____
51%
8