BT Stat Unit 2
BT Stat Unit 2
BT Stat Unit 2
SKEWNESS,KURTOSIS,CORRELATION ,REGRESSION
(i) skewness (ii) kurtosis.
skewness In a perfectly symmetrical distribution Mean, Median and mode
coincide , skewness is a measure to study the aspect of a statistical
distribution. If adistribution is not symmetrical,we say that it is skewed.
( )
- 0.7= -1.4=Mean-Median
-1.4 = Mean-12.8 Mean=12.8-1.4
Mean=11.4
4. In a frequency distribution,the coefficient of skewness based upon
quaetiles is 0.6.If the sum of the upper and lower quartiles is 100 and the
median 38,Estimate the value of the upper quartile.
Solution:
=0.6, =100 ,M=38
( )
=
( )
0.6 = =
( ) ---( )
Adding 1&2 2 =140 ( )
5.Find the coefficient of skewness,If difference between two quartiles is equal
to 8,sum of two quartiles is 22 and median is 10.5.
Solution:
Given =22, =8 ,h=10.5
( )
= = = =0.125
6. Calculate the coefficient of variation,if Karl Pearson’s coefficient of skewness
is 0.42,mean is 86,and median is 80.
Solution:
Given ,pearsons coefficient of Skewness =0.42
( )
Mean=86,Median=80. S.K =
( )
⇒0.42= => = =42.857
Coefficient of variation = x 100 =
7. The first four central moments of a distribution are 0,2.5,0.7 and
8.75.Write the skewness and kurtosis of the distribution.
Solution:
The coefficient of skewness is given by
( )
= ( )
,Since is positive ,the distribution is
positively Skewed.
The measure of kurtosis is given by = =( )
= =3
Since =3 the distribution is normal.
8 . The Karl Pearsons coefficient of skewness of a distribution is 0.32,it’s
standard deviation is 6.5 and the mean is 29.6.Calculate the mode and
the median.(L3)
Solution:
=0.32, =6.5 ,Mean =29.6
( ) ( )
S.K = => 0.32=
=> 0.32x6.5 =88.8 -3 Median
=>3 Median =-2.08+88.8 =86.72
Median = =28.90
Mean-Mode=3(Mean-Median)
29.6-Mode =3(29.6-28.90)
=3(0.7) =2.1
Mode=29.6-2.1=27.5
9. Compute the first four central moments for the following data 8,
10,11,12,14. (L3)
Solution:
̅ = = =11
x x- ̅ ( ̅) ( ̅) ( ̅)
8 -3 9 -27 81
10 -1 1 -1 1
11 0 0 0 0
12 1 1 1 1
14 3 9 27 81
55 0 20 56 144
Solution:
Marks Mid F d fd f
value
0-10 5 10 -3 -30 90
10-20 15 15 -2 -30 60
20-30 25 24 -1 -24 24
30-40 35 25 0 0 0
40-50 45 10 1 10 10
50-60 55 10 2 20 40
60-70 65 6 3 18 64
∑
A=35,d= , ̅ =A+ -
Mode =l + ( ) ( )
=30+ =30.625
√∑ (∑ )
=√ ( ) =30.625
Coefficient of skewness= = =0.0476
1. Calculate the Pearson’s coefficient of skewness for the following data (L3)
Class 10-19 20-29 30-39 40-49 50-59 60-69 70-79 80-89
frequency 5 9 14 20 25 15 8 4
Solution:
class Mid value F d fd f
9.5-19.5 14.5 5 3 -15 45
39.5-49.5 44.5 20 0 0 0
49.5-59.5 55.5 25 1 25 25
59.5-69.5 65.5 15 2 30 60
69.5-79.5 75.5 8 3 24 72
79.5-89.5 85.5 4 4 16 64
√∑ (∑ )
=√ ( )
=√
=17.12
Mode = l + ( )
=49.5+ ( )
=49.5+
Pearsons coefficient of skewness=
2. Calculte the pearsons coefficient of skewness for the following data (L3)
Class 3-7 8-12 13-17 18-22 23-27 28-32 33-37 38-42
frequency 2 108 580 175 80 32 18 5
Solution:
A=20 d=
∑
Mean = A+ =20+
Mode = l + ( )
=15+ ( )
=15+ +15=17.69
S √∑ (∑ )
√ ( ) =5.52
Pearsons coefficient of skewness=
Solution:
This is a discrete data.
Maximum frequency corresponds to x=10
X f d fd f
7 2 -3 -6 18
8 11 -2 -22 44
9 36 -1 -36 36
10 64 0 0 0
11 39 1 39 39
12 39 2 60 120
13 22 3 66 198
14 2 4 8 32
S √∑ (∑ ) =√ ( )
= = 50 + = 58.07
= 70 + = 78.125
( )
Bowley’s coefficient of Skewness = = = 19.61
MOMENTS
( ̅)
∑
( ̅)
∑
( ̅)
∑
( ̅)
∑
6. Calculate the first four central moments for the following frequency
distribution. (L3)
X 0 1 2 3 4 5 6 7 8
F 1 8 28 56 70 56 28 8 1
Solution:
X f D ( ̅) ( ̅) ( ̅) ( ̅)
0 1 -4 -4 16 -64 256
1 8 -3 -24 72 -216 648
2 28 -2 -56 112 -224 448
3 56 -1 -56 56 -56 56
4 70 0 0 0 0 0
5 56 1 56 56 56 56
6 28 2 56 112 224 748
7 8 3 24 72 216 648
8 1 4 4 16 64 256
256 0 512 0 2616
̅ = = = 4
( ̅)
∑ = = 0
( ̅)
∑ = = 2
( ̅)
∑ = = 0
( ̅)
∑ = = 10.22
Since the distribution is symmetrical
7. Calculate the first four central moments for the following frequency.(L4)
Marks less than 80 70 60 50 40 30 20 10
frequency 100 90 80 60 32 20 13 5
Solution:
∑
̅
Let d = , c = 10
∑
∑
∑
8. Calculate the moment measure of Kurtosis from the following data (L4)
X 2 4 6 8 10 12 14
Y 4 11 48 27 20 16 8
Solution:
X F d fd f f f
2 4 -3 -12 36 -108 324
4 11 -2 -22 44 -88 176
6 18 -1 -18 18 -18 36
8 27 0 0 0 0 0
10 20 1 20 20 20 20
12 16 2 32 64 128 256
14 8 3 24 72 216 648
( )
Types of correlation
Types of correlation:
( i) positive and negative
(ii).Simple,partial and multiple
(iii)Linear,non linear.
lines of regression.
=
3.Calculate the coefficient of correlation between from the following
data. (L3)
x 1 3 5 8 9 10
y 3 4 8 10 12 11
Solution:
x y ̅ ̅ ( ̅) ( ̅) ( ̅ )( ̅)
1 3 -5 -5 25 25 25
3 4 -3 -4 9 16 12
5 8 -1 0 1 0 0
8 10 2 2 4 4 4
9 12 3 4 9 16 12
10 11 4 3 16 9 12
36 48 0 0 64 70 65
∑ ∑
̅ ̅
∑( ̅ )( ̅)
̅
√∑( ̅ ) √∑( ̅) √ √
4.Calculate coefficient of correlation between . (L3)
x 1 2 3 4 5 6 7 8 9
y 12 11 13 15 14 17 16 19 18
Solution:
Y ̅ ̅ ( ̅) ( ̅) ( ̅ )( ̅)
1 12 -4 -3 16 9 12
2 11 -3 -4 9 16 12
3 13 -2 -2 4 4 4
4 15 -1 0 1 0 0
5 14 0 -1 0 1 0
6 17 1 2 1 4 2
7 16 2 1 4 1 2
8 19 3 4 9 16 12
9 18 4 3 16 9 12
45 135 0 0 60 60 56
̅ ̅
∑( ̅ )( ̅)
=
√∑( ̅ ) √∑( ̅) √ √
Solution:
X y Z
1 3 6 -2 -3 -5 4 9 25
6 5 4 1 1 2 1 1 4
5 8 9 -3 -1 -4 9 1 16
10 4 8 6 -4 2 36 16 4
3 7 1 -4 6 2 16 36 4
2 10 2 -8 8 0 64 64 0
4 2 3 2 -1 1 4 1 1
9 1 10 8 -9 -1 64 81 1
7 6 5 1 1 2 1 1 4
8 9 7 -1 2 1 1 4 1
200 214 60
The rank correlation between x & y is
∑ ( )
( )
( ) ( )
The rank correlation between y & z is
∑ ( )
( )
( ) ( )
The rank correlation between y & z is
∑ ( )
( )
( ) ( )
Since ( ) is maximum and also positive, We conclude that the pair of
judges x & z has the nearest approach to common likings in music
Solution:
x Y x- ̅ =x-32 y- ̅=y-38 ( ̅) ( ̅) ( ̅ )( ̅)
25 43 -7 5 49 25 -35
28 46 -4 8 16 64 -32
35 49 3 11 9 121 33
32 41 0 3 0 9 0
31 36 -1 -2 1 4 2
36 32 4 -6 16 36 -24
29 31 -3 -7 9 49 21
38 30 6 -8 36 64 -48
34 33 2 -5 4 25 -10
32 39 0 1 0 1 0
320 380 0 0 140 398 -93
∑ ∑
Here ̅ & ̅
Coefficient of regression of y on x is
∑( ̅ )( ̅)
∑( ̅)
Coefficient of regression of x on y is
∑( ̅ )( ̅)
∑( ̅)
Solution:
Let production be denoted by the variable x and capacity utilization by y
Then the regression equation is given by
̅ ( ̅) ----------------------(1)
Where = 0.62 = 0.5019
& ̅ = 35.6 , ̅ = 84.8
(1) y – 84.8 = 0.5019 (x-35.6)
y = 66.9324 + 0.5019 x
Which is the required regression of capacity utilization on production.
To find regression equation x on y is
̅ ( ̅) -------------------------(2)
Where = 0.62 = 0.7659
(2) x – 35.6 = 0.7659(y-84.8)
X = 35.6 + 0.7659 y – 64.9483
= 0.7659 y – 29.3483
When y = 70, x = 0.7659(70) – 29.3483
= 24.2647
Hence the estimated production is 242.647 units when the capacity
utilization is 70 percent.
= 0.6
Since both the regression coefficients are positive, r must be positive
r = 0.6