FYBCOM Corrrelation Regressionppt
FYBCOM Corrrelation Regressionppt
COM
Sem-II
Mathematics
Module-I : Derivatives & its Applications Marks:20
Module-II : -Interest
-Annuity Marks:20
Statistics
Module-III : -Correlation
- Regression Marks:20
Module-IV : -Index Number
-Time series Marks:20
Module-V : Probability Distribution Marks:20
Prof. Anil Khadse NKTT College, TThane 2
Prof. Anil Khadse NKTT College, TThane 3
Correlation
Correlation means Finding the relationship between two
quantitative variables without being able to infer causal
relationships
Scatter Diagram
The points are plotted on graph and from the direction of the
movement of the points we can conclude on the relationship
No Correlation
Perfect Positive Perfect Negative
Reliability
Age of Car
Prof. Anil Khadse NKTT College, TThane 11
Karl Pearson’s Coefficient of correlation (r)/Product Moment
𝐶𝑜𝑣(𝑥,𝑦) - - - - - - - - - (iii)
𝑟= 𝑛.𝜎𝑥.𝜎𝑦
5. If r = 0, then No correlation
X 12 10 8 13 7
Y 15 20 25 18 22
−32
𝑟=
26 . 58
−32
𝑟= 5.099× .7.6157
−32
𝑟= 38.8324
𝑟 = −0.8240
There is –ve correlation
Prof. Anil Khadse NKTT College, TThane 17
Example # 2
Find Karl Pearson’s coefficient of correlation for the following data
serial Age Weight
No (years) (Kg)
1 7 12
2 6 8
3 8 12
4 5 10
5 6 11
6 9 13
461−451
𝑟=
291−280.1666 . 742−726
10
𝑟= 10 𝑟 = 13.1656
10.8334 . 16
𝑟= 10 𝑟 =0.7595
3.2914×4
Prof. Anil Khadse NKTT College, TThane 20
Example #3 : Calculate Coefficient of correlation from the following
information
𝑛 = 12 , Ʃ𝑥 = 35, Ʃ𝑦 = 60, Ʃ𝑥2=148 , Ʃ𝑦2 = 450, Ʃ𝑥𝑦 = 105
𝑥𝑦− 𝑥 𝑦/𝑛
𝑟=
2 ( 𝑥)2 2 ( 𝑦)2
𝑥− . 𝑦−
𝑛 𝑛
105−35×60/12
𝑟=
35 2 60 2
148− . 450−
12 12
105−175
𝑟=
1225 3600
148− . 450−
12 12
Prof. Anil Khadse NKTT College, TThane 21
−70
𝑟=
148− 102.0833 . 450− 300
−70
𝑟=
45.9187 . 150
−70
𝑟= 6.7763 .12.2474
−70
𝑟 = 82.9923
𝑟 = −0.84
(𝑥−𝑥 )(𝑦−𝑦 )
𝑟=
𝑥−𝑥 2 . 𝑦−𝑦 2
111
𝑟=
84 . 158
111
𝑟= 9.1651 .12.5698
111 𝑟 = 0.9635
𝑟= 115..2035
Prof. Anil Khadse NKTT College, TThane 23
Example #5: Calculate Coefficient of correlation from the following results
𝑛 = 8 , 𝑠. 𝑑 𝑜𝑓 𝑥 = 3.86, 𝑠, 𝑑 𝑜𝑓 𝑦 = 6.57 , (𝑥 − 𝑥 )(𝑦 − 𝑦 ) = 192
(𝑥−𝑥 )(𝑦−𝑦 )
𝑟= 𝑛.𝜎𝑥.𝜎𝑦
192
𝑟= 8×3.86×6.57
192
𝑟= 202.8816
𝑟 = 0.9463 = 0.95
X 14 8 10 11 9 13 5
Ans: 0.9231
Y 14 9 11 13 11 12 4
R1 R2 d=R1-R2 d2
5 4 1 1
2 2 0 0
1 3 -2 4
4 5 -1 1
3 1 2 4
Ʃd2= 10
Prof. Anil Khadse NKTT College, TThane 27
n=5
6 𝑑2
𝑅 =1− 𝑛(𝑛2−1)
6×10
𝑅 =1− 5(52−1)
60
𝑅 =1− 5(25−1)
60
𝑅 =1− 5(24)
60
𝑅 =1− 120
𝑅 = 1 −0.5 = 0.5
Prof. Anil Khadse NKTT College, TThane 28
Example #2: Find Rank correlation coefficient from the following data
Demand 15 22 20 30 25
Supply 18 25 26 28 20
Sol: Let R1 denotes rank for demand and R2 denotes rank for supply
x y R1 R2 d d2
15 18 5 5 0 0
22 25 3 3 0 0
20 26 4 2 2 4
30 28 1 1 0 0
25 20 2 4 -2 4
Ʃd2= 8
Prof. Anil Khadse NKTT College, TThane 29
n=5
6 𝑑2
𝑅 =1− 𝑛(𝑛2−1)
6×8
𝑅 =1− 5(52−1)
48
𝑅 =1− 5(25−1)
48
𝑅 =1− 5(24)
48
𝑅 =1− 120
𝑅 = 1 −0.4 = 0.6
Prof. Anil Khadse NKTT College, TThane 30
Example# 3
The coefficient of rank correlation between marks in two subjects
obtained by a group of students is 0.8. If the sum of squares of the
differences in ranks is 33. Find the number of students in the group.
Given: Rank correlation R =0.8
𝑑2= 33, n =?
198
6 𝑑2 −0.2 = −𝑛(𝑛2−1)
𝑅 = 1 − 𝑛(𝑛2−1)
𝑛(𝑛 2 − 1) =198/0.2
6×33
0.8 = 1 − 𝑛(𝑛2−1)
𝑛(𝑛2 − 1) =198/0.2=990
6×33
0.8 − 1 = −𝑛(𝑛2−1) 𝑛(𝑛2 − 1) =10(102-1)
𝑛 =10
Prof. Anil Khadse NKTT College, TThane 31
Rank correlation for repeated values :
𝐶. 𝐹 = 1
12
[(m1(m12-1)+m2(m22-1)+------]
6( 𝑑2+𝐶.𝐹)
𝑅 =1− 𝑛 𝑛 2
−1
Prof. Anil Khadse NKTT College, TThane 32
Example #2: Find Rank correlation coefficient from the following data
X 25 20 20 18 32 35
y 58 55 48 62 55 40
Sol: Let R1 denotes rank for demand and R2 denotes rank for supply
x y R1 R2 d d2
20 repeated twice
25 58 3 2 1 1
Rank = 4+5/2=4.5
20 55 4.5 3.5 1 1 m1 = 2
20 48 4.5 5 -0.5 0.25 55 repeated twice
18 62 6 1 5 25 Rank = 3+4/2=3.5
32 55 2 3.5 -1.5 2.25 m2 = 2
35 40 1 6 -5 25
Prof. Anil Khadse NKTT College, TThane Ʃd2= 54.58 33
𝐶. 𝐹 = 1
12
[(m1(m12-1)+m2(m22-1)]
1
𝐶. 𝐹 = 12 [(2(22-1)+2(22-1)]
1
𝐶. 𝐹 = 12 [(2(4-1)+2(4-1)]
1
𝐶. 𝐹 = 12 [6+6)]=12/12=1
6( 𝑑2+𝐶.𝐹)
𝑅 =1− 𝑛 𝑛2−1
6(54.5+1)
𝑅 = 1 − 6 62−1
333
𝑅 = 1 − 6 36−1
Prof. Anil Khadse NKTT College, TThane 34
333
𝑅 =1− 6 35
333
𝑅 =1− 210
𝑅 = 1 − 1.5857
𝑅 = −0.5857
Regression Equation of y on x
Regression Equation of x on y
X 8 7 10 9 5 6
y 11 8 12 13 8 10
x y x2 y2 xy
8 11 64 121 88
7 8 49 64 56
10 12 100 144 120
9 13 81 169 117
5 8 25 64 40
6 10 36 100 60
Ʃx = Ʃy = Ʃx2 Ʃy2 = Ʃxy =
45 62 355 662 481
Prof. Anil Khadse NKTT College, TThane 42
n=6 𝑥 45 𝑦 62
𝑥= 𝑛
= = 7.5 𝑦= 𝑛
= = 10.33
6 6
𝑥. 𝑦
𝑥𝑦 − 𝑛
𝑏𝑦𝑥 = ( 𝑥)2
𝑥2− 𝑛
481 − 45 × 62/6
𝑏𝑦𝑥 = 45 2
355 − 6
481 − 465
𝑏𝑦𝑥 =
355 − 337.5
Prof. Anil Khadse NKTT College, TThane 43
16
𝑏𝑦𝑥 = = 0.91
17.5
𝑥. 𝑦
𝑥𝑦 − 𝑛
𝑏𝑥𝑦 = ( 𝑦)2
𝑦2− 𝑛
16
𝑏𝑥𝑦 = 62 2
662 − 6
16 16
𝑏𝑥𝑦 = 𝑏𝑥𝑦 = = 0.75
662 − 640.666 21.34
Prof. Anil Khadse NKTT College, TThane 44
Regression equation of y on x is given by
𝑦 = 𝑏𝑦𝑥 𝑥 − 𝑥 + 𝑦
𝑦 = 0.91𝑥 + 3.505
𝑥 = 0.75𝑦 − 0.2475
𝑦 = 54.18 − 2.9
Prof. Anil Khadse NKTT College, TThane 48
Regression equation of x on y is given by
𝑥 = 𝑏𝑥𝑦 𝑦 − 𝑦 + 𝑥
𝑥 = 0.71 𝑦 − 53 + 65
= 0.71𝑦 − 0.71 × 53 + 65
= 0.71𝑦 − 37.63 + 65
𝑥 = 0.71𝑦 + 27.37
When y=50
𝑥 = 35.5 + 27.37
Prof. Anil Khadse NKTT College, TThane 49
Example # 3
For bivariate distribution, Mean of x = 43, Mean of y = 37
Regression coeff. of y on x = 0.59
Regression Coeff of x on y = 0.72
Find two regression equations and estimate
i) Likely value of y when x = 40
ii) Likely value of x when y = 35
Given: 𝑥 = 43 , 𝑦 = 37, 𝑏𝑦𝑥 = 0.59, 𝑏𝑥𝑦 = 0.72,
Regression equation of y on x is given by
𝑦 = 𝑏𝑦𝑥 𝑥 − 𝑥 + 𝑦
𝑦 = 0.59𝑥 − 25.37 + 37
𝑦 = 0.59𝑥 + 11.63
When x=40
𝑦 = 0.59 × 40 + 11.63
𝑦 = 37.95
Prof. Anil Khadse NKTT College, TThane 54
Regression equation of x on y is given by
𝑥 = 𝑏𝑥𝑦 𝑦 − 𝑦 + 𝑥
𝑥 = 2.69 𝑦 − 39 + 65
𝑥 = 2.69𝑦 − 104.91 + 65
𝑥 = 2.69𝑦 − 39.91
Put y = 37
𝑥 = 2.69 × 37 − 39.91
𝑥 = 99.53 − 39.91
𝑥 = 59.62
Prof. Anil Khadse NKTT College, TThane 55
Properties of regression lines
𝑟 = ± 𝑏𝑦𝑥. 𝑏𝑥𝑦
Prof. Anil Khadse NKTT College, TThane 56
Sign of corr coeff. r depends on the sign of regression coefficients.
i) r is positive if both the regression coefficients are positive
ii) r is negative if both the regression coefficients are negative
𝐶𝑜𝑒𝑓𝑓𝑖𝑐𝑖𝑒𝑛𝑡 𝑜𝑓 𝑥
𝑏𝑦𝑥 = −𝐶𝑜𝑒𝑓𝑓𝑖𝑐𝑖𝑒𝑛𝑡 𝑜𝑓 𝑦
1
𝑏𝑦𝑥 = −3
𝑟 = ± 𝑏𝑦𝑥. 𝑏𝑥𝑦
1 1 1
𝑟 =± − ×− =−
3 2 6
𝑟 = − 0.1666
𝑟 = −0.4082
Prof. Anil Khadse NKTT College, TThane 60
Example # 6
Given two regression equations 2𝑥 + 3𝑦 = 5 and 𝑥 + 𝑦 = 2.
Find i) Mean values of x and y
ii) Coefficient of correlation
SOL: To find mean values , solve two regression equations
2𝑥 + 3𝑦 = 5 −−−−−− −(𝑖)
𝑥+𝑦 =2 −−−−−− −(𝑖𝑖) × 2
2𝑥 + 3𝑦 = 5
2𝑥 + 2𝑦 = 4 𝑠𝑢𝑏𝑡𝑟𝑎𝑐𝑡 𝑖𝑖 𝑓𝑟𝑜𝑚 𝑖
- - -
𝑦=1
Put y = 1 in equation ii 𝑥+1=2 ⇒𝑥 = 1
𝑥 =1&𝑦 =1 Prof. Anil Khadse NKTT College, TThane 61
ii) To find r
Let regression equation of y on x is 2𝑥 + 3𝑦 = 5
𝐶𝑜𝑒𝑓𝑓𝑖𝑐𝑖𝑒𝑛𝑡 𝑜𝑓 𝑥 2
𝑏𝑦𝑥 = −𝐶𝑜𝑒𝑓𝑓𝑖𝑐𝑖𝑒𝑛𝑡 𝑜𝑓 𝑦 𝑏𝑦𝑥 = −3
Regression equation of x on y is 𝑥 + 𝑦 = 1.
1
𝑏𝑥𝑦 = −𝐶𝑜𝑒𝑓𝑓𝑖𝑐𝑖𝑒𝑛𝑡 𝑜𝑓 𝑦
𝐶𝑜𝑒𝑓𝑓𝑖𝑐𝑖𝑒𝑛𝑡 𝑜𝑓 𝑥
𝑏𝑦𝑥 = −1 = −1
σ𝑥
𝑏𝑥𝑦 = 𝑟.
σ𝑦
σ𝑥
1/3 = 0.4714.
2
2/3 = 0.4714. σ𝑥 1.41 = σ𝑥
Prof. Anil Khadse NKTT College, TThane 64