PS - Module 3 - ViRa
PS - Module 3 - ViRa
Vignesh Ravi
Contact: [email protected]
1 / 104
Outline
Correlation
Types of correlation
Coefficient of correlation
Problems
Rank Correlation Coefficient
Regression
Problems
Partial Correlation
Multi-linear Regression
Problems for Practice
2 / 104
Correlation
3 / 104
Types of correlation
4 / 104
Types of correlation
Positive correlation
If two variables tend to move together in the same direction that is
an increase in the value of one variable is accompained by an
increase in the value of the other variable or a decrease in the
value of one variable is accompained by an decrease in the value of
the other variable then the correlation is called “positive or direct
correlation”.
Example
height and weight, rainfall and yield of crops, price and supply
5 / 104
Types of correlation
Negative correlation
If two variables, tend to move together in the same directions that
is an increase or decrease in the values of one variable is
accompained by a decrease or increase in the value of the other
variable, then the correlation is called “negative or inverse
correlation”.
Example
6 / 104
Types of correlation
Simple correlation
About the study of only two variables, the relationship is described
as simple correlation.
Example
Quantity of money and price level, demand and price.
7 / 104
Types of correlation
Multiple correlation
About the study of more than two variables simultaneously, the
relationship is described as multiple correlation.
Example
The relationship of price, demand and supply of a commodity.
8 / 104
Types of correlation
Partial correlation
The study of two variables excluding some other variables is called
“partial correlation”.
Example
Price and demand, eliminating the supply side.
Note
In total correlation, all the facts are taken into account.
9 / 104
Types of correlation
Linear correlation
If the ratio of change between two variables is uniform, then there
will be linear correlation between them.
10 / 104
Types of correlation
11 / 104
Coefficient of correlation
covariance of x and y
r=
σx σy
Direct method
PP P
xi yi − xi
n y
r=q P q P i P
n xi2 − ( xi )2 n yi2 − ( yi )2
P
12 / 104
Coefficient of correlation
P
XY
r = pP pP
X 2 Y2
13 / 104
Properties of correlation coefficient
14 / 104
Problem 1
Solution:
15 / 104
Problem 1 Contd.
16 / 104
Problem 2
Solution:
17 / 104
Problem 2 Contd.
18 / 104
Problem 3
19 / 104
Problem 3 Contd.
Solution:
20 / 104
Problem 3 Contd.
21 / 104
Problem 4
22 / 104
Problem 4 Contd.
23 / 104
Problem 4 Contd.
24 / 104
Problem 5
25 / 104
Problem 5 Contd.
26 / 104
Problem 6
27 / 104
Problem 6 Contd.
28 / 104
Problem 6 Contd.
29 / 104
Problem 7
30 / 104
Problem 7 Contd.
31 / 104
Problem 7 Contd.
32 / 104
Problem 8
33 / 104
Problem 8 Contd.
34 / 104
Problem 8 Contd.
35 / 104
Problem 8 Contd.
36 / 104
Rank Correlation Coefficient
6 D2
P
ρ=1−
N(N 2 − 1)
37 / 104
Rank Correlation Coefficient
38 / 104
Equal and repeated ranks
39 / 104
Problem 1
40 / 104
Problem 1 Contd.
41 / 104
Problem 2
42 / 104
Problem 2 Contd.
43 / 104
Problem 3
44 / 104
Problem 3 Contd.
45 / 104
Problem 4
46 / 104
Problem 4 Contd.
47 / 104
Problem 4 Contd.
48 / 104
Problem 5
49 / 104
Problem 5 Contd.
50 / 104
Problem 5 Contd.
51 / 104
Problem 6
A sample of 12 fathers and their elder sons gave the following data
about their elder sons. Calculate the coefficient of rank correlation.
52 / 104
Problem 6 Contd.
53 / 104
Problem 6 Contd.
54 / 104
Regression
Definition
In regression, we can estimate the value of one variable with the
value of the other variable which is known. The statistical method
which helps us to estimate the unknown value of one variable from
the known value of the related variable is called regression.
Line of regression
The line described in the average relationship between two
variables is known as line of regression.
Example
▶ Used to estimate the relation between two economic variables
like Income and Expenditure. Also in prediction analysis.
▶ It is useful in statistical estimation of demand curves, supply
cµrves, production function, cost function and consumption
function, etc.
55 / 104
Regression equation
Regression equation of Y on X
Y = a + bX
56 / 104
Regression equation
Regression equation of X on Y
X = a + bY
57 / 104
Deviations taken from Arithmetic mean of X on Y
Regression equation of Y on X
σy
Y − Ȳ = r (X − X̄ )
σx
The regression coefficient of Y on X is
P
σy xy
byx = r = P 2
σx y
where x = X − X̄ and y = Y − Ȳ .
58 / 104
Deviations taken from Arithmetic mean of X on Y
Regression equation of X on Y
σx
X − X̄ = r (Y − Ȳ )
σy
where x = X − X̄ and y = Y − Ȳ .
59 / 104
Problem 1
Determine the equation of a straight line which best fits the data.
60 / 104
Problem 1 Contd.
61 / 104
Problem 2
62 / 104
Problem 2 Contd.
63 / 104
Problem 2 Contd.
64 / 104
Problem 3
From the following data, calculate
65 / 104
Problem 4
66 / 104
Problem 4 Contd.
67 / 104
Problem 5
Determine the equation of a straight line which best fits the data:
68 / 104
Problem 5 Contd.
69 / 104
Problem 5 Contd.
70 / 104
Problem 5 Contd.
71 / 104
Problem 5 Contd.
72 / 104
Problem 6
73 / 104
Problem 6 Contd.
74 / 104
Partial Correlation
75 / 104
Partial Correlation
76 / 104
Formula
77 / 104
Problem 1
78 / 104
Problem 1 Contd.
79 / 104
Problem 1 Contd.
80 / 104
Multiple Correlation
81 / 104
Multiple Correlation
82 / 104
Formula
s
2 + r 2 − 2r r r
r12 13 12 13 23
R1.23 = 2
1 − r23
s
2 + r 2 − 2r r r
r12 23 12 13 23
R2.13 = 2
1 − r13
s
2 + r 2 − 2r r r
r13 23 12 13 23
R3.12 = 2
1 − r12
83 / 104
Problem 1
84 / 104
Problem 1 Contd.
85 / 104
Problem 1 Contd.
86 / 104
Multi-linear Regression
Y = a + b1 X1 + b2 X2 + b3 X3 + . . . + bk Xk
where X1 , X2 , . . . , Xk are the k independent variables, Y is the
87 / 104
Steps to follow
3. Calculate b0 , b1 , and b2 .
88 / 104
Problem 1
89 / 104
Problem 1 Contd.
90 / 104
Problem 1 Contd.
91 / 104
Problem 1 Contd.
92 / 104
Problem 1 Contd.
93 / 104
Problem 1 Contd.
94 / 104
Problem 1 Contd.
95 / 104
Problem 2
Y X1 X2
4 15 30
6 12 24
7 8 20
9 6 14
13 4 10
15 3 4
96 / 104
Practice Problems I
1. Find the correlation co-efficient for the following data.
Sales 15 18 25 27 30 35
Advertising Expense 50 65 82 95 110 120
Ans: r = 0.99
2. Find the correlation co-efficient for the following data.
x 65 66 67 67 68 69 70 72
y 67 68 65 68 72 72 69 71
Ans: r = 0.6030
3. A computer while calculating rxy from 25 pairs of
observations; obtained thePfollowing constants
P 2 n = 25,
2 = 650,
P P
P x = 125, x y = 100, y = 460 and
xy = 508. A recheck showed that two pairs of values (6,
14), (8, 6) were wrong while the correct values were (8, 12),
(6, 8). Obtain the correct value of correlation co-efficient.
Ans: r = 0.6670
97 / 104
Practice Problems II
4. The marks secured by the recruits in the selection test X and
in the proficiency test Y are given below. Calculate the rank
correlation co-efficient.
S. No. 1 2 3 4 5 6 7 8 9
X 10 15 12 17 13 16 24 14 22
Y 30 42 45 46 33 34 40 35 39
Ans: ρ = 0.4
5. Calculate rank correlation coefficient from the following data:
Ans: 0.96
6. Calculate rank correlation coefficient from the following data:
Expenditure on Ads. 10 15 14 25 14 14 20 22
Profit 6 25 12 18 25 40 10 7
98 / 104
Practice Problems III
Ans: -0.024
7. Calculate the rank co-efficient of correlation for the following
data:
X 68 64 75 50 64 80 75 40 55 64
Y 62 58 68 45 81 60 68 48 50 70
Ans: ρ = 0.5450
8. The ranking of 10 students in two subjects, maths and
physics, are as follows:
Maths 3 5 8 4 7 10 2 1 6 9
Physics 6 4 9 8 1 2 3 10 5 7
Ans: r = −0.2970
9. Find the correlation co-efficient and the equation of the
regression lines for the following data
X 1 2 3 4 5
Y 2 5 3 8 7
99 / 104
Practice Problems IV
Ans: r = 0.8062, y = 1.3x + 1.1, x = 0.5y + 0.5.
10. Marks obtained by ten students in Mathematics X and
Statistics Y are given below. Find the two regression lines.
Also, find y when x = 55.
X 60 34 40 50 45 40 22 43 42 64
Y 75 32 33 40 45 33 12 30 34 51
Ans: Y = 1.1865X − 13.7060, X = 0.6414Y + 19.3061;
Y = 51.55 when X = 55.
11. In a correlation analysis the equations of the two regression
lines are 3x + 12y = 19 and 3y + 9x = 46. Find
−1
i.) Correlation Co-efficient. Ans: r = √
2 3
ii.) Mean values of X and Y. Ans: x̄ = 5, ȳ = 13 .
12. For the following data, find the most likely price at Madras
corresponding to the price 70 at Bombay and that Bombay
corresponding to the price 68 at Madras. S. D. of the
difference between the prices at Madras and Bombay is 3.1.
100 / 104
Practice Problems V
Madras Bombay
Average Price 65 67
S. D. of Price 0.5 3.5
Ans: For the price 68 at Madras, the most likely price at
Bombay is 84.43.
Ans: For the price 70 at Bombay, the most likely price at
Madras is 65.36.
13. If r12 = 0.75, r13 = 0.80, r23 = 0.70. Find the partial
correlation r13.2 . Ans: r13.2 = 0.5823
14. Given that r12 = 0.7, r32 = 0.85, r31 = 0.75. Determine R2.13 .
Ans: R2.13 = 0.8552
15. Determine the all multiple correlation co-efficient for the
following data:
X1 Number of Students 35 45 60 64
X2 Marks Obtained 60 72 68 80
X3 Number of Activity 4 3 7 5
101 / 104
Practice Problems VI
Ans: r12 = 0.77435, r13 = 0.66799, r23 = 0.04688;R1.23 =
0.9997, R2.13 = 0.9995, R3.12 = 0.9994.
16. Determine the multiple correlation co-efficient R1.23 for the
following data:
X1 2 5 7 11
X2 3 6 10 12
X3 1 3 6 10
Ans: r12 = 0.9692, r13 = 0.9922, r23 = 0.9713;
R1.23 = 0.9937.
17. Given that r12 = 0.6, r32 = 0.45, r31 = 0.5. Determine all the
partial correlation co-efficient. Ans:
r12.3 = 0.48, r13.2 = 0.32, r23.1 = 0.22.
18. Determine all the partial correlation co-efficient from the
following data:
102 / 104
Practice Problems VII
X1 20 15 25 26 28 40 38
X2 12 13 16 15 23 15 28
X3 13 15 12 16 14 18 14
Ans: r12 = 0.59, r13 = 0.59, r23 = −0.18, r12.3 =
0.88, r13.2 = 0.88, r23.1 = −0.81.
19. Determine the R1.23 , R2.13 and R3.12 for the data given in 18.
20. Predict the value of Y for subject 6 from the given dataset
that contains values for X1 , X2 , and Y by using a Multiple
Regression Model.
Subject Y X1 X2
S1 −3.7 3 8
S2 3.5 4 5
S3 2.5 5 7
S4 11.5 6 3
S5 5.7 2 1
S6 ? 3 2
103 / 104
Thank You
104 / 104