Correlation & Regression (Complete) .PDF Theory Module-6-B
Correlation & Regression (Complete) .PDF Theory Module-6-B
Examples
based on Coefficient of Correlation Examples
based on Rank Correlation
Ex.1 If var (x) = 8.25, var (y) = 33.96 and
cov (x, y) = 10.2, then the correlation Ex.3 From a random sample space, 5 students
coefficient is has been selected. Their marks in Maths
(A) 0.89 (B) – 0.64 and Statistics as given below -
(C) 0.61 (D) – 0.16 Roll No. 1 2 3 4 5
co ( x, y) 10.2 Marks in Maths 85 60 73 40 90
Sol. rxy = =
ar ( x). ar ( y) (8.25)(33.96) Marks in Statistics 93 75 65 50 80
= 0.61 Ans.[C] The rank correlation coefficient is
(A) 0.8 (B) 0.6
Ex.2 The coefficients of correlation between the (C) 0.5 (D) – 0.8
heights (in inches) of fathers and sons from
the following data Sol. Rank in Maths 2 4 3 5 1
Rank in Statistics 1 3 4 5 2
Heights of fathers (x) 65 66 67 68 69 70 71
Heights of sons (y) 67 68 66 69 72 72 69
Difference in rank (d2) 1 1 1 0 1
will be 64 4
= 1 – = = 0.8
(A) 0.60 (B) – 0.60 5 24 5
(C) – 0.67 (D) 0.67 Ans.[A]
6. PROPERTIES OF CORRELATION COEFFICIENT 8.2 Line of regression of x on y :
(r) The line of regression of x on y gives the most
probable values of x for given values of y and so
(a) r lies between – 1 and + 1
it is used to estimate the value of x for given
(b) the correlation is value of y. Its equation is -
(i) perfect and positive if r = + 1
(ii) perfect and negative if r = – 1 cov.( x, y)
x – x = 2y (y – y )
(iii) not correlated if r = 0
(iv) positive if r > 0
x
(v) negative if r < 0 x – x = r (y – y )
y
(c) It is independent of the change of origin and
scale. Examples
(d) It is a pure number and hence unitless based on Line of Regression
(e) If x and y are independent then r = 0
Ex.4 For the following data
7. REGRESSION ANALYSIS x y
Mean 65 67
In the previous article we have seen that
Standard deviation 5.0 2.5
correlation is merely a tool of ascertaining the
degree of relationship between two variables. If Correlation coefficient 0.8
does not tell any thing about the functional Then the equation of line of regression of y
relationship or nature of relationship between two on x is
variables but regression analysis attempts to 2
study the functional relationship between the (A) y – 67 = (x – 65)
5
variables so that one can predict the value of one
1
variable for the given value of the other variable, so (B) y – 67 = (x – 65)
5
Regression analysis is a statistical device with 2
the help of which we can estimate or predict the (C) x – 65 = (y – 67)
5
unknown values of one variable from the known
values of the other variable. 1
(D) x – 65 = (y – 67)
5
8. LINE OF REGRESSION Sol. Since the line of regression of y on x is
y
The regression line is a graphical method, which
y – y = r. (x – x ) y – 67 =
describes the average of relationship between the x
two variables. 0.8 2.5
Let us take the case of two variable x and y we (x – 65)
5
shall have two lines of regression because there
are two variable. 2
y – 67 = (x – 65) Ans.(A)
(i) Y on X (ii) X on Y 5
y 3
Sol. 3x + 2y = 26 y = – x + 13
or y – y = r . (x – x ) 2
x
10. PROPERTIES OF REGRESSION
1 31
6x + y = 31 x = – y + COEFFICIENTS
6 6
F
G 3 IF 1 I (i) r = b yx . b xy i.e. the coefficient of correlation
H2 JKGH 6 JK =
1
r = –
2 is the geometric mean between the two
Ans.[C] regression coefficients.
(ii) If byx > 1, then bxy < 1, i.e. If one of the
9. REGRESSION COEFFICIENT regression coefficient is greater then unity
then the other will be less than unity.
(iii) If the correlation between the variables is not
(I) The regression coefficient of y on x is denoted perfect then the regression lines intersect at
by byx and is given by
( x, y)
y cov.( x, y ) (iv) byx is called the slope of regression line y on
byx – r. = x and bxy is called the slope of regression
x 2x
line x on y.
This represents the change in the values of y
corresponding to a unit change in x . (v) byx + bxy > 2 b yx . b xy or byx + bxy > 2r
(i) The coefficient of regression of x on y is i.e the arithmetic mean of the regression
denoted by bxy and is given by coefficient is greater than the correlation
coefficient.
x cov.( x, y)
bxy = r = (vi) Regression coefficients are independent of
y 2y change of origin but not of scale.
(vii) The product of lines of regression's gradients
This represents the change in the value of x
corresponding to a unit change in y. 2y
is given by
2x
Examples (viii)If the angle between lines of regression is
Regression coefficient
based on
F1 r I F
2 I
Hr K H JK
G J G x y
Ex.6 If regression coefficient of y on x is 0.40, then tan = . 2 2
x y
then the regression coefficient of x on y will
be (ix) If both the lines of regression coincide, then
correlation will be perfect linear
(A) 1.6 (B) 6.4
(C) 5.1 (D) 3.2 (x) If both byx and bxy are positive, then r will be
positive and if both byx & bxy are negative
Sol. We know that product of both regression
then r will be negative
coefficient must be 1,
therefore 1.6 × 0.40 = .64 < 1 Ans.[A]
11. IMPORTANT POINTS ON REGRESSION
LINES
Ex.7 If the two regression coefficient between x
and y are 0.8 and 0.2, then the coefficient of
(I) If r = 0, then tan is not defined i.e. =
correlation between them is 2
(A) 0.4 (B) 0.6 Thus If two variables are not correlated, then
(C) 0.3 (D) 0.5 the lines of regression are perpendicular to
each other.
Sol. If to regression coefficient are 0.8 and 0.2
then coefficient of correlation will be (ii) If r = ± 1, then tan = 0 i.e. = 0 Thus
the regression lines are coincident
r = + 0.8 0.2 = 0.4 Ans.[A]
(iii) If regression lines are y = ax+b & x = cy+
bc d ad b
d then x = and y =
1 ac 1 ac
12. STANDARD ERROR OF PREDICTION
R
|S (y y )
p
2 U
|V
Sy =
|T n |W
where y is actual value and yp is predicted
value.
In relation to coefficient of correlation, it is given
by -
(i) Standard error of estimate of x is
Sx = x 1 r 2
(ii) Standard error of estimate of y is
Sy = y 1 r 2
2X 2Y
= Ans.[B] Ex.7 Let X and Y be two variables with the same
2X 2Y mean. If the lines of regressions of Y on X
and X on Y are respectively y = ax + b and
Ex.5 If the coefficient of rank correlation between x = y + , then the value of the common
marks in Mathematics and marks in Physics mean is -
obtained by a certain group of students is b 1 a
0.8. If the sum of the squares of the (A) (B)
1 a b
differences in ranks is given to be 33, then
the number of students in the groups is - b
(C) (D)
(A) 11 (B) 10 1 a 1
(C) 30 (D) None of these
Sol. We have X = Y (given)
6 di2 Since the two lines of regression pass
Sol. We have, r = 1 – 2
n (n 1) through ( X , Y ) , therefore
1 tan2
Since sin2 =
b XY = slope of the line of regression of X on 1 tan2
Ex.9 If bYX and bXY are regression coefficients of
2 Y on X and X on Y respectively, then -
1 r 2 (A) bYX + bXY = 2r (X, Y)
2r
2 (B) bYX + bXY < 2r (X, Y)
sin < 2
2 (C) bYX + bXY > 2r (X, Y)
1 1 r
(D) None of these
2r
r Y r X
Sol. Since bYX = and bXY = . Therefore,
X Y
2
1 r 2 bYX . bXY = r2 r is the GM of bYX and bXY
sin2 <
1 r2 b YX b XY
But is the AM of bYX and bXY
r2
Since 1 – < 1 + r2 andsin2 <1 2
and AM > GM
sin < 1 – r2 Ans.[C]
b YX b XY
Therefore, > r bYX + bXY > 2r
2
Ans.[C]