0% found this document useful (0 votes)
768 views3 pages

Correlation and Regression

Correlation measures the strength of association between variables, while regression predicts the value of one variable based on the other. Correlation coefficients range from -1 to 1, where 1 is total positive correlation, -1 is total negative correlation, and 0 is no correlation. Regression lines plot the linear relationship between variables and their coefficients can be used to predict changes in one variable from the other. The correlation coefficient and regression lines help determine how much of the variation in one variable can be explained by changes in the other variable.

Uploaded by

api-197545606
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
768 views3 pages

Correlation and Regression

Correlation measures the strength of association between variables, while regression predicts the value of one variable based on the other. Correlation coefficients range from -1 to 1, where 1 is total positive correlation, -1 is total negative correlation, and 0 is no correlation. Regression lines plot the linear relationship between variables and their coefficients can be used to predict changes in one variable from the other. The correlation coefficient and regression lines help determine how much of the variation in one variable can be explained by changes in the other variable.

Uploaded by

api-197545606
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

Rajib Dolai

Correlation and Regression https://rajib1.weebly.com/

 Correlation is concerned with the measurement of the ‘ strength of association’ between variable.
 While Regression is concerned with the ‘prediction’ of the most likely value of one variable when the
value of the other variable in known.

 When statistical data relating to


the simultaneous measurement
on two variable, each pair of
observation can be geometrically
represented then that
representation is known as
Scatter Diagram.

 The o d o elatio is used to de ote the deg ee of asso iatio et ee a ia les .


 If y tends to increase as x increases the variables are said to be positively correlated.
 If y tends to decrease as x increases the variables are negatively correlated.
 If the values of y are not affected by changes in the values of x , the variables are said to be
uncorrelated.

1
Cov (x,y) = ( − )( − )
= -( )( )
 Variance must be always positive, covariance may be positive, negative or zero.
 If x and y are two independent variable, then their co-variance is Zero. i.e. COV (X,Y) = 0 .

ASSUMPTION:
1. X and Y are linear relationship .
2. Both variable should be Normally Distributed.
3. Homoscedasticity of the variable.

𝒐𝒗 ( , )
r=
𝝈 𝝈
 r is independent of the choice of both origin and scale of observation.
 Correlation co-efficient between x and y = Correlation co-efficient between u and v.
− − ′
If u = ,v= ′
 r is a pure number and is unit free.
 r lies et ee - a d + - ≤ ≤ .
When r = +1 perfect positive Correlation between variable.
r= -1 perfect negative Correlation between variable. Rajib Dolai
https://rajib1.weebly.com/
 r is a measure of degree of association between two variables.
 Correlation coefficient is adopted by karl Pearson.
 If two variable are independent, their correlation coefficient is Zero. But the converse is not true.

 Total variation = Unexplained variance + Explain variable


𝑬
 1= +
𝐸 𝑙 𝑖 𝑟𝑖 𝑙 (𝐸 )
Now, 𝑟 2 = Total variation (TV )
 When r2= 1 , TV =EV and UV =0
 When r2= 0 , EV=0
 Sign of r only indicates whether x and y more in the same direction or opposite directions but r2 is
always positive.
−𝐸 𝑙 𝑖 𝑟𝑖 𝑙
 Coefficient of Non-Determination : K2 =
𝑡 𝑙 𝑟𝑖 𝑙
𝑬
=1-
= 1 – r2
 Coefficient of Alienation : K = ± − 𝒓
 The Correlation coefficient are symmetric function of x and y i.e.
i.e. 𝑟 = . 𝑟 . But regression co-efficient are not symmetric function of x and y i.e. ≠ .

𝑪𝑶 ( , ) 𝝈
= 𝝈
=r𝝈

1. y - = ( x- )
x- = ( y- )
Where and are respectively the regression coefficients of y on x and the regression
coefficients of x on y.
2. The product of the two regression coefficients is equal to the square of correlation coefficient.
. = r2
3. r, and , all have the same sign. If the correlation coefficient r is zero, the regression coefficients
and are also zero.
4. The regression lines always intersect at the point ( , ) . The slopes of the regression line of y on x and
the regression line of x on y are respectively and 1/ .
5. The angle between the two regression lines depends on the correlation coefficient r. When r=0 , the
two lines are perpendicular to each other; when r= +1, or r= -1, they coincide .As r increases
numerically from 0 to 1 , the angle between the regression lines diminishes from 90 0 to 00.
6. The two regression equations are usually different . However, when r = ±1 , they become identical;
and in this case, there is an exact linear relationship between the variables . When r = 0, the regression
equations reduce to y = and x = , and neither y nor x can be estimated from linear regression
equations.
7. If the variables are uncorrelated i.e. r = 0 then the lines are perpendicular.
8. If one of the regression coefficient is greater than one , the other must be less than one.
9. The A.M. of regression coefficient ( + ) is greater than the correlation coefficient.
10. Regression coefficients are independent of change of origin but not of scale.
 Correlation need not imply cause and effect relationship between the variables. But regression
analysis clearly indicates the cause and effect relationship between variables.
Rajib Dolai
https://rajib1.weebly.com/
Example 1:
 Let the two regression lines be given as: 3x = 10 + 5y and 4y = 5 + 15x . Then the correlation
oeffi ient etween and is…….

10 5
 X=
3
+ …………..
3
5 15
Y= +
4
…………..
4

5 15 5 15 25 5
𝑟2 = × = × = = = 2.5 > 1 [ this is impossible ]
3 4 3 4 4 2
So from 1 and 2 e uatio
st nd
e o e…….
10 3 5 4
Y=- + and x = - +
5 5 15 15
3 4 3 4 4 2
𝑟 2 = 5 × 15 = 5
× 15
= = = 0.4 < 1
25 5
so answer is 0.4.

Example 2:
 In a two variable regression Y is dependent variable and X is independent variable. The correlation
coefficient between Y and X is 0.6. For this which of the result explained by X.

 Y = a + bX where Y = dependent variable


X = independent variable
Here r = 0.6
r2 = 0.36

So 36% variations in Y are explained by X.

You might also like