0% found this document useful (0 votes)
51 views7 pages

Document

This document discusses correlation and regression. It defines correlation as the study of the natural relationship between two or more variables. Correlation can be positive, negative, simple, multiple or partial. Regression is used to estimate the average relationship between variables and predict unknown values. The document provides examples of calculating correlation coefficients using Karl Pearson's and Spearman's methods. It also demonstrates solving regression equations to find regression coefficients.

Uploaded by

vasanthi_cyber
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
51 views7 pages

Document

This document discusses correlation and regression. It defines correlation as the study of the natural relationship between two or more variables. Correlation can be positive, negative, simple, multiple or partial. Regression is used to estimate the average relationship between variables and predict unknown values. The document provides examples of calculating correlation coefficients using Karl Pearson's and Spearman's methods. It also demonstrates solving regression equations to find regression coefficients.

Uploaded by

vasanthi_cyber
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 7

UNIT -4

CORRELATION AND REGRESSION

Correlation: Correlation is the study of the natural relationship between two or more variables.

Uses of correlation: Correlation is very useful in physical, social sciences, business and economics. It
is very useful to economists to study the relationship between price and demand. And to estimate
costs, sales, price and other related variables.

Variables: Cost, sales, price, income, expenditure, investment, return, Loan given to customers, Loan
amount received from the customers, students appeared an examination, students passed an
examination, etc.,

Types of Correlation:

Positive correlation, Negative correlation, Simple correlation, Multiple correlation, Partial


correlation, Linear correlation and Non Linear correlation.

Positive correlation:

Sales Rs. : 1000 1600 2100 3000

Profit Rs : 500 750 900 1500

Negative Correlation:

Supply in Rice ( in tones) : 100 98 85 120 70

Price : 35 36 38 30 40

Simple correlation:

When only two variables are studied, it is said to be simple correlation.

For example, The study of age and consumption of milk .

Multiple correlation:

When more than two variables are studied simultaneously, the correlation is said to be multiple.

For example, The study of price, demand and supply of product.

Partial Correlation:

Partial correlation coefficient provides a measure of relationship between a dependent variable and
a particular independent variable when all other variables involved are kept constant and when the
effect of all other variables are removed.

Linear Correlation:

The correlation said to be linear , if the amount of change in one variable tends to bear a constant
ratio to the amount of change in the other.

Degree of correlation: y =3x +2 x=1 y = 5 2 8 3 11


γ(correlation coefficient) value

0 No (or) zero correlation

+1 Perfect positive correlation

-1 Perfect Negative correlation

0.7 to 0.99 High degree of Positive correlation

-0.7 to -0.99 High degree of Negative Correlation

0.36 to 0.69 Moderate degree of Positive Correlation

-0.36 to -0.69 Moderate degree of Negative correlation

0.01 to 0.35 Low degree of Positive correlation

-0.01 to -0.35 Low degree of Negative correlation

Correlation value lies in between -1 and +1

Karl Pearson’s Coefficient of correlation

Formula:

1. γ = (Co-variance of xy) / (σx σy)

2. γ = Σ xy / N σx σy (or) Σdxdy / N σx σy

Where x = X – Mean; y = Y – Mean ; dx = X – A ; dy = Y - A

σx=Standard deviation of x σy = standard deviation of y

3. γ = Σxy / √(Σx2 Σy2) or γ = (Σdxdy) / (√Σx2 √Σy2 )

4. γ = (NΣXY-(ΣX)(ΣY))/(√NΣX2- (ΣX)2 √NΣY2- (ΣY)2)

Spearman’s Rank Correlation coefficient


In 1904, a famous British Psychologist Charles Edward Spearman found out the method of
Coefficient of correlation of rank.

Rank correlation is applicable to individual observations.

This measure is useful in dealing with quantitative characteristics.

R = 1 - (6ΣD^2)/(N3 -1) Where R = Rank co-efficient of correlation

D = Difference of two ranks

ΣD2 = Sum of squares of the difference of two ranks.

Problem:1

Find the Karl pearson’s Coefficient of Correlation

X: 6 2 10 4 8

Y: 9 11 5 8 7

----------------------------------------------------------------------------------------

X Y X2 Y2 XY

-----------------------------------------------------------------------------------------

6 9 36 81 54

2 11 4 121 22

10 5 100 25 50

4 8 16 64 32

8 7 64 49 56

---------------------------------------------------------------------------------------------

30 40 220 340 214

N=5 ΣX = 30 ΣY = 40 ΣX2 =220 ΣY2 =340 ΣXY= 214

Γ = ( (5 X 214)-(30 X 40))/(√5X220-(30)2 X √5X340- (40)2) = - 0.9194


Problem:2

Calculate coefficient of correlation from the following data:

X: 100 101 102 102 100 99 97 98 96 95

Y: 98 99 99 97 95 92 95 94 90 91

Coefficient of correlation γ = Σxy

(√Σx2 √Σy2 )

X x = X –Mean x2 Y y= Y- Mean y2 xy

100 1 1 98 3 9 3

101 2 4 99 4 16 8

102 3 9 99 4 16 12

102 3 9 97 2 4 6

100 1 1 95 0 0 0

99 0 0 92 -3 9 0

97 -2 4 95 0 0 0

98 -1 1 94 -1 1 1

96 -3 9 90 -5 25 15

95 -4 16 91 -4 16 16

ΣX =990 Σx2 =54 ΣY= 950 Σy2 = 96 Σxy=61

Mean = ΣX / N = 990 /10 = 99 Mean =ΣY / N = 950 /10 = 95

Γ = Σxy

(√Σx2 √Σy2 )

= 61 / (√54 √96) = 0.847

Problem : 3

Covariance between X and Y variables is 10.6 and variance of X and Y is 16 and 9. Find correlation
value.
Variance = (Standard deviation)2 ; Standard deviation =√variance

Correlation value γ = Covariance / σx σy

= 10.6 / √16 √9

= 10.6 / 12 = 0.8833

Problem:4

Coefficient of correlation between two variables X and Y is 0.48. Their covariance is 36. The variance
of X is 16. Find the standard deviation of Y series.

Correlation value γ = Covariance / σx σy

Correlation value Γ = 0.48 ; Covariance = 36

Variance of X = σx2 = 16 σx = √16=4

Correlation value Γ = covariance / σx σy

0.48 = 36 / 4 σy ; 4 σy ( 0.48) = 36 ; σy = 36 / 4(0.48)

= 18.75

Regression:

Regression is the measure of the average relationship between two or more variables in terms of the
original units of the data.

Uses:

It is useful to estimate the relationship between two variables, prediction of unknown value,
forecast the business situations, estimate the error in sampling.

Equation

Regression equation of X on Y

X = a + bY

Regression equation of Y on X

Y = a + bX (Trend line of Time Series) a and b are reg.coefficient.


To determine the value of a and b, the following two normal equation are to be solved
simultaneously.

Σy = Na + bΣx a = ∑Y / n b = ∑XY /∑X2

Σxy = aΣx + bΣx2

Problem:5

Given X = 16 , σx = 4.8 Y = 20 σy = 9.6. The coefficient of correlation between x and y is 0.6.


What will be the regression coefficient of x on y ?

σx= 4.8 σy =9.6

Regression coefficient of x on y bxy = γ (σx/σy)

= 0.6 (4.8 / 9.6) = 0.6 (0.5)

= 0.3

Problem:6

The correlation coefficient between x and y is -1/2. The value of bxy = -1/8. Find byx.

Γ2 = bxy . byx

(-1/2)2 = -1/8 . byx

1/4 = -1/8 . byx

byx = -8/4 = -2

POINTS:

Bivariate data : Data are collected from two variables simultaneously.

Uncorrelation: If change in one variable does not affect on another variable.

Karl pearson is the best method of calculating correlation coefficient.


It is only limitation is that it is applicable for only linear relation.

Quickest method of finding correlation is concurrent deviation.

Spurious correlation between two variables having no casual relation.

Method applied for devicing the regression equation is least square.

Observed value – Estimated value = Error or residue

Two lines of regression are equal, when r = -1 or r = 1.

Correlation coefficient is dependent/independent of the units of measurement.

If the sum of the product of deviations of x and y series from their means is zero, then the
coefficient of correlation will be zero.

The linear equations y = a + bx and

You might also like