0% found this document useful (0 votes)
30 views9 pages

REGRESSION

The document discusses linear regression and its properties. Linear regression finds the line of best fit to describe the relationship between two variables. The key properties discussed are that the regression coefficients sum to the correlation coefficient and that their product is equal to the squared correlation coefficient.

Uploaded by

AARUSH SABOO
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
30 views9 pages

REGRESSION

The document discusses linear regression and its properties. Linear regression finds the line of best fit to describe the relationship between two variables. The key properties discussed are that the regression coefficients sum to the correlation coefficient and that their product is equal to the squared correlation coefficient.

Uploaded by

AARUSH SABOO
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

REGRESSION

REGRESSION

Regression is the estimation or prediction of unknown values of one variable from known values of another
variable. After establishing the fact of correlation between two variables, it is natural curiosity to know the extent
to which one variable varies in response to a given variation in the other variable i.e, one is interested to know the
nature of relationship between the two variables.
Regression measures the nature and extent of correlation.

LINEAR REGRASSION:
If two variates x and y are correlated i.e, there exists an association or relationship between them, then the
scatter diagram will be more or less concentrated round a curve, This curve is called the curve of regression and
the relationship is said to be expressed by means of curvilinear regression. In the particular case, when the curve
is a straight line, it is called a line of regression and the regression is said to be linear.
A line of regression is the straight line which gives the best fit in the least square sense to the given
frequency.
If the line of regression is so chosen that the sum of squares of deviation parallel to the axis of y is
minimised. It is called the line of regression of y on x and it gives the best estimate of y for any given value of x.
If the line of regression is so chosen that the sum of squares of deviation parallel to the axis of x is
minimised. It is called the line of regression of x on y and it gives the best estimate of x for any given value of y.

Y Y
𝑃𝑖 (𝑥𝑖 𝑦𝑖 )

B B
𝑃𝑖 (𝑥𝑖 𝑦𝑖 )
𝐻 𝑥𝑗 𝑦𝑖
𝐻 𝑥𝑖 𝑦𝑗
A A
O X X
O
Its equation is ̅ ( ̅) the line of regression of y on x

Similarly the equation of the line of regression of x on y is ̅ ( ̅)

Alternative Method:
Instead of calculating ̅ ̅ and r, we may use the following method,
Find sum, and
Solve the equation and simultaneously for a and b
we get the required equations . The above equations are called Normal equations

Note: (i) is called the regression co – efficient of y on x and is denoted by


REGRESSION

(ii) is called the regression co – efficient of x on y and is denoted by

(iii) If r = 0, the two lines of regression becomes ̅ ̅ which are two straight lines parallel to X
and Y axes respectively and passing through their means ̅ ̅ . They are mutually perpendicular.
(iv) If the two lines of regression will coincide.
̅̅ ̅̅ ̅̅ ̅̅ ∑( ̅)( ̅)
(v) ̅
OR ∑( ̅)
̅

̅̅ ̅̅ ∑( ̅)( ̅)
(vi) ̅
OR ∑( ̅)
̅

PROPERTIES OF REGRESSION CO – EFFICIENTS:


Property I. Correlation co – efficient is the geometric mean between the regression coefficients.
proof: The co –efficient of regression are and .

G.M. between them √ √ co – efficient of correlation

Property II. If one of the regression co – efficient is greater than unity, the other must be less than unity.
proof: Let

Since ( )

Similarly, if then

Property III. Arithmetic mean of regression co – efficient is greater than the correlation co – efficient.

Proof: We have prove that or

or or which is true.

Property IV: Regression co – efficient are independent of the origin but not of scale.
Proof: Let where a, b, h and k are constant

( )

Similarly
Thus, and are both independent of a and b but not of h and k.

Property V. The correlation co – efficient and the two regression co – efficient have same sign
Proof: Regression co – efficient of on
REGRESSION

Regression co – efficient of on

Since are both positive and r have same sign.

ANGLE BETWEEN TWO LINES OF REGRESSION:


If  is the acute angle between the two regression lines in the case of two variables X and Y, then

where r, have their usual meaning. Explain the significance of the formula when

and .

SOME SOLVES EXAMPLES:


1. A chemical engineer is investigating the effect of process operating temperature on product yield The
study results in the following data.
100 110 120 130 140 150 160 170 180 190
45 51 54 61 66 70 74 78 85 89
Find the equation of the least square line which will enable to predict yield on the basis of temperature.
Find also the degree of relationship between the temperature and the yield. Also verify that the sum of the
coefficients of regression is greater than
Solution:

Sr. no.

1 100 25 45 625 1250


2 110 16 51 361 760
3 120 9 54 256 480
4 130 4 61 81 180
5 140 1 66 16 40
6 150 00 0 70 0 0 00
7 160 10 1 74 4 16 40
8 170 20 4 78 8 64 160
9 180 30 9 85 15 225 450
10 190 40 16 89 19 361 760
85 2005 4120

Calculations of etc
̅
̅
̅ ̅
̅̅ ( )( )
̅ ( )
REGRESSION
̅ ( )( )
̅ ( )

The line of regression of on is ̅ ( ̅)


( )

The coefficient of correlation √ √


Now,
And
Hence, we see that
2. A panel of two judges and graded dramatic performances by independently awarding marks as follow
Performance No. 1 2 3 4 5 6 7
Marks by 36 32 34 31 31 32 35
Marks by 35 33 31 30 34 32 36
The eighth performance, however, which judge could not attend, got 38 marks by judge If judge had
also been present, how many marks would he be expected to have awarded to the eighteen performance?
Solution: We have to find the marks that would have been awarded by the judge Therefore, let the marks
given by the judge be denoted by and those given by by
For Calculation of coefficient of correlation
∑ ∑
̅ ̅

Now
√∑ ∑ √ √

Calculation of coefficient of correlation etc


̅ ̅ Product
Sr no

1 36 3 9 35 2 4 6
2 32 1 33 0 0 0
3 34 1 1 31 4
4 31 4 30 9 6
5 31 4 34 1 1
6 32 1 32 1 1
7 35 2 4 36 3 9 3
∑ ∑ ∑ ∑ ∑
∑ ∑
√ √ √ √ √

The equation of the line of regression of on is ̅ ( ̅)

( ) ( )
To find the value of when put in the above equation
( )
REGRESSION

approximately
Therefore, the judge would have given 37 marks to the eighth performance

3. The following data regarding the heights ( ) and weights ( ) of 100 college students are given
∑ ∑ ∑ ∑ ∑ Find the coefficient of
correlation between height and weight and also the equation of regression of height and weight
Solution: The coefficients of regression are given by
∑ ∑

(∑ ) ( )

∑ ∑

(∑ ) ( )

√ √
The equation of the lines of regression of on is ̅ ( ̅)
( )

4. From 10 observations on price and supply of a commodity the following summary figures were
obtained ∑ ∑ ∑ ∑ Compute the equation of the line of
regression of on and interpret the result. Estimate the supply when price is 16 units
Solution: We obtain the values of and of the equation of the line of regression of on i.e. of the equation
from the normal equations
∑ ∑
∑ ∑ ∑
But ∑ ∑ ∑ ∑

Multiply the first equation by 13 and subtract the result from the second equation

Substituting this value of in any of the two equations we get,


Hence, the equation of the line of regression is,
When putting this value,

5. Given the following results of weights X and heights Y of 1000 men

Where are means of X and Y, are standard deviations of X and Y and r is the correlation
coefficient between X and Y. John weight 200lbs, Smith is 5 feet tall. Estimate the height of john and weight
of Smith. From the value of height of John estimate his weight. Why is it different from 200 ?
REGRESSION

Solution: With the given notation the line of regression of on is ̅ ( ̅)

Substituting the given values, ( ) ( )


Put
( )
inches
Now the line of regression of of is ̅ ( ̅)

Substituting the given values, ( ) ( )


Put feet inches
( )

lbs
Hence, the height of John inches and weight of Smith lbs
To estimate the weight of John from his height 71.25 we have to use the equation of line of regression
of on (and not of on )
i.e. ( )

Putting we get ( )
The difference is due to the fact that for estimating we use one equation and for estimating we
use another equation.

6. It is given that the means of and are 5 and 10. If the line of regression of on is parallel to the line
estimate the value of for
Solution: The line of regression of on is ̅ ( ̅)
Its slope is But this line is parallel to i.e. whose slope is

But by data ̅ and ̅ Hence, the equation of the line of regression of on is


( ) i.e.

When

7. Given Find (i) ̅ and ̅ (ii) and (iii)


Solution: (i) To find ̅ and ̅ We solve the given equations simultaneously. Multiply the first equation by 3
and add
̅
Putting this value in any of the given equations ̅
(ii) To find Suppose the first equation represents, the line of regression of on
REGRESSION

Writing it as we find
Suppose the second equation represents the line of regression of on
Writing it as we find

√ √ √ √

But the value of can never be greater than 1 numerically. Hence, our supposition is wrong
Now treating the first equation as representing the line of regression of on we write it as,

Treating the second equation as representing the line of regression of on we write it as,

√ √ √

(iii) For

8. The regression lines of a sample are and Find (i) Sample means ̅ and ̅
(ii) coefficient of correlation between and Also estimate when
Also verify that the sum of the coefficients of regressions is greater than .
Solution: (i) Mean ̅ and ̅ are obtained by solving the two given equations.

(ii) If the line is the line of regression of on then


i.e.
If the line is the line of regression of then
i.e.

√ √( ) ( ) √

Since and are negative, is negative



Since (Numerically) and we see that

(iii) To estimate when we use the line of regression of on i.e. when

9. Find the angle between the lines of regression using the following data ∑ ∑
REGRESSION

Solution: The angle between the lines of regression is given by ( )( )


( )
Putting the given values * +( )

10. If the tangent of the angle made by the line of regression of on is 0.6 and find the correlation
coefficient between and
Solution: If the equation of the line of regression of on is ̅ ( ̅ ) then we know that is the
slope of the line of regression. We are thus, given
But and

Putting these values

11. If the tangent of the angle made by the lines of regression is 0.6 and find the coefficient of
correlation between and .
Solution: We know that the tangent of the angle between two lines of regression is given by

( )

But and

( )( ) ( )

( )( )
or ⁄ (| | cannot be )

12. If and the angle between the lines of regression is find the coefficient of correlation

Solution: We have ( )

( )


Since cannot be greater than 1, √

13. If the arithmetic mean of regression coefficients is and their difference is find the correlation
coefficient.
Solution: Let the coefficients of regression be and Now by data and
and
and
REGRESSION

Coefficient of correlation √ √

You might also like