0% found this document useful (0 votes)
21 views17 pages

Regression Analysis-1 JJ

Regression analysis is a statistical method used to estimate the relationship between two or more variables, allowing for predictions based on known values. It was first introduced by Sir Francis Galton in 1877 and is widely applied across various fields, including economics and social sciences. The analysis distinguishes between independent and dependent variables and utilizes regression lines to represent the average relationship, enabling estimation and forecasting.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
21 views17 pages

Regression Analysis-1 JJ

Regression analysis is a statistical method used to estimate the relationship between two or more variables, allowing for predictions based on known values. It was first introduced by Sir Francis Galton in 1877 and is widely applied across various fields, including economics and social sciences. The analysis distinguishes between independent and dependent variables and utilizes regression lines to represent the average relationship, enabling estimation and forecasting.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 17

Regression Analysis

7.1 INTRODUCTION
After having established the fact that two variablcs arc
in estimating (predicing) the valuc of thc one variable givencloscly relatcd
the values of we may be
if we know that advertising and sales arc corrclated we may find
out another. inexattere,g)
For
expccted
penditure for attaining a given amount of sales. Similarly. if we know that the amount a

achieveyield certain
rainfall arc closcly related we may find out the amount of rain of ice ec
duction figure. Regression analysis revcals averagc rclationshiprcquired to
betwccn two vari p a
makes possible estimation or prediction. anå t
The dictionary mcaning of the term
'regression' is the act
term 'regression' was first used by Sir Francis Galton in 1877of rcturning or going b
between the height of fathers and sons. This term was whilc studying
sion towards Mediocrity in Hereditary introduced by him in the thcnanerrelationk
:
fathers and sons revealed a very Stature." His study of height of about one
and short fathers short sons, but theinteresting relationship, i.e., tall fathers tend to havethous
average height of the sons of group of short fathers is tallsrre
than that of the fathers. The line describing the tendency to regress or going
Galton a 'Regression Line'. The term is still used back was callet h
points to represent the trend present, but it no longer describe that line drawn for a group
of "stepping back" that Galton nccessarily carries the original implhcabon
intended.
writers to use the term estimating line These days there is a growing tendency of the moden
instead of regression line because the expressen
estimating line is more clarificatory in character.
Let us
examine a few definitions of the term regression.
1. "Regression is the mcasure of the
terms of the original units of the data. average relationship betwcen two or more vanabics
-Bar
2. *The term regression analysis' refers to the
values of a variable from a knowledge of the values mcthods by which estimates are made ot
measurement of the errors involved in this estimation of one or more other variables and o
process." -Morris Hambrg
3. "One of the most frequently uscd
technigue in economics and business researen,
relation between two or more variables that are related casually, is regression
analy -Iarv
si lam
4. "Regression analysis attempts to establish the nature of the relationship' betwecn van
ables-that is, to study the functional relationship between the variables and thereby prosnk
mechanisn for prediction, or forecasting. -la-Lw-(h
It is clear from above definitions that regression analysis is a statistical device WiththN
of which we are in a position to estimate (or predict) the unknown values of one arnablethm
known values of another variable. The variable which is used to predict the varnabl of untee
is called the indepcndent variable or explanatory variable and the variable we are trying
dict is called the dependent variable or "explained" variable. The independent Vanables
noted by Xand the dependent variable by Y. The analysis used iis called the :Simple inearrey
hn
sion analysis-simplebecause there is variable, and
because of the assumed lincar only one predictor or independent
The term "lincar" means that anrelationship betweenen the dependentat and the independent
b are constants, is used to cquation of a straight line of the form betweenthe tho
describe
ables. Two variables are said to
the average
relationsShipthat exists indepen
have lincar when change in the variable(sglucs
)
able (say X) by one unit leads to constant dependent
two variables have lincar relationship the absolute change in the used to find oout
the

regression can lines be


7.3
chGRESSION
ANALYSIS

plot two variables (say Xand ) on a scatter diagram and draw


When we through
variable
which pass the plotted points. these lines are called regression lines.
are straight ones. These regression lines arc based on two cqua
4endet
of best fit theselines
ineatregresston,
eression equations which give best estimate of one variable when the other is ex
-led regre
given,
ANOWTn or terms *dependent' and 'independent' refer to the mathematical or
Vshouldbenoted that the
dependence -they do not imply that there is nccessarily any cause and
meaning of variables. What is meant is simply that estimates of values of the
latrelationshipobetween the obtained for given values of the independent variable Xfrom a
dnde vanable
Y may be
endeht snetion involving Xand Y. In that sense, the values of Yare depcndent upon the
athemat
The variable may or may not
be causing changes in the Yvariable. For example,
(
dlaes ofA.
cstimat1ng sales of a product from figures on advertising expenditure, sales is generally
sh°le depcndent variable and the advertising expenditure independent variable. However,
asthe factors in the sense that changes
3w or may not be causal connection betwecn these two the cause-effect re
ertising expenditures cause changes in sales. In fact, in certain cases,
be the obvious one.
on may be just opposite of what appcars to
7.2 USES OF REGRESSION ANALYSIS
the scj
Reeression analysis is a branch of statistical theory that is widely used in almost all
atfic disciplines. In economics it is the basic technique for measuring or cstimating the rela
bunship among cconomic variables that constitute the essence of cconomic theory and economic
ife. For example, if we know that two variables, price () and demand (Y), are closcly related
Ecan find out the most probable value of X for a given value of Yor the most probable value
of Yfor given value ofX. Similarly, if we know that the amount of tax and the rise in the price
da2x levy.
commnodity are closely related, we can find out the expected price for a certain amount of
Thus we find that the study of regression is of considerable help to the economists and
bUsinessmen. The uses of regression are not confined to cconomics and business fields only. Its
plications are extended to almost all the natural, physical and social sciences. The regression
nalysis attempts to accomplish the following:
. Regression analysis provides estimates of values of the depcndent variable from values of
mependent variable. The device used to accomplish this estimation procedure is the regres
e. The regression line describes the average relationship existing between X and Yvari
2bles, ie t displays mean values of X for given values of Y. The equation of this l1ne, known as
SSion cquation, provides cstimates of the dependent variable when values of the inde
aent variable are inserted into the cquatiOn.
in usng
2A of regression analysis is to obtain a measure of the error involved
second goal
the standard error of estimate is
u o n linc as a basis for cstimation. For this purposeregression lne. If the line fits the
a c h i s is a corresponding value estimated from the regression line, good
tim, that is, if there is litle scatter of the observations around thea great
elheimates can be made of the Y variable. On the otherthehand if there is deal of scatter of

observations around the fitted regression line,


the dependent variable.
line will not produce accurate estimates
3 Wih thc help of regression coefficicnts we can calculate the correlation cocftictent. The
Quare of correlation cocfficient (r). called cocficient of deternmnation, measures the degree of
ocvation or Correlation that exists between thetwo variables. It assesses the proportion of van-
aike mthc variable that has been accounted
d for by the regression cquation In gen
denendent
tnl, the greater the value of , the better is the tit and the more usctul the regress1on cquation
predictive device
I3 DIFFERENCE BETWEEN CORRELATON & REGRESSION ANALYSIS
or
elation and ree ression nalysis are constructed umhder ditlerent assumptions, they furnnsh
ieUhenrentproblen
ynes nformation and
of
Situation.
it s not always clear as to wheh mcasure should be used in a
The following are the points of difference between the two
Whercas corelnlson coefticient i cANC ot degree of covanab1lity between Xand Y
7.4

the objective of regression analysis is to study the 'nature of relationshin


so that we may be able to predict the value of onc on the
basis of
REGRESbetSwIcenON ANALYM
the
tionship between two variables, the greater the confidence that another.
The
may be placcd in the closer the te
of
2. Correlation is merely atool of ascertaining thc degree
ables and, thereforc, we cannot say that one variable is the
examplc, a high degree of correlation between pricc and
cause and
demand
relationship
other the two
for a certain cifeca .
bctwecnestimte
regressionpoint
particular of time
analysis
may notsuggest which is the cause and which is the
one variable is taken as dependent while the other as
a independent cffect.commodt
HoACVery
rclationship. It t should be
ing it possible to study the causc and cffect noted that tthe
association does not imply causation, but the existence of causation always implies
Statistical evidence can only establish thc presencc or absence of association
nrevceA
a5socat
whether causation exists or not depcnds purely on reasoning. between vanjb
3. In correlation analysis r is a measure of direction and degree of linear
twcen two variables XandY. r and r are symmetric (r = )ie., it is relatiowhuch
immater1al nshugcf,
and Y isis dependent variable and which is indepcndent variable. In
b and benee it definitclyyssmakc
the
ression cocfficients b andb are not symmetric, i.e.. h
difference as to which variable is depcndent and which is indepcndent.
4. There may be nonsense correlation betwecn two variables which is purely due to che
and has no practical relevance such as inercase in incomc and incrcasc in weight of agre
people. However, there is nothing like nonsense regression.
5. Correlation cocfficient is independent of change of scale and origin while regrese
efficients are independent of change of origin but not of scale. Regression coefficient of
lation (r) takes the same sign as the regression cocfficient (b and b). Also, if the valuz odh
ven levcl of significance, r is also significant at that level.
significant at a given
7.4 REGRESSION LINES
If we take the case of two variables XandY, we shall have two regression lines as the regres
ofX on Yand the regression of Yon X. The regression line of Yon Xgives the most probable ve
of Yfor given values of Xand the regression line of Xon Ygives the most probable vaBue o
given values of Y. However, when there is cither perfect positivc or perfect negative cort
between the two variables (r =+ 1) the two regression lines will coincide, i.e, we will have
one linc. The farther the two regression lines from each other, the lcsser is the degree of are
tion and the ncarer the two regression lincs to cach other, the higher is the degree of corsdaa
the variables are independent, r is zero and the lines of regression are at right angles, ie p
to OX and OY.
It should be noted that the regression lines cut each other at the point of average o
i.e, if from the point where both the regression lines cut cach other aperpendhcular is
the X-axis, we will get the mean value of Xand if from that point a horizontal lin: s
the Y-axis, we will get the mean value of Y.
It is important to note that the regression lincs are drawn on least squares assunm
slipulates that the sum of squares of the deviations of the observed 'Y' values tron ponts s
lines shall be minimum. The total of the squares of the devviattions of the various
of best i
mum only fromthe line of best fit. The deviations from the points to the lne
measurcd in two ways-vertical, i.e., parallel to Y-axis. and horizontal, ie,
For minimising the total of the squares separately it is essentialmmnimises
to have two ofofthe
total
squares squari
hea
The regression Iinc of Yon Xis drawn in such a way that it
vertical deviation and the regression line of Xon Yminimises the total
te
deviations. This can be best appreciated with the help of the following exanpe
Height of father
68
(inches) 65 63 67 (h4 6N 62 70 6t
Height of son 71
(inchcs) 68 6 65

The two regression equalions corresponding to these varables are:


-3.38 L.036
| )
Y 35.82 2.476 X
ecRESSION ANALYSIS 7.5

Rassuming anv alucs of wc can find out corresponding values ofX liom Eq. (i).
cxample if = Iwould be -3.38+ L.036 (65)- 63.96
For 70, Nwould be
Sinilarlv,if) 3.38 + L036 (70)= 69.14.
S nlot these points on the graph and obtain regression ine of Xon Y.
ry by assigning any valucs to Xin Eq (n) we can obtain corresponding values of
FThes if=ci, would be
35.82 + 2.476 (63) = 6$.8O8 or 65.81
would be 35.82 +2.476(70) =69,14
andfer '=
70,
ornh of original data and these lines would be as follows:

(INCHES)
70
sON 69

OF 68
HEIGHT 67 REGAESSION
LINE OFY ON X
66

65

REGRESSION
63 LINE OFX ON Y

62 63 64 65 66 67 68 69 70 71
HEIGHT OF FATHERS (INCHES)
(GRAPH OF ORIGINAL DATA)
Y

1
(INCHES)
70

SON 69
68
OF
HEIGHT57

65
64
a
63
62
63 64 65 66 67 68 69 70 71

HEIGHT OF FATHERS (INCHES)


OF YON X(Y- Y)
7.6
REGRESSION ANALEs
71
(INCHES)
70

69
SON
68

OF 67
HEIGHT
6E

65
64
hx=a+ by
63
a
62 63 64 65 66 67 68 69 70 71
HEIGHT OF FATHERS (INCHES)

7.5 REGRESSION EQUATIONS


algebraic expressions af
Regression equations, also known as estimating equations, are regression equations
are two
regression lines. Since there are two regression lines, there
variations in the values ofX for given chang
regression equation of Xon Yis used to describe the
describe the variation in the values of
in Yand the regression equation of Yon Xis used to
given changes in X.
Regression Equation of YonX
The regression cquation of Yon Xis expressed as follows:
Y = a t bX
which determine the pts
In this equation a and b are constants (fixcd numerical values) of the line. If the vak
of the linc completely. These constants are called the parameters
either or both of them is changed, another line is determined. The parameter 'a' determ
intercept, i.e., what will be the value of Y(dependent variable) when X (independent a
takes the value zero. The parameter 'b' determines the slope of the line, ie.. the chanyi
unit change in X. The symbol Y stands for the value of Ycomputed from the relate
given X.
determin
obtained, the line is completely
If the values of the constants 'a' and '5' are
methods ofLcat ha
u
the question is howto obtain these values. The answer is provided by the manner
such a
which states that the line should be drawn through the plotted points incomputed Yvalues
sum of the squares of the deviations of the actual Yvalues from the best E(-
least, or in other words, in order to obtain a line which fits the points
minimum. Such a line is known as the line of 'best fit.
Astraight line fitted by least squares has the following characteristics: thesquared
of'
1. It gives the best fit to the data in the sense that it makes the sumstraightline Ths
from the line, E(Y - Y}, smallerthan they would be from any other
mean
accounts for the name 'Least Squares." This
2. The deviations above the line equal those bclow the line, on the verage.
total of the positive and negative deviations is zero, or (Y- Y) 0.
3. The straight lines goes through the overall mcan of the data (. least
Ý). squareS
4. When the data represent a sample from a larger population, the
estimate of the population regression line.
esGRESSIONANALYSIS
77
th a litle algebra and difMerential caleulls il can be shown that the lollowing two cqu
solved
eimultancously,
poIs,1 requiremnent is fulfilled:
will yicld values of the paameters a and bsuch that the least
sguares

Thesc cquations are usually called the normal cquations. In the cquations X, EY, LYY, EX
ndicaletotalswhich arc
computed from the obscrved pairs of of Iwo variables N and Y
whichthe least squares estinating line is to be fitted and Nvalues
is the number of observed pairs
fvalues.
X on y
Reeression Equation of
Regression Equation ofX on Yis cxpresscd as follows:
X = Na + bY X-at bf
Ts determine the values or a and D he following two normal cquations are to be solved
simultaneously.
EX= Na + bEY
EXY= a EY+ b E?
lustration 1. From the following data obtain the two regression cquations:
X 2 10 4
Y 9
8 7
Solution.
OBTAINING REGRESSION EQUATIONS
X Y
9 54 36
2 22 4 121
10 5 50 00 25
4 32 16 64
8 7 56 64 4g

EX= 30 EY= 40 EXY= 214 Lr= 220 EY= 340


Regression equation of Yon X: Y =a+ bX
16 determine the values of a and b the following two normal equations are to be
solved
EY= Na+ bEX
EXY = aX+ bEX
Substituting thevalucs 40 - 5 a+30 b

should be minimum oor 2(Y-a- bX) should be minimum (since Y at bA)


ttatng partial y SE(Y-a-bX)
with respect to a and

= 2)(Y - a- bX )(-) = 0

2)(Y- a - bX)(-X) 0

(-2)S(Y- a -bX) = 0or X(Y -4 - bX) 0


(-2))(Y -a- bX)X = 0 £(XY - a- bX) =0
XY Na +b) X
XY = a) X + bSx
7.8
214 -30a 220b
REGRESSION
6:240=30a + 180 b
Multiplying equation (i) by
214 - 30a + 220b
26 or b= -0.6S
(iün) - 405
Deducting equation (iv) from
equation (i)
Substituting the value of b in
+ 19.5 59.5 or a =11.9
40 =Sa 30(-0.65 ) or 5a =40
cquation, he regression of Yon X is
I'utting the values of a and b in the
Y= 119-0.65 X
Regression line of Xon Y:X=u tbY
and the two normal equations are:
LX-Na t bEY
LXY= aEY + bEY
30 = Sa+ 40b
214 = 40a + 3405
240 = 40a + 3205
Multiplying equation () by 8 :
214 = 4Oa + 340b
From eqn. (iii) and (iv)
20b = 26 or b
Substituting the value of b in equation ():30 =5a + 40 (-1.3)
ja = 30 +52 = 82 .. a = 16.4
Putting the values of a and b in the equation, the regression line ofX on Y is
X= 16.4 -1.3Y
Deviations taken from Arithmetic Means ofX and Y
The above method of finding out regression equation is tedious. The calculations can ve
much be simplified if instead of dealing with the actual values ofX and Y we take the deu
of Xand Yseries from their respective means. In such a case the two regression equahxe F
written as follows:
(i) Regression Equation of Xon Y
x-X-r(r- P)
X is the mean of X series
¼ is the mean of Yseries

Pis known as the regression coefficient of Xon Y.


The regression coefficient ofX on Yis denoted by the symbol b or b,. It measures the
oflaa
in Xcorresponding to a unit change in Y. When deviations are taken from the means
the regression coefficient of X on Yis obtained as follows:

b,, or r 0, Zxy
tinlthe
Instead of finding out the value of correlation We can
of regression coefficient by calculating Zxy and Lcoefticient, o,,o,
and dividing ee: bythe latte
the tormer

y_ Zy
No,o,
Alo b,, 0 (y)0,
7,9

on \
quatonot.

woctlwcnt of on \. lis s denoted by b or b,

nuIresth change m coresponding to aunit change in X. When deviations are taken


meats h
ticicnt
TeSSon coct ofYon can bo obtained as follows:

he noted that the underoot of tlhe product of(wo regression coefficients gives us
Symbolieallý:
slc of corelation coclicient.

Oy
b,, =r and b,, =
o,

b,, x b,, =r =r'.r= Jb, x b,,


The following points should be noted about the regression coficicnts:
Both the regression coetlicients will have the same sign, i.e, cither they willbe positive
yive. It is never possible that one of the regression coeflicicnts is negative and other posi
Since the value of the coefficient of correlation (r) cannottheexceed oncone of the regres
regression coefficCients can
oCicients must be less than one or, in othcr words, both
ater than one]or example, if b = 1.2 and b = 1.4 the value of correlation còetti
would beJx4=1206 which is not possiblc.
ne coeticient of correlation will bave thc same sign as that of regression coetticients,
Tegression cocficients have a neuative sign, r will also be negative and if regresston
ents have a positive sien, r would also be positive. For example, it b-0.8 and b
Would be -0.8x-=-098 and not +0.98.

Sinceb=rcÞ, , we can find out uny of the four values given the other three. Fo exan1ple.
Nhat r0.6, g -4 and b 0.8, we cun find ,

b. =

SLhstitutiny the yven valluCS.


06 X 4 or 24
0,
)8

No a
7.10

Illustration 2. mthe ata ot llusttato , valeulato ihe oivsin attons

Solution. (A AlON OF RORESSON TOUAUONS

0,
Regresson Eqatton of Aon )

0,
0,

llence 1 0 8) L3 l04
RevcsNion Etuation of on

20
-), 65

8 0.65 (X-6) 0.65 N 39


Y0(5 N+ |19 or Y |19 065 N
Thus we find that the anSwer is the sane s obtalcd curlier. llowever. tle calculatns are r na
sImplificd without the usc of nomal equations
Deviations taken from Assumed Means
When actual means of Xand Yvariables are in fraetions the calculations cn be sinylith
lak1ng he deviations fronn the ussumed meus. When devitio)s are taken trom assum
the cntire procedure of finding regression equations remains the same the only
that nstead of takimg deviations from actual meuus. ve take tlhe deviatons ronn ss
The two regression cquations are:

The value of will now be obtained as follows


Necndy-tnh
N

N
d (X A) and d 1
onfsSAN4L)SIs 7.1|

is
cquation ot ) on
regressin
the
Y-Y=r-D
Nscdy- slnsly
NEddr9
N
both the cases the numerator is the same, the only ditfercnce Is in
sheuldbe notedthat in
t denomnator. When the regression coefticients are calculated from correlation table thcir

sareobtained
as follows:
Efd,d
-x 1
(E fd,
N
and
=dass interval ofX variable:
duss interval of Yvariable
Efd, x Efd,
Efd,d, N
Similarly. r
2 (Efd, y'
Efd,
N

As is clear from above the formulac for calculating regression coefficients in a correlation
ale are the same-the only difference is that in a correlation table we are given frequencies
and hence we have multiplied every value by f.
llastration 3. Erom the data of illustration 1, obtain regression equations taking deviations from S in
cof and7 in case of Y
Solation.
CALCULATION FOR REGRESSION EQUATIONS
(X-5) Y (Y-7)
d, d d dd
+
1 9 +2 4 +2
2 9 +4 16 -12
+5 2 -2

+3 0

Ed =+ 5 J'= 4S SY= 40 El 25
ession Equation of Xon Y: X-X -b, (Y-Y)
-21- (5)(5)
N (5) -21-S
= -13
25- (5) D0

Coefficicuts ae indenendent of chunge of orign but not of scale.


REGRES ION AN:
So the tcyressiot equaton Is
I 0 N) L3Y 104 or 16.4-1.3Y
Regrewn bquation of Yon N: Y- Y ,, (X- X)
(5)X(5)
-21 - -26
N
= -0.65
45 (5) 40

So the tgression cquation is


0.05 - o) 0.65 N+ 3.9 or Y= ||.9-0.65X.
Graphing Regression Lines. It is quitc casy to graph thc regression lines on
Ibcen conputed. All one has to do is to:
( ) choose any tvo values (preterably wcll apart) for the unknown
side of the cquation.
variable on the
(b compute the other variable.
()plotthe two pairs of values, and
() draw a straught line through the plotted points.
Illustration 4. Show graphically the regression cquations of illustration 3.
Solution. (a) Regression Line of Yon[Y= ||.9 -0.65 X
)Let -2, Y= ||.9-0.65 (2) =1 L.9- 1.3= 10.6
()Let =10, )= ||.9-0.65 x 10 = 5.4
These ponts and the regression linc through them are shown in the following graph:

AEGAESSION
LINE OF
YON X

REGRESsION
LINE OF
X ONY

(b) Regression lne of Non Y(X= l64- 13 Y)


()ct 10

(u) Let
N164- 13(|0)- 164 13 =34
Y-6
X- l64 L3 (6) l64- 74- 8.6
Thesc ponts and the regression Ine through them are shown n the
Thus thc valuc of graph ar
regressIOn coefticIent comes out to be the same
Illustration S. GIven the bivanatc data

(a) Fit a regression ine of Yon and


thence predict Yf\ 5.
(b) Fit a regresson line ofon and thence
(e)Caleulate Karl Pearson's coeficient
predctAif25
2

x- 2

244, (24,12d,)
2d,r

-2 v73.7 -2, =2)


Subetitutiy the valuet the u t i t .

Y- 23.X it vual to 3431 - t027% 25j =3431- 695- 95-2736

24,d, (2d,)Zd,) 1--ly0)


h,,
24,
33
Y-2 9304/X -275) = 304X -9874 or Y-2:- 014 2
Ir-3. Yn equalto 2%74 -(0.394 5) =2 874 - 52-1354
) Jo,, h,, 27%)(-4 304) = -0 291
iustratinn 6. Fron the follow1ng data of the age of husbands and age of wIves determie the wo
eon lines and estunate the husband's age when wafe's age s ib
Husbad's ae. 6, 27. 2%, 28. 29 3
Wife's aye 2 20 22. 27. 21 29
|B Com (H). BHU 20081
7.14

Solution. Let age of husbands be denoted byX and tlhat of wives hy y


CALCULATION OF REGRESSION EQUATIONs
REGRES ION ANA
(X-30) (X-27)
d d
ty
36 36 28
23 -7 49 18 -9
8 6
27 -3 20 -7
49
28 4 22
25
28 4 27
29 -1 21
36
30 0 29 6
31 + | 27 (
38 64 29 4
+5 25 28 -
16
5
EX= 305 EA'= |93 EY= 249 Ed =-21
Sdl=201 d
Regression equation of Yon X

249
= 24.9: T = 2 305
= 30.5
10 10

127- S(-21)
10
O,
N
193-.(5)
10
127 +10.5
193 - 2.5
137.0.722
190.5
Y- 24.9 = 0.722 (X - 30.5)
y= 0.722X - 22.02 + 24.9 or Y= 0.722 X+ 2.88
Regression equation ofX on Y:

x-= (Y - )

Ed,d, - Id, Ed,


r
N
o, Ea,(2d,y?
N
127- (5)(-21)

201 (-21
127 + 10 S 137.5
0. 876
201-44.I 156.9
X- 30.50.876 (Y 24.9)
X-30.5 0.876 Y 21.81
X ),876 Y- 21L81 + 30,5
0.876 Y + 8.69
willbe
When= l6, X= 0.876 (16) + 8.69 7.15
-14. 016 + 8.69
22.706 or 23
the
hustration7. )You arc given following information
AnthmetCmcan Price (R Amount demanded (000 units)
Standand deviation 2 35
(orelation Cocicient
of
Obtan the rciSsIOn cquation amout demanded on price
Rs 12.5 and estimate the
Solution. Ilet the pIce be denoted
by Xand (BA likely demand when price
amount demanded by Y. The (H) Econ, Defhi Univ. 199
undet ()on prce L) will be :
regression equation of amourt
o,
Y=35, o, =5, X =10, Oy= 2. r=0.8
Y- 35 =0.8x= (X- 10)
Y-35 =2(X- 10)
Y- 35 = 2 X- 20
Y=2X+ I5
The estimated vlauc of Ywhen X=
12.5
Y =2(12.5) +15
25 + 15 = 40
Thus Y,,,= 40 (000 units)
gstration 8. In a partially destroycd laboratory record of an
kwtg results only are legible: analysIs of correiaton data the
Variance of X=9
Kg csson
cquations 8X- 10Y + 66 = 0
40X- 18Y= 214
fac on the basis of the
above
ie mean
values of XandYinformation
et vcient of
correlation between X and Y. and
ardatd devvattheion
of Y. |BA (H) Bcon, telhi Unix, 1994, MCA Madus Cnn 002
Alo
calculTheate standard crror of estate of regression of Yom and Aon
Mean values of Xund
KX |0Y

iplyg ey ) by 18YS 214


40X 50Y 130
4)X IKY 214
12Y 544

utung the value of Yn cq (


10x 17 66
7.16

8X=- 66 + 170
8X= 104
REGRES ION AN V
X = 13 = 13
have to find out
(ii) For finding out the correlation cocfficient, we will the equation the
we don't know which of the two regression cquations
Y.
is of regressimake
Xon Y, we on coefaN aSsume
Let us take cq. (i) as the regression equation ofX on
8X=- 66 + 10 Y

66
X =

= 1.25
or

From eq. (ii) we can calculate b.


40X- 18Y= 214
18Y = 214- 40X
214 40
Y= -
18 18
40
b, 18
Since both the regression coefficients are exceeding 1, our assumption is wrong. Hence the fer.
equation of Yon X. From eq. (i)
10Y=-8X -66

8
Y=X+6.6
8
b,, =

From eq. (ii) by =

Jo.36 = 0.6
V10 40
(iii) o, = V9 = 3

b,, =r

.45=,63

1.8
.450, = 1.8 or o, = -= 4
45
Hence standard deviation of Yis 4.
(iv) S,, = o, I-?
g, =4, r= 0.6
S,, = 4 - (0.6) =4 x0.8 = 3.2

, 3,r-0.6
S, = 3/ - (0.6) =3x0.8= 2.4
a E S NANALSS 7.17

studentssefaclass the regresson cquation of marks in Statistics ) on the marks


stratNa
0. l + 180 0. The mcan marks of Accountancy is 44 and variance of matks
Mesira
arance of marks in Accountancy Find the mean marks of Statistics and the
h,tthe
hwn marks in two subjects [8 Com. (H). Delht Untv 1994]
Stat1stics and Ymarks in Accountancy

44, \will be
whr
I-Qo(44)- 36
o4i6=624
in statistics is 624
The mean marks
0.6 } 36
=0.o

(given)
16

9
4

b, =r -

06= 3

0k24 0.8
Ihe coefiicient of correlation between marks in two subjects 1s O.5.
O lustration 10. You are given the following data:
X Y
Arither etic Mean
36 85
Standard Deviation 8
orre latinn coeff. between X and Y=
0.66
DFiad the two
Eimate the Regression
value of X Equations
when Y 75
Selution.
Reyression Euuation of Xon Y

36, r 0.66, o - | ,o - 8, Y 8S
X- 36 66 -(Y - 85)
X- 36 9075 (Y- 85)
X= 9075 Y- 77.1375 + 36
9075 Y- 41.1375
Regression Equation ofY on X
r-7=r
REGRES IoN ANA
G,

Y-85 6u-36)
=.
Y- 85 = 48 (K- 36)
Y- 85 = 48.1- 17.28
Y= 48 X -67.72
(i) Fromthe regressOn equation of Xon Y, we can find out the estimated
value of X
X= 9075 (75) - 4L.1375
=68.062S -41.1375= 26.925
when Y14
Thus X- 26.925.
llustration 11. For a bivariate distribution, the lines of regression are
3X- 12Y = 19; 3Y+ 9X= 46
Find the means and correlation coeficient. [B. SC. (H)
Solution. Mean values of.X and Y Chemistry, Delhi Uinn I
3X -12Y = 19 .. ()
3Y-9X= 46 (ii)
Multiplying equation () by 3
9X+36Y= 57
9.-3Y= 46

Eliminating X: 33Y= 1l Y-0.333


Putting the value of Yin eq. ()

3X +12x= 19 or 3X + 4 = 19 K = 5
Hence the mean of.X and Y are 5 and 0.333.
Correlation Coeftcient:
Let eq. () be regression of Yon
3X-12Y = 19 12Y= 19 - 3X

12 4
From eq. (ü) 3Y+9X= 46 9X= 46 - 3Y
46
X=Y or b, =-or
r=, x b,, =-x-
3 4
=-0. 289

You might also like