We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
You are on page 1/ 16
REUREssIoN ANALYSIS
$102:1006 OSI **"Ausa99ury eMEXUDS,
bin ho
[4-3 de Ate. [exe EXou/e Diff La 15 le
Panwa (6 [lg rreagu~e FP Aversz0
ow move Udriebir 10 ley G fla
hot fin Vertetls ( ep DI
CertumP hen ) are peso a “fae
nef uf fle Gavrege feffioncbop Le foven
fe ehmete fa Unlinem Vefir2 9 DePerdink ber
Chen Er Compimnd fen) Fron fle Krein
Leg
9" Incl Pend Uorhue (sen (Dy ferrer Cont)»
Regrgio PneT ots _
$< Fefibee! fod! fo slety [le nef aoe
#F fare frome] + ae cD nr reg
Gad fo @himcfe vy Predlif [Le Uymkro Hy
F Depend-f Jan ble mle [ran VAfiag
G Sire perdi 4x 47
epenrde OV eM
; sD epens fh Ye fe
Vereble Lrtauckh 1s Precli taf om hs bo-
|Z. OOH Va7-BbU + [ff Yor low [En Venbay
feg X one lee Petter liF Biehl 1
a dewld 4 VY
de Pendent Yared
fade VirrebU Chel (1 tel fy Prat
aware Virb, /f sy Urefl[y chectict
x.DISTINCTION BETWEEN CORRELATION AND REGRESSION
Correlation differs from Repression in the following respects:
Basis of Distinction | Correlation Regression
|. What measures? | Correlation measures degree | Regression measures the|
| und direction of relationship | nature and extent of average
between the variables. relationship between (wo or
more variables in terms of the
original units of the data,
2. Whether relative or| lisarclativemeasure showing | 11 is an absolute measure of
absolute measure, | association between variables. | relationship.
3, Whether Correlation coefficient is
independent of independent of change of both | independent of change of|
choice of both origin and scale, origin and not scale,
origin and scale.
Correlation Coefficient is | Regression Coefficient is not
4, Whether S H
independent of | independent of units of | independent of units of
ait of measurement. ‘measurement.
measurement.
; Expression of the relationship |p, of the relationshi
5. Expression of xpression ip
arma between the variables ranges | fetween the variables may be
from -1 to +1. in any of the forms like —
+ bX
Ysa+bX+cX
6. Whether a Itis not a forecasting device. | {1 is a forecasting device]
Sorecasting device? which can be used to predict
the value of dependent variable
from the given value of
independent variable.
7. Non-sense There may be non-sense | ‘There is nothing like non-
correlation such as weight of | sense regression.
girls and income of boys.
REGRESSION LINES (OR REGRESSION MODELS)
In case of simple linear regression model (i.e. when there is only one independent variable and there
is linear relationship between the dependent and independent variable), there are (wo regression lines
as follows:
1. Regression Line of X on Y
X=atby
where, “X = Dependent Variable, Y = Independent Variable
a =X intercept, (i.c. value of dependent variable when value of independent
variable is zero).
b =Slope of the said line (i.c, the amount of change in the value of the dependent
variable per unit change in independent variable).Regression Analysis 8.3
The values of two constants ‘a’ and ‘b’ can be calculated for the given data of X and ¥ variable by
calving the following two algebraic normal equations:
EX = Na+ b&Y
EXY =aLy + bE
where, N = Number of pairs of X and Y variables
EX = Sum of values of variable X
ZY = Sum of values of variable ¥
LY* = Sum of squares of values of variable ¥
EXY = Sum of products of values of X and ¥ variable
tse of Regression Line of X on ¥: This line gives the probable value of X for any given value of Y.
Another way of expressing Regression Line of X on ¥: This line can also be expressed as follows:
(X =X) = b(Y-¥)
or
Ca
Ss(r-7) Since bry =
oy
= Arithmetic Mean of X Series
= Arithmetic Mean of Y Series
Standard Deviation of X Series
Standard Deviation of Y Series
= Coefficient of Correlation between two variables X and ¥
(X=
where,
OF SI
2, Regression Line of ¥ on X
¥ =a+bx
X = Independent Variable, Y = Dependent Variable
a =Y intercept, (ie. value of dependent variable when value of independent
variable is zero).
+b =Slope of the said line (ie. the amount of change in the value of the dependent
variable per unit change in independent variable).
The values of two constants ‘a’ and ‘b’ can be calculated for the given data of X and ¥ variable by
solving the following two algebric normal equations.
EY =Na+bEX
EXY = aEX + bEX?
where, N= Number of pairs of X and ¥ variables
EX = Sum of values of variable X
EY = Sum of values of variable ¥
IX? = Sum of squares of values of variable X
EXY = Sum of products of values of X and Y variable
Use of Regression Line of ¥ on X: This line gives the probable value of ¥ for any given value of X.
Another way of expressing Regression Equation of Y on X: This line can also be expressed as
follows:
(Y-¥) =b, (X-X)84 Business Statistics
(Y-F)=r22(x-%) [see
Arithmetic Mean of X Series
Arithmetic Mean of ¥ Series
ion of X Series
ion of ¥ Series
1 = Coefficient of Correlation between two variables X and ¥
PROPERTIES OF LINEAR REGRESSION
1. Two Regression Equations - There are two linear regression equations.
(i) Regression equation of y on x:
Y-¥=b,,(X-X)
(ii) Regression equation of x on y:
X-X=b,(Y-7)
where b,, and b,, are respectively the regression coefficient (or slope) of ¥ on X and the regression
coefficient (or slope) of X on ¥.
_ OV) _ 8
5 y
2. Product of Regression Coefficient — The product of the two regression coefficients is equal
to the square of correlation coefficient.
Dyebyy =P
3. Signs of Regression Coefficient and Correlation Coefficient ~ r, by, and by, all have the
same sign. Ifthe correlation coefficient ris zero, the regression coefficients B,, and b, are
also zero.
4. Intersection at Means ~The regression lines always intersect at their means,
5. Slopes -The slopes of the regression line of ¥ on X and the regression line of X on Y are
respectively b,, and b,.
6. Angle between Regression Lines ~The angle between the two regression lines depends on
the Correlation Coefficient (r).
Value of r Angle between Regression Lines
() fr=0 Regression lines are perpendicular to each other.
(b) Ifr=+lor-1 Regression lines coincide (Le. become identical),
As value of r increases numerically from 0 to 1, the angle between regression equation
decreases from 90° to 0°. In other words, the farther the two regression lines are from each
other, the lesser is the degree of correlation and the nearer the two regression lines are to
each other, the higher is the degree of correlation.
7. Estimation of Value ~The value of X or ¥ can be estimated from linear equations if r + 0.pl
ROPERTIES OF REGRESSION COEFFICIENTS
The following are the important properties of Regression Coefficients:
\.
.c. either they will be positive
Same sign — Both regression coefficients have the same signs,
or negative,
Both can not be greater than one — If one of the regression coefficients is greater than
unity, the other must be less than unity to the extent the product of both regression
coefficients is less than unity. In other words, both the regression coefficients cannot be
greater than one.
Independent of origin — Regression coefficients are independent of the origin but not of
scale.
A.M. > r — Arithmetic mean of regression coefficients is greater than the correlation
coefficient.
5, risG.M. —Correlation coefficient is the geometric mean between the regression coefficients.
6. r, bxy and byx have same sign — The coefficient of correlation will have the same sign as
that of regression coefficient i.e. if regression coefficients have a positive sign, r will also be
positive and if regression coefficients have a negative sign, r will also be negative.FORMATION OF REGRESSION EQUATIONS BY SOLVING NORMAL EQUATION
PRACTICAL STEPS INVOLVED IN THE FORMATION OF REGRESSION
EQUATIONS BY SOLVING NORMAL EQUATIONS
Step 1— Denote one variable by X and another variable by Y.
Step2— Obtain the total of actual values of X i.e. XX.
Step3— Calculate squares of actual values of variable X and obtain their total ie. XX’.
Step4— Obtain the total of actual values of Y ie. XY.
Step 5 Calculate squares of actual values of variable Y and obtain their total ie. ZY".
Step 6—> Multiply the actual values of X by the corresponding aciual values of Y and obtain the
total ie. XXY.
Step? — Determine the values of ‘a’ and ‘b’ by solving the follovsing two normal equations for
regression equation of X on Y.
XX = Na + bxY .-Eqg. 1 {Note: Eq. 1 is summation of X = a + bY)EXY=aEY+b3Y Eg, Il (Note: Eg. Il is Eq. | multiplied by ¥\
then, put the values of ‘a’ and ‘b’ in the following equation:
X=a+by
step 8 Determine the values of ‘a’ and ‘b’ by solving the following two normal equations for
regression equation of Y on X.
ZY=Na+bEX ——_.. Eq. (Note : Eg. 1 is summation of ¥ = a + bX)
EXY = aEX + DEX? ... Eq. Il (Note : Eq, Il is Eg, I multiplied by X)
then, put the values of ‘a’ and ‘b’ in the following equation:
Y=a+bX
_—_——
{LUSTRATION 1) [Finding out Regression Equations by solving normal equations}
he following data relate to advertising expenditure and sales.
‘Advertising expenditure (Rs. lakhs) | 1 2 3 4 5
Sales (Rs. lakhs) 10 20 30 50 40
Required:
(a) Find out two Regression Equations.
(b) Estimate the likely sales when advertising expenditure is Rs. 7 lakhs.
(c) What should be the advertising expenditure if the firm wants to attain sales target of Rs. 80
lakhs
(d) Calculate Coefficient of Correlation.
solution
(a) Regression Equations
Step 1» Let advertising expenditure be denoted by X and sales by Y.
Step 2 Computation Table for finding out Regression Equations
x x? y y xy
1 1 10 100 10
2 4 20 400 40
3 9 30 900 90
4 16 50 2500 200
5 25 40 1600 200
sx=1s | xxt=55 | sy=150 | s¥=5s00] Exy=540
N=No. of pairs = 5
Step 3 Regression Equation of X on ¥: X,=a + bY
To determine the values of a and b the following two normal equations are to be solved:
EX =Na+bEY
EXY =aZY + bEYSubstituting the values in the above equations:
15 = Sa + 150b
540 = 150a + 5500b
Multiplying Eq. I by 30
450 = 150a + 4500b
540 = 150a + 5500b
Subtracting Eq. IV from Eq. Ill
-90 10006
b = 90/100 = .09
Substituting the value of b in Eq. 1
15 = 5a + 150 (.09)
15 =Sa+ 135
Sa =15- 13.5
a =15/5=.3
Putting the values of a and b in the equation the regression line of X on Y is —
X, =0.3 +0.09¥
Step 4» Regression Equation of ¥ on X: ¥, =a + 6X
To determine the values of a and b the following two normal equations are to be solved:
EY =Na+bEX
EXY = a5X + bEX?
Substituting the values in the above equations:
150 =Sa + 15b
540 = 15a +55b
Multiplying Eq. I by 3
450 = 15a + 45b
540 = 15a + 55b
Subtracting Eq. IV from Eq. Ill
-90 =-10b
b =90/10=9
Substituting the value of b in Eq. I
150 = 5a + 15(9)
150 =5a+ 135
Sa = 150-135 = 15
a =15/5=3
Putting the values of a and b in the equation, the regression line of Y on X is —
¥ =3+9X
(b) Sales (Y) when Advertising Expenditure (X) is Rs. 7 lakhs.
Y =3+9x7=3+63=66
ire (X) to attain sales (Y) target of 80 lakhs.
X =0.3 + .09 (80) =0.34+7.2=7.5
(4) Coefficient of Correlation (r) = /b,, Xby, = V.09X9 = +09awe STEPS INVOLVEDIINITHEIFORMATION|OFIREGRESSION EQUATIONS
gy DIRECT, METHOD) WHEN) NO) DEVIATIONS ARE TAKEN.
Denote one variable by X and another variable by Y.
Obtain the total of actual values of X ive. EX.
Calculate squares of actual values of variable X and obtain their total i.e. EX?.
Obtain the total of actual values of Y i.e. ZY.
Calculate squares of actual values of variable Y and obtain their total i.e. ZY*.
Multiply the actual values of X by the corresponding actual values of Y and obtain the
total ie. EXY.
Calculate b,, and b,, as follows:
exy-@X) GY) zxy-@0 GY)
ee NEN
7 gy? Gr OF sya xy?
N N
|LLUSTRATION 2 [Finding out Regression Equations by Direct Method]
The following data relate to advertising expenditure and sales.
Advertising expenditure (Rs. lakhs) 1 2 3 4 5
Sales (Rs. lakhs) 10 20 30 50 40
Required:
(a) Find out two Regression Equations.
(p) Estimate the likely sales when advertising expenditure is Rs. 7 lakhs.
(c) What should be the advertising expenditure if the firm wants to attain sales target of Rs. 80
lakhs.
(d@) Calculate Coefficient of Correlation.
Solution
(a) Regression Equations
Step 1 Let advertising expenditure be denoted by X and sales by ¥.
Step 2 Computation Table for finding out Regression Equations
ag xe. ¥ yi. xy
1 1 10 100 10
2 4 20 400 | 40
3 9 30 900 90
4 16 50 2500 200
5 25 40 1600 200
EX = 15 E=55 | Z¥=150 |E¥*=5500 Exy=540
N= No. of pairs = 5Step3— Calculation Sb, and byy
(2X) @Y) (15) (150)
rxy-PUEY) s49-
b= N oo ee en
py? Ye 5500-/50- 5500 — 4501
N 5
(Y) (15) (150)
py -SORD 0S _ 50- 45090,
by = we ae Par 35-45 10
5
Step 4 Calculation of Mean
ge ae oo BY = 10.30
RaW Paws
Step $— Formation of Regression Equations
(a) @) (X-X) =by (Y-¥) (ii) (Y-¥) = by (X-X)
(X-3) = 0.09 (Y - 30) (Y¥ - 30) = 9 (X-3)
3 +0.09 ¥ ¥=349X
(b) Sales (Y) when Advertising Expenditure (X) is Rs. 7 lakhs.
Y =34+9x7=3+63=Rs, 66 lakhs
(0) Advertising Expenditure (X) {0 attain sales (Y) target of 80 lakhs.
X =0.3 + .09 (80) = 0.3 + 7.2=Rs. 7.5 lakh
@ Coefficient of Correlation (r) = by X Py, = V.099 = + 0.9
FORMATION OF REGRESSION EQUATIONS BY TAKING DEVIATIONS FROM THE
ACTUAL MEANS
PRACTICAL STEPS INVOLVED IN THE FORMATION OF REGRESSION
EQUATIONS BY TAKING DEVIATIONS FROM THE ACTUAL MEANS
Step 1-9 Denote one variable by X and another variable by ¥.
Step 2» Obtain the total of actual values of X i.e. EX and calculate actual mean of X i.e. X
Step 3» Take deviations from the actual mean of variable X and denote them by x.
Step 4 ee of deviations from actual mean of variable X and obtain their
total i.e, Dx.
Step 5—> Obtain the total of actual values of Y ie. EY and calculate actual mean of Y ie. ¥.
Step 6» Take deviations from the actual mean of variable Y and denote them by y.
Step 7 Calculate spa of deviations from actual mean of variable ¥ and obtain their
total i.e. Ly.
Step 8» Multiply the deviations from actual mean of X by the corresponding deviations from
actual mean of ¥ and obtain the total i.e. Ex
Step 9» Put the relevant values in the following equations:
ssion Equation of Yon Y (ii) Regression Equation of ¥ on X—_— ce
\LLUSTRATION 3 Finding out Regression Equations when deviations are taken from actual
mean]
rhe following data relate to advertising expenditure and sales.
Advertising expenditure (Rs, lakhs) | 1 2 3 4 5
Sales (Rs. lakhs) 10 20 30 50 40
Require
(a) Find out two Regression Equations,
(b) Estimate the likely sales when advertising expenditure is Rs. 7 lakhs.
(c) What should be the advertising expenditure if the firm wants to attain sales target of Rs. 80
lakhs.
(@) Calculate Coefficient of Correlation.
solution
(a) Regression Equations:
step 1 Let advertising expenditure be denoted by X and sales by Y.
step 2 Computation Table for finding out Regression Equations
x N-N Y y-7
x x? y y ay
1 2 4 10 -20 400 40
2 -l 1 20 -10 100 10
3 0 0 30 0 0 0
4 1 1 50 20 400 20
3 2 4 40 10 100 20
EX=15]Ex=0 | Ee=10 | Ly=150 | ty=0 y? = 1000] Exy = 90
N=No. of pairs = 5
yim
sup3 > F=T=S=3;
Hence, X - 3 = .09 (Y-30) = .09¥-2.7
X=.09Y-2.74+3 or X=3+.09Y
sO &
Step 5+ Regression Equation of Yon X: Y-¥ =r (X—X)
Hence, Y-30=9(X-3)=9X-27
Y=9X-27+30 or ¥=3+9XPart (b) Sales (Y) when Advertising Expenditure (X) is Rs. 7 lakhs.
Y =3+9x7=3+63=Rs. 66 lakhs
Part (c) Advertising Expenditure (X) (0 attain sales (Y’) target of 80 lakhs.
X =0.3 + .09 (80) = 0.3 + 7.2 = Rs. 7.5 lakhs
Part (d) Coefficient of Correlation (r) = fry X By, = V.09x9 = + 0.9
FORMATION OF REGRESSION EQUATION BY TAKING DEVIATIONS FROM THE
ASSUMED MEANS
PRACTICAL STEPS) INVOLVED) IN THE FORMATION OF REGRESSION EQUATION
BY) TAKING) DEVIATIONS FROM) THE) ASSUMED) MEANS:
Step > Denote one variable by X and another variable by Y.
Step 2 Obtain the toral of actual values of X i.e. EX and calculate actual mean of X i.e. X.
Step 3 Take deviations from the assumed mean of variable X and denote them by d,,
Step 4—> Calculate squares of deviations from assumed mean of variable X and obtain their
total i.e, Ed,?.
Step 5—> Obtain the total of actual values of Y i.e. ZY and calculate actual mean of Y i.e. ¥.
Step 6 Take deviations from the assumed mean of variable Y and denote them by d,.
Step 7 Calculate squares of deviations from assumed mean of variable ¥ and obtain their
total ive, Ed,?,
Step 8» Multiply the deviations from assumed mean of X by the corresponding deviations
from assumed mean of ¥ and obtain the total ie. ¥d,d,
Step 9—> Put the relevant values in the following equations:
(® Regression Equation of Yon Y (ii) Regression Equation of ¥ on X
aiose =
x-K=r2e(y-7) Sy Cy
3, a x)
Bd, xEd, Ei,
go, Mad, é ade
where, r—= = N_ where, r— =_—___N__
5 2_G4) S542 Gay
Tai 7 WV
d,=X—A,andd,=¥-A, 4,=X~A, and d,=Y-A,
ILLUSTRATION 4 [Finding out Regression Equations when deviations are taken from as-
sumed mean]
The following data relate to advertising expenditure and sales.
‘Advertising expenditure (Rs. lakhs) | 1 2 3 4 5
Sales (Rs. lakhs) 10 20 30 50 40equired:
(a) Find out two Regression Equations.
(b) Estimate the likely sales when advertising expenditure is Rs, 7 lakhs.
(c) What should be the advertising expenditure if the firm wants to attain sales target of Rs, 80
lakhs.
(a) Calculate Coefficient of Correlation.
solution
part (a) Regression Equations
step 1 Let advertising expenditure be denoted by X and sales by ¥.
step 2? Computation Table for Finding Out Regression Equations
x X-4 Y Yao
ts ae 4 a? ded,
1 -3 9 10 -30 900 00
Z oe 4 20 -20 400 40
J a! ! 30 -10 100 10
4 0 0 50 10 100 0
5 1 1 40 0 0 0
X= 15 Yd2=15 EV¥=150 Ed,=~50 ¥d,2= 1500 Edy, = 140
N= No. of pairs = 5
step3 > X=
'N
(Ed,) x (Ed, ) (-5x-50)
2,4, - 0a 140-50 90.
yg 2 24) 1500 - (50) 1500-500 1000
_ N 5
Hence, X-3=.09(Y-30)=.09Y-27
X=.09¥-2.7+3 or X=.3+.09Y
Step 5» Regression Equation of ¥ on X: Y-¥ =by,(X-X)
2d,)x (Ld,
x,4,-§ ret y)
bye (,)°
Yd,? —“2
N
Hence, Y-30=9 (X-3)=9X-27
Y=9X-274+30 or Y=3+9XPart (b) Sales (Y) when Advertising Expenditure (X) is Rs. 7 lakhs.
Y =34+9x7=3+63=Rs. 66 lakhs
Part (c) Advertising Expenditure (X) to attain sales (Y) target of 80 lakhs.
X =0.3 + .09 (80) = 0.3 + 7.2 = Rs. 7.5 lakhs
Part (d) Coefficient of Correlation (r) = bx xb, = V.09x9 =+09FORMATION OF REGRESSION EQUATIONS IN CASE OF FREQUENCY DISTRIBUTION
TABLE
PRACTICAL STEPS INVOLVED IN) THE FORMATION OF REGRESSION
| EQUATIONS IN CASE OF FREQUENCY DISTRIBUTION TABLE
Step 1 Prepare a Frequency Distribution Table (if not given).
Step 2 > List the class intervals for Y series in the column headings and those for X series in
the row headings. Note: Their order can also be reversed.
Step 3 Calculate the mid-point of each class-interval of X series and Y series.
Step4— Calculate the step deviations of variable X and denote these deviations by d,.
Step 5 —> Multiply the frequencies of the variable X by the deviation of X and obtain the total
2fd,.
Step6— Take the squares of the deviations of the variable X and multiply them by the
respective frequencies and obtain yfd,.
Step 7 Calculate the step deviations of the variable Y and denote these deviations by d,.
Step 8 — Multiply the frequencies of the variable Y by the deviations of Y and obtain the total
2d,
Step9— Take the squares of the deviations of the variable Y and multiply them by the respective
frequencies and obtain ya,
Step 10 > Multiply d,d, and the respective frequency of each cell and write the figure obtained
in the right hand upper comer of each cell.Step 11 + Add together all the cornered values as calculated in Step 10 and obtain the
total Zfd, dy
Step 12 — Put the relevant values in the following equation:
(i) Regression Equation of X on Y
Se(y-7)
Yd, x Bf,
x-F=
3.
where, r—- =
o
Wee Sfay—
width of class interval of X variable, i, = width of class interval of Y variable
(ii) Regression Equation of ¥ on X
Ny
(Sfd,) iy
Yd, x Bf,
o, Wad, Wa, XE y
where, r— = cu ;
+ aya Cll
width of class interval of X variable, width of class interval of Y variable
ILLUSTRATION 6
The following table gives the ages of husbands and wiyes for 50 newly married couples. Find
the two regression lines and estimate the age of husband when the age of wife is 20 and the
* age of wife when the age of husband is 30. Also calculate the coefficient of correlation.
Age of Husbands
Age of
Wives 20-25 25-30 30-35 Total
16-20 9 14 = 23
20-24 6 u 3 20
24-28 - - 7 7
Total 15 25 10 50
Solution
Step 1 > Let Age of Wives be denoted by X and Age of Husband by Y
Step 2—> Calculation of X,Y, 3/d,, ¥fd,?, Bfd,,Efd,” and Lf ,d,.
Computation Table I showing the calculation of Efd, ,5fd,? and X
Age of Wives
x m, | dy=m,-2| Ff fa, fa?
16-20 18 =4 23 92 368
20-24 2 0 20 0 0
24-28 26 4 1 28 12
50 | Sfd,=-64 yd,” = 480