0% found this document useful (0 votes)
24 views16 pages

Regression

This is for studying purposes

Uploaded by

merulidavid
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
0% found this document useful (0 votes)
24 views16 pages

Regression

This is for studying purposes

Uploaded by

merulidavid
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
You are on page 1/ 16
REUREssIoN ANALYSIS $102:1006 OSI **"Ausa99ury eMEXUDS, bin ho [4-3 de Ate. [exe EXou/e Diff La 15 le Panwa (6 [lg rreagu~e FP Aversz0 ow move Udriebir 10 ley G fla hot fin Vertetls ( ep DI CertumP hen ) are peso a “fae nef uf fle Gavrege feffioncbop Le foven fe ehmete fa Unlinem Vefir2 9 DePerdink ber Chen Er Compimnd fen) Fron fle Krein Leg 9" Incl Pend Uorhue (sen (Dy ferrer Cont)» Regrgio PneT ots _ $< Fefibee! fod! fo slety [le nef aoe #F fare frome] + ae cD nr reg Gad fo @himcfe vy Predlif [Le Uymkro Hy F Depend-f Jan ble mle [ran VAfiag G Sire perdi 4x 47 epenrde OV eM ; sD epens fh Ye fe Vereble Lrtauckh 1s Precli taf om hs bo- |Z. OOH Va7-BbU + [ff Yor low [En Venbay feg X one lee Petter liF Biehl 1 a dewld 4 VY de Pendent Yared fade VirrebU Chel (1 tel fy Prat aware Virb, /f sy Urefl[y chectict x. DISTINCTION BETWEEN CORRELATION AND REGRESSION Correlation differs from Repression in the following respects: Basis of Distinction | Correlation Regression |. What measures? | Correlation measures degree | Regression measures the| | und direction of relationship | nature and extent of average between the variables. relationship between (wo or more variables in terms of the original units of the data, 2. Whether relative or| lisarclativemeasure showing | 11 is an absolute measure of absolute measure, | association between variables. | relationship. 3, Whether Correlation coefficient is independent of independent of change of both | independent of change of| choice of both origin and scale, origin and not scale, origin and scale. Correlation Coefficient is | Regression Coefficient is not 4, Whether S H independent of | independent of units of | independent of units of ait of measurement. ‘measurement. measurement. ; Expression of the relationship |p, of the relationshi 5. Expression of xpression ip arma between the variables ranges | fetween the variables may be from -1 to +1. in any of the forms like — + bX Ysa+bX+cX 6. Whether a Itis not a forecasting device. | {1 is a forecasting device] Sorecasting device? which can be used to predict the value of dependent variable from the given value of independent variable. 7. Non-sense There may be non-sense | ‘There is nothing like non- correlation such as weight of | sense regression. girls and income of boys. REGRESSION LINES (OR REGRESSION MODELS) In case of simple linear regression model (i.e. when there is only one independent variable and there is linear relationship between the dependent and independent variable), there are (wo regression lines as follows: 1. Regression Line of X on Y X=atby where, “X = Dependent Variable, Y = Independent Variable a =X intercept, (i.c. value of dependent variable when value of independent variable is zero). b =Slope of the said line (i.c, the amount of change in the value of the dependent variable per unit change in independent variable). Regression Analysis 8.3 The values of two constants ‘a’ and ‘b’ can be calculated for the given data of X and ¥ variable by calving the following two algebraic normal equations: EX = Na+ b&Y EXY =aLy + bE where, N = Number of pairs of X and Y variables EX = Sum of values of variable X ZY = Sum of values of variable ¥ LY* = Sum of squares of values of variable ¥ EXY = Sum of products of values of X and ¥ variable tse of Regression Line of X on ¥: This line gives the probable value of X for any given value of Y. Another way of expressing Regression Line of X on ¥: This line can also be expressed as follows: (X =X) = b(Y-¥) or Ca Ss(r-7) Since bry = oy = Arithmetic Mean of X Series = Arithmetic Mean of Y Series Standard Deviation of X Series Standard Deviation of Y Series = Coefficient of Correlation between two variables X and ¥ (X= where, OF SI 2, Regression Line of ¥ on X ¥ =a+bx X = Independent Variable, Y = Dependent Variable a =Y intercept, (ie. value of dependent variable when value of independent variable is zero). +b =Slope of the said line (ie. the amount of change in the value of the dependent variable per unit change in independent variable). The values of two constants ‘a’ and ‘b’ can be calculated for the given data of X and ¥ variable by solving the following two algebric normal equations. EY =Na+bEX EXY = aEX + bEX? where, N= Number of pairs of X and ¥ variables EX = Sum of values of variable X EY = Sum of values of variable ¥ IX? = Sum of squares of values of variable X EXY = Sum of products of values of X and Y variable Use of Regression Line of ¥ on X: This line gives the probable value of ¥ for any given value of X. Another way of expressing Regression Equation of Y on X: This line can also be expressed as follows: (Y-¥) =b, (X-X) 84 Business Statistics (Y-F)=r22(x-%) [see Arithmetic Mean of X Series Arithmetic Mean of ¥ Series ion of X Series ion of ¥ Series 1 = Coefficient of Correlation between two variables X and ¥ PROPERTIES OF LINEAR REGRESSION 1. Two Regression Equations - There are two linear regression equations. (i) Regression equation of y on x: Y-¥=b,,(X-X) (ii) Regression equation of x on y: X-X=b,(Y-7) where b,, and b,, are respectively the regression coefficient (or slope) of ¥ on X and the regression coefficient (or slope) of X on ¥. _ OV) _ 8 5 y 2. Product of Regression Coefficient — The product of the two regression coefficients is equal to the square of correlation coefficient. Dyebyy =P 3. Signs of Regression Coefficient and Correlation Coefficient ~ r, by, and by, all have the same sign. Ifthe correlation coefficient ris zero, the regression coefficients B,, and b, are also zero. 4. Intersection at Means ~The regression lines always intersect at their means, 5. Slopes -The slopes of the regression line of ¥ on X and the regression line of X on Y are respectively b,, and b,. 6. Angle between Regression Lines ~The angle between the two regression lines depends on the Correlation Coefficient (r). Value of r Angle between Regression Lines () fr=0 Regression lines are perpendicular to each other. (b) Ifr=+lor-1 Regression lines coincide (Le. become identical), As value of r increases numerically from 0 to 1, the angle between regression equation decreases from 90° to 0°. In other words, the farther the two regression lines are from each other, the lesser is the degree of correlation and the nearer the two regression lines are to each other, the higher is the degree of correlation. 7. Estimation of Value ~The value of X or ¥ can be estimated from linear equations if r + 0. pl ROPERTIES OF REGRESSION COEFFICIENTS The following are the important properties of Regression Coefficients: \. .c. either they will be positive Same sign — Both regression coefficients have the same signs, or negative, Both can not be greater than one — If one of the regression coefficients is greater than unity, the other must be less than unity to the extent the product of both regression coefficients is less than unity. In other words, both the regression coefficients cannot be greater than one. Independent of origin — Regression coefficients are independent of the origin but not of scale. A.M. > r — Arithmetic mean of regression coefficients is greater than the correlation coefficient. 5, risG.M. —Correlation coefficient is the geometric mean between the regression coefficients. 6. r, bxy and byx have same sign — The coefficient of correlation will have the same sign as that of regression coefficient i.e. if regression coefficients have a positive sign, r will also be positive and if regression coefficients have a negative sign, r will also be negative. FORMATION OF REGRESSION EQUATIONS BY SOLVING NORMAL EQUATION PRACTICAL STEPS INVOLVED IN THE FORMATION OF REGRESSION EQUATIONS BY SOLVING NORMAL EQUATIONS Step 1— Denote one variable by X and another variable by Y. Step2— Obtain the total of actual values of X i.e. XX. Step3— Calculate squares of actual values of variable X and obtain their total ie. XX’. Step4— Obtain the total of actual values of Y ie. XY. Step 5 Calculate squares of actual values of variable Y and obtain their total ie. ZY". Step 6—> Multiply the actual values of X by the corresponding aciual values of Y and obtain the total ie. XXY. Step? — Determine the values of ‘a’ and ‘b’ by solving the follovsing two normal equations for regression equation of X on Y. XX = Na + bxY .-Eqg. 1 {Note: Eq. 1 is summation of X = a + bY) EXY=aEY+b3Y Eg, Il (Note: Eg. Il is Eq. | multiplied by ¥\ then, put the values of ‘a’ and ‘b’ in the following equation: X=a+by step 8 Determine the values of ‘a’ and ‘b’ by solving the following two normal equations for regression equation of Y on X. ZY=Na+bEX ——_.. Eq. (Note : Eg. 1 is summation of ¥ = a + bX) EXY = aEX + DEX? ... Eq. Il (Note : Eq, Il is Eg, I multiplied by X) then, put the values of ‘a’ and ‘b’ in the following equation: Y=a+bX _—_—— {LUSTRATION 1) [Finding out Regression Equations by solving normal equations} he following data relate to advertising expenditure and sales. ‘Advertising expenditure (Rs. lakhs) | 1 2 3 4 5 Sales (Rs. lakhs) 10 20 30 50 40 Required: (a) Find out two Regression Equations. (b) Estimate the likely sales when advertising expenditure is Rs. 7 lakhs. (c) What should be the advertising expenditure if the firm wants to attain sales target of Rs. 80 lakhs (d) Calculate Coefficient of Correlation. solution (a) Regression Equations Step 1» Let advertising expenditure be denoted by X and sales by Y. Step 2 Computation Table for finding out Regression Equations x x? y y xy 1 1 10 100 10 2 4 20 400 40 3 9 30 900 90 4 16 50 2500 200 5 25 40 1600 200 sx=1s | xxt=55 | sy=150 | s¥=5s00] Exy=540 N=No. of pairs = 5 Step 3 Regression Equation of X on ¥: X,=a + bY To determine the values of a and b the following two normal equations are to be solved: EX =Na+bEY EXY =aZY + bEY Substituting the values in the above equations: 15 = Sa + 150b 540 = 150a + 5500b Multiplying Eq. I by 30 450 = 150a + 4500b 540 = 150a + 5500b Subtracting Eq. IV from Eq. Ill -90 10006 b = 90/100 = .09 Substituting the value of b in Eq. 1 15 = 5a + 150 (.09) 15 =Sa+ 135 Sa =15- 13.5 a =15/5=.3 Putting the values of a and b in the equation the regression line of X on Y is — X, =0.3 +0.09¥ Step 4» Regression Equation of ¥ on X: ¥, =a + 6X To determine the values of a and b the following two normal equations are to be solved: EY =Na+bEX EXY = a5X + bEX? Substituting the values in the above equations: 150 =Sa + 15b 540 = 15a +55b Multiplying Eq. I by 3 450 = 15a + 45b 540 = 15a + 55b Subtracting Eq. IV from Eq. Ill -90 =-10b b =90/10=9 Substituting the value of b in Eq. I 150 = 5a + 15(9) 150 =5a+ 135 Sa = 150-135 = 15 a =15/5=3 Putting the values of a and b in the equation, the regression line of Y on X is — ¥ =3+9X (b) Sales (Y) when Advertising Expenditure (X) is Rs. 7 lakhs. Y =3+9x7=3+63=66 ire (X) to attain sales (Y) target of 80 lakhs. X =0.3 + .09 (80) =0.34+7.2=7.5 (4) Coefficient of Correlation (r) = /b,, Xby, = V.09X9 = +09 awe STEPS INVOLVEDIINITHEIFORMATION|OFIREGRESSION EQUATIONS gy DIRECT, METHOD) WHEN) NO) DEVIATIONS ARE TAKEN. Denote one variable by X and another variable by Y. Obtain the total of actual values of X ive. EX. Calculate squares of actual values of variable X and obtain their total i.e. EX?. Obtain the total of actual values of Y i.e. ZY. Calculate squares of actual values of variable Y and obtain their total i.e. ZY*. Multiply the actual values of X by the corresponding actual values of Y and obtain the total ie. EXY. Calculate b,, and b,, as follows: exy-@X) GY) zxy-@0 GY) ee NEN 7 gy? Gr OF sya xy? N N |LLUSTRATION 2 [Finding out Regression Equations by Direct Method] The following data relate to advertising expenditure and sales. Advertising expenditure (Rs. lakhs) 1 2 3 4 5 Sales (Rs. lakhs) 10 20 30 50 40 Required: (a) Find out two Regression Equations. (p) Estimate the likely sales when advertising expenditure is Rs. 7 lakhs. (c) What should be the advertising expenditure if the firm wants to attain sales target of Rs. 80 lakhs. (d@) Calculate Coefficient of Correlation. Solution (a) Regression Equations Step 1 Let advertising expenditure be denoted by X and sales by ¥. Step 2 Computation Table for finding out Regression Equations ag xe. ¥ yi. xy 1 1 10 100 10 2 4 20 400 | 40 3 9 30 900 90 4 16 50 2500 200 5 25 40 1600 200 EX = 15 E=55 | Z¥=150 |E¥*=5500 Exy=540 N= No. of pairs = 5 Step3— Calculation Sb, and byy (2X) @Y) (15) (150) rxy-PUEY) s49- b= N oo ee en py? Ye 5500-/50- 5500 — 4501 N 5 (Y) (15) (150) py -SORD 0S _ 50- 45090, by = we ae Par 35-45 10 5 Step 4 Calculation of Mean ge ae oo BY = 10.30 RaW Paws Step $— Formation of Regression Equations (a) @) (X-X) =by (Y-¥) (ii) (Y-¥) = by (X-X) (X-3) = 0.09 (Y - 30) (Y¥ - 30) = 9 (X-3) 3 +0.09 ¥ ¥=349X (b) Sales (Y) when Advertising Expenditure (X) is Rs. 7 lakhs. Y =34+9x7=3+63=Rs, 66 lakhs (0) Advertising Expenditure (X) {0 attain sales (Y) target of 80 lakhs. X =0.3 + .09 (80) = 0.3 + 7.2=Rs. 7.5 lakh @ Coefficient of Correlation (r) = by X Py, = V.099 = + 0.9 FORMATION OF REGRESSION EQUATIONS BY TAKING DEVIATIONS FROM THE ACTUAL MEANS PRACTICAL STEPS INVOLVED IN THE FORMATION OF REGRESSION EQUATIONS BY TAKING DEVIATIONS FROM THE ACTUAL MEANS Step 1-9 Denote one variable by X and another variable by ¥. Step 2» Obtain the total of actual values of X i.e. EX and calculate actual mean of X i.e. X Step 3» Take deviations from the actual mean of variable X and denote them by x. Step 4 ee of deviations from actual mean of variable X and obtain their total i.e, Dx. Step 5—> Obtain the total of actual values of Y ie. EY and calculate actual mean of Y ie. ¥. Step 6» Take deviations from the actual mean of variable Y and denote them by y. Step 7 Calculate spa of deviations from actual mean of variable ¥ and obtain their total i.e. Ly. Step 8» Multiply the deviations from actual mean of X by the corresponding deviations from actual mean of ¥ and obtain the total i.e. Ex Step 9» Put the relevant values in the following equations: ssion Equation of Yon Y (ii) Regression Equation of ¥ on X —_— ce \LLUSTRATION 3 Finding out Regression Equations when deviations are taken from actual mean] rhe following data relate to advertising expenditure and sales. Advertising expenditure (Rs, lakhs) | 1 2 3 4 5 Sales (Rs. lakhs) 10 20 30 50 40 Require (a) Find out two Regression Equations, (b) Estimate the likely sales when advertising expenditure is Rs. 7 lakhs. (c) What should be the advertising expenditure if the firm wants to attain sales target of Rs. 80 lakhs. (@) Calculate Coefficient of Correlation. solution (a) Regression Equations: step 1 Let advertising expenditure be denoted by X and sales by Y. step 2 Computation Table for finding out Regression Equations x N-N Y y-7 x x? y y ay 1 2 4 10 -20 400 40 2 -l 1 20 -10 100 10 3 0 0 30 0 0 0 4 1 1 50 20 400 20 3 2 4 40 10 100 20 EX=15]Ex=0 | Ee=10 | Ly=150 | ty=0 y? = 1000] Exy = 90 N=No. of pairs = 5 yim sup3 > F=T=S=3; Hence, X - 3 = .09 (Y-30) = .09¥-2.7 X=.09Y-2.74+3 or X=3+.09Y sO & Step 5+ Regression Equation of Yon X: Y-¥ =r (X—X) Hence, Y-30=9(X-3)=9X-27 Y=9X-27+30 or ¥=3+9X Part (b) Sales (Y) when Advertising Expenditure (X) is Rs. 7 lakhs. Y =3+9x7=3+63=Rs. 66 lakhs Part (c) Advertising Expenditure (X) (0 attain sales (Y’) target of 80 lakhs. X =0.3 + .09 (80) = 0.3 + 7.2 = Rs. 7.5 lakhs Part (d) Coefficient of Correlation (r) = fry X By, = V.09x9 = + 0.9 FORMATION OF REGRESSION EQUATION BY TAKING DEVIATIONS FROM THE ASSUMED MEANS PRACTICAL STEPS) INVOLVED) IN THE FORMATION OF REGRESSION EQUATION BY) TAKING) DEVIATIONS FROM) THE) ASSUMED) MEANS: Step > Denote one variable by X and another variable by Y. Step 2 Obtain the toral of actual values of X i.e. EX and calculate actual mean of X i.e. X. Step 3 Take deviations from the assumed mean of variable X and denote them by d,, Step 4—> Calculate squares of deviations from assumed mean of variable X and obtain their total i.e, Ed,?. Step 5—> Obtain the total of actual values of Y i.e. ZY and calculate actual mean of Y i.e. ¥. Step 6 Take deviations from the assumed mean of variable Y and denote them by d,. Step 7 Calculate squares of deviations from assumed mean of variable ¥ and obtain their total ive, Ed,?, Step 8» Multiply the deviations from assumed mean of X by the corresponding deviations from assumed mean of ¥ and obtain the total ie. ¥d,d, Step 9—> Put the relevant values in the following equations: (® Regression Equation of Yon Y (ii) Regression Equation of ¥ on X aiose = x-K=r2e(y-7) Sy Cy 3, a x) Bd, xEd, Ei, go, Mad, é ade where, r—= = N_ where, r— =_—___N__ 5 2_G4) S542 Gay Tai 7 WV d,=X—A,andd,=¥-A, 4,=X~A, and d,=Y-A, ILLUSTRATION 4 [Finding out Regression Equations when deviations are taken from as- sumed mean] The following data relate to advertising expenditure and sales. ‘Advertising expenditure (Rs. lakhs) | 1 2 3 4 5 Sales (Rs. lakhs) 10 20 30 50 40 equired: (a) Find out two Regression Equations. (b) Estimate the likely sales when advertising expenditure is Rs, 7 lakhs. (c) What should be the advertising expenditure if the firm wants to attain sales target of Rs, 80 lakhs. (a) Calculate Coefficient of Correlation. solution part (a) Regression Equations step 1 Let advertising expenditure be denoted by X and sales by ¥. step 2? Computation Table for Finding Out Regression Equations x X-4 Y Yao ts ae 4 a? ded, 1 -3 9 10 -30 900 00 Z oe 4 20 -20 400 40 J a! ! 30 -10 100 10 4 0 0 50 10 100 0 5 1 1 40 0 0 0 X= 15 Yd2=15 EV¥=150 Ed,=~50 ¥d,2= 1500 Edy, = 140 N= No. of pairs = 5 step3 > X= 'N (Ed,) x (Ed, ) (-5x-50) 2,4, - 0a 140-50 90. yg 2 24) 1500 - (50) 1500-500 1000 _ N 5 Hence, X-3=.09(Y-30)=.09Y-27 X=.09¥-2.7+3 or X=.3+.09Y Step 5» Regression Equation of ¥ on X: Y-¥ =by,(X-X) 2d,)x (Ld, x,4,-§ ret y) bye (,)° Yd,? —“2 N Hence, Y-30=9 (X-3)=9X-27 Y=9X-274+30 or Y=3+9X Part (b) Sales (Y) when Advertising Expenditure (X) is Rs. 7 lakhs. Y =34+9x7=3+63=Rs. 66 lakhs Part (c) Advertising Expenditure (X) to attain sales (Y) target of 80 lakhs. X =0.3 + .09 (80) = 0.3 + 7.2 = Rs. 7.5 lakhs Part (d) Coefficient of Correlation (r) = bx xb, = V.09x9 =+09 FORMATION OF REGRESSION EQUATIONS IN CASE OF FREQUENCY DISTRIBUTION TABLE PRACTICAL STEPS INVOLVED IN) THE FORMATION OF REGRESSION | EQUATIONS IN CASE OF FREQUENCY DISTRIBUTION TABLE Step 1 Prepare a Frequency Distribution Table (if not given). Step 2 > List the class intervals for Y series in the column headings and those for X series in the row headings. Note: Their order can also be reversed. Step 3 Calculate the mid-point of each class-interval of X series and Y series. Step4— Calculate the step deviations of variable X and denote these deviations by d,. Step 5 —> Multiply the frequencies of the variable X by the deviation of X and obtain the total 2fd,. Step6— Take the squares of the deviations of the variable X and multiply them by the respective frequencies and obtain yfd,. Step 7 Calculate the step deviations of the variable Y and denote these deviations by d,. Step 8 — Multiply the frequencies of the variable Y by the deviations of Y and obtain the total 2d, Step9— Take the squares of the deviations of the variable Y and multiply them by the respective frequencies and obtain ya, Step 10 > Multiply d,d, and the respective frequency of each cell and write the figure obtained in the right hand upper comer of each cell. Step 11 + Add together all the cornered values as calculated in Step 10 and obtain the total Zfd, dy Step 12 — Put the relevant values in the following equation: (i) Regression Equation of X on Y Se(y-7) Yd, x Bf, x-F= 3. where, r—- = o Wee Sfay— width of class interval of X variable, i, = width of class interval of Y variable (ii) Regression Equation of ¥ on X Ny (Sfd,) iy Yd, x Bf, o, Wad, Wa, XE y where, r— = cu ; + aya Cll width of class interval of X variable, width of class interval of Y variable ILLUSTRATION 6 The following table gives the ages of husbands and wiyes for 50 newly married couples. Find the two regression lines and estimate the age of husband when the age of wife is 20 and the * age of wife when the age of husband is 30. Also calculate the coefficient of correlation. Age of Husbands Age of Wives 20-25 25-30 30-35 Total 16-20 9 14 = 23 20-24 6 u 3 20 24-28 - - 7 7 Total 15 25 10 50 Solution Step 1 > Let Age of Wives be denoted by X and Age of Husband by Y Step 2—> Calculation of X,Y, 3/d,, ¥fd,?, Bfd,,Efd,” and Lf ,d,. Computation Table I showing the calculation of Efd, ,5fd,? and X Age of Wives x m, | dy=m,-2| Ff fa, fa? 16-20 18 =4 23 92 368 20-24 2 0 20 0 0 24-28 26 4 1 28 12 50 | Sfd,=-64 yd,” = 480

You might also like