0% found this document useful (0 votes)
61 views34 pages

DISCRETE MATH Chapter-8

Uploaded by

noter3848
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
0% found this document useful (0 votes)
61 views34 pages

DISCRETE MATH Chapter-8

Uploaded by

noter3848
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
You are on page 1/ 34
Regression Analysi INTRODUCTION After having established the fact that two variables are closely related, we may be interested estimating (predicting) the value of one variable given the value of another. For example, if we that advertising and sales are correlated, we may find out the expected amount of sales for a gi advertising expenditure or the required amount of expenditure for achieving a fixed sales target. sition to estimate (or predict) the unknown values statistical tool with the help of which we are in a po: “one variable from known values of another variable is called regression. With the help of reg analysis, we are in a position to find out the average probable change in one variable given @ amount of change in another. ‘The dictionary meaning of the term ‘regression’ is the act of returning or going back. The ‘regression’ was first used in 1877 by Francis Galton while studying the relationship between the of fathers and sons. His study of height of about one thousand fathers and sons revealed a very in ing relationship, ie, tall fathers tend to have tall sons and short fathers, short sons; but the ay height of the sons of a group of tall fathers is les than that ofthe tall fathers and the average height sons of a group of short fathers is greater than that of the shor fathers. The line describing this te to regress or going back was called by Galton a ‘Regression Line’. The term is still used to describe} line drawn for a group of points to represent the trend present, but it no longer necessarily carries original implication that Galton intended. These days there is a growing tendency of the modern to use the term estimating line or predicting line instead of regression line. Regression analysis is a branch of statistical theory that is widely used in almost all the sci disciplines. In economics it is the basic technique for measuring or estimating the relationship economic variables that constitute the essence of economic theory and economic life. For example, know that two variables price (X) and demand (¥) are closely related we can find out the most p value of X for a given value of ¥ or the most probable value of ¥ for a given value of X. Similarly, i know that the amount of tax and the rise in the price of a commodity are closely related, we can fi the expected price for a certain amount of tax levy. The regression analysis helps in three im ways: 1, It provides estimates of values of the dependent variables from values of independent vari ‘The deviee used to accomplish the estimation procedure is the regression line which descril average relationship existing between X and Y variables. > The second goal of regression analysis is to obtain a measure of the error involved in wsit regression line as a basis for estimations. For this purpose, the standard error of estimate is calcul the line fits the data closely, that is, if there is relatively little scatter of the observations arout reeression line, good estimate can be made of Y variable. On the other hand, if there is a great scatter of the observations around the fitted regression line, the line will not produce accurate es i of the dependent variable. Regression Analysis! 239 3. With the help of regression analysis, we can obtain a measure of the degree of association ot correlation that exists between the two variables. The coefficient of determination calculated for this Purpose measures the strength of the relationship that exists between the variables. It assesses the pro- Portion of variance that has been accounted for by the regression equation. The too! of regression analysis can be extended to three or more variables, But in this text we shall confine ourselves to the problems of two variables only, ie., simple regression. Difference between Correlation and Regression Analysis There are two important points of difference between correlation and regression analysis : 1, Whereas correlation coefficient is a measure of degree of relationship between X and Y, the objective of regression analysis is to study the ‘nature of relationship’ between the variables. 2. The cause and effect relation is clearly indicated through regression analysis than by correlation. Correlation is merely a tool of ascertaining the degree of relationship between two variables and, there- fore, we cannot say that one variable is the cause and the other the effect. THE LINEAR BIVARIATE REGRESSION MODEL In regression analysis, as in other types of statistical studies, we usually proceed by observing the sample data and using the results obtained as estimates of the corresponding population relationship. To make valid inferences, we must assume some population model. For a bivariate population, there are many possible models that can be constructed to describe the mutual variations of the two variables. The Particular one in which we are interested is called the simple linear regression model. This model is constructed under the following set of assumptions : (1) The value of the dependent variable, Y, is dependent in some degree upon the value of the independent variable, X. The dependent variable is assumed to be a random variable, but the values of Y we assumed to be fixed quantities that are selected and controlled by the experimenter. The requirement that the independent variable assumes fixed values, however, is not a critical one. Useful results can still be obtained by regression analysis in the case where both and Y are random variables. (2) The average relationship between X and Y can be adequately described by a linear equation Y= a+ bX whose geometrical presentationds a straight line as in the diagram that follows : y x Asis clear from the above diagram, the height of the line tells the average value of Yat a fixed value of.X. When X= 0, the average value of Y is equal to a. The value of a is called the Y intercept, since it is the point at which the straight line crosses the Y-axis. The slope of the line is measured by 6, which gives the average amount of change of Y per unit change in the value of X. The sign of b also indicates the type of relationship between Y and X. (3) Associated with each value of X there is a sub-population of Y. ‘The distribution of the sub- population may be assumed to be normal or non-specified in the sense that it is unknown. In any event, the distribution of each population ¥ is conditional to the value of X. 240 Business Statistics (4) The mean of each sub-population Y is called the expected value of Y fora given.X : E(YIX) = By» Furthermore, under the assumption ofa linear relationship between X and Y, all values of E(Y/X) oF H,.. must fall on a straight line. That is E(WIX) =p, =a + bX which is the population regression equation for our bivariate linear model. In this equation a and b are called the population regression coefficients. (5) An individual value in each sub-population ¥, may be expressed as : Y= K(¥IX)+e where ¢ is the deviation of a particular value of Y from Hy disturbance term. The errors are assumed to be independent Variables and independent, The expectations of these errors are Zero; normal variables, the error can also be assumed to be normal. (6) It is assumed that the variances of all sub-populations, called variances of the regression, are identical. Regression Lines Ifwe take the case of two variables. and is called the error term or the stochastic 1 random variables because Ys are random E(e) = 0. Moreover, if Y's are “Yand Y, we shall have two regression lines as the regression line of Xon Y and the regression line of Y on X. The regression line of Yon X gives the most probable values of ¥ for given values of X and the regression line of on ¥ gives the most probable values of X for gi values of ¥, Thus, we have two regression lines. However, when there is either perfect positive or pet negative correlation between the two variables, the two regression lines will coincide, je., we will one line. The farther the two regression lines are from each other, the lesser is the degree of correlatic and the nearer the two regression lines to each other, the higher is the degree of correlation. If Variables are independent, r is zero and the lines of regression are at right angles, i., parallel to X and Y-axis. It should be noted that the regression lines cut each other at the point of average of Xand ¥, 1.2.5 from the point where both the regression lines cut each other, a perpendicular is drawn on the X-axis, will get the mean value of X ang if from the point a horizontal line is drawn on the Y-axis,.we will get mean value of Y. Regression Equations Regression equations are algebraic expressions of the regression lines. Since there are two regressit lines, there are two regression equations—the regression equation of X on ¥ is used to describe variations in the values of X for given changes in ¥ and the regression equation of Y on X is used describe the variation in the values of ¥ for given changes in oe Regression Equation of Y on x ‘The regression equation of Y on X is expressed as follows = Y,=a+bx Where Y, is the dependent variable to be estimated and is the independent variable. In this equation a and b are two unknown constants (fixed numerical values) which determine position of the line completely. The constants are called the parameters of the line. If the value of ei or both of them is changed, another line is determined. The parameter . BY_297 Fo aa 37125; Midydy—-Bd,Edy —8x238-(-3)1) . be Tea? aca, 8xsT=3)) 56-97 ¥~37.125 = 4.266 (X—5.625) Or ¥—37.125 =4.266 X-23.996 Y= 13.129 +4.266X; When Xis 10, ¥ shall be Y= 13.129 + 4.266(100) = 439.729 Thus the likely expenditure on Research and Development for an allocation of Rs. 100,000 is Rs. 439.729. 1907 gay = 4.266 ression Coefficients The Quantity b in the regression equations is called the “regression coefficient” or “slope coefficient”. there are two regression equations, therefore, there are two regression coefficients—regression icient of X on Y and regression coefficient of ¥ on X. -ssion Coefficient of X on ¥ a The regression coefficient of X on Y is represented by the symbol bxy or ,. It measures the i of change in X corresponding to a unit change in Y. The regression coefficient of Yon Yis given by Gy by =r i deviations are taken from the means of X and ¥, the regression coefficient is obtained by = 2y by = Sr deviations are taken from assumed means, the value of bxy is obtained as follows : N&d,d, -¥d,Ed, bay = N¥d,? ~ (Zd,)* Regression Coefficient of Y on X The regression coefficient of ¥ on X is represented by byx or b,. It measures the amount of change in ponding to a unit change in X. The value of byx is given by 6, oo eee 3, When deviations are taken from actual means of X and Y, on 246 Business Statistics ‘When deviations are taken from assumed means, b N3d,d, — Sd,2d. FO NEd,’ ~ (dy)? Properties of the Regression coefficients (1) Thecoefficient of correlationisthe geometric mean of the tworregression coefficients. Symbolically = r= By x Bx Proof. by=rg. shear a Bay * bye = » "a, 2) If one of the regression coefficients is greater than unity, the other must be less than unity, si the value of the coefficient of correlation cannot exceed unity. For example, if Byy = 1.2 and byx = Ie rwould be 12 x 1.4 = 1.29 which is not possible. (3) Both the regression coefficients will have the same sign, i.e., they will be either positive negative. In other words, it is not possible that one of the regression coefficients is having minus si and the other plus sign. (4) The coefficient of correlation will have the same sign as that of regression coefficie i.e., if regression coefficients have a negative sign, r will also have negative sign and if the regres coefficients have a positive sign, r would also be positive. For example, if bey = —0.2and byx=-0.8 r = —f02x08 =-0.4 (5) The average value of the two regression coefficients would be greater than the value of coefficient correlation. In symbols (xy + byx’2>r. For example, if Bxy=0.8 and byx=0.4, the average of the two would be (0.8-+0.4)/2=0.6 and the value ofrwould be /0.8 x 0.4 =0.566 which is less than 0.6. (6) Regression coefficients are independent of change of origin but not scale.* *Proof 4 = Mee ye” “Next (Ex Let Sy and v ¥ Then + hu, and ¥ = b+ ky and ati: Y=b+ky Subtracting, we get (X- Fah Gs YHRO-¥) Substituting these values in the above formula, we get Ehk (ui) (v-¥) Similarly, we have b= Hence the result. Regression Analysis 247 Mlustration 5. On the basis of figures recorded below for “Supply” and ‘Price’ for nine years, calculate the regression coefficients and the value of r Year 2002 2003 «200420052006» 2007-S 2008 ~=—S 2009-2010 Supply 80 82 86 91 83 85 89 96 93 Price 14s 140 130 124 133 127 120 ho 116 Solution. Let the price be denoted by Y and supply by X. CALCULATION OF REGRESSION COEFFICIENTS Year ‘Supply (290) Price (Y= 127) | £ dy a? y dy a axdy 2002 80 -10 100 14s +18 324 180 | 2003 82 -8 64 140 +13 169 — 104 | 2004 86 “4 16 130 3 9 -12 2005 91 +1 1 124 3 9 3 2006 83 4 49 133 +6 36 42 2007 85 -5 28 127 0 0 0 2008 89 1 1 120 2 49 +7 2009 96 +6 36 110 -17 289 02 2010 3 +3 9 16 -— Lae REGRESSION OF Y ONX (¥=254095X) 12 tt 7 8 9 6 1 2 3 IMlustration 17. The General Sales Manager of Kiran Enterprises—an enterprise dealing in the sale of ready-made men’s wears—is toying with the idea of increasing his sales to Rs. 80,000. On checking the records of sales during the last 10 years, it was found that the annual sale proceeds and advertisement expenditure were highly correlated to the extent of 0.8. It was further noted that the annual average sale has been Rs. 45,000 and annual average advertisement expenditure Rs. 30,000, with a variance of Rs. 1,600 and Rs. 626 in advertisement expenditure respectively. In view of the above, how much expenditure on advertisement you would suggest the General Sales Manager of the enterprise to incur to meet his target of sales. (MBA, Kurukshetra Univ. 2004) Solution. Let advertisement expenditure be denoted by X and sales by ¥. We are required to find out the regression equation of ¥ on X given by the equation, # 30,000 Substituting the values, we get 25 (¥~ 30,000) = 0.8 795 (X- 45,000) = .05 (x — 45,000) Y= 30,000 + .05X— 2,250 = 27,750 + OSX When X= 80,000 Y = 27,150 + .05 x 80,000 = 27,750 + 4,000 = 31,750 Hence the General Sales Manager should spend Rs. 31.750 to have the target sales of Rs. 80,000. Mlustration 18. Suppose that you are interested in using past expenditure on research and development by a firm to predict current expenditures on RK & D. You got the following data by taking a random sample of firms, where X is the amount on R & D (in lakhs of rupees) 5 years ago and Y is the amount spent on R & D (in lakhs of rupees) in the current year : x 2 30 50 20 80 10 20 20 40 Y 2 50 80 30 110 20 20 40 50 (ji) Find the regression equation of ¥ on X. (ii Ifa firm is chosen randomly and X= 10, can you use the regression to predict the value of Y? Discuss. (MBA, Madurai-Kamaraj Univ., 2000) Regression Analysis 259 Solution. CALCULATION OF REGRESSION EQUATION x (x -33) 2 (50) de 2 4 42 dedy 30 + 9 50 0 0 0 50 +17 289 80 +30 900 +510 20 -1B 169 30 -20 400 +260 80 +47 2209 0 +60 3600 +2820 10 23 529 20 30 900 +690 20 13 169 20 30 900 +390 20 13 169 40 -10 100 +130 40 +7 49 50 0 0 0 EX=270 Ld, = +6 Xa,? (i) Regression equation of ¥ on X : ¥- 8x 4800-60 _ 38400 _ gx3592—(6F 28700 Y¥—50 = 1.338 (X—33.75) or Y= 4.84 + 1.338% (ii) When Xis 10 : Y= 4.84 + 1.338 (10) = 18.22 For X= 10, Yis 18.22. Illustration 19. You are given the following information about advertising expenditure and sales : Adv. Exp. (X) Sales (Y) (Rs. lakhs) (Rs. lakhs) x 10 90+ o 3 it Correlation coefficient = 0.8 (i) Obtain the two regression equations. (ii) Find the likely sales when advertisement budget is Rs. 15 lakhs. (Zi) What should be the advertisement budget if the company wants to attain sales target of Rs. 120 lakhs? (MBA, Kumaun Univ., 2000; MBA, DU, 2002, MBA (HCA), DU, 2003) Solution. (i) Regression equation of X on Y: X— X a ESP) =10, r= 0.8, 0,=3,6,=12, 3 X—10=08 73 (¥-90) X-10=0.2(Y-90) X-10=0.2Y-18 or X=-8+0.2¥ Regression equation of Y on X: Y- ¥ (x-¥) 12 ¥-90=.8-> (X-10) ¥-90=3.2 (X10) or ¥=S58+3.2X _ (ii) By putting 15 imregression equation of ¥ on X, we can find out the likely sales, ¥=58 + 3.2 (15) =58 + 48 = 106 Thus the likely sales for advertisement budget of Rs. 15 lakhs is/Rs. 106 lakhs. 260 Business Statistics (iii) By putting 120 in regression equation of X on Y, we can find what should be the advertisement budget. X=-8+0.2 (120) = 16 ing sales target of Rs. 120 lakhs, the advertisement budget should be Rs. 16 lakhs. n 20. The following table gives the aptitude test scores and productivity indices of 10 workers selected Thus for att Mlustr: at random : Aptitude scores: 60 62 65 70 72 48 53 73 65 82 Productivity index : 68 60 62 80 85 40 52 62 60 81 Estimate (i) the productivity index of a worker whose test score is 92, (ii) the test score of a worker whose produc- tivity index is 75. (MBA, Delhi Univ., 2001; MBA,Hyderabad, Univ., 2004) Solution. Since productivity depends on aptitude scores, let Y denote the productivity and X the aptitude score. CALCULATION OF REGRESSION EQUATIONS, Aptitude (X65) Produgtivity (¥-65) Score X=65 Index Y x x Yy xy 60 -5 25 68 +3 9 -15 62 3 9 60 5 25 +15 65 0 0 62 3 9 0 70 +5 25 80 +15 228 +75 n 47 49 85 +20 400 +140 48 -17 289 40 25 625 +425 53 -12 144 52 = 169 +156 7B +8 64 62 3 ' 24 65 0 0 60 -5 25 0 82 +17 289 81 +16 256 +272 EX = 650 zx=0 Ex? = 894 ZY = 650 zy=0 y= 1752 Exy = 1044 For answering part (i) of the question we hgvé to fit a regression equation of ¥ on X. Y-Y = by (X- X) Y-65 = 1.168 (X- 65) ¥ — 65 = 1.168 Y- 75.92 or Y= 1.168 X— 10.92 Yop = 1.168 (92) ~ 10.92 = 107.456 ~ 10.92 = 96.536 For answering part (ii) of the question we have to fit a regression equation of X on ¥. X-X=by(¥- ¥) Zay _ 1044 by Se 182 X- 65 = 0.596 (Y-65) or X-65=0.596Y - 38.74 X= 0.596Y + 26.26 Yy5 = 596 (75) + 26.26 = 44.7 + 26.26 = 70.96, Mlustration 21. In a partially destroyed laboratory record of an analysis of correlation data, the following results only are legible 596 Variance of X= 9 Regression Analysis 261 Regression equation 8X-10¥ + 66=0 40x18 = 214 Find on the basis of the above information : (i) The mean values of X and Y, (ii) Coefficient of correlation between X and Y, and (iii) Standard deviation of ¥. (MBA, Pune Univ,, 2002; MBA, Anna Univ., 2003) Solution. (i) Calculating mean values of X and ¥ 8X- 107 = -66 40X-18Y= 214 oi) Multiplying eq. (i) by 5 40X - 50Y = -330 40X- 18Y = 214 ~32Y = -544 Y=17 or ¥ =17 Putting the value of Yin eq. (i) 8Y— 10(17) = -66 8X = -66+ 170 8X = 104 or X= 13 or ¥ (i) Coefficient of correlation between X and Y For finding the value of r, we have to determine the value of regression coefficients. Since we don’t know which equation is regression of X on Y and which is of Y on X, we have to make an assumption. Assuming eq. (i) as the regression of Yon ¥. 8X= 107-66 xe Sey get 8 8 7 8 From eq. (ii) -18Y = 214— 40x - ee or 5 1 TY mn 1g Since both the regression coefficients are greater than 1, our assumption is wrong. Hence eq. (i) is regression eq. of Yon X. -10 ¥=-66 = 8x From eq. (ii) 40X = 214+ 18Y = 2418 8 0 = a 8 8 1 bay Bye = Io 70 = ¥036 =06 (iii) The value of standard deviation of Y can be determined from any regression coeffi Sx bya ge 18 by > 6, 6, =V9=3 Substituting the values or 186,=72.0r.6,=4. 262 Business Statistics IMlustration 22. The coefficient of correlation between the ages of husbands and wives in a: community ‘was found to be ¥0.8 the average of husbands age was 25 years and that of wives age 22 years. Their standard deviations were 4 and 5 years respectively. Find with the help of regression equations : (a) the expected age of husband when wife’s age is 16 years, and (bythe expected age of wife when husband’s age is 33 years. (MBA, Osmania Univ, 2000) Solution. Let age of wife be denoted by ¥ and age of husband by X. We are given ¥ =25, ¥ =22, 6,=4, o,=5, r=0.8 For answering part (a) we have to fit a regression equation X on ¥ xX-¥= 724 (y-F) By 4 X-25= BS (Y¥~22) or ¥—25= 64 (Y- 22) X= TS = GAY — 14.08 ot X= 10924 O.64Y When Y= 16, X= 10.92 + 0.64 (16) = 10.92 + 10.24 = 21.16 Thus, the expected age of husband when wife’s age is 16 years shall be 21.16 years. For answering part (b) we have to fit a regression equation of Y on X. - By —, TS em 5 Y¥-22=.8 7 (X-25) Y-22 = (X25) or ¥=-3 +X; when X= 33, ¥=-3433=30 Thus, the expected age of wife when husband’s age is 33 is 30 years. THustration 23. The following data relate to marks obtained by 250 students in Accountancy and Statistics examination of a university : Subject Arithmetic Mean ‘Standard Deviation Accountancy 48 4 Statistics 55 3 Coefficient of correlation between marks in accountancy and statistics is +0.8. Find the two regression and estimate the marks obtained by a student in Statistics who secured 50 marks in Accountancy. (M.Com., Sukhadia Univ., Solution. Let marks in accountancy be denoted by X and in statistics by Y. Regression equation of X on ¥ x-¥=r v-7) By X = 48, ¥ =55,0,=4,0,= 5,750.8 4 X-48=.8 5 (Y~55) X~48 = 64 (¥—55) X= .64Y+128 Regression equation of Yon X = iy 4 y-Yars -%) 5 Y-S5=.8 7 (X-48) Y¥—55=(X-48) or Y=74+X If marks in accountancy, i.e., X is 50; the marks in statistics shall be 57. Illustration 24. The following figures relate to length of service and income of the employees of an organisation : Length of Service (Years) : 11 7 2 5S 8 6 10 Income (Rs. hundred) 9: 7 5 3 2 6 4 8 Compute the coefficient of correlation for the above data. Find the two regression equations and examine the relationship. Regression Analysis 263 ‘Selation. Let length of service be denoted by X and income by Y. CALCULATION OF REGRESSION EQUATIONS AND CORRELATION COEFFICIENT x (7) Hi (5) x 2 y x xy u +4 16 7 42 4 +8 7 0 0 5 0 0 0 2 -s 25 3 3 4 +10 5 -2 4 2 3 9 +6 8 + 1 6 +1 1 +1 6 -1 1 4 4 1 +1 10 +3 9 8 +3 9 +9 Ex=49 x=0 Ex? = 56 zry=35 Zy=0 Dy? = 28 Lay =35 25Y ~ 6.25 or X= 0.75 + 1.25Y Regression equation of Y on X : Y~¥ = by, (X—- ¥) By _ 35 by = 2 = 2 = 0625 Sy? 56 X-5'= 625 (Y-7) X-5 = 625X - 4.375 or X= 0.625 + 0.625 X = [Bey Xbyq = 125 «625 = 0884 ‘Thus, there is a high degree of positive correlation between length of service and experience. Mlustration 25, In.a correlation stiidy the following values are obtained : ae y Mean 6 6 SD. 25 35 Coefficient of Correlation r=08. Find the two regression equations. (M.Com., Madurai-Kamaraj Univ, 2007) Solution : Regression equation of X on ¥ : X-¥=r 5 y-7) 5, ay X = 65, 0,= 25,0, 25 8 55 (Y-67) 571 (¥— 67) 571 Y~38.26 )ST1Y + 26.74 X-65 X-65 X-65 Regression equation of ¥ on X oe 35 Y-67 =08 25 (¥-65) ¥-67= 1.12 (X-65) Y-67 12 X-72.8 Y= 112X-58 264 Business Statistics Mlustration 26. In trying to evaluate the effectiveness in its advertising campaign, a firm compiled the following information Year 2003 © 2004 «= 2005-2006 »= 2007S 2008» 2009-2010 Adv, Expenditure (7000 Rs.) : 12 15 15 23 24 38 a2 48 Sales (Rs. lakh) iSO: 5.6 58 1.0 12 88 9.2 95 Calculate the regression equation of sales on advertising expenditure. Estimate the probable sales when adverts expenditure is Rs. 60 thousand. Sol CALCULATION OF REGRESSION EQUATION (X24) (7.0) x dy ag Y 4, 42 ad, 2 -12 144 5.0 2.0 4.00 24.0 15 -9 81 5.6 “14 1.96 126 Is “9 81 58 “12 144 10.8 23 -1 1 10 0 0 0 24 0 0 12 +02 04 0 38 +14 196 88 +18 3.24 25.2 2 +18 324 92 +22 4.84 39.6 48 424 576 9.5 42.5 6.25 60.0 D¥=217 Zd,=25 Yd2=1403 DY=581 Ydya21 — LdP=20.77 Ldydy = 172.2 y_ Ex _ ut > = SS = = = 27.125; ¥ NS. 8 Regression equation of sales on advertisement expenditure is given by : (-Y)=5,,- ¥) Ed,) (Edy) Ld) _ 81722) - 25) (2.1) 1377. ~~ g(1403)-(@252 (1124-625 10599 = 0.125 ‘Substituting the values, we have Y~ 1.2625 = Qu125 (X 27.125) Y¥~1.2625 = 0.125X—3.3906 Y = 3.8719 + 0.1250X When X’= 60, the estimated value of Y shall be : Y= 3.8719 + 0.1250 (60) = 3.8719 + 7.5 = 11.37. Illustration 27. A resarch company summarized advertising expenditure and sales results as follows where ‘Ad. Expenditure (Rs. crore) Sales (Rs. crore) Mean 20 200 SD, 18 170 7 =06. Derive two regression equations. (MBA, GGDIP Univ., 2009) Solution : Since sales depend on advertisement expenditure, we take sales as Y and advertisement expenditure as X. Regression equation of X on Y : eo, ~¥ar Ey X-¥erZ-¥) y = ¥ = 20, 0x= 18, 0)=170,r=0.6, F = 200 18 X-20= 06 455 (Y-200) X—20= 0.64 (Y-200) X-20= 0.64 Y-12.8 X= 0.64Y+7.2 Regression Analysis 265 Regression equation of Y on X ine (-¥) ¥ = 200, oy=170, oy = 18,7 =0.6, X =20 170 Y¥-200 = 06 18 (¥-20) Y—200 = 5.667 (X20) ¥—200 = 5.667 X— 113.34 667 X 86.66 s one mark: : Answer the following questions, each question can (i) Whatis regression ? (ii) What is the use of studying regression ? (iii) When will regression coefficients become coefficient of correlation? (MBA, Madurai-Kamaraj Univ., 2003) (iv) Write down the two regression equations. (v) Write down the formula for regression coefficient of x and y ? (vi) What do you understand by the term ‘regression line” ? (M.Com., M.K.Univ., 2003) (vii) What are regression coefficients ? (viii) Can both the regression coefficients exceed one ? (ix) Are regression coefficients independent of change of scale and origin or only origin ? (x) _ In the regression equation of y on x how do you interpret the values of ‘a’ and “b"? (xi) Who had coined the term ‘regression’ ? Answer the following questions, each question carries four marks: (i) _ Distinguish between “correlation” and ‘regression analysis’. Why there are two regression lines? (MBA, UP Tech. Univ., 2007) (ii) What are regression coefficients ? How do you interpret them ? (iii) What are the important characteristics of regression coefficients ? {iv) If two regression coefficients are -1.2 and — 0.8, what would be the value of r ? (v) What are the important uses of regression analysis ? (a) Explain the concept of regression and point out its usefulness in dealing with business problems. (6) Distinguish between correlation and regression, Also point out the properties of regression coefficients. (a) Compare and contrast the role of correlation and regression in studying the interdependence of two variates. (b) Explain the concept of regression and point out its ifiportance in business forecasting. Under what conditions ean there be one regression line? Explain. “The regression line gives only the best estimate ofthe value of quantity in question. We may assess the degree of uncertainty in this estimate by calculating a quantity known as the standard error of estimate”. Elucidate. Do you agree with the view that regression equations are irreversible, ie., we cannot find out the regression of \'on Y from that of Y on X? (a) Point out the usefulness of regression analysis in business and industry. (b) What is linear regression? When is it used? (MBA, Madurai-Kamaraj Univ,, 2003) (c) Discuss the role of correlation and regression analysis in business, Illustrate, What are regression lines ? With the help of an example, illustrate how they help in business decision-making. (MBA, Dethi Univ., 2004) What do you understand by the term “regression analysis”? Point out the role of regression analysis in business decision making, What are the important properties of regression coefficients? (MBA, Osmania Univ.; MBA, Delhi Univ., 2096) (a) Write any two differences between correlation and regression. (M.Com., Madras univ,, 2009) (b) What are regression coefficients? State some of the important properties of regression coefficients. (c) Write down the mathematical properties of Correlation Coefficient and Regression Coefficient. (MBA, Hyderabad Univ., 2005) (a) State the utility of regression in economic analysis. _ The following data give the hardness (X) and tensile strength (Y) of 7 samples of metal in certain units. Find the linear regression equation of Y on X. % 146 152 158 164 170 176 182 =. 75 78 77 89 82 85 86 [Y=29.45 +0314] ‘266 Business Statistics 12. 13. 14, 16. 17. 18, 19. 20. “The average daily wage for working class in Nagpur is Rs. 12 and for that in Delhi Rs. 18, their respective standard deviations are Rs. 2 and Rs. 3 and the coefficient of correlation is 0.67. Find the most likely wage in Delhi corresponding to the wage of Rs. 20 in Nagpur. [a= 26.04] “There are two series of index numbers D for disposable personal income and S for a salary of the company. The mean and standard deviations of the D series are 120 and 15 respectively and of the S series 115 and 10. The coefficient of correlation between the two series is 0.75. From the given information obtain a linear equation for estimating the values of S for different values of D. How will you interpret the values of S corresponding to different values of D obtained from the ‘equation? Can the same equation be used for estimating values of D for different values of $? [S=0.5; D=55; No] “The following calculations have been made for closing prices of 12 stocks (X) on the Bombay Stock Exchange on a certain day along with the volume of sales in thousand of shares (1) From these calculations find the regression equations. Ex = 580, ry=370, ZAY= 11,494 Lx?= 41,658, y?=17,206 [Y= 53.55 - 0.47%, X=79.16- 1.1] Given the following data, what will be the possible yield when the rainfall is 29" ? Rainfall Production Mean 29" 40 units per acre sD. 3° 6 units per acre Coefficient of correlation between rainfall and production = 0.8. [40 units) Inthe following table are recorded data showing the test scores made by salesmen on an intelligence test and their weekly sales: Salesmen 1 2 3 4 5 6 7 8 9 10 ‘Test Scores 45 15 50 60 80 90 85 40 80 35 Sales (°000) 20 GS EAS: re PO arg 60 oS. PS ea <5 Calculate the regression line of sales on test score and estimate the most probable weekly sales volume if a salesman makes a score of 70. [y'=-0.541 + 0.078X, 4919} ‘The following marks have been obtained by a group of students in Statisties (out of 100): Paper I 80 45 55 56 58 60 65 68 70 15 85 Paper It 82 56 Oo it, 60 eo 64 65 70 4 90 ‘Compute the coefficient of correlation for the above data. Find the lines of regression and examine the relationship. [r= 0.75, 1+0,75.X, X= 4.25 + 0.75 ¥) ‘The following table gives marks out of 50 awarded in a French and.a German test to the same group of boys. Assume there isa linear relation between the sets of marks, calculate the equations of the lines of regression. French : 10 10 18 25 28 33 34 39 2 4B German : MW 22 2 19 35 27 33 40 42 47 [7=6.25 + 0.13 X, ¥=-0:34 + 0.96 Y} ‘You are given the following result of the height (X) and weight (Y) of 1,000 managers: Mean (X) = 68.00" ‘Mean (¥) = 150 Ibs Standard deviation (X) 2.50" Standard deviation (Y) = 20 Ibs Coefficient of correlation between X and Y = 0.6. Estimate from the above data the height of a manager whose weight is 200 Ibs. (MBA, Kurukshetra Univ, 2002) ‘The following table shows the mean and standard deviation of the prices of two shares on a stock exchange Shares Mean Standard deviation (in Rs.) (in Rs) Ald. 39.5 10.8 Bld. 475 16.8 Ifthe coefficient of correlation between the prices of two shares is 0.42, find the most likely price of share 4 corresponding to a price of Rs. 55 observed in the case of share B. Regression Analysis 267 4. Catalogues listing textbooks were examined to discover the relationship between the cost of a book and number of pages it contains. The perusal gives the following data for ten books: Pages 700 540-210 62538090 610.420750 00 Price (Rs): 12 n 5 10 7 15 9 8 2 9 (a) Obtain the line of regression for estimating the price of a book. (6) What is your estimate for the price of a book containing 500 pages ? (c What increase would you expect fora book if itis decided to increase the number of pages ofthe book by 100 ? (@) Calculate the standard error of the estimate. = From the data given below find the two regression equations, Age of wife Age of Husband 20-25 25-30 30-35 Total 16-20 4 9 _ B 20-24 1 4 1 6 24-28 4 4 3 nl Total 9 7 4 30 (M.Phil, Kurukshetra Univ,, 2003) ‘The data given below relate tothe scores obtained by 9 salesmen in an intelligence test and their weekly sales, in lakh of rupees : Salesman a 1 2 3 4 5 6 a 8 9 Test Score : 50 60 50 60 80 50 80 40 70 Sales (Rs. lakh) ; 3 6 4 5 6 3 iu 5 6 Obiain regression equation of sales on te intelligence test score, Ifa salesman has obtained a score of 65, what would be his expected weekly sales ? [Y= 0.075X + 0.5, Rs, 5.375 lakh] ‘The following figures relate to advertisement expenditure-and:sales : Adv. Exp. (in lakh of Rs.) 60 62 65 70 B 75 7 Sales (in crore of Rs.) 10 " 13 15 16 19 14 Estimate (i) the sales for advertisement expenditure of Rs. 80 lakh and (ii) the advertisement for.a sales target of Rs. 25 crore. (20.1; 87.75} ‘You are given the following data about the sales and advertisement expenditure of a firm Sales Advertisement Expenditure (Rs. crore) (Rs. crore) Arithmetic Mean 30 10 Standard Deviation 10 2 Coefficient of Correlation 2 +0.9 (a) Calculate the two regression equations. (6) Estimate the likely sales for a proposed advertisement expenditure of Rs, 13.5 crore, (c) What should be the advertisement budget if the company wants to achieve a sales target of Rs. 70 crore ? (MBA, Dethi Univ., 2005) (a) Y= 4.5X-+ 5, X= 18Y + 1. (6) 65.75 crore. (c) 13.6 crore] Ths following bivariate frequency distribution relates to sales turnover (in lakh Rs.) and money spent on advertising budget (in thousand Rs.). Obtain the two regression equations, Sales Turnover Advertising budget (in thousand Rs.) (in lakh Rs.) 50-60 60-70 70-80 80-90 25-50 2 1 2 3 50-75 3 4 id 6 75-100 1 5 8 6 100-125 2 7 9 2 Estimate (i) the sales turnover corresponding to advertising budget of Rs. 150 thousand, (ii) the advertising budget to achieve a sales turnover of Rs. 200 lakh, and (iii) compute the coefficient of correlation. (MBA, Dethi Univ., 2008) ‘The following data give the test scores and sales made by nine salesmen during the last one year ‘Test Scores : 4 19 24 21 26 22 1s 20 19 Sales ((000Rs.): 31 36 48 37 50+ 45 33 41 39 Obtain (0 the repression equation of test scores on sales, (ii) the regression equation of sales on test scores, and (Gi) coefficient of correlation. [X=-2312 + 0.5578 Y, (i) .834 + 1.6083.%, (iii) = 0,947] 268 Business Statistics 28. 29. 30. 31. 32. 33. 34. [A study of share prices of Tentle group and Fertiliser group of companies yielded the following results: Textiles Fertilisers Mean 128 985.0 Standard Deviation 16 70.1 Coefficient of Correlation +0.52 ‘The financial expert has estimated the likely price of textiles shares atthe close of the next accounting year as 92. What would be your estimate of the likely price of fertiliser shares at the corresponding time ? Following are the data on business turnover and staff of a company for eight years from 2003 to 2010 2003 2004 +» 2005 2006.-=—- 2007-2008» 20092010 Business Tumover (Rs. crore) : 45 30 60 5 800 P50 4,170 Staff 2600 3,000 3,100 «3,530 3,850 4,300 5,870 7,150 Fit a proper regression equation to estimate manpower in terms of business turmover. Estimate the staff requirement ‘when the business turnover reaches Rs. 200 crore, [Y=33.24X + 1100.3; 7748.3] “The data on sales and promotion expenditure on a product for 10 years are given below : Sales (Rs. lakh) : 8 10 9 12 10 WW & 13 4 1S Promotion Exp. (Rs. thousand) 2 2 3 4 5 5 5 6 7 8 Use two-variable regression model to estimate the effect of promotion on sales. Forecast the sales for next year when the company hopes to spend Rs. 10 thousand on promotion. [x= 0.815 ¥- 4.591, Y=1 003.X + 6.686, ¥,o= 16.716] Table below shows the power and top speeds of different brands of sports cars : Brand A B c D E F Power X [kW] : 70 63 2 60 66 70 Speed ¥ [km/h] 155 150 180 135 156 168 Brand : G H ’ J K L Power X [kW] ” 65 62 6 65 68 Speed ¥ [km/h] 178 160 132 145 139 152 (i). Find the best linear relationship that fits the given data, (ii) _ Estimate the speed of a car that has a power of 63 kW and find a 95% confidence interval for this estimate. (iii) Determine how much of the variability in speed may be explained by the regression hypothesis. Calculate the coefficient of correlation from the following data : ea 1 2 3 4 5 6 1 8 9 f q 8 10. 12 iT 3 4 16 1S ‘Also obtain the regression equations andfind an estimate of Y which should correspond on an average to X= 6. [¥=0.95K+7.25; ¥,,= 13.14] (MBA, Madurai-Kamaraj Univ. 2006) Family income and its percentage spent on food gave the following bivariate frequency table Food Expenditure ‘Monthly Family Income (in hundred Rs.) (in%) 25-35 35-45 45-55 55-65 65-70 15-20 8 9 2 13 8 20-25 6 3 6 ul 4 25-30 7 2 _- 4 30-35 5 8 10 14 ee (i Estimate the family income for a food expenditure of 40%. i). What amount should be spent on food expenditure for a monthly family income of Rs. 10,000. Gili) Compute coefficient of correlation. You are given below the following information about advertisement and sales. Adv. Exp. (X) Sales (Y) (Rs. crore) (Rs. crore) Mean 20 120 SD. - 25 Correlation coefficient +08 (i Calculate the two regression equations. (ji) Find the likely sales when advertisement expenditure is Rs. 25 crore. ii) What should be the advertisement budget ifthe company wants to attain sales target of Rs, 150 crore 2 [Y= AX +40; X=0.16Y + 0.8; Yas = 1405 X59 = 24.8]

You might also like