0% found this document useful (0 votes)
94 views18 pages

Adobe Scan 23 Aug 2022

1. Correlation refers to the relationship between two variables. It can be positive, negative, or no correlation. 2. A scatter diagram visually depicts the correlation between two variables with data points. It helps analyze the nature, extent, and direction of correlation. 3. Karl Pearson's coefficient of correlation measures the strength and direction of association between two variables on a scale of -1 to +1. It is calculated from sample data using a formula.

Uploaded by

Shubham Palkar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
94 views18 pages

Adobe Scan 23 Aug 2022

1. Correlation refers to the relationship between two variables. It can be positive, negative, or no correlation. 2. A scatter diagram visually depicts the correlation between two variables with data points. It helps analyze the nature, extent, and direction of correlation. 3. Karl Pearson's coefficient of correlation measures the strength and direction of association between two variables on a scale of -1 to +1. It is calculated from sample data using a formula.

Uploaded by

Shubham Palkar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 18

EXERCISE 81 t1C

1Explain the concept of correlation. Clearly explain with suitable illustrations its role in dealing with business
ms.
la) Define correlation. Explain various types of correlation with suitable examples.
[Delhi Univ. B.Com. (Pass), 20001
State the nature of the following correlations (positive, negative or no correlation)
( of woollen
Sale garments and the day tenperature,
() The colour of the saree and the intelligghce of the lady who wears it; and
(ii) Amount of rainfall and yield ofcrop
8-6 FUNDAMENTALS OF STATISTICS
3. Define correlation. Discuss its significance. Does correlation always signify causal relationship between two
variables ? Explain with illustration.
4. (a) Does the high degrec of correlation between the two variables signily the existence of cause and effect
relationshipbetwecn the two variables?
(6) Docs correlation imply causation between two variables ? Delhi Univ. B.Com. (Hons.), 2008]
5. (a) Whatl is spurious correlation' and 'non-sense or chance correlation' ? Explain with the help of an
example Delhi Univ. B.Com. (Pass), 19971
6 Comment on the following statement : "A high degree of positive correlation between the 'size of the shoe'
andthe 'intelligence' of a group of individuals implies that people with bigger shoe size are more intelligent than the
people with lower shoe size".

6TwoHow far do you agree with the conclusion drawn in the following case ? Why ?
series- quantity of money in circulation and
of
general price indexare found to possess positive correlation
a fairly high order. From this, it is concluded that one is the cause and the other the effect in a direct causal
relationship.
7.(a) Distinguish clearly between:
() Positive and Negative correlation; (i) Linear and non-linear correlation.
(6) "If the two or more
quantities vary in sympathy so that movements.in one tend to be accompanied by
corresponding movements in other(s), then they are said to be correlated." Discuss.
8. What is correlation ? What is a scatter
variables, in respect of both its nature and extent ?
diagram ? How does it help in studying correlation between two
9. (a) What is correlation ?
Explain the implications of
diagram, the presence of perfect positive and perfect negativepositive
and
correlation.negative
correlation. Show by means of scatter
IC.A. (Foundation), May 1996]
(6) Tllustrate a perfect negative correlation on a scatter diagram. [Delhi Univ. B.Com. (Pass), 1998]
10. (a) What is a scatter diagram ? How is it useful in the
study of correlation between two variables ?
with suitable examples. Explain
[Delhi Univ. B.Com. (Hons.), (External), 2006
(6) Write a note on scatter diagram. Draw sketches of scatter diagram to show the following correlation between
two
variables x andy:
(i) linear : (i) linear and perfect: (ii) non-linear, (iv) x and y uncorrelated.
(c) While drawing a scatter diagram, if all points appear to form
then it is inferred that there is
a straight line going downward from left to right
() Perfect positive correlation; (i) Simple positive correlation:
(ii) Perfect negative correlation ; (iv) No correlation.
Ans. (in
11. Given the following pairs of values :
Cupital employed (Crores ofRs.) 5 6 8 9
Profits (Lakhs ofRs.) 5 12 11
(a) Make a scatter diagram.
(b) Do you think that there is any correlation between profits and capital
it high or low ?
employed ? Is it positive or negative ? Is

(c) By graphic inspection, draw an estimating line.


12. "Even a high degree of correlation does not mean that a
relationship of cause and effect exists between the two
correlated variables". Discuss
13. Draw a scatterdiagram from the following data:
Height (inches) 62 72 70 67 70 64 65 60 70
Weight (lbs) 50 65 63 52 56 60 59 58 54 65
Also indicate whether correlation is positive or negative.
Ans. Positive Correlation.

scatter diagram for the data given below and interpret it.
14. Draw a

10 20 30 40 50 60 70 80
32 20 24 36 40 28 38 44
EXERCISE 8.2
.
Explain the meaning and significance of the concept of correlation. How will you calculate it from statistical
point of view.
2.
(a) Define Karl Pearson's coefficient of correlation. What is it intended to measure ?
(b) What are the special characteristics of Karl Pearson's cocfficient of correlation ? What the
are underly1ng
assumptions on which this formula is based?

values ofr(c) How 0,do you interpret


=
r1 andr
= -
calculated
1.
value of Karl Pearson's cocfficient of correlation ? Discuss in particular the
=
a
+

3. (a)
Explain what is
meant by of
coefficient
correlation between two variables. What are the different methods of
finding correlation ? Distinguish between Positive and Negative correlation. ICalicut Univ. B.Com, 19971
b )Write down an expression for the Karl Pearson's cocfficient of linear correlation. Why is it termed as the
coefficient of linear correlation ? Explain. [Delhi Univ. B.A. (Econ., Hons.), 1997]
4. (a) The value of correlation coefficient ranges from -I to +
1 Why ? Give the explanations by giving
examples.
(b) Prove that I rI<1.
(c)Prove that Karl Pearson's correlation coefficient cannot exceed the limits+l, ie., - ISrS1.

( d Define product moment correlalion coellicient between two Variables X and y and prove that it lies between
diagram for the extreme cases.
1 . Draw the scatter
5 (a) I f r and y are independent variales then prove Ihat they are uncorrelated. Is the converse true ? Explain your

answer with the help of an example.


D e n u e that two independent variables are uncorreClatea. 5y BiVIng an example, show that the converse is not

? Guru Nanak Dev Univ. MBA, 1994]


Explain the
reason
true.
CORRELATION ANALYSI1S 8-27
(c) Comment on the following statement
If the coefficient of corelation between two variables is zero, it does not mean thatthe variables are unrelated.
[Delhi Univ. B.Com. (Hons), 2002
6. (a) Show that the coefficient of correlation r (X, Y) is independent of the change of scale and origin of the
variables. [Delhi Univ. B.A. (Econ. Hons.), 2008
6) Prove that ry= ru where U = (X-A)/h and V= (Y- B/k. [Delhi Univ. B.A. (Econ. Hons.), 2007]
7. State, giving reasons, whether the following statements are true or false.
(a) Coefficient of correlation between two variables must be in the same units as the original data.
(6) The correlation coefficient between rainfall and wheat yield per hectare was found to be 0-8. Hence
more rainfall means more agricultural production.
8. Discuss the statistical validity of the following statements
and increase in the
(a)High positive coefficient of correlation between increase in the sale of a newspaper
number of crimes. leads to the conclusion that newspaper reading may be responsible for the increase in the number of
crimes."

(b) "A high positive value of r between the increase in cigarette smoking and increase in lung cancer establishes
that cigarette smoking is responsible for lung cancer."
(c)If the cocfficient of correlation between the annual value of exports during the last ten years and the annual
number of children born during the same period is +0.9, what inference,if any, would you draw ?
[Delhi Univ. B.A. (Econ. Hons.), 1996]
9. Comment on the following:
(a) "Positive correlation r = 0.9, is found between the number of children born and exports over last decade.
Delhi Univ. B.Com. (Hons.), 2001]
b) The correlation coefficient between the railway accidents in a particular year and the babies born in that year
was found to be 0-8.
10. (a) Define a scatter diagram. Draw the scatter diagram when (i) r= + 1, (i)r a - I and (in r = 0. where r is
the correlation coeflicient. [I.C.W.A. (Intermediate), Dec. 200/]
(b) What is a scatter diagram ? Give the procedure of drawing a scatter diagram. Draw scatter diagrams whenthe
coefficient of correlation r = + I and r = -1 [C.A. (Foundation), May 2000]
11. The production manager of a company maintains that the flow time in days (). depends on the number of
operations (x) to be performed. The following data give the necessary information:
2 2 3 4 5 6 6 7

8 13 14 11 20 10 22 26 22 25
Plot a scatter diagram. Calculate the value of the Karl Pearson's Product Moment Correlation Coefficient.
I.C.W.A. (Intermediate), Dec. 1995]1
Ans. r (. y) = 078.
12. Discuss importance of scatter diagram in finding the degree of relationship between two variables. Sketch
scalter diagrams for r = 0, 0-8, 1, 1, and interpret these values.
-
-

13. Calculate Karl Pearson's coefficient ofcorrelation from the following data
X 6 8 12 15 18 20 24 28 31
Y: 10 12 15 15 18 25 22 26 28
Ans. 0:9587.
14. Calculate Karl Pearson's coefficient of correlation between x and y from the following data
I-series: 80 60 51 69 58 62 64 72 56 58
y-series: 45 71 60 57 62 58 48 50 62 69
Ans. -07199.
15. Making use of the data given below, calculate the coefficient of correlation ri2
Case A B C D E F G
10 6 9 10 12 13 9
9 6 9 11 13 4
Ans. ri2=08958.
8-28 FUNDAMENTALS OF STATISTICS
16.alculate product oment and expenses () in
lakhs of nupees moment coefficient of correlation for the following data of sales (x)
O nupees of 10 fr
of 10 firms.
46 37 50 40
33 41 38 36 45 34
12 13 21 17 19 19
Ans. -0-0213. 24 16 15 14

PIcealculate
70 as the Karl Pearson's coefficient of correlation from the following data, using 20 as the working mean for
price and 70 as the
working mean for demand
Price 14 16 9 20 21 2322
17 18
Demand 84 78 0 5 66 67 62 58 60
[Delhi Univ. B. Com. (Pass), 19991
Ans. r= -0.954
18. Calculate the Karl Pearson's
coefficient of correlation from the following aata

Percentage of Marks Percentage of Marks


No.
Subject First Term Second Term No. Subject First Term Second Term
Hindi 75 77 69
2.
62 5. Commerce
English 81 68 6. Mathematics 81 72
3.
EconomicCs 70 65 7. Statistics 76
84
4.
Accounts 76 60 8 75 72
Costing
Delhi Univ. B.Com. (Pass), 2000
Ans. r= 0-623
COmpute Karl Pearson's coefficient
of correlation in the following series relating to cost of living and Wages.
Wages (Rs.) 100 101 103 102 100 99 97 98 96 95
Cost of living 98 99 99 97 95 2 95 94 90 91
Ans. r= 0.8472.
20.
Calculate the coefficient of correlation by Karl Pearson's
method from the following data relating to overhead
expenses and cost of produclion
Overheads (in '000 Rs.) 80 90 100 110 120 130 40 150 160
Cost in '000 Rs.) 15 15 16 19 17 16 18 19
Ans. r= 0-6928.
21. The following table shows the trend of cinema admissions and the growth of TV sets in a locality during
1974-1980. Calculate the product moment correlation coefficient between the two variates.
Year 1974 1975 1976 1977 1978 1979 1980
Admissions (in 000) 13 12 9 9 8 6 6
No. of TV Sets 54 53 57 61 67 72 70
Ans. r = - 0.9374.

22. Calculate the Karl Pearson's coefficient of correlation lor the following ages of husbands and wives at the time
of their marriage:
28
Age of husband (in years) 23
18
27
20 22
28
21
30 30 33 35 38
Age of wife (in years) 29 27 29 28 29
Ans. r= 0-8013.
23 Calcnlate the Pearson's coefficient ot correladon irom thne tollowing data using 44 and 26 respectively as the

and Y:
origin ofX 40 44 42 45
43 44 46 42 38 40 42 57
X
31 19 18 19 27 27 29 41
Y: 29 30 26 10
[Osmania Univ. B.Com., 1998]
Ans. r - 0 . 7 3 2 6 .
24. Calculate Pearson's coefficient of correlation from the following data: Take 6 and 70 as the assumed average
respectively.
variates X and Y
of the 58 65 8 70
45 55 75 80 85
48 60 62 64 5 70
50 74 82 90
Y 56
Ans. =0.9188.
CORRELATION ANALYSIS 8-29
25. From the following data examine whether there exists any correlation betwcen X and Y.

6.9 8.5 5.8 8.6 9.6 8.0 9.7


Y 2.9 3.8 6.5 2.3 5.5 3.5 3.2

Ans. - 0-34.
26. The following table gives the distribution of the total population and those who are totally or partially blind

among them. Find out if there is any relation between age and blindness.
0-10 10-20 20-30 30-40 40-50 50-60 60 79 70-80
Age (Years)
No. of Persons ('000) 100 60 40 36 24 6 3

Blind 40
55 40 40 36 22 18 15
Hint. Here we shall find the correlation coefficient between age X) and the number of blinds per lakh (Y) as
given in the following table.
5 15 35 45 55 65 75
2
Y 55 67 100 150 200 300 500
Ans. r= 0-8982.
27. With the following data in 6 cities, calculate the coefficient of correlation by Pearson's method between the

density of population and the death rale.


Cities Area in square miles Population (in 000) No. of deaths
150 30 300
180 90 1440
100 40 560
D 60 42 840
120 72 1224

F 80 24 312
IC.A. Intermediate). May 19811
Hint. Find r between. Density =F Population and Death Rate =No. of deaths x 1000.
Area Population
Ans. r= 0.9876.
28. Calculate the values of Y= (X-6P corresponding to X= 1, 2, 3, 4 and 5 and obtain the correlation coefficient
between X and Y. Explain why the ohtained coefficient differs from unity.

Ans. X: 1 2 3 4
1024 243 - 32
Y 3125
This value is different from unity because the relation between X and Yis Y= (Y- 6)*. which is not
r=0-8679.
linear
data
29. Calculate the correlation coefficient from the following
:

12 9 8 10 13
:
14 8 6 9 12
Y:
6 be added to it. Sinmilarly multiply each value of Y by 3 and
Let now each value of X be multiplied by 2 and then
subtract 2 from it. What will be the correlation coefficient between the new series of X and Y.
IC.A. (Foundation), May 19971
Ans. Let U= 2X + 6, V= 3Y-2. Since correlation coefficient is independent of change of origin and scale
rU, ) = rX, N = 09485.

2 4
30.
4 9 16 25
12 14 16 18 20
the value of rxz.
(i) Compute the correlation coefficient between X and Z and comment on
the correlation coetticient between N and
(ii) Let Xbe time, Y be population and Z be agricultural output. Compute
the difference in the two correlation coefficients r and r, ?
Y. What explains
Ans. ()r=1 : =I. il) r,=0.98
rt, z) = 1, because Xand Z have linear relationship 27= 2X + 10
rla. y) * 1. because Y= X is not linear but polynomial relation.
8.30 FUNDAMENTALS OF STATISTICS
31. (a) Given: 2X =125, 436, XY=520 and n=25
EY= 100, X= 650,
Oblain the valuc of Karl
(b) If Pearson's correlation coefficient r{X, 7).
n= 9, X= 45. 285, 2Y= 108 2 = 1.356, XY= 597.
find rX, .
Ans. (a) 0-67.
(b) 0.95.
v e n the following information relating to a frequency distribution comprising of 10 observations.
X = 5-5. Y = 4-0 EX = 385, Y= 192: 2X+ 2 = 947.
Find [Punjah Univ. B.Com., 1994]
Hint. Use E(X+ Y)2 =
XX? +Y?+2 2XY and find EXY =185.
Ans. rX. Y) = -

0681.
t 3. uputer while calculating the correlation coefficient between the variables X and Y obtained the following
results:
N= 30, EX=
120, 2X? =600, 2Y= 90, E?= 250,
was, however, later discovered at the EXY =356
time of checking that it had copied down two
pairs of observations as
XY XY
8 10 while the correct values were,
12 8 12
7
Obtain the 10 8
correct value of the correlation
Ans. r= 0-0504 coefficient between X and Y. [I.C W.A. Dec., 20031
34. Coefficient of correlation
between X and Y for 20 items is 0.3; mean of X
aeviations are 4 and 5 is 15 and that of Y 20, standard
respectively.
(x 17, y 35). Find the
=
=
At the time of calculations
one pair (x 27, y= 30) was wrongly taken as =

correct coefficient of
Ans. Corect value of correlation. Delhi Univ. B.Com. (Hons.), (External), 2007)
correlation coefficient 0-5153. =

35. In order to find


the correlation coefficient
between two variables X and Y from 12
followingcalculations were made: of pairs observations, the

On
EX 30, 2Y=5,
EX=670. 2Y= 285, XY= 334
subsequent verification it was found that the pair
(X= 10, Y= 14). Find the correct value of corelation
(X =11, Y= 4) was copied wrongly. the correct value
coefficient. being
Ans. 0-78.
What do you understand by the
probable error of
(i) Interpret the significance of an observed valuecorrelation
coefficient ? Explain how it can be used to
of sample correlation coefficient.
(i) Determine the limits for the population
correlation coefficient.
37. Calculate the coefficient
of correlation and find its probable error from the
X 7 6 4 3
following data
2
Y: 18 16 14 12 10 6 8
Ans. ry=0.9643 P.E. ) = 0-0179.
38. Find Karl Pearson's correlation coefficient between age and
playing habit of the following students:
Age (years) 16 5
17 18
No. of students 250 19 20
200 150 120 100
Regular players 150 200
90 80
48 30
Also calculate the probable error and point out if coefficient of correlation is significant 12
Himachal Pradesh
Univ. M.B.A. 1998; Delhi
Univ. B.Com.
Hint. Find r between age (X) and percentage of regular players (). (Hons.), 1996]
Ans. r -0.9912; P.E.() =0-0048; r is highly significant
dents of B.A. obtained the following percentage ol marks in
Examination (Y). English in the Internal Assessment
and University Test
(X)
Calculate Karl Pearson's coefficient of correlation from Actual Means aand its probable errors
50 60 75 84 47 52
50 65 59 4
45 52 40 65 33
Y: 50 0 46
= 0-1498. 32 51
P.E.r)
Ans. = 0 5 4 5 7 ;
CORRELATION ANALYSIS
8-31
40. The following table gives the distribution of production and also the relati vely deféctive items
according to size-groups. Find the correlation coefficient between size and defect in quality and its probableamong
cerror. them.
Size-groupP 15-16 16-17 17-18 18-19 19-20 20-21
No. of itenms 200 270 340 360 400 300
No. of defective items 150 162 170 180 180 114
Hint. Find correlation cocfficient between x : Mid-value of size-group and y: Percentage of defcctives.
Ans. rlr. y) = - 0.95: P.E.) = 0-0269.
41. Calculate coefficient of correlation between X and Y series from the
following data and calculate its probable
error als0
78 89 96 69 59 79 68 61
125 137 156 112 107 136 123 108
(Take 69 as working mean for Xand 112 for ).
Ans. r= 09544; P.E. (r) = 0:-0212.
42. Calculate Karl Pearson's coefficient of correlation for the
following series.
Price (in Rs.) 110-111 111-112 112-113 113-114 114-115 115-116
Demand (in kg.) 600 640 640 680 700 780
Price (in Rs.) 116-117 117-118 118-119
Demand (in kg.) 830 900 1,000
Also calculate the probable error of the correlation coefficient. From your result can you assert that the demand is
correlated with price ?
Ans. r = 09651 P.E. (r) = 0-0154.
43. (a) A student calculates the value of ras 0-7 when the number of items (n) in the sample is 25. Find the linits
within which r lies for another sample from the same universe.
Ans. Required limits for r are 0-767 and 0-633.
(b) Ifr = 0.6 for a pair of 64 observations, find the probable error of r and determine the limits for the measure of
the population.
Ans. P.E.) =0-054. Limits for population correlation coefficient are 0-546 and 0-654.
(c) Astudent calculates the value of ras 0-7 when the value of Nis S and concludes that r is highly significant. Is
hecorrect? Delhi Univ. B.Com. (Hons.), 19971

Ans. 0-7 Vs = 4-55 <6. Not significant.


p.E.(r)0-6745 x051
44. The correlation coefficient between Physics and Mathematics final marks for a group of 21 students was
computed to be 0.80. Find 95% confidence limits for the coefficient.
Ans. Required limits = r: 1.96 P.E.() =08 # 196 x 0-05299 = 06961 and 0-9039.
45. Calculate Karl Pearson's coefficient of correlation from the following data:
() Sum of deviations ofr=5; (i) Sum of deviations ofy =4
(iii) Sum of squares of deviations of x = 40; (iv) Sum of squares of deviations ofy = 50
(v) Sum of products of deviations of x and y = 32; (vi) No. of pairs of observations = 10
Ans. 0-7042.
46. The deviations from the respective means of X and Yseries are given below:
- 4
2 -1
y: -3 -4 4 2
Calculate Karl Pearson's coefficient of correlation from the above data Delhi Univ. B.Com. (Pass), 19951
Hint. Cov (X, n=2xy =0.
Ans. rX,Y) =0.
47. Calculate the coefficient correlation between X and Y series from the following data:
X series Yseries
No. of observations 15
Arithmetic mean 25 18
Standard deviation
EIX-25) (Y- 18)] = 125. Delhi Univ. B.Com. (Pass), 1995]
LINEAR REGRESSION ANALYSIS
9-11
Aliter. We have to prove:

i.e..
o2o,2 26,0, ,2+,2-26,0,>0 (o, o,>0.
which is always
irue, since the
square of a real
Theorem 9:4. Regression quantity always positive.
is
coefficients are independent of change of
transform from x andy to new variables u and origin
but not of scale.
Symbolically. il we
v by change
viz.. of origin and scale,
u h V= where a, b, h (>0)
K and k(>0) arc constanls, .(9-43)
Then

Proof. Since the


and
b hb ..(944)
correlation coefficient is
independent of change of origin and scale we have
.(9-45)
Alsotransformation (943) gives: o ho and
since standard deviation is independent of Oy ko ..(9-45a)
change origin but not of scale.
of
Oy
bysyo, hd,Ou Ou b ..(946)

uv hOu- k .(9-46a)
ko
From (9-46) and (946a), it is obvious that the
origin but not ofscale. regression coefficients are independent of chunge of
In particular if we take h k= 1, i.e..
=
we transform the variables x and y tou and v by the relation.
u =x - a
and V=y -b .(9-47)
i.e.. by change of origin only, 1 6 )and (9.
then from (9-46) and (946a). we get
bby nuv-( uL )...(9.47a) ( uN v)
Iy-(EwT
and byb=ny
..(9-47b)
n -(2 v) u-(2u?
These fonmulae are very useful for obtaining the equations of the lines of regression if the mean values
x and /
y come out to be in fractions or if the values ofx and y are large.
or

Example 9:1. From the following data, obtain the two regression equations :
Sales 91 97 108 121 67 124 51 73 57
Purchases 71 97 7570 69 80 91 39 61 47
Solution. Let us denote the sales by the variable X and the purchases by the variable Y.
CALCULATIONS FOR REGRESsION EQUATIONS
d=x- dy-y- dx dy dxdy
91 71
97 75 49 25 35
108 69 18 324 -18
121 97 27 961 729 837
67 70 23 0 529 0
124 91 34 21 1156 441 714
S1 39 -39 -31 1521 961 1209
73 61 17 -9 289 81 153
80 21 10 441 100 210
57 47 -33 -23 1089 529 759
2 x 900 2y= 700 dr-0 dy 0 dr= 6360 dhy?- 2868 2 dr dy =3900
8.10
FUNDAMENI

FUNDAMENTALS OF STATISTICS

9-12
2 F 70070
We have x9 90 and y= 10

Z dr dy
r - F)(y-) 3900
6360
06132
bys
Xr-)
2r-)-y) L dr dy 3900
8681361
bs - Edy
Regression Equations
Equation of line of rcgression ofr on y is
Equation of line of regression of y on x is
y-y w(r-¥) x-X = b,y-y)
y - 70 =06132 (x- 90)
x-90 = 1 361 (y
70)
= 1 361y -9527
0-6132x-55 188
y = 0-6132x - 55-188+ 70-000 x = 1:36ly-95 27+90-00

x = 1:361y-527
y =0:6132x + 14812
Remark. We have

b by =06132 x 1361 =0 8346 rtV08346 =


+ 0:9135
But since, both the regression coefficients are positive, r must be positive. Hence, r = 09135.

Example 92. From the data given below find :


(a) The two regression coefficients. 6) The two regression equations.
(c) The coefficient of correlation between the marks in Economics and Statistics.
() The most likely marks in Statistics when marks in Economics are 30.
Marks in Econonmics 25 28 35 32 31 36 29 38 34 32
(y Marksin Statistics 43 46 49 41 36
32 30 33 39
[Himachal Pradesh Univ. M.A. (Econ.). 2003]
Solution. Let us denote the marks in Economics by the variable X and the marks in Statistics by the
variable Y.
CALCULATIONS FOR REGRESSION EQUATIONS
dx=x-x =x- 32 dy=y-y =y- 38
dray
25 43 5 49 25 -35
28 46 8 6 64 2
35 49 11 9 121 33
32 41 0 9 0
31 36 4
36 32 6 36 24
29 31 9 49 21
38 30 B6 64 48
34 33 2 25 10
32 39
2x=320 2y= 380 dx 0 2 dy=0 d 140 d= 398 2 ddy =-93
Here, 032 and 2 380
10 38.
INEAR REGRESSION ANALYSIS
9.13
(a) Regression Cocfficients

Coefficient of regression of y on =h 2-x)0-y) 2 dxdy


93
dr2406 -06643
r-
Coefficient of regression of xon y=b, = Xx-)0)2dx -93 -02337
-0-2337
Z- 2 dy2 398
(6) Regression Equations
Fquation of the line of regression of x on y is :
Equation of the line of regression of y on x is:
x-X by(v-y)
y-y =bx (r-r)
-32 =-02337(y-38) y-38 =-0-6643 (x 32)
0 2337y +0
2337 x38
-

-0-6643x + 38 +06643 x 32
=
-02337y + 8 8806 =-06643x + 38 +21 2576
x
=-02337y + 32 +88806 y =-06643x +59 2576
x-0-2337y +40 8806 ...(*)
(c)Correlation Coefficient. We have
P=bya ba(-0-6643)) x(-02337)=0-1552
r=tV01552 t0394
Since both the regression coefficients are negative, r must be
(d) In order to estimate the most likely marks in negative. Hence, we get r=-0394.
St¡tistics
shall use the line of regression of y on x viz., the eqyation (*). (y) when marks in Economics (x) are 30,
we
given by
Taking x=
30 in (*), the required estimate is

y=-0-6643 x 30+59.2576-19.929+ 59-2576 39-3286


Hence, the most
likely marks in Statistics when marks in Economics are 30, are 39-3286 = 39.
Example 93. A panel of judges A and B graded seven debators and independently awarded the
following marks
Debator 3
Marks by A 40 34 28 30 44 38 31
Marks by B 32 39 26 30 38 34 28
An eighth debator was awarded 36 marks by Judge A while
Judge B was not present.
f Judge B was also present, how many marks would you expect him to award to eighth debator
assuming same degree of relationship exists in judgement?
[Delhi Univ. B.Com (Hons.), 1993; Himachal Pradesh Univ. M.A. (Econ.), June 1999,
Allahabad Univ. M.Com. 2002]
Solution. Let the marks awarded by Judge A" be denoted by the variable X and the marks awarded by
Judge 'B° by the variable Y.
CALCULATIONS FOR REGRESSION EQUATIONS
Debator u =x-A =x-35 y-B=y-30 2 UV

40 32 4 10
2 34 39 81 9
3 28 26 49 16 28
4 30 30 0
5 44 38 81 64 72
6 38 34 3 9 16 12
31 28 4 4

Total 2 u=0 2v=17 2 uf= 206 2 = 185 2 uv= 121


The marks awarded by Judge A to the eighth debalor are given to be 36, i.e., we are given x = 36. We
want to find the marks which would have been given to the 8th debator by Judge B, if he were present. In
8.10

FUNDAMENTALS OF STAFISTICS

On X.

9.14 line of regressron O


need the equation of
when x
=
36. To do this we

we want findy
to
other words,
usual notations we have
:
In the j B + 2=30 32-4286
A2=35 +=35. n

nuv-(E u) (2 v)7x 121-0x 17_ 12 0-5874


7x 206 0 206 -

on x is given by
The equation of line of regression of y
y-y =bx-X)
y-324286
=
0:5874 (r 35)-

= 0:5874x -0-5874 x 35

0-5874x 20-5590 + 32-4286


=
0-5874x + 11 8696
y
11 8696 33-016
05874 x 36+118696 21 1464
+
When x36. y
would have eighth debator
given 33 marks to the
Hence, if the Judge B were also present, he test
its salesmen which is followed by
a
9-4. A departmental store gives in-service training to
Example who does nct do well in the test
should terminate the service of any salesman
it is considering whether it test sales made by nine salesmen during a
certain period:
Thefollowing data give the scores and 15 20 19
14 19 24 21 26 22
Test scores
50 45 33 41 39
Sales (000 Rs.) 31 36 48 37
Calculate the coefficient of correlation between the test scores and
the sales. Does it indicate that the
termination of services of low test scores is justified ? lf the firm wants a minimum sales volume of
Rs. 30,000, whut is the minimum test score that will ensure continuation of service ? Also estimate the most
28. [Delhi Univ. B.Com. (Hons.), 20031
probable sales vohume ofa salesman making a of score
Solution. Let x denote the test scores of the salesmen and y denote their corresponding sales (in "000

Rs.)
CALCULATIONS FOR REGRESSION LINES

dr =x-x =x- 20 dy =y -ý=y 40 dr drdy


14 31 6 36 81 54
19 36 4 16 04
8 16 64 32
24 48 4
37 3 9 03

26 50 6 10 36 100 60
22 45 2 4 25 10
15 33 25 49 35
20 41 0 0
19 39 01
d=120 2d-346 2 drdy = 193
180 360 d-=0 dly= 0

Then =20 - - 360 40


n

. b Coefficient of regression of y on x b Coefficient of regression of r on y

dxdy 19316083 drdy


19305578
Karl Pearson's correlation coefticient r between x andy is given by

h s b = 16083 x 0-5578 =0-8971 rtV08971 t0-9471


LINEAR REGRESSION ANALYSIS
9-15
Since, the regression coefficients are positive, also
r is
positive. r+09471
2 dxdy 193
Aliter. 193 193 0.9472
n203.7646
Edr2 Edy2 120x 346 V41520
we see
Thus. that there is a
very high degree of positive correlation between the test scores (x) and the
ps ('000 Rs.) (). This justifies the proposal for the termination of service of those with low test scores.
Regression Equations
To obtain the lest sclore
(x) for given sales estimate
), we use the equation of the line of regression of withTo
given test the sales(x).
score volume use ofthe
we (y) a salesman
line of
r ony. regression of y on x, which is given by :
The equation of line of
regression of x on y is

200-5S78 (y
- 40 16083 (x 20)
r
40)0-5578y - 22.312 1
1 6083x 32-1660
-05578y 22312+20 y 16083x 321660 40
r-0-5578y 2:312 .* v 1 6083x 78340
Hence to ensure the continuation of service, Hence the estimated sales volume of a
the minimum test Score (x) corresponding to a salesman with test score of 28 is (in 000 Rs.)
minimum sales volume (y) of Rs. 30,000 3 0
vV = 5083x 28 78340
( 000 Rs.) is obiained on
putting y =
30 in (*) and
is given by 45-032478340
r
0-5578 x30 2:312 16:734 2:312
-

52 8664 ('000 Rs.)


14-422 14 Rs. 52,866.40
UNEAR REGRESSION ANALYSISs
2( ) 2y-8 +3 =

680 -64 +9 625: 9-21


(x).=2xy-3 x 8+8x3 =400-24 +24=
(o, 400
21250125 20 25
[Cov (xv). = ) .
[Cov (zy). = -(T.)x(y.)= "-ix-
ov (r. p)1.
40-
25/4
25/4 1.
Corrected line of (o,
regression of x on y becomes
x-()e (v-)
1-) X =y- 3.

1.(a) Explain concept of regression andEXERCISE 9.1


the
(b) What is a scatter diagram ? point out its usefulness in dealing with
Indicate by means of suitable scatter business problems.
may exist betwecn the
variables
between corTelation analysis
in bivariate
data. What are diagrams different types of correlation that
and regression regression lines ? Write down the main
2. Distinguish between analysis. points of distinction
correlation and
economic activities. regression analysis and indicate the
utility of regression
3.(a) What is regression analysis ? How does it differ from [C.A. (Foundation), anaiysisin
Nov. 1996
equations ? correlation ? Why there are, in general, two regression
(b) Comment on the
following:
"Regression equations are irreversible".
4. Given a scatter [Delhi Univ. B.Com. (Hons.). 20021
diagram of bivariate data involving variables X
XY-Y and hence derive normal
equations for the linear
and Y. Find the conditions of minimisation of
Xis regressed upon Yand what are the normal regression of Yupon X. What sum is to be minimised
when
5. Derive the normal
equations in this case?
equations for the
Show that the mean of the error terms is zero.regression of Y on X for a data comprising of n pairs of values of X and Y.
Hint. Y a + bX [Delhi Univ. B.A. (Econ. Hons.). 2005]
(i) (Regression equation of Yon X)
. . .

Normal equations are


Y na+ bEX...(i)
Mean of error terms is given by:
and EXY= aX+bEX...in
e
-Y- n
ni -a-bx) From (i)]
[EY-na-bX] =0. [From (i]
6. What is linear
the use of
regression ? Why are there, in general, two regression l1nes ? When do they coincide ? Explain
regression equations in economic enquiry.
7. (a) It is said that
regression equations are irreversible meaning thereby that you cannot find out the regression
cquation ofx on y from that of y on x. Justify the comnent with special reference to the
principle of least squares
6) Explain the term 'Regression'. Why do we take, in general, two regression lines ? When are the
unes () perpendicular to each other and (i) coincident? regression
8. What are regression lines ? Why is it necessary to consider two lines of regression ? In case the two lines are
ical, prove that the correlation coefficient is tI or -1. If the two variables are independent, show thatthe two
Tegression lines are
perpendicular
.Obtain the angle between the two lines of regression and disCuss the nature of the lines tor the follow ing
particular cases .
()r=t1. (i) r= 0.
10. (a) What is the difference between correlation and regression coeficients ? Can correlation coefticient be
nputed out of regression coefficients ? If yes, how ?
6) Prove that the estimated slope of the regression of Y on X will equal the reciprocal of the estimated slope of the
egression of Xon Yonly if r=1. Delhi Univ. B.A. (Econ. Hons.). 2007]
9-22
FUNDAMENTALS OF STATISTICS

Hint. Estimated slope of regression of Yon X =


byx=r

Estimated slope of regression of Xon Y byr

D ifandonly if ro,- 7
ro,
1

11. (a) Prove that the


regression lines of Yon X and X on Y intersect at the
point (X, Y).
(6) What are regression coefficients?
Deducc the expressions for
these estimates. regression coefficients by method ot icast squares. Point out important properties of
12. (a) Define
regression coefficients. What information do they supply?
(6) Let h and h, stand for the coefficients
of regression of Y on X and X on Y respectively. Show that

13. Prove that : Delhi Univ. B.A. (Econ Hons.). I9971


(a) The correlation coefficient between two variables X and Y is the
coefficients of Y on X and X on Y geometric mean of the regression
(6) The arithmetic of the
mean
coefticient between X and Y.
regression coefficients of Y on A, and X on Y is greater than the correlation
14. Given the
following values of x and y
6
2 4 6 8
find the equation of regression of
(i) y on x and (ii) x on y.
Interpret the results.
Ans. y=0-7143r 0-3334; x= 1-2857y+ 1-0001.
15. Obtain the
equations of the two lines of regression for the data given below
2 3 4 5 6 9
Y 9 8 10 12 13 14 16 15
Ans. Y- 0-95X+ 7-25: X=095Y+ 7-25.
16. Fit a least square line to the following data
(a) using x as independent variable; (b)
usingr as dependent variable.
3 8 9 14
2 4 5 7 8
Hence obtain .
)the regression coefficient of y on x and x on y:
(ii) coefficient of correlation between x and y;
(ii)x, y
iv) the estumated value of v when x - 10 and ofx when y = 6.

Ans. (a) - 06332x + 0-619. (6)x 15104y 06235 (i)0-6332. b,, 5104.
(i)r 7:14. v - 5:14. (ii) r=09779, (iv)) 0 6951 and (v) 84389
17. What are regression coefficients ? Show that

where the symbols have their usual meanings (which you are to explain in course of your demonstrauon) What ean you
say about the angle between the regression lines when (i) r =0. ti) r = +1. (i) r mereases trom t o Obain the
of Y on X from the following data
equaluon of the line of regression
2 18 24 30 36 42 48
5-27 568 625 721 802 8:71 842
40.
the mosi probable value of Y when X
Estumate

Ans. Y
3-9943 0-1028X. 8 1063
RREGRESSION ANALYSIS 9:23
LINE
18. Prom the rolowing data of the age of husband and the age of wife, fom two regression lines and calculate the
husband age whentthe wife's age is 16.
Haushand's age 36 23 27 28 28 29 30 31 33 35
Wife's age 29 18 20 22 27 21 29 27 29 28
Ans. Husband's age:x Wife's age :y
y=0-95x- 35 = 08y+ 10 ()-122 8.
19. Find the regression equation ofy on x where y and x are the marks obtained by 10 students as given below
20 60 55 45 10 50
35 25 90
20 45 65 40 25 50
55 35 15 80
[C.A. (Foundation), May 2002]
Ans. h, 1-105; y= 1 105x 1015.
20. The following data give the experience of machine operators and their performance ratings as given by the
number ofgood parts fumed out per 100 pieces
Operator 3 b 8
Experience (in years) 16 12 18 4 3 10 5 12
Performance Ratings ( 87 88 89 68 78 80 75 83
Calcalatetheregression line of performance ratings on experience and estimate the probable performance if an
operator has 7 years experience. [Himachal Pradesh Univ. B.Com., 1996]
Ans. Y-69 67 + 1-133 X: 77 601.
21. Willitbe useful to fit a linear regression to the following data ? Support your argument by computing the
relevant measures.

X 12 10 14 18 16
Y 10 8 7
[Delhi Univ. B.A. (Econ. Hons.). 19971
Ans.r =05547. Since r =r{X, ), which is a measure of linear relationship between X and Y1s not high (much
less than 1), it is not advisable to fit linear regression to the given data.
22. You are given the data relating to purchases and sales. Obtain the two regression equations by the method of
least squares and estimate the likely sales when the purchases equal 100.
Purchases 62 72 98 76 81 56 76 92 88 49
Sales 112 124 131 117 132 96 120 136 97 85
Ans. Purchase: x; Sales y x= 0-6515y +0-0775
y=0-7825y+ 56:3125; 1345625.
23. The height of fathers and sons is given in the following table. Find the two lines of regression and estimate the
expected average height of the son when the height of the father is 67-5 inches.
Height offather (in inches) 66
65 67 67 68 69 71 73
Height ofson (in inches) 68
67 64 68 72 70 69 70
Ans. y= 04242r + 39 5484; x=0-525y+ 32-2875; 6818 inches.
24. The following table gives the age of cars of a certain make and annual maintenance costs. Obtain the
regression equation for costs related to age.
2 4
Age ofcars tin years)
Maintenance cost 10 20 25 30
(in hundreds of Rs.)
Ans. r: Age: y: Cost y=325x+ 5.
25. Assuming that we conduct an experiment with eight fields planted with corn, four fields having no nitrogen
ertilizer and four fields having 80 kgs. of nitrogen fertilizer. The resulting corn yields are shown in the table in bushels
per acre
Field 2 3 4 5
0 0 0 0 80
Nitrogen (kgs.) 80 80 80
Corn Yield / Heciare 120 360 60 180 I.280 1.120 1.120 760
FUNDAMENTALS OF STATISTICe
9-24
of regression equation in tenmsa
s of
the meaning
least squares. Explain
(a)Compute linear regression cquation by
a
fertilizer and corn yield.
6) Predict corn yield for a field treated with 60 kgs. of fertilizer.
Ans. (a) Y= 11:125X+ 180, (6) 8475.
26. The following table gives the ages and blood pressure of 10 women.
49 42 60 72 63 55
Age () 56 42 36 47
Blood Pressure ( ) : 147 125 118 128 145 140 155 160 149 I50
(i) Find the correlation coefficient between X and Y.
(i) Determine the least square regression equation of
YonX.
(ii') Estimate the blood pressure ofa woman whose
age is 45 years.
Ans. ()r= 089, (i) Y=83 758+ 111X, (ii) When
X= Y= 134. 45,
27. After
investigation
demand for automobiles in a towm
it has been found that
the No. offamilies in Sales of autos in
depends mainly, if not City 10,000's ( 1,000's ()
entirely, upon the number of families
town. In the residing in that --~*------.

adjoining table figures are given for the sales A 70 2


ofautomobiles in the five cities for the B 75 286
number of families year 1995, and the
residing in those cities. C 80 30-2
D 60 223
E 90 35-4
Fit a linear equation of Yon Xby the least square
is estimated to have 30 lakh method and estimate the sales for the
families assuming that the same year 1996 for city F which
Ans.
y=0-443r- 4-885 relationship holds true.
28. A
1,28,015.
panel of two
judges P and
follows: Q graded seven dramatic
performances by independently awarding marks as
Performance 2
Marks by PP 3 4
46 42 6
Marks by 2 44 40
40 38 43 41
The eighth 36 45
performance, which Judge
been present, how Q could not attend,
35 39 37 41
many marks would be was
37 marks byawarded
Ans. 33 5 34. expected to have been awarded by him Judge P. If Judge
Q had also
to the
eighth performance?
29. The
following table gives the normal
Age in months weight of baby during the first six months of
a

2
life:
Weight in lbs. 5 5
6
Estimate the weight of a 8
baby at the age of 4 months. 10
12
Ans. 9-2982 Ibs.
30. The weight (in lbs.) of a new born calfis taken at
Age X weekly intervals. Below are the
2 3 4 observations for 10 weeks:
Weight Y 52.5 58-7 8
70 2 65-0 9 10
Let Y=a +bU, where 75-4 81-1
U= 2X- 11. Use normal 872 95 5
the form UY for these 10 equations to estimate a and b. 102-2 108-4
observations 10168). Hence
average rate of growth of the calf
=

obtain the line of best (Given: The sum of the products of


per week. fit of Y on X.
Ans. a =7962, Now write down
the
b=3-08; Y=6:16X+
45 74; 616 lbs.
31. You are given the
following data
Arithmetic Mean
Standard Deviation 36
85
Correlation coefficient between x and
y 066 =

(i) Find two regression equations. (ii) Estimate value of x when


Ans. () y=048x +67-72; y 75. =

x=0-9075y-41-1375,
32. Given the information Sum of
(ii) 26925.
X 5; Sum of Y= 4
=

Sum of squares of deviations from the mean of X


40; Sum of =

squares of deviations from the


mean of Y= 50
INEAR REGRESSION ANALYSIS
9.25
Sum of the products of deviations from the
Calculate
means of Xand Y= 32: Number of pairs of observations =
10

(i) regression coefficient of Y on X (i) regression coefficient of Xon Y;


(iin Karl Pearson's coefficient of correlation.
(Delhi Univ. B.A. (Econ. Hons.), 1999]
Ans. by 0-80; by=064; r(X,
)=0-71 56.
33. For some bi-variate data, the
following results were obtained:
Mean value of variable X= 53 2 and of Y= 39 5.
Regression Coefficient of Yand X= -

15 and of Xon Y=-038


What should be the most
likely value of Xwhen Y=50?
Also find the cocfficient of correlation between two
variables. Delhi Univ. B.Com. (Hons.), 2005
Ans. X=532+(-15) (50- 395) =4921 =-V-15)
34. For particular product, the sales (y) and the
a
(-0-38) =- V57 -07549
advertisement expenditure (x) for 10 years, provide the results
Zx=15, Ey= 110, Zy =400, =250,
Find the regression line Z=3200.
of yon x
and the estimated value of y for x =
10. .C.WA Wntermediate), Dec. 2001)
Ans.y=1033x +9.4505; )- 0= 19-781
35. Calculate the correlation coefficient from the
N= 10, ZX =350, Y=310,
following results
Also find the regression line of Yon X.
X-35 =162, Z(Y -31) =222, X-35) (Y -31)=92.
[Delhi Univ. B.A. (Econ. Hons.), 2007]
Hint. X = 35, Y = 31
Zr-35)= 2(r-x= 162 and so on.
Ans. X, )=0:485; Y=0568X + 11-12.
36. For bivariate data, you are given the
following:
XX- 58)= 46 Z(Y-58) = 9, EX - 58)2 = 3086, E(Y-58)2 = 483 ZX-58) (Y-58) = 1095.
Number of pairs of observations is 7. You are required to determine the two regression equations and the
coefficient of correlation between X and Y [Delhi Univ. B.Com. (Hons.), 2000]
Hint. Let U=X- 58, V=Y- 58. Then we are given EU, EV, U 2? and EUV.

X 58+U Y 58 +V; byxbru and


bxy=bv
Ans. Regression Equations
Yon X: Y=0:372 X+35 266 Xon Y: X=2-197Y-65680: rX. ) =0904.
37. Obtain the lines of regression of y on x and x on y for the data given below
x - 50, y=60, 2 xy =350; n= 10, o-4, o,3 9
[Delhi Univ. B.A. (Econ. Hons.), 1996]
Ans. y = 1 25r - 025 *=0-556y+ 1664.
38. If the two regression lines corresponding to two variables X and Y meet at a point (2, 3), V{X) =4, V()= 1 and
corelation coefficient between X and Yis, the estimated value of Yfor X = 6 is:

) 2, (i) 4, (ii) 7, (iv) None of these.


C.W.A. (Intermediate), Dec. 1999]
Hint. Lines of regression intersect at the point (x, y ) = (2, 3).

Ans. (ii).
39. Let the two variables X and Y have the covariance and correlation coefficient between them as 2 and 05
coefficient of Xon Yis
Espectively and V(X) =
2V(Y), then the regression
(i)1. in (ii) (iv) None of these.
L.C.W.A. ntermediate), June 2001]

Ans. (iv) b, 1/N2


40. The correlation coefficient between the variables x and Y is r= 0-60. If o, = 1-50, o, 200, x= 10 and

2 0 , find the equations of the regression lines

()y on x (i)x on y.

Ans. y =
08r+ 12 *=0-45y + 1.

You might also like