100% found this document useful (1 vote)
2K views43 pages

Correlation & Regression Analysis - Exercise2

This document provides solutions to 7 exercises on correlation and regression analysis. The exercises include calculating the coefficient of correlation, estimating values using regression lines, and determining how removing an outlier affects the original correlation coefficient value. The solutions show the step-by-step workings and calculations for each exercise.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
2K views43 pages

Correlation & Regression Analysis - Exercise2

This document provides solutions to 7 exercises on correlation and regression analysis. The exercises include calculating the coefficient of correlation, estimating values using regression lines, and determining how removing an outlier affects the original correlation coefficient value. The solutions show the step-by-step workings and calculations for each exercise.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 43

Solution to

Exercises on
Correlation &
Regression Analysis

Correlation & Regression_Exercise 1


Exercise # 1
From the following data, calculate coefficient of
correlation between the percentage yield on securities
and wholesale price indices for certain years:
Year 1982 1983 1984 1985 1986 1987 1988
% yield on 5.0 5.1 5.2 4.9 4.8 5.3 5.4
securities
Index no. of 140 138 126 132 140 135 132
wholesale
prices

Also calculate the two regression lines. Estimate percentage yield


on securities when index no. of wholesale prices is 150. Also
estimate index no. of wholesale prices when yield on securities is
6.0.

Correlation & Regression_Exercise 2


Solution: Exercise # 1
Let, X denotes % yield on securities and Y denotes Index
no. of wholesale prices
X Y X2 Y2 XY
5 140 25 19600 700
5.1 138 26.01 19044 703.8
5.2 126 27.04 15876 655.2
4.9 132 24.01 17424 646.8
4.8 140 23.04 19600 672
5.3 135 28.09 18225 715.5
5.4 132 29.16 17424 712.8
Total
35.7 943 182.35 127193 4806.1

Correlation & Regression_Exercise 3


Here  X  35.7,  Y  943,   182.35
X 2

  127193 and
Y 2
 XY  4806.1
rxy 
 xy  n  xy
 (x  x)   ( y  y)
2 2

4806.1  7  5.1 134.7143



(182.35  7  5.12 )  (127193  7 134.71432 )
  0.482

Correlation & Regression_Exercise 4


b1 
 xy  nx y
 x n x 2 2

4806.1  7  5.1 134.7143



(182.35  7  5.12 )
 11 .43
a 1 y  b1x
134.7143  (11 .43)  5.1  193
The equation of the line of regression is Y = 193 – 11.43X
Therefore, the estimated index no. of wholesale prices
when yield on securities is 6.0 = 193 – 11.43 × 6 = 124.42

Correlation & Regression_Exercise 5


b2 
 xy  nx y
 y  n y 2 2

4806.1  7  5.1134.7143
   0.02
(127193  7  134.7143 )
2

a 2 x  b2 y
 5.1  (0.02)  134.7143  7.79

The equation of the line of regression is X = 7.79 – 0.02Y


Therefore, the estimated percentage yield on securities
when index no. of wholesale prices is 150 = 7.79 - 0.02 × 150
= 4.79

Correlation & Regression_Exercise 6


Exercise # 2
Following data relates to advertising expenditure (in
lakh taka) and their corresponding sales (in crores of
taka):
Advertising 10 12 15 23 20
Expenditure
Sales 14 17 13 25 21

Estimate i) the sales corresponding to advertising


expenditure of taka 30 lakh and
ii) the advertising expenditure for a sales target of
tk. 35 crores.

Correlation & Regression_Exercise 7


Solution: Exercise # 2
 Suppose the advertising expenditure is denoted by X
and the sales denoted by Y.
X Y X2 Y2 XY
10 14 100 196 140
12 17 144 289 204
15 13 225 169 195
23 25 529 625 575
20 21 400 441 420
80 90 1398 1720 1534

Correlation & Regression_Exercise 8


Now, X  16, Y  18
1
   1398  16 2  23.6,  x  4.858
2
x
5
1
 y   1720  18 2  20.0,  y  4.472
2

5
1
COV(XY)   1534  16  18  18.8
5

The correlation Coefficient


is
COV(XY) 18.8
rxy    0.865
x  y 4.458  4.472

Correlation & Regression_Exercise 9


 xy 18.8
b1  b yx  2   0.797, a1  y  b1x  5.248
 x 23.6
Thus the regression equation of sales on advertisin g expenditur e is
Y  5.248  0.797X
The estimated sales when advertisin g expenditur e is 30 lakh is

Y  5.248  0.797  30  29.158
 xy 18.8
b 2  b xy  2   0.94, a 2  x  b 2 y  0.92
y 20
Thus the regression equation of advertisin g expenditur e on sales is
X  -0.92  0.94Y
The estimated advertisin g expenditur e when sales target is 35 crores is

X  0.92  0.94  35  31.98
Correlation & Regression_Exercise 10
Exercise # 3

The following table gives the distribution of items of


production and also the relatively defective items
among them, according to size groups. Is there any
correlation between size and defect in quality?
Size Group 15-16 16-17 17-18 18-19 19-20 20-21
No. of items 200 270 340 360 400 300
No. of 150 162 170 180 180 120
defective items

Also calculate the two regression lines. Estimate no. of


defective items when size group is 22-23

Correlation & Regression_Exercise 11


Solution: Exercise # 3

Correlation & Regression_Exercise 12


Exercise # 4

From the following data compute coefficient of


correlation between X and Y.

X- Series Y – Series
No. of items 15 15
Average 25 18
Sum of squares of 136 138
deviation from mean

Sum of the product of deviation of X and Y-series from


their respective AMs’ is 122

Correlation & Regression_Exercise 13


Solution: Exercise # 4

Here we have, n = 15, X  25, and Y  18

 ( X  X ) 2
136,  ( Y  Y ) 2
 138 and  (X  X )( Y  Y )  122

Now, rxy   ( X  X )(Y  Y ) 


122
 ( X  X ) 2
 (Y  Y ) 2
136  138
 0.890

Correlation & Regression_Exercise 14


Exercise # 5

Find out the regression equation showing the regression


of capacity utilization on production from the following
data:
Average SD
Production (in Lakh units) 35.6 10.5
Capacity utilization (in %) 84.8 8.5

Correlation Coefficient, r = 0.62

Estimate the production when the capacity


utilization is 70 percent.

Correlation & Regression_Exercise 15


Solution: Exercise # 5
Let production be denoted by Y and Capacity Utilization
be denoted by X
Here we have, Y  35.6,   10.5 y

X  84.8,  x  8. 5
rxy  0.62
 xy
 0.62 
x  y
  xy  0.62  10.5  8.5  55.335
Correlation & Regression_Exercise 16
Now the regression coefficient of Y on X is
 xy 55.335
b yx  2   0.766
x 8 .5 2

and the cons tan t a  y  bx


 35.6  0.766  84.8
 29.357
The regression equation is
Y  - 29.357  0.766X
When the capacity utilizatio n is 70 percent th en
the predicted value of Y is
Y  -29.357  0.766  70  24.26
Correlation & Regression_Exercise 17
Exercise # 6
Coefficient of correlation between two variables X and Y
is 0.32. Their Co-variance is 7.86. The variance of X is 10.
Find the SD of Y series.

Solution: Exercise # 6
Here,
rxy  0.32,  xy  7.86, and  2x  10
 xy 7.86
We know, rxy   y   7.767
x  y 10  0.32

Correlation & Regression_Exercise 18


Exercise # 7

In two series of variables X and Y with 50 observations


each, the following data were observed:

X  10, σ  x   3, Y  6, σ Y  2 and rXY  0.3

But on subsequent verification it was found that one


pair of vales [X (=10) and Y (=6)] were incorrect and
weeded out. With the remaining 49 pairs of values,
how is the original value of r affected?

Correlation & Regression_Exercise 19


Solution: Exercise # 7

Here Y  6,  y  2 and X  10, x  3


For 50 pairs of observations

 X  10  50  500 and
 X  n(  x )  50(9  100)  5450
2 2
x
2

Again  y  50  6  300 and


 y  50(4  36)  2000
2

Correlation & Regression_Exercise 20


Now if a pair of observation (x = 10, y = 6) were weeded
out, then for 49 pairs of observations, we have,
X  500  10  490,  x  10

and  Y  300  6  294 y6

 x  5450  100  5350


2

  2000  36  1964
y 2

5350
Now  2x   100  9.184,   x  3.03
49
1964
and  y 
2
 36  4.082,   y  2.02
49

Correlation & Regression_Exercise 21


Again for 50 observations
 xy  r   x  y  0.3  3  2  1.8
So,  xy  n ( xy  xy)  50(1.8  10  6)  3090

So for 49 pairs of observations

 xy  3090  10  6  3030
3030  49  10  6
So, rxy   0.3
3.03  2.02

Correlation & Regression_Exercise 22


Exercise # 8
The General Manager of Kiran Enterprises- an enterprise dealing
in the sales of readymade men’s wears – is toying with the idea
of increasing his sales to Tk. 80,000. On checking the records of
sales during the last 10years, it was found that the annual sales
proceeds and advertising expenditure was highly correlated to
the extent of 0.8. It was further noted that the annual average
sales has been Tk. 45,000 and annual average expenditure Tk.
30,000, with a variance of 1600 and 626 in annual average sales
and annual average expenditure respectively.

In the view of the above picture, how much expenditure on


advertisement you would suggest the general sales manager of
the enterprise to incur to meet his target of sales.

Correlation & Regression_Exercise 23


Solution: Exercise # 8
Suppose sales is denoted by Y and advertisement expenditure
is by X
Here, n  10, rxy  0.8, x  30000, y  45000
 x2  626,  y2  1600
Now,  xy  rxy   x   y  800.64
800.64
So, bxy   0.5004
1600
a  x  by  30000  0.5  45000  7500
The regression equation of X on Y is
x  7500  0.5y
The expected value of x when y is 80000 is
x̂  7500  0.5  80000  47500
Correlation & Regression_Exercise 24
Exercise # 9

For 10 observations on price (X) and supply (Y) of the


following data were obtained.

 X  130,  Y  220,   2288,


X 2

  5506 and
Y 2
 XY  3467
Obtain the line of regression of Y on X and X on Y,
and estimate the supply when price is 16 units.

Correlation & Regression_Exercise 25


Solution: Exercise # 9

Here, x 
 x
 130  10  13
n
y  220  10  22
The regression equation of y on x is y = a1 + b1x

b1  b yx 
 xy  nxy 3467  10  13  22
  1.015
 x  nx
2 2
2288  10  13 2

a 1  y  b1 x  22  1.015  13  8.805

So the equation is y = 8.805 + 1.015x

Correlation & Regression_Exercise 26


The regression equation is X = a2 + b2Y

b 2  b xy 
 xy  nxy 3467  10  13  22
  0.911
 y  ny
2 2
5506  10  22 2

a 2  y  b 2 x  13  0.911  22  7.042
So the regression equation of X on Y is,
X  - 7.042  0.911Y
The estimated supply when price is 16 is

Ŷ  8.805  1.015  16  25.045

Correlation & Regression_Exercise 27


Exercise # 10

Find Coefficient of Correlation for the distribution in


which SD of X is 3.0 unit, SD of Y is 1.4 unit and the
coefficient of regression of Y on X is 0.28.

Correlation & Regression_Exercise 28


Solution: Exercise # 10

We know the correlation coefficient r is

Here,  x  3.0,  y  1.4, and b yx  0.28


 xy
rxy 
x  y
 xy
Now we have b yx    xy  0.28  9  2.52
 2
x

2.52
Now rxy   0. 6
3  1.4

Correlation & Regression_Exercise 29


Spearman’s Rank Correlation Coefficient

 This method is applied in a situation in


which quantitative measure of certain
qualitative factors such as beauty, intelligent,
judgment, leadership, colour, taste cannot be
fixed but individual observations can be
arranged in a definite order called rank.

Correlation & Regression_Exercise 30


Definition: Suppose 1, 2, 3, …….. , n are assigned to the
x observations in order of magnitude and similarly to
the y observations. Then the simple correlation
coefficient between the two sets of ranks is called
Spearman’s Rank correlation coefficient. It is denoted by
R.
 
[When there are no tie among
either sets of observations]
 

[When there are ties among either sets of


observations, where m is the number of times an
observations is repeated]
Correlation & Regression_Exercise 31
Properties of Rank Correlation Coefficient
 Rank Correlation Coefficient lies between -1 to +1
 R = 1, when the ranks of x completely agree with the
ranks of y
 R = -1, when the ranks of x completely agree with
the ranks of y
 Only method for finding relationship between two
qualitative variables.
 Only method for finding the relationship between
two variables when ranks are given.

Correlation & Regression_Exercise 32


Limitations
 Cannot be used to finding correlation for
grouped data.

Correlation & Regression_Exercise 33


Example: A social scientist wants to see whether
there is any association between the intelligence and
beauty among the female students. To verify this he
randomly selected 6 female students from a class. The
scores on intelligence and beauty are found as follows:
Student A B C D E F
Scores on 80 75 90 70 65 60
intelligence
Scores on 65 70 60 75 85 80
Beauty

Compute rank correlation coefficient and comment.

Correlation & Regression_Exercise 34


X Y Rx Ry d=Rx -Ry d2
80 65 2 5 -3 9
75 70 3 4 -1 1
90 60 1 6 -5 25
70 75 4 3 1 1
65 85 5 1 4 16
60 80 6 2 4 16

Conclusion: There exist almost perfect negative


relationship between the Intelligence and the Beauty.

Correlation & Regression_Exercise 35


Example : (Repetitive Ranks)
The following data refers to the marks obtained
by 8 students in Mathematics and Statistics.
Marks in 20 80 40 62 28 20 45 60
Mathematics
Marks in 35 60 45 60 50 45 55 45
Statistics

Compute rank correlation coefficient and


comment.

Correlation & Regression_Exercise 36


M S RM RS d=RM -RS d2
20 35 7.5 8 -0.50 0.25
80 60 1 1.5 -0.50 0.25
40 45 5 6 -1.00 1.00
62 60 2 1.5 0.50 0.25
28 50 6 4 2.00 4.00
20 45 7.5 6 1.50 2.25
45 55 4 3 1.00 1.00
60 45 3 6 -3.00 9.00
      18

Correlation & Regression_Exercise 37


Correlation & Regression_Exercise 38
Exercise
 The coefficient of correlation between the age of husbands and
wives in a community was found to be +0.8, the average of
husbands age is 25 years and that of wives age 22 years. The
Standard deviations were 4 and 5 years respectively. Find with
the help of regression equations:
 a. the expected age of husband when wife’s age is 16 years
and
 b. the expected age of wife when husband’s age is 33 years.

Correlation & Regression_Exercise 39


Assignment :

 Correlation :
 Illustration :20, 31 and 32.

 Regression:

 Illustration: 1,3, 6,11 25,22*

Correlation & Regression_Exercise 40


Correlation & Regression_Exercise 41
Correlation & Regression_Exercise 42
Correlation & Regression_Exercise 43

You might also like