Correlation and Regression-2023
Correlation and Regression-2023
Correlation and Regression-2023
Introduction to Correlation
In the chapter Measures of central tendencies, we studied problems based on one variable called as
univariate analysis. But in the real world we have problems pertaining to two or more variables. If there
exist some relationship between these two or more variable, such an analysis is called as bivariate
analysis. The extent of relationship between these variables can be measured with the help of correlation.
The measure of correlation is called as correlation coefficient. For example, there exists some relationship
between the height of a father and the height of a son, price and demand, wage and price index, yield and
rainfall, height and weight and so on. Correlation is the statistical analysis which measure and analysis
the degree or extent to which two variables are associated or the closeness between each other.
Definition:
Correlation analysis attempts to determine the degree of relationship between variables or the degree of
association between the variables”
Thus, the association of any two variates is known as correlation. It depicts the relationship or
interdependence of two sets of variables upon each other in such a way that the change in one variable
will have a corresponding change in the other. Correlation is the numerical value showing the degree of
correlation between variables. One variable is called the independent (subject) and the other is dependent
(relative) variable. For example, rainfall and agricultural products. Rainfall causes the affects of
agricultural production, while agricultural production cannot cause the rainfall and thus rainfall is
independent and production is dependent.
Uses of Correlation
Correlation is used in both physical and social sciences, also in the field of business and economics.
1. Correlation is very useful to economists to study the relationship between variables, like price and
quantity demanded. For a businessman, it helps to estimate costs, sales, prices and other related
characteristics.
2. Correlation analysis helps in measuring the degree of relationship between the variables like
demand and supply, price and supply, income and expenditure, etc.
3. The measure of correlation can be further tested for significance in the research work.
4. The effect of correlation is to reduce the uncertainty of our prediction.
5. Correlation is the basis for the concept of regression and ratio of variation.
The correlation is said to be positive or direct correlation, if the movement of variables is on the same side
i.e, when the increase (decrease) in one variable is accompanied by an increase (decrease) in the value of
the other variable. For example, price and supply, height and weight etc.
1
FOR PRIVATE CIRCULATION ONLY
QUANTITATIVE TECHNIQUES II
If the two variables tent to move in the opposite directions so that the increase or decrease of one variable
is accompanied by a decrease or increase in the other variable, then the correlation is called negative or
inverse or indirect correlation. For example, price and demand, yield of crops and price, etc. In short an
increase in one variable is associated with the decrease in the other variable and vice versa.
Correlation is a statistical technique used for analyzing the behavior of two or more variables.
Karl Pearson, a reputed statistician has constructed a formula based on the mathematical treatment for
determining the coefficient of correlation.
r= =
Where and
2
FOR PRIVATE CIRCULATION ONLY
QUANTITATIVE TECHNIQUES II
The degree of the correlation between the variables can be determined by the quantitative value of the
coefficient of correlation. On the basis of the formula given by Karl Pearson, we can state
1. Perfect +1 -1
7.No correlation 0 0
X 1 2 3 4 5 6 7 8 9
Y 9 8 10 12 11 13 14 16 15
Solution : COMPUTATION OF COEFFICIENT OF CORRELATION
Coefficient of correlation r = =
Correlation coefficient is +0.95 and hence there is a very high degree of positive correlation.
Example 2. The following table gives the marks obtained by A and B in ten tests during the year 2009-10.
Calculate the correlation coefficient.
Test No. 1 2 3 4 5 6 7 8 9 10
Marks in
Statistics 77 54 27 52 14 35 90 25 56 60
Marks in
Maths 35 58 60 40 50 40 35 56 34 42
Solution :
Let the marks in statistics be taken as X and that of marks in maths as Y.
Computation of coefficient of correlation
X Y dx dy dx2 dy2 dxdy
70 35 21 -10 441 100 -210
54 58 5 13 25 169 65
27 60 -22 15 484 225 -330
52 40 3 -5 9 25 -15
21 50 -28 5 784 25 -140
35 40 -14 -5 196 25 70
90 35 41 -10 1681 100 -410
25 56 -24 11 576 121 -264
56 34 7 -11 49 121 -77
60 42 11 -3 121 9 -33
∑dxdy
490 450 ∑dx2= 4366 ∑dy2=920 =-1344
Coefficient of correlation r = =
Coefficient of correlation is -0.671 and hence there is a very moderate degree of negative correlation.
Example 4. From the following table, find correlation coefficient between age and the playing habits of
students
Age 15 16 17 18 19 20 21
(years)
4
FOR PRIVATE CIRCULATION ONLY
QUANTITATIVE TECHNIQUES II
No. of 250 200 150 120 100 80 90
students
Regular 200 150 90 48 30 12 50
players
What conclusions do you draw from the result obtained?
Solution :
First calculate the percentage of regular players and then calculate correlation coefficient. Let age be
denoted by X and the percentage of regular players as Y.
In the X series assumed mean is taken as 18 and Y series , the assumed mean is taken as 60
Example 5 :
The following table gives the distribution of the population and those who are totally and partially blind
among them. Find out if there exists any relation between age and blindness.
Age 0 - 10 10 - 20 20 – 30 30 - 40 40 – 50 50 - 60 60 - 70 70 – 80
No of 100 80 50 40 35 29 16 8
persons(000’)
No of Blind 60 50 49 38 28 39 20 14
Solution:
In order to make the data comparable it is necessary to find out the number of blind out of a fixed
number (common unit). We have to find out the number of blind persons corresponding to one lakh, in
each group.
The first figure : (60/100)x 100 = 60
The second figure: (50/60) x 100 = 55
5
FOR PRIVATE CIRCULATION ONLY
QUANTITATIVE TECHNIQUES II
dy =
total dx= (y –
CI persons no of blind X Y (X-40) dx2 111.312) dy2 dxdy
-35 1225 -51.31 2633.024 1795.955
0 – 10 100 60 5 60
80 50 15 -25 625 -48.81 2382.709 1220.325
10 – 20 62.5
50 49 25 -15 225 -13.31 177.236 199.695
20 – 30 98
40 38 35 -5 25 -16.31 266.114 81.565
30 – 40 95
35 28 45 5 25 -31.31 980.504 -156.565
40 – 50 80
20 39 55 15 225 83.687 7003.514 1255.305
50 – 60 195
16 20 65 25 625 13.687 187.334 342.175
60 – 70 125
8 14 75 35 2975 4056.034 2229.045
70 – 80 175 8
∑dx2
= ∑dy2
4200 17686.47 6967.5
Example 7:
Calculate the coefficient of correlation by Pearson’s method between the density of population and the
death rate. Find significance of coefficient of correlation; also find the limits of probable error.
Cities A B C D E F
Density 200 500 400 700 600 300
Death rate 10 16 14 20 17 13
Example 9:
If the covariance between X and Y variables is 15 and the variance of X and Y are respectively 25 and 9,
Solution:
Covariance = Variance of X =
Variance of Y = r=
There is perfect positive correlation between the variables.
Example 10: With the following data of cities, calculate the coefficient of correlation between the
D 60 42 840
E 120 72 1224
F 80 24 312
Solution:
r= =0.9875
Example 11
What inference do you draw when the correlation coefficient between the two variables is.
Solution:
(i) No correlation.
8
FOR PRIVATE CIRCULATION ONLY
QUANTITATIVE TECHNIQUES II
Rank correlation
In 1904, Charles Edward Spearman, a British psychologist found out the method of ascertaining the
coefficient of correlation by ranks. This method is based on rank. This method is used for dealing with
qualitative characteristics such as intelligence, beauty, morality, character, etc. which cannot be
quantified as in the case of Karl Pearson’s coefficient of correlation. This measure uses ranks for the
respective observation for any erratic or irregular or extreme or inaccurate values for a given data,
because rank correlation is not based on the assumption of formality of data.
Rank correlation method only gives the approximate results as this method uses ranks instead of the
original values. Rank correlation is applicable only to individual observations.
The formula for Spearman’s Rank correlation which is denoted by r is:
Common ranks
Sometimes the values of the variable would be same and their ranks will be same, in such cases the
common ranks are given to all the items having the same value by averaging the normal ranks which the
items would have got if they have differed slightly from each other.
For example:
X 50 46 30 50 60 20 70 50
Ranks 4 6 7 4 2 8 1 4
The item 50 is repeated thrice. Against these items there are three ranks i,e., 3, 4 and 5. If we take the
average of these ranks , we get the common rank 4 as under:
𝑁𝑜𝑟𝑚𝑎𝑙 𝑅𝑎𝑛𝑘𝑠 3+4+5
Common rank = 𝑁𝑜.𝑜𝑓 𝑟𝑎𝑛𝑘𝑠 𝑝𝑜𝑜𝑙𝑒𝑑 = 3
=4
9
FOR PRIVATE CIRCULATION ONLY
QUANTITATIVE TECHNIQUES II
When there are common ranks in the series , the correlation coefficient formula gets modified with some
adjustments to ∑d2
If there are ‘m’ items in the series and the ranks are common. Then a correlation is modified as:
If there are more than one such groups of items with the common ranks, the above value is added as
many times as the number of items sharing the ranks.
For example, in x series there are two items having the same value and the common rank is 5.5
( i,e., the average of ranks 6th and 7th) and in Y series there are three items with the rank 4(i.,e. 3rd, 4th
and the 5th) and four items with rank 8.5 (i.,e., 7,8,9 and 10), we have to add to the value of ∑d2 three
times as under:
+ +
+ +
Thus for two items m =2, for 3 items m=3 and for four items m = 4.
Example 12: Ten students have obtained the following marks in Statistics and Economics. Calculate the
rank coefficient of correlation.
Statistics 28 30 45 60 90 88 65 55 76 50
Economics 50 40 80 90 20 40 45 66 44 70
10
FOR PRIVATE CIRCULATION ONLY
QUANTITATIVE TECHNIQUES II
Example 13: Calculate the rank correlation between the order of merit and years of service for the
following data.
Employee A B C D E F G H I
Shelf life in months 10 24 15 19 20 15 14 22 20
Actual usage (months) 9 25 18 15 10 11 19 22 118
Solution:
In the series X , there are two 20’s and hence their ranks are shared as: (3+4)/2=3.5
There are two 15’s and hence their ranks are shared as (6+7)/2=6.5
In the series Y, there are two 18’s and hence their ranks are shared as (4+5)/2=4.5
Example 14: Fifteen industries of the state have been ranked according to profit earned in 2007-2008 and
the working capital for that year. Calculate the rank correlation coefficient.
Industries A B C D E F G H I J K L M N O
Rank (profit) 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Rank( working Capital) 14 5 15 13 12 10 11 9 7 8 1 6 4 3 2
Solution: The ranks are already assigned and hence just find the d and its square.
Computation of rank correlation coefficient
Rx Ry d d2
1 14 -13 169
11
FOR PRIVATE CIRCULATION ONLY
QUANTITATIVE TECHNIQUES II
2 5 -3 9
3 15 -12 144
4 13 -9 81
5 12 -7 49
6 10 -4 16
7 11 -4 16
8 9 -1 1
9 7 2 4
10 8 2 4
11 1 10 100
12 6 6 36
13 4 9 81
14 3 11 121
15 2 13 169
2
∑d = 1000
Example 15 : Ten competitors in a beauty contest are ranked by three judges in the following order:
First
1 6 5 10 3 2 4 9 7 8
judge
Second
7 5 8 3 10 4 9 2 1 6
judge
Third
5 6 7 3 2 6 1 8 9 10
judge
Solution : The ranks are already assigned . The correlation between the judges in the combination of two
should be found to ascertain which two judges are close in their judgment.
The judge1 is taken as X , judge 2 as Y and judge 3 as Z.
Computation of Rank correlation Coefficient
d= d= d=
Rx Ry Rz Rx-Ry d2 Ry-Rz d2 Rz- Rx d2
1 7 5 -6 36 2 4 4 16
6 5 6 1 1 -1 1 0 0
5 8 7 -3 9 1 1 2 4
10 3 3 7 49 0 0 -7 49
3 10 2 -7 49 8 64 -1 1
2 4 6 -2 4 -2 4 4 16
4 9 1 -5 25 8 64 -3 9
9 2 8 7 49 -6 36 -1 1
7 1 9 6 36 -8 64 2 4
8 6 10 2 4 -4 16 2 4
2 2 2
∑d = 262 ∑d =254 ∑d =104
12
FOR PRIVATE CIRCULATION ONLY
QUANTITATIVE TECHNIQUES II
The judge 3 and judge 1 are close in their approach of judgment as r is positively correlated.
Merits and demerits of Rank correlation
Merits
1. It is simple to understand and easy to calculate.
2. It is very useful in the case of data which are of qualitative nature, like intelligence, honesty,
beauty, efficiency, etc.
3. Ranks are assigned only in this method which becomes easy for computation.
Demerits
1. It cannot be used for quantitative distribution.
2. If the number of the items is greater than 20, becomes tedious and requires lot of time.
Exercise
1. Calculate the Karl Pearson’s coefficient of correlation and interpret the result for the deviations
from their mean of the given two series X and Y.
X -4 -3 -2 -1 0 1 2 3 4
Y 3 -3 -4 0 4 1 2 -2 -1
2. The data relating to import price(Y) and import quantity (X) in respect of a given commodity are
as under:
Year 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006
Import 2 3 6 5 4 3 5 7 8 7
price
Quantity 6 5 4 5 7 10 9 7 8 9
imported
Calculate Karl Pearson’s coefficient of correlation between x and y and comment on it.
3. Calculate the Karl Pearson’s coefficient of correlation from the following data, using 20 as the working
mean for price, and 70 as the working mean for demand:
Price 14 16 17 18 19 20 21 22 23
Demand 84 78 70 75 66 67 62 58 60
4. Given is the data relating to the aptitude scores and productivity index.
13
FOR PRIVATE CIRCULATION ONLY
QUANTITATIVE TECHNIQUES II
Aptitude scores 9 18 18 20 20 23
Productivity Index 33 23 33 42 29 32
Find the coefficient of correlation between aptitude scores and productivity index.
5. Given :
x series Series
Arithmetic Mean 74.50 125.50
Assumed Mean 69.00 112.00
Standard Deviation 13.07 15.85
Summation of corresponding deviations of X and Y series = 2176.
Calculate the coefficient of correlation between the series.
6. From the following table calculate the coefficient of correlation by Karl Pearson’s method :
X 6 2 10 4 8
Y 9 - 5 8 7
Arithmetic means of x and Y series are 6 and 8 respectively.
Also find the probable error.
7. From the following data calculate coefficient of correlation between age and playing habits. How do
you interpret the result
Age 20 21 22 23 24 25
9. The marking of trainees in two skills, programming and analysis are as follows. What is the
coefficient of rank correlation?
Programming 3 5 8 4 7 10 2 1 6 9
Analysis 6 4 9 8 1 2 3 10 5 7
10. Calculate the rank correlation coefficient for the following table of marks of students in two
subjects.
First 80 64 54 49 48 35 32 29 20 18 15 10
subject
Second 36 38 39 41 27 43 45 52 51 42 40 52
subject
14
FOR PRIVATE CIRCULATION ONLY
QUANTITATIVE TECHNIQUES II
11. Ten competitors in a voice contest are ranked by three judges in the following orders:
First 1 6 5 10 3 2 4 9 7 8
judge
Second 3 5 8 4 7 10 2 1 6 9
judge
Third 6 4 9 8 1 2 3 10 5 7
judge
12. Calculate the coefficient of correlation between age of cars and the annual maintenance cost and
comment:
Age of cars 2 4 6 7 8 10 12
Annual maintenance cost 1,600 1,500 1,800 1,900 1,700 2,100 2,000
13. Quotations of index number of security prices of a certain joint stock company and of prices of
preferences shares and the debentures are given below:
Price 73.2 85.8 78.9 75.8 77.2 81.2 83.8
Debenture 97.8 99.2 98.8 98.3 98.3 96.7 97.1
price
Calculate the rank coefficient of correlation between the preference shares prices and debenture prices.
14. Following are the scores of ten students in a class and their IQ. Use the method of rank correlation
to determine the relationship between scores and IQ.
Students 1 2 3 4 5 6 7 8 9 10
Scores 35 40 25 55 85 90 65 55 45 50
IQ 100 100 110 140 150 130 100 120 140 110
15. The average daily wages for working class in Nagpur is Rs.12 and for that in Delhi Rs.18, their
respective standard deviations are Rs.2 and Rs.3 and the coefficient of correlation is 0.67. Find the
most likely wage in Delhi corresponding to the wage of Rs.20 in Nagpur.
16. Given the following values, find the expected value of X when Y is 12
Average of X series = 25 Average of Y series = 22
18. The coefficient of correlation between marks obtained in mathematics and marks obtained in is -0.4,
the average marks are respectively 80 and 50. The standard deviation of marks in Mathematics and
English are 15 and 10 respectively. Estimate the marks of the student in mathematics who has secured 64
marks in English.
Answer
1. r = 0, 3. r=0.954 4. 0.034 5. 0.955 6. 11, 0.919 7. -0.071
8. r= -0.991 9 . -0.297 10. -0.685 11. I and II=-0.212: II and III=-0.297: I and III=0.636 12.
0.84 13.0.125 14. 0.47 15. Y20 = 26.04 18. 28.6 19. Marks in Maths =94
15
FOR PRIVATE CIRCULATION ONLY
QUANTITATIVE TECHNIQUES II
PART B
Introduction to Regression
Correlation measures the direction and the strength of the relationship between the variables and so
we can predict the value of one variable from the given value of variable knowing the degree of
association between these variables. For example, the demand and supply are correlated. We can find
the expected demand for the given supply for the market needs.
Regression analysis is widely used for deriving an appropriate functional relationship between the
variables. It helps us to estimate one variable or the dependent variable from the other variable or
independent variable. The prediction is based on average relationship arrived at statistically by
regression analysis.
The literal meaning of regression is ‘moving backward’, ‘going back’ or ‘return to the mean value’.
“Regession is a technique which estimates the value of unknown from the know values. Regression
also is defined as predicting or estimating the dependent values with the help of independent values.”
In regression analysis there are two types of variables. The variable whose value is influenced or is to
be predicted is called dependent variable and the variables which influence the value or is used for
prediction, is called independent variable. In regression analysis independent variable is also known
as regressor or predictor or explained variable.
Uses of regression analysis
1. Regression analysis is used almost in every field where two or more relative variables have the
tendency to go back to the averages. It is very useful in prediction purposes as in the fields of
statistics, economics, natural sciences and physical sciences and many other applied fields.
It is very well adopted for predicting sales, production or demand in any business entity
which would plan for a better profit.
2. Regression analysis predicts the unknown variable from the known values of the variable.
3. We can calculate the coefficient of correlation with the help of regression coefficient.
4. Regression analysis in statistical estimation of demand curves, supply curves, production function,
cost function, consumption functions etc., can be predicted.
16
FOR PRIVATE CIRCULATION ONLY
QUANTITATIVE TECHNIQUES II
Regression Equations
Regression equations are the algebraic expression of the regression lines. Since there are two regression
lines, there will be two regression equations. One, the regression equation of X on Y is used to describe
the variation in the value of X for the given changes of Y and the regression equation Y on X is used to
describe the variation in the values of Y for the given charges of X.
Regression Equation of Y in X.
Yc = a + bX
In this equation ‘a’ and ‘b’ are two unknown constants (fixed numerical values) which determine the
position of the line completely. The constant ‘a’ determines the level of the fitted line i,e., the change in Y
for the unit change in X.
If the values of the constants ‘a’ and ‘b’ are obtained, the line equation is completely determined. But
how to determine these values, the answer is obtained by the method of least squares which states that the
line should be drawn through the plotted points in such a manner that the sum of the squares of the
vertical deviations of the actual Y values from the estimated Y values is the least or in the other words, in
order to obtain a line which fits the points best, (Y-Yc)2 should be minimum. Such a line is known as the
line of best fit.
or or
=
Deviation taken from arithmetic mean of Y on X.
2. Regression equation of Y an X
or or
The relation between the coefficient of correlation and coefficient of regression is given by
(c) If one regression coefficient is greater than unity, then the other regression coefficient must be
Example 1: Calculate the two regression equations of X on Y and Y on X from the data given below,
Demand 12 13 15 13 12 20 20
Supply 45 40 43 37 40 39 43
13 40 -2 4 -1 1 2
15 43 0 0 2 4 0
13 37 -2 4 -4 16 8
18
FOR PRIVATE CIRCULATION ONLY
QUANTITATIVE TECHNIQUES II
12 40 -3 9 -1 1 3
20 39 5 25 -2 4 -10
20 43 5 25 2 4 10
∑dx2 =
∑x =105 ∑y = 287 ∑dx = 0 76 ∑dy = 0 ∑dy2 = 46 ∑dxdy=1
EXAMPLE 2:
The following data relate to ages of husbands and wives. Obtain the two regression equations and
determine the most likely age of husbands for the age of wife 25 years and most likely age of wife age of
Age of 27 25 29 28 30 33 37 35 40 42
husbands
Age of 24 20 27 25 24 28 34 28 44 38
wives
Solution : The ages of husband is taken as X and the ages of wives is taken as Y.
dx = x - dy= y -
X Y 32.6 dx2 29.2 dy2 dxdy
27 24 -5.6 31.36 -5.2 27.04 29.12
19
FOR PRIVATE CIRCULATION ONLY
QUANTITATIVE TECHNIQUES II
30 24 -2.6 6.76 -5.2 27.04 13.52
∑dx2
∑x = ∑y = = ∑dy2 = ∑dxdy
326 292 ∑dx = 0 298.4 ∑dy = 0 483.6 =349.8
Thus the likely age of husband for age of wife being 25 years is
20
FOR PRIVATE CIRCULATION ONLY
QUANTITATIVE TECHNIQUES II
The following data relates to advertising expenditure (in lakhs of rupees ) and their corresponding
Advertising
expenditure 15 16 19 20 21 23
Sales 9 12 17 23 21 26
Regression equation – short cut method or deviations taken from Assumed means
When the actual means of X and Y series are in fractions, the calculation of the deviations becomes
tedious and hence the deviations are taken from the assumed mean. The value of the
Where where
dx = x – A and dy = y - A
Example 4: A company wants to assess the impact of Exports on its annual profit. The following table
exports.
The annual profit for the sum of exports of Rs.12 is Y = = 1.755(12) +33.524=Rs. 54.584(in thousands)
Example 5: Calculate the coefficient of correlation and regression equations between X and Y series
from the following data:
23
FOR PRIVATE CIRCULATION ONLY
QUANTITATIVE TECHNIQUES II
X series Y series
Number of pairs of observation 15
Arithmetic mean 25 18
Sum of square of deviations from
arithmetic mean 286 136
summation of product deviation of X and Y series from their respective arithmetic mean = 169
Solution:
Lets the data given in the form of notations
Example 6: From the following data of the rainfall and production of rice , the most likely production
corresponding to the rainfall of 35”
Rain fall(inches) production(tonnes)
Mean 25 50
SD 6 8
Coefficient of correlation = +0.85
Solution : rain fall is taken as X and production as Y.
Regression equation of X on Y Regression equation of Y on X
24
FOR PRIVATE CIRCULATION ONLY
QUANTITATIVE TECHNIQUES II
Example 7: The coefficient of correlation between the ages of boys and girls in a community was found
to be +0.89, the average of boys was 13 years and that of girls 10 years. Their standard deviations were 3
ans 2 years respectively. Find with the help of regression equations:
(a) The expected age of boy when the girl’s age is 17.
(b) The expected age of girl when the boy’s age is 18.
Solution:
Let the boys age be X and girls age Y.
(c) (b) The expected age of a girl when the boy’s age being 18 is
(d) =
Example 8: The following calculation have been made for closing prices of eight stocks (Y) on the
National stock Exchange on a certain, along with the volume of the sales in thousands of shares(X) . from
these calculation find the regression equation of volume of shares on stocks.
25
FOR PRIVATE CIRCULATION ONLY
QUANTITATIVE TECHNIQUES II
2 2
∑X= 56, ∑Y=40, ∑XY=364,∑X =524, ∑Y =256
Solution :
Regression equation of X on Y
Exercise
1. The average daily wages for working class in Nagpur is Rs.12 and for that in Delhi Rs.18, their
respective standard deviations are Rs.2 and Rs.3 and the coefficient of correlation is 0.67. Find the
most likely wage in Delhi corresponding to the wage of Rs.20 in Nagpur.
3. The coefficient of correlation between marks obtained in mathematics and marks obtained in is -0.4,
the average marks are respectively 80 and 50. The standard deviation of marks in Mathematics and
English are 15 and 10 respectively. Estimate the marks of the student in mathematics who has secured 64
marks in English.
4. Prices indices of cotton and wool are given below for 6 months of a year. Obtain the equations of
regression between the indices.
Prices index of cotton (X) 78 77 85 88 87 82
Y 56 49 53 58 65 76 51
Determine the regression equations which may be associated with these values and calculate Karl
Pearson’s coefficient of correlation
6. The following table gives the aptitude test scores and the productivity indices of 10 workers
selection at random.
Aptitude Scores(X) 60 62 65 70 72 48 53 73 65 82
26
FOR PRIVATE CIRCULATION ONLY
QUANTITATIVE TECHNIQUES II
Productivity index(Y) 68 60 62 80 85 40 52 62 60 81
Calculate the two regression equations and estimate the productivity index of a workers whose test score
is 92.
7. To study the relationship between expenditure on accommodation X and expenditure on food and
entertainment Y , an enquiry into 50 families gave the following results:
∑X=8500, ∑y =9600, σx = 60 , σy = 20 and r = 0.6
Estimate the expenditure on food and entertainment when expenditure on accommodation is Rs.200.
8. Following are the data on business on turnover and the staff of a company for eight years from
2002 to 2009:
Years 2002 2003 2004 2005 2006 2007 2008 2009
Fit a proper regression equation to estimate manpower in terms of business turnover. Estimate the
staff requirement when the business turnover reaches Rs.200 crores.
9. Calculate the two regression equations of X on Y and Y on X from the data given below taking
deviations from actual means of X and Y:
Price (Rs) 10 12 13 12 16 15
Amount demanded 40 38 43 45 37 43
Estimate the likely demand when the price is Rs.20.
10. An industrial engineer collected the following data on experience and performance rating of 8
operators:
Operators 1 2 3 4 5 6 7 8
Experience 16 12 18 4 3 10 5 12
(years)
Performance 87 88 89 68 58 80 70 85
rating
(a) Does the data give evidence that experience improves performance?
(b) Estimate the performance rating of an operator having (a) 9 years (b) 15 years of experience.
Answer
1. Y 20 = 26.04 2. 28.6 3. Marks in Maths =94 4. X=4.78Y+42.084, Y = 0.265X+63.365
28
FOR PRIVATE CIRCULATION ONLY