Regression and Correlation (Ch.14) )
Regression and Correlation (Ch.14) )
14
SIMPLE REGRESSION AND CORRELATION
Discuss the term regression.
The dependence of one variable on one or more other variables is called regression.
OR
The process in which we estimate one variable on the basis of another variable is called regression.
Note: The term regression was introduced by the English biometrician, Sir Francis Galton in 1885.
The term regression means “to step back or to regress”.
Define regression analysis.
In regression analysis, we obtain an equation which can be sued to estimate the values of the
dependent variable on the basis of independent variable whose values are known.
OR
The technique used to develop the equation and provide the estimates is called regression analysis.
Explain the simple linear regression model.
The simple linear regression model is 𝑌 = 𝛼 + 𝛽(𝑋), where X is independent Y is dependent variables.
In this model α is the y - intercept and β is the slope of line or regression coefficient.
Define independent variable or Regressor.
A variable whose values do not depend on any other variable is called independent variable or
regressor. OR
The variable that provide the basis of estimation or prediction is called as independent variable or
regressor.
Define regressand or dependent variable.
A variable whose values depend on the values of another variable is called dependent variable or
regressand. OR
The variable which we want to estimate or predict on the basis of independent variable is
called dependent variable or regressand.
What is scatter diagram?
The graphical representation of the paired observations (𝑥𝑖 , 𝑦𝑖 ) is called scatter diagram.
What is meant by y-intercept or intercept?
Y-intercept or intercept is the value of y against the value of x = 0.
Define slope or regression coefficient for the simple regression line.
What is regression coefficient?
Slope or regression coefficient is the rate of change in the dependent variable as per unit change in the
independent variable. It is denoted by β.
Give the properties of regression coefficient.
1. Regression Coefficients (𝑏𝑦𝑥 , 𝑏𝑥𝑦 ) always have the same sign.
2. If regression coefficient 𝑏𝑦𝑥 is greater than 1, then 𝑏𝑥𝑦 will be less than 1.
5. The correlation coefficient is the geometric mean of two regression coefficients. r = ± √𝑏𝑦𝑥 × 𝑏𝑥𝑦
What do you understand by simple linear regression?
In simple linear regression, the dependent variable y is expressed as a linear function
of one independent variable is called simple linear regression.
OR
If the dependent variable depends on a single independent variable is called simple linear
regression modal. Simple linear regression modal is 𝑌 = 𝑎 + 𝑏𝑋 + 𝑒𝑖
Where 𝑌 = Dependent variable;
𝑋 = Independent variable
𝑎 = Intercept i.e. average value of “Y” when “X= 0”
𝑏 = Regression coefficient or coefficient of independent variable Slope of regression line
𝑒𝑖 = Random error
Enlist the properties of regression line.
1. The least square regression line always passes through the mean values i.e. (𝑋 , 𝑌).
5. The sum of square deviation of observed values from the estimated values is minimum,
2
i.e. ∑(𝑌 − 𝑌̂) = Minimum
Define principal of least square for fitting a regression line.
OR
State the principal of least square.
The principal of least squares states that the sum of squared deviations of the observed
values from the estimated values should be least or minimum.
OR
The principal of the method of least squares consists of determining the value of the unknown
parameters that will minimize the sum of squares of errors or residuals.
Define intercept of straight line or regression line.
In regression modal the average value of dependent variable when there is no association
is called intercept. In simple linear regression modal is 𝑌 = 𝑎 + 𝑏𝑋 + 𝑒𝑖
𝑎 = Intercept i.e. average value of “Y” when “X= 0”.
The correlation is said to be positive if the two random variables tend to move in the same
direction i.e. increase or decrease simultaneously.
The correlation is said to be negative if the two random variables tend to move in opposite
direction i.e. one random variable increases as the other random variable decreases.
∑(𝑋−𝑋)(𝑌−𝑌)
2 2
√∑(𝑋−𝑋) × ∑(𝑌−𝑌)
𝟐 𝟐
Given ∑(𝐗 − 𝑿) (𝐘 − 𝒀) = 𝟗𝟐, ∑(𝐗 − 𝑿) = 𝟏𝟕𝟎 , ∑(𝐘 − 𝒀) = 𝟏𝟒𝟎 𝒂𝒏𝒅 𝒏 = 𝟏𝟎.
Solution:
2 2
∑(X − 𝑋) (Y − 𝑌) = 92, ∑(X − 𝑋) = 170 , ∑(Y − 𝑌) = 140 , 𝑛 = 10 , bxy = ?, byx =
?, rxy = ?
∑(X − 𝑋) (Y − 𝑌) 92
byx = 2 = = 0.54
∑(X − 𝑋) 170
∑(X − 𝑋) (Y − 𝑌) 92
bxy = 2 = = 0.66
∑(Y − 𝑌) 140
∑(X − 𝑋) (Y − 𝑌) 92
rxy = = = 0.60
2 2 √170 × 140
√∑(Y − 𝑌) × ∑(Y − 𝑌)
Solution:
byx = 1.15
Sxy
rxy =
Sx . Sy
20
(0.8) =
(4) . Sy
20
Sy = = 6.25
(4). (0.8)
∑ XY − nXY
rxy =
nSx . Sy
350 − (10)(5)(6)
rxy =
(10)(2)(3)
350 − 300
rxy =
60
50
rxy =
60
rxy = 0.8333
Given 𝐒𝐱 = 𝟐. 𝟏𝟐 , 𝐒𝐲 = 𝟐. 𝟑𝟒 , 𝐫𝐱𝐲 = 𝟎. 𝟔𝟎𝟓 , ∑(𝐗 − 𝑿) (𝐘 − 𝒀) = 𝟐𝟒. Compute the
number of pairs.
∑(X − 𝑋) (Y − 𝑌)
rxy =
√n 𝑆𝑥2 . 𝑆𝑦2
24
0.605 =
√n(2.12)2 (2.34)2
24
0.605 =
√(4.49)(5.47)n
24
0.605 =
4.95 √n
24
√n =
4.95 × 0.605
√n = 9.60
n = 92.16 ≅ 92