Module 3 Regression Notes
Module 3 Regression Notes
Module 3 Regression Notes
Regression
Regression is the measure of the average relationship between two or more variables in terms of the
original units of the data.
Regression Analysis
The term regression analysis refers to the methods by which estimates are made of the values of a
variable from a knowledge of the values of one or more other variables and to the measurement of
the errors involved in this estimation process.
Regression Co-efficient
The rate of change of Variable for unit change in the other variable is called the regression co-
efficient of former on the latter. Since there are two regression lines there are two regression co-
efficients. The rate of change of x for unit change in y is called regression coefficient of x on y. It is
the Co-efficient of y in the regression equation. When it is in the form of x = n + by. This is denoted
by bxy
1) Regression analysis is used in all those fields where two or more related variables are having
the tendency to go back to the average.
2) It predicts the values of dependent variables from the known values of independent
variables.
3) It is used in Statistical estimation of demand curve, supply curve, Production function, Cost
function, consumption function etc.
4) To study correlation with the help of regression.
Regression Lines
1|Page
MISS LAXMI RAJPUT
FY BMS SEMESTER I BUSINESS STATISTICS
If we take two variables X and Y we shall have two regression lines as the regression of X on Y and
the regression of Y on X. The regression line of Y on X gives the most probable values of Y for given
value of X and the regression line of X on Y gives the most probable values of X for given values of Y.
However, When there is either perfect positive or negative correlation between the two variables,
the regression lines will coincide, i.e., we will have only one line.
The Regression Line is the line that best fits the data, such that the overall distance from the line to
the points (variable values) plotted on a graph is the smallest. In other words, a line used to minimize
the squared deviations of predictions is called as the regression line.
The farther the two regression lines from each other, the lesser is the degree of correlation and the
nearer the two regression lines to each other, the higher is the degree of correlation. If the variables
are independent, r is zero and the lines of regression right angles, i.e., parallel to ox and oy.
For two variables X and Y, there are always two lines of regression -
Regression line of X on Y:
gives the best estimate for the value of X for any specific given values of Y?
X = a + by
a = X – intercept
X = Dependent variable
Y = Independent variable
For two variables X and Y, there are always two lines of regression - Regression line of Y on X:
gives the best estimate for the value of Y for any specific given values of X
Y = a + bx
a = Y - intercept
Y = Dependent variable
x= Independent variable
r = √ byx x bxy
If byx is positive than bxy should also be positive & vice versa.
If one regression coefficient is greater than one the other must be less than one. The coefficient of
correlation will have the same sign as that our regression coefficient.
2|Page
MISS LAXMI RAJPUT
FY BMS SEMESTER I BUSINESS STATISTICS
Arithmetic mean of byx & bxy is equal to or greater than coefficient of correlation. byx + bxy / 2 ≥ r
Regression coefficient are independent of origin but not of scale.
Standard Error of Estimate is the measure of variation around the computed regression line.
Standard error of estimate (SE) of Y measure the variability of the observed values of Y around the
regression line.
Standard error of estimate gives us a measure about the line of regression. of the scatter of the
observations about the line of regression.
Y = Observed value of y
Ye = Estimated values from the estimated equation that correspond to each y value
a = Y intercept.
Regression Coefficient of Y an X: The symbol byx is used that measures the change in Y
corresponding to the unit change in X. Symbolically, it can be represented as:
In case, the deviations are taken from the actual means; the following formula is used:
The byx can be calculated by using the following formula when the deviations are taken from the
assumed means:
The Regression Coefficient is also called as a slope coefficient because it determines the slope of the
line i.e. the change in the independent variable for the unit change in the independent variable.
3|Page
MISS LAXMI RAJPUT