Regression
Regression
Sekhar
REGRESSION ANALYSIS
What is regression?
Regression Analysis is a statistical model which gives the relationship
The main aim of the regression analysis is to give the relationship between
the variables, nature, and strength among the variables, and make
(OR)
learning is the analyzing or predicting the data based on the previously available
For supervised learning, we have both train data and test data. Regression analysis
is one of the statistical methods for the analysis and prediction of the data.
It is one of the basic and linear regression analysis. In this simple linear
This linear regression model gives the linear relationship between the
y=ax+b
x is a independent variable
Multiple linear regression analysis gives the relationship between the two
It is almost similar to the linear regression but the major difference is the
Multi linear regression analysis is used in the fields of real estate, finance,
y=a0x1+a1x2+a2x3+....+b
the intercept.
Dr. K. R. Sekhar
3. Polynomial Regression
Polynomial regression analysis helps for the flexible curve fitting of the
shown below.
y=a0+a1x+a2x^2+...........+anx^n
x is independent variable
4. Exponential Regression
with example.
y=ae^(bx)
x is independent variable
5. Logistic Regression
Logistic regression analysis can be used for classification and regression. We can
y=a+b*ln(x)
x is independent variable
1. Regression analysis is used for the prediction of stock price based on the
past data, analyzing the relationship between the interest rate and consumer
spending.
2. It can be used for the analysis of the impact of price changes on product
3. It can be used in real estate for predicting the value of property based on
the location.
5. It is also used for the prediction of crops yield based on the weather
6. It can be used in the analysis of the product quality and also gives the
7. It can be used for the prediction of performance of the sports players based
the dependent variable i.e. regression analysis measures the average relationship
The literal meaning of regression is “stepping back towards the average” which
was used by British Biometrician Sir Francis Galton (1822-1911) regarding the
between two or more variables. There are two types of variables in regression
analysis:
1. Independent variable
2. Dependent variable
The variable which is used for prediction is called independent variable. It is also
LINES OF REGRESSION
Regression lines are the lines of best fit which express the average relationship
between variables. Here, the concept of lines of best fit is based on principle of
least squares.
𝑟𝜎𝑦
If regression line of y on x is (𝑦 − 𝑦̅) = (𝑥 − 𝑥̅ ) and it is denoted by 𝑏𝑦𝑥
𝜎𝑥
𝑟𝜎𝑦
Where 𝑏𝑦𝑥 =
𝜎𝑥
𝑪𝒐𝒗(𝒙,𝒚)
Slope 𝒃𝒚𝒙 = ; 𝑪𝒐𝒗(𝒙, 𝒚) = 𝒓𝝈𝒙 𝝈𝒚
𝝈𝟐𝒙
∑(𝒙 − 𝒙
̅)(𝒚 − 𝒚
̅)
𝒃𝒚𝒙 =
∑(𝒙 − 𝒙̅)𝟐
𝑟𝜎𝑥
If regression line of y on x is (𝑥 − 𝑥̅ ) = (𝑦 − 𝑦̅) and it is denoted by 𝑏𝑥𝑦
𝜎𝑦
𝑟𝜎𝑥
Where 𝑏𝑥𝑦 =
𝜎𝑦
∑(𝒙 − 𝒙
̅)(𝒚 − 𝒚
̅)
𝒃𝒙𝒚 =
∑(𝒚 − 𝒚̅)𝟐
2. If one of the regression coefficients is greater than one, then other must be less
than one.
1
i.e; 𝑟 2 ≤ 1 then 𝑏𝑦𝑥 ≤ <1
𝑏𝑥𝑦
5. If 𝑚1 and 𝑚2 are the slopes of two lines and 𝜃be the angle between them
𝑚1 −𝑚2
therefore 𝑡𝑎𝑛𝜃 = | |
1+𝑚1 𝑚2
1−𝑟 2 𝜎𝑥 𝜎𝑦
𝜃 = 𝑡𝑎𝑛−1 { (𝜎2 +𝜎2)}
𝑟 𝑥 𝑦
Dr. K. R. Sekhar
NOTE:
𝜋
1. If 𝑟 = 0 i.e., variables are uncorrelated then 𝑡𝑎𝑛𝜃 = ∞ ⇒ 𝜃 =
2
𝑡𝑎𝑛𝜃 = 0 ⇒ 𝜃 = 0 𝑜𝑟 𝜋.
3. There are two angles between regression lines whenever two lines intersect
𝜋
The tan θ would be greater than zero if θ lies between 0 and then θ is called
2
acute angle.
𝜋
The tan θ would be less than zero if θ lies between and π then θ is called obtuse
2
angle.