Regression Analysis
Regression Analysis
• Best of all, you can use the equation to make predictions. For example,
how much snow will fall in 2017?
y = -2.2923(2017) + 4624.4 = 0.8 inches.
• Regression also gives you an R2 value, which for this graph is 0.702.
This number tells you how good your model is.
• The values range from 0 to 1, with 0 being a terrible model and 1
being a perfect model. As you can probably see, 0.7 is a fairly decent
model so you can be fairly confident in your weather prediction!
Regression Analysis
There are multiple benefits of using regression analysis:
• It indicates the significant relationships between dependent variable
and independent variable.
• It indicates the strength of impact of multiple independent variables
on a dependent variable.
x 95 85 80 70 60
y 85 95 70 65 70
What linear regression equation best predicts statistics performance, based on math aptitude scores?
If a student made an 80 on the aptitude test, what grade would we expect her to make in statistics?
How well does the regression equation fit the data?
What linear regression equation best predicts statistics performance, based on math aptitude scores?
If a student made an 80 on the aptitude test, what grade would we expect her to make in statistics?
How well does the regression equation fit the data?
What linear regression equation best predicts statistics performance, based on math aptitude scores?
If a student made an 80 on the aptitude test, what grade would we expect her to make in statistics?
How well does the regression equation fit the data?
3. After the derivatives are calculated,The slope(m) and intercept(b) are updated
with the help of the following equation.
m = m-α*derivative of m
b = b-α*derivative of b
4. The process of updating the values of m and b continues until the cost function
reaches the ideal value of 0 or close to 0.
Multiple Linear Regression
• The main difference is the number of independent variables that they
take as inputs. Simple linear regression just takes a single feature,
while multiple linear regression takes multiple x values.
Y = Xb
b = (X'X)-1X'Y
Multiple Linear Regression: Example
1 100 110 40
2 90 120 30
3 80 100 20
4 70 90 0
5 60 80 10
Multiple Linear Regression: Example
•Define X.
•Define X'.
•Compute X'X.
•Define Y.
Multiple Linear Regression: Example
b = (X'X)-1X'Y
ŷ = 20 + 0.5x1 +0.5x2
Regression Types
1.Simple and Multiple Linear Regression
2.Logistic Regression
3.Polynomial Regression
4.Ridge Regression and Lasso Regression (upgrades to
Linear Regression)
5.Decision Trees Regression
6.Support Vector Regression (SVR)
Logistic Regression
In Linear regression, we draw a straight line(the best fit line) L1 such that the sum of distances of all the data
points to the line is minimal.
Logistic Regression Equation
It uses sigmoid function is to map any predicted values of probabilities into
another value between 0 and 1.
Derive
Hypothesis
Overfitting and Underfitting
Underfitting is a situation when your model is too simple for your data or
your hypothesis about data distribution is wrong and too simple. For
example, your data is quadratic and your model is linear.
This type of regularization (L1) can lead to zero coefficients i.e. some of the
features are completely neglected for the evaluation of output. So, Lasso
regression not only helps in reducing over-fitting but it can help us in feature
selection.
Ridge Regression
In ridge regression, the cost function is altered by adding a penalty
equivalent to square of the magnitude of the coefficients.
The penalty term (lambda) regularizes the coefficients such that if the
coefficients take large values the optimization function is penalized.
Ridge regression shrinks the coefficients and it helps to reduce the model
complexity and multi-collinearity
It is also called “L2 regularization”.
Polynomial Regression
• A regression equation is a polynomial regression equation if the power
of independent variable is more than 1. The equation below represents
a polynomial equation:
• In this regression technique, the best fit line is not a straight line. It is
rather a curve that fits into the data points.
Lasso vs Ridge
Regression vs Classification
• Classification predictions can be evaluated using accuracy, whereas
regression predictions cannot.
• Regression predictions can be evaluated using root mean squared
error, whereas classification predictions cannot.