0 ratings0% found this document useful (0 votes) 285 views31 pages7 Types of Regression Techniques You Should Know PDF
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content,
claim it here.
Available Formats
Download as PDF or read online on Scribd
‘asco 1 Types of Regression Techniques you should know
Ana lyti cs Vi d hya (http://www.analyticsvidhya.com)
Learn Everything About Analytics
Qe Join The Most Happening
238 ANALYTICS DISCUSSIONS
(http://discussanalyticsvidhya.com)
19/2015/1V/improve:tiel p¢FhurMaobierbearilig tino Ytam2016 A Pip Real ¢ GrbifehiGitntp Business Analytics (http://www analyticsvidhya.com/blog/category/busi
7 Types of Regression Techniques you should
know!
‘www facebook.com/sharer.php?u-http://www. analyticsvidhya.com/blog/2015/08/comprehensive-guide.
L0Types%20of%20Regression%20Techniques%20youe% 20should%20know!) SM (https://twitter.com/home?
%200f%20Regression%®20Techniques%20youe% 20should®% 20know!-http://www.analyticsvidhya.com/blog/2015/08/comprehensive-
8+ (https://plus.google.com/share?urlehttp://wuw.analyticsvidhya.com/blog/2015/08/comprehensive-guide-regression/) D
om/pin/ereate/button/?urlehttp://www.analyticsvidhya.com/blog/2015/08/comprehensive-guide.
chttp://www.analyticsvidhya.com/wp-
015/08/regression jpg&description-74%20Typest%200f%20Regre:
Oc a a
NOIDA
ion 20Techniques%20you%20should%20know!)
(http://admissions bridgesom.com/pba-new/general/index.aspx?
pnw analyesvicya.comyblog/2015I0&comprehensive-quide-egression! at‘asco 1 Types of Regression Techniques you should know
utm_source-Analytics Vidhya&utm_medium-Banner&utm_campaign-PBAAV)
Introduction
Linear and Logistic regressions are usually the first algorithms people learn in predictive
modeling. Due to their popularity, a lot of analysts even end up thinking that they are the
only form of regressions. The ones who are slightly more involved think that they are
the most important amongst all forms of regression analysis.
The truth is that there are innumerable forms of regressions, which can be performed. Each
form has its own importance and a specific condition where they are best suited to apply. In
this article, | have explained the most commonly used 7 forms of regressions in a simple
manner. Through this article. | also hope that people develop an idea of the breadth of
regressions, instead of just applying linear / logistic regression to every problem they come
across and hoping that they would just fit!
—
AJ anal aeaa
TYPES OF ——
REGRESSION
YOU SHOULD KNOW |*
(http://i.wp.com/wwwanalyticsvidhya.com/wp-
content/uploads/2015/08/regressionjpg)
Table of Contents
pnw analyesvicya.comyblog/2015I0&comprehensive-quide-egression! 2031‘asco 1 Types of Regression Techniques you should know
1. What is Regression Analysis?
2. Why do we use Regression Analysis?
3, What are the types of Regressions?
Linear Regression
Logistic Regression
Polynomial Regression
Stepwise Regression
Ridge Regression
Lasso Regression
ElasticNet Regression
4, How to select the right Regression Model?
Register for Data Hackathon 3.X - Win Amazon Voucher worth Rs.10,000(-$200)
(https//discuss.analyticsvidhya.com/t/datahackathon-ontine$xth-oth september.
What is Regression Analysis?
Regression analysis is a form of predictive modelling technique which investigates the
relationship between a dependent (target) and independent variable (s) (predictor). This
technique is used for forecasting, time series modelling and finding the causal effect
relationship (http://wwwanalyticsvidhya.com/blog/2015/06/establish-causality-events/)
between the variables. For example, relationship between rash driving and number of road
accidents by a driver is best studied through regression.
Regression analysis is an important tool for modelling and analyzing data. Here, we fit a
curve / line to the data points, in such a manner that the differences between the distances
of data points from the curve or line is minimized. |'ll explain this in more details in coming
sections.
pnw analyesvicya.comyblog/2015I0&comprehensive-quide-egression! 331‘asco 1 Types of Regression Techniques you should know
‘wo!
(http:/ /iz.wp.com/www.analyticsvidhya.com/wp-
content/uploads/2015/08/Regression_Line.png)
Why do we use Regression Analysis?
As mentioned above, regression analysis estimates the relationship between two or more
variables. Let's understand this with an easy example:
Let's say. you want to estimate growth in sales of a company based on current economic
conditions. You have the recent company data which indicates that the growth in sales is
around two and a half times the growth in the economy. Using this insight, we can predict
future sales of the company based on current & past information
There are multiple benefits of using regression analysis. They are as follows:
1. It indicates the significant relationships between dependent variable and independent
variable.
2, It indicates the strength of impact of multiple independent variables on a dependent
variable,
Regression analysis also allows us to compare the effects of variables measured on
different scales, such as the effect of price changes and the number of promotional
activities. These benefits help market researchers / data analysts / data scientists to
eliminate and evaluate the best set of variables to be used for building predictive models.
pnw analyesvicya.comyblog/2015I0&comprehensive-quide-egression! ast‘asco 1 Types of Regression Techniques you should know
How many types of regression techniques do we have?
There are various kinds of regression techniques available to make predictions. These
techniques are mostly driven by three metrics (number of independent variables, type of
dependent variables and shape of regression line). We'll discuss them in detail in the
following sections.
Regression
No of
independent
variables
Shape of the Wis
dependent
Regression line
i variable
(http://io.wp.com/www.analyticsvidhya.com/wp-
content/uploads/2015/08/Regression_Type.png)
For the creative ones, you can even cook up new regressions, if you feel the need to use a
combination of the parameters above, which people haven't used before. But before you
start that, let us understand the most commonly used regressions
1. Linear Regression
It is one of the most widely known modeling technique. Linear regression is usually among
the first few topics which people pick while learning predictive modeling. In this
technique, the dependent variable is continuous, independent variable(s) can be continuous
or discrete (https://en.wikipedia.org/wiki/Continuous_and_discrete_variables), and nature
of regression line is linear.
pnw analyesvicya.comyblog/2015I0&comprehensive-quide-egression! 531‘asco 1 Types of Regression Techniques you should know
Linear Regression establishes a relationship between dependent variable (Y) and one or
more independent variables (X) using a best fit straight line (also known as regression
line).
It is represented by an equation Y-a+b"X + e, where a is intercept. b is slope of the line and
is error term, This equation can be used to predict the value of target variable based on
given predictor variable(s).
Relation B/w Weight & Height
Rona
‘Fah ema
(http://i0.wp.com/wwwanalyticsvidhya.com/wp-
content/uploads/2015/08/Linear_Regression1,png)
cot a QUOSTIONZ ear regression and multiple linear regression is that.
. independent variables, whereas simple linear regression
DISCUSS & — ow, the question is "How do we obtain best fit line?”
get it
answered aandb)?
on Analufics ished by Least Square Method. It is the most common
on line. It calculates the best-fit line for the observed data
Vidhya ares of the vertical deviations from each data point to the
first squared, when added, there is no cancelling out
were veges ~ se (UCS,
ialyticsvidhya.com)
min ||Xw —ylla"
(http://io.wp.com/wwwanalyticsvidhya.com/wp-
content/uploads/2015/08/Least_Square.png)
pnw analyesvicya.comyblog/2015I0&comprehensive-quide-egression! 631‘asco 1 Types of Regression Techniques you should know
(http:/ /i.wp.com/www.analyticsvidhya. com/wp-content/uploads/2015/08/reg_error.gjf)
We can evaluate the model performance using the metric R-square. To know more details
about these metrics, you can read: Model Performance metrics Part 1
(http://www.analyticsvidhya.com/blog/2015/01/model-performance-metrics-
classification/), Part 2. (http://wwwanalyticsvidhya.com/blog/2015/01/model-perform-
part-2/).
Important Points:
+ There must be linear relationship between independent and dependent variables
* Multiple regression suffers from multicollinearity, autocorrelation, heteroskedasticity.
* Linear Regression is very sensitive to Outliers. It can terribly affect the regression line
and eventually the forecasted values
* Multicollinearity can increase the variance of the coefficient estimates and make the
estimates very sensitive to minor changes in the model. The result is that the coefficient
estimates are unstable
* In case of multiple independent variables, we can go with forward selection, backward
elimination and step wise approach for selection of most significant independent variables,
2. Logistic Regression
pnw analyesvicya.comyblog/2015I0&comprehensive-quide-egression! 731‘asco 1 Types of Regression Techniques you should know
Logistic regression is used to find the probability of event-Success and event-Failure. We
should use logistic regression when the dependent variable is binary (0/ 1, True/ False.
Yes/ No) in nature. Here the value of Y ranges from 0 to 1 and it can represented by
following equation,
odds= p/ (1-p) = probability of event occurrence / probability of not event occurrence
An(odds) = In(p/(4-p))
Logit(p) = In(p/(1-p)) = be+b1x14b2x24b3X3. ...+dKXK
Above, p is the probability of presence of the characteristic of interest. A question that you
should ask here is "why have we used log in the equation?"
Since we are working here with a binomial distribution (dependent variable), we need to
choose a link function which is best suited for this distribution. And, it is logit
(https://en.wikipedia.org/wiki/Logistic_function) function. In the equation above, the
parameters are chosen to maximize the likelihood of observing the sample values rather
than minimizing the sum of squared errors (like in ordinary regression).
http://i2.wp.com/www.analyticsvidhya.com/wp-
content/uploads/2015/08/Logistic_Regression.png)
Important Points:
* Itis widely used for classification problems
* Logistic regression doesn't require linear relationship between dependent and independent
pnw analyesvicya.comyblog/2015I0&comprehensive-quide-egression! ast‘asco 1 Types of Regression Techniques you should know
variables. It can handle various types of relationships because it applies a non-linear log
transformation to the predicted odds ratio
+ To avoid over fitting and under fitting, we should include all significant variables. A good
approach to ensure this practice is to use a step wise method to estimate the logistic
regression
* It requires large sample sizes because maximum likelihood estimates are less powerful at
low sample sizes than ordinary least square
* The independent variables should not be correlated with each other ie. no multi
collinearity. However, we have the options to include interaction effects of categorical
variables in the analysis and in the model.
* If the values of dependent variable is ordinal, then itis called as Ordinal logistic regression
* If dependent variable is multi class then it is known as Multinomial Logistic regression
3. Polynomial Regression
Aregression equation is a polynomial regression equation if the power of independent
variable is more than 1. The equation below represents a polynomial equation:
yeatbxe2
In this regression technique, the best fit line is not a straight line. It is rather a curve that fits
into the data points.
(http://i.wp.com/wwwanalyticsvidhya.com/wp-
content/uploads/2015/08/Polynomial.png)
Important Points:
pnw analyesvicya.comyblog/2015I0&comprehensive-quide-egression! 9st‘asco 1 Types of Regression Techniques you should know
+ While there might be a temptation to fit a higher degree polynomial to get lower error, this
can result in over-fitting. Always plot the relationships to see the fit and focus on making
sure that the curve fits the nature of the problem. Here is an example of how plotting can
help
x x
Underfitting Just right! overfitting
(http: /iz.wp.com/wwwanalyticsvidhya.com/wp-
content/uploads/2015/02/underfitting-overfittingpng)
* Especially look out for curve towards the ends and see whether those shapes and trends
make sense, Higher polynomials can end up producing wierd results on extrapolation,
4. Stepwise Regression
This form of regression is used when we deal with multiple independent variables. In this
technique, the selection of independent variables is done with the help of an automatic
process, which involves no human intervention
This feat is achieved by observing statistical values like R-square, t-stats and AIC metric to
discern significant variables. Stepwise regression basically fits the regression model by
adding/dropping co-variates one at a time based on a specified criterion. Some of the most
commonly used Stepwise regression methods are listed below:
* Standard stepwise regression does two things. It adds and removes predictors as needed
for each step.
* Forward selection starts with most significant predictor in the model and adds variable for
each step.
* Backward elimination starts with all predictors in the model and removes the least
significant variable for each step.
pnw analyesvicya.comyblog/2015I0&comprehensive-quide-egression! 1031‘asco 1 Types of Regression Techniques you should know
The aim of this modeling technique is to maximize the prediction power with minimum
number of predictor variables. It is one of the method to handle higher dimensionality
(http://wwwanalyticsvidhya.com/blog/2015/07/dimension-reduction-methods/) of data
set.
5. Ridge Regression
Ridge Regression is a technique used when the data suffers from multicollinearity (
independent variables are highly correlated). In multicollinearity, even though the least
squares estimates (OLS) are unbiased, their variances are large which deviates the observed
value far from the true value. By adding a degree of bias to the regression estimates, ridge
regression reduces the standard errors
Above, we saw the equation for linear regression. Remember? It can be represented as:
yrat b°x
This equation also has an error term. The complete equation becomes:
yeatb*xte (error term), [error term is the value needed to correct for a prediction error b
etween the observed and predicted value]
=> yeaty= at bixi+ b2x2+....+e, for multiple independent variables.
In a linear equation, prediction errors can be decomposed into two sub components. First
is due to the biased and second is due to the variance. Prediction error can occur due to
any one of these two or both components. Here, we'll discuss about the error caused due to
variance.
Ridge regression solves the multicollinearity problem through shrinkage parameter
(https://en.wikipedia.org/wiki/Shrinkage_estimator) A (lambda), Look at the equation below.
pnw analyesvicya.comyblog/2015I0&comprehensive-quide-egression! wt‘asco 1 Types of Regression Techniques you should know
=argmin ||y— X83 +2 |3||3
Sts Reena © NS
Lows Penalty
(http://i2.wp.com/www_analyticsvidhya.com/wp-content/uploads/2015/08/Ridge2.png)
In this equation, we have two components. First one is least square term and other one is
lambda of the summation of 2 (beta- square) where B is the coefficient. This is added to
least square term in order to shrink the parameter to have a very low variance.
Important Points:
* The assumptions of this regression is same as least squared regression except normality is
not to be assumed
* Itshrinks the value of coefficients but doesn't reaches zero, which suggests no feature
selection feature
* This is a regularization method and uses (2 regularization
(https://en.wikipedia.org/wiki/Regularization_ (mathematics).
6. Lasso Regression
Similar to Ridge Regression, Lasso (Least Absolute Shrinkage and Selection Operator) also
penalizes the absolute size of the regression coefficients. In addition. it is capable of
reducing the variability and improving the accuracy of linear regression models. Look at the
equation below:
= argmin |ly — X13 +A |16ll
feo WO Us
(http:/ /iz.wp.com/www.analyticsvidhya.com/wp-
content/uploads/2015/08/Lasso,png)Lasso regression differs from ridge regression in a
way that it uses absolute values in the penalty function, instead of squares. This leads
to penalizing (or equivalently constraining the sum of the absolute values of the estimates)
pnw analyesvicya.comyblog/2015I0&comprehensive-quide-egression! apt‘asco 1 Types of Regression Techniques you should know
values which causes some of the parameter estimates to turn out exactly zero. Larger the
penalty applied, further the estimates get shrunk towards absolute zero. This results to
variable selection out of given n variables.
Important Points:
+ The assumptions of this regression is same as least squared regression except normality is
not to be assumed
* It shrinks coefficients to zero (exactly zero), which certainly helps in feature selection
* This is a regularization method and uses 1 regularization
(https://en.wikipedia.org/wiki/Regularization (mathematics)
* If group of predictors are highly correlated, lasso picks only one of them and shrinks the
others to zero
7. ElasticNet Regression
ElasticNet is hybrid of Lasso and Ridge Regression techniques. It is trained with L1 and La
prior as regularizer. Elastic-net is useful when there are multiple features which are
correlated. Lasso is likely to pick one of these at random, while elastic-net is likely to pick
both.
3 = argmin( ly — X 3]? + Ag||3]|? + Aal|lls):
(http://i.wp.com/wwwanalyticsvidhya.com/wp-
content/uploads/2015/08/Elastic_Netpng)
A practical advantage of trading-off between Lasso and Ridge is that, it allows Elastic-Net
to inherit some of Ridge’s stability under rotation,
Important Points:
* Itencourages group effect in case of highly correlated variables
* There are no limitations on the number of selected variables
* Itcan suffer with double shrinkage
Beyond these 7 most commonly used regression techniques, you can also look at other
models like Bayesian (https://en.wikipedia.org/wiki/Bayesian_linear_regression), Ecological
(https://en.wikipedia.org/wiki/Ecologicalregression) and Robust _ regression
(https://en.wikipedia.org/wiki/Robust_regression).
pnw analyesvicya.comyblog/2015I0&comprehensive-quide-egression! 1381‘asco 1 Types of Regression Techniques you should know
How to select the right regression model?
Life is usually simple. when you know only one or two techniques. One of the training
institutes | know of tells their students - if the outcome is continuous - apply linear
regression. If it is binary — use logistic regression! However, higher the number of options
available at our disposal, more difficult it becomes to choose the right one. A similar case
happens with regression models.
Within multiple types of regression models, it is important to choose the best suited
technique based on type of independent and dependent variables, dimensionality in the
data and other essential characteristics of the data. Below are the key factors that you
should practice to select the right regression model
1. Data exploration is an inevitable part of building predictive model. It should be you first step
before selecting the right model like identify the relationship and impact of variables
2, To compare the goodness of fit for different models, we can analyse different metrics like
statistical significance of parameters, R-square, Adjusted r-square, AIC, BIC and error
term. Another one is the Mallow’s Cp (http://support minitab.com/en-us/minitab/17/topic-
library/modeling-statistics/regression-and-correlation/goodness-of-fit-statistics/what-is-
mallows-cp/) criterion. This essentially checks for possible bias in your model, by
comparing the model with all possible submodels (or a careful selection of them)
3. Cross-validation is the best way to evaluate models used for prediction, Here you divide
your data set into two group (train and validate). A simple mean squared difference between
the observed and predicted values give you a measure for the prediction accuracy.
4, If your data set has multiple confounding variables, you should not choose automatic model
selection method because you do not want to put these in a model at the same time.
5, Itllalso depend on your objective. It can occur that a less powerful model is easy to
implement as compared to a highly statistically significant model.
Regression regularization methods(Lasso, Ridge and ElasticNet) works well in case of high
dimensionality and multicollinearity among the variables in the data set.
2
End Note
pnw analyesvicya.comyblog/2015I0&comprehensive-quide-egression! ast‘asco 1 Types of Regression Techniques you should know
By now, I hope you would have got an overview of regression. These regression techniques
should be applied considering the conditions of data. One of the best trick to find out which
technique to use. is by checking the family of variables ie. discrete or continuous
In this article, | discussed about 7 types of regression and some key facts associated with
each technique. As somebody who's new in this industry, I'd advise you to learn these
techniques and later implement them in your models.
Did you find this article useful ? Share your opinions / views in the comments section
below,
If you like what you just read & want to continue your analytics
learning, subscribe to our emails
(http://feedburner.google.com/fb/a/mailverify?
uri-analyticsvidhya), follow us on twitter
(http://twitter.com/analyticsvidhya) or like our facebook page
(http://facebook.com/analyticsvidhya).
2a
& (httosiwww analyticsvidhya.com/blog/20115/08/comprehensive-guide-regression/?share=reddit&nb=1)
‘TAGS: ELASTICNET REGRESSION (HTTP.//WWW.ANALYTICSVIDHVA.COM/BLOG/TAG/ELASTICNET-REGRESSION/), LASSO REGRESSION
(HTTP:/WWW.ANALYTICSVIDHYA.COM/BLOG/TAG/LASSO-REGRESSION/), LINEAR-REGRESSION
(HTTP-/ WWW ANALYTICSVIDHYA.COM/BLOG/TAG/LINEAR-REGRESSION/I, LOGISTIC REGRESSION
(HTTP:/ (WWW ANALYTICSVIDHYA,COM/BLOG/TAG/LOGISTIC-REGRESSION/), MULTINOMIAL LOGISTIC RGRESSION
(HTTP://WWW ANALYTICSVIDHYA.COM/BLOG/TAG/MULTINOMIAL-LOGISTIC-RGRESSION/), MULTIPLE REGRESSION
(HTTP://WWW ANALYTICSVIDHYA.COM/BLOG/TAG/MULTIPLE-REGRESSION/), ORDINAL LOGISTIC REGRESSION
(HTTP://WWW ANALYTICSVIDHYA.COM/BLOG/TAG/ORDINAL-LOGISTIC-REGRESSION/), POLYNOMIAL REGRESSION
(HTTP:/ WWW ANALYTICSVIDHYA.COM/BLOG/TAG/POLYNOMIALREGRESSION/), REGRESSION
(HTTP://WWW ANALYTICSVIDHYA.COM/BLOG/TAG/REGRESSION/), REGULARIZATION
pnw analyesvicya.comyblog/2015I0&comprehensive-quide-egression! 1831f
(http://www.facebook.com/sharer.php?
u-http://www.analyticsvidhya.com/blog/2
guide-
regression/&t=7%20Types%200f%20Re¢
yw _— (https://twitter.com/home?
status=/%20Types%200f%20Regression
guide-regression/) 3+
(https://plus.google.com/share?
url=http://www.analyticsvidhya.com/blog/
guide-regression/) P
(http://pinterest.com/pin/create/button/
url-http://www.analyticsvidhya.com/blog/
guide-
regression/&media-http://www.analytics\
content/uploads/2015/08/regression.jpgé‘asco 1 Types of Regression Techniques you should know
Previous Article Next Article
News - Great Lakes launches N things you should know as a Data
Analytics Program in Bangalore, Scientist
India (http://www.analyticsvidhya.com/blog/2015/08/ready-
(http://www.analyticsvidhya.com/blog/2015/08/greatdata-science-resources-common-
lakes-analytics-program-bangalore/) questions-answered/)
(http://www.analyticsvidnya.com/blog/author/sunil-ray/)
Author
Sunil Ray {http://www analyticsvidhya.com/blog/author/suni-
ray,
1am a Business Analytics and Intelligence professional with deep experience in the Indian
Insurance industry. | have worked for various multi-national insurance companies in last 7
years.
RELATED ARTICLES
pnw analyesvicya.comyblog/2015I0&comprehensive-quide-egression! west‘asco 1 Types of Regression Techniques you should know
GUEST BLOG (HTTP://WWW.ANALYTICSVIDHYA.COM/BLOG/AUTHOR/GUEST-BLOG/), AUGUST 25, 2014
How to use big data to profit from the stock market?
(http://www.analyticsvidhya.com/blog/2014/08/big-data-profit-stock-market/)
(GK. KUNAL JAIN CHTTP-//WWW.ANALYTICSVIDHYA.COM/BLOG/AUTHOR/KUNALI/), JUNE TI, 2015
Basie eal
ele Machine Learning —
thttp/ Awww analytiesvidhya, ayo noe ee ernie: learning-basics/)
Machine Learning basics for a newbie (http://www.analyticsvidhya.com/blog/2015/06/machine-learning-
basics/)
(G5, KUNAL JAIN (1HTTP-//WWW.ANALYTICSVIDHYA.COM/BLOG/AUTHOR/KUNALI), SEPTEMBER 16, 2014
How To Become a Data Scientist (Business Analyst)?
(http://www.analyticsvidhya.com/blog/2014/09/how-data-scientist-business-analyst/)
pnw analyesvicya.comyblog/2015I0&comprehensive-quide-egression! 1831‘asco 1 Types of Regression Techniques you should know
20 COMMENTS
rakasbpns:Saisiw sutyTicsvioH¥A coM/610G/2015/08/COMPREHENSIVE GUIDEREGRESSION/2REPLYTOCOM.926954RESPOND)
AUGUST 14, 2015 AT 6:02 AM (HTTP://WWW.ANALYTICSVIDHYA.COM/BLOG/2015/08/COMPREHENSIVE-GUIDE-
REGRESSION/HCOMMENT.92695)
Hi Sunil, Really a nice article for understanding the regression models. Especially for
novice like me who are stepping into Analytic
‘Sunil fay (AYS7yrw anALYTICSVIHVACOM/6LOG/2015/08/COMPREHENSIE SUIDEREGRESSION/2REPLYTOCOM.SZIO7HRESPOND)
AUGUST 14, 2015 AT 7:37 AM (HTTP://WWW.ANALYTICSVIDHYA.COM/8LOG/2015/08/COMPREHENSIVE-GUIDE
REGRESSION/#COMMENT.92707)
Thanks for the comment
‘Shivagn Kash M ati SAW csvitYA.coM/610G/2015/08/COMPREHENSIVE GUIDE REGRESSION/2REPLYTOCON-92699HRESPOND)
AUGUST 14, 2015 AT 6:24 AM (HTTP://WWW.ANALYTICSVIDHYA.COM/BLOG/2015/08/COMPREHENS|VE-GUIDE-
REGRESSION/#COMMENT92699)
Could you please, provide a material ( book/website) where | can understand concept
underlying in such regression techniques.
‘Suni Ray @A¥Sivw ANALvTIcsvIoHVA.cOM/6L0G/2015/08/COMPREHENSIVE GUIDEREGRESSION/2REPLYTOCOM.SZTOSERESPOND)
AUGUST 14, 2015 AT 7:40 AM (HTTP://WWW.ANALYTICSVIDHYA.COM/BLOG/2015/08/COMPREHENS |VE-GUIDE-
REGRESSION/HCOMMENT.92708)
Thanks for the comment
You can read book “The Elements of Statistical Learning’, it has detailed explanation of
these regression models.
Regards,
Sunit
pnw analyesvicya.comyblog/2015I0&comprehensive-quide-egression! 1981‘asco 1 Types of Regression Techniques you should know
JuliugeSicabm AW ShaLrTicsviorYA.con/e102015/08/COMPREHENSIVEGUIDEREGRESSION/2REPLYTOCOM.92724#RESPOND)
AUGUST 14,2015 AT 9:23 AM (HTTP://WWW.ANALYTICSVIDHYA.COM/BLOG/2015/08/COMPREHENSIVE.GUIDE
REGRESSION/#COMMENT.92724)
agree with you Sunil. but before reading "The Elements of Statistical Learning’. | would
recommend reading An Introduction to Statistical Learning: with application in R, which
is more practical because you have to practise with R codes, or you may take Statistical
Learning course which is offered by authors of these books, in addition they are
inventors of some of these model as well (eg, Lasso by Tibshirani).
regards,
Julius
Kumagai. analyTicsvioHyA.coM/B10G/2015/08/COMPREHENSIVE GUIDE REGRESSION/REPLYTOCOM.SZTI2HRESPOND)
AUGUST 14, 2015 AT 8:04 AM (HTTP://WWW. ANALYTICSVIDHYA.COM/BLOG/2015/08/COMPREHENSIVE GUIDE.
REGRESSION/HCOMMENT.92712)
A good refresher on Regression techniques
Tom S@iarrrez/www anaLyricsvioHvACcOM/6L0G/2015/08/COMPREHENSIVE-GUIDEREGRESSION/?REPLYTOCOM-S27ISHRESPOND)
AUGUST 14, 2015 AT 8:49 AM (HTTP://WWW.ANALYTICSVIDHYA.COM/BLOG/2015/08/COMPREHENSIVE-GUIDE.
REGRESSION/#COMMENT.92718)
Hi Sunil
Thanks for posting this. Very nice summary on a technique used so often but
underutilised when looking at the different forms available, You wouldnt be interested
in doing something similar for classification techniques. quite a few here as well.
Tom
‘Sunif ay AVS) www aNatvricsviokYA.coMeL0«/2016/08/ COMPREHENSIVE GUIDEREGRESSION/7REPLYTOCOM-9272OHRESPOND)
AUGUST 14, 2015 AT 8:57 AM (HTTP.//WWW.ANALYTICSVIDHYA.COM/6L0G/2015/08/COMPREHENSIVE-GUIDE-
REGRESSION/#COMMENT-92720)
Thanks Tom...you can refer article on most common machine learning algorithms
http://www analyticsvidhya.com/blog/2015/08/common-machine-learning-
algorithms/ (http://wwwanalyticsvidhya.com/blog/2015/08/common-machine-
learning-algorithms/). Here | have discussed various types of classification algorithms
like decision tree, random forest, KNN, Naive Bayes.
pnw analyesvicya.comyblog/2015I0&comprehensive-quide-egression! 2031‘asco
1 Types of Regression Techniques you should know
Regards,
Sunil
R raj Asmat samen anavyticsvprivacoM/L02015/08/COMPRENENSIE-GUIDEREGRESSION/2REPLYTOCON-S27Z2HRESPOND)
AUGUST 14, 2015 AT 9:14 AM (HTTP://WWW,ANALYTICSVIDHYA.COM/BLOG/2015/08/COMPREHENSIVE-GUIDE
REGRESSION/HCOMMENT.92722)
Dear sir,
Is the regression applied for predicting the results of a college for the coming academic
year.
Pratadoshi Fhtiand/twittnesoamBsabalshht axe ercrevsive cuipc-RceRcssion/REPLYTOCON.S272SHRESPOND)
AUGUST 14, 2015 AT 9:34 AM (HTTP://WWW. ANALYTICSVIDHYA.COM/BLOG/2015/08/COMPREHENSIVE-GUIDE-
REGRESSION/#COMMENT.92725)
Hi Sunil,
The difference given between linear regression and multiple regression needs
correction. When there is just one independent and one dependent variable, itis called
“simple linear regression” not just linear regression
‘Sunil-ay (a¥9zvww anacyTicsvioHYACOM/6LOG/2015/08/COMPREHENSIVEUIDE REGRESSION/REPLYTOCOM.S27Z7HRESPOND)
AUGUST 14, 2015 AT 9:43 AM (HTTP://WWW.ANALYTICSVIDHYA.COM/BLOG/2015/08/COMPREHENS VE GUIDE.
REGRESSION/HCOMMENT.92727)
Hi PratzJoshi,
Thanks for highlighting it.
Regards,
Sunil
Rohagts@eha 9 Paw ana.yticsvioiva con/06/2015/08/COMPREHENSE GUIDE REGRESSION/7REPLYTOCOM.SZ7354RESPOND)
AUGUST 14, 2015 AT 11:47 AM (HTTP://WWW.ANALYTICSVIDHYA.COM/BLOG/2015/08/CONPREHENSIVE-GUIDE
REGRESSION/#COMMENT.92735)
pnw analyesvicya.comyblog/2015I0&comprehensive-quide-egression! 20st‘asco
1 Types of Regression Techniques you should know
Hey,quite nice article.It did help me broaden my perspective regarding the regression
techniques(specially ElasticNet) but still it would be nice to elucidate upon the
differences between {1 and [2 regularization techniques.For
this http:// www.quora.com/What-is-the-difference-between-L1-and-L2-regularization
(http://wwnw.quora.com/What-is-the-difference-between-L1-and-L2-regularization)
will be very helpful Though it could be incorporated into a new article | think
Paul ga ¥SdirTPwrivW/ANALYTICSVIDHYA COH/BLOG/2015/08/COMPREHENSIVE GUIDEREGRESSIONY/ 7REPLYTOCOM-9Z743HRESPOND)
AUGUST 14, 2015 AT 17 PM (HTTP://WWW.ANALYTICSVIDHYA.COM/BLOG/2015/08/COMPREHENSIVE-GUIDE.
REGRESSION/#COMMENT.92743)
I'm sorry but | am going to complain again,
This is an excellent article, However, when | go to save it or print it, it is a mess!
If | print from IE, the only browser allowed on my network, all the ads and hypertext links
cover the article text: you cannot read the article.
Ihad suggested having a feature where you use a button to convert the article to a PDF,
which can them be printed without the ads and hypertext. You did in once. then
stopped. Please start again!
‘ShasibBAM Sip www aNALvTiosvoHYA.COM/EL06/2015/08/COMPREHENSIVE GUIDEREGRESSION/2REPLYTOCON.92754HRESPOND)
AUGUST 14, 2015 AT 6:27 PM (HTTP://WWW.ANALYTICSVIDHYA.COM/BLOG/2015/08/COMPREHENSIVE.GUIDE-
REGRESSION/#COMMENT.92754)
Good consolidation of concepts...Sunil do you have a comprehensive data set upon
which we can apply all/few of the the above techniques and see how each regression
behaves....thanks again
‘Suni ayi9aieS4w anaLyTicsvIHvACcOM/6LOG/2015/08/COMPREHENSIVE SUIDEREGRESSION/REPLYTOCOM.£90434RESPOND)
AUGUST 18, 2015 AT 9:24 AM (HTTP://WWW.ANALYTICSVIDHYA.COM/BLOG/2015/08/COMPREHENSIVE-GUIDE.
REGRESSION/#COMMENT-93043)
Shashi,
You can look at scikit-learn example data sets
pnw analyesvicya.comyblog/2015I0&comprehensive-quide-egression! 22031‘asco
1 Types of Regression Techniques you should know
Regards,
Sunil
Lalit Siaelaie SANRi auyTicsvioHYA.cow/610G/2015/08/COMPREHENSIE GUIDE REGRESSION/?RIPLYTOCOM.-92829HRESPOND)
AUGUST 17, 2015 AT 5:33 AM (HTTP://WWW. ANALYTICSVIDHYA,COM/BLOG/2015/08/COMPREHENSIVE.GUIDE
REGRESSION/HCOMMENT.92929)
Hi Sunil,
Nice compilation, Suggesting a correction , elastic net penalty has another parameter
too and is written as lambda * summation (lalpha * L2 penalty + (1-alpha)’ La Penalty |)
Also quoting book by trevor & hastie "The elastic-net selects variables like the lasso,
and shrinks together the coefficients of correlated predictors like ridge.”
like your posts, very informative.
-Lalit
‘Senne Saft. /ivw aNALYTICSVIDHYA COM/BLOG/2015/08/COMPREHENSIVE GUIDEREGRESSION/REPLYTOCOM.S@O3GHRESPOND)
AUGUST 18, 2015 AT 6:59 AM (HTTP://WWW. ANALYTICSVIDHYA.COM/BLOG/2015/08/COMPREHENSIVE GUIDE.
REGRESSION/HCOMMENT.93036)
Hi
Can you please explain this point mentioned under the logistic regression - multi
collinearity part "However, we have the options to include interaction effects of
categorical variables in the analysis and in the model.”
‘Sunifa¥i949¢54nw anaLyTicsvIoHYA cOM/6L.OG/2015/08/COMPREHENSIVE GUIDE REGRESSION/REPLYTOCOM.S9042HRESPOND)
AUGUST 18, 2015 AT 9:22 AM (HTTP://WWW. ANALYTICSVIDHYA.COM/8LOG/2015/08/COMPREHENSIVE-GUIDE-
REGRESSION/HCOMMENT-93042)
Hi Seema,
Read this article to understand the effect of interaction in detail,
http://www.theanalysisfactor.com/interpreting-interactions-in-regression/
(http://wwnw.theanalysisfactor.com/interpreting-interactions-in-regression/)
pnw analyesvicya.comyblog/2015I0&comprehensive-quide-egression! 2331‘asco
1 Types of Regression Techniques you should know
Regards,
Sunil
Gaurap(hitn:/Anmunanalytionnicthie 20m hee AAI aA Riv anmincalenniveisilid str eeAiard
‘Says:
‘SEPTEMBER 23, 2015 AT 9:34 PM (HTTP://WWW .ANALYTICSVIDHYA.COM/BLOG/2015/08/COMPREHENS |VE-GUIDE-
REGRESSION/#COMMENT-95801)
Hi sunil
The article seems very interesting, Please can you let me know how can we implement
Forward stepwise Regression in python as we dont have any inbuilt lib for it
‘San gAgU¢ a} $ANBiw ANALYTICSVIDHYA.COM/BLOG/2015/08/COMPREHENSIVEGUIDEREGRESSION/REPLYTOCOM.G8284HRESPOND)
OCTOBER 27, 2015 AT 2:02 PM (HITP://WWW.ANALYTICSVIDHYA.COM/BLOG/2015/08/COMPREHENSIVE-GUIDE.
REGRESSION/#COMMENT.98284)
good article especially for computer science students please provide this and further
articles in pdf thank you
LEAVE A REPLY
Connect with:
B (http://wwwanalyticsvidhya.com/wp-login php?
action-wordpress_social_authenticate&mode=login&provider-Facebook&redirect_to-http%3A%er%:
guide-regression%2F)
Your email address will not be published.
Name (required)
Email (required)
pnw analyesvicya.comyblog/2015I0&comprehensive-quide-egression! past‘asco 1 Types of Regression Techniques you should know
Website
Comment
Notify me of follow-up comments by email
Notify me of new posts by email
GET CONNECTED
3,132
FOLLOWERS
672
FOLLOWERS
pnw analyesvicya.comyblog/2015I0&comprehensive-quide-egression!
‘SUBMIT COMMENT
8,507
FOLLOWERS
Email
SUBSCRIBE
2581‘asco 1 Types of Regression Techniques you should know
GREAT LAKES
INSTITUTE OF MANAGEMENT
BUSINESS ANALYTICS PROGRAM
(http://pgpbagreatlakes.eduin/?
utm_source-AVM&utm_medium-Banner&utm_campaign-Pgpba_decjan)
POPULAR POSTS
10 Must Watch Movies on Data Science and Machine Learning
(http://wwnwanalyticsvidhya.com/blog/2015/11/10-watch-movies-data-science-machine-
learning/)
7 Must Watch Documentaries on Statistics and Machine Learning
(http://www analyticsvidhya.com/blog/2015/11/7-watch-documentaries-statisties-machine-
learning/)
Simple Guide to Logistic Regression in R
(http://wwwanalyticsvidhya.com/blog/2015/11/beginners-guide-on-logistic-regression-in-
”
7 Types of Regression Techniques you should know!
(http://wwnwanalyticsvidhya.com/blog/2015/08/comprehensive-guide-regression/)
SAS vs. R (vs, Python) - which tool should | learn?
(http://wwnwanalyticsvidhya.com/blog/2014/03/sas-vs-vs-python-tool-learn/)
Exclusive Interview with SRK, Sr, Data Scientist, Kaggle Rank 25
(http://wonwanalyticsvidhya.com/blog/2015/11/exclusive-interview-stk-sr-data-scientist-
kaggle-rank-25/)
Simple Yet Powerful Excel Tricks for Analyzing Data
ihttp://wwwanalyticsvidhya.com/blog/2015/11/excel-tips-tricks-data-analysis/)
Must Read Books for Beginners on Machine Learning and Artificial Intelligence
(http://wwnwanalyticsvidhya.com/blog/2015/10/read-books-for-beginners-machine-
pnw analyesvicya.comyblog/2015I0&comprehensive-quide-egression! 25131‘asco 1 Types of Regression Techniques you should know
learning-artificial-intelligence/)
Cees uu
beater
(http://www simplilearn.com/big-data-and-
BIG DATA
HADOOP 2
analytics /big-data-and-hadoop-training?utm_source-AnVidhya&utm_medium-affiliate-
cpm&utm_campaign-AnVidhyaBanner-BigDataHadoop27)
LATEST DISCUSSIONS
Black Friday Data Hack - Reveal your approach (http://discuss.analyticsvidhya.com/t/black-friday-data-
hack-reveal-your-approach)
Welcome to Analytics Vidhya discussions (http://discuss.analyticsvidhya.com/t/welcome-to-analytics-
vidhya-discussions)
What is Null and Residual deviance in logistic regression (http://discuss.analyticsvidhya.com/t/what-
is-null-and-+esidual-deviance-in-logistic-regression)
Which Platform( Windows or Linux ) is used by most R programmers for Rstudio
(http://discuss.analyticsvidhya.com/t/which-platform-windows-or-linux-is-used-by-most-r-programmers-
for-rstudio)
How decision tree performs nonlinear classification? (http://discuss.analyticsvidhya.com/t/how-
decision-tree-performs-non-linear-classification)
Pattern matching, Text Mining (http://discuss.analyticsvidhya.com/t/pattern-matching-text-mining)
Multiclass Classification in R (http://discuss.analyticsvidhya.com/t/multiclass-classification-in-r)
pnw analyesvicya.comyblog/2015I0&comprehensive-quide-egression! 2731‘asco 1 Types of Regression Techniques you should know
Learn Analytics, SAS/R
Tal Mel Wey
Classroom Training
(erla-tslay ats ec litets) (http://imarticus.org/programs/business-analytics-
professional/)
RECENT POSTS
= (http://www analyticsvidhya. com/blog/2015/11/infographic-rise-machine-
b
learning-year-2015/)
The Machine Learning Times of Year 2015 - A Powerful Growth Story
(http://www.analyticsvidhya.com/blog/2015/11/infographic-rise-machine-learning-year-
2015
MANISH SARASWAT, NOVEMBER 24, 2015,
p://wwwanalyticsvidhya.com/blog/2015/11/started-machine-learning-ms-
excel-xl-miner/)
Getting started with Machine Learning in MS Excel using XLMiner
(http://www.analyticsvidhya.com/blog/2015/11/started-machine-learning-ms-excel.xl-
miner/)
MANISH SARASWAT , NOVEMBER 23, 2015
(http://www analyticsvidhya.com/blog/2015/11/improve-model-performance-
cross-validation-in-python-r/)
pnw analyesvicya.comyblog/2015I0&comprehensive-quide-egression! 23131‘asco 1 Types of Regression Techniques you should know
Improve Your Model Performance using Cross Validation (in Python and R)
(http://www.analyticsvidhya.com/blog/2015/11/improve-madel-performance-cross-
validation-in-python-/)
‘SUNIL RAY , NOVEMBER 18, 2015
ae Oe ne
? Ea g http://www analytiesvidhya.com/blog/2015/11/lifetime-lessons-20-data-
scientist-today/)
Lifetime Lessons: 20 Things Every Data Scientist Must Know Today
(http://www.analyticsvidhya.com/blog/2015/11/lifetime-lessons-20-data-scientist-taday/)
KUNAL JAIN, NOVEMBER 18, 2015
ABOUT US
For those of you, who are wondering what is “Analytics Vidhya", “Analytics” can be defined as the
science of extracting insights from raw data, The spectrum of analytics starts from capturing data
and evolves into using insights / treo MAREN TORI, PRE RSa SRSA oltiCs?
sw Of Fao:
| ul ‘www.edvancer.in
ur Jutm_medium-AVads&utm_cam yy _content-cbapavad)
; 850%
6/2 Email
LATEST POSTS
=
BESS, (http://www analyticsvidhya.com/blog/2015/11/infographic-rise-machine-
b
learning-year-2015/)
pnw analyesvicya.comyblog/2015I0&comprehensive-quide-egression! 2931‘asco 1 Types of Regression Techniques you should know
The Machine Learning Times of Year 2015 - A Powerful Growth Story
(http://www.analyticsvidhya.com/blog/2015/11/infographic-rise-machine-learning-year-
2015/1
MANISH SARASWAT , NOVEMBER 24, 2015
(http://www analyticsvidhya.com/blog/2015/11/started-machine-learning-ms-
excel-xl-miner/)
Getting started with Machine Learning in MS Excel using XLMiner
(http://www.analyticsvidhya.com/blog/2015/11/started-machine-learning-ms-excel.xl-
miner/)
MANISH SARASWAT , NOVEMBER 23, 2015
p= (http://www analyticsvidhya.com/blog/2015/11/improve-model-performance-
om
cross-validation-in-python-r/)
Improve Your Model Performance using Cross Validation (in Python and R)
(http://www.analyticsvidhya.com/blog/2015/11/improve-model-performance-cross-
validation-in-python-r/)
‘SUNIL RAY , NOVEMBER 18, 2015
8 (http://www analyticsvidhya.com/blog/2015/11/lifetime-lessons-20-data-
ai
scientist-today/)
Lifetime Lessons: 20 Things Every Data Scientist Must Know Today
(http://www.analyticsvidhya.com/blog/2015/11/lifetime-lessons-20-data-scientist-today/)
KUNAL JAIN, NOVEMBER 18, 2015
QUICK LINKS
Home (http://www.analyticsvidhya.com/)
About Us (http://wwwanalyticsvidhya.com/about-
me/)
Our team (http://wwwanalyticsvidhya.com/about-
me/team/)
pnw analyesvicya.comyblog/2015I0&comprehensive-quide-egression! 3031‘asco 1 Types of Regression Techniques you should know
Privacy Policy
(http://www analyticsvidhya.com/privacy-policy/)
Refund Policy
(http://wwwanalyticsvidhya.com/refund-policy/)
Terms of Use
(http://wwwanalyticsvidhya.com/terms/)
TOP REVIEWS
Traffic Rank
amalytcevidhya.com
Soe9 | thttp//www alexa com/data/details/main?url-http /analyticsvidhyacom)
by QbAlexar
(http://www.alexa.com/siteinfo/yoursite.com)
© Copyright 2015 Analytics Vidhya
pnw analyesvicya.comyblog/2015I0&comprehensive-quide-egression! 31st