0% found this document useful (0 votes)
69 views11 pages

Intro Regression Modeling

Uploaded by

rkrehankhan00786
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
69 views11 pages

Intro Regression Modeling

Uploaded by

rkrehankhan00786
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 11

School of Computing Science and Engineering

Course Code : R1UC402T Course Name: Data Analytics

Unit 2
Data Analysis I : Introduction to Regression Modeling ,
Techniques and Applications

Name of the Faculty: Dr. Saurabh Raj Sangwan Program


Name: Btech (IV Sem)
Regression Modeling
 The process of fitting a line to data.
 Sir Francis Galton (1822-1911) -- a British anthropologist and
meteorologist coined the term “regression”.

Regression towards mediocrity in hereditary stature - the tendency of


offspring to be smaller than large parents and larger than small parents.
Referred to as “regression towards the mean”.

ˆ 2
Y Y  ( X  X )
3 Adjustment for how far
Expected
offspring parent is from mean of
height parents
Average sized offspring
• Regression Modeling is a statistical method to model the relationship between a
dependent (target) and independent (predictor) variables with one or more independent
variables.
• More specifically, Regression Modeling helps us to understand how the value of the
dependent variable is changing corresponding to an independent variable when other
independent variables are held fixed.
• It predicts continuous/real values such as temperature, age, salary, price, etc.
Terminology

• Dependent Variable:
The main factor in Regression analysis which we want to predict or understand is
called the dependent variable. It is also called target variable.

• Independent Variable:
The factors which affect the dependent variables or which are used to predict the
values of the dependent variables are called independent variable, also called as a
predictor.

• Outliers:
Outlier is an observation which contains either very low value or very high value
in comparison to other observed values. An outlier may hamper the result, so it should be
avoided.

• Multicollinearity:
If the independent variables are highly correlated with each other than other
variables, then such condition is called Multicollinearity. It should not be present in the
dataset, because it creates problem while ranking the most affecting variable.
•Underfitting and Overfitting:
If our algorithm works well with the training dataset but not well with test dataset,
then such problem is called Overfitting.
If our algorithm does not perform well even with training dataset, then such
problem is called underfitting.
• We can understand the concept of regression analysis using the below example:

Example: Suppose there is a marketing company A, who does various advertisement every
year and get sales on that. The below list shows the advertisement made by the company in
the last 5 years and the corresponding sales:

• Now, the company wants to do the advertisement of $200 in the year 2019 and wants to
know the prediction about the sales for this year.

• So to solve such type of prediction problems in machine learning, we need regression


analysis.
• Regression is a supervised learning technique which helps in finding the correlation
between variables and enables us to predict the continuous output variable based on the
one or more predictor variables.

• It is mainly used for prediction, forecasting, time series modeling, and determining
the causal-effect relationship between variables.

• In Regression, we plot a graph between the variables which best fits the given
datapoints, using this plot, the machine learning model can make predictions about the
data.

• In simple words, "Regression shows a line or curve that passes through all the
datapoints on target-predictor graph in such a way that the vertical distance between
the datapoints and the regression line is minimum."

• Regression is a statistical procedure that determines the equation for the straight line that
best fits a specific set of data.

• The distance between datapoints and line tells whether a model has captured a strong
relationship or not.
Some examples of regression can be as:

• Prediction of rain using temperature and other factors

• Determining Market trends

• Prediction of road accidents due to rash driving.


Why do we use Regression Analysis?

• As mentioned earlier, Regression analysis helps in the prediction of a continuous


variable.

• There are various scenarios in the real world where we need some future predictions such
as weather condition, sales prediction, marketing trends, etc., for such case we need some
technology which can make predictions more accurately.

• So for such case we need Regression analysis which is a statistical method and used in
machine learning and data science.

• Below are some other reasons for using Regression analysis:

• Regression estimates the relationship between the target and the independent variable.
• It is used to find the trends in data.
• It helps to predict real/continuous values.
• By performing the regression, we can confidently determine the most important
factor, the least important factor, and how each factor is affecting the other
factors.
Types of Regression

• There are various types of regressions which are used in data science and machine
learning.

• Each type has its own importance on different scenarios, but at the core, all the
regression methods analyze the effect of the independent variable on dependent
variables.

• Here we are discussing some important types of regression which are given below:

 Linear Regression
 Logistic Regression
 Polynomial Regression
 Support Vector Regression
 Decision Tree Regression
 Random Forest Regression
 Ridge Regression
 Lasso Regression
Techniques for Regression Modeling

1.Ordinary Least Squares (OLS)

2.Maximum Likelihood Estimation (MLE)

3.Gradient Descent

4.Cross-Validation

5.Regularization Techniques (Ridge, Lasso)

You might also like