Regression Unit-2

Regression analysis is a statistical method used to model the relationship between a dependent variable and one or more independent variables, enabling predictions of continuous values. It is a supervised learning technique primarily used for forecasting, determining causal relationships, and analyzing trends. Types of regression include simple linear regression, multiple linear regression, and nonlinear regression, with applications in various fields such as marketing, real estate, and traffic analysis.

Uploaded by

yashi.bajpai18

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

11 views5 pages

Regression Unit-2

Uploaded by

yashi.bajpai18

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 5

Regression

Regression Analysis
Regression analysis is a statistical method to model the relationship between a dependent
(target) and independent (predictor) variables with one or more independent variables. More
specifically, Regression analysis helps us to understand how the value of the dependent
variable is changing corresponding to an independent variable when other independent
variables are held fixed. It predicts continuous/real values such as temperature, age, salary,
price, etc.
We can understand the concept of regression analysis using the below example:
Example: Suppose there is a marketing company A, who does various advertisement every
year and get sales on that. The below list shows the advertisement made by the company in
the last 5 years and the corresponding sales:

Now, the company wants to do the advertisement of $200 in the year 2023 and wants to know
the prediction about the sales for this year. So to solve such type of prediction problems in
machine learning, we need regression analysis.
Regression is a supervised learning technique which helps in finding the correlation between
variables and enables us to predict the continuous output variable based on the one or more
predictor variables. It is mainly used for prediction, forecasting, time series modeling, and
determining the causal-effect relationship between variables.
In Regression, we plot a graph between the variables which best fits the given datapoints,
using this plot, the machine learning model can make predictions about the data.
In simple words, "Regression shows a line or curve that passes through all the datapoints on
target-predictor graph in such a way that the vertical distance between the datapoints and the
regression line is minimum." The distance between datapoints and line tells whether a model
has captured a strong relationship or not.
Some examples of regression can be as:
 Prediction of rain using temperature and other factors
 Determining Market trends
 Prediction of road accidents due to rash driving.
Terminologies Related to the Regression Analysis:
 Dependent Variable: The main factor in Regression analysis which we want to
predict or understand is called the dependent variable. It is also called target variable.
 Independent Variable: The factors which affect the dependent variables or which are
used to predict the values of the dependent variables are called independent variable,
also called as a predictor.
Outliers: Outlier is an observation which contains either very low value or very high value in
comparison to other observed values. An outlier may hamper the result, so it should be
avoided. Now, the company wants to do the advertisement of $200 in the year 2023 and
wants to know the prediction about the sales for this year. So to solve such type of prediction
problems in machine learning, we need regression analysis.
Regression is a supervised learning technique which helps in finding the correlation between
variables and enables us to predict the continuous output variable based on the one or more
predictor variables. It is mainly used for prediction, forecasting, time series modeling, and
determining the causal-effect relationship between variables.
In Regression, we plot a graph between the variables which best fits the given datapoints,
using this plot, the machine learning model can make predictions about the data.
In simple words, "Regression shows a line or curve that passes through all the datapoints on
target-predictor graph in such a way that the vertical distance between the datapoints and the
regression line is minimum." The distance between datapoints and line tells whether a model
has captured a strong relationship or not.
Some examples of regression can be as:
 Prediction of rain using temperature and other factors
 Determining Market trends
 Prediction of road accidents due to rash driving.
Terminologies Related to the Regression Analysis:
 Dependent Variable: The main factor in Regression analysis which we want to
predict or understand is called the dependent variable. It is also called target variable.
 Independent Variable: The factors which affect the dependent variables or which are
used to predict the values of the dependent variables are called independent variable,
also called as a predictor.
Outliers: Outlier is an observation which contains either very low value or very high value in
comparison to other observed values. An outlier may hamper the result, so it should be
avoided.
Below is the mathematical equation for Linear regression:
Y= aX+b
Here, Y = dependent variables (target variables),
X= Independent variables (predictor variables), a and b are the linear coefficients
A Linear Regression model’s main aim is to find the best fit linear line and the optimal values
of intercept and coefficients such that the error is minimized. Error is the difference between
the actual value and Predicted value and the goal is to reduce this difference.

Statistical tools for high-throughput data analysis In the above diagram,

 x is our dependent variable which is plotted on the x-axis and y is the dependent
variable which is plotted on the y-axis.
 Black dots are the data points i.e the actual values. bo is the intercept which is 10 and
b1 is the slope of the x variable.
 The blue line is the best fit line predicted by the model i.e the predicted values lie on
the blue line.
The vertical distance between the data point and the regression line is known as error or
residual. Each data point has one residual and the sum of all the differences is known as the
Sum of Residuals/Errors.
Mathematical Approach:
Residual/Error = Actual values – Predicted Values
Sum of Residuals/Errors = Sum(Actual- Predicted Values)
Square of Sum of Residuals/Errors = (Sum(Actual- Predicted Values))2
i.e

Some popular applications of linear regression are:

 Analyzing trends and sales estimates
 Salary forecasting
 Real estate prediction
 Arriving at ETAs in traffic.

Types of Linear Regression

Linear regression can be further divided into two types of the algorithm:
 Simple Linear Regression: If a single independent variable is used to predict the
value of a numerical dependent variable, then such a Linear Regression algorithm is
called Simple Linear Regression.
 Multiple Linear regression: If more than one independent variable is used to predict
the value of a numerical dependent variable, then such a Linear Regression algorithm
is called Multiple Linear Regression.
Equation of Simple Linear Regression, where bo is the intercept, b1 is coefficient or slope, x
is the independent variable and y is the dependent variable.
y=b0+b1x
Equation of Multiple Linear Regression, where bo is the intercept, b1,b2,b3,b4…,bn are
coefficients or slopes of the independent variables x1,x2,x3,x4…,xn and y is the dependent
variable.
y=b0+b1x1+ b2x2+ b3x3+…….+ bnxn
Linear Regression:
 Straight-line regression analysis involves a response variable, y, and a single predictor
variable x.
 It is the simplest form of regression, and models y as a linear function of x. That is,y =
b+wx where the variance of y is assumed to be constant band w are regression
coefficients specifying the Y-intercept and slope of the line.
 The regression coefficients, w and b, can also be thought of as weights, so that we can
equivalently write, y = w0+w1x These coefficients can be solved for by the method of
least squares, which estimates the best-fitting straight line as the one that minimizes
the error between the actual data and the estimate of the line.
 Let D be a training set consisting of values of predictor variable, x, for some
population and their associated values for response variable, y. The training set
contains |D| data points of the form (x1, y1), (x2, y2), … , (x|D|, y|D|).
The regression coefficients can be estimated using this method with the following equations:
where x is the mean value of x1, x2, … , x|D|, and y is the mean value of y1, y2,…, y|D|. The
coefficients w0 and w1 often provide good approximations to otherwise complicated
regression equations.
Multiple Linear Regression:
 It is an extension of straight-line regression so as to involve more than one predictor
variable.
 It allows response variable y to be modeled as a linear function of, say, n predictor
variables or attributes, A1, A2, …, An, describing a tuple, X.
An example of a multiple linear regression model based on two predictor attributes or
variables, A1 and A2, isy = w0+w1x1+w2x2
where x1 and x2 are the values of attributes A1 and A2, respectively, in X.
 Multiple regression problems are instead commonly solved with the use of statistical
software packages, such as SAS,SPSS, and S-Plus.
Nonlinear Regression:
 It can be modeled by adding polynomial terms to the basic linear model.
 By applying transformations to the variables, we can convert the nonlinear model into
a linear one that can then be solved by the method of least squares.
 Polynomial Regression is a special case of multiple regression. That is, the addition of
high-order terms like x2, x3, and so on, which are simple functions of the single
variable, x, can be considered equivalent to adding new independent variables.
Transformation of a polynomial regression model to a linear regression model:
Consider a cubic polynomial relationship given by
y = w0+w1x+w2x2+w3x3
To convert this equation to linear form, we define new variables:
x1 = x, x2 = x2 ,x3 = x3
It can then be converted to linear form by applying the above assignments, resulting in the
Equation
y = w0+w1x+w2x2+w3x3
which is easily solved by the method of least squares using software for regression analysis.

ML Module3 Regression
No ratings yet
ML Module3 Regression
51 pages
Forecasting Ambulance Demand Using Machine Learning A Case Study From Oslo Norway
No ratings yet
Forecasting Ambulance Demand Using Machine Learning A Case Study From Oslo Norway
10 pages
Data Science
100% (1)
Data Science
14 pages
Lecture 6 - Regression Analysis
No ratings yet
Lecture 6 - Regression Analysis
34 pages
Regression Analysis in Machine Learning
No ratings yet
Regression Analysis in Machine Learning
9 pages
Unit 2 Topic 1 REGRESSION
No ratings yet
Unit 2 Topic 1 REGRESSION
19 pages
Unit 3 Notes
100% (2)
Unit 3 Notes
32 pages
DA unit-III
No ratings yet
DA unit-III
30 pages
Yared Leliso CBR PDF
No ratings yet
Yared Leliso CBR PDF
172 pages
ML Unit 2
No ratings yet
ML Unit 2
27 pages
Hydrological Characteristics of Australia: Relationship Between Surface Flow, Climate and Intrinsic Catchment Properties
No ratings yet
Hydrological Characteristics of Australia: Relationship Between Surface Flow, Climate and Intrinsic Catchment Properties
62 pages
Unit 2
No ratings yet
Unit 2
19 pages
6naiev Base
No ratings yet
6naiev Base
37 pages
1961 Kutateladze S.S. Kutateladze S.S. Int. J. Heat Mass Transf
100% (1)
1961 Kutateladze S.S. Kutateladze S.S. Int. J. Heat Mass Transf
15 pages
Guidelines For Converting Between Various Wind Averaging Periods in Tropical Cyclone Conditions
No ratings yet
Guidelines For Converting Between Various Wind Averaging Periods in Tropical Cyclone Conditions
60 pages
Survival Analysis For Cache Time-To-Live Optimization Presentation
No ratings yet
Survival Analysis For Cache Time-To-Live Optimization Presentation
27 pages
Exp No 03
No ratings yet
Exp No 03
15 pages
Gordo Accuracysamplingprotocolscommonbirdmonitoringschemes Ardeola 2018
No ratings yet
Gordo Accuracysamplingprotocolscommonbirdmonitoringschemes Ardeola 2018
13 pages
Regression: Unit Iii
No ratings yet
Regression: Unit Iii
54 pages
3rd Quarter Peta Grade 7
No ratings yet
3rd Quarter Peta Grade 7
4 pages
Unit 2questionbank-1
No ratings yet
Unit 2questionbank-1
38 pages
Regression
No ratings yet
Regression
3 pages
What Is Regression Analysis
No ratings yet
What Is Regression Analysis
3 pages
Regression Analysis in Machine Learning
No ratings yet
Regression Analysis in Machine Learning
26 pages
Saiz-Rubio Et Al. - 2021 - Robotics-Based Vineyard Water Potential Monitoring at High Resolution
No ratings yet
Saiz-Rubio Et Al. - 2021 - Robotics-Based Vineyard Water Potential Monitoring at High Resolution
12 pages
Regression in M.L
No ratings yet
Regression in M.L
13 pages
Types of Supervised Learning2
No ratings yet
Types of Supervised Learning2
66 pages
Independent Variables
No ratings yet
Independent Variables
3 pages
Trend Lines in Charts
No ratings yet
Trend Lines in Charts
18 pages
AppendixB Regression
No ratings yet
AppendixB Regression
29 pages
Regression PDF
No ratings yet
Regression PDF
16 pages
ME 211 Chapter - 4 - Examples Solutions
No ratings yet
ME 211 Chapter - 4 - Examples Solutions
21 pages
June 99 Paper 4
No ratings yet
June 99 Paper 4
8 pages
4 ML
No ratings yet
4 ML
41 pages
Probability & Statistics Practice Final Exam and Answers
0% (1)
Probability & Statistics Practice Final Exam and Answers
8 pages
5 - AML Lecture 5 - Linear Regression
No ratings yet
5 - AML Lecture 5 - Linear Regression
56 pages
Unit 2
No ratings yet
Unit 2
67 pages
Simple Linear and Logistic Regression
No ratings yet
Simple Linear and Logistic Regression
81 pages
Unit - 2 MLA
No ratings yet
Unit - 2 MLA
57 pages
MLT Unit 2
No ratings yet
MLT Unit 2
53 pages
Logit Model For Binary Data
No ratings yet
Logit Model For Binary Data
50 pages
Linear Regression
No ratings yet
Linear Regression
17 pages
TMP 2 A08
No ratings yet
TMP 2 A08
6 pages
STAT 3502 Class 1 Notes
No ratings yet
STAT 3502 Class 1 Notes
26 pages
Unit - Iii Data Analysis
No ratings yet
Unit - Iii Data Analysis
39 pages
Regression Modelling
No ratings yet
Regression Modelling
25 pages
Assignment 3 PDF
No ratings yet
Assignment 3 PDF
3 pages
Ashenfelter - Predicting The Quality and Prices of Bordeaux Wine
No ratings yet
Ashenfelter - Predicting The Quality and Prices of Bordeaux Wine
11 pages
Machine Learning: Bilal Khan
100% (2)
Machine Learning: Bilal Khan
20 pages
Unit - 3 Machine Learning
No ratings yet
Unit - 3 Machine Learning
30 pages
Regression Coeffient
No ratings yet
Regression Coeffient
52 pages
Data Analytics Regression Unit III
No ratings yet
Data Analytics Regression Unit III
27 pages
ML Unit-2
No ratings yet
ML Unit-2
123 pages
Asjad Zahid Ansari Case1 4416096 1137048331
No ratings yet
Asjad Zahid Ansari Case1 4416096 1137048331
4 pages
Linear Regression
No ratings yet
Linear Regression
15 pages
Module III (Part II) (Regression and Time Series)
No ratings yet
Module III (Part II) (Regression and Time Series)
118 pages
Unit-3 Part 2 DA
No ratings yet
Unit-3 Part 2 DA
20 pages
Basic Control Rules
No ratings yet
Basic Control Rules
39 pages
6 Regression Analysis
No ratings yet
6 Regression Analysis
12 pages
1linear Regression
No ratings yet
1linear Regression
12 pages
Linear Regression
No ratings yet
Linear Regression
18 pages
Notes 2
No ratings yet
Notes 2
22 pages
Writing Notes New 1
No ratings yet
Writing Notes New 1
14 pages
Low-Flow Estimation and Prediction
100% (2)
Low-Flow Estimation and Prediction
138 pages
Regression Analysis
No ratings yet
Regression Analysis
49 pages
Internal Load Schedule
No ratings yet
Internal Load Schedule
84 pages
Regression Analysis in Machine Learning
No ratings yet
Regression Analysis in Machine Learning
13 pages
Linear Regression
No ratings yet
Linear Regression
15 pages
1.5.linear Regression
No ratings yet
1.5.linear Regression
5 pages
Mod3 Eda
No ratings yet
Mod3 Eda
16 pages
Lecture Note #8 - PEC-CS701E
No ratings yet
Lecture Note #8 - PEC-CS701E
20 pages
Chem 321 Exam 1 Study Guide
No ratings yet
Chem 321 Exam 1 Study Guide
1 page
Simulation and Risk Analysis
No ratings yet
Simulation and Risk Analysis
69 pages
Forecasting 1234
No ratings yet
Forecasting 1234
52 pages
18-Linear Regression
No ratings yet
18-Linear Regression
29 pages
What Is Linear Regression
No ratings yet
What Is Linear Regression
14 pages
Hanan
No ratings yet
Hanan
9 pages
SimpleMultipleLinearRegression FoundationalMathofAI S24
No ratings yet
SimpleMultipleLinearRegression FoundationalMathofAI S24
6 pages
Linear Regression
No ratings yet
Linear Regression
24 pages
Regression
No ratings yet
Regression
25 pages
Linear Regression
No ratings yet
Linear Regression
18 pages
L4a - Supervised Learning
No ratings yet
L4a - Supervised Learning
25 pages
Linear Regression
No ratings yet
Linear Regression
7 pages
Linear Regression
No ratings yet
Linear Regression
11 pages
ML - Module 2
No ratings yet
ML - Module 2
16 pages
Exp 1
No ratings yet
Exp 1
5 pages
Question Paper Final
No ratings yet
Question Paper Final
10 pages

Regression Unit-2

Uploaded by

Regression Unit-2

Uploaded by

Regression

Statistical tools for high-throughput data analysis In the above diagram,

Some popular applications of linear regression are:

Types of Linear Regression

You might also like