SlideShare a Scribd company logo
2
Most read
3
Most read
8
Most read
Regression Methods in
Machine Learning
Multiple Linear Regression
Portland Data Science Group
Created by Andrew Ferlitsch
Community Outreach Officer
July, 2017
Multiple Linear Regression
X1 (Independent Variable)
Y (Dependent Variable) Hyperplane
• Used to Predict a correlation between more than one
independent variables and a dependent variable.
e.g., Income and Age is correlated with Spending
• When the data is plotted on a graph, there appears to
be a hyperplane relationship.
X2 (Independent Variable)
Simple vs. Multiple Linear Regression
• Simple Linear Regression – one independent variable.
y = b0 + b1x1
• Multiple Linear Regression – multiple independent
variables.
y = b0 + b1x1 + b2x2 … + bnxn
2nd independent
variable and
weight (coefficient)
nth independent
variable and
weight (coefficient)
Feature Elimination
ID Income Age Height Spending
37 18000 18 5’8”
38 75000 40 5’9”
39 27000 28 6’1”
40 24000 26 5’6”
41 45000 34 6’2”
• In a dataset, we may not want to keep all the independent
variables (features) in our model:
• More Features = More Complex Model
• If feature does not contribute to prediction, adds noise
to the model.
ID fields are like
random numbers –
do not contribute
to prediction.
Height not likely or
very little to
influence spending
Backward Elimination
• A method for identifying and removing independent
variables that do not contribute enough to the model.
• Steps:
• Fit (Train) the model with all the independent variables.
• Calculate the P-value of each independent variable.
• Eliminate independent variable with highest P-value above
threshold (e.g., 0.05 [5 percent]).
• Repeat (re-fit) until there are no independent variables with
P-value above threshold.
All Variables Train
Is Variable with highest
P-value > Threshold
DONE
Eliminate the variable
Multiple Linear Regression in Python
from sklearn.linear_model import LinearRegression # sci-kit learn library for linear regression
regressor = LinearRegression() # instantiate linear regression object
regressor.fit(X_train, y_train) # train (fit) the model
• Perform Linear Regression with all independent variables.
y_pred = regressor.predict( X_test ) # y_pred is the list of predicted results
• Run (Predict) the model on the test data.
• Analyze (View) the predicted values (y_pred) to the actual values (y_test)
Backward Elimination in Python
import statsmodels.formula.api as sm
X = np.append( arr = np.ones( (nrows,1 )).astype(int), values = X, axis = 1 )
• Prepare for Backward Elimination.
• The statsmodel does not take into account the constant b0.
• Need to fake it by adding a x0 = 1 independent variable for b0.
Function to create
column of ones
Append column
of ones to X
Create array of one
column of nrows
Append ones
to this array
X_opt = X[:, [0, 1, 2, 3, 4] ]
• Create array of optional independent variables (features) from which we
will eliminate independent variables.
All rows Start with all columns (i.e., 0, 1, 2 .. N)
Backward Elimination in Python (2)
ols = sm.OLS( endog = y, exog = X_opt).fit() # Create OLS object and fit the model
ols.summary() # Display Statistical Metrics including P-values
• Use the class Ordinary Linear Square (OLS) from stats model to train (fit)
the model and get P-values.
Independent
Variables
Dependent
Variable (Label)
• Example elimination of an independent variable (x2).
X_opt = X[:, [0,1,3,4,5]]
ols = sm.OLS( endog = y, exog = X_opt).fit()
ols.summary()
Eliminate x2 (2), where 0 is x0 (constant)
• Repeat elimination until independent variable with highest P-value is not
greater than the threshold (e.g., 0.05).

More Related Content

PPTX
Ml3 logistic regression-and_classification_error_metrics
ankit_ppt
 
PPTX
ML - Simple Linear Regression
Andrew Ferlitsch
 
PDF
Ridge regression, lasso and elastic net
Vivian S. Zhang
 
PPTX
Machine learning session4(linear regression)
Abhimanyu Dwivedi
 
PPTX
Linear regression
zekeLabs Technologies
 
PPTX
Feature Selection in Machine Learning
Upekha Vandebona
 
PDF
Ridge regression
Ananda Swarup
 
PPT
Decision Tree, Naive Bayes, Association Rule Mining, Support Vector Machine, ...
Akanksha Bali
 
Ml3 logistic regression-and_classification_error_metrics
ankit_ppt
 
ML - Simple Linear Regression
Andrew Ferlitsch
 
Ridge regression, lasso and elastic net
Vivian S. Zhang
 
Machine learning session4(linear regression)
Abhimanyu Dwivedi
 
Linear regression
zekeLabs Technologies
 
Feature Selection in Machine Learning
Upekha Vandebona
 
Ridge regression
Ananda Swarup
 
Decision Tree, Naive Bayes, Association Rule Mining, Support Vector Machine, ...
Akanksha Bali
 

What's hot (20)

PDF
Logistic regression in Machine Learning
Kuppusamy P
 
PDF
Logistic Regression in Python | Logistic Regression Example | Machine Learnin...
Edureka!
 
PDF
Introduction to Machine Learning Classifiers
Functional Imperative
 
PPT
Linear regression
Karishma Chaudhary
 
PPTX
Naïve Bayes Classifier Algorithm.pptx
Shubham Jaybhaye
 
PPTX
Logistic Regression | Logistic Regression In Python | Machine Learning Algori...
Simplilearn
 
PDF
Classification Based Machine Learning Algorithms
Md. Main Uddin Rony
 
PPTX
Principal Component Analysis (PCA) and LDA PPT Slides
AbhishekKumar4995
 
PPTX
Logistic regression
DrZahid Khan
 
PDF
Linear Regression vs Logistic Regression | Edureka
Edureka!
 
PPTX
Implement principal component analysis (PCA) in python from scratch
EshanAgarwal4
 
PDF
Logistic regression
VARUN KUMAR
 
PDF
Bias and variance trade off
VARUN KUMAR
 
PPTX
Exploratory data analysis
Gramener
 
PDF
03 Machine Learning Linear Algebra
Andres Mendez-Vazquez
 
ODP
NAIVE BAYES CLASSIFIER
Knoldus Inc.
 
PDF
Bayesian Networks - A Brief Introduction
Adnan Masood
 
PDF
Introduction to Generalized Linear Models
richardchandler
 
PDF
Naive Bayes
CloudxLab
 
Logistic regression in Machine Learning
Kuppusamy P
 
Logistic Regression in Python | Logistic Regression Example | Machine Learnin...
Edureka!
 
Introduction to Machine Learning Classifiers
Functional Imperative
 
Linear regression
Karishma Chaudhary
 
Naïve Bayes Classifier Algorithm.pptx
Shubham Jaybhaye
 
Logistic Regression | Logistic Regression In Python | Machine Learning Algori...
Simplilearn
 
Classification Based Machine Learning Algorithms
Md. Main Uddin Rony
 
Principal Component Analysis (PCA) and LDA PPT Slides
AbhishekKumar4995
 
Logistic regression
DrZahid Khan
 
Linear Regression vs Logistic Regression | Edureka
Edureka!
 
Implement principal component analysis (PCA) in python from scratch
EshanAgarwal4
 
Logistic regression
VARUN KUMAR
 
Bias and variance trade off
VARUN KUMAR
 
Exploratory data analysis
Gramener
 
03 Machine Learning Linear Algebra
Andres Mendez-Vazquez
 
NAIVE BAYES CLASSIFIER
Knoldus Inc.
 
Bayesian Networks - A Brief Introduction
Adnan Masood
 
Introduction to Generalized Linear Models
richardchandler
 
Naive Bayes
CloudxLab
 
Ad

Similar to ML - Multiple Linear Regression (20)

PPTX
Different Types of Machine Learning Algorithms
rahmedraj93
 
PPTX
unit-5 Data Wrandling weightage marks.pptx
nilampatoliya
 
PPTX
Linear regression.pptx
ssuserb8a904
 
PPTX
linear regression in machine learning.pptx
shifaaya815
 
PPTX
Linear Regression final-1.pptx thbejnnej
mathukiyak44
 
PPTX
Regression Analysis.pptx
arsh260174
 
PPTX
Regression Analysis Techniques.pptx
YutaItadori
 
PPTX
Regression
ramyaranjith
 
PDF
Chapter 1: Linear Regression
AkmelSyed
 
PDF
3ml.pdf
MianAdnan27
 
PPTX
REGRESSION METasdfghjklmjhgftrHODS1.pptx
cajativ595
 
PDF
Module 3: Linear Regression
Sara Hooker
 
PDF
Linear Regression
SourajitMaity1
 
PPT
Regression analysis ppt
Elkana Rorio
 
PPTX
Regression vs Neural Net
Ratul Alahy
 
PPTX
Regression analysis refers to assessing the relationship between the outcome ...
sureshm491823
 
PPTX
Ca-1 assignment Machine learning.ygygygpptx
bishalnandi2
 
PDF
3. Regression.pdf
Jyoti Yadav
 
PDF
Machine learning Introduction
Kuppusamy P
 
PPTX
Lecture 8 Linear and Multiple Regression (1).pptx
haseebayy45
 
Different Types of Machine Learning Algorithms
rahmedraj93
 
unit-5 Data Wrandling weightage marks.pptx
nilampatoliya
 
Linear regression.pptx
ssuserb8a904
 
linear regression in machine learning.pptx
shifaaya815
 
Linear Regression final-1.pptx thbejnnej
mathukiyak44
 
Regression Analysis.pptx
arsh260174
 
Regression Analysis Techniques.pptx
YutaItadori
 
Regression
ramyaranjith
 
Chapter 1: Linear Regression
AkmelSyed
 
3ml.pdf
MianAdnan27
 
REGRESSION METasdfghjklmjhgftrHODS1.pptx
cajativ595
 
Module 3: Linear Regression
Sara Hooker
 
Linear Regression
SourajitMaity1
 
Regression analysis ppt
Elkana Rorio
 
Regression vs Neural Net
Ratul Alahy
 
Regression analysis refers to assessing the relationship between the outcome ...
sureshm491823
 
Ca-1 assignment Machine learning.ygygygpptx
bishalnandi2
 
3. Regression.pdf
Jyoti Yadav
 
Machine learning Introduction
Kuppusamy P
 
Lecture 8 Linear and Multiple Regression (1).pptx
haseebayy45
 
Ad

More from Andrew Ferlitsch (20)

PPTX
AI - Intelligent Agents
Andrew Ferlitsch
 
PPTX
Pareto Principle Applied to QA
Andrew Ferlitsch
 
PPTX
Whiteboarding Coding Challenges in Python
Andrew Ferlitsch
 
PPTX
Object Oriented Programming Principles
Andrew Ferlitsch
 
PPTX
Python - OOP Programming
Andrew Ferlitsch
 
PPTX
Python - Installing and Using Python and Jupyter Notepad
Andrew Ferlitsch
 
PPTX
Natural Language Processing - Groupings (Associations) Generation
Andrew Ferlitsch
 
PPTX
Natural Language Provessing - Handling Narrarive Fields in Datasets for Class...
Andrew Ferlitsch
 
PPTX
Machine Learning - Introduction to Recurrent Neural Networks
Andrew Ferlitsch
 
PPTX
Machine Learning - Introduction to Convolutional Neural Networks
Andrew Ferlitsch
 
PPTX
Machine Learning - Introduction to Neural Networks
Andrew Ferlitsch
 
PPTX
Python - Numpy/Pandas/Matplot Machine Learning Libraries
Andrew Ferlitsch
 
PPTX
Machine Learning - Accuracy and Confusion Matrix
Andrew Ferlitsch
 
PPTX
Machine Learning - Ensemble Methods
Andrew Ferlitsch
 
PPTX
Machine Learning - Dummy Variable Conversion
Andrew Ferlitsch
 
PPTX
Machine Learning - Splitting Datasets
Andrew Ferlitsch
 
PPTX
Machine Learning - Dataset Preparation
Andrew Ferlitsch
 
PPTX
Machine Learning - Introduction to Tensorflow
Andrew Ferlitsch
 
PPTX
Introduction to Machine Learning
Andrew Ferlitsch
 
PPTX
AI - Introduction to Dynamic Programming
Andrew Ferlitsch
 
AI - Intelligent Agents
Andrew Ferlitsch
 
Pareto Principle Applied to QA
Andrew Ferlitsch
 
Whiteboarding Coding Challenges in Python
Andrew Ferlitsch
 
Object Oriented Programming Principles
Andrew Ferlitsch
 
Python - OOP Programming
Andrew Ferlitsch
 
Python - Installing and Using Python and Jupyter Notepad
Andrew Ferlitsch
 
Natural Language Processing - Groupings (Associations) Generation
Andrew Ferlitsch
 
Natural Language Provessing - Handling Narrarive Fields in Datasets for Class...
Andrew Ferlitsch
 
Machine Learning - Introduction to Recurrent Neural Networks
Andrew Ferlitsch
 
Machine Learning - Introduction to Convolutional Neural Networks
Andrew Ferlitsch
 
Machine Learning - Introduction to Neural Networks
Andrew Ferlitsch
 
Python - Numpy/Pandas/Matplot Machine Learning Libraries
Andrew Ferlitsch
 
Machine Learning - Accuracy and Confusion Matrix
Andrew Ferlitsch
 
Machine Learning - Ensemble Methods
Andrew Ferlitsch
 
Machine Learning - Dummy Variable Conversion
Andrew Ferlitsch
 
Machine Learning - Splitting Datasets
Andrew Ferlitsch
 
Machine Learning - Dataset Preparation
Andrew Ferlitsch
 
Machine Learning - Introduction to Tensorflow
Andrew Ferlitsch
 
Introduction to Machine Learning
Andrew Ferlitsch
 
AI - Introduction to Dynamic Programming
Andrew Ferlitsch
 

Recently uploaded (20)

PDF
A Day in the Life of Location Data - Turning Where into How.pdf
Precisely
 
PDF
REPORT: Heating appliances market in Poland 2024
SPIUG
 
PDF
The Evolution of KM Roles (Presented at Knowledge Summit Dublin 2025)
Enterprise Knowledge
 
PDF
Oracle AI Vector Search- Getting Started and what's new in 2025- AIOUG Yatra ...
Sandesh Rao
 
PDF
CIFDAQ'S Market Insight: BTC to ETH money in motion
CIFDAQ
 
PDF
Accelerating Oracle Database 23ai Troubleshooting with Oracle AHF Fleet Insig...
Sandesh Rao
 
PPTX
Stamford - Community User Group Leaders_ Agentblazer Status, AI Sustainabilit...
Amol Dixit
 
PDF
Event Presentation Google Cloud Next Extended 2025
minhtrietgect
 
PDF
Structs to JSON: How Go Powers REST APIs
Emily Achieng
 
PPTX
New ThousandEyes Product Innovations: Cisco Live June 2025
ThousandEyes
 
PPTX
Smart Infrastructure and Automation through IoT Sensors
Rejig Digital
 
PDF
Software Development Company | KodekX
KodekX
 
PDF
Chapter 2 Digital Image Fundamentals.pdf
Getnet Tigabie Askale -(GM)
 
PPT
L2 Rules of Netiquette in Empowerment technology
Archibal2
 
PDF
How Open Source Changed My Career by abdelrahman ismail
a0m0rajab1
 
PPTX
What-is-the-World-Wide-Web -- Introduction
tonifi9488
 
PDF
Security features in Dell, HP, and Lenovo PC systems: A research-based compar...
Principled Technologies
 
PDF
Google I/O Extended 2025 Baku - all ppts
HusseinMalikMammadli
 
PDF
Chapter 1 Introduction to CV and IP Lecture Note.pdf
Getnet Tigabie Askale -(GM)
 
PDF
NewMind AI Weekly Chronicles - July'25 - Week IV
NewMind AI
 
A Day in the Life of Location Data - Turning Where into How.pdf
Precisely
 
REPORT: Heating appliances market in Poland 2024
SPIUG
 
The Evolution of KM Roles (Presented at Knowledge Summit Dublin 2025)
Enterprise Knowledge
 
Oracle AI Vector Search- Getting Started and what's new in 2025- AIOUG Yatra ...
Sandesh Rao
 
CIFDAQ'S Market Insight: BTC to ETH money in motion
CIFDAQ
 
Accelerating Oracle Database 23ai Troubleshooting with Oracle AHF Fleet Insig...
Sandesh Rao
 
Stamford - Community User Group Leaders_ Agentblazer Status, AI Sustainabilit...
Amol Dixit
 
Event Presentation Google Cloud Next Extended 2025
minhtrietgect
 
Structs to JSON: How Go Powers REST APIs
Emily Achieng
 
New ThousandEyes Product Innovations: Cisco Live June 2025
ThousandEyes
 
Smart Infrastructure and Automation through IoT Sensors
Rejig Digital
 
Software Development Company | KodekX
KodekX
 
Chapter 2 Digital Image Fundamentals.pdf
Getnet Tigabie Askale -(GM)
 
L2 Rules of Netiquette in Empowerment technology
Archibal2
 
How Open Source Changed My Career by abdelrahman ismail
a0m0rajab1
 
What-is-the-World-Wide-Web -- Introduction
tonifi9488
 
Security features in Dell, HP, and Lenovo PC systems: A research-based compar...
Principled Technologies
 
Google I/O Extended 2025 Baku - all ppts
HusseinMalikMammadli
 
Chapter 1 Introduction to CV and IP Lecture Note.pdf
Getnet Tigabie Askale -(GM)
 
NewMind AI Weekly Chronicles - July'25 - Week IV
NewMind AI
 

ML - Multiple Linear Regression

  • 1. Regression Methods in Machine Learning Multiple Linear Regression Portland Data Science Group Created by Andrew Ferlitsch Community Outreach Officer July, 2017
  • 2. Multiple Linear Regression X1 (Independent Variable) Y (Dependent Variable) Hyperplane • Used to Predict a correlation between more than one independent variables and a dependent variable. e.g., Income and Age is correlated with Spending • When the data is plotted on a graph, there appears to be a hyperplane relationship. X2 (Independent Variable)
  • 3. Simple vs. Multiple Linear Regression • Simple Linear Regression – one independent variable. y = b0 + b1x1 • Multiple Linear Regression – multiple independent variables. y = b0 + b1x1 + b2x2 … + bnxn 2nd independent variable and weight (coefficient) nth independent variable and weight (coefficient)
  • 4. Feature Elimination ID Income Age Height Spending 37 18000 18 5’8” 38 75000 40 5’9” 39 27000 28 6’1” 40 24000 26 5’6” 41 45000 34 6’2” • In a dataset, we may not want to keep all the independent variables (features) in our model: • More Features = More Complex Model • If feature does not contribute to prediction, adds noise to the model. ID fields are like random numbers – do not contribute to prediction. Height not likely or very little to influence spending
  • 5. Backward Elimination • A method for identifying and removing independent variables that do not contribute enough to the model. • Steps: • Fit (Train) the model with all the independent variables. • Calculate the P-value of each independent variable. • Eliminate independent variable with highest P-value above threshold (e.g., 0.05 [5 percent]). • Repeat (re-fit) until there are no independent variables with P-value above threshold. All Variables Train Is Variable with highest P-value > Threshold DONE Eliminate the variable
  • 6. Multiple Linear Regression in Python from sklearn.linear_model import LinearRegression # sci-kit learn library for linear regression regressor = LinearRegression() # instantiate linear regression object regressor.fit(X_train, y_train) # train (fit) the model • Perform Linear Regression with all independent variables. y_pred = regressor.predict( X_test ) # y_pred is the list of predicted results • Run (Predict) the model on the test data. • Analyze (View) the predicted values (y_pred) to the actual values (y_test)
  • 7. Backward Elimination in Python import statsmodels.formula.api as sm X = np.append( arr = np.ones( (nrows,1 )).astype(int), values = X, axis = 1 ) • Prepare for Backward Elimination. • The statsmodel does not take into account the constant b0. • Need to fake it by adding a x0 = 1 independent variable for b0. Function to create column of ones Append column of ones to X Create array of one column of nrows Append ones to this array X_opt = X[:, [0, 1, 2, 3, 4] ] • Create array of optional independent variables (features) from which we will eliminate independent variables. All rows Start with all columns (i.e., 0, 1, 2 .. N)
  • 8. Backward Elimination in Python (2) ols = sm.OLS( endog = y, exog = X_opt).fit() # Create OLS object and fit the model ols.summary() # Display Statistical Metrics including P-values • Use the class Ordinary Linear Square (OLS) from stats model to train (fit) the model and get P-values. Independent Variables Dependent Variable (Label) • Example elimination of an independent variable (x2). X_opt = X[:, [0,1,3,4,5]] ols = sm.OLS( endog = y, exog = X_opt).fit() ols.summary() Eliminate x2 (2), where 0 is x0 (constant) • Repeat elimination until independent variable with highest P-value is not greater than the threshold (e.g., 0.05).