100% found this document useful (1 vote)

223 views12 pages

Linear Regression With Multiple Variables - Machine Learning, Deep Learning, and Computer Vision

This document discusses linear regression with multiple variables. It covers gradient descent for multiple variables, feature scaling, checking gradient descent convergence, selecting the learning rate, and using polynomial regression for features. It also covers computing parameters analytically using the normal equation method and cases when the normal equation matrix is non-invertible.

Uploaded by

Preetam Kumar Ghosh

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

100% found this document useful (1 vote)

223 views12 pages

Linear Regression With Multiple Variables - Machine Learning, Deep Learning, and Computer Vision

Uploaded by

Preetam Kumar Ghosh

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 12

1/21/2018 Linear Regression with Multiple Variables | Machine Learning, Deep Learning, and Computer Vision

Machine Learning
Machine Learning Resources ▾
Machine Learning and Econometrics ▾
Supervised Learning Theory ▾
Overview (/machine-learning/)

One Variable Linear Regression (/one-variable-linear-regression/)

Linear Algebra (/linear-algebra-machine-learning/)

Multiple Variable Linear Regression (/multi-variable-linear-regression/)

Logistic Regression (/logistic-regression/)

Neural Networks (Representation) (/neural-networks-representation/)

Neural Networks (Learning) (/neural-networks-learning/)

Applying Machine Learning (/applying-machine-learning/)

Machine Learning Systems Design (/machine-learning-systems-design/)

Support Vector Machines (/machine-learning-svms-support-vector-machines/)

Unsupervised Learning Theory ▾

Reinforcement Learning Theory ▾
Deep Learning Theory ▾
Deep Learning with TensorFlow ▾
Machine Learning with Scikit-Learn ▾
Machine Learning Projects ▾

Linear Regression with Multiple Variables

Summary: Linear Regression with Multiple Variables.

Table of Contents
1. Multivariate Linear Regression
– 1a. Multiple Features (Variables)
– 1b. Gradient Descent for Multiple Variables

http://www.ritchieng.com/multi-variable-linear-regression/ 1/12
1/21/2018 Linear Regression with Multiple Variables | Machine Learning, Deep Learning, and Computer Vision

– 1c. Gradient Descent: Feature Scaling

– 1d. Gradient Descent: Checking
– 1e. Gradient Descent: Learning Rate
– 1f. Features and Polynomial Regression

2. Computing Parameters Analytically

– 2a. Normal Equation
– 2b. Normal Equation Non-invertibility

1. Multivariate Linear Regression

I would like to give full credits to the respective authors as these are my personal python notebooks taken from
deep learning courses from Andrew Ng, Data School and Udemy :) This is a simple python notebook hosted
generously through Github Pages that is on my main personal notes repository on
https://github.com/ritchieng/ritchieng.github.io . They are meant for my personal review but I have open-source
my repository of personal notes as a lot of people found it useful.

1a. Multiple Features (Variables)

X1, X2, X3, X4 and more

http://www.ritchieng.com/multi-variable-linear-regression/ 2/12
1/21/2018 Linear Regression with Multiple Variables | Machine Learning, Deep Learning, and Computer Vision

New hypothesis

Multivariate linear regression

Can reduce hypothesis to single number with a transposed theta matrix multiplied by x matrix

1b. Gradient Descent for Multiple Variables

http://www.ritchieng.com/multi-variable-linear-regression/ 3/12
1/21/2018 Linear Regression with Multiple Variables | Machine Learning, Deep Learning, and Computer Vision

Summary

http://www.ritchieng.com/multi-variable-linear-regression/ 4/12
1/21/2018 Linear Regression with Multiple Variables | Machine Learning, Deep Learning, and Computer Vision

New Algorithm

1c. Gradient Descent: Feature Scaling

Ensure features are on similar scale

Gradient descent will take longer to reach the global minimum when the features are not on a similar
scale

Feature scaling allows you to reach the global minimum faster

http://www.ritchieng.com/multi-variable-linear-regression/ 5/12
1/21/2018 Linear Regression with Multiple Variables | Machine Learning, Deep Learning, and Computer Vision

So long they’re close enough, need not be between 1 and -1

Mean normalization

1d. Gradient Descent: Checking

Can you a graph

x-axis: number of iterations

http://www.ritchieng.com/multi-variable-linear-regression/ 6/12
1/21/2018 Linear Regression with Multiple Variables | Machine Learning, Deep Learning, and Computer Vision

y-axis: min J(theta)

Or use automatic convergence test

Tough to gauge epsilon

Gradient descent that is not working (large learning rate)

1e. Gradient Descent: Learning Rate

Alpha (Learning Rate) too small: slow convergence

http://www.ritchieng.com/multi-variable-linear-regression/ 7/12
1/21/2018 Linear Regression with Multiple Variables | Machine Learning, Deep Learning, and Computer Vision

Alpha (Learning Rate) too large:

J(theta) may not decrease on every iteration

May not converge (diverge)

Start with 0.001 and increase x3 each time until you reach an acceptable alpha

Choose a slightly smaller number than that acceptable alpha value

1f. Features and Polynomial Regression

Ensure the features capture the pattern

Doesn’t make sense to choose quadratic equation for house prices

Use cubic or square root

There are automatic algorithms, and this will be discussed later

http://www.ritchieng.com/multi-variable-linear-regression/ 8/12
1/21/2018 Linear Regression with Multiple Variables | Machine Learning, Deep Learning, and Computer Vision

2. Computing Parameters Analytically

2a. Normal Equation

Method to solve for theta analytically

If theta is real number

Minimise J(theta) is to take the derivative and equate to zero

Solve for theta

If theta is not

Take partial derivative and equate to zero

http://www.ritchieng.com/multi-variable-linear-regression/ 9/12
1/21/2018 Linear Regression with Multiple Variables | Machine Learning, Deep Learning, and Computer Vision

Solve for all thetas

Minimise Cost Function: Specific Example

X: m x (n + 1)

m: number of training examples

n: number of features

X_transpose: (n + 1) x m

X_transpose * X: (n + 1) x m * m x (n + 1) = (n + 1) x (n + 1)

(X_transpose * X)^-1 * X_transpose: (n + 1) x (n + 1) * (n + 1) x m = (n + 1) x m

http://www.ritchieng.com/multi-variable-linear-regression/ 10/12
1/21/2018 Linear Regression with Multiple Variables | Machine Learning, Deep Learning, and Computer Vision

theta = (n + 1) x m * m x 1 = (n + 1) x 1

Minimise Cost Function: General

Minimise Cost: Octave Code

No need for feature scaling using normal equation

http://www.ritchieng.com/multi-variable-linear-regression/ 11/12
1/21/2018 Linear Regression with Multiple Variables | Machine Learning, Deep Learning, and Computer Vision

pinv (X' * X) * X' * y

Gradient Descent vs Normal Equation

Gradient Descent Normal Equation

Need to choose No need to choose alpha

alpha

Needs many Don’t need to iterate

iterations

Works with large n Slow if n is large (100, 1000 is fine)

(10,000)

Number of features So long number features < 1000

> 1000

2b. Normal Equation Non-invertibility

What happens if X_transpose * X is non-invertible (singular or degenerate)

pinv (X' * X) * X' * y

This works regardless if it is non-invertible

Intuition of non-invertibility

Causes of non-invertibility

http://www.ritchieng.com/multi-variable-linear-regression/ 12/12

Machine Learning Project Basic - Linear Regression - Kaggle
No ratings yet
Machine Learning Project Basic - Linear Regression - Kaggle
10 pages
K Kiran Kumar IIM Indore
100% (1)
K Kiran Kumar IIM Indore
115 pages
Machine Learning Guide Line
No ratings yet
Machine Learning Guide Line
10 pages
Regression Analysis in Machine Learning
No ratings yet
Regression Analysis in Machine Learning
26 pages
Linear Regression
No ratings yet
Linear Regression
71 pages
Statistical Computing by Using R
100% (1)
Statistical Computing by Using R
11 pages
Linear Regression
100% (1)
Linear Regression
51 pages
Dplyr Tutorial
100% (1)
Dplyr Tutorial
22 pages
Linear Regression With LM Function, Diagnostic Plots, Interaction Term, Non-Linear Transformation of The Predictors, Qualitative Predictors
100% (1)
Linear Regression With LM Function, Diagnostic Plots, Interaction Term, Non-Linear Transformation of The Predictors, Qualitative Predictors
15 pages
Multiple Linear Regression
100% (1)
Multiple Linear Regression
14 pages
Quiz Feedback1 - Coursera
100% (1)
Quiz Feedback1 - Coursera
7 pages
App.A - Detection and Estimation in Additive Gaussian Noise PDF
No ratings yet
App.A - Detection and Estimation in Additive Gaussian Noise PDF
55 pages
Statistics in Details
100% (2)
Statistics in Details
283 pages
Final Project - Regression Models
100% (1)
Final Project - Regression Models
35 pages
Linear Regression
No ratings yet
Linear Regression
83 pages
Lecture 4 Linear Regression
100% (1)
Lecture 4 Linear Regression
44 pages
Classification With Decision Trees: Instructor: Qiang Yang
100% (1)
Classification With Decision Trees: Instructor: Qiang Yang
62 pages
LLSPS - INT - 2831 - Predicting Life Expectancy Using Machine Learning
100% (1)
LLSPS - INT - 2831 - Predicting Life Expectancy Using Machine Learning
36 pages
Cheatsheet Midterms 2 - 3
No ratings yet
Cheatsheet Midterms 2 - 3
2 pages
Customer Segmentation Clustering
No ratings yet
Customer Segmentation Clustering
35 pages
Ensemble Techniques Project
100% (2)
Ensemble Techniques Project
28 pages
Regression Notes
100% (1)
Regression Notes
20 pages
Parallelism of Statistics and Machine Learning & Logistic Regression Versus Random Forest
100% (1)
Parallelism of Statistics and Machine Learning & Logistic Regression Versus Random Forest
72 pages
Using Categorical Data With One Hot Encoding - Kaggle PDF
No ratings yet
Using Categorical Data With One Hot Encoding - Kaggle PDF
4 pages
Lecture Notes - Linear Regression
No ratings yet
Lecture Notes - Linear Regression
26 pages
Least Squares Problems: How To State and Solve Them, Then Evaluate Their Solutions
100% (1)
Least Squares Problems: How To State and Solve Them, Then Evaluate Their Solutions
63 pages
Statistical Modeling
No ratings yet
Statistical Modeling
22 pages
Introduction To STATISTICS-new
100% (1)
Introduction To STATISTICS-new
46 pages
A Practical Approach To Linear Regression in Machine Learning - by Ashwin Raj - Towards Data Science
No ratings yet
A Practical Approach To Linear Regression in Machine Learning - by Ashwin Raj - Towards Data Science
20 pages
30 Hrs Deep Learning CV Images Video
No ratings yet
30 Hrs Deep Learning CV Images Video
6 pages
Cheet Sheet
No ratings yet
Cheet Sheet
47 pages
Lab 3 - Linear Regression
No ratings yet
Lab 3 - Linear Regression
15 pages
Logistic Regression
100% (1)
Logistic Regression
14 pages
Machine Learning With Python
100% (1)
Machine Learning With Python
14 pages
Chapter 7 - Regression Analysis
100% (1)
Chapter 7 - Regression Analysis
111 pages
I. The Types of Machine Learning
No ratings yet
I. The Types of Machine Learning
8 pages
Confusion Matrix in Machine Learning
No ratings yet
Confusion Matrix in Machine Learning
10 pages
ML UNIT-2 Notes
No ratings yet
ML UNIT-2 Notes
15 pages
Evaluation Metrics in Machine Learning
No ratings yet
Evaluation Metrics in Machine Learning
14 pages
Quiz Week 7 - Support Vector Machines
100% (1)
Quiz Week 7 - Support Vector Machines
3 pages
Supervised Learning - Regression - Annotated
No ratings yet
Supervised Learning - Regression - Annotated
97 pages
Machine Learning Notes
No ratings yet
Machine Learning Notes
3 pages
Introduction To Python and Computer Programming 1704298503
No ratings yet
Introduction To Python and Computer Programming 1704298503
44 pages
Logistic Regression
100% (1)
Logistic Regression
21 pages
Sajjad DS
100% (2)
Sajjad DS
97 pages
Outliers, Hypothesis and Natural Language Processing
100% (1)
Outliers, Hypothesis and Natural Language Processing
7 pages
HW 06 Markov Chains Solutions
No ratings yet
HW 06 Markov Chains Solutions
4 pages
Variable Selection
No ratings yet
Variable Selection
15 pages
Seminar Report Machine Learning
No ratings yet
Seminar Report Machine Learning
20 pages
CS2610 Final Exam: If Is - Nan Print
No ratings yet
CS2610 Final Exam: If Is - Nan Print
5 pages
Regression Analysis
100% (2)
Regression Analysis
9 pages
Cheatsheet Machine Learning Tips and Tricks PDF
No ratings yet
Cheatsheet Machine Learning Tips and Tricks PDF
2 pages
02 - Decision Tree Classification On Iris Dataset
No ratings yet
02 - Decision Tree Classification On Iris Dataset
6 pages
AAL Programs
No ratings yet
AAL Programs
12 pages
Math4ml PDF
No ratings yet
Math4ml PDF
21 pages
Maths For ML Revision
No ratings yet
Maths For ML Revision
25 pages
Linear Regression: in Machine Learning
No ratings yet
Linear Regression: in Machine Learning
6 pages
Notes 2. Linear - Regression - With - Multiple - Variables
No ratings yet
Notes 2. Linear - Regression - With - Multiple - Variables
10 pages
Machine Learning (ML) RIME-832: Dr. Hasan Sajid
No ratings yet
Machine Learning (ML) RIME-832: Dr. Hasan Sajid
57 pages
4.multivariate Linear Regression-Shared
No ratings yet
4.multivariate Linear Regression-Shared
41 pages
LE - Mathematics 7 - Q1 - Week4 - Formattedfinal
No ratings yet
LE - Mathematics 7 - Q1 - Week4 - Formattedfinal
6 pages
Class X Part B Unit 1 Intro. To AI
No ratings yet
Class X Part B Unit 1 Intro. To AI
19 pages
Teacher Continues Learning
No ratings yet
Teacher Continues Learning
4 pages
HG Q2 Oct 7
No ratings yet
HG Q2 Oct 7
3 pages
Motivation Techniques
No ratings yet
Motivation Techniques
17 pages
Albert Bandura by Sheena Bernal
No ratings yet
Albert Bandura by Sheena Bernal
2 pages
7 Lecture 2 - Conceptual Famework & Theoretical Framework
No ratings yet
7 Lecture 2 - Conceptual Famework & Theoretical Framework
12 pages
Child and Ado
No ratings yet
Child and Ado
15 pages
Comparing The Effects of Low-Level and High-Level Worker Need-Satisfaction: A Synthesis of The Self-Determination and Maslow Need Theories
No ratings yet
Comparing The Effects of Low-Level and High-Level Worker Need-Satisfaction: A Synthesis of The Self-Determination and Maslow Need Theories
15 pages
Katamaran
No ratings yet
Katamaran
2 pages
The Passive 2
No ratings yet
The Passive 2
15 pages
Module 1 - Philosophical Perspective
No ratings yet
Module 1 - Philosophical Perspective
21 pages
Skripsi Icebreaker PDF
No ratings yet
Skripsi Icebreaker PDF
107 pages
Conceptual Integration in The Poetic Text: Sciencedirect
No ratings yet
Conceptual Integration in The Poetic Text: Sciencedirect
6 pages
Tolkappiyam & Ashtadhyayi 2011 Siniruddha Dash
0% (1)
Tolkappiyam & Ashtadhyayi 2011 Siniruddha Dash
23 pages
Color Psychology
100% (3)
Color Psychology
52 pages
Day 1-mt Rainier
No ratings yet
Day 1-mt Rainier
2 pages
So and Such: 1 May Be Freely Copied For Personal or Classroom Use
No ratings yet
So and Such: 1 May Be Freely Copied For Personal or Classroom Use
4 pages
Leisure Activities in Context - A Micro-Macro - Agency-Structure Interpretation of Leisure (PDFDrive)
No ratings yet
Leisure Activities in Context - A Micro-Macro - Agency-Structure Interpretation of Leisure (PDFDrive)
199 pages
Material Development - Writing Lesson Plan
No ratings yet
Material Development - Writing Lesson Plan
8 pages
Excel 5 Teacher 39 S Book-1
No ratings yet
Excel 5 Teacher 39 S Book-1
336 pages
Lesson Plan in Introduction To Computer-2
No ratings yet
Lesson Plan in Introduction To Computer-2
4 pages
Adjectives and Adverbs
No ratings yet
Adjectives and Adverbs
2 pages
Chapter 1 5
No ratings yet
Chapter 1 5
66 pages
Lesson 2-STS (Midterm)
No ratings yet
Lesson 2-STS (Midterm)
3 pages
Hum1046 Behavioral-Economics TH 1.0 67 Hum1046 62 Acp
No ratings yet
Hum1046 Behavioral-Economics TH 1.0 67 Hum1046 62 Acp
2 pages
Complete Models of Teaching 7th Edition Bruce Joyce PDF For All Chapters
No ratings yet
Complete Models of Teaching 7th Edition Bruce Joyce PDF For All Chapters
51 pages
Rubric Essay
No ratings yet
Rubric Essay
1 page
Unit 4
No ratings yet
Unit 4
101 pages
Skill Related Fitness Components
No ratings yet
Skill Related Fitness Components
34 pages

Linear Regression With Multiple Variables - Machine Learning, Deep Learning, and Computer Vision

Uploaded by

Linear Regression With Multiple Variables - Machine Learning, Deep Learning, and Computer Vision

Uploaded by

1/21/2018 Linear Regression with Multiple Variables | Machine Learning, Deep Learning, and Computer Vision

One Variable Linear Regression (/one-variable-linear-regression/)

Linear Algebra (/linear-algebra-machine-learning/)

Multiple Variable Linear Regression (/multi-variable-linear-regression/)

Logistic Regression (/logistic-regression/)

Neural Networks (Representation) (/neural-networks-representation/)

Neural Networks (Learning) (/neural-networks-learning/)

Applying Machine Learning (/applying-machine-learning/)

Machine Learning Systems Design (/machine-learning-systems-design/)

Support Vector Machines (/machine-learning-svms-support-vector-machines/)

Unsupervised Learning Theory ▾

Linear Regression with Multiple Variables

– 1c. Gradient Descent: Feature Scaling

2. Computing Parameters Analytically

1. Multivariate Linear Regression

1a. Multiple Features (Variables)

Multivariate linear regression

1b. Gradient Descent for Multiple Variables

1c. Gradient Descent: Feature Scaling

Feature scaling allows you to reach the global minimum faster

So long they’re close enough, need not be between 1 and -1

1d. Gradient Descent: Checking

x-axis: number of iterations

y-axis: min J(theta)

Or use automatic convergence test

Tough to gauge epsilon

Gradient descent that is not working (large learning rate)

1e. Gradient Descent: Learning Rate

Alpha (Learning Rate) too large:

J(theta) may not decrease on every iteration

May not converge (diverge)

Choose a slightly smaller number than that acceptable alpha value

1f. Features and Polynomial Regression

Doesn’t make sense to choose quadratic equation for house prices

Use cubic or square root

There are automatic algorithms, and this will be discussed later

2. Computing Parameters Analytically

2a. Normal Equation

If theta is real number

Minimise J(theta) is to take the derivative and equate to zero

Solve for theta

Take partial derivative and equate to zero

Solve for all thetas

Minimise Cost Function: Specific Example

m: number of training examples

(X_transpose * X)^-1 * X_transpose: (n + 1) x (n + 1) * (n + 1) x m = (n + 1) x m

Minimise Cost Function: General

Minimise Cost: Octave Code

No need for feature scaling using normal equation

pinv (X' * X) * X' * y

Gradient Descent vs Normal Equation

Gradient Descent Normal Equation

Need to choose No need to choose alpha

Needs many Don’t need to iterate

Works with large n Slow if n is large (100, 1000 is fine)

Number of features So long number features < 1000

2b. Normal Equation Non-invertibility

pinv (X' * X) * X' * y

This works regardless if it is non-invertible

You might also like