Notes 2. Linear - Regression - With - Multiple - Variables
Notes 2. Linear - Regression - With - Multiple - Variables
http://www.holehouse.org/mlclass/04_Linear_Regression_with_multiple_variables.html 1/10
2/1/2019 04_Linear_Regression_with_multiple_variables
Similarly, instead of thinking of J as a function of the n+1 numbers, J() is just a function of the parameter vector
J(θ)
Gradient descent
http://www.holehouse.org/mlclass/04_Linear_Regression_with_multiple_variables.html 2/10
2/1/2019 04_Linear_Regression_with_multiple_variables
http://www.holehouse.org/mlclass/04_Linear_Regression_with_multiple_variables.html 3/10
2/1/2019 04_Linear_Regression_with_multiple_variables
Learning Rate α
Focus on the learning rate (α)
Topics
Update rule
Debugging
How to chose α
http://www.holehouse.org/mlclass/04_Linear_Regression_with_multiple_variables.html 4/10
2/1/2019 04_Linear_Regression_with_multiple_variables
If, for example, after 1000 iterations you reduce the parameters by nearly nothing you could chose to only
run 1000 iterations in the future
Make sure you don't accidentally hard-code thresholds like this in and then forget about why they're their
though!
But you overshoot, so reduce learning rate so you actually reach the minimum (green line)
http://www.holehouse.org/mlclass/04_Linear_Regression_with_multiple_variables.html 5/10
2/1/2019 04_Linear_Regression_with_multiple_variables
Example
House price prediction
Two features
Frontage - width of the plot of land along road (x1)
Depth - depth away from road (x2)
You don't have to use just two features
Can create new features
Might decide that an important feature is the land area
So, create a new feature = frontage * depth (x3)
h(x) = θ0 + θ1x3
Area is a better indicator
Often, by defining new features you may get a better model
Polynomial regression
May fit the data better
θ0 + θ1x + θ2x2 e.g. here we have a quadratic function
For housing data could use a quadratic function
But may not fit the data so well - inflection point means housing prices decrease when size gets really
big
So instead must use a cubic function
http://www.holehouse.org/mlclass/04_Linear_Regression_with_multiple_variables.html 6/10
2/1/2019 04_Linear_Regression_with_multiple_variables
Normal equation
For some linear regression problems the normal equation provides a better solution
So far we've been using gradient descent
Iterative algorithm which takes steps to converse
Normal equation solves θ analytically
Solve for the optimum value of theta
Has some advantages and disadvantages
Here
m=4
n=4
To implement the normal equation
Take examples
Add an extra column (x0 feature)
Construct a matrix (X - the design matrix) which contains all the training data features in an [m x n+1]
matrix
Do something similar for y
Construct a column vector y vector [m x 1] matrix
Using the following equation (X transpose * X) inverse times X transpose y
If you compute this, you get the value of theta which minimize the cost function
General case
http://www.holehouse.org/mlclass/04_Linear_Regression_with_multiple_variables.html 8/10
2/1/2019 04_Linear_Regression_with_multiple_variables
Vector y
Used by taking all the y values into a column vector
pinv(X'*x)*x'*y
When should you use gradient descent and when should you use feature scaling?
Gradient descent
Need to chose learning rate
Needs many iterations - could make it slower
Works well even when n is massive (millions)
Better suited to big data
What is a big n though
100 or even a 1000 is still (relativity) small
If n is 10 000 then look at using gradient descent
Normal equation
No need to chose a learning rate
No need to iterate, check for convergence etc.
Normal equation needs to compute (XT X)-1
This is the inverse of an n x n matrix
With most implementations computing a matrix inverse grows by O(n3 )
So not great
Slow of n is large
Can be much slower
http://www.holehouse.org/mlclass/04_Linear_Regression_with_multiple_variables.html 10/10