Linear Regression With One Variable
Linear Regression With One Variable
Linear Regression With One Variable
MODEL REPRESENTATION:
➢ Other Notations:
X = space of input values
Y = space of output values
Dataset = list of m training examples → (x(i) ,y(i)); i = 1,2, . . . ,m
➢ Vertical lines
represent the given
difference
≫ For more complex h(x) fxn like h(x) = Θ0 + Θ1x : we have to plot
J(Θ0, Θ1) in 3D. As, for different combinations of Θ0 and Θ1, J can be
different.
These can be more easily represented using contour figures: A
contour plot is a graph that contains many contour lines. A contour
line of a two variable function has a constant value at all points on
the line.
Graph b/w Θ0 and Θ1: these ellipses are the combinations of Θ0 and
Θ1 for which value of J is same. Points other than on ellipses are
also valid points, they also correspond to a unique value of J.
➢ The best combination (one which minimizes J(Θ0, Θ1) ) of Θ0 and
Θ1 lies around center of the innermost circle.
The slope of the tangent is the derivative at that point and it will
give us a direction to move towards. We make steps down the cost
function in the direction with the steepest descent. The size of
each step is determined by the parameter α, which is called the
learning rate.
α = Learning rate
SIMULTANEOUS UPDATE: first we calculate new value for both Θ0
and Θ1, then only we update their values. So order of execution of
statements is:
If the value of α is too small: gradient descent takes baby steps
towards the min.
If α is too large: gradient descent takes huge steps.. in such
case the gradient descent may even overshoot the min if the
diff b/w initial Θ and Θmin is less than the value of jump in Θ
(α * derivative of J) and it may start going further and further
from the min.