0% found this document useful (0 votes)
41 views1 page

7 PDF

The document discusses gradient descent for machine learning models with multiple variables. It defines the cost function J(θ) as the mean squared error between the model's predictions hθ(x(i)) and actual values y(i). Gradient descent works by repeating the process of updating each parameter θj by subtracting a fraction of the partial derivative of J with respect to θj, multiplied by the learning rate α. This gradient descent process iterates until convergence of the parameters.

Uploaded by

Laith Bounenni
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
41 views1 page

7 PDF

The document discusses gradient descent for machine learning models with multiple variables. It defines the cost function J(θ) as the mean squared error between the model's predictions hθ(x(i)) and actual values y(i). Gradient descent works by repeating the process of updating each parameter θj by subtracting a fraction of the partial derivative of J with respect to θj, multiplied by the learning rate α. This gradient descent process iterates until convergence of the parameters.

Uploaded by

Laith Bounenni
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 1

28/09/2020 Machine Learning - Home | Coursera

Cost function
For the parameter vector θ (of type Rn+1 or in R(n+1)×1 , the cost function is:

m
1 2
J(θ) = ∑ (hθ (x(i) ) − y (i) )
2m i=1

The vectorized version is:

1
J(θ) = (Xθ − y )T (Xθ − y )
2m

Where y denotes the vector of all y values.

Gradient Descent for Multiple Variables


The gradient descent equation itself is generally the same form; we just have to repeat it for our 'n' features:

repeat until convergence: {


1 m (i)
θ0 := θ0 − α ∑ (h θ (x (i) ) − y (i) ) ⋅ x 0
m i=1
1 m (i)
θ1 := θ1 − α ∑ (h θ (x (i) ) − y (i) ) ⋅ x 1
m i=1
m
1 (i)
θ2 := θ2 − α ∑ (h θ (x (i) ) − y (i) ) ⋅ x 2
m i=1

}

In other words:

repeat until convergence: {


1 m (i)
θj := θj − α ∑ (h θ (x (i) ) − y (i) ) ⋅ x j for j := 0..n
m i=1
}

https://www.coursera.org/learn/machine-learning/resources/QQx8l 1/1

You might also like