Gradient Descent Algorithm
Gradient Descent Algorithm
Gradient Descent
Gradient Descent
Gradient Descent is just like Agile Methodology
Make
changes Build
depending something
upon the quickly
feedback
Algorithm:
- initialize θ ’s randomly
- keep chaining θ ′ s to reduce J(θ)
until we hopefully end up at a minimum
Gradient Descent
Lets have some function 𝐽 θ
Algorithm:
- initialize θ ’s randomly
- repeat until convergence {
𝜕
θi := θi - α J(θ)
𝜕θi
Gradient Descent
Algorithm:
- initialize θ1 randomly
- keep chaining θ1 to reduce J(θ 1)
until we hopefully end up at a minimum
Gradient Descent
Lets have some function 𝐽 θ1
Algorithm:
- initialize θ ’s randomly
- repeat until convergence {
𝜕
θ1 := θ1 - α J(θ1)
𝜕θ1
}
Gradient Descent
𝐽 θ1 = (θ1 - 3 )2 +5 θ1 := θ1 - α
𝜕
J(θ1)
𝜕θ1
θ1 𝑱 θ1 𝜕
0 14 J(θ1) = 2(θ1 – 3) α = 0.1
𝜕θ1
1 9
-1 21
If θ1 = 10
2 6
-2 30
3 5
-3 41
4 6
-4 54
5 9
-5 69
6 14 If θ1 = -5
-6 86
7 21
8 30
9 41
10 54
11 69
12 86
13 105
Gradient Descent
Q&A
Impact of learning rate in Gradient
Descent
Impact of learning rate in Gradient Descent
Impact of learning rate in Gradient Descent
Q&A
How to implement Gradient Descent
How to implement Gradient Descent
𝐽 θ1 = (θ1 - 3 )2 +5 initialize θ ’s randomly
- repeat until convergence {
θ1 𝑱 θ1 𝜕
0 14 θ1 := θ1 - α J(θ1)
𝜕θ1
1 9 }
-1 21
2 6
-2 30
3 5 𝜕
𝐽(θ1) = 2(θ1 – 3)
-3 41 𝜕θ1
4 6
-4 54
5 9
-5 69 initialization θ1 = 10 initialization θ1 = -5
6 14
-6 86
7 21 Repeat until convergence{
8 30
θ1 := θ1 - α 2(θ1 – 3)
9 41
10 54
}
11 69
12 86
13 105
How to implement Gradient Descent
min J(θ0,θ1)
θ0,θ1
Algorithm:
- initialize θ ’s randomly
- repeat until convergence {
𝜕
θi := θi - α J(θ0,θ1)
𝜕θi
How to implement Gradient Descent
How to implement Gradient Descent
How to implement Gradient Descent
How to implement Gradient Descent
Cost function: J(θ0,θ1)
Algorithm:
- initialize θ ’s randomly min J(θ0,θ1)
θ0,θ1
- repeat until convergence {
𝜕
θi := θi - α J(θ0,θ1)
𝜕θi
Correct: Simultaneous Update Incorrect
𝜕 𝜕
temp0 := θ0 - α J(θ0,θ1) temp0 := θ0 - α J(θ0,θ1)
𝜕θ0 𝜕θ0
𝜕
temp1 := θ1 - α J(θ0,θ1) θ0 := temp0
𝜕θ1
𝜕
θ0 := temp0 temp1 := θ1 - α J(θ0,θ1)
𝜕θ1
θ1 := temp1 θ1 := temp1
Q&A