Lecture1 Slides 1
Lecture1 Slides 1
26444542
#Code
Attendance Code:
26444542
#Code
Class 𝑓 𝑤, 𝑥 = sign 𝑤 𝑇 𝑥
y1 = −1 “didn’t buy”
(𝑦 = −1) 𝑓: ℝ𝐷 → {−1, +1}
Decision boundary
(𝑤0𝑇 𝑥 = 0)
Class
y1 = −1 “didn’t buy” ○ Perceptron error: sum of perpendicular distances of
(𝑦 = −1) every misclassified data point to the decision
boundary,
○ 𝐹 𝑤 = σ𝑁 𝑇
𝑖=1 max 0, −𝑦𝑖 𝑤 𝑥𝑖
Correctly classified ■ ‘Penalizes’ incorrect decisions by the distance from
𝕀 𝑓 𝑤, 𝑥1 ≠ 𝑦1 = the decision boundary wTx in the direction w
(perpendicular distance).
Attendance Code:
26444542
#Code
Example: health insurance company
Initial guess: 𝑤0 = −130, 2, 1 𝑇
Correctly classified
𝕀 𝑓 𝑤, 𝑥2 ≠ 𝑦2 = 0 ○ Now let’s look at y3
Decision boundary
(𝑤0𝑇 𝑥 = 0)
Class
Incorrectly classified y2 = +1 “bought”
𝕀 𝑓 𝑤, 𝑥3 ≠ 𝑦3 = 1 y7 = −1 (𝑦 = +1) ○ How about y7?
Incorrectly classified
y3 = +1 𝕀 𝑓 𝑤, 𝑥7 ≠ 𝑦7 = 1
○ Perceptron error: sum of perpendicular
Class distances of every misclassified data point to
y1 = −1 “didn’t buy” the decision boundary,
(𝑦 = −1)
○ 𝐹 𝑤 = σ𝑁 𝑇
𝑖=1 max 0, −𝑦𝑖 𝑤 𝑥𝑖
■ ‘Penalizes’ incorrect decisions by the
Correctly classified distance from the decision boundary wTx in
𝕀 𝑓 𝑤, 𝑥1 ≠ 𝑦1 = 0
the direction w (perpendicular distance).
Attendance Code:
26444542
#Code
Example: health insurance company
Initial guess: 𝑤0 = −130, 2, 1 𝑇
Correctly classified
𝕀 𝑓 𝑤, 𝑥2 ≠ 𝑦2 = 0 ○ Now let’s look at y3
Decision boundary
(𝑤0𝑇 𝑥 = 0)
Class
Incorrectly classified
𝕀 𝑓 𝑤, 𝑥3 ≠ 𝑦3 = 1
y2 = +1 “bought” ○ How about y7?
y7 = −1 (𝑦 = +1)
Incorrectly classified
y3 = +1 𝕀 𝑓 𝑤, 𝑥7 ≠ 𝑦7 = 1 ○ Perceptron error: sum of perpendicular
distances of every misclassified data point to
Class the decision boundary,
y1 = −1 “didn’t buy”
(𝑦 = −1) ○ 𝐹 𝑤 =
𝐹 𝑤 = max 0, −𝑦𝑖 𝑤 𝑇 𝑥𝑖
𝑖=1
Attendance Code:
26444542
#Code
Perceptron classification
𝐹 𝑤 = max 0, −𝑦𝑖 𝑤 𝑇 𝑥𝑖
𝑖=1
𝐹𝑤 𝑤 = − 𝑦𝑖 𝑥𝑖 𝕀 −𝑦𝑖 𝑤 𝑇 𝑥𝑖 ≥ 0
𝑖=1
8 60 80 Yes
9 50 40 No
10 28 35 No
Attendance Code:
26444542
#Code
Example: health insurance company
● Perceptron algorithm in action
○ 𝑛 = 0, 𝑤0 = −130, 2, 1 𝑇 , 𝑅 = 10, 𝛼 = 0.1:
1
Decision boundary ■ 𝑖 = 3, 𝑤1 = 𝑤0 + 0.1 × +1 × 30 = −129.9, 5, 6 𝑇
(𝑤0𝑇 𝑥 = 0) 50
y2 = +1 1
𝑖 = 4, 𝑤1 = −129.9, 5, 6 𝑇 + 0.1 × −1 × 22 = −130, 2.8, 3.5 𝑇
y3 = +1
25
Class
“bought”
(𝑦 = +1)
y1 = −1
𝑇 𝑇 𝑇
𝑖 = 7, 𝑤1 = −129.9, 5, 6 + 0.1 × −1 × 1, 40, 55 = −130, 1, 0.5
Class
“didn’t buy” 𝑖 = 9, 𝑤1 = −130, 1, 0.5 𝑇
+ 0.1 × −1 × 1, 50, 40 𝑇
(𝑦 = −1) = −130.1, −4, −3.5 𝑇
𝑛 = 1, 𝑤0 = −130.1, −4, −3.5 𝑇
:
𝑇 𝑇
𝑖 = 2, 𝑤2 = 𝑤1 + 0.1 × +1 × 1, 45, 60 = −130, 0.5, 2.5
𝑇 𝑇
𝑖 = 3, 𝑤2 = 𝑤1 + 0.1 × +1 × 1, 30, 50 = −129.9, 3.5, 7.5
Attendance Code:
26444542
#Code
Example: health insurance company
● Perceptron algorithm in action
○ 𝑛 = 0, 𝑤0 = −130, 2, 1 𝑇 , 𝑅 = 10, 𝛼 = 0.1:
1
■ 𝑖 = 3, 𝑤1 = 𝑤0 + 0.1 × +1 × 30 = −129.9, 5, 6 𝑇
y6 = +1 50
1
y2 = +1 ■ 𝑖 = 4, 𝑤1 = −129.9, 5, 6 𝑇 + 0.1 × −1 × 22 = −130, 2.8, 3.5 𝑇
25
y7 = −1 Class
y3 = +1 “bought”
(𝑦 = +1)
𝑇 𝑇 𝑇
y5 = +1 𝑖 = 7, 𝑤1 = −129.9, 5, 6 + 0.1 × −1 × 1, 40, 55 = −130, 1, 0.5
𝑇 𝑇
𝑖 = 9, 𝑤1 = −130, 1, 0.5 + 0.1 × −1 × 1, 50, 40
y1 = −1 Class = −130.1, −4, −3.5 𝑇
“didn’t buy”
y4 = −1 (𝑦 = −1) 𝑛 = 1, 𝑤0 = −130.1, −4, −3.5 𝑇
:
𝑇 𝑇
𝑖 = 2, 𝑤2 = 𝑤1 + 0.1 × +1 × 1, 45, 60 = −130, 0.5, 2.5
𝑇 𝑇
𝑖 = 3, 𝑤2 = 𝑤1 + 0.1 × +1 × 1, 30, 50 = −129.9, 3.5, 7.5
Attendance Code:
26444542
#Code
Example: health insurance company
● Perceptron algorithm in action
○ 𝑛 = 0, 𝑤0 = −130, 2, 1 𝑇 , 𝑅 = 1000, 𝛼 = 0.1:
1
■ 𝑖 = 3, 𝑤1 = 𝑤0 + 0.1 × +1 × 30 = −129.9, 5, 6 𝑇
y6 = +1 50
1
y2 = +1 ■ 𝑖 = 4, 𝑤1 = −129.9, 5, 6 𝑇 + 0.1 × −1 × 22 = −130, 2.8, 3.5 𝑇
25
y7 = −1 Class
1
y3 = +1 “bought” ■ 𝑖 = 7, 𝑤1 = −130, 2.8, 3.5 𝑇 + 0.1 × +1 × 40 = −130.1, −1.2, −2 𝑇
(𝑦 = +1) 55
y5 = +1 ….
𝑇 𝑇
𝑖 = 9, 𝑤1 = −130, 1, 0.5 + 0.1 × −1 × 1, 50, 40
y1 = −1 Class = −130.1, −4, −3.5 𝑇
“didn’t buy”
y4 = −1 (𝑦 = −1) 𝑛 = 1, 𝑤0 = −130.1, −4, −3.5 𝑇
:
𝑇 𝑇
𝑖 = 2, 𝑤2 = 𝑤1 + 0.1 × +1 × 1, 45, 60 = −130, 0.5, 2.5
𝑇 𝑇
𝑖 = 3, 𝑤2 = 𝑤1 + 0.1 × +1 × 1, 30, 50 = −129.9, 3.5, 7.5
Attendance Code:
26444542
#Code
Example: health insurance company
● Perceptron algorithm in action
○ 𝑛 = 0, 𝑤0 = −130, 2, 1 𝑇 , 𝑅 = 1000, 𝛼 = 0.1:
1
■ 𝑖 = 3, 𝑤1 = 𝑤0 + 0.1 × +1 × 30 = −129.9, 5, 6 𝑇
y6 = +1 50
1
■ 𝑖 = 4, 𝑤1 = −129.9, 5, 6 𝑇 + 0.1 × −1 × 22 = −130, 2.8, 3.5 𝑇
y2 = +1
25
y7 = −1 Class 1
y3 = +1 “bought” ■ 𝑖 = 7, 𝑤1 = −130, 2.8, 3.5 𝑇 + 0.1 × +1 × 40 = −130.1, −1.2, −2 𝑇
(𝑦 = +1) 55
….
y5 = +1
○ 𝑛 = 1, 𝑤1 = −130, −0.2, 2 𝑇 :
y1 = −1 Class
“didn’t buy”
y4 = −1 (𝑦 = −1) Update weights, and so on… until n = R
𝑖 = 7, 𝑤1 = −129.9, 5, 6 𝑇 + 0.1 × −1 × 1, 40, 55 𝑇 = −130, 1, 0.5 𝑇
Further Reading
● PRML, Section 4.1.7
● R&N, Section 18.6.3
● H&T, Section 4.5.1