0% found this document useful (0 votes)

6 views

Lecture1 Slides 1

The document discusses classification using the perceptron algorithm. It uses an example of predicting whether customers will buy a health insurance plan based on their age and income. The goal is to create a linear classifier that splits the data into two classes using a decision boundary learned from labeled training data.

Uploaded by

Junaid Qaiser

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

6 views

Lecture1 Slides 1

Uploaded by

Junaid Qaiser

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 36

Attendance Code:

26444542
#Code
Attendance Code:
26444542
#Code

● Last lecture: classification in ML

● This lecture: classification using the perceptron algorithm

Attendance Code:
26444542
#Code
Example: health insurance company
● Data on whether customers bought the plan
Client Age (yrs) Income (k £) Bought? ● Task: predict whether a new customer is
1 25 30 No likely to buy or not the plan, given their
2 45 60 Yes
age and income.
○ Goal: split the data into 2 classes
3 30 50 Yes
(bought/didn’t buy) that best match class-
4 22 25 No labeled training data.
5 35 45 Yes Classification model: linear classifier
6 55 70 Yes 𝑓 𝑤, 𝑥 = sign 𝑤 𝑇 𝑥
7 40 55 No
𝑓: ℝ𝐷 → {−1, +1}
8 60 80 Yes
Initial guess: 𝑤0 = −130, 2, 1 𝑇

9 50 40 No Hypothesis: there is some decision boundary in the

10 28 35 No data which makes this classification possible
Attendance Code:
26444542
#Code
Example: health insurance company
● Data on whether customers bought the plan
Client Age (yrs) Income (k £) Bought? ● Task: predict whether a new customer is
1 25 30 No likely to buy or not the plan, given their
2 45 60 Yes
age and income.
○ Goal: split the data into 2 classes
3 30 50 Yes
(bought/didn’t buy) that best match class-
4 22 25 No labeled training data.
5 35 45 Yes ○ Classification model: linear classifier
6 55 70 Yes 𝑓 𝑤, 𝑥 = sign 𝑤 𝑇 𝑥
7 40 55 No
𝑓: ℝ𝐷 → {−1, +1}
8 60 80 Yes
Initial guess: 𝑤0 = −130, 2, 1 𝑇

9 50 40 No Hypothesis: there is some decision boundary in the

10 28 35 No data which makes this classification possible
Attendance Code:
26444542
#Code
Example: health insurance company
● Data on whether customers bought the plan
● Task: predict whether a new customer is
Decision boundary
likely to buy or not the plan, given their
(𝑤0𝑇 𝑥 = 0) age and income.
y2 = +1
Class ○ Goal: split the data into 2 classes
“bought”
y7 = −1 (𝑦 = +1) (bought/didn’t buy) that best match class-
labeled training data.
y3 = +1
○ Classification model: linear classifier

Class 𝑓 𝑤, 𝑥 = sign 𝑤 𝑇 𝑥
y1 = −1 “didn’t buy”
(𝑦 = −1) 𝑓: ℝ𝐷 → {−1, +1}

○ Initial guess: 𝑤0 = −130, 2, 1 𝑇

Hypothesis: there is some decision boundary in the

data which makes this classification possible
Attendance Code:
26444542
#Code
Example: health insurance company
● Data on whether customers bought the plan
Correctly classified
𝕀 𝑓 𝑤, 𝑥2 ≠ 𝑦2 = 0 ○ Misclassification error: number of misclassified
data points
Decision boundary 𝑁
(𝑤0𝑇 𝑥 = 0)
𝐹 𝑤 = ෍ 𝕀 𝑓 𝑤, 𝑥𝑖 ≠ 𝑦𝑖
Class 𝑖=1
Incorrectly classified y2 = +1 “bought”
𝕀 𝑓 𝑤, 𝑥3 ≠ 𝑦3 = 1 y7 = −1 (𝑦 = +1) ■ assigns the same penalty to all incorrect
decisions, regardless of how ‘bad’ they are.
Incorrectly classified
y3 = +1 𝕀 𝑓 𝑤, 𝑥7 ≠ 𝑦7 = 1

○ Perceptron error: sum of perpendicular

Class distances of every misclassified data point to
y1 = −1 “didn’t buy”
(𝑦 = −1)
the decision boundary,
○ 𝐹 𝑤 = σ𝑁 𝑇
𝑖=1 max 0, −𝑦𝑖 𝑤 𝑥𝑖
■ ‘Penalizes’ incorrect decisions by the
Correctly classified
𝕀 𝑓 𝑤, 𝑥1 ≠ 𝑦1 = 0
distance from the decision boundary wTx in
the direction w (perpendicular distance).
Attendance Code:
26444542
#Code
Example: health insurance company
Initial guess: 𝑤0 = −130, 2, 1 𝑇

Correctly classified ○ Now let’s look at y1

𝕀 𝑓 𝑤, 𝑥2 ≠ 𝑦2 =

Decision boundary
(𝑤0𝑇 𝑥 = 0)

Class What should be the loss here?

Incorrectly classified y2 = +1 “bought”
𝕀 𝑓 𝑤, 𝑥3 ≠ 𝑦3 = 1 y7 = −1 (𝑦 = +1)
How about y2?
Incorrectly classified
y3 = +1 𝕀 𝑓 𝑤, 𝑥7 ≠ 𝑦7 = 1

Class
y1 = −1 “didn’t buy” ○ Perceptron error: sum of perpendicular distances of
(𝑦 = −1) every misclassified data point to the decision
boundary,
○ 𝐹 𝑤 = σ𝑁 𝑇
𝑖=1 max 0, −𝑦𝑖 𝑤 𝑥𝑖
Correctly classified ■ ‘Penalizes’ incorrect decisions by the distance from
𝕀 𝑓 𝑤, 𝑥1 ≠ 𝑦1 = the decision boundary wTx in the direction w
(perpendicular distance).
Attendance Code:
26444542
#Code
Example: health insurance company
Initial guess: 𝑤0 = −130, 2, 1 𝑇

Correctly classified
𝕀 𝑓 𝑤, 𝑥2 ≠ 𝑦2 = 0 ○ Now let’s look at y3

Decision boundary
(𝑤0𝑇 𝑥 = 0)

Class
Incorrectly classified y2 = +1 “bought”
𝕀 𝑓 𝑤, 𝑥3 ≠ 𝑦3 = 1 y7 = −1 (𝑦 = +1) ○ How about y7?
Incorrectly classified
y3 = +1 𝕀 𝑓 𝑤, 𝑥7 ≠ 𝑦7 = 1
○ Perceptron error: sum of perpendicular
Class distances of every misclassified data point to
y1 = −1 “didn’t buy” the decision boundary,
(𝑦 = −1)
○ 𝐹 𝑤 = σ𝑁 𝑇
𝑖=1 max 0, −𝑦𝑖 𝑤 𝑥𝑖
■ ‘Penalizes’ incorrect decisions by the
Correctly classified distance from the decision boundary wTx in
𝕀 𝑓 𝑤, 𝑥1 ≠ 𝑦1 = 0
the direction w (perpendicular distance).
Attendance Code:
26444542
#Code
Example: health insurance company
Initial guess: 𝑤0 = −130, 2, 1 𝑇

Correctly classified
𝕀 𝑓 𝑤, 𝑥2 ≠ 𝑦2 = 0 ○ Now let’s look at y3

Decision boundary
(𝑤0𝑇 𝑥 = 0)

Class
Incorrectly classified
𝕀 𝑓 𝑤, 𝑥3 ≠ 𝑦3 = 1
y2 = +1 “bought” ○ How about y7?
y7 = −1 (𝑦 = +1)

Incorrectly classified
y3 = +1 𝕀 𝑓 𝑤, 𝑥7 ≠ 𝑦7 = 1 ○ Perceptron error: sum of perpendicular
distances of every misclassified data point to
Class the decision boundary,
y1 = −1 “didn’t buy”
(𝑦 = −1) ○ 𝐹 𝑤 =

■ ‘Penalizes’ incorrect decisions by the

Correctly classified
𝕀 𝑓 𝑤, 𝑥1 ≠ 𝑦1 = 0
distance from the decision boundary wTx in
the direction w (perpendicular distance).
Attendance Code:
26444542
#Code
Example: health insurance company
● Data on whether customers bought the plan
Correctly classified
max 0, −𝑦2𝑤 𝑇 𝑥2 = 0
○ Misclassification error: number of misclassified data
points
𝑁
Decision boundary
(𝑤0𝑇 𝑥 = 0) 𝐹 𝑤 = ෍ 𝕀 𝑓 𝑤, 𝑥𝑖 ≠ 𝑦𝑖
𝑖=1
Class
assigns the same penalty to all incorrect
Incorrectly classified
max 0, −𝑦3 𝑤 𝑇 𝑥3 y2 = +1 “bought” ■
= 𝑤 𝑇 𝑥3 = 20 y7 = −1 (𝑦 = +1) decisions, regardless of how ‘bad’ they are.
Incorrectly classified
Perceptron error: sum of perpendicular distances of
y3 = +1 max 0, −𝑦7 𝑤 𝑇 𝑥7 ○
= 𝑤 𝑇 𝑥7 = 5
every misclassified data point to the decision
boundary,
Class
y1 = −1 “didn’t buy” 𝑁
(𝑦 = −1)
𝐹 𝑤 = ෍ max 0, −𝑦𝑖 𝑤 𝑇 𝑥𝑖
𝑖=1
■ ‘Penalizes’ incorrect decisions by the distance
Correctly classified from the decision boundary wTx in the
max 0, −𝑦1 𝑤 𝑇 𝑥1 = 0 direction w (perpendicular distance).
Attendance Code:
26444542
#Code
SGD: algorithm (Section 9 Lecture Notes)

● Step 1. Initialization: Select an initial guess for w0, a convergence

tolerance ε > 0, step size (learning rate) parameter α > 0, set
iteration number n=0
● Step 2. Gradient descent step: Compute new model parameters,
wn+1 = wn − α Fw(wn)
● Step 3. Convergence test: Compute new loss function value
F(wn+1), and loss function improvement, ΔF = |F(wn+1) − F(wn)| and if
ΔF < ε, exit with solution w*=wn+1
● Step 4. Iteration: update n=n+1 and go to step 2.
Attendance Code:
26444542
#Code
Perceptron classification

● Classification model (D-dimensional):

𝑓 𝑤, 𝑥 = sign 𝑤1 𝑥 1 + 𝑤2 𝑥 2 + ⋯ + 𝑤𝐷 𝑥 𝐷 = sign 𝑤 𝑇 𝑥
Attendance Code:
26444542
#Code
Perceptron classification

● Classification model (D-dimensional):

𝑓 𝑤, 𝑥 = sign 𝑤1 𝑥 1 + 𝑤2 𝑥 2 + ⋯ + 𝑤𝐷 𝑥 𝐷 = sign 𝑤 𝑇 𝑥

● Perceptron error function:

𝑁

𝐹 𝑤 = ෍ max 0, −𝑦𝑖 𝑤 𝑇 𝑥𝑖
𝑖=1
Attendance Code:
26444542
#Code
Perceptron classification

● Classification model (D-dimensional):

𝑓 𝑤, 𝑥 = sign 𝑤1 𝑥 1 + 𝑤2 𝑥 2 + ⋯ + 𝑤𝐷 𝑥 𝐷 = sign 𝑤 𝑇 𝑥

● Perceptron error function:

𝑁

𝐹 𝑤 = ෍ max 0, −𝑦𝑖 𝑤 𝑇 𝑥𝑖
𝑖=1

● Gradient with respect to w:

𝑁

𝐹𝑤 𝑤 = − ෍ 𝑦𝑖 𝑥𝑖 𝕀 −𝑦𝑖 𝑤 𝑇 𝑥𝑖 ≥ 0
𝑖=1

● Intuitively, gradient is just sum of −yi xi over

incorrectly classified points
Attendance Code:
26444542
#Code
Perceptron training: algorithm
● Step 1. Initialization: Select a starting candidate classification
model w0, set iteration number n = 0, choose maximum number of
iterations R and learning rate α > 0
● Step 2. Gradient descent step: Compute new model parameters:
taking each i = 1, 2, . . . , N in turn, if sign 𝑤𝑛𝑇 𝑥𝑖 ≠ 𝑦𝑖 ,then
w𝑛+1 = 𝑤𝑛 + 𝛼𝑦𝑖 𝑥𝑖

● Step 3. Iteration: If n < R, update n = n + 1, go to step 2, otherwise

exit with solution w* = wn.
Attendance Code:
26444542
#Code
Perceptron training: algorithm
● Step 1. Initialization: Select a starting candidate classification
model w0, set iteration number n = 0, choose maximum number of
iterations R and learning rate α > 0
● Step 2. Gradient descent step: Compute new model parameters:
taking each i = 1, 2, . . . , N in turn, if sign 𝑤𝑛𝑇 𝑥𝑖 ≠ 𝑦𝑖 , then
w𝑛+1 = 𝑤𝑛 + 𝛼𝑦𝑖 𝑥𝑖

● Step 3. Iteration: If n < R, update n = n + 1, go to step 2, otherwise

exit with solution w* = wn.
Attendance Code:
26444542
#Code
Example: health insurance company
Client Age (yrs) Income (k £) Bought?
● Perceptron algorithm in action
1 25 30 1 No
𝑇 𝑇
𝑖 = 4, 𝑤1 = −129.9, 5, 6 + 0.1 × −1 × 22 = −130, 2.8, 3.5
2 45 60 25 Yes
Decision boundary
(𝑤0𝑇 𝑥 = 0)
3 30 50 Yes
y2 = +1
y3 = +1 𝑖 = 7, 𝑤1 = 4 6 𝑇 + 0.1 × −1
−129.9, 5,22 25 × 1, 40, 55
𝑇
=
No−130, 1, 0.5
𝑇

Class 𝑖 = 9, 𝑤1 = −130, 1, 0.5 𝑇 + 0.1 × −1 × 1, 50, 40 𝑇

“bought” 5 −3.5 𝑇 35
= −130.1, −4, 45 Yes
(𝑦 = +1)
y1 = −1 𝑛 = 1, 𝑤0 = −130.1, −4, −3.5 𝑇
:
6 55 70 𝑇 Yes 𝑇
Class 𝑖 = 2, 𝑤2 = 𝑤1 + 0.1 × +1 × 1, 45, 60 = −130, 0.5, 2.5
“didn’t buy”
(𝑦 = −1) 71 + 0.1 ×40
𝑖 = 3, 𝑤2 = 𝑤 50 𝑇 = −129.9, 3.5,
+1 × 1, 30, 55 No7.5 𝑇

8 60 80 Yes

9 50 40 No

10 28 35 No
Attendance Code:
26444542
#Code
Example: health insurance company
● Perceptron algorithm in action
○ 𝑛 = 0, 𝑤0 = −130, 2, 1 𝑇 , 𝑅 = 10, 𝛼 = 0.1:
1
Decision boundary ■ 𝑖 = 3, 𝑤1 = 𝑤0 + 0.1 × +1 × 30 = −129.9, 5, 6 𝑇
(𝑤0𝑇 𝑥 = 0) 50
y2 = +1 1
𝑖 = 4, 𝑤1 = −129.9, 5, 6 𝑇 + 0.1 × −1 × 22 = −130, 2.8, 3.5 𝑇
y3 = +1
25
Class
“bought”
(𝑦 = +1)
y1 = −1
𝑇 𝑇 𝑇
𝑖 = 7, 𝑤1 = −129.9, 5, 6 + 0.1 × −1 × 1, 40, 55 = −130, 1, 0.5
Class
“didn’t buy” 𝑖 = 9, 𝑤1 = −130, 1, 0.5 𝑇
+ 0.1 × −1 × 1, 50, 40 𝑇
(𝑦 = −1) = −130.1, −4, −3.5 𝑇
𝑛 = 1, 𝑤0 = −130.1, −4, −3.5 𝑇
:
𝑇 𝑇
𝑖 = 2, 𝑤2 = 𝑤1 + 0.1 × +1 × 1, 45, 60 = −130, 0.5, 2.5
𝑇 𝑇
𝑖 = 3, 𝑤2 = 𝑤1 + 0.1 × +1 × 1, 30, 50 = −129.9, 3.5, 7.5
Attendance Code:
26444542
#Code
Example: health insurance company
● Perceptron algorithm in action
○ 𝑛 = 0, 𝑤0 = −130, 2, 1 𝑇 , 𝑅 = 10, 𝛼 = 0.1:
1
■ 𝑖 = 3, 𝑤1 = 𝑤0 + 0.1 × +1 × 30 = −129.9, 5, 6 𝑇
y6 = +1 50
1
y2 = +1 ■ 𝑖 = 4, 𝑤1 = −129.9, 5, 6 𝑇 + 0.1 × −1 × 22 = −130, 2.8, 3.5 𝑇

25
y7 = −1 Class
y3 = +1 “bought”
(𝑦 = +1)
𝑇 𝑇 𝑇
y5 = +1 𝑖 = 7, 𝑤1 = −129.9, 5, 6 + 0.1 × −1 × 1, 40, 55 = −130, 1, 0.5
𝑇 𝑇
𝑖 = 9, 𝑤1 = −130, 1, 0.5 + 0.1 × −1 × 1, 50, 40
y1 = −1 Class = −130.1, −4, −3.5 𝑇
“didn’t buy”
y4 = −1 (𝑦 = −1) 𝑛 = 1, 𝑤0 = −130.1, −4, −3.5 𝑇
:
𝑇 𝑇
𝑖 = 2, 𝑤2 = 𝑤1 + 0.1 × +1 × 1, 45, 60 = −130, 0.5, 2.5
𝑇 𝑇
𝑖 = 3, 𝑤2 = 𝑤1 + 0.1 × +1 × 1, 30, 50 = −129.9, 3.5, 7.5
Attendance Code:
26444542
#Code
Example: health insurance company
● Perceptron algorithm in action
○ 𝑛 = 0, 𝑤0 = −130, 2, 1 𝑇 , 𝑅 = 1000, 𝛼 = 0.1:
1
■ 𝑖 = 3, 𝑤1 = 𝑤0 + 0.1 × +1 × 30 = −129.9, 5, 6 𝑇
y6 = +1 50
1
y2 = +1 ■ 𝑖 = 4, 𝑤1 = −129.9, 5, 6 𝑇 + 0.1 × −1 × 22 = −130, 2.8, 3.5 𝑇

25
y7 = −1 Class
1
y3 = +1 “bought” ■ 𝑖 = 7, 𝑤1 = −130, 2.8, 3.5 𝑇 + 0.1 × +1 × 40 = −130.1, −1.2, −2 𝑇
(𝑦 = +1) 55
y5 = +1 ….
𝑇 𝑇
𝑖 = 9, 𝑤1 = −130, 1, 0.5 + 0.1 × −1 × 1, 50, 40
y1 = −1 Class = −130.1, −4, −3.5 𝑇
“didn’t buy”
y4 = −1 (𝑦 = −1) 𝑛 = 1, 𝑤0 = −130.1, −4, −3.5 𝑇
:
𝑇 𝑇
𝑖 = 2, 𝑤2 = 𝑤1 + 0.1 × +1 × 1, 45, 60 = −130, 0.5, 2.5
𝑇 𝑇
𝑖 = 3, 𝑤2 = 𝑤1 + 0.1 × +1 × 1, 30, 50 = −129.9, 3.5, 7.5
Attendance Code:
26444542
#Code
Example: health insurance company
● Perceptron algorithm in action
○ 𝑛 = 0, 𝑤0 = −130, 2, 1 𝑇 , 𝑅 = 1000, 𝛼 = 0.1:
1
■ 𝑖 = 3, 𝑤1 = 𝑤0 + 0.1 × +1 × 30 = −129.9, 5, 6 𝑇
y6 = +1 50
1
■ 𝑖 = 4, 𝑤1 = −129.9, 5, 6 𝑇 + 0.1 × −1 × 22 = −130, 2.8, 3.5 𝑇
y2 = +1
25
y7 = −1 Class 1
y3 = +1 “bought” ■ 𝑖 = 7, 𝑤1 = −130, 2.8, 3.5 𝑇 + 0.1 × +1 × 40 = −130.1, −1.2, −2 𝑇
(𝑦 = +1) 55
….
y5 = +1
○ 𝑛 = 1, 𝑤1 = −130, −0.2, 2 𝑇 :
y1 = −1 Class
“didn’t buy”
y4 = −1 (𝑦 = −1) Update weights, and so on… until n = R
𝑖 = 7, 𝑤1 = −129.9, 5, 6 𝑇 + 0.1 × −1 × 1, 40, 55 𝑇 = −130, 1, 0.5 𝑇

𝑖 = 9, 𝑤1 = −130, 1, 0.5 𝑇 + 0.1 × −1 × 1, 50, 40 𝑇 = −130.1, −4, −3.5 𝑇 𝑖 = 2, 𝑤2

= 𝑤1 + 0.1 × +1 × 1, 45, 60 𝑇 = −130, 0.5, 2.5 𝑇
𝑖 = 3, 𝑤2 = 𝑤1 + 0.1 × +1 × 1, 30, 50 𝑇 = −129.9, 3.5, 7.5 𝑇
Attendance Code:
26444542
#Code
Example: health insurance company
● Perceptron algorithm in action
○ 𝑛 = 0, 𝑤0 = −130, 2, 1 𝑇 , 𝑅 = 1000, 𝛼 = 0.1:
1
■ 𝑖 = 3, 𝑤1 = 𝑤0 + 0.1 × +1 × 30 = −129.9, 5, 6 𝑇
50
y6 = +1
1
■ 𝑖 = 4, 𝑤1 = −129.9, 5, 6 𝑇 + 0.1 × −1 × 22 = −130, 2.8, 3.5 𝑇
y2 = +1 25
y7 = −1 Class 1
“bought” ■ 𝑖 = 7, 𝑤1 = −130, 2.8, 3.5 𝑇 + 0.1 × +1 × 40 = −130.1, −1.2, −2 𝑇
y3 = +1 55
(𝑦 = +1)
….
y5 = +1
○ 𝑛 = 1, 𝑤1 = −130, −0.2, 2 𝑇 :
y1 = −1 Class
“didn’t buy” Update weights, and so on… until n = R
y4 = −1 (𝑦 = −1)
𝑖 = 7, 𝑤1 = −129.9, 5, 6 𝑇 + 0.1 × −1 × 1, 40, 55 𝑇 = −130, 1, 0.5 𝑇

○ 𝑛 = 999, 𝑤 ⋆ = −128.98, −1.85, 3.55 𝑇 :

𝑖 = 9, 𝑤1 = −130, 1, 0.5 𝑇 + 0.1 × −1 × 1, 50, 40 𝑇 = −130.1, −4, −3.5 𝑇

𝑛 = 1, 𝑤0 = −130.1, −4, −3.5 𝑇 :

Attendance Code:
26444542
#Code
Perceptron algorithm: analysis

● If the data is linearly separable, perceptron algorithm always

converges on a decision boundary with zero error but no
guarantee on the number of iterations required to reach fixed
point
● If data is not linearly separable, no convergence guarantee –
can cycle between local optima of the perceptron error function,
so we need to stop after some number of iterations R
Attendance Code:
26444542
#Code
Perceptron training in action
Attendance Code:
26444542
#Code
Perceptron training in action
Attendance Code:
26444542
#Code
Perceptron training in action
Attendance Code:
26444542
#Code
Perceptron training in action
Attendance Code:
26444542
#Code
Perceptron training in action
Attendance Code:
26444542
#Code
Perceptron training in action
Attendance Code:
26444542
#Code
Perceptron training in action
Attendance Code:
26444542
#Code
Perceptron training in action
Attendance Code:
26444542
#Code
Perceptron training in action
Attendance Code:
26444542
#Code
Perceptron training in action
Attendance Code:
26444542
#Code
Some remarks about the perceptron
● Simple linear classifier based on the perceptron error function
rather than the misclassification error function
● Very important classic algorithm in the history of ML, direct
precursor to modern deep learning algorithms
● Extremely simple; there are mathematically better "linear single-
layer" algorithms (e.g. support vector machines) so the
perceptron is rarely used in practice today
● Understanding the perceptron critical to understanding most of
the main principles of modern ML classification
Attendance Code:
26444542
#Code
To recap
● We learned the perceptron algorithm for classifying data.
● It only converges if training data is linearly separable (and solution may not
be unique)
● Question: how would you generalize the algorithm to K > 2 classes?
● Next: Neural networks (we will answer the question above)

Further Reading
● PRML, Section 4.1.7
● R&N, Section 18.6.3
● H&T, Section 4.5.1

6.86x Machine Learning With Python: Linear Classifiers
No ratings yet
6.86x Machine Learning With Python: Linear Classifiers
7 pages
Deep Learning Interview Questions and Answers
No ratings yet
Deep Learning Interview Questions and Answers
21 pages
Machine Learning - Classifiers and Boosting: Reading CH 18.6-18.12, 20.1-20.3.2
No ratings yet
Machine Learning - Classifiers and Boosting: Reading CH 18.6-18.12, 20.1-20.3.2
54 pages
Lecture 2 Math
No ratings yet
Lecture 2 Math
34 pages
Lecture Notes 3 Perceptron
No ratings yet
Lecture Notes 3 Perceptron
7 pages
NN Theory
No ratings yet
NN Theory
138 pages
cs188 Fa23 Note21
No ratings yet
cs188 Fa23 Note21
8 pages
Perceptron
No ratings yet
Perceptron
26 pages
Perceptron Linear Classifiers
No ratings yet
Perceptron Linear Classifiers
42 pages
Perceptron 2014
No ratings yet
Perceptron 2014
44 pages
CIS 4526: Foundations of Machine Learning Linear Classification: Perceptron
No ratings yet
CIS 4526: Foundations of Machine Learning Linear Classification: Perceptron
33 pages
Machine Learning and Data Mining: Prof. Alexander Ihler
No ratings yet
Machine Learning and Data Mining: Prof. Alexander Ihler
46 pages
07 Supervised Machine Learning
No ratings yet
07 Supervised Machine Learning
24 pages
ML_Lec 6- Linear Classifiers
No ratings yet
ML_Lec 6- Linear Classifiers
55 pages
07. Linear Regression
No ratings yet
07. Linear Regression
37 pages
unit 2_class_preceptron
No ratings yet
unit 2_class_preceptron
13 pages
Lecturenotes Perceptron
No ratings yet
Lecturenotes Perceptron
7 pages
astar
No ratings yet
astar
4 pages
NN 03
No ratings yet
NN 03
27 pages
Lecture 2
No ratings yet
Lecture 2
19 pages
20200428135045cfbc718e2c (1)
No ratings yet
20200428135045cfbc718e2c (1)
30 pages
Linear Classifier-Perceptron
No ratings yet
Linear Classifier-Perceptron
42 pages
SML_Lecture5
No ratings yet
SML_Lecture5
45 pages
NN-Ch2 New V1
No ratings yet
NN-Ch2 New V1
99 pages
ML Lecture#3
No ratings yet
ML Lecture#3
37 pages
3 Percept Ron
No ratings yet
3 Percept Ron
34 pages
NN Unit 2
No ratings yet
NN Unit 2
20 pages
DeepLearning Workshop Humayun
No ratings yet
DeepLearning Workshop Humayun
63 pages
CSE 473 Pattern Recognition: Instructor: Dr. Md. Monirul Islam
No ratings yet
CSE 473 Pattern Recognition: Instructor: Dr. Md. Monirul Islam
43 pages
L10 Learning II Gradient Based Learning
No ratings yet
L10 Learning II Gradient Based Learning
72 pages
Perceptron Bound Proof
No ratings yet
Perceptron Bound Proof
27 pages
2021 Logistic Regression
No ratings yet
2021 Logistic Regression
33 pages
Perceptron Notes
No ratings yet
Perceptron Notes
5 pages
Week3_LearningI
No ratings yet
Week3_LearningI
48 pages
Performance Metrics
No ratings yet
Performance Metrics
402 pages
Adaptive Linear Neuron Using Linear (Identity) Activation Function With Batch Gradient Method
No ratings yet
Adaptive Linear Neuron Using Linear (Identity) Activation Function With Batch Gradient Method
19 pages
Session 6 Machine Learning Algorithms
No ratings yet
Session 6 Machine Learning Algorithms
46 pages
Lec1 PerceptronPocket Recap
No ratings yet
Lec1 PerceptronPocket Recap
61 pages
unit 2_class
No ratings yet
unit 2_class
16 pages
Lecture 13 - Perceptrons: Machine Learning March 16, 2010
No ratings yet
Lecture 13 - Perceptrons: Machine Learning March 16, 2010
49 pages
הרצאה-Classifiers and Decision Trees
No ratings yet
הרצאה-Classifiers and Decision Trees
119 pages
Slide 2
No ratings yet
Slide 2
35 pages
The Percept Ronal Go
No ratings yet
The Percept Ronal Go
72 pages
Pattern Classification 10. Linear Perceptron, Least Squares & Multi-Layer Nns
No ratings yet
Pattern Classification 10. Linear Perceptron, Least Squares & Multi-Layer Nns
38 pages
Perceptron
No ratings yet
Perceptron
6 pages
Deep Learning Techniques
No ratings yet
Deep Learning Techniques
9 pages
Introduction To Machine Learning: ETH Zurich Janik Schuettler Marcel Graetz FS18
No ratings yet
Introduction To Machine Learning: ETH Zurich Janik Schuettler Marcel Graetz FS18
18 pages
cs188 sp23 Lec25 - Z
No ratings yet
cs188 sp23 Lec25 - Z
38 pages
ML-UNIT-I
No ratings yet
ML-UNIT-I
14 pages
05 Linear Classifiers
No ratings yet
05 Linear Classifiers
59 pages
Artificial Neural Networks Unit 3: Single-Layer Perceptrons
No ratings yet
Artificial Neural Networks Unit 3: Single-Layer Perceptrons
11 pages
Machine Learning: Support Vector Machines Kernel Methods
No ratings yet
Machine Learning: Support Vector Machines Kernel Methods
87 pages
Lecture 9
No ratings yet
Lecture 9
97 pages
PRu 4
No ratings yet
PRu 4
13 pages
Perceptron Lecture 3
No ratings yet
Perceptron Lecture 3
25 pages
PLA Explanation
No ratings yet
PLA Explanation
19 pages
Super Cheatsheet Artificial Intelligence
No ratings yet
Super Cheatsheet Artificial Intelligence
18 pages
Assg 2 Complete
No ratings yet
Assg 2 Complete
5 pages
PLA Perceptron Learning Algorithm
No ratings yet
PLA Perceptron Learning Algorithm
14 pages
DL Assignment
No ratings yet
DL Assignment
11 pages
Working toward Better Pay
From Everand
Working toward Better Pay
Paolo Falco
No ratings yet
Artificial Neural Networks (Anns) VS Deep Neural Networks
No ratings yet
Artificial Neural Networks (Anns) VS Deep Neural Networks
24 pages
Machine Learning (VR20) III B.Tech - II Semester: Random Forest Algorithm
No ratings yet
Machine Learning (VR20) III B.Tech - II Semester: Random Forest Algorithm
14 pages
ML Lec-12
No ratings yet
ML Lec-12
17 pages
DL Question Bank Answers
No ratings yet
DL Question Bank Answers
55 pages
Terminologies of ANN
No ratings yet
Terminologies of ANN
3 pages
ML - Business Report - Priyanka Sharma
No ratings yet
ML - Business Report - Priyanka Sharma
117 pages
A Deep Learning Based Multi Agent System For Intrusion Detection
No ratings yet
A Deep Learning Based Multi Agent System For Intrusion Detection
13 pages
Deep Learning Unit 2
No ratings yet
Deep Learning Unit 2
30 pages
Output SPSS (1) 1
No ratings yet
Output SPSS (1) 1
47 pages
PublishedPaperNo.8 2022
100% (1)
PublishedPaperNo.8 2022
14 pages
3-ADALINE (Adaptive Linear Neuron) (Widrow & Hoff, 1960) : W X T E
No ratings yet
3-ADALINE (Adaptive Linear Neuron) (Widrow & Hoff, 1960) : W X T E
8 pages
Hopfield Networks and Boltzman Machines-Part 1
100% (1)
Hopfield Networks and Boltzman Machines-Part 1
13 pages
Neural Network in MATLAB
86% (7)
Neural Network in MATLAB
52 pages
ML Unit-2
No ratings yet
ML Unit-2
51 pages
Deep Learning Cheatsheet
No ratings yet
Deep Learning Cheatsheet
5 pages
LSTM
No ratings yet
LSTM
123 pages
Deep Neural Network DNN
No ratings yet
Deep Neural Network DNN
5 pages
Learning Deep Learning Theory and Practice of Neural Networks Computer Vision NLP and Transformers using TensorFlow 1st Edition Ekman Magnus download
100% (1)
Learning Deep Learning Theory and Practice of Neural Networks Computer Vision NLP and Transformers using TensorFlow 1st Edition Ekman Magnus download
55 pages
Algorithm
No ratings yet
Algorithm
27 pages
Lampiran 5. Hasil Analisis
No ratings yet
Lampiran 5. Hasil Analisis
2 pages
Stop and Scan The Trees: Tree Leaf Recognition With Transfer Learning
No ratings yet
Stop and Scan The Trees: Tree Leaf Recognition With Transfer Learning
6 pages
Artificial Neural Networks - 12: Dr. Aditya Abhyankar
No ratings yet
Artificial Neural Networks - 12: Dr. Aditya Abhyankar
42 pages
BIVARIAT
No ratings yet
BIVARIAT
4 pages
Adaboost Dataset X1 X2 Y
No ratings yet
Adaboost Dataset X1 X2 Y
5 pages
The Multilayer Perceptron
No ratings yet
The Multilayer Perceptron
11 pages
Decision Trees and Random Forests
No ratings yet
Decision Trees and Random Forests
36 pages
Email Classification: Roll No-41463 (LP-3)
No ratings yet
Email Classification: Roll No-41463 (LP-3)
5 pages
Univariat Dan Bivariat
No ratings yet
Univariat Dan Bivariat
10 pages
Recurrent Neural Network
No ratings yet
Recurrent Neural Network
11 pages

Lecture1 Slides 1

Uploaded by

Lecture1 Slides 1

Uploaded by

Attendance Code:

● Last lecture: classification in ML

● This lecture: classification using the perceptron algorithm

9 50 40 No Hypothesis: there is some decision boundary in the

9 50 40 No Hypothesis: there is some decision boundary in the

9 50 40 No Hypothesis: there is some decision boundary in the

○ Initial guess: 𝑤0 = −130, 2, 1 𝑇

Hypothesis: there is some decision boundary in the

○ Perceptron error: sum of perpendicular

Correctly classified ○ Now let’s look at y1

Class What should be the loss here?

■ ‘Penalizes’ incorrect decisions by the

● Step 1. Initialization: Select an initial guess for w0, a convergence

● Classification model (D-dimensional):

● Classification model (D-dimensional):

● Perceptron error function:

● Classification model (D-dimensional):

● Perceptron error function:

● Gradient with respect to w:

● Intuitively, gradient is just sum of −yi xi over

● Step 3. Iteration: If n < R, update n = n + 1, go to step 2, otherwise

● Step 3. Iteration: If n < R, update n = n + 1, go to step 2, otherwise

Class 𝑖 = 9, 𝑤1 = −130, 1, 0.5 𝑇 + 0.1 × −1 × 1, 50, 40 𝑇

𝑖 = 9, 𝑤1 = −130, 1, 0.5 𝑇 + 0.1 × −1 × 1, 50, 40 𝑇 = −130.1, −4, −3.5 𝑇 𝑖 = 2, 𝑤2

○ 𝑛 = 999, 𝑤 ⋆ = −128.98, −1.85, 3.55 𝑇 :

𝑛 = 1, 𝑤0 = −130.1, −4, −3.5 𝑇 :

● If the data is linearly separable, perceptron algorithm always

You might also like