Evaluation Metrics
Evaluation Metrics
perceptron
Feedforward
networks
Multilayer
perceptron
Neural networks
Hopfield
network
Recurrent
networks
Kohnen’s SOM
1
Difference between Feedforward and
Recurrent Network
Feed forward network :The network has no Recurrent Network :The network has memory
memory or feedback or feedback (Same network with Feedback
connections)
2
Hopfield Network
• A Hopfield network is a form of recurrent artificial neural network. It is also one
of the oldest neural networks.
• A Hopfield network is single-layered, the neurons are fully connected, i.e., each
neuron is connected to every other neuron.
3
Hopfield Network (Cont.)
4
5
Training Algorithm
𝑊 = ∑𝑚 𝑘 𝑘𝑇 − 𝑚𝐼
𝑘=1 𝑋 𝑋
𝒚𝑖 = 𝒙𝑖 , (𝑖 = 1,2 … 𝑛)
Step 3. Do Steps 4-6 for each unit 𝑦𝑖 (Units should be updated in random
order.)
Step 4. Calculate the total input (Net) of the network 𝑦𝑖𝑛 using the
equation given below.
𝑦𝑖𝑛𝑖 = 𝑥𝑖 + 𝑦𝑗 𝑤𝑗𝑖
𝑗=1 7
Training Algorithm (Cont.)
1 if 𝑦𝑖𝑛𝑖 > 0
𝑦𝑖 = ൞𝑦𝑖 if 𝑦𝑖𝑛𝑖 = 0
0 if 𝑦𝑖𝑛𝑖 < 0
Step 6. Predict the value of 𝑦𝑖 to all other units. (This updates the activation
vector.)
Step 7. Test the network for convergence.
8
Hopfield Example
9
Hopfield Example (Cont.)
𝑊 = ∑𝑚 𝑘 𝑘𝑇 − 𝑚𝐼
𝑘=1 𝑋 𝑋
1 1 0 0 0 0 1 1 −1
1 0 1 0 0 1 0 1 −1
𝑊= × 1 1 1 −1 − 1 × =
1 0 0 1 0 1 1 0 −1
−1 0 0 0 1 −1 −1 −1 0
10
0 1 1 −1
1 0 1 −1
Hopfield Example (Cont.) 𝑤=
1 1 0 −1
−1 −1 −1 0
11
0 1 1 −1
1 0 1 −1
Hopfield Example (Cont.) 𝑤=
1 1 0 −1
−1 −1 −1 0
−1
−1
0+ 1 0 1 0 × = −2
−1
0
ii. Applying activation, 𝑦𝑖𝑛4 < 0 → 𝑦4 = 0
iii.Giving feedback to other units, we gety = (1,0,1,0).
1
1
1+ 1 0 1 0 × =2
0
1
ii. Applying activation, 𝑦in > 0 → 𝑦3 = 1
iii. Giving feedback to other units, we get y = (1,0,1,0). which is not
equal to input vector 𝑥 = 1 1 1 0
13
0 1 1 −1
1 0 1 −1
Hopfield Example (Cont.) 𝑤=
1 1 0 −1
−1 −1 −1 0
1
0
0+ 1 0 1 0 × =2
1
−1
iii. Giving feedback to other units, we get y = (1,1,1,0). which is
equal to input vector 𝑥 = 1 1 1 0 . Hence, convergence with
vector 𝑥. 14
Hopfield Example 2
1 −1
𝑿1 = 1 𝑿2 = −1 or 𝑋1𝑇 = 1 1 1 𝑿𝑇2 = −1 −1 −1
1 −1
𝑊 = ∑𝑚 𝑘 𝑘𝑇 − 𝑚𝐼
𝑘=1 𝑋 𝑋
1 −1 1 0 0 0 2 2
𝐖= 1 1 1 1 + −1 −1 −1 −1 − 2 0 1 0 = 2 0 2
1 −1 0 0 1 2 2 0
16
Hopfield Example 2 (Cont.)
0 2 2 1 0 1
𝐘1 = f 2 0 2 1 + 0 = 1
2 2 0 1 0 1
0 2 2 −1 0 −1
𝐘2 = f 2 0 2 −1 + 0 = −1
2 2 0 −1 0 −1
17
Self Organizing Map (SOM)
• The self organizing map also known as a Kohonen map, which
introduced by Teuvo Kohonen in 1980s
• SOM is a type of neural network but different from other NNs
(feedforward, CNN)- unsupervised learning
o Does not require class labels
• An SOM is trained using competitive learning rather than the error-
correction learning (e.g., backpropagation with gradient descent)
• In competitive learning, neurons compete among themselves to be
activated and only a single output neuron is active at any time
• The output neuron that wins the "competition" is called the winner-
takes-all neuron or Best Matching Unit (BMU).
• Only the weight vectors of winner and its neighbor units are updated.
• SOM is used for clustering and mapping (or dimensionality
reduction) 18
SOM Architecture
• SOMs have two layers; the first one is the input layer and the second
one is the output layer or the feature map.
• The output layer is a two-dimensional competitive layer, organized as
a 2D grid of units.
• Each neuron in the competitive layer is fully connected to all the
source units in the input layer.
19
SOM Algorithm
Kohonen learning algorithm starts by initializing the synaptic weights by
small random values, so that no prior order is imposed on the feature map.
Then performs the following three processes:
• Competition:
oFor each input pattern, the neurons compute their respective values of
a discriminant function. The particular neuron with the smallest value
is declared winner.
• Cooperation:
o The winning neuron determines the spatial location of a topological
neighborhood of excited neurons, thereby providing the basis for
cooperation among such neighboring neurons.
• Synaptic weights Adaption:
o Adjustments applied to the synaptic weights of the excited neurons
such that the response of the winning neuron is enhanced for similar 20
input patterns.
Competition Process
•
where 𝒋 is the index if the particular neuron that satisfies this condition
is called the best-matching, or winning, neuron for the vector x.
21
Cooperative Process
22
Cooperative Process (Cont.)
𝑑𝑗,𝑖
ℎ𝑗,𝑖(𝑥) = exp − , 𝑗 ∈ ሾ1, 𝑙]
2𝜎 2
• For stability, that the size of the topological neighborhood
should shrink with time (t)
𝑡
𝜎(t) = 𝜎0 exp − t = 0,1,2, … ,
𝜏1
• Then the neighborhood function is
2
𝑑𝑗,𝑖
ℎ𝑗,𝑖 (t) = exp − 2 , t = 0,1,2, …
2𝜎 (t)
23
Cooperative Process (Cont.)
• The neighborhood function shows that the value of ℎ𝑗𝑖 (𝑡) depending on
the distance (𝑑𝑗,𝑖 ) from the position of the being assessed neuron (neuron
𝑖 ) to the position of BMU .
24
Adaptive Process
• During training, the winner neuron and its topological neighbors are
adapted to make their weight vectors more similar to the input pattern
that caused the activation
25
Adaptive Process (Cont.)
• Neurons that are closer to the winner will adapt more heavily than
neurons that are further away
26
Summary of the SOM Algorithm
Step0:
27
Summary of the SOM Algorithm (Cont.)
• Step 3:For each j neuron, compute the Euclidean distance
2
𝐷(𝑗) = ∑𝑛𝑖=1 𝑥𝑖 − 𝑤𝑖𝑗
28
SOM Example
𝒙1 𝒙2 𝒙3 𝒙4
1 1 0 0
0 0 0 1
1 0 0 0 𝒙1 𝒙2 𝒙3 𝒙4
0 0 1 1 1 1 0 0
0 0 0 1
1 0 0 0
0 0 1 1
• Classify these patterns, Use SOM Map
29
SOM Example Continued
• The learning rate is initialized with a 0.6 value, and will decreases in
each epoch according with the following expression:
𝜂(𝑡 = 0) = 0.6
𝜂(𝑡 + 1) = 0.5𝜂(𝑡)
• For vector 1100 (We are using the Euclidean distance squared for
convenience)
D(1) = (1 − 0.2)2 + (1 − 0.6)2 + (0 − 0.5)2 + (0 − 0.9)2 = 1.86
D 1 = 1.86, D(2) = 0.98
Hence 𝐽 = 2. Note that 𝑅 = 0, so we need not update the weights of any
neighboring neurons.
𝑤𝑖𝑗 𝑡 + 1 = 𝑤𝑖𝑗 𝑡 + 𝜂 𝑡 ℎ(𝑡) 𝑥𝑖 − 𝑤𝑖𝑗 (𝑡)
• The new weight matrix is
𝑤11 𝑤12 0.2 0.92
𝑤21 𝑤22 𝑤12 = 0.8 + 0.6 (1 − 0.8)
0.6 0.76
𝑤31 𝑤32 =
0.5 0.28
𝑤41 𝑤42 0.9 0.12
31
SOM Example Continued
Hence 𝐽 = 1
𝑤11 𝑤12 0.032 0.968
𝑤21 𝑤22 0.096 0.304
𝑤31 𝑤32 =
0.680 0112
𝑤41 𝑤42 0.984 0.048
32
SOM Example Continued
𝐱𝟏 𝔁𝟐 𝐱𝟑 𝔁𝟒 Cluster
1 1 0 0 2
0 0 0 1 1
1 0 0 0 2
0 0 1 1 1 34
Evaluation/Classification
35
Evaluation/Classification Continued
36
Evaluation/Classification Continued
Predicted Class
□ The numbers across the major diagonal Y Positive Positive
represent the correct decision model.
□ The numbers outside the major
diagonal represent the errors (i.e., the False True
confusion) between the various classes. N Negative Negative
37
Evaluation/Classification Continued
True Class
Positive
Negative
38
Evaluation/Classification Continued
Predicted Class
Yes
No
39
Evaluation/Classification Continued
Comparing True
to Predicted Class
TP
FP
FN
TN
40
Evaluation/Classification Continued
Predicted Class
Y Positive Positive
TP
Sensitivity =
TP + FN False True
N Negative Negative
41
Evaluation/Classification Continued
Predicted Class
Y Positive Positive
TN
Specificity =
TN + FP False True
N Negative Negative
42
Evaluation/Classification Continued
True Class
→ FP rate =
FP P n
FP+TN
TP True False
→ TP rate = = Recall
Predicted Class
TP+FN Y Positive Positive
TP
→ Precision =
TP+FP
Precision x Recall False True
→ F1 score = 2 . N Negative
Precision+ Recall Negative
TP+TN
→ Accuracy =
P+n
43
Evaluation/Classification Continued
44
ROC curve
The better the test, the closer the curve comes to the
upper left corner 45
ROC curve Continued Accuracy =
(TP+TN)/(TP+FN+FP+TN)
=
Example on a test set of 20 instances (5+9)/(5+5+1+9) = (14)/(20)
= 70%
. . .
. . .
. . .
. . .
.
. .
. The ROC point at
. . (0.1,0.5) produces
.
. . its highest accuracy
. (70%)
. .
. . . . . . . . . . .
alse positive rate
46
Assignment
47
Evaluation/Segmentation
Doctor
Mass No Mass
Algorithm
Mass TP FP
No Mass FN TN
Sensitivity Specificity
48
Evaluation/Segmentation Continued
49