0% found this document useful (0 votes)
11 views49 pages

Evaluation Metrics

The document discusses various types of neural networks, focusing on the differences between feedforward and recurrent networks, particularly the Hopfield network. It explains the characteristics of Hopfield networks, including their ability to store and reconstruct patterns, and outlines the training algorithm used for these networks. Additionally, it introduces Self Organizing Maps (SOMs) and their competitive learning process, emphasizing their application in clustering and dimensionality reduction.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views49 pages

Evaluation Metrics

The document discusses various types of neural networks, focusing on the differences between feedforward and recurrent networks, particularly the Hopfield network. It explains the characteristics of Hopfield networks, including their ability to store and reconstruct patterns, and outlines the training algorithm used for these networks. Additionally, it introduces Self Organizing Maps (SOMs) and their competitive learning process, emphasizing their application in clustering and dimensionality reduction.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 49

Single layer

perceptron
Feedforward
networks
Multilayer
perceptron
Neural networks
Hopfield
network
Recurrent
networks
Kohnen’s SOM

1
Difference between Feedforward and
Recurrent Network

Feed forward network :The network has no Recurrent Network :The network has memory
memory or feedback or feedback (Same network with Feedback
connections)

2
Hopfield Network
• A Hopfield network is a form of recurrent artificial neural network. It is also one
of the oldest neural networks.
• A Hopfield network is single-layered, the neurons are fully connected, i.e., each
neuron is connected to every other neuron.

3
Hopfield Network (Cont.)

• It behaves in a discrete manner, i.e., it gives finite distinct output,


generally of two types:
• Binary (0/1)
• Bipolar (-1/1)
• The weights associated with this network is symmetric in nature (Given
two neurons, 𝑖 and 𝑗 then 𝑤𝑖𝑗 = 𝑤𝑗𝑖 ) and neurons are not self-connected
(𝑤𝑖𝑖 = 0)
• No. of the input neurons should always be equal to no of output neurons
• Hopfield network stores the correct patterns then reconstruct a pattern
from partial or corrupted patterns. These networks are sometimes called
Associative Memories or Hopfield Memories.

4
5
Training Algorithm

• For storing a set of input patterns 𝑥 𝑝 ሾ𝑝 = 1 to 𝑃 ], where 𝒙(𝐩)


= 𝒙1 (𝐩) … 𝒙i (𝐩) … 𝒙n (𝐩),
• Step 0: Initialize weights to store patterns (Use Hebbian learning rule.)
• Weights are calculated using formula:
𝑘 𝑘
∑𝑚
𝑘=1 𝑥𝑖 𝑥𝑗 𝑖≠𝑗
𝑤𝑖𝑗 = ൝
0 𝑖 = 𝑗 (to avoid self−feedback)
• In matrix form, it is the outer product of the patterns:

𝑊 = ∑𝑚 𝑘 𝑘𝑇 − 𝑚𝐼
𝑘=1 𝑋 𝑋

Where m is the number of states (patterns) to be memorized by the network


𝐼 is 𝑛 × 𝑛 identity matrix, superscript 𝑇 denotes a matrix transpose.
6
Training Algorithm (Cont.)
Step 1. For each input pattern 𝑥, do Steps 2-6.
Step 2. Set initial activations of net 𝒚 equal to the external input vector 𝑥 :

𝒚𝑖 = 𝒙𝑖 , (𝑖 = 1,2 … 𝑛)

Step 3. Do Steps 4-6 for each unit 𝑦𝑖 (Units should be updated in random
order.)

Step 4. Calculate the total input (Net) of the network 𝑦𝑖𝑛 using the
equation given below.

𝑦𝑖𝑛𝑖 = 𝑥𝑖 + ෍ 𝑦𝑗 𝑤𝑗𝑖
𝑗=1 7
Training Algorithm (Cont.)

Step 5. Determine activation (output signal):

1 if 𝑦𝑖𝑛𝑖 > 0
𝑦𝑖 = ൞𝑦𝑖 if 𝑦𝑖𝑛𝑖 = 0
0 if 𝑦𝑖𝑛𝑖 < 0

Step 6. Predict the value of 𝑦𝑖 to all other units. (This updates the activation
vector.)
Step 7. Test the network for convergence.

8
Hopfield Example

• Consider an example in which the vector (1,1,1,0) (or its bipolar


equivalent (1,1,1, −1)) was stored in a net. Test the Hopfield
network with missing entries in the first and second component of
the stored vector (i.e. [0 0 1 0]). The units update their activations in
a random order. For this example, the update order is 𝒚𝟏 , 𝒚𝟒 , 𝒚𝟑 , 𝒚𝟐 .

o How many neurons?


4 activations = 4 neurons
o How many synapses (connection between neurons)?
(4x4) − 4 =12

9
Hopfield Example (Cont.)

• Step 0. Initialize weights to store patterns(Use Hebbian learning rule.)

𝑊 = ∑𝑚 𝑘 𝑘𝑇 − 𝑚𝐼
𝑘=1 𝑋 𝑋

1 1 0 0 0 0 1 1 −1
1 0 1 0 0 1 0 1 −1
𝑊= × 1 1 1 −1 − 1 × =
1 0 0 1 0 1 1 0 −1
−1 0 0 0 1 −1 −1 −1 0

10
0 1 1 −1
1 0 1 −1
Hopfield Example (Cont.) 𝑤=
1 1 0 −1
−1 −1 −1 0

• Step 1. The input vector is 𝑥 = (0,0,1,0). For this vector, 𝑦 =


(0,0,1,0).
• Step 2. Choose unit y1 to update its activation:
i. 𝑦𝑖𝑛1 = 𝑥1 + ∑𝑓 𝑦𝑗 𝑤𝑗1 = 0 + 1
0
1
0+ 0 0 1 0 𝑥 =1
1
−1

ii. Applying activation, 𝑦𝑖𝑛1 > 0 → 𝑦1 = 1


iii. Giving feedback to other units, we get y = (1,0,1,0).
which is not equal to input vector 𝑥 = 1 1 1 0

11
0 1 1 −1
1 0 1 −1
Hopfield Example (Cont.) 𝑤=
1 1 0 −1
−1 −1 −1 0

• Step 3. Choose unit y4 to update its activation:


i. 𝑦𝑖𝑛4 = 𝑥4 + ∑𝑓 𝑦𝑗 𝑤𝑗4 = 0 + −2

−1
−1
0+ 1 0 1 0 × = −2
−1
0
ii. Applying activation, 𝑦𝑖𝑛4 < 0 → 𝑦4 = 0
iii.Giving feedback to other units, we gety = (1,0,1,0).

which is not equal to input vector 𝑥 = 1 1 1 0


12
0 1 1 −1
1 0 1 −1
Hopfield Example (Cont.) 𝑤=
1 1 0 −1
−1 −1 −1 0

• Step 4. Choose unit y3 to update its activation:


i. 𝑦𝑖𝑛3 = 𝑥3 + ∑𝑓 𝑦𝑗 𝑤𝑗3 = 1 + 1

1
1
1+ 1 0 1 0 × =2
0
1
ii. Applying activation, 𝑦in > 0 → 𝑦3 = 1
iii. Giving feedback to other units, we get y = (1,0,1,0). which is not
equal to input vector 𝑥 = 1 1 1 0
13
0 1 1 −1
1 0 1 −1
Hopfield Example (Cont.) 𝑤=
1 1 0 −1
−1 −1 −1 0

• Step 5. Choose unit y2 to update its activation:


i. 𝑦𝑖𝑛2 = 𝑥2 + ∑𝑓 𝑦𝑗 𝑤𝑗2 = 0 + 2
ii. Applying activation, 𝑦𝑖𝑛2 > 0 → 𝑦2 = 1

1
0
0+ 1 0 1 0 × =2
1
−1
iii. Giving feedback to other units, we get y = (1,1,1,0). which is
equal to input vector 𝑥 = 1 1 1 0 . Hence, convergence with
vector 𝑥. 14
Hopfield Example 2

• Suppose, for instance, that our network is required to memorize two


opposite states (patterns), (1,1,1) and (−1, −1, −1). Thus,

1 −1
𝑿1 = 1 𝑿2 = −1 or 𝑋1𝑇 = 1 1 1 𝑿𝑇2 = −1 −1 −1
1 −1

• The 3 × 3 identity matrix I is


1 0 0
𝐈= 0 1 0
0 0 1
• No of input patterns m=3
• How many neurons?
15
• How many synapses (connection between neurons)?
Hopfield Example 2 (Cont.)

𝑊 = ∑𝑚 𝑘 𝑘𝑇 − 𝑚𝐼
𝑘=1 𝑋 𝑋

• Thus, we can now determine/the weight matrix as follows:

1 −1 1 0 0 0 2 2
𝐖= 1 1 1 1 + −1 −1 −1 −1 − 2 0 1 0 = 2 0 2
1 −1 0 0 1 2 2 0

• Next, the network is tested by the sequence of input vectors, 𝑋1 and 𝑋2 ,


which are equal to the output (or target) vectors, respectively.

16
Hopfield Example 2 (Cont.)

• First, we activate the Hopfield network by applying the input vector


X. Then, we calculate the actual output vector Y, and finally, we
compare the result with the initial input vector 𝑋.

0 2 2 1 0 1
𝐘1 = f 2 0 2 1 + 0 = 1
2 2 0 1 0 1

0 2 2 −1 0 −1
𝐘2 = f 2 0 2 −1 + 0 = −1
2 2 0 −1 0 −1

17
Self Organizing Map (SOM)
• The self organizing map also known as a Kohonen map, which
introduced by Teuvo Kohonen in 1980s
• SOM is a type of neural network but different from other NNs
(feedforward, CNN)- unsupervised learning
o Does not require class labels
• An SOM is trained using competitive learning rather than the error-
correction learning (e.g., backpropagation with gradient descent)
• In competitive learning, neurons compete among themselves to be
activated and only a single output neuron is active at any time
• The output neuron that wins the "competition" is called the winner-
takes-all neuron or Best Matching Unit (BMU).
• Only the weight vectors of winner and its neighbor units are updated.
• SOM is used for clustering and mapping (or dimensionality
reduction) 18
SOM Architecture
• SOMs have two layers; the first one is the input layer and the second
one is the output layer or the feature map.
• The output layer is a two-dimensional competitive layer, organized as
a 2D grid of units.
• Each neuron in the competitive layer is fully connected to all the
source units in the input layer.

19
SOM Algorithm
Kohonen learning algorithm starts by initializing the synaptic weights by
small random values, so that no prior order is imposed on the feature map.
Then performs the following three processes:
• Competition:
oFor each input pattern, the neurons compute their respective values of
a discriminant function. The particular neuron with the smallest value
is declared winner.
• Cooperation:
o The winning neuron determines the spatial location of a topological
neighborhood of excited neurons, thereby providing the basis for
cooperation among such neighboring neurons.
• Synaptic weights Adaption:
o Adjustments applied to the synaptic weights of the excited neurons
such that the response of the winning neuron is enhanced for similar 20
input patterns.
Competition Process

• Suppose 𝑚 is the dimension of the input patten:


x = 𝑥1 , 𝑥2 , … , 𝑥𝑚 𝑇
• If 𝑙 is the no. of neuron in the output layer, then the synaptic weight
vector of each neuron is
𝑇
w𝑗 = 𝑤𝑗1 , 𝑤𝑗2 , … , 𝑤𝑗𝑚 , 𝑗 ∈ ሾ1, 𝑙]
• Now we should compute the Euclidean distance for each neuron and
select the smallest. This is equivalent to computing:
2
𝐷(𝑗) = min( ∑𝑚
𝑖=1 𝑥𝑖 − 𝑤𝑖𝑗 )


where 𝒋 is the index if the particular neuron that satisfies this condition
is called the best-matching, or winning, neuron for the vector x.
21
Cooperative Process

• The winning neuron locates the center of a


topological neighborhood of cooperating neurons at
distance 𝑑𝑗,𝑖 .
• We may assume that the topological neighborhood
function ℎ𝑗,𝑖 satisfy:
• The neighborhood ℎ𝑗,𝑖 is symmetric around the
winner neuron.
• The amplitude of ℎ𝑗,𝑖 decreases monotonically
with increasing 𝑑𝑗,𝑖

22
Cooperative Process (Cont.)

• A good choice for the neighborhood function ℎ𝑗,𝑖 given


aforementioned requirements is a Gaussian:

𝑑𝑗,𝑖
ℎ𝑗,𝑖(𝑥) = exp − , 𝑗 ∈ ሾ1, 𝑙]
2𝜎 2
• For stability, that the size of the topological neighborhood
should shrink with time (t)
𝑡
𝜎(t) = 𝜎0 exp − t = 0,1,2, … ,
𝜏1
• Then the neighborhood function is
2
𝑑𝑗,𝑖
ℎ𝑗,𝑖 (t) = exp − 2 , t = 0,1,2, …
2𝜎 (t)

23
Cooperative Process (Cont.)

• The neighborhood function shows that the value of ℎ𝑗𝑖 (𝑡) depending on
the distance (𝑑𝑗,𝑖 ) from the position of the being assessed neuron (neuron
𝑖 ) to the position of BMU .

• If 𝑑𝑗,𝑖 = 0 (BMU is neuron being assessed), ℎ𝑗𝑖 (𝑡) = 1.

• If 𝑑𝑗,𝑖 = 𝑁𝑖 (𝑡) (the being assessed neuron in the farthest position in


neighborhood radius 𝑁𝑖 (𝑡))

24
Adaptive Process

• During training, the winner neuron and its topological neighbors are
adapted to make their weight vectors more similar to the input pattern
that caused the activation

• The adaptation rule is (Weight updates performed by:)

𝐰𝑗 (t + 1) = 𝐰𝑗 (t) + 𝜂(t)ℎ𝑗,𝑖 (t) 𝐱(t) − 𝐰𝑗 (t)

• The magnitude of the adaptation is controlled with a learning rate,


which decays over time to ensure convergence of the SOM

25
Adaptive Process (Cont.)

• The mathematical form of learning rate decay is


𝑡
𝜂(𝑛) = 𝜂0 exp − , t = 0,1,2, …
𝜏2
𝜏2 is a time constant
• Takes long iteration; 1000 or more. Then we carefully choose
parameters that 𝜼 start at 0.1 and ends above 0.01. the following
good choices
1000
• 𝜼0 = 0.1, 𝜏2 = 1000, and 𝜏1 =
log 𝜎0

• Neurons that are closer to the winner will adapt more heavily than
neurons that are further away

26
Summary of the SOM Algorithm

Step0:

• Initialize synaptic weights 𝑤𝑖𝑗 to small random values, say in an


interval [0, 1].
• Set max value for neighborhood radius (R) .
• Set learning rate parameters (small positive value).

• Step 1: While stopping condition is false, do Steps 2-8.


Step 2: For each input pattern 𝒙 chosen at random from the set of
training data and presented to the network, do Steps 3-5.

27
Summary of the SOM Algorithm (Cont.)
• Step 3:For each j neuron, compute the Euclidean distance

2
𝐷(𝑗) = ∑𝑛𝑖=1 𝑥𝑖 − 𝑤𝑖𝑗

• Step 4: Find the index 𝐽 such that 𝐷(𝐽) is a minimum


• Step 5: Learning (Update the synaptic weights):
For all units 𝐣 within a specified neighborhood of 𝐉, and for all 𝐢 :
𝑤𝑖𝑗 𝑡 + 1 = 𝑤𝑖𝑗 𝑡 + 𝜂 𝑡 ℎ(𝑡) 𝑥𝑖 − 𝑤𝑖𝑗 (𝑡)
• Step 6: Update learning rate 𝜂(𝑡). It is a decreasing function of the
number of epochs.
• Step 7: Reduce radius of topological neighborhood at specified times
• Step 8: Test stopping condition.

28
SOM Example

• Consider a simple example in which there are only 4


input training patterns.

𝒙1 𝒙2 𝒙3 𝒙4
1 1 0 0
0 0 0 1
1 0 0 0 𝒙1 𝒙2 𝒙3 𝒙4
0 0 1 1 1 1 0 0
0 0 0 1
1 0 0 0
0 0 1 1
• Classify these patterns, Use SOM Map

29
SOM Example Continued

• Let the initial weight matrix be

𝑤11 𝑤12 0.2 0.8


𝑤21 𝑤22 0.6 0.4
𝑤31 𝑤32 =
0.5 0.7
𝑤41 𝑤42 0.9 0.3

• The learning rate is initialized with a 0.6 value, and will decreases in
each epoch according with the following expression:
𝜂(𝑡 = 0) = 0.6
𝜂(𝑡 + 1) = 0.5𝜂(𝑡)

• Let topological radius 𝑅 = 0.


30
SOM Example Continued

• For vector 1100 (We are using the Euclidean distance squared for
convenience)
D(1) = (1 − 0.2)2 + (1 − 0.6)2 + (0 − 0.5)2 + (0 − 0.9)2 = 1.86
D 1 = 1.86, D(2) = 0.98
Hence 𝐽 = 2. Note that 𝑅 = 0, so we need not update the weights of any
neighboring neurons.
𝑤𝑖𝑗 𝑡 + 1 = 𝑤𝑖𝑗 𝑡 + 𝜂 𝑡 ℎ(𝑡) 𝑥𝑖 − 𝑤𝑖𝑗 (𝑡)
• The new weight matrix is
𝑤11 𝑤12 0.2 0.92
𝑤21 𝑤22 𝑤12 = 0.8 + 0.6 (1 − 0.8)
0.6 0.76
𝑤31 𝑤32 =
0.5 0.28
𝑤41 𝑤42 0.9 0.12

31
SOM Example Continued

• Likewise, for remains training patterns


• For vector 0001
D(1) = 0.66, D(2) = 2.2768 • For vector 1000
Hence 𝐽 = 1. D(1) = 1.8656, D(2) = 0.6768
𝑤11 𝑤12 0.08 0.92 Hence 𝐽 = 2
𝑤21 𝑤22 0.24 0.76 𝑤11 𝑤12
= 0.08 0.968
𝑤31 𝑤32 0.20 0.28 𝑤21 𝑤22 0.24 0.304
𝑤41 𝑤42 0.96 0.12 𝑤31 𝑤32 =
0.20 0.112
• For vector 0011 𝑤41 𝑤42 0.96 0.048

Hence 𝐽 = 1
𝑤11 𝑤12 0.032 0.968
𝑤21 𝑤22 0.096 0.304
𝑤31 𝑤32 =
0.680 0112
𝑤41 𝑤42 0.984 0.048
32
SOM Example Continued

• Now reduce learning rate (step 6):


𝜂(0) 0.6
𝜂(1) = = = 0.3
2 2
• It can be shown that after 100 presentations (epochs)of all the input
vector, the final weight matrix is
𝑤11 𝑤12 6.7𝑥10−17 1
𝑤21 𝑤22 2 × 10−16 0.49
𝑤31 𝑤32 =
0.51 2.3 × 10−16
𝑤41 𝑤42 1 1 × 10−16
• This matrix seems to converge to
Cluster 1 Cluster 2
𝑤11 𝑤12 0 1
𝑤21 𝑤22 0 0.5
𝑤31 𝑤32 =
0.5 0
𝑤41 𝑤42 1 0 33
SOM Example Continued
• Test network
Suppose the input pattern is 1100 .
• Then
2 2 2 2
𝐷(𝑗) = 𝑤1𝑗 − 𝑥1 + 𝑤2𝑗 − 𝑥2 + 𝑤3𝑗 − 𝑥3 + 𝑤4𝑗 − 𝑥4
𝐷(1) = (0 − 1)2 + (0 − 1)2 + (0.5 − 0)2 + (1 − 0)2 = 3.25
𝐷(2) = (1 − 1)2 + (0.5 − 1)2 + (0 − 0)2 + (0 − 0)2 = 0.25
• Thus neuron 2 is the "winner" and is the localized active region of the
SOM. Notice that we may label this input pattern to belong to cluster 2.
• For all the other patterns, we find the clusters are as listed below.

𝐱𝟏 𝔁𝟐 𝐱𝟑 𝔁𝟒 Cluster
1 1 0 0 2
0 0 0 1 1
1 0 0 0 2
0 0 1 1 1 34
Evaluation/Classification

• Sensitivity: the number of abnormal cases correctly identified divided by


the total number of abnormal cases. That is, it is related to the test’s
ability to identify positive results.
• Specificity: the number of normal cases correctly identified as negative
divided by the total number of normal cases. That is, it is related to the
test’s ability to identify negative results.
• Receiver Operating Characteristics (ROC) curves are practical for
visualizing, organizing, and selecting classifiers based on their
performance.
• They are extensively used in medical diagnosis.
• They have become an important tool for evaluating how accurate a test is
in determining if a patient has a certain disease.

35
Evaluation/Classification Continued

• Let’s consider a classification problem using only two classes.


Train ▪ Each pattern 𝐼 is mapped to one element of the set 𝑝, 𝑛 of positive
and negative class labels.
𝑝 𝑛 ▪ A classification model (classifier) is a mapping from instances to
Test predicted labels.
▪ We use labels 𝑌, 𝑁 for the class predictions by a model.
𝑌 𝑁
• Given a classifier and an instance, four possible outcomes can occur:
▪ True positive: the instance is positive, and it is classified as positive.
▪ False negative: the instance is positive, and it is classified as negative.
▪ True negative: the instance is negative, and it is classified as negative.
▪ False positive: the instance is negative, and it is classified as positive.

36
Evaluation/Classification Continued

□ Given a classifier and a set of instances True Class


(the test set), a two-by-two confusion P n
matrix can be constructed to represent
the inclination of the instances. True False

Predicted Class
□ The numbers across the major diagonal Y Positive Positive
represent the correct decision model.
□ The numbers outside the major
diagonal represent the errors (i.e., the False True
confusion) between the various classes. N Negative Negative

37
Evaluation/Classification Continued

True Class

Positive

Negative

38
Evaluation/Classification Continued

Predicted Class

Yes

No

39
Evaluation/Classification Continued

Comparing True
to Predicted Class
TP

FP

FN

TN

40
Evaluation/Classification Continued

□ Sensitivity: the number of


abnormal cases correctly identified
divided by the total number of True Class
abnormal cases. That is, it is P n
related to the test’s ability to
identify positive results. True False

Predicted Class
Y Positive Positive
TP
Sensitivity =
TP + FN False True
N Negative Negative

41
Evaluation/Classification Continued

□ Specificity: the number of


normal cases correctly identified
as negative divided by the total True Class
number of normal cases. That is, P n
it is related to the test’s ability
to identify negative results. True False

Predicted Class
Y Positive Positive
TN
Specificity =
TN + FP False True
N Negative Negative

42
Evaluation/Classification Continued
True Class
→ FP rate =
FP P n
FP+TN
TP True False
→ TP rate = = Recall

Predicted Class
TP+FN Y Positive Positive
TP
→ Precision =
TP+FP
Precision x Recall False True
→ F1 score = 2 . N Negative
Precision+ Recall Negative
TP+TN
→ Accuracy =
P+n

43
Evaluation/Classification Continued

• Recall: When we want to identify the positives is crucial.


• Specificity: When we want to identify the negative is crucial.
• Precision: When we want to be more confident of our predicted
positives.

• High recall: few patients with diseases are missed.


• High precision: few healthy individuals are incorrectly diagnosed as
having the disease.

44
ROC curve

• An ROC curve is a graph showing


the performance of a classification
model at all classification thresholds.
• ROC graphs are two-dimensional
graphs in which 𝑡𝑝 rate is plotted on
the 𝑌 axis and 𝑓𝑝 rate is plotted on
the 𝑋 axis.
• Area Under Curve (AUC): The
bigger the area under the curve
(closer to 1), the better our model is.

The better the test, the closer the curve comes to the
upper left corner 45
ROC curve Continued Accuracy =
(TP+TN)/(TP+FN+FP+TN)
=
Example on a test set of 20 instances (5+9)/(5+5+1+9) = (14)/(20)
= 70%

. . .

. . .

. . .

rue positive rate


. . .

. . .

.
. .
. The ROC point at
. . (0.1,0.5) produces
.
. . its highest accuracy
. (70%)
. .
. . . . . . . . . . .
alse positive rate
46
Assignment

Determine the accuracy at a threshold = 0.5

Instance Class Score Instance Class Score


1 P 0.99 11 P 0.41
2 P 0.85 12 n 0.4
3 P 0.7 13 n 0.28
4 P 0.6 14 n 0.27
5 n 0.55 15 n 0.26
6 P 0.54 16 n 0.25
7 0 0.53 17 n 0.24
8 P 0.52 18 n 0.23
9 n 0.51 19 n 0.2
10 P 0.49 20 n 0.1

47
Evaluation/Segmentation

Doctor
Mass No Mass
Algorithm

Mass TP FP
No Mass FN TN

Sensitivity Specificity

48
Evaluation/Segmentation Continued

□ Jaccard Index: it is defined


as the size of the intersection
divided by the size of the
union of the sample sets.
|𝐴⋂𝐵|
𝐽 𝐴, 𝐵 =
|𝐴⋃𝐵|

49

You might also like