0% found this document useful (0 votes)
36 views29 pages

Modul 7 (Neural Network & Evaluasi)

The document discusses neural networks and performance evaluation. It explains neural networks, backpropagation, and using neural networks for classification. It also explains the importance of performance evaluation, confusion matrix, and metrics like accuracy, precision, recall, and F1 score that can be derived from the confusion matrix.

Uploaded by

rdtfs12345
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
36 views29 pages

Modul 7 (Neural Network & Evaluasi)

The document discusses neural networks and performance evaluation. It explains neural networks, backpropagation, and using neural networks for classification. It also explains the importance of performance evaluation, confusion matrix, and metrics like accuracy, precision, recall, and F1 score that can be derived from the confusion matrix.

Uploaded by

rdtfs12345
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 29

Pengolahan Citra

Habibullah Akbar, PhD

PERTEMUAN 7

Neural Network &


Evaluasi

www.esaunggul.ac.id
Kemampuan Akhir Pertemuan 7

o Setelah mempelajari modul pada sesi ini,


• Mahasiswa dapat meringkas model neural networks
• Mahasiswa dapat menjelaskan pentingnya melakukan evaluasi
performa terhadap model klasifikasi seperti KNN, ANN, dan CNN
• Mahasiswa dapat menjelaskan pengertian tabel confusion matrix.
• Mahasiswa dapat menjelaskan cara mengukur akurasi, precision,
recall dan F measure

www.esaunggul.ac.id
Outline

• Neural Networks
• Pendahuluan Evaluasi Performa
• Confusion Matrix
• Akurasi
• Precision,
• Recall
• F measure
Let starts with Perceptron

Output Y is 1 if at least two of the three inputs


are equal to 1.

X1 X2 X3 Y Input Black box


1 0 0 0
1 0 1 1
X1
1 1 0 1 Output
1 1 1 1 X2
0 0 1 0
Y
0 1 0 0
0 1 1 1 X3
0 0 0 0
Perceptron

Model is an assembly of Input


nodes
inter-connected nodes Black box
Output
and weighted links X1 w1 node
w2
X2  Y
Output node sums up w3
each of its input value X3 t
according to the weights
of its links
Perceptron Model

Compare output node Y = I (  wi X i − t ) or


against some threshold t i

Y = sign(  wi X i − t )
i
Perceptron

Input
nodes Black box
X1 X2 X3 Y
1 0 0 0 Output
1 0 1 1
X1 0.3 node
1 1 0 1
1 1 1 1 X2 0.3 
0 0 1 0
Y
0 1 0 0
0 1 1 1 X3 0.3 t=0.4
0 0 0 0

Y = I (0.3 X 1 + 0.3 X 2 + 0.3 X 3 − 0.4  0)


1 if z is true
where I ( z ) = 
0 otherwise
Artificial Neural Network

x1 x2 x3 x4 x5

Input
Layer Input Neuron i Output
I1 wi1
wi2 Activation
I2
wi3
Si function Oi Oi
Hidden g(Si )
Layer I3

threshold, t

Output
Training ANN means learning
Layer
the weights of the neurons
y
How A Multi-Layer ANN Works
• The inputs to the network correspond to the attributes measured for each
training tuple
• Inputs are fed simultaneously into the units making up the input layer
• They are then weighted and fed simultaneously to a hidden layer
• The number of hidden layers is arbitrary, although usually only one
• The weighted outputs of the last hidden layer are input to units making up
the output layer, which emits the network's prediction
• The network is feed-forward: None of the weights cycles back to an input unit
or to an output unit of a previous layer
• From a statistical point of view, networks perform nonlinear regression: Given
enough hidden units and enough training samples, they can closely
approximate any function
Defining a Network Topology

• Decide the network topology: Specify # of units in the input layer, # of


hidden layers (if > 1), # of units in each hidden layer, and # of units in
the output layer
• Normalize the input values for each attribute measured in the training
tuples to [0.0—1.0]
• One input unit per domain value, each initialized to 0
• Output, if for classification and more than two classes, one output unit
per class is used
• Once a network has been trained and its accuracy is unacceptable,
repeat the training process with a different network topology or a
different set of initial weights
Learning Algorithm: Backpropagation
• Iteratively process a set of training tuples & compare the network's prediction
with the actual known target value
• For each training tuple, the weights are modified to minimize the mean squared
error between the network's prediction and the actual target value
• Modifications are made in the “backwards” direction: from the output layer,
through each hidden layer down to the first hidden layer, hence
“backpropagation”
• Steps
– Initialize weights to small random numbers, associated with biases
– Propagate the inputs forward (by applying activation function)
– Backpropagate the error (by updating weights and biases)
– Terminating condition (when error is very small, etc.)
Learning Algorithm: Backpropagation
• Initialize the weights (w0, w1, …, wk)
• Adjust the weights in such a way that the output of ANN is
consistent with class labels of training examples
– Objective function:

E =  Yi − f ( wi , X i )
2

– Find the weights wi’s that minimize the above objective


function
• e.g., backpropagation algorithm (see lecture notes)
Efficiency and Interpretability
• Efficiency of backpropagation: Each epoch (one iteration through the
training set) takes O(|D| * w), with |D| tuples and w weights, but # of
epochs can be exponential to n, the number of inputs, in worst case
• For easier comprehension: Rule extraction by network pruning
– Simplify the network structure by removing weighted links that
have the least effect on the trained network
– Then perform link, unit, or activation value clustering
– The set of input and activation values are studied to derive rules
describing the relationship between the input and hidden unit
layers
• Sensitivity analysis: assess the impact that a given input variable has
on a network output. The knowledge gained from this analysis can be
represented in rules
Neural Network as a Classifier
Weakness

– Long training time


– Require a number of parameters typically best determined
empirically, e.g., the network topology or “structure.”
– Poor interpretability: Difficult to interpret the symbolic meaning
behind the learned weights and of “hidden units” in the network

Strength

– High tolerance to noisy data


– Ability to classify untrained patterns
– Well-suited for continuous-valued inputs and outputs
– Successful on an array of real-world data, e.g., hand-written letters
– Algorithms are inherently parallel
– Techniques have recently been developed for the extraction of rules
from trained neural networks
Pendahuluan Evaluasi Performa

• Evaluasi performa pada model klasifikasi seperti KNN, ANN, dan


CNN merupakan hal yang penting.
• Misalkan pada dataset penyakit Covid-19, tentunya kita tidak ingin
hasil klasifikasi salah, misalnya pasien yang sebenarnya positif
Covid-19 justru dideteksi negatif.
Pendahuluan Evaluasi Performa

• Langkah pertama untuk mengevaluasi adalah kita


membagi dataset menjadi training set dan test set.
Pendahuluan Evaluasi Performa

• Berikutnya model klasifikasi


(seperti KNN, ANN, dan
CNN) akan dilatih
menggunakan training set.
• Pada contoh VGG
(modul 11 dan 12), maka
pelatihan bermaksud mencari
parameter VGG yang terbaik
Pendahuluan Evaluasi Performa

• Kemudian performa
dan model yang
dihasilkan perlu
dievaluasi agar dapat
digunakan dalam
bisnis/dunia nyata.
Evaluasi Performa

• Evaluasi dapat dilakukan berdasarkan metrik.


• Contoh metrik adalah akurasi, precision, recall, dan F1
score.
• Metrik-metrik tersebut dapat diturunkan berdasarkan
confusion matrix.
Confusion Matrix

• Misalkan kita ingin mendeteksi penyakit Covid-19.


• Matrik confusion adalah tabel untuk mengevaluasi performa suatu metode
klasifikasi.

HASIL PREDIKSI

positif negatif
false negatives
positif true positives (TP)
LABEL (FN)
SEBENARNYA true negatives
negatif false positives (FP)
(TN)
Confusion Matrix

• Label positif menunjukan pasien sebenarnya menderita penyakit Covid-19.


sedangkan label negatif menunjukan pasien sebenarnya tidak terkena
Covid-19. Pada tabel disebut LABEL YANG SEHARUSNYA.
• HASIL PREDIKSI dari metode klasifikasi yang di-training dapat saja salah.

HASIL PREDIKSI
positif negatif

positif true positives (TP) false negatives (FN)


LABEL YANG
SEHARUSNYA negatif false positives (FP) true negatives (TN)
Confusion Matrix

• Ketika hasil prediksi metode klasifikasi mengatakan positif dan label yang
seharusnya adalah positif, maka kita matrix ini menyatakan true positif
(memang benar positif). Demikian juga, jika hasil prediksi negatif dan
memang label negatif, kita katakan true negatif (sesuai)
• Namun jika hasil prediksi menyatakan pasien positif Covid-19 padahal
sebenarnya negatif (labelnya) maka kejadian ini disebut false positif (yaitu
kesalahan dalam memprediksi positif) dan jika sebaliknya maka disebut
false negatif.

HASIL PREDIKSI
positif negatif

positif true positives (TP) false negatives (FN)


LABEL
SEHARUSNYA negatif false positives (FP) true negatives (TN)
Contoh

• Pada contoh tabel, terjadi kesalahan prediksi sebanyak 10 data.


Empat pasien yang seharusnya negative (labelnya) justru diprediksi
positif Covid-19. Akibatnya, pasien bisa dikarantina padahal
seharusnya tidak perlu.
• Sebaliknya, 6 pasien yang memang terkena Covid-19 diprediksi
negatif. Akibatnya, penyebaran bisa semakin luas karena pasien
dapat berkeliaran bebas.

HASIL PREDIKSI
positif negatif

positif 60 6
LABEL
SEHARUSNYA negatif 4 30
Metrik
• Dari confusion matrix, dapat diturunkan beberapa metrik untuk mengukur
performa model klasifikasi seperti:
– Akurasi
– Precision,
– Recall
– F measure
Akurasi

• Akurasi adalah nilai total dari prediksi yang tepat yaitu dari True Positive
(TP) dan True Negative (TN). Perhitungannya adalah sebagai berikut:

• Kelemahan akurasi adalah ketika dataset tidak seimbang. Misal, dataset


mengandung data berlabel positif terlalu banyak dan hanya sedikit data
berlabel negatif. Untuk mengatasi, dapat digunakan F measure.
Precision

• Presisi adalah perbandingan dari prediksi positif yang tepat terhadap total
prediksi yang positif. Perhitungannya adalah sebagai berikut:

• Presisi dapat diartikan sebagai ukuran tentang berapa banyak pasien


Covid-19 yang diprediksi positif berdasarkan label yang memang
seharusnya positif.
Recall

• Recall (disebut juga sensitifitas) adalah perbandingan dari prediksi


positif yang tepat terhadap label yang memang seharusnya positif .

• Recall digunakan untuk mengevaluasi resiko pasien yang


sebenarnya positif Covid-19 justru dideteksi negatif.
F measure
• Ukuran (measure) F adalah rata-rata terhadap precision dan recall yang
dibobotkan. Ukuran F lebih tepat digunakan ketika dataset yang dilabelkan
tidak seimbang.
Catatan

• Selain membagi dataset menjadi 2 yaitu training set dan test set, terdapat
metode evaluasi performa lainnya seperti:
– Cross validation
– Kurva receiver operating characteristic
Referensi

1. GONZALEZ, R. C., AND WOODS, R. E. 2018. Digital image


processing, 4th Global Edition, Pearson, New York
2. SOLOMON, C. AND BRECKON, T. 2011. Fundamentals of Digital
Image Processing: A practical approach with examples in Matlab. John
Wiley & Sons.
3. Tan, Steinbach, and Kumar, Introduction to Data Mining
4. http://www-users.cs.umn.edu/~kumar/dmbook/index.php
5. Han, Kamber, and Pei, Data mining: Concepts and Techniques
http://hanj.cs.illinois.edu/bk3/bk3_slidesindex.htm

You might also like