0% found this document useful (0 votes)

3 views38 pages

ML Module 4

Bayes' Theorem is a key principle in probability that updates the probability of a hypothesis based on new evidence, involving prior probability, likelihood, and posterior probability. The document also discusses artificial neural networks, their structure, and the functioning of perceptrons, which are simplified models of biological neurons used for supervised learning. It includes examples of activation functions and a practical problem involving the design of a two-layer perceptron network to implement a NAND gate.

Uploaded by

avalandro12

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

3 views38 pages

ML Module 4

Uploaded by

avalandro12

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 38

MODULE-4

Bayes' Theorem is a fundamental concept in probability theory and forms the foundation of Bayesian
learning in machine learning. It allows you to update the probability of a hypothesis (or event) based on
new evidence.

Bayes' Theorem Explained

At its core, Bayes' Theorem relates current knowledge or belief about an event (the prior probability) to
new data or evidence (the likelihood) to produce an updated belief (the posterior probability).

Mathematically, Bayes' Theorem is:

Where:

• .IN
P(H | D) is the posterior probability: the probability of the hypothesis HH being true given the data
C
DD.
• P(D | H) is the likelihood: the probability of observing the data DD given that hypothesis HH is true.
N
• P(H) is the prior probability: the initial belief about the hypothesis HH before any data is observed.
• P(D) is the marginal likelihood or evidence: the total probability of the data under all possible
SY

hypotheses. This acts as a normalizing constant to ensure that the posterior is a valid probability
distribution.

Breaking Down the Components of Bayes' Theorem

1. Prior Probability (P(H)):

This represents what we know or believe about a hypothesis before seeing any new data.
VT

o
o Example: In a medical test scenario, it could be the prior probability of a person having a
disease before considering the test results (e.g., based on the general population statistics).
2. Likelihood (P(D | H)):
o This is the probability of observing the data, assuming the hypothesis is true. It expresses how
likely it is to see the given data under the assumption of the hypothesis.
o Example: The likelihood would be the probability of getting a positive test result assuming the
person has the disease.
3. Evidence (P(D)):
o This is the total probability of the data across all hypotheses. It serves to normalize the
posterior probability so that it sums to 1.
o Example: The probability of getting a positive test result across all people, whether they have
the disease or not.
4. Posterior Probability (P(H | D)):
o This is the updated belief about the hypothesis after considering the new data (the evidence).
o Example: The posterior would give the probability of a person having the disease after
considering both the prior knowledge and the test results.

Intuition Behind Bayes' Theorem

Bayes' Theorem can be understood in terms of updating beliefs. When you receive new evidence, you
modify your prior belief to form a new belief that incorporates both your prior knowledge and the new
data.

• Before you collect any data, you have a prior belief about a hypothesis (e.g., the probability of a patient
having a disease).
• After seeing new data (e.g., the result of a medical test), you update your belief about the hypothesis to
reflect this new evidence.

Bayes’ Theorem lets you do this systematically, ensuring that your updated belief (posterior) is
proportional to the prior belief and the likelihood of observing the new data.

Example: Disease Diagnosis

Consider a simple example of diagnosing a disease using a medical test.

1. Prior Probability (P(H)):

o The prior belief is the probability that a person has the disease. For example, in a population,
1% of people might have the disease, so P(H)=0.01P(H) = 0.01.
2. Likelihood (P(D | H)):

.IN
o This is the probability of getting a positive test result if the person has the disease. Suppose the
test correctly identifies the disease 95% of the time, so P(D∣H)=0.95P(D | H) = 0.95.
3. Evidence (P(D)):
o This is the total probability of a positive test result in the population. It includes both people
C
who have the disease and those who do not.
4. Posterior Probability (P(H | D)):
After receiving a positive test result, we want to calculate the probability that the person
N
o
actually has the disease.
SY
U
VT
VT
U
SY
N
C
.IN
VT
U
SY
N
C
.IN
VT
U
SY
N
C
.IN
VT
U
SY
N
C
.IN
VT
U
SY
N
C
.IN
VT
U
SY
N
C
.IN
VT
U
SY
N
C
.IN
VT
U
SY
N
C
.IN
VT
U
SY
N
C
.IN
VT
U
SY
N
C
.IN
Chapter 10
Artificial Neural Networks
The term "Artificial neural network" refers to a biologically inspired sub-field of artificial intelligence
modelled after the brain.
An Artificial neural network is usually a computational network based on biological neural networks
that construct the structure of the human brain.
Similar to a human brain has neurons interconnected to each other, artificial neural networks also have
neurons that are linked to each other in various layers of the networks. These neurons are known as
nodes.

.IN
C
N
SY

The biological neuron consists of main four parts:

• dendrites: nerve fibres carrying electrical signals to the cell .

• cell body: computes a non-linear function of its inputs

• axon: single long fiber that carries the electrical signal from the cell body to other neurons
• synapse: the point of contact between the axon of one cell and the dendrite of another,
regulating a chemical connection whose strength affects the input to the cell.

•
Dendrites are tree like networks made of nerve fiber connected to the cell body.
An Axon is a single, long connection extending from the cell body and carrying signals from the
neuron. The end of axon splits into fine strands. It is found that each strand terminated into small
bulb like organs called as synapse. It is through synapse that the neuron introduces its signals to
other nearby neurons. The receiving ends of these synapses on the nearby neurons can be found
both on the dendrites and on the cell body. There are approximately 104 synapses per neuron in the
human body. Electric impulse is passed between synapse and dendrites. It is a chemical process
which results in increase/decrease in the electric potential inside the body of the receiving cell. If
the electric potential reaches a thresh hold value, receiving cell fires & pulse / action potential of
fixed strength and duration is send through the axon to synaptic junction of the cell. After that, cell
has to wait for a period called refractory period.

Difference between biological and Artificial Neuron

.IN
C
N
SY
U
VT
ARTIFICIAL NEURONS:
.IN
Artificial neurons are like biological neurons that are linked to each other in various layers of the
C
networks. These neurons are known as nodes.
N
A node or a neuron can receive one or more input information and process it. artificial neurons are
SY

connected by connection links to another neuron. Each connection link is associated with a synaptic
weight. The structure of a single neuron is shown below:
U
VT
Fig: McCulloch-Pitts Neuron Mathematical model. .IN
C
N
SY

Simple Model of an ANN

The first mathematical model of a biological neuron was designed by McCulloch-Pitts in 1943.
It includes 2 steps:
U

1. It receives weighted inputs from other neurons.

2. It operates with a threshold function or activation function.
VT

Basically, a neuron takes an input signal (dendrite), processes it like the CPU (soma), passes
the output through a cable like structure to other connected neurons (axon to synapse to
other neuron’s dendrite).

OR
Working:
The received input are computed as a weighted sum which is given to the activation function

.IN
and if the sum exceeds the threshold value the neuron gets fired.The neuron is the basic
processing unit that receives a set of inputs x1,x2,x3,….xn and their associated weights
w1,w2,w3,….wn. The summation function computes the weighted sum of the inputs
C
received by the neuron.
Sum=∑xiwi
N
SY

Activation functions:
• To make work more efficient and for exact output, some force or activation is given. Like
that, activation function is applied over the net input to calculate the output of an ANN.
U

Information processing of processing element has two major parts: input and output. An
VT

integration function (f) is associated with input of processing element.

• Several activation functions are there.

1. Identity function or Linear Function: It is a linear function which is defined as 𝑓(𝑓) =

𝑓 𝑓𝑜𝑟 𝑎𝑙𝑙 𝑥

The output is same as the input ie the weighted sum. The function is useful when we do
not apply any threshold. The output value ranged between –∞ and +∞
2. Binary step function: This function can be defined as
𝑓(𝑓) = { 1 𝑓𝑓 𝑓 ≥ 𝑓
0 𝑓𝑓 𝑓 < 𝑓
Where, θ represents threshhold value. It is used in single layer nets to convert
the net input to an output that is binary (0 or 1).
3. Bipolar step function: This function can be defined as
𝑓(𝑥) = { 1 𝑖𝑓 𝑥 ≥ 𝜃
−1 𝑖𝑓 𝑥 < 𝜃
Where, θ represents threshold value. It is used in single layer nets to convert
the net input to an output that is bipolar (+1 or -1).
4. Sigmoid function: It is used in Back propagation nets.
Two types:
a) Binary sigmoid function: It is also termed as logistic sigmoid function or unipolar
sigmoid function. It is defined as

.IN
where, λ represents steepness parameter. The range of sigmoid function is 0
C
to 1
b) Bipolar sigmoid function: This function is defined as
N
SY

Where λ represents steepness parameter and the sigmoid range is between -1

and +1.
VT

5. Ramp function: The ramp function is defined as:

It is a linear function whose upper and lower limits are fixed.

.IN
C
N
SY

6. Tanh-Hyperbolic tangent function : Tanh function is very similar to the sigmoid/logistic

activation function, and even has the same S-shape with the difference in output range of -1 to
1. In Tanh, the larger the input (more positive), the closer the output value will be to 1.0,
VT

whereas the smaller the input (more negative), the closer the output will be to -1.0.

7. ReLU Function
ReLU stands for Rectified Linear Unit.
Although it gives an impression of a linear function, ReLU has a derivative function and
allows for backpropagation while simultaneously making it computationally efficient.
The main catch here is that the ReLU function does not activate all the neurons at the same
time.
The neurons will only be deactivated if the output of the linear transformation is less than 0
8. Softmax function: Softmax is an activation function that scales numbers/logits into
probabilities. The output of a Softmax is a vector (say v) with probabilities of each

possible outcome. The probabilities in vector v sums to one for all possible outcomes or

classes.

.IN
C
N
SY
U
VT

Artificial Neural Network Structure

• Artificial Neural Networks Computational models inspired by the human brain: – Massively
parallel, distributed system, made up of simple processing units (neurons) – Synaptic
connection strengths among neurons are used to store the acquired knowledge.

• Knowledge is acquired by the network from its environment through a learning process.

• The Neural Network is constructed from 3 type of layers:

• Input layer — initial data for the neural network.
• Hidden layers — intermediate layer between input and output layer and place where all the
computation is done.

• Output layer — produce the result for given inputs.

.IN
C
N
PERCEPTRON AND LEARNING THEORY
SY

• The perceptron is also a simplified model of a biological neuron.

• The perceptron is an algorithm for supervised learning of binary classifiers. It is a type of
linear classifier, i.e. a classification algorithm that makes all of its predictions based on a
U

linear predictor function combining a set of weights with the feature vector.
• One type of ANN system is based on a unit called a perceptron.
VT

OR
• The perceptron can represent all boolean primitive functions AND, OR, NAND , NOR.
• Some boolean functions can not be represented .
– E.g. the XOR function.

.IN
Major components of a perceptron
• Input
C
• Weight
• Bias
N
• Weighted summation
SY

• Step/activation function
• output
WORKING:
U

• Feed the features of the model that is required to be trained as input in the first layer. All
VT

weights and inputs will be multiplied – the multiplied result of each weight and input will be
added up.The Bias value will be added to shift the output function .This value will be
presented to the activation function (the type of activation function will depend on the need)
The value received after the last step is the output value.
The activation function is a binary step function which outputs a value 1, if f(x) is above the
threshold value Θ and a 0 if f(x) is below the threshold value Θ. Then the output of a neuron
is:
.IN
C
PROBLEM:
Design a 2 layer network of perceptron to implement NAND gate. Assume your own weights and
N
biases in the range of [-0.5 0.5]. Use learning rate as 0.4.
SY

Solution:
U

X0
VT

𝚹3 𝚹4
X1 𝑤13

X3 X4
𝑤34
AND NOT
𝑤23
X2

Figure 1 Two Layer Network for NAND gate

Table 1: Weights and Biases

𝑿𝟏 𝑿𝟐 𝑶𝒅𝒆𝒔𝒊𝒓𝒆𝒅 𝒘𝟏𝟑 𝒘𝟐𝟑 𝒘𝟑𝟒 𝚹𝟑 𝚹𝟒 𝑿𝟎

0 1 1 0.1 -0.4 0.3 0.2 -0.3 1

Table 2: Truth Table of NAND Gate
𝑿𝟏 𝑿𝟐 𝑿𝟏 𝑨𝑵𝑫 𝑿𝟐 𝑵𝑨𝑵𝑫 = 𝑵𝑶𝑻(𝑿𝟏 𝑨𝑵𝑫 𝑿𝟐)

0 0 0 1
0 1 0 1
1 0 0 1
1 1 1 0

ITERATION 1:
Step 1: FORWARD PROPAGATION
1. Calculate net inputs and outputs in input layer as shown in Table 3.
Table 3: Net Input and Output Calculation
Input Layer 𝑰𝒋 𝑶𝒋

𝑿𝟏 0 0

.IN
𝑿𝟐 1 1
C
2. Calculate net inputs and outputs in hidden and output layer as shown in Table 4.
Table 4: Net Input and Output Calculation in Hidden and Output layer
N

𝑼𝒏𝒊𝒕𝒋 𝑵𝒆𝒕 𝑰𝒏𝒑𝒖𝒕 𝑰𝒋 𝑵𝒆𝒕 𝒐𝒖𝒕𝒑𝒖𝒕 𝑶𝒋

𝑿𝟑 𝐼3 = 𝑋1𝑊13 + 𝑋2𝑊23 + 𝑋0𝚹3 1

𝑶𝟑 =
1 + 𝑒−𝐼3
= 0(0.1) + 1(−0.4) + 1(0.2)
1
U

= −0.2 =
1 + 𝑒−(−0.2)
VT

= 0.450
𝑼𝒏𝒊𝒕𝒌 𝑵𝒆𝒕 𝑰𝒏𝒑𝒖𝒕 𝑰𝒌 𝐍𝐞𝐭 𝐨𝐮𝐭𝐩𝐮𝐭 𝑶𝒌

𝑿𝟒 𝐼4 = 𝑂3𝑊34 + 𝑋0𝚹4 1
𝑶𝟒 =
1 + 𝑒−𝐼4
= (0.450 ∗ 0.3) + 1(−0.3)
1
= −0.165 =
1 + 𝑒−(−0.165)

= 0.458

3. Calculate Error
𝑬𝒓𝒓𝒐𝒓 = 𝑶𝒅𝒆𝒔𝒊𝒓𝒆𝒅 − 𝑶𝒆𝒔𝒕𝒊𝒎𝒂𝒕𝒆𝒅
= 1 − 0.458
𝐸𝑟𝑟𝑜𝑟 = 0.542

Step 2: BACKWARD PROPAGATION

1. For each 𝒖𝒏𝒊𝒕𝒌 in the output layer
𝑬𝒓𝒓𝒐𝒓𝒌 = 𝑶𝒌 ∗ (𝟏 − 𝑶𝒌) ∗ (𝑶𝒅𝒆𝒔𝒊𝒓𝒆𝒅 − 𝑶𝒌)

For each 𝒖𝒏𝒊𝒕𝒋 in the hidden layer

𝑬𝒓𝒓𝒐𝒓𝒋 = 𝑶𝒋 ∗ (𝟏 − 𝑶𝒋) ∗ (∑ 𝑬𝒓𝒓𝒐𝒓 ∗ 𝑾𝒋𝒌)

𝒌

Table 5: Error Calculation

For each output 𝑬𝒓𝒓𝒐𝒓𝒌
layer 𝒖𝒏𝒊𝒕𝒌
𝑋4 𝐸𝑟𝑟𝑜𝑟𝑘 = 𝑂𝑘 ∗ (1 − 𝑂𝑘) ∗ (𝑂𝑑𝑒𝑠𝑖𝑟𝑒𝑑 − 𝑂𝑘)

.IN
= 0.458(1 − 0.458)(1 − 0.458)
= 0.134
C
For each hidden layer 𝑬𝒓𝒓𝒐𝒓𝒋
N
𝒖𝒏𝒊𝒕𝒋
SY

𝑋3 𝐸𝑟𝑟𝑜𝑟𝑗 = 𝑂𝑗 ∗ (1 − 𝑂𝑗) ∗ (∑ 𝐸𝑟𝑟𝑜𝑟 ∗ 𝑊𝑗𝑘)

= 0.450 ∗ (1 − 0.450) ∗ 0.134 ∗ 0.3

= 0.0099

2. Update Weights and biases

Table 6: Weight and Bias Calculation

𝒘𝒊𝒋 𝒘𝒊𝒋 = 𝒘𝒊𝒋 + (𝜶 ∗ 𝑬𝒓𝒓𝒐𝒓𝒋 ∗ 𝑶𝒊) Net Weight

𝑤13 𝑤13 = 𝑤13 + (0.4 ∗ 𝐸𝑟𝑟𝑜𝑟3 ∗ 𝑂1) 0.1

= 0.1 ∗ (0.4 ∗ 0.0099 ∗ 0)
𝑤23 𝑤23 = 𝑤23 + (0.4 ∗ 𝐸𝑟𝑟𝑜𝑟3 ∗ 𝑂2) -0.396
= −0.4 ∗ (0.4 ∗ 0.0099 ∗ 1)
𝑤24 𝑤24 = 𝑤24 + (0.4 ∗ 𝐸𝑟𝑟𝑜𝑟4 ∗ 𝑂2) 0.324
= 0.3 ∗ (0.4 ∗ 0.134 ∗ 0.450)
𝚹𝒋 𝚹𝒋 = 𝚹𝒋 + (𝜶 ∗ 𝑬𝒓𝒓𝒐𝒓𝒋) Net Bias

𝚹3 𝚹3 = 𝚹3 + (0.4 ∗ 𝐸𝑟𝑟𝑜𝑟3) 0.203

= 0.2 + (0.4 ∗ 0.0099)
𝚹4 𝚹4 = 𝚹4 + (0.4 ∗ 𝐸𝑟𝑟𝑜𝑟4) -0.246
= −0.3 + (0.4 ∗ 0.134

ITERATION 2:
Step 1: FORWARD PROPAGATION

1. Calculate net inputs and outputs in hidden and output layer

Table 7: Inputs and Outputs in Hidden and Output layer

𝑼𝒏𝒊𝒕𝒋 𝑵𝒆𝒕 𝑰𝒏𝒑𝒖𝒕 𝑰𝒋 𝑵𝒆𝒕 𝒐𝒖𝒕𝒑𝒖𝒕 𝑶𝒋

.IN
𝑿𝟑 𝐼3 = 𝑋1𝑊13 + 𝑋2𝑊23 + 𝑋0𝚹3
𝑶𝟑 =
1 + 𝑒−𝐼3
= 0(0.1) + 1(−0.396) + 1(0.203)
1
= −0.193 =
1 + 𝑒−(−0.193)
C
= 0.451
N
𝑼𝒏𝒊𝒕𝒌 𝑵𝒆𝒕 𝑰𝒏𝒑𝒖𝒕 𝑰𝒌 𝑵𝒆𝒕 𝒐𝒖𝒕𝒑𝒖𝒕 𝑶𝒌
1
SY

𝑿𝟒 𝐼4 = 𝑂3𝑊34 + 𝑋0𝚹4 𝑶𝟒 =
1 + 𝑒−𝐼4
= (0.451 ∗ 0.324) + 1(−0.246)
1
= −0.099 =
1 + 𝑒−(−0.099)
U

= 0.475
VT

2. Calculate Error
𝑬𝒓𝒓𝒐𝒓 = 𝑶𝒅𝒆𝒔𝒊𝒓𝒆𝒅 − 𝑶𝒆𝒔𝒕𝒊𝒎𝒂𝒕𝒆𝒅
= 1 − 0.475
𝐸𝑟𝑟𝑜𝑟 = 0.525

ITERATION ERROR
1 0.542 =0.542-0.525
=0.017
2 0.525

In iteration 2 the error gets reduced to 0.525. This process will continue until desired output
is achieved.
How a Multi-Layer Perceptron does solves the XOR problem. Design an MLP with back
propagation to implement the XOR Boolean function.
Solution:

X1 X2 Y
0 0 1
0 1 0
1 0 0
1 1 1

.IN
0.1

X1 -0.3
-0.2
0.4
C
0.4
X3 0.2
0.2
N
X2 X5
SY

-0.3
-0.3

X4
U

Figure 2: Multi Layer Perceptron for XOR

Learning rate: =0.8

Table 8: Weights and Biases
X1 X2 W13 W14 W23 W24 W35 W45 𝜃3 𝜃4 𝜃5
1 0 -0.2 0.4 0.2 -0.3 0.2 -0.3 0.4 0.1 -0.3

Step 1: Forward Propagation

1. Calculate Input and Output in the Input Layer shown in Table 9.
Table 9: Net Input and Output Calculation
Input Layer Ij Oj
X1 1 1
X2 0 0
2. Calculate Net Input and Output in the Hidden Layer and Output Layer shown in Table 10.
Table 10: Unit j at Hidden Layer and Output Layer – Net Input and Output Calculation
Unit j Net Input Ij Output Oj
1 1
X3 I3 = X1*W13 + X2*W23+ X0*θ3 O3 = = = 0.549
1+𝑒−𝐼3 1+𝑒−0.2
I3 = 1*-0.2 + 0*0.2+ 1*0.4 = 0.2
1 1
X4 I4 = X1*W14 + X2*W24+ X0*θ4 O4 = = = 0.622
1+𝑒−𝐼4 1+𝑒−0.5
I4 = 1*0.4 + 0*-0.3+ 1*0.1 = 0.5
1 1
X5 I5 = O3 * W35 + O4*W45 + X0*θ5 O5 = = =0.407
1+𝑒−𝐼5 1+𝑒0.376
I5 = 0.549 * 0.2 + 0.622 * -0.3 + 1*-0.3 = -0.376

3. Calculate Error = Odesired – OEstimated

So error for this network is,
Error = Odesired – O7 = 1 – 0.407 = 0.593

Step 2: Backward Propagation

1. Calculate Error at each node as shown in Table 11. .IN
C
For each unit k in the output layer, calculate
N
Error k = Ok (1-Ok) (YN – Ok)
SY

For each unit j in the hidden layer, calculate

Error j = Oj (1-Oj) ∑𝑘 𝐸𝑟𝑟𝑜𝑟𝑘 𝑊𝑗𝑘
U

Table 11: Error Calculation for each unit in the Output layer and Hidden layer
For Output Layer Errork
VT

Unit k
X5 Error 5 = O5 (1-O5) (1 – O5)
= 0.407 * (1-0.407) * (1- 0.407)
= 0.143
For Hidden layer Errorj
Unit j
X4 Error 4 = O4 (1-O4) ∑𝑘 𝐸𝑟𝑟𝑜𝑟𝑘 𝑊𝑗𝑘 = O4 (1-O4) 𝐸𝑟𝑟𝑜𝑟5 𝑊45
= 0.622 (1-0.622) *- 0.3 *0.143
= -0.010
X3 Error 3 = O3 (1-O3) ∑𝑘 𝐸𝑟𝑟𝑜𝑟𝑘 𝑊𝑗𝑘 = O3 (1-O3) 𝐸𝑟𝑟𝑜𝑟5 𝑊35
= 0.549 (1- 0.549) * 0.143 * 0.2
= -0.007

2. Update weight using the below formula,

Learning rate α = 0.8
∆Wij = ∝∗ Error j* Oi
Wij = Wij+ ∆Wij
The updated weight and bias is shown in Table 12 and Table 13.
Table 12: Weight Updation
Wij Wij = Wij+ ∝∗ Error j* Oi New Weight
W13 W13 = W13 + 0.8 * Error 3* O1 -0.194
= -0.2 + 0.8 * 0.007 * 1
W14 W14 = W14 + 0.8 * Error 4* O1 0.392
= 0.4+ 0.8 * -0.01 *1

.IN
W23 W23 = W23 + 0.8 * Error 3* O2 0.2
= 0.2 + 0.8 * 0.007 *0
C
W24 W24 = W24+ 0.8 * Error 4 * O2 -0.3
= -0.3+ 0.8 * -0.001 *0
N
W35 W35 = W35 + 0.8 * Error 5* O3 0.154
SY

= 0.2 + 0.8 0.143 0.4

W45 W45 = W45 + 0.8 * Error 5* O4 -0.288
= 0.3 + 0.8 * 0.143* 0.1
U

Update bias using the below formula,

∆θj = = ∝∗ Error j
θj = θj + ∆θj
Table 13: Bias Updation
θj θj = θj + ∝∗ Error j New Bias
𝜃3 Θ3 = θ3 + ∝∗ Error 3 0.405
= 0.4 + 0.8 * 0.007
𝜃4 θ 4 = θ4 + ∝∗ Error 4 0.092
= 0.1 + 0.8 *- 0.01
𝜃5 θ 5 = θ5 + ∝∗ Error 5 -0.185
= -0.3 + 0.8 * 0.143
Iteration 2
Now with the updated weights and biases,
1. Calculate Input and Output in the Input Layer shown in Table 14.
Table 14: Net Input and Output Calculation
Input Layer Ij Oj
X1 1 1
X2 0 0

2. Calculate Net Input and Output in the Hidden Layer and Output Layer shown in Table 15.
Table 15: Net Input and Output Calculation in the Hidden Layer and Output Layer
Unit j Net Input Ij Output Oj
1 1
X3 I3 = X1*W13 + X2*W23+ X0*θ3 O3 = = =
1+𝑒−𝐼3 1+𝑒−0.211
I3 = 1*-0.194 + 0*0.2+ 1*0.405 = 0.211 0.552

.IN
1 1
X4 I4 = X1*W14 + X2*W24+ X0*θ4 O4 = = =
1+𝑒−𝐼4 1+𝑒−0.484
I4 = 1*0.392 + 0*-0.3+ 1*0.092 = 0.484 0.618
C
1 1
X5 I5 = O3 * W35 + O4*W45 + X0*θ5 O5 = = =0.429
1+𝑒−𝐼5 1+𝑒0.282
N
I5 = 0.552* 0.154 + 0.618* -0.288 + 1*-0.185 = -
0.282
SY

The output we receive in the network at node 5 is 0.407.

Error = 1 - 0.429= 0.571

Now when we compare the error, we get in the previous iteration and in the current iteration, the
VT

network has learnt which reduces the error by 0.022.

Error is reduced by 0.055: 0.593 – 0.571.

Consider the Network architecture with 4 input units and 2 output units. Consider four training
samples each vector of length 4.
Training samples
i1: (1, 1, 1, 0)
i2: (0, 0, 1, 1)
i3: (1, 0, 0, 1)
i4: (0, 0, 1, 0)
Output Units: Unit 1, Unit 2
Learning rate η(t) = 0.6
Initial Weight matrix
0.2 0.8 0.5 0.1
[Unit 1]:[ ]
Unit 2 0.3 0.5 0.4 0.6
Identify an algorithm to learn without supervision? How do you cluster them as we
expected?

Solution:
Use Self Organizing Feature Map (SOFM)

Iteration 1:
Training Sample X1: (1, 1, 1, 0)
Weight matrix
0.2 0.8 0.5 0.1
[Unit 1]: [

.IN
]
Unit 2 0.3 0.5 0.4 0.6

Compute Euclidean distance between X1: (1, 1, 1, 0) and Unit 1 weights.

C
N
d2 = (0.2 -1)2 + (0.8 – 1)2 + (0.5 -1)2 + (0.1 – 0)2
= 0.94
SY

Compute Euclidean distance between X1: (1, 1, 1, 0) and Unit 2 weights.

d2 = (0.3 -1)2 + (0.5 – 1)2 + (0.4 -1)2 + (0.6– 0)2

= 1.46
VT

Unit 1 wins
Update the weights of the winning unit
New Unit 1 weights = [0.2 0.8 0.5 0.2] + 0.6 ([1 1 1 0] - [0.2 0.8 0.5 0.2])
= [0.2 0.8 0.5 0.2] + 0.6 [0.8 0.2 0.5 -0.2]
= [0.2 0.8 0.5 0.2] + [0.48 0.12 0.30 -0.12]
= [0.68 0.92 0.80 0.08]

[Unit 1]:[ 0.68 0.92 0.80 0.08]

Unit 2 0.3 0.5 0.4 0.6
Iteration 2:
Training Sample X2: (0, 0, 1, 1)
Weight matrix
0.68 0.92 0.80 0.08
[Unit 1]:[ ]
Unit 2 0.3 0.5 0.4 0.6
Compute Euclidean distance between X2: (0, 0, 1, 1) and Unit 1 weights.

d2 = (0.68 -0)2 + (0.92 – 0)2 + (0.80 -1)2 + (0.08 – 1)2

= 2.1952
Compute Euclidean distance between X2: (0, 0, 1, 1) and Unit 2 weights.

d2 = (0.3 -0)2 + (0.5 – 0)2 + (0.4 -1)2 + (0.6– 1)2

= 0.86
Unit 2 wins
Update the weights of the winning unit
New Unit 2 weights = [0.3 0.5 0.4 0.6] + 0.6 ([0 0 1 1] - [0.3 0.5 0.4 0.6])
= [0.3 0.5 0.4 0.6] + 0.6 [-0.3 -0.5 0.6 0.4]

.IN
= [0.3 0.5 0.4 0.6] + [-0.18 -0.30 0.36 0.24]
= [0.12 0.2 0.76 0.84]

[Unit 1]:[ 0.68 0.92 0.80 0.08]

C
Unit 2 0.12 0.2 0.76 0.84
N
Iteration 3:
SY

Training Sample X3: (1, 0, 0, 1)

Weight matrix
0.68 0.92 0.80 0.08
[Unit 1]:[ ]
U

Unit 2 0.12 0.2 0.76 0.84

Compute Euclidean distance between X3: (1, 0, 0, 1) and Unit 1 weights.

d2 = (0.68 -1)2 + (0.92 – 0)2 + (0.80 -0)2 + (0.08 – 1)2

= 2.44
Compute Euclidean distance between X3: (1, 0, 0, 1) and Unit 2 weights.

d2 = (0.12 -1)2 + (0.2 – 0)2 + (0.76 -0)2 + (0.84– 1)2

= 1.42
Unit 2 wins
Update the weights of the winning unit
New Unit 2 weights = [0.12 0.2 0.76 0.84] + 0.6 ([1 0 0 1] - [0.12 0.2 0.76 0.84])
= [0.12 0.2 0.76 0.84] + 0.6 [0.88 -0.2 -0.76 0.16]
= [0.12 0.2 0.76 0.84] + [0.53 -0.12 -0.46 0.096]
= [0.65 0.08 0.3 0.94]

[Unit 1]:[ 0.68 0.92 0.80 0.08]

Unit 2 0.65 0.08 0.3 0.94

Iteration 4:
Training Sample X4: (0, 0, 1, 0)
Weight matrix

[Unit 1]:[ 0.68 0.92 0.80 0.08

]
Unit 2 0.65 0.08 0.3 0.94

Compute Euclidean distance between X4: (0, 0, 1, 0) and Unit 1 weights.

d2 = (0.68 -0)2 + (0.92 –0)2 + (0.80 -1)2 + (0.08 – 0)2

.IN
= 1.36
Compute Euclidean distance between X1: (0, 0, 1, 0) and Unit 2 weights.
C
d2 = (0.65- 0)2 + (0.08 – 0)2 + (0.3 -1)2 + (0.94– 0)2
N
= 1.8025
Unit 1 wins
SY

Update the weights of the winning unit

New Unit 1 weights = [0.68 0.92 0.80 0.08] + 0.6 ([0 0 1 0] - [0.68 0.92 0.80 0.08])
U

= [0.68 0.92 0.80 0.08] + 0.6 [-0.68 -0.92 0.2 -0.08]

= [0.68 0.92 0.80 0.08] + [-0.408 -0.552 0.12 -0.258]
VT

= [0.27 0.37 0.92 -0.178]

0.27 0.37 0.92 − 0.178
[Unit 1]:[ ]
Unit 2 0.65 0.08 0.3 0.94

Best mapping unit for each of the sample taken are,

X1: (1, 1, 1, 0) → Unit 1
X2: (0, 0, 1, 1) → Unit 2
X3: (1, 0, 0, 1) → Unit 2
X4: (0, 0, 1, 0) → Unit 1

This process is continued for many epochs until the feature map doesn’t change.
Learning Rules
Learning in NN is performed by adjusting the network weights in order to minimize the
difference between the desired and estimated output.

Delta Learning Rule and Gradient Descent

.IN
C
🞂 Developed by Widrow and Hoff, the delta rule, is one of the most common learning rules.
🞂 It is supervised learning.
N
🞂 Delta rule is derived from gradient descent method(Back-propogation).
SY

🞂 It is Non-linearly separable. Also called as continuous perceptron Learning rule.

🞂 It updates the connection weights with the difference between the target and the output
value. It is the least mean square learning algorithm.
U

🞂 The Delta difference is measured as an error function or also called as cost function.
VT

TYPES OF ANN
1. Feed Forward Neural Network
2. Fully connected Neural Network
3. Multilayer Perceptron
4. Feedback Neural Network
Feed Forward Neural Network:
Feed-Forward Neural Network is a single layer perceptron. A sequence of inputs enters the layer and are
multiplied by the weights in this model. The weighted input values are then summed together to form a total.
If the sum of the values is more than a predetermined threshold, which is normally set at zero, the output
value is usually 1, and if the sum is less than the threshold, the output value is usually -1.
The single-layer perceptron is a popular feed-forward neural network model that is frequently used for
classification.
The model may or may not contain hidden layer and there is no backpropagation.
Based on the number of hidden layers they are further classified into single-layered and multilayered feed
forward network.

.IN
C
N
SY

Fully connected Neural Network:

• A fully connected neural network consists of a series of fully connected layers that connect
VT

every neuron in one layer to every neuron in the other layer.

• The major advantage of fully connected networks is that they are “structure agnostic” i.e. there
are no special assumptions needed to be made about the input.
Multilayer Perceptron:
A multi-layer perceptron has one input layer and for each input, there is one neuron (or node), it has
one output layer with a single node for each output and it can have any number of hidden layers and
each hidden layer can have any number of nodes.
The information flows in both directions.
The weight adjustment training is done via backpropagation.
Every node in the multi-layer perception uses a sigmoid activation function. The sigmoid activation
function takes real values as input and converts them to numbers between 0 and 1 using the sigmoid
formula.

.IN
C
N
SY

Feedback Neural Network:

Feedback networks also known as recurrent neural network or interactive neural network are
the deep learning models in which information flows in backward direction.
U

It allows feedback loops in the network. Feedback networks are dynamic in nature, powerful and
VT

can get much complicated at some stage of execution

Neuronal connections can be made in any way.
RNNs may process input sequences of different lengths by using their internal state, which can
represent a form of memory.
They can therefore be used for applications like speech recognition or handwriting recognition.
Advantages and Disadvantages of ANN

Limitations of ANN

.IN
C
N
SY
U
VT
Challenges of Artificial Neural Networks

.IN
C
N
SY
U
VT

Infancy and Toddlerhood
No ratings yet
Infancy and Toddlerhood
93 pages
Structure of Neuron
No ratings yet
Structure of Neuron
12 pages
Unit I Deeplearning
No ratings yet
Unit I Deeplearning
13 pages
IT 701 Soft Computing Unit I - 1722317885
No ratings yet
IT 701 Soft Computing Unit I - 1722317885
12 pages
NEURON ACTION POTENTIAL - PPT - Ready
100% (1)
NEURON ACTION POTENTIAL - PPT - Ready
83 pages
Unit 1 Deep Learning
No ratings yet
Unit 1 Deep Learning
20 pages
Project Report On EEG Machine
60% (5)
Project Report On EEG Machine
26 pages
When Neurons Tell Stories A Layman's Guide to the Neuroscience of Mental Illness and Health Erin Hawkes-
From Everand
When Neurons Tell Stories A Layman's Guide to the Neuroscience of Mental Illness and Health Erin Hawkes-
Erin Hawkes-Emiru
No ratings yet
Artificial Intelligence Neural Networks
100% (1)
Artificial Intelligence Neural Networks
7 pages
10.1201 9780367823467 Previewpdf
No ratings yet
10.1201 9780367823467 Previewpdf
100 pages
Science Notes Form 4
100% (2)
Science Notes Form 4
15 pages
Vander - S Human Physiology 14th Edition 4
No ratings yet
Vander - S Human Physiology 14th Edition 4
28 pages
Unit-7 ANN
No ratings yet
Unit-7 ANN
211 pages
Somatosensory Teachers Manual: for Somatosensory Science Facts
From Everand
Somatosensory Teachers Manual: for Somatosensory Science Facts
Charles Pidgeon
No ratings yet
Artificial Neural Networks Notes PDF
100% (1)
Artificial Neural Networks Notes PDF
27 pages
4-Early Neural Network Architectures (MADALINE Network), and Application Domains.-16!12!2024
No ratings yet
4-Early Neural Network Architectures (MADALINE Network), and Application Domains.-16!12!2024
136 pages
Module 5 AIML Notes
No ratings yet
Module 5 AIML Notes
77 pages
Biomedical Instrumentation
100% (6)
Biomedical Instrumentation
16 pages
Wur Thesis Defense
100% (3)
Wur Thesis Defense
4 pages
Module 1
No ratings yet
Module 1
100 pages
Module 5
No ratings yet
Module 5
91 pages
Nervous System
100% (2)
Nervous System
13 pages
Module 4
No ratings yet
Module 4
84 pages
Nervous Tissue System-92618
No ratings yet
Nervous Tissue System-92618
151 pages
Artificial Neural Networks
No ratings yet
Artificial Neural Networks
81 pages
Nitin Sir Notes
No ratings yet
Nitin Sir Notes
66 pages
Unit 1
No ratings yet
Unit 1
89 pages
Sensory Deprivation
No ratings yet
Sensory Deprivation
19 pages
Module 4
No ratings yet
Module 4
57 pages
@vtucode - in Module 5 AI 2021 Scheme 5th Sem
No ratings yet
@vtucode - in Module 5 AI 2021 Scheme 5th Sem
66 pages
Module 4
No ratings yet
Module 4
50 pages
Logisctic Models Intro
No ratings yet
Logisctic Models Intro
60 pages
Unit 1
No ratings yet
Unit 1
61 pages
Mod (ML) 4
No ratings yet
Mod (ML) 4
48 pages
Unit-4 ML
No ratings yet
Unit-4 ML
41 pages
Deep Learning-Material For The Units 1,2,3
No ratings yet
Deep Learning-Material For The Units 1,2,3
36 pages
ANN Lecture Note1F
No ratings yet
ANN Lecture Note1F
50 pages
PPT1 - Cambridge - IX - Bio - Unit 12 - Coordination and Response
No ratings yet
PPT1 - Cambridge - IX - Bio - Unit 12 - Coordination and Response
60 pages
Module - 04 Machine Learning (BCS602) Search Creators
No ratings yet
Module - 04 Machine Learning (BCS602) Search Creators
21 pages
WINSEM2023-24 BITE410L TH VL2023240503970 2024-03-11 Reference-Material-I
No ratings yet
WINSEM2023-24 BITE410L TH VL2023240503970 2024-03-11 Reference-Material-I
40 pages
ML M4 Ann
No ratings yet
ML M4 Ann
31 pages
Sci 10 Week 3 Nervous System
100% (1)
Sci 10 Week 3 Nervous System
17 pages
General Psychology Questions and Answers For Chapter 2
No ratings yet
General Psychology Questions and Answers For Chapter 2
9 pages
SCT Unit-2
No ratings yet
SCT Unit-2
30 pages
Hancock PPT Chapter 04
No ratings yet
Hancock PPT Chapter 04
42 pages
FALLSEM2023-24 CSE4020 ETH VL2023240103694 2023-09-01 Reference-Material-I
No ratings yet
FALLSEM2023-24 CSE4020 ETH VL2023240103694 2023-09-01 Reference-Material-I
35 pages
Kvasnicka Pospichal SETINAIR 2013
No ratings yet
Kvasnicka Pospichal SETINAIR 2013
29 pages
Unit-5 DR - HCV
No ratings yet
Unit-5 DR - HCV
34 pages
NN Lec - 01
No ratings yet
NN Lec - 01
51 pages
Module - 4 - ANN1
No ratings yet
Module - 4 - ANN1
31 pages
Slide 2
No ratings yet
Slide 2
35 pages
NN 2nd
No ratings yet
NN 2nd
23 pages
Blue Brain Seminar Report
No ratings yet
Blue Brain Seminar Report
34 pages
Unit 3 SC
No ratings yet
Unit 3 SC
24 pages
Unit 1 Notes
No ratings yet
Unit 1 Notes
34 pages
Introduction To ANN
No ratings yet
Introduction To ANN
47 pages
Module 4 Artificial Neural Network
No ratings yet
Module 4 Artificial Neural Network
18 pages
Pages 2-15
No ratings yet
Pages 2-15
14 pages
Artificial Neural Network Introduction
No ratings yet
Artificial Neural Network Introduction
30 pages
Question Bank
No ratings yet
Question Bank
26 pages
ML Module 5
No ratings yet
ML Module 5
14 pages
Aiml Ses 2-1
No ratings yet
Aiml Ses 2-1
15 pages
Introduction To Neural Networks - Chapter1
No ratings yet
Introduction To Neural Networks - Chapter1
43 pages
NNFL 1 RA Moodle
No ratings yet
NNFL 1 RA Moodle
42 pages
Artificial Neural Networks
No ratings yet
Artificial Neural Networks
13 pages
Statistical Reasoning
No ratings yet
Statistical Reasoning
19 pages
What Are Artificial Neural Networks (Anns) ?: Neurons Dendrites Neurons. Axons
No ratings yet
What Are Artificial Neural Networks (Anns) ?: Neurons Dendrites Neurons. Axons
11 pages
Artificial Neural Network
No ratings yet
Artificial Neural Network
25 pages
Dhiraj Report1
No ratings yet
Dhiraj Report1
25 pages
Section A: Ques. 1
No ratings yet
Section A: Ques. 1
31 pages
Deep Learning
No ratings yet
Deep Learning
14 pages
Preliminaries: Biological Neuron To Artificial Neural Network
No ratings yet
Preliminaries: Biological Neuron To Artificial Neural Network
21 pages
Ai Unit 4 Notes
No ratings yet
Ai Unit 4 Notes
11 pages
2223hk1 Slide01 ML2022-2
No ratings yet
2223hk1 Slide01 ML2022-2
23 pages
Mid Summary
No ratings yet
Mid Summary
13 pages
LAPORAN LENGKAP Jaringan Saraf
No ratings yet
LAPORAN LENGKAP Jaringan Saraf
22 pages
What Is Nervous System?: Presented By: TIP Grade 12 Student
No ratings yet
What Is Nervous System?: Presented By: TIP Grade 12 Student
28 pages
AI Unit 4-1
No ratings yet
AI Unit 4-1
13 pages
Worksheetsmod1 69
No ratings yet
Worksheetsmod1 69
29 pages
Function of Single Biological Neuron and Modelling of Artificial Neuron From It
No ratings yet
Function of Single Biological Neuron and Modelling of Artificial Neuron From It
33 pages
Neural Control and Coordination
No ratings yet
Neural Control and Coordination
3 pages
Psychology Notes
No ratings yet
Psychology Notes
16 pages
The Biological Approach To Psychology
No ratings yet
The Biological Approach To Psychology
4 pages
978-3-319-10783-7 - 1 - 70 Anniversary of Publication - McCulloch
No ratings yet
978-3-319-10783-7 - 1 - 70 Anniversary of Publication - McCulloch
2 pages
Neurobic A Brain Booster Journal
No ratings yet
Neurobic A Brain Booster Journal
7 pages
A Concise Introduction To Machine Learni PDF
No ratings yet
A Concise Introduction To Machine Learni PDF
7 pages
Artificial Neural Networks
No ratings yet
Artificial Neural Networks
9 pages
Unit 3.10 Neurons: Structure of A Neuron
No ratings yet
Unit 3.10 Neurons: Structure of A Neuron
5 pages
Statistical Markov Model Markov Process Dynamic Bayesian Network Markov Model
No ratings yet
Statistical Markov Model Markov Process Dynamic Bayesian Network Markov Model
6 pages
Module 2 - Physiological Bases of Behavior
No ratings yet
Module 2 - Physiological Bases of Behavior
13 pages

ML Module 4

Uploaded by

ML Module 4

Uploaded by

MODULE-4

Bayes' Theorem Explained

Mathematically, Bayes' Theorem is:

Breaking Down the Components of Bayes' Theorem

1. Prior Probability (P(H)):

Intuition Behind Bayes' Theorem

Example: Disease Diagnosis

Consider a simple example of diagnosing a disease using a medical test.

1. Prior Probability (P(H)):

The biological neuron consists of main four parts:

• dendrites: nerve fibres carrying electrical signals to the cell .

• cell body: computes a non-linear function of its inputs

Difference between biological and Artificial Neuron

Simple Model of an ANN

1. It receives weighted inputs from other neurons.

integration function (f) is associated with input of processing element.

• Several activation functions are there.

1. Identity function or Linear Function: It is a linear function which is defined as 𝑓(𝑓) =

Where λ represents steepness parameter and the sigmoid range is between -1

5. Ramp function: The ramp function is defined as:

It is a linear function whose upper and lower limits are fixed.

6. Tanh-Hyperbolic tangent function : Tanh function is very similar to the sigmoid/logistic

Artificial Neural Network Structure

• The Neural Network is constructed from 3 type of layers:

• Output layer — produce the result for given inputs.

• The perceptron is also a simplified model of a biological neuron.

Figure 1 Two Layer Network for NAND gate

Table 1: Weights and Biases

0 1 1 0.1 -0.4 0.3 0.2 -0.3 1

𝑼𝒏𝒊𝒕𝒋 𝑵𝒆𝒕 𝑰𝒏𝒑𝒖𝒕 𝑰𝒋 𝑵𝒆𝒕 𝒐𝒖𝒕𝒑𝒖𝒕 𝑶𝒋

𝑿𝟑 𝐼3 = 𝑋1𝑊13 + 𝑋2𝑊23 + 𝑋0𝚹3 1

Step 2: BACKWARD PROPAGATION

For each 𝒖𝒏𝒊𝒕𝒋 in the hidden layer

𝑬𝒓𝒓𝒐𝒓𝒋 = 𝑶𝒋 ∗ (𝟏 − 𝑶𝒋) ∗ (∑ 𝑬𝒓𝒓𝒐𝒓 ∗ 𝑾𝒋𝒌)

Table 5: Error Calculation

𝑋3 𝐸𝑟𝑟𝑜𝑟𝑗 = 𝑂𝑗 ∗ (1 − 𝑂𝑗) ∗ (∑ 𝐸𝑟𝑟𝑜𝑟 ∗ 𝑊𝑗𝑘)

= 0.450 ∗ (1 − 0.450) ∗ 0.134 ∗ 0.3

2. Update Weights and biases

𝒘𝒊𝒋 𝒘𝒊𝒋 = 𝒘𝒊𝒋 + (𝜶 ∗ 𝑬𝒓𝒓𝒐𝒓𝒋 ∗ 𝑶𝒊) Net Weight

𝑤13 𝑤13 = 𝑤13 + (0.4 ∗ 𝐸𝑟𝑟𝑜𝑟3 ∗ 𝑂1) 0.1

𝚹3 𝚹3 = 𝚹3 + (0.4 ∗ 𝐸𝑟𝑟𝑜𝑟3) 0.203

1. Calculate net inputs and outputs in hidden and output layer

𝑼𝒏𝒊𝒕𝒋 𝑵𝒆𝒕 𝑰𝒏𝒑𝒖𝒕 𝑰𝒋 𝑵𝒆𝒕 𝒐𝒖𝒕𝒑𝒖𝒕 𝑶𝒋

Figure 2: Multi Layer Perceptron for XOR

Learning rate: =0.8

Step 1: Forward Propagation

3. Calculate Error = Odesired – OEstimated

Step 2: Backward Propagation

For each unit j in the hidden layer, calculate

2. Update weight using the below formula,

= 0.2 + 0.8 *0.143* 0.4

Update bias using the below formula,

The output we receive in the network at node 5 is 0.407.

Error = 1 - 0.429= 0.571

network has learnt which reduces the error by 0.022.

Compute Euclidean distance between X1: (1, 1, 1, 0) and Unit 1 weights.

Compute Euclidean distance between X1: (1, 1, 1, 0) and Unit 2 weights.

d2 = (0.3 -1)2 + (0.5 – 1)2 + (0.4 -1)2 + (0.6– 0)2

[Unit 1]:[ 0.68 0.92 0.80 0.08]

d2 = (0.68 -0)2 + (0.92 – 0)2 + (0.80 -1)2 + (0.08 – 1)2

d2 = (0.3 -0)2 + (0.5 – 0)2 + (0.4 -1)2 + (0.6– 1)2

[Unit 1]:[ 0.68 0.92 0.80 0.08]

Training Sample X3: (1, 0, 0, 1)

Unit 2 0.12 0.2 0.76 0.84

Compute Euclidean distance between X3: (1, 0, 0, 1) and Unit 1 weights.

d2 = (0.68 -1)2 + (0.92 – 0)2 + (0.80 -0)2 + (0.08 – 1)2

d2 = (0.12 -1)2 + (0.2 – 0)2 + (0.76 -0)2 + (0.84– 1)2

[Unit 1]:[ 0.68 0.92 0.80 0.08]

[Unit 1]:[ 0.68 0.92 0.80 0.08

Compute Euclidean distance between X4: (0, 0, 1, 0) and Unit 1 weights.

d2 = (0.68 -0)2 + (0.92 –0)2 + (0.80 -1)2 + (0.08 – 0)2

Update the weights of the winning unit

= [0.68 0.92 0.80 0.08] + 0.6 [-0.68 -0.92 0.2 -0.08]

= [0.27 0.37 0.92 -0.178]

Best mapping unit for each of the sample taken are,

Delta Learning Rule and Gradient Descent

= 0.2 + 0.8 0.143 0.4