0% found this document useful (0 votes)
11 views35 pages

Artificial Neural Network

The document provides an overview of Artificial Neural Networks (ANNs) and their structure, emphasizing their inspiration from biological neural systems. It details the components of neural networks, learning processes such as perceptron learning and backpropagation, and the significance of activation functions. Additionally, it discusses gradient descent as an optimization technique used in machine learning to minimize functions by adjusting model parameters.

Uploaded by

hellfire83675
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views35 pages

Artificial Neural Network

The document provides an overview of Artificial Neural Networks (ANNs) and their structure, emphasizing their inspiration from biological neural systems. It details the components of neural networks, learning processes such as perceptron learning and backpropagation, and the significance of activation functions. Additionally, it discusses gradient descent as an optimization technique used in machine learning to minimize functions by adjusting model parameters.

Uploaded by

hellfire83675
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 35

ARTIFICIAL NEURAL NETWORK

Dipesh Koirala
INTRODUCTION TO DEEP LEARNING

▪ is a subset of machine learning that uses neural networks to model complex patterns in data.

▪ A neural network is a computational model inspired by the human brain's structure.

2
INTRODUCTION TO DEEP LEARNING

▪ A neuron is a cell in brain whose principle function is the collection, processing, and dissemination of
electrical signals.

▪ Brains Information processing capacity comes from networks of such neurons.

3
NEURAL NETWORK

▪ ANN is an information processing paradigm that is


inspired by the way biological nervous systems, such as
the brain, process information.

▪ It is composed of a large number of highly


interconnected processing elements (neurons) working
in unison to solve specific problems.

▪ ANNs, like people, learn by example.


▪ It has revolutionized fields like computer vision, natural
language processing (NLP), speech recognition, and more.

4
NEURAL NETWORK

Units of Neural Network


Nodes (units): Nodes represent a cell of neural network.
Links: Links are directed arrows that show propagation of
information from one node to another node.

Activation: Activations are inputs to or outputs from a unit


Weight: Each link has weight associated with it which determines
strength and sign of the connection.

Activation function: A function which is used to derive output


activation from the input activations to a given
Bias Weight: Bias weight is used to set the threshold for a unit.

5
NEURAL NETWORK

1 if x  c
 ( x) = 
0 if x  c
Example: Consider following neuron and compute its output by assume activation function F(x)=1 if x>5
and F(x)=0, otherwise
b=1
u = x1 * w1 + x2 * w2 + x3 * w3

= 2 * 1.5 + 1 * 2 + 2 * 0.5 = 6 x1=2


w1=1.5

v=u+b=6+1=7 w2=2
x2=1 y

Now, w3=0.5

x3=2

y=f(v)=1
6
NEURAL NETWORK

E.g.,

7
NEURAL NETWORK

Other Activation Functions


▪ Sigmoid Function
▪ Tanh function
▪ Relu function
▪ Softmax function

8
STRUCTURE OF NEURAL NETWORK

Single Layer Network


▪ It is the simplest form of a network architecture.
▪ In this architecture the input layer of source nodes that are connected directly with an output layer
of neurons

9
STRUCTURE OF NEURAL NETWORK

Multi Layer Neural Network


▪ In this type of network architecture, one or more hidden layers are present between input and output
layers.
▪ These layers are not directly visible and information only flows in the direction of input to output
layer.

10
STRUCTURE OF NEURAL NETWORK

Example: Consider following Neural Network and compute its output using activation function
f(x) = 2x-1. Weights of synaptic links are provided above each link.

For Node 1
u1 = 2 * 0.8 + 3 * 1 = 4.6 => y1 = f(u1) = 2 * 4.6 – 1 = 8.2
For Node 2
u2 = 2 * 0.4 + 3 * 0.6 = 2.6 => y2 = f(u2) = 4.2

11
LEARNING IN NEURAL NETWORKS

▪ Learning in neural networks is carried out by


adjusting the connection weights among neurons.

▪ There is no algorithm that determines how the


weights should be assigned in order to solve
specific problems.

▪ Hence, the weights are determined by a learning


process

12
LEARNING IN NEURAL NETWORKS

Perceptron Learning
▪ The term "Perceptron" was coined by Frank RosenBlatt
in 1962.

▪ The perceptron is the simplest form of a neural network


used for the classifying linearly separable patterns.

▪ Basically, perceptron consists of a single neuron with


adjustable synaptic weights and bias

13
LEARNING IN NEURAL NETWORKS

Perceptron Learning Algorithm


1. Initialize all weights and bias to zero
2. For each training vector s and target t perform steps 3 to 6
3. Set xi = si for i = 1 to n
4. Compute output using Hard limiter activation function as below
n
yin = b +  wi xi y = f ( yin )
i =1

5. Adapt weights as: wi = wi +  (t − y) xi for i = 1 to n


6. Adapt bias as: b = b +  (t − y )
7. Test for Stopping Criteria

14
LEARNING IN NEURAL NETWORKS

Perceptron Training
Example
Train the following perceptron by using given training set

x1 x2 t
x1 w1
1 1 1
w2  f y 1 -1 -1
x2
-1 1 -1
b -1 -1 -1

15
LEARNING IN NEURAL NETWORKS

Solution: =1
Epoch #1

1 𝑖𝑓 𝑥 > 0
𝜙(𝑥) = −1 𝑖𝑓 𝑥 < 0 𝐴𝑠𝑠𝑢𝑚𝑒𝑑 ℎ𝑎𝑟𝑑 𝐿𝑖𝑚𝑖𝑡𝑒𝑟 𝐴𝑐𝑡𝑖𝑣𝑎𝑡𝑖𝑜𝑛 𝑓𝑢𝑛𝑐𝑡𝑖𝑜𝑛
0 𝑖𝑓 𝑥 = 0
16
INTRODUCTION TO DEEP LEARNING

Epoch #2

17
BACKPROPAGATION ALGORITHM

• method for the training of multilayer perceptron


i. Forward Phase:
- In this phase, the synaptic weights of the network are fixed and the input signal is propagated through the
network, layer by layer, until it reaches the output.
- Thus, in this phase, changes are confined to the activation potentials and outputs of the neurons in the
network.

ii. Backward Phase:


- In this phase, an error signal is produced by comparing the output of the network with a desired
response.
- The resulting error signal is propagated in backward direction through the network, again
layer by layer. In this second phase, successive adjustments are made to the synaptic weights of the
network

18
BACKPROPAGATION ALGORITHM

• Deep

19
BACKPROPAGATION ALGORITHM

▪ Reference diagram of the Backpropagation algorithm

20
BACKPROPAGATION ALGORITHM

Algorithm (Taking logistic as activation function)


1. Initialize all weights and biases in network
2. Repeat terminating condition is not satisfied
3. for each training tuple x in D //Forward Pass
• for each input layer unit j:
yj = xj; // output of an input unit is its actual input value
• for each hidden or output layer unit j
compute the net input of unit j

𝑧𝑗 = 𝑏 + ෍ 𝑤𝑗𝑖 𝑦𝑖
𝑖
compute the output of each unit j 1
𝑦𝑗 =
1 + 𝑒 −𝑧𝑗
21
BACKPROPAGATION ALGORITHM

Algorithm (Taking logistic as activation function)


4. for each unit j in the output layer compute error gradient as (Backward Pass Steps 4-6)
𝛿𝑗 = 𝑦𝑗 (1 − 𝑦𝑗) (𝑡𝑗 − 𝑦𝑗 )
5. for each unit j in the hidden layers error gradient as
𝛿𝑗 = 𝑦𝑗 (1 − 𝑦𝑗) ∑𝑘 𝑤𝑗𝑘 𝛿𝑘
6. for each weight wij in network, update weights as
wij = wij + α δj yi
7. for each bias bj in network, update biases as
bj = bj + α δj

22
BACKPROPAGATION ALGORITHM

Algorithm (For other activation functions)


𝜙 − Activation function 𝜙 ′ − Derivation of activation function

4. for each unit j in the output layer compute error gradient as

𝛿𝑗 = 𝜙𝑗′ (𝑡𝑗 − 𝑦𝑗 )

5. for each unit j in the hidden layers error gradient as

𝛿𝑗 = 𝜙𝑗′ ෍ 𝛿𝑘 𝑤𝑗𝑘
k

23
BACKPROPAGATION ALGORITHM

Example
▪ Consider a MLP given below. Let the learning rate be 1. The initial weights of the network are given in the
table below. Assume that first training tuple is (1, 0, 1) and its target output is 1. Calculate weight updates by
using back-propagation algorithm.
1
Assume  ( x) =
1 + e− x

w14 w15 w24 w25 w34 w35 w46 w47 w56 w57 w68 w78
0.6 0.4 0.2 -0.3 0.7 -0.6 0.4 0.7 0.1 0.8 0.2 0.5

24
BACKPROPAGATION ALGORITHM

Solution
Forward Pass
z4=1*0.6+0*0.2+1*0.7=1.3 y4=1/1+e-1.3=0.786

z5=1*0.4+0*(-0.3)+1*(-0.6)=-0.2 y5=1/1+e0.2=0.45

z6=0.786*0.4+0.45*0.1=0.36 y6=1/1+e-0.36=0.59

z7=0.786*0.7+0.45*0.8=0.91 y7=1/1+e-0.91=0.71

z8=0.59*0.2+0.71*0.5=0.47 y8=1/1+e-0.47=0.61

25
BACKPROPAGATION ALGORITHM

Solution
Backward Pass

𝛿8 = 𝑦8 1 − 𝑦8 𝑡 − 𝑦8 = 0.61 ∗ 1 − 0.61 ∗ 1 − 0.61 = 0.093

𝛿7 = 𝑦7 1 − 𝑦7 𝛿8 𝑤78 = 0.71 ∗ 1 − 0.71 ∗ 0.093 ∗ 0.5 = 0.0096

𝛿6 = 𝑦6 1 − 𝑦6 𝛿8 𝑤68 = 0.59 ∗ (1 − 0.59) ∗ 0.093 ∗ 0.2 = 0.0045

𝛿5 = 𝑦5 (1 − 𝑦5 )(𝛿7 𝑤57 + 𝛿6 𝑤56 )

= 0.45 ∗ (1 − 0.45) ∗ (0.0096 ∗ 0.8 + 0.0045 ∗ 0.1) = 0.002

𝛿4 = 𝑦4 (1 − 𝑦4 )(𝛿7 𝑤47 + 𝛿6 𝑤46 )

= 0.786 ∗ (1 − 0.786) ∗ (0.0096 ∗ 0.7 + 0.0045 ∗ 0.4) = 0.0014


26
BACKPROPAGATION ALGORITHM

Solution
Update Weights

𝑤14 = 𝑤14 + 1 ∗ 𝛿4 ∗ 𝑦1 = 0.6 + 1 ∗ 0.0014 ∗ 1 =?


𝑤15 = 𝑤15 + 1 ∗ 𝛿5 ∗ 𝑦1 =? 𝑤24 = 𝑤24 + 1 ∗ 𝛿4 ∗ 𝑦2 =?
𝑤25 = 𝑤25 + 1 ∗ 𝛿5 ∗ 𝑦2 =? 𝑤34 = 𝑤34 + 1 ∗ 𝛿4 ∗ 𝑦3 =?
𝑤35 = 𝑤35 + 1 ∗ 𝛿5 ∗ 𝑦3 =? 𝑤46 = 𝑤46 + 1 ∗ 𝛿6 ∗ 𝑦4 =?
𝑤47 = 𝑤47 + 1 ∗ 𝛿7 ∗ 𝑦4 =? 𝑤56 = 𝑤56 + 1 ∗ 𝛿6 ∗ 𝑦5 =?
𝑤57 = 𝑤57 + 1 ∗ 𝛿7 ∗ 𝑦5 =? 𝑤68 = 𝑤68 + 1 ∗ 𝛿8 ∗ 𝑦6 =?
𝑤78 = 𝑤78 + 1 ∗ 𝛿8 ∗ 𝑦7 =?

27
EXTRA MATERIALS

▪ Gradient Descent

28
GRADIENT DESCENT

▪ Iterative approach for finding the minimum of a function


▪ Gradient descent is an optimization algorithm used to minimize some convex function by iteratively moving in
the direction of steepest descent as defined by the negative of the gradient.

▪ In machine learning, gradient descent is used to update the parameters or weights of the model.

29
GRADIENT DESCENT

▪ Said it more mathematically, a gradient is a partial derivative with respect to its inputs.

▪ The higher the gradient, the steeper the slope and the faster a model can learn. But if the slope is
zero, the model stops learning.

▪ How big the steps are that Gradient Descent takes into the direction of the local minimum are
determined by the learning rate.

30
LINEAR REGRESSION

𝑦 = 𝑤0 + 𝑤1 𝑥

▪ Let us suppose that { 𝑥1 , 𝑦1 , 𝑥2 , 𝑦2 , … 𝑥𝑛 , 𝑦𝑛 } are given data points. Loss function for the n
data points is given by:

𝑛
1
𝐿= ෍ 𝑒𝑖2
2𝑛
𝑖=1

𝑛
1
𝐿= ෍(𝑦𝑖 − 𝑤0 − 𝑤1 𝑥𝑖 )2
2𝑛
𝑖=1

31
LINEAR REGRESSION

▪ Now, coefficients or weights can be determined or updated using gradient decent method as
below:

𝑛
𝜕𝐸 1
𝒘𝟎 = 𝑤0 − 𝛼 = 𝑤0 + 𝛼 ෍(𝑦𝑖 −𝑤0 − 𝑤1 𝑥𝑖 )
𝜕𝑤0 𝑛
𝑖=1

𝜕𝐸 1
𝒘𝟏 = 𝑤1 − 𝛼 𝜕𝑤 = 𝑤1 + 𝛼 𝑛 σ𝑛𝑖=1(𝑦𝑖 −𝑤0 − 𝑤1 𝑥𝑖 )𝑥𝑖
1

32
LINEAR REGRESSION

▪ E.g., Fit a straight line through the following data using SGD. Show one epoch of training.

Solution
General form of linear regression equation is: 𝑦 = 𝑤0 + 𝑤1 𝑥
Let us assume that initial values of parameters are:
𝑤0 = 𝑤1 = 0

33
LINEAR REGRESSION

Iteration 1: x=1, y=f(x)=3 𝛼 = 0.01


𝑤0 = 𝑤0 + 𝛼(𝑦 − 𝑤0 −𝑤1 𝑥) = 0 + 0.01 × 3 = 0.03
𝑤1 = 𝑤1 + 𝛼(𝑦 − 𝑤0 −𝑤1 𝑥)𝑥 = 0 + 0.01 × 3 = 0.03

Iteration 2: x=2, y=f(x)=5


𝑤0 = 𝑤0 + 𝛼(𝑦 − 𝑤0 −𝑤1 𝑥) =?
𝑤1 = 𝑤1 + 𝛼(𝑦 − 𝑤0 −𝑤1 𝑥)𝑥 =?

In the same way perform iteration 3 and 4.

34
THANK YOU

You might also like