0% found this document useful (0 votes)
61 views48 pages

Lecture 7 - Neural Networks

Notes on Neural Networks

Uploaded by

Nathaniel Adika
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
61 views48 pages

Lecture 7 - Neural Networks

Notes on Neural Networks

Uploaded by

Nathaniel Adika
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 48

Lecture 7: Neural Networks

CPEN 405: Artificial Intelligence

Instructor: Gifty Osei


Outline

• Biological Neuron

• Artificial neural network

• Artificial neuron architecture

• Activation functions

• Learning in neural networks

• Neural network models/architectures


The Biological Neural System
• Biological nervous system is involved in receiving information
about the environment around us (sensation) and generating
responses to that information (motor responses).

• Biological nervous system can be structured as 3-stage


system:

• Receptor system
Converts stimuli from the human body or external environment into
electrical pulses which convey the intended information to the brain
(neural network)

• Brain system (biological neural network)


Serves as the control center and is represented by the nerve (neural
network) which continuously receives information, processes the
information received, and takes an appropriate decision

• Effector system
Converts the electrical impulses generated by the neural network
(output of the brain) into discernible responses as system output
The Biological Neuron

• Human nervous system made up


of cells called neurons

• Neurons of the nervous system


are specialized to pass or carry
signals (message) via an
electromechanical/chemical
process.

• A biological neuron has three


types of main components;
dendrites, soma (or cell body)
and axon.
– Dendrites receive signals from other neurons.

– The soma, sums the incoming signals. When sufficient


input is received, the cell fires; that is it transmit a signal
over its axon to other cells.
The Biological Neuron
• The synapse or synaptic terminals are point of contact
between the axon of one soma and the dendrites of another
soma, regulating the chemical connection whose strength
affects the input to the receiving soma
The Neural Network
• A neuron is a brain cell in the biological neural network.

• An artificial neuron is a simple processing unit which is an


approximation of a biological neuron. It may be a physical
device or a mathematical construction.

• A neural network is a coordinative system with neurons as


the basic elements connected in a specific hierarchical
structure (or model or architecture) to perform particular
task.
The Artificial Neural Network

• Artificial neural network (ANN) is a massively parallel


distributed processor system that consists of many simple
processing elements (neurons) with natural ability to store
knowledge from experience and making it available for use.

• ANNs are statistical data modeling or decision tools used to


model complex relationships between inputs and outputs or
used to find inherent patterns contained in data.

• ANN architecture is a layered architecture (single or


multiple layers) of neurons that are interconnected in
feedforward loop or feedback loop. This gives rise to
different models.
The Artificial Neural Network

• ANN resembles the human brain in two respects:


• Knowledge is acquired by network from its environment
through a learning procedure (called learning
algorithm).

• Synaptic connection or link strengths are used to store


the acquired knowledge.

• Learning is achieved through modification of connection


link strength of the neurons which rule is derived via
optimization.
The Artificial Neuron
The Artificial Neuron

Input x1 x2 … xm

Processing ∑ ∑= X1+X2 + ….+Xm =y

Output y
The Artificial Neuron

............
input xm x2 x1

weights wm ..... w2 w1

processing ∑ ∑= X1w1+X2w2 + ….+Xmwm = y

output y
Not all inputs are equal
The Artificial Neuron

............
input xm x2 x1

weights wm ..... w2 w1

processing ∑
Transfer Function
(Activation Function)

output y

The signal is not passed down to the next neuron verbatim


Artificial Neuron Architecture

• A single neuron architectural model with multiple inputs and


single output (McCulloch-Pitts neuron model)
Neuron Activation Functions

• Various forms of activation functions are used for neurons but


most used functions for neural network models:

• Unit Step (Threshold or Hard Limit) function


• Sign (signum) function
• Piece-wise linear function
• Linear (pure linear) function
• Logistic sigmoid function
• Hyperbolic tangent (tanh) function
• ReLu (Rectified Linear Unit) function

• Choice of activation function to deploy depends on problem


statement and the form of output desired by the designer
Neuron Activation Functions
Learning Rules for Neural Networks
Learning Function
Perceptron network learning example 1
Perceptron network learning solution 1
T
1 1 1
1 -1 1
-1 1 1
-1 -1 -1
Perceptron network learning solution 1
T
1 1 1
1 -1 1
-1 1 1
-1 -1 -1
Perceptron network learning solution 1
T
1 1 1
1 -1 1
-1 1 1
-1 -1 -1
Perceptron network learning solution 1
Perceptron network learning solution 1
T
1 1 1
1 -1 1
-1 1 1
-1 -1 -1
Perceptron network learning solution 1
T
1 1 1
1 -1 1
-1 1 1
-1 -1 -1
Perceptron network learning solution 1

• Now on your own, go through the same exercise and find the
weights and bias at the end of one epoch but using the
following conditions;

• Sigmoid activation function


• Continuous (delta) perceptron learning rule
Neural Network Models

• Neural networks primarily come in two forms based on the


activation or propagation function of the network:

• Linear networks
• Linearity from input to output makes the network linear
regardless of the number of neurons or layers in network.

• Activation function is linear with respect to the weights

• Output of linear network is linearly dependent on the


weights and the bias connection layers.

• Nonlinear networks
• Networks whose activation function performs non-
linearity operation >> how weights of network affect
output.

• Activation function is nonlinear with respect to weights


Single Layer Neural Network Models

• Single layer network architecture is characterized by only


one layer with one or more neurons in the layer and
one or more inputs to the layer.

Inputs Single layer Outputs


Single Layer Neural Network Models
Single Layer Neural Network Models

• For a 3 inputs, 2 neuron single layer network

• The network parameter representations for computations can be


defined for the multiple neurons and inputs in the form as
Multi Layer Neural Network Models

• Multilayer network architecture is a generalized m-layer


feedforward perceptron network (MLP)

• This network is characterized by multiple layers with neurons


often numbered by the layers.

• Most common multilayer network consists of three layers:


Layer of “input” neurons connected to a layer of “hidden”
neurons which is connected to a layer of “output” neurons.

• Input layer neurons is different from the input data


source >> this layer receives the input data and processes it
for next layer.

• The input source is not a layer and does not have neurons
Multi Layer Neural Network Models

• Activity of each hidden layer neuron(s) is determined by the activities


of the input layer neurons and connection weights between the input
layer neurons and hidden layer neurons.

• In multi layer networks, the update of weights is done using a


method called back propagation.

input input hidden output


source layer layer layer
Multi Layer Neural Network Example 1

A 3-layer MLP network is designed with 2 inputs [𝑥1 , 𝑥2 ], one hidden


layer with 2 neurons, and output layer with 2 neurons.

The network is trained with a sample 𝑿 = [𝒙𝟏 , 𝒙𝟐 ] = [𝟎, 𝟏] and the


desired outputs 𝑻 = [𝑻𝟏 , 𝑻𝟐 ] = [𝟏, 𝟎]. The learning rate 𝜼 = 𝟎. 𝟏𝟎.

The initial weights of the network from the input to


hidden layer and from hidden layer to output layer are defined
as :
Input-hidden layer: [𝒗𝟏𝟏 , 𝒗𝟏𝟐 ; 𝒗𝟐𝟏 , 𝒗𝟐𝟐 ] = [−1, 0 ; 0, 1]
hidden output layer: [𝒘𝟏𝟏 , 𝒘𝟏𝟐 ; 𝒘𝟐𝟏 , 𝒘𝟐𝟐 ] = [1, −1 ; 0, 1]

Bias for hidden layer neurons: [𝒃𝒉𝟏 , 𝒃𝒉𝟐 ] = [1,1] and bias for output
neurons: [𝒃𝒐𝟏 , 𝒃𝒐𝟐 ] = [1, 1].

Assume the network performs linear activation function 𝑓(𝑥) = 𝑥.


Find the weights at which the network can correctly map arbitrary
inputs to the outputs.
Multi Layer Neural Network Example 1

𝑏ℎ1 𝑏𝑜1 [𝒙𝟏 , 𝒙𝟐 ] = [𝟎, 𝟏]

𝑣11 𝑤11 [𝑻𝟏 , 𝑻𝟐 ] = [𝟏, 𝟎]


𝑥1
𝑣12 𝑤12 [𝒗𝟏𝟏 , 𝒗𝟏𝟐 ; 𝒗𝟐𝟏 , 𝒗𝟐𝟐 ] = [−1, 0 ; 0, 1]
𝑏ℎ2 𝑏𝑜1
[𝒘𝟏𝟏 , 𝒘𝟏𝟐 ; 𝒘𝟐𝟏 , 𝒘𝟐𝟐 ] = [1, −1 ; 0, 1]
𝑣21 𝑤21
[𝒃𝒉𝟏 , 𝒃𝒉𝟐 ] = [1,1]
𝑥2
𝑣22 𝑤22
[𝒃𝒐𝟏 , 𝒃𝒐𝟐 ] = [1, 1].

First Layer:
Multi Layer Neural Network Example 1

𝑏ℎ1 𝑏𝑜1 [𝒙𝟏 , 𝒙𝟐 ] = [𝟎, 𝟏]

𝑣11 𝑤11 𝑦1 [𝑻𝟏 , 𝑻𝟐 ] = [𝟏, 𝟎]


𝑥1
𝑣12 𝑤12 [𝒗𝟏𝟏 , 𝒗𝟏𝟐 ; 𝒗𝟐𝟏 , 𝒗𝟐𝟐 ] = [−1, 0 ; 0, 1]
𝑏ℎ2 𝑏𝑜1
[𝒘𝟏𝟏 , 𝒘𝟏𝟐 ; 𝒘𝟐𝟏 , 𝒘𝟐𝟐 ] = [1, −1 ; 0, 1]
𝑣21 𝑤21
𝑦2 [𝒃𝒉𝟏 , 𝒃𝒉𝟐 ] = [1,1]
𝑥2
𝑣22 𝑤22
[𝒃𝒐𝟏 , 𝒃𝒐𝟐 ] = [1, 1].

Output Layer:
Backpropagation Learning Concept

• Back-propagation (error back propagation) is a method for


learning the weights and biases in feed-forward networks.
• The method consist of two passes through the network
layers:
• Forward Pass:
• The process begins with a forward pass, where input data is fed
into the neural network, and the network's weights are used to
compute the predicted output.
• The predicted output is compared to the actual target
values, and the error is calculated using a predefined loss or
cost function.

• Backward Pass(Backpropagation):
• The backpropagation algorithm involves propagating the error
backward through the network to update the weights.
• The gradient of the loss with respect to each weight in the
network is computed using the chain rule of calculus.
Neural Network Architectures

• Feed forward network architecture (FFNN)

• Multilayer network with data flow from input to hidden


layer to output without looping back.

• Most common form of neural network model


Neural Network Architectures
• Recurrent network architecture (RNN)

• Derived from feed forward but have presence of feedback


loop in its structure (either positive or negative or both).

• Connections between nodes can create a cycle, allowing


output from some nodes to affect subsequent input to
the same nodes.

• Recurrent Neural Networks (RNNs) are designed to work


with sequential data.
Neural Network Architectures

• Recurrent network architecture (RNN)

• A neuron in the hidden layer has a connection to itself,


forming a loop. This loop allows information to be passed
from one step in the sequence to the next, capturing
dependencies and patterns in sequential data.

• At each time step, the hidden layer produces an output


and a hidden state which serves as the memory of the
network, retaining information about the past inputs.

• Mathematically, the hidden state ℎ𝑡 at time step 𝑡 is a


function of the current input 𝑥𝑡 and the previous hidden
state ℎ𝑡 = 𝑓(𝑥𝑡 , ℎ𝑡 − 1)

• RNNs are trained using a process called


backpropagation through time (BPTT). It's a variant
of backpropagation designed for sequences.
Neural Network Architectures

• Convolutional neural network architecture

• Similar to feedforward network with hidden layers that


perform convolutional functions and pooling operations.

• Convolution and pooling layers alternate in the structure

• It is a member of the deep neural network learning


fraternity
Neural Network Architectures

• Convolutional neural network architecture

• A convolutional kernel is applied to inputs. The kernel


weights are multiplied elementwise with the corresponding
numbers in the local receptive field.
Neural Network Architectures

• Convolutional neural network architecture

• A convolutional kernel is applied to inputs. The kernel


weights are multiplied elementwise with the corresponding
numbers in the local receptive field.
Neural Network Architectures

• Convolutional neural network architecture

• The first convolutional layer will learn small local patterns


such as edges, and the second layer will learn patterns
made by the features of the first layers.

• This is known as spatial hierarchy

2nd Convolutional
layer

1st Convolutional
layer

Input Image
Neural Network Architectures
• Convolutional neural network architecture

• Pooling layers are used for down sampling or subsampling


the spatial dimensions of the input.

• Max pooling and average pooling are common pooling


operations. Max pooling retains the maximum value from a
group of neighboring pixels, while average pooling
computes the average.
Neural Network Architectures
• Self organizing network architecture (SOM)

• Similar to feed forward network but with a one-layer type


of structure based on topographical organization of brain.

• Its neurons are in a x × 𝑦 lattice


Neural Network Architectures
• Self organizing network architecture (SOM)

• In each training step, one sample x from the dataset is


chosen. The distance (usually Euclidian) between x and
all the wight vectors of the SOM are computed

• The neuron whose weight vector is closest to the input


vector is called the best matching unit (BMU) or the
“winner”.

• The weight vectors are then updated so that the BMU


moves closer to the input vector in the input space.
Neural Network Architectures
• Self organizing network architecture (SOM)
Neural Network Architectures
• Self organizing network architecture (SOM)

• In each training step, one sample x from the dataset is


chosen. The distance (usually Euclidian) between x and
all the wight vectors of the SOM are computed

• The neuron whose weight vector is closest to the input


vector is called the best matching unit (BMU) or the
“winner”.

• The weight vectors are then updated so that the BMU


moves closer to the input vector in the input space.

• The weight vectors of neighbors of the BMU are also


adjusted.
Neural Network Architectures

• Deep network architecture

• Derived from feedforward network with multiple or deep


layers between input and the output layers with deep
structured form of learning.

• Deep learning architectures may be MLP, RBFN, RNN,


CNN, etc

You might also like