0% found this document useful (0 votes)
4 views

Lecture 5-Introduction to neural network (1)

The document discusses the fundamentals of Artificial Neural Networks (ANN), including their structure, function, and comparison to biological neurons. It covers key concepts such as perceptrons, activation functions, and the architecture of single-layer and multi-layer perceptrons, along with guidelines for determining the number of hidden layers and neurons in a neural network. The content is aimed at students in a Biomedical Engineering course, providing a foundational understanding of machine learning approaches in medical pattern recognition.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views

Lecture 5-Introduction to neural network (1)

The document discusses the fundamentals of Artificial Neural Networks (ANN), including their structure, function, and comparison to biological neurons. It covers key concepts such as perceptrons, activation functions, and the architecture of single-layer and multi-layer perceptrons, along with guidelines for determining the number of hidden layers and neurons in a neural network. The content is aimed at students in a Biomedical Engineering course, providing a foundational understanding of machine learning approaches in medical pattern recognition.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 42

BIO3603:Medical Pattern Recognition

Lecture6
Dr. Lamees Nasser
E-mail: [email protected]
Third Year– Biomedical Engineering Department
Academic Year 2023- 2024

11/13/2023 1
Artificial Neural Networks

• Artificial neural network (ANN) is a machine learning approach that


models (mimics) human brain .

• ANN is composed of many artificial neurons that are linked together


to perform the desired function.

• A neuron only fires (generate an electrical signal) if its input signal


exceeds a certain amount (the threshold) in a short time period.
Artificial Neural Networks (cont’d)
Biological Neuron
• A Neuron is a single cell that transmit chemical and electrical signals in
the brain
• The human brain contains billions of neurons connected into a network.
• A biological neuron is mainly composed of 3 main parts (dendrite, cell
body, axon) and an external part called synapse:-
Biological Neuron Structure
1. Dendrite-input links
Dendrites are responsible for getting incoming signals from the axons of
other neurons to form a neural network
2. Soma - processor
Soma is the cell body responsible for the processing of input signals and
deciding whether a neuron should fire an output signal. A neuron only fires
if its input signal exceeds a certain amount (the threshold) in a short time
period.
3. Axon - output
Axon is responsible for getting processed signal from neuron to target cell
4. Synapse- interconnections
Synapse is the connection between an axon and other neuron dendrites
- The size of synapses determines the strength of their signal transmission
• The larger the synapse, the stronger the signal it transmits.
Mathematical model of a biological neuron
(Artificial Neuron)
• The output of a neuron is a function of the weighted sum of
the inputs plus a bias

The McCulloch-Pitts model


Comparison between biological neuron &
artificial neuron

Biological Neuron Artificial Neuron

Dendrites Inputs
Soma(cell body) Node (Neuron)
Synapse Weights or interconnections
Axon Output
Perceptron

• A perceptron (artificial neuron) is a most fundamental unit of a


neural network, which takes an input, processes it, passes it through
an activation function like the Sigmoid, return the activated output.
• The perceptron is a linear model for supervised learning used for
binary classification tasks

An artificial neuron: basic unit of neural network


Perceptron (cont'd)

• Input nodes or input layer: takes the initial data into the model for
further processing.
𝑥1 , 𝑥2 , 𝑥3 , … , 𝑥𝑚
• Weights and Bias:
• Weight: shows the effectiveness of a particular input. More the
weight of input, more it will have an impact on the network.

• Bias: It is the same as the intercept added in a linear equation. It is


an additional parameter which task is to modify the output along
with the weighted sum of the input to the other neuron.

𝑤1 , 𝑤2 , 𝑤3 , … , 𝑤𝑚 and 𝑏
Perceptron (cont’d)
• Net sum: calculates the total sum.
𝑚

෍ 𝑥𝑖 𝑤𝑖 + 𝑏
𝑖=1
• Activation Function: decides whether a neuron should be activated
or not by calculating weighted sum and further adding bias with it.
The purpose of the activation function is to introduce non-
linearity into the output of a neuron.
𝑚

𝑦 = 𝑓 ෍ 𝑥𝑖 𝑤𝑖 + 𝑏
𝑖=1
• Two types of perceptron:
a) Single Layer Perceptron
b) Multi-layer perceptron
Effect of Bias in Neural Network

• Bias is like the intercept added in a linear equation.

𝑦 = 𝑎𝑥 + 𝑏
• It is an additional parameter in the neural network which is used to adjust
the output along with the weighted sum of the inputs to the neuron.
𝑚

𝑦 = ෍ 𝑥𝑖 𝑤𝑖 + 𝑏
𝑖=1

• Therefore, Bias is a constant which helps the model in a way that it can fit
best for the given data.
• The bias shifts the decision boundary away from the origin
• Recall equation of a straight line.

Y-intercept
• Recall equation of a straight line.

Y-intercept
• Recall equation of a straight line.

Y-intercept

b= -ve
Bias of a Neuron

• Same concept applied to 𝐵𝑖𝑎𝑠

෍ 𝑥𝑖 𝑤𝑖 + 𝑏𝑖𝑎𝑠
𝑖=1
Bias of a Neuron (cont'd)

• Same concept applied to 𝐵𝑖𝑎𝑠

෍ 𝑥𝑖 𝑤𝑖 + 𝑏𝑖𝑎𝑠
𝑖=1
Common activation functions
It is used to determine the output of a neural network like 0 or
1. It maps the resulting values between 0 to 1 or -1 to 1
Step function
Relu function

Sigmoid function
Activation functions

• Linear transfer function (identity function)


▪ The signal passes through it unchanged. It remains a
linear function. Almost never used.
Activation functions (cont'd)

• Step function
▪ Produces a binary output of 0 or 1. Mainly used in binary
classification to give a discrete value.

0 if 𝑤 ⋅ 𝑥 + 𝑏 ≤ 0
output = ቊ
1 if 𝑤 ⋅ 𝑥 + 𝑏 > 0
Activation functions (cont'd)

• Sigmoid/logistic function
▪ Squishes all the values to a
probability between 0 and 1, which
reduces extreme values or outliers in
the data. Usually used to classify
two classes.
1
𝜎 𝑧 =
1 + 𝑒 −𝑧
• SoftMax function
▪ A generalization of the sigmoid
function. Used to obtain
classification probabilities when we
have more than two classes.
Activation functions (cont'd)

• Hyperbolic tangent function (tanh)


▪ Squishes all values to the range of –
1 to 1. Tanh almost always works
better than the sigmoid function in
hidden layers.
Activation functions Cont’d
• Rectified linear unit (ReLU)
▪ Activates a node only if the input is
above zero. Always recommended for
hidden layers. Better than tanh

• Leaky ReLU
▪ Instead of having the function be zero
when x < 0, leaky ReLU introduces a
small negative slope (around 0.01)
when (x) is negative.
▪ It usually works better than the ReLU
function, although it’s not used as
much in practice.
Activation functions (cont'd)

• Activation for hidden layers:


▪ Rectified linear unit (ReLU)
▪ Sigmoid/logistic function
▪ Hyperbolic tangent function (tanh)

• Activation for output layer:


▪ Linear
▪ Sigmoid/logistic function
▪ SoftMax
Single-layer Perceptron
• Single Layer Perceptron has just two layers of input and
output. It does not contain Hidden Layers as that of
Multilayer perceptron
• Input nodes are fully connected to a node or multiple nodes
in the next layer. A node in the next layer takes a weighted
sum of all its inputs
.

Single-layer Perceptron Limitation


➢A single layer perceptron can only learn linearly separable problems.

• Linearly separable refers to the fact that classes of patterns


with n-dimensional vector can be separated with a single decision
boundary.
• Boolean AND function is linearly separable, whereas Boolean
XOR function is not linearly separable.
• The Multilayer Perceptron was developed to tackle this limitation.

(0,1) (1,1) (0,1) (1,1)

(0,0) (1,0) (0,0) (1,0)


Boolean Functions
Linearly separable Nonlinearly separable
Multi-Layer Perceptron
• The multilayer perceptron (MLP) is a hierarchical structure of
several perceptrons and overcomes the limitations of these single-
layer
• The multilayer perceptron is used to solve nonlinear problems.
• Nonlinear problems can be represented by multilayer perceptron
with nodes that use nonlinear activation functions.
• It consists of input, output, and hidden layers

Input layer Hidden layer Output layer


How many hidden layers and neurons do you need
in your artificial neural network?
Guidelines to know the number of hidden
layers and neurons
1. Based on the data, draw an expected decision boundary to
separate the classes.
2. Express the decision boundary as a set of lines where each
line will be modeled as a perceptron in the ANN
• The number of selected lines represents the number of
hidden neurons in the first hidden layer.
3. To connect the lines created by the previous layer, a new
hidden layer is added. Note that a new hidden layer is
added each time you need to create connections among the
lines in the previous hidden layer.
▪ The number of hidden neurons in each new hidden
layer equals the number of connections to be made.
Guidelines to know the number of hidden
layers and neurons (cont'd)
Number of hidden layers
• If the data is linearly separable then you don't need any hidden
layers at all.
• If data is less complex and is having fewer dimensions or features,
then neural networks with 1 to 2 hidden layers would work.
• If data is having large dimensions or features then to get an
optimum solution, 3 to 5 hidden layers can be used.
• The general rule is this: the deeper your network is, the more it
will fit the training data.
• But too much depth is not a good thing, because the network can
fit the training data so much that it fails to generalize when you
show it new data (overfitting); also, it becomes more
computationally expensive.
Guidelines to know the number of hidden
layers and neurons (cont'd)
• Start from that point, maybe three to five layers, and observe the
network performance. If it is performing poorly (underfitting), add
more layers. If you see signs of overfitting, then decrease the
number of layers.
Number of nodes in each hidden layer
• There are many rule-of-thumb methods for determining the
correct number of neurons to use in the hidden layers, such as
the following:
▪ The number of hidden neurons should be between the size
of the input layer and the size of the output layer.
▪ The number of hidden neurons should be 2/3 the size of
the input layer, plus the size of the output layer.
Guidelines to know the number of hidden
layers and neurons (cont'd)
▪ The number of hidden neurons should be less than twice
the size of the input layer.
▪ These three rules provide a starting point for you to
consider.
• The number of hidden neurons should keep on decreasing in
subsequent layers to get closer and closer to pattern and
feature extraction and to identify the target class.
Example 1
• Let’s start with a simple example of a classification problem with two
classes. Each sample has two inputs and one output that represents the
class label. It is much similar to XOR problem.
• To split a nonlinear dataset, we need more than one line. This
means we need to come up with an architecture to use tens and
hundreds of neurons in our neural network.
Step 1

• Draw the decision boundary that splits the two classes.

non-linearly separated
Step 2

• Express the decision boundary by a set of lines.


• Two lines are required. In other words, there are two single layer
perceptron networks. Each perceptron produces a line.
• Two lines required to represent the decision boundary the decision
boundary means that the first hidden layer have two hidden neurons.
Step 3

• The output neuron will merge the two lines generated previously so
that there is only one output from the network.
Network architecture

• After knowing the number of hidden layers and their neurons, the
network architecture is now complete
Example 2
• There are two classes where each sample has two inputs and one
output.

You might also like