0% found this document useful (0 votes)
26 views45 pages

Lecture 1artificial Neural Networks

Uploaded by

nihalali00oo1
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
26 views45 pages

Lecture 1artificial Neural Networks

Uploaded by

nihalali00oo1
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 45

Artificial Neural Networks

Dr. Kavita Pabreja


Outline
• Types of Machine Learning
• Artificial Neural Network
• Biological Neuron
• Hebbian Learning Rule
• Hebbian Learning Rule with Implementation of AND Gate
• Perceptron, Perceptron Example, Perceptron Terminology
• Artificial Neural Network-Architecture, Loss function, Hyperparameters
• Gradient Descent algorithm for training artificial neural networks (ANNs)
• ANN- Benefits, Disadvantages, Applications

Dr. Kavita Pabreja


What is Machine learning?

• Machine learning is a type of algorithm in which the user gives the


input and expected output.
• It expects the machine to learn (find) the algorithm (model) to produce
the desired (expected) output.

Dr. Kavita Pabreja


Types of Machine Learning
Supervised Learning:
• A model learns from data that has samples of data with both the input and the
expected output.
Unsupervised Learning:
• The model is expected to discover the patterns on its own.
• There is no guidance as to what it is expected to learn.
• Unsupervised learning algorithms simply group or cluster similar items.
Reinforcement Learning:
• It is like unsupervised learning as there is no available class label.
• But based on the solution provided by the model, positive or negative feedback
is given.
• This feedback is used by the model to make a better decision.
Dr. Kavita Pabreja
Artificial Neural Network and Biological Neuron…cont’d
• The term "Artificial Neural Network" is derived from Biological neural
networks that develop the structure of a human brain.
• Similar to the human brain that has neurons interconnected to one
another, artificial neural networks also have neurons that are
interconnected to one another in various layers of the networks.
These neurons are known as nodes.
• Deep learning is a technique that uses artificial neural network (ANN).
• Artificial neural network can be used to solve both regression and
classification problems.

Dr. Kavita Pabreja


Artificial Neural Network and Biological Neuron
• The biological neural network refers to a group of biological nerve cells
that are connected to one another.
• A typical biological neural network is the brain.
• The brain is composed of a number of neurons that are interlinked to
form a huge network.
• This network is used to transmit information between any two points.
• The brain uses different routes to transmit different types of
information between the two points.

Dr. Kavita Pabreja


Biological Neuron
• The electrical signals produced in one neuron are passed to the next neuron using a
long branch-like structure called axon.
• When a nerve signal reaches the end of the neuron, it triggers the release of
neurotransmitters which carry the signal across the synapse to the next neuron.
• This allows signals to pass from one neuron to the next.

Dr. Kavita Pabreja


• An artificial neural network is a mathematical model of biological neurons.
• Like biological neurons, it has many connecting neurons (nodes).
• Dendrites(Inputs) receive signals from other neurons. Cell body sums all the
inputs signals to generate output.
• Axon(Output) transmits output, when the sum reaches to a threshold.
• Synapses(weights) is a point of interaction neurons. It transmits electrical or
chemical signals to another neuron.
• Synapse is derived from the Greek word which means conjunction.
Dr. Kavita Pabreja
Hebbian Learning Rule
• Hebbian Learning Rule, also known as Hebb Learning Rule, was proposed by
Donald O Hebb.
• It is one of the first and also easiest learning rules in the neural network.
• It is used for pattern classification.
• It is a single layer neural network, i.e. it has one input layer and one output layer.
• The input layer can have many units, say n. The output layer only has one unit.
• Hebbian rule works by updating the weights between neurons in the neural
network for each training sample.
• The core idea of Hebb's rule is that when the axon of one neuron consistently
and repeatedly stimulates another neuron, the connection between them
becomes stronger. In simpler terms, if neuron A repeatedly triggers neuron B,
the connection between A and B will be reinforced, making it more likely for
neuron B to fire when neuron A fires.
Dr. Kavita Pabreja
Hebbian Learning Rule Algorithm
1. Set all weights to zero, wi = 0 for i=1 to n, and bias
to zero.
2. For each input vector, S(input vector) : t(target
output pair), repeat steps 3-5.
3. Set activations for input units with the input vector
Xi = Si for i = 1 to n.
4. Set the corresponding output value to the output
neuron, i.e. y = t.
5. Update weight and bias by applying Hebb rule for
all i = 1 to n:

Dr. Kavita Pabreja


Hebbian Learning Rule with Implementation of
AND Gate
• Implementing the concept of the AND gate using Hebbian learning in a simplified
neural network involves adjusting the synaptic strengths between neurons to
produce an output similar to the logical AND operation.
• In the context of an AND gate, here's a basic example using Hebbian learning:
• Imagine a neural network with two input neurons (x1 and x2) and one output
neuron (y). The goal is for the output neuron (y) to fire (produce an output) when
both input neurons (x1 and x2) are activated simultaneously (see example on next
slide).

Dr. Kavita Pabreja


Hebbian Learning Rule with Implementation of
AND Gate

• Truth Table of AND Gate using bipolar sigmoidal function (produces


outputs in a range between -1 and 1)
• There are 4 training samples, so there will be 4 iterations.
• The activation function used here is Bipolar Sigmoidal Function so the
range is [-1,1]. Dr. Kavita Pabreja
Hebbian Learning Rule with Implementation of
AND Gate

Dr. Kavita Pabreja


Testing the Network

Dr. Kavita Pabreja


What is the application/principle of Hebb's
rule?
• Hebb's Rule describes how and when a cell persistently activates
another nearby cell, the connection between the two cells becomes
stronger.
• Specifically, when Neuron A axon repeatedly activates neuron B's
axon, a growth process occurs that increases how effective neuron A
is in activating neuron B.
• Hebbian theory asserts that “neurons that fire together, wire
together,” meaning that when activity in one cell repeatedly
stimulates action potentials in a second cell, synaptic strength is
potentiated.

Dr. Kavita Pabreja


Perceptron
• A Perceptron is an Artificial Neuron.
• It is the simplest possible Neural Network.

• The perceptron consists of 4 parts.


1. Input values or One input layer
2. Weights and Bias
3. Net sum
4. Activation Function
• The Neural Networks work the same way as the perceptron. So, if you want
to know how neural network works, learn how perceptron works.

Dr. Kavita Pabreja


History of Perceptrons
• Frank Rosenblatt (1928 – 1971) was an American psychologist notable in
the field of Artificial Intelligence.
• In 1957 he "invented" a Perceptron program, on an IBM 704 computer at
Cornell Aeronautical Laboratory.
• Scientists had discovered that brain cells (Neurons) receive input from our
senses by electrical signals.
• The Neurons, then again, use electrical signals to store information, and to
make decisions based on previous input.
• Frank had the idea that Perceptrons could simulate brain principles, with
the ability to learn and make decisions.

Dr. Kavita Pabreja


The Perceptron
• The original Perceptron was designed to take a number of binary
inputs, and produce one binary output (0 or 1).
• The idea was to use different weights to represent the importance of
each input, and that the sum of the values should be greater than a
threshold value before making a decision like yes or no (true or false)
(0 or 1).

Dr. Kavita Pabreja


Perceptron Example
• Imagine a perceptron (in your brain).
• The perceptron tries to decide if you should go to a concert.
• Is the artist good? Is the weather good? Friend will Come? Food is
Served? Alcohol is Served?
• What weights should these facts have?
Criteria Input Weight
Artists is Good x1 = 0 or 1 w1 = 0.7
Weather is Good x2 = 0 or 1 w2 = 0.6
Friend will Come x3 = 0 or 1 w3 = 0.5
Food is Served x4 = 0 or 1 w4 = 0.3
Alcohol is Served x5 Dr.
= 0Kavita
or 1Pabreja w5 = 0.4
The Perceptron Algorithm
Frank Rosenblatt suggested this algorithm:
1. Set a threshold value 3. Sum all the results:
2. Multiply all inputs with its weights 0.7 + 0 + 0.5 + 0 + 0.4 = 1.6 (The Weighted Sum)
3. Sum all the results 4. Activate the Output:
4. Activate the output Return true if the sum > 1.5 ("Yes I will go to the
Concert")
1. Set a threshold value: Threshold = 1.5
2. Multiply all inputs with its weights:
Note
x1 * w1 = 1 * 0.7 = 0.7 If the weather weight is 0.6 for you, it might be different for
x2 * w2 = 0 * 0.6 = 0 someone else. A higher weight means that the weather is more
x3 * w3 = 1 * 0.5 = 0.5 important to them.
x4 * w4 = 0 * 0.3 = 0 If the threshold value is 1.5 for you, it might be different for
x5 * w5 = 1 * 0.4 = 0.4 someone else. A lower threshold means they are more wanting to
goKavita
Dr. to any concert.
Pabreja
Perceptron Terminology
• Nodes (Perceptron Inputs)
• Perceptron inputs are called nodes.
• The nodes have both a value and a weight.
• Node Values (Input Values)
• Each input node has a binary value of 1 or 0.
• This can be interpreted as true or false / yes or no.
• In the example above, the node values are: 1, 0, 1, 0, 1
• Node Weights
• Weights show the strength of each node.
• In the example above, the node weights are: 0.7, 0.6, 0.5, 0.3, 0.4
• The Activation Function
• The activation function maps the weighted sum into a binary value of 1 or 0.
• This can be interpreted as true or false / yes or no.
• In the example above, the activation function is simple: (sum > 1.5)
Dr. Kavita Pabreja
Artificial Neural Network
• An Artificial Neural Network (ANN) is a computational model inspired
by the human brain’s neural structure.
• It consists of interconnected nodes (neurons) organized into layers.
• Information flows through these nodes, and the network adjusts the
connection strengths (weights) during training to learn from data,
enabling it to recognize patterns, make predictions, and solve various
tasks in machine learning and artificial intelligence.

Dr. Kavita Pabreja


Artificial Neural Network

Dr. Kavita Pabreja


Artificial Neural Network Architecture
1. There are three layers in the network architecture: the input layer, the hidden
layer (more than one), and the output layer. Because of the numerous layers
are sometimes referred to as the MLP (Multi-Layer Perceptron).
2. It is possible to think of the hidden layer as a “distillation layer,” which extracts
some of the most relevant patterns from the inputs and sends them on to the
next layer for further analysis. It accelerates and improves the efficiency of the
network by recognizing just the most important information from the inputs
and discarding the redundant information.
3. The activation function is important for two reasons: first, it allows you to turn
on your computer.
• This model captures the presence of non-linear relationships between the inputs.
• It contributes to the conversion of the input into a more usable output
Dr. Kavita Pabreja
Artificial Neural Network Architecture
4. Finding the “optimal values of W — weights” that minimize prediction error is
critical to building a successful model. The “backpropagation algorithm” does
this by converting ANN into a learning algorithm by learning from mistakes.
5. The optimization approach uses a “gradient descent” technique to quantify
prediction errors. To find the optimum value for W, small adjustments in W are
tried, and the impact on prediction errors is examined. Finally, those W values
are chosen as ideal since further W changes do not reduce mistakes.

Dr. Kavita Pabreja


Dr. Kavita Pabreja
Loss Function
• In artificial neural networks (ANN), a loss function, also known as a cost
function or objective function, measures the disparity between the
predicted output and the actual target output.
• It quantifies the model's performance by calculating the difference
between predicted values and the ground truth across all training samples.
• The primary goal of training an ANN is to minimize this loss function.
• Minimizing the loss function involves adjusting the model's parameters
(weights and biases) through optimization algorithms like gradient descent
or its variants. These adjustments aim to make the predicted output as
close as possible to the actual target output.

Dr. Kavita Pabreja


Hyper parameters of ANN…..cont’d
• Hyperparameters in artificial neural networks (ANNs) are parameters whose
values are set before the training process begins.
• They are not learned during the training phase but are crucial as they determine
the architecture, behavior, and performance of the neural network. Tuning these
hyperparameters can significantly impact the network's learning ability and final
performance. Some of the key hyperparameters in ANNs include:

1. Number of Hidden Layers 6. Epochs


2. Number of Neurons in Each Layer 7. Optimizer
3. Activation Functions 8. Regularization Parameters
4. Learning Rate 9. Dropout Rate
5. Batch Size 10.Initialization Method

Dr. Kavita Pabreja


Hyper parameters of ANN…..cont’d
1. Number of Hidden Layers: This refers to the layers between the input
and output layers in the neural network. Determining the right number
of hidden layers and neurons in each layer is essential for the network's
capacity to learn complex patterns without overfitting or underfitting.
2. Number of Neurons in Each Layer: The number of neurons or units in
each hidden layer is a critical hyperparameter. Too few neurons might
result in underfitting, while too many might lead to overfitting.
3. Activation Functions: These functions introduce non-linearity into the
network. Choosing the appropriate activation function (such as ReLU-
Rectified Linear Unit, Sigmoid, Tanh, etc.) for each layer can significantly
impact learning and convergence.
Dr. Kavita Pabreja
Hyper parameters of ANN…..cont’d
4. Learning Rate: This hyperparameter controls the size of the steps taken during
the optimization process (e.g., gradient descent) to update the model's
parameters. A high learning rate may lead to overshooting the optimal values,
while a low learning rate may slow down convergence.
5. Batch Size: It refers to the number of training samples utilized in one iteration.
Smaller batch sizes might provide more noise in the parameter updates but can
converge faster, while larger batch sizes might offer more accurate gradients
but can be computationally expensive.
6. Epochs: An epoch represents one complete pass through the entire training
dataset. The number of epochs determines how many times the learning
algorithm will work through the entire training dataset. It's a trade-off between
computational resources and the model's performance.
Dr. Kavita Pabreja
Hyper parameters of ANN
7. Optimizer: The optimization algorithm used to update the weights
and biases of the neural network during training, such as Stochastic
Gradient Descent (SGD), Adam, RMSprop, etc.
8. Regularization Parameters: Techniques like L1 or L2 regularization
(or a combination known as Elastic Net) to prevent overfitting by
penalizing large weights.
9. Dropout Rate: Dropout is a regularization technique where a
certain proportion of neurons are randomly dropped out during
each training iteration to prevent overfitting.
10. Initialization Method: The method used to initialize the weights of
the neural network, such as random initialization, Xavier
initialization, or He initialization.
Dr. Kavita Pabreja
Gradient Descent algorithm for training
artificial neural networks (ANNs)
• Imagine you're on a mountain in thick fog and want to get to the
bottom, but you can't see the way down. What you do is feel the
slope of the ground beneath your feet and take small steps downhill.
By repeatedly checking the slope and moving in the steepest
downward direction, you eventually reach the base of the mountain.
• Similarly, Gradient Descent is like finding the way down a
mathematical "mountain" to minimize a "cost" or "loss" function in
an ANN during training. The goal is to adjust the neural network's
parameters (weights and biases) to minimize the difference between
predicted outputs and actual targets.

Dr. Kavita Pabreja


Gradient Descent algorithm-step-by-step
1. Initialize Parameters: Start with random values for the weights and
biases in the neural network.
2. Make Predictions: Pass the input data through the network to
make predictions. These initial predictions are likely far from the
actual target values.
3. Calculate Error: Measure how far off these predictions are from the
actual targets using a loss function (e.g., Mean Squared Error).
4. Calculate Gradient: Calculate the gradient, which represents the
slope of the loss function concerning each parameter (weight and
bias). The gradient points in the direction of the steepest increase in
the loss.
Dr. Kavita Pabreja
Gradient Descent
• Gradient descent is a method of updating b0 and
b1 to reduce the cost function(MSE).
• The idea is that we start with some values for b0
and b1 and then we change these values iteratively
to reduce the cost.
• Gradient descent helps us on how to change the
values.
• To draw an analogy, imagine a pit in the shape of U and you are standing at the topmost
point in the pit and your objective is to reach the bottom of the pit. There is a catch, you
can only take a discrete number of steps to reach the bottom.
• If you decide to take one small step at a time you would eventually reach the bottom of
the pit but this would take a longer time.
• If you choose to take longer steps each time, you would reach sooner but, there is a
chance that you could overshoot the bottom of the pit and not exactly at the bottom.
• In the gradient descent algorithm, the number of steps you take is the learning rate. This
decides on how fast the algorithm converges to the minima.
Dr. Kavita Pabreja
Gradient Descent algorithm-step-by-step
5. Update Parameters: Adjust the parameters (weights and biases) in the
opposite direction of the gradient, thereby moving towards reducing the loss.
The size of this update is determined by a factor called the learning rate.
6. Iterate: Repeat steps 2 to 5 for multiple iterations or epochs. Each iteration
gradually reduces the loss, and the parameters get closer to values that
minimize the loss function.
7. Stop Criteria: The process stops when either the loss reaches an acceptable
level or after a fixed number of iterations.
The key idea is to adjust the parameters iteratively by "descending" the slope of the loss
function to reach the lowest point, where the error between predictions and actual targets
is minimized.
It's crucial to find a balance in the learning rate — too small, and the training might be
slow, too large, and the algorithm might overshoot or fail to converge to the minimum.
Dr. Kavita Pabreja
Avoiding overfitting through Regularization in
ANN…cont’d
• Overfitting occurs in ANNs when the model learns not only the
underlying patterns in the training data but also the noise or random
fluctuations present in that data. This leads to poor generalization,
meaning the model performs well on the training data but fails to
generalize to new, unseen data.
• Regularization techniques in ANNs are used to prevent overfitting by
imposing constraints on the neural network's parameters during
training. Regularization aims to find a balance between fitting the
training data well and ensuring the model generalizes effectively to
unseen data.

Dr. Kavita Pabreja


Avoiding overfitting through Regularization in ANN..cont’d
• L1 regularization (Lasso)
• It creates a simpler model by potentially removing irrelevant
features.
• Adds a penalty based on the absolute value of weights, leading to
some weights becoming zero. This creates a simpler model by
effectively removing certain features.
• It is like setting some features to zero, using only the most
important ones, which simplifies the model.

Dr. Kavita Pabreja


Avoiding overfitting through Regularization in ANN

• L2 regularization (Ridge)
• It smooths the model, preventing extreme weight values but using
all features.
• It adds a penalty based on the square of the weights, leading to
smaller weights overall but usually not zero.
• It smooths out the model without discarding features completely.
• It spreads the "importance" more evenly across features, reducing
the chance of any one feature dominating and leading to
overfitting.

Dr. Kavita Pabreja


Benefits of Artificial Neural Networks
1. Prediction accuracy is generally high.
2. Robust, works when training examples contain errors.
3. Output may be discrete, real-valued, or a vector of several discrete or real-valued attributes.
4. Able to classify patterns on which they have not been trained.
5. ANNs can learn and model non-linear and complicated interactions, which is critical since many of
the relationships between inputs and outputs in real life are non-linear and complex.
6. ANNs can generalize – After learning from the original inputs and their associations, the model may
infer unknown relationships from anonymous data, allowing it to generalize and predict unknown
data. Can be used when you may have little knowledge of relationships between attribute values
and class.
7. ANN does not impose any constraints on the input variables, unlike many other prediction
approaches (like how they should be distributed).
8. ANNs can better simulate heteroskedasticity (describes a situation where the variability (spread) of
errors or residuals in a dataset is not constant across all levels of an independent variable), because
of their capacity to discover latent correlations in the data without imposing any preset associations.

Dr. Kavita Pabreja


Disadvantages of Artificial Neural Networks
1.Hardware Dependence:
The construction of Artificial Neural Networks necessitates the use of parallel processors. As a result, the
equipment’s realization is contingent. It takes long training time on simple machines.
2. Understanding the network’s operation:
This is the most serious issue with ANN. When ANN provides a probing answer, it does not explain why or how
it was chosen. It is difficult to understand the learned function (weights). As a result, the network’s confidence
is eroded.
3. Assured network structure:
Any precise rule does not determine the structure of artificial neural networks. Experience and trial and error
are used to develop a suitable network structure.
4. Difficulty in presenting the issue to the network:
ANNs are capable of working with numerical data. Before being introduced to ANN, problems must be
converted into numerical values.
5. The network’s lifetime is unknown:
When the network’s error on the sample is decreased to a specific amount, the training is complete. The value
does not produce the best outcomes.
6. It is not easy to incorporate domain knowledge.
Dr. Kavita Pabreja
Applications of ANN….cont’d
• Image and Pattern Recognition: ANNs excel in tasks such as image classification,
object detection, facial recognition, and handwriting recognition (OCR).
Convolutional Neural Networks (CNNs), a specialized type of ANN, are particularly
effective in image-related tasks.
• Natural Language Processing (NLP): ANNs are used for tasks like text
classification, sentiment analysis, machine translation, speech recognition,
language generation, and language modeling. Recurrent Neural Networks (RNNs)
and Transformer-based models like the GPT series and BERT are prominent
architectures in this domain.
• Speech and Audio Recognition: ANNs are employed in speech recognition
systems, speaker identification, speech synthesis, and audio analysis, where they
learn features and patterns from audio data.

Dr. Kavita Pabreja


Applications of ANN….cont’d
• Predictive Analytics and Time Series Forecasting: ANNs are used to
forecast trends, make predictions, and analyze time-series data in various
fields, including finance, weather forecasting, stock market analysis, and
demand forecasting.
• Medical Diagnosis and Healthcare: ANNs are utilized in medical image
analysis (MRI, CT scans), disease diagnosis, drug discovery, personalized
medicine, patient monitoring, and analyzing medical records to assist in
diagnosis and treatment.
• Financial Services and Trading: ANNs are employed in credit scoring, risk
assessment, fraud detection, algorithmic trading, and stock market analysis
for making predictions and decisions based on historical data.
Dr. Kavita Pabreja
Applications of ANN
• Robotics and Control Systems: ANNs are used in robotics for tasks like object
recognition, path planning, robot control, and reinforcement learning-based robotic
systems for autonomous decision-making.
• Recommendation Systems: ANNs power recommendation engines in e-commerce,
streaming platforms, and social media to provide personalized suggestions based
on user behavior and preferences.
• Gaming and Virtual Reality: ANNs are used in game development for creating
intelligent non-player characters (NPCs), adaptive game environments, and
improving user experience in virtual reality applications.
• Environmental Modeling and Energy Forecasting: ANNs are applied in
environmental sciences for modeling ecosystems, climate prediction, and in energy
sectors for load forecasting and optimizing energy consumption.

Dr. Kavita Pabreja


References
• https://www.youtube.com/watch?v=aircAruvnKk
• https://www.youtube.com/watch?v=bfmFfD2RIcg
• https://www.geeksforgeeks.org/hebbian-learning-rule-with-implementation-of-
and-gate/
• https://www.w3schools.com/ai/ai_perceptrons.asp
• Following are for Multilayer NN
• https://www.analyticsvidhya.com/blog/2021/09/introduction-to-artificial-
neural-networks/#h-what-is-artificial-neural-network-ann
• https://www.geeksforgeeks.org/artificial-neural-networks-and-its-
applications/
• https://medium.com/machine-learning-researcher/artificial-neural-network-
ann-4481fa33d85a
• https://aws.amazon.com/what-is/neural-network/
Dr. Kavita Pabreja
All the BEST

Dr. Kavita Pabreja

You might also like