NNDL - Unit - I Notes
NNDL - Unit - I Notes
Unit – I Notes
NEURAL NETWORKS – INTRODUCTION
Neural networks is also known as artificial neural networks (ANNs) or simulated neural
networks (SNNs), are a subset of machine learning and are at the heart of deep learning
algorithms.
The term "Artificial Neural Network" is derived from Biological neural networks that
develop the structure of a human brain.
Similar to the human brain that has neurons interconnected to one another, artificial
neural networks also have neurons that are interconnected to one another in various
layers of the networks.
The given figure illustrates the typical diagram of Biological Neural Network.
The typical Artificial Neural Network looks something like the given following
figure.
Artificial Neural Network (ANN) is a type of neural network that is based on a Feed-
Forward strategy.
It is called this because they pass information through the nodes continuously till it
reaches the output node. This is also known as the simplest type of neural network.
Dendrites receive signals from other neurons, Soma sums all the incoming signals
and axon transmits the signals to other cells.
manipulations
Operating well-defined well-constrained poorly defined un-constrained
Environment
The potential of fault tolerance performance degraded even on
Fault Tolerance partial damage
Each node, or artificial neuron, connects to another and has an associated weight and
threshold.
If the output of any individual node is above the specified threshold value, that
node is activated, sending data to the next layer of the network.
Neural networks rely on training data to learn and improve their accuracy over time.
However, once these learning algorithms are fine-tuned for accuracy, they are
powerful tools in computer science and artificial intelligence, allowing us to classify
and cluster data at a high velocity.
Tasks in speech recognition or image recognition can take minutes versus hours when
compared to the manual identification of human experts
ANNs play a significant part in picture and character recognition because of their
capacity to take in many inputs, process them, and infer hidden and complicated,
non-linear correlations.
Forecasting issues are frequently complex; for example, predicting stock prices is
complicated with many underlying variables (some known, some unseen).
Traditional forecasting models have flaws when it comes to accounting for these
complicated, non-linear interactions.
Given its capacity to model and extract previously unknown characteristics and
correlations, ANNs can provide a reliable alternative when used correctly.
ANN also has no restrictions on the input and residual distributions, unlike
conventional models.
Forecasting
3. Other Applications of Artificial Neural Networks:
2. It can be used to for Fraud Detection regarding credit cards, insurance or taxes by
analysing the past records.
4. Recommender systems:
ANNs are used in recommendation engines to analyze user preferences and provide
personalized recommendations for products, movies, music, and more.
5. Financial analysis:
ANNs can be employed for tasks like stock market prediction, fraud detection,
credit risk assessment, and algorithmic trading.
6. Medical diagnosis:
7. Autonomous vehicles:
ANNs are crucial for self-driving cars, enabling perception, object recognition,
decision-making, and control.
8. Robotics:
ANNs play a vital role in robotics applications, such as robot motion planning,
object manipulation, and human-robot interaction.
Every new technology need assistance from the previous one i.e. data from previous
ones and these data are analysed so that every pros and cons should be studied
correctly. All of these things are possible only through the help of neural network.
11. Neural Network can be used in betting on horse races, sporting events, and most
importantly in stock market.
12. It can be used to predict the correct judgment for any crime by using a large data of crime
details as input and the resulting sentences as output.
13. By analyzing data and determining which of the data has any fault ( files diverging from
peers ) called as Data mining, cleaning and validation can be achieved through neural
network.
14. Neural Network can be used to predict targets with the help of echo patterns we get from
sonar, radar, seismic and magnetic instruments.
15. It can be used efficiently in Employee hiring so that any company can hire the right
employee depending upon the skills the employee has and what should be its
productivity in future.
The evolution of neural networks is a fascinating journey that spans several decades.
Here's a brief overview of the key milestones in the evolution of neural networks:
- The concept of neural networks was inspired by the structure and function of the
human brain.
- Warren McCulloch and Walter Pitts introduced the first mathematical model of a
neuron in 1943.
- In the 1950s, Frank Rosenblatt developed the perceptron, a single-layer neural network
capable of binary classification.
- The perceptron's limitations were exposed, particularly its inability to solve problems
that were not linearly separable.
- The perceptron fell out of favor, and research in neural networks waned during the AI
winter.
- Support Vector Machines (SVMs) gained popularity for their success in classification
tasks, overshadowing neural networks for a while.
- Lack of computational power and difficulties in training deep networks led to another
decline in interest, known as the "neural network winter."
- The advent of powerful GPUs and the availability of large datasets enabled the training
of deeper neural networks.
- Key contributions from researchers like Geoffrey Hinton, Yann LeCun, and Yoshua
Bengio played a crucial role in the resurgence of neural networks.
- Deep learning has become the dominant paradigm in machine learning, achieving
remarkable success in various applications such as image recognition, natural language
processing, and speech recognition.
- Architectures like Long Short-Term Memory (LSTM) networks and Transformers have
been instrumental in handling sequential and attention-based tasks.
- Transfer learning and pre-trained models have become common, allowing for efficient
training on smaller datasets.
7. Current Trends:
- Neural architecture search (NAS) and AutoML techniques aim to automate the design of
neural network architectures.
As of the last knowledge update in January 2022, the field continues to evolve
rapidly, and new developments are likely to have occurred since then.
➢ Input Layer:
ANN layers
➢ Hidden Layer:
The hidden layer presents in-between input and output layers. It performs
all the calculations to find hidden features and patterns.
➢ Output Layer:
The artificial neural network takes input and computes the weighted
sum of the inputs and includes a bias. This computation is represented
in the form of a transfer function
F
Only those who are fired make it to the output layer. There are
distinctive activation functions available that can be applied upon the
sort of task we are performing.
Architecture Diagram:
There are three layers in the network architecture: the input layer, the hidden layer
(more than one), and the output layer. Because of the numerous layers are
sometimes referred to as the MLP (Multi-Layer Perceptron).
Architecture of ANN
It is possible to think of the hidden layer as a “distillation layer,” which extracts some
of the most relevant patterns from the inputs and sends them on to the next layer for
further analysis.
It accelerates and improves the efficiency of the network by recognizing just the
most importantinformation from the inputs and discarding the redundant
information.
The activation function is important for two reasons: first, it allows you to turn
on your computer.
This model captures the presence of non-linear relationships between the inputs.
Finding the “optimal values W - weights” that minimize prediction error is critical to
building a successful model
The back propagation algorithm does this by converting ANN into a learning
algorithm by learning from mistakes.
The optimization approach uses a “gradient descent” technique to quantify prediction
errors
To find the optimum value for W, small adjustments in W are tried, and the impact on
prediction errors is examined
Finally, those W values are chosen as ideal since further W changes do not reduce
mistakes
The majority of the artificial neural networks will have some similarities with a
morecomplex biological partner and are very effective at their expected tasks.
1. Feedback ANN:
In this type of ANN, the output returns into the network to accomplish the best-
evolved results internally.
The feedback networks feed information back into itself and are well suited to
solve optimization issues. The Internal system error corrections utilize
feedback ANNs.
2. Feed-Forward ANN:
Through assessment of its output by reviewing its input, the intensity of the
network canbe noticed based on group behavior of the associated neurons, and
the output is decided.
The primary advantage of this network is that it figures out how to evaluate and
recognizeinput patterns.
The information flows only in one direction, from the input layer through
the hiddenlayers to the output layer.
Each neuron in the network processes the input data and passes the output
to the nextlayer without any feedback loop.
FNNs are commonly used for tasks such as classification and regression.
These functions evaluate the distance between the input data and a set of
learned centersin a multidimensional space.
The Kohonen Self Organizing Neural Network, named after its inventor
Teuvo Kohonen, is an unsupervised learning network.
The network organizes its neurons into a grid, and during the learning
process, similar input patterns lead to the activation of nearby neurons,
causing the network to self- organize and learn the underlying data
distribution.
LSTMs are designed to address the vanishing gradient problem, which can
occur in traditional RNNs when learning long-term dependencies.
LSTMs use memory cells with gating mechanisms that allow them to
remember or forget information over extended sequences.
They are widely used in tasks involving sequential data, such as language
modeling, machine translation, and speech recognition.
1) Neuron: A fundamental unit of a neural network that receives input, applies an activation
function, and produces an output.
2) Input Layer: The first layer of a neural network that receives the initial input data.
3) Hidden Layer: Intermediate layers between the input and output layers that perform
computations and feature extraction.
4) Output Layer: The final layer of a neural network that produces the desired output
or prediction.
5) Activation Function: A mathematical function applied to the output of a neuron to
introduce non-linearity and control the neuron's firing behavior.
6) Weight: A parameter associated with each connection between neurons, determining the
strength or importance of the connection.
7) Bias: An additional parameter added to each neuron that allows for shifting the activation
function.
8) Forward Propagation: The process of passing input data through a neural network to
compute the output.
9) Backpropagation: An algorithm for updating the weights and biases of a neural network
by propagating the error from the output layer back to the input layer.
10) Loss Function: A function that quantifies the difference between the predicted output of
a neural network and the true output, used to guide the training process.
11) Gradient Descent: An optimization algorithm used to minimize the loss function by
iteratively adjusting the weights and biases of the neural network.
12) Epoch: One complete pass through the entire training dataset during the training phase of
a neural network.
13) Batch Size: The number of training examples used in each iteration of gradient descent
during training.
14) Learning Rate: A hyperparameter that determines the step size at each iteration of
gradient descent, influencing the rate at which the neural network learns.
15) Dropout: A regularization technique that randomly drops out a certain percentage of
neurons during training to prevent overfitting
16) Overfitting: A condition where a neural network performs well on the training data but
fails to generalize to unseen data due to excessively fitting the training data.
17) Activation Layer: A layer in a neural network that applies an activation function to its
inputs.
19) Recurrent Neural Network (RNN): A type of neural network designed for sequential
data processing, capable of capturing dependencies and patterns over time.
20) Long Short-Term Memory (LSTM): A variant of RNN that addresses the vanishing
gradient problem and is well-suited for learning long-term dependencies.
1. These are the most basic type of ANN, where information flows in a single direction,
from the input layer to the output layer.
2. They are commonly used for tasks such as pattern recognition, classification, and
regression.
2. They employ specialized layers, such as convolutional and pooling layers, to extract
features from images and learn hierarchical representations.
1. RNNs are designed for sequential data analysis, where the output of a previous step is
fed back as input to the current step.
2. They are used for tasks like natural language processing, speech recognition, and time
series analysis.
1. GANs consist of two networks, a generator and a discriminator, that compete against
each other to generate realistic data samples.
2. They have been used for tasks such as image synthesis, style transfer, and
Data augmentation.
Supervised learning, also known as supervised machine learning, is a subcategory of machine learning
and artificial intelligence.
It is defined by its use of labelled datasets to train algorithms that to classify data or predict outcomes
accurately.
As input data is fed into the model, it adjusts its weights until the model has been fitted appropriately,
which occurs as part of the cross validation process.
Supervised learning helps organizations solve for a variety of real-world problems at scale, such as
classifying spam in a separate folder from your inbox.
Supervised learning uses a training set to teach models to yield the desired output. This training dataset
includes inputs and correct outputs, which allow the model to learn over time.
The algorithm measures its accuracy through the loss function, adjusting until the error has been
sufficiently minimized.
Supervised learning can be separated into two types of problems when data mining classification and
regression:
Classification uses an algorithm to accurately assign test data into specific categories.
It recognizes specific entities within the dataset and attempts to draw some conclusions on how
those entities should be labeled or defined.
Common classification algorithms are linear classifiers, support vector machines (SVM), decision
trees, k-nearest neighbor, and random forest, which are described in more detail below.
Regression is used to understand the relationship between dependent and independent variables.
It is commonly used to make projections, such as for sales revenue for a given business. Linear
regression, logistical regression, and polynomial regression are
Various algorithms and computations techniques are used in supervised machine learning processes.
Below are brief explanations of some of the most commonly used learning methods, typically calculated
through use of programs like R or Python:
Neural networks:
O Primarily leveraged for deep learning algorithms, neural networks process training data by mimicking
the interconnectivity of the human brain through layers of nodes.
O If that output value exceeds a given threshold, itor activates the node, passing data to
O Neural networks learn this mapping function through supervised learning, adjusting based
on the loss function through the process of gradient descent. model’s
O When the cost function is at or near zero, we can be confident in the accuracy to yield
Naive bayes:
Naive Bayes is classification approach that adopts the principle of class conditional
independence from the Bayes Theorem.
This means that the presence of one feature does not impact the presence of another in the
probability of a given outcome, and each predictor has an equal effect on that result.
There are three types of Naïve Bayes classifiers: Multinomial Naïve Bayes, Bernoulli Naïve
Bayes, and Gaussian Naïve Bayes.
Linear regression:
O Linear regression is used to identify the relationship between a dependent variable and
one or more independent variables and is typically leveraged to make predictions about
future outcomes.
O When there is only one independent variable and one dependent variable, it is known as
simple linear regression.
O For each type of linear regression, it seeks to plot a line of best fit, which is calculated
through the method of least squares.
O However, unlike other regression models, this line is straight when plotted on a graph.
Logistic regression:
O While linear regression is leveraged when dependent variables are continuous, logistic
regression is selected when the dependent variable is categorical, meaning they have
binary outputs, such as "true" and "false" or "yes" and "no."
O While both regression models seek to understand relationships between data inputs,
logistic regression is mainly used to solve binary classification problems, such as spam
identification.
O This hyperplane is known as the decision boundary, separating the classes of data points
(e.g., oranges vs. apples) on either side of the plane.
K-nearest neighbor:
O K-nearest neighbor, also known as the KNN algorithm, is a non-parametric algorithm that
classifies data points based on their proximity and association to other available data.
O This algorithm assumes that similar data points can be found near each other.
O As a result, it seeks to calculate the distance between data points, usually through
Euclidean distance, and then it assigns a category based on the most
it less appealing for classification tasks. KNN is typically used for recommendation
engines and image recognition.
Supervised learning models can be used to build and advance a number of business applications,
including the following:
Supervised learning algorithms can be used to locate, isolate, and categorize objects out of videos
or images, making them useful when applied to various computer vision techniques and imagery
analysis.
Predictive analytics:
A widespread use case for supervised learning models is in creating predictive analytics systems
to provide deep insights into various business data points.
This allows enterprises to anticipate certain results based on a given output variable, helping
business leaders justify decisions or pivot for the benefit of the organization.
This can be incredibly useful when gaining a better understanding of customer interactions and
can be used to improve brand engagement efforts.
Spam detection:
Supervised learning models can require certain levels of expertise to structure accurately.
Datasets can have a higher likelihood of human error, resulting in algorithms learning incorrectly.