Neural Networks
Neural Networks
Neural Networks
UNIT 1: INTRODUCTION
Neural Networks-Application Scope of Neural Networks-Artificial Neural Network: An
Introduction – Perceptron Learning Algorithm - Activation Functions – Need for non-
linear activation functions – Chain Rule and Backpropagation – Deep Neural Networks
– Shallow vs Deep Networks
Neural Networks:
In the fast-evolving era of artificial intelligence, Deep Learning stands as a cornerstone
technology, revolutionizing how machines understand, learn, and interact with complex
data. At its essence, Deep Learning AI mimics the intricate neural networks of the human
brain, enabling computers to autonomously discover patterns and make decisions from
vast amounts of unstructured data. This transformative field has propelled breakthroughs
across various domains, from computer vision and natural language processing to
healthcare diagnostics and autonomous driving.
What is Deep Learning
The definition of Deep learning is that it is the branch of machine learning that is based on
artificial neural network architecture. An artificial neural network or ANN uses layers of
interconnected nodes called neurons that work together to process and learn from the
input data.
In a fully connected Deep neural network, there is an input layer and one or more hidden
layers connected one after the other. Each neuron receives input from the previous layer
neurons or the input layer. The output of one neuron becomes the input to other neurons
in the next layer of the network, and this process continues until the final layer produces
the output of the network. The layers of the neural network transform the input data
through a series of nonlinear transformations, allowing the network to learn complex
representations of the input data.
Loss Function:
Gradient Descent: Gradient descent is then used by the network to reduce the
loss. To lower the inaccuracy, weights are changed based on the derivative of the
loss with respect to each weight.
Adjusting weights: The weights are adjusted at each connection by applying
this iterative process, or backpropagation, backward across the network.
Training: During training with different data samples, the entire process of
forward propagation, loss calculation, and backpropagation is done iteratively,
enabling the network to adapt and learn patterns from the data.
NEURAL NETWORKS AND DEEP LEARNING IT1701
1. Computer vision:
NEURAL NETWORKS AND DEEP LEARNING IT1701
The first Deep Learning applications is Computer vision. In computer vision, Deep learning
AI models can enable machines to identify and understand visual data. Some of the main
applications of deep learning in computer vision include:
Object detection and recognition: Deep learning model can be used to identify
and locate objects within images and videos, making it possible for machines to
perform tasks such as self-driving cars, surveillance, and robotics.
Image classification: Deep learning models can be used to classify images into
categories such as animals, plants, and buildings. This is used in applications
such as medical imaging, quality control, and image retrieval.
Image segmentation: Deep learning models can be used for image
segmentation into different regions, making it possible to identify specific
features within images.
2. Natural language processing (NLP):
In Deep learning applications, second application is NLP. NLP, the Deep learning model
can enable machines to understand and generate human language. Some of the main
applications of deep learning in NLP include:
Automatic Text Generation – Deep learning model can learn the corpus of text
and new text like summaries, essays can be automatically generated using these
trained models.
Language translation: Deep learning models can translate text from one
language to another, making it possible to communicate with people from
different linguistic backgrounds.
Sentiment analysis: Deep learning models can analyze the sentiment of a piece
of text, making it possible to determine whether the text is positive, negative, or
neutral. This is used in applications such as customer service, social media
monitoring, and political analysis.
Speech recognition: Deep learning models can recognize and transcribe
spoken words, making it possible to perform tasks such as speech-to-text
conversion, voice search, and voice-controlled devices.
3. Reinforcement learning:
In reinforcement learning, deep learning works as training agents to take action in an
environment to maximize a reward. Some of the main applications of deep learning in
reinforcement learning include:
Game playing: Deep reinforcement learning models have been able to beat
human experts at games such as Go, Chess, and Atari.
Robotics: Deep reinforcement learning models can be used to train robots to
perform complex tasks such as grasping objects, navigation, and manipulation.
Control systems: Deep reinforcement learning models can be used to control
complex systems such as power grids, traffic management, and supply chain
optimization.
Concept
Binary Classification: The perceptron aims to classify input data into one of two
possible classes.
Linear Separability: The algorithm works best when the two classes are linearly
separable, meaning a straight line (or hyperplane in higher dimensions) can separate
the data points of the two classes.
Components
Weights (w): The algorithm maintains a set of weights, one for each feature in the
input data.
Bias (b): An additional parameter that helps the decision boundary not necessarily
pass through the origin.
Activation Function: A step function that outputs 1 if the weighted sum of the inputs
is greater than or equal to a threshold (usually 0), and -1 otherwise.
Algorithm Steps
1. Initialization: Initialize the weights and bias to small random values (often zero).
2. For each training example (x, y):
o Compute the output using the current weights and bias:
ypred=sign(w⋅x+b)y_{pred} = \text{sign}(w \cdot x + b)ypred=sign(w⋅x+b)
o Update the weights and bias if there is a misclassification:
If ytrue≠ypredy_{true} \neq y_{pred}ytrue =ypred:
w:=w+Δww := w + \Delta ww:=w+Δw where
Δw=η⋅(ytrue⋅x)\Delta w = \eta \cdot (y_{true} \cdot
x)Δw=η⋅(ytrue⋅x)
b:=b+Δbb := b + \Delta bb:=b+Δb where Δb=η⋅ytrue\Delta b =
\eta \cdot y_{true}Δb=η⋅ytrue
o Here, η\etaη is the learning rate, a positive constant that determines the step
size.
Pseudocode
python
Copy code
initialize weights w and bias b to small random values
set learning rate η
NEURAL NETWORKS AND DEEP LEARNING IT1701
Key Points
Example
Let's consider a simple example with two features and binary class labels:
Training data:
Starting with initial weights (0, 0) and bias 0, and a learning rate of 1, the algorithm updates
the weights and bias based on misclassifications until it finds a suitable separating
hyperplane.
Activation Functions:
NEURAL NETWORKS AND DEEP LEARNING IT1701
Now the value of net input can be any anything from -inf to +inf. The neuron doesn’t really
know how to bound to value and thus is not able to decide the firing pattern. Thus the
activation function is an important part of an artificial neural network. They basically
decide whether a neuron should be activated or not. Thus, it bounds the value of the net
input. The activation function is a non-linear transformation that we do over the input
before sending it to the next layer of neurons or finalizing it as output.
A threshold activation function (or simply the activation function, also known
as squashing function) results in an output signal only when an input signal exceeding a
specific threshold value comes as an input. It is similar in behavior to the biological neuron
which transmits the signal only when the total input signal meets the firing threshold.
The chain rule is a fundamental concept in calculus used to find the derivative of a composite
function. If you have two functions, fff and ggg, and you want to differentiate their
composition h(x)=f(g(x))h(x) = f(g(x))h(x)=f(g(x)), the chain rule states that:
NEURAL NETWORKS AND DEEP LEARNING IT1701
1. Forward Pass: Compute the predicted output by passing the input through the
network.
2. Compute Loss: Calculate the loss (or error) using a loss function, such as mean
squared error (MSE) or cross-entropy loss.
3. Backward Pass:
o Compute Gradients: Use the chain rule to compute the gradient of the loss
function with respect to each weight in the network. This is where the chain
rule plays a crucial role.
o Update Weights: Adjust the weights by subtracting a fraction of the gradient
(learning rate) from the current weights.
Consider a simple neural network with one input layer, one hidden layer, and one
output layer. Let:
NEURAL NETWORKS AND DEEP LEARNING IT1701
Deep Neural Networks (DNNs) are a type of artificial neural network (ANN) with multiple
hidden layers between the input and output layers. These networks are designed to model
complex patterns and relationships in data. They are called "deep" because of the depth
(number of layers) in the network.
NEURAL NETWORKS AND DEEP LEARNING IT1701
1. Input Layer: This layer receives the input data. Each neuron in this layer represents
a feature from the input data.
2. Hidden Layers: These are intermediate layers between the input and output layers.
DNNs have multiple hidden layers, allowing them to learn complex representations.
Each neuron in a hidden layer receives input from neurons in the previous layer and
sends output to neurons in the next layer.
3. Output Layer: This layer produces the final output of the network. The number of
neurons in this layer depends on the task (e.g., one neuron for binary classification,
multiple neurons for multi-class classification).
4. Weights and Biases: Each connection between neurons has an associated weight,
and each neuron has a bias. These parameters are adjusted during training to
minimize the error in the network's predictions.
5. Activation Functions: These functions introduce non-linearity into the network,
allowing it to learn more complex patterns. Common activation functions include
ReLU (Rectified Linear Unit), Sigmoid, and Tanh.
Training a DNN involves adjusting the weights and biases to minimize the error between the
predicted output and the actual output. This is typically done using the backpropagation
algorithm combined with an optimization technique such as gradient descent.
1. Forward Pass: Input data is passed through the network, and the output is
computed.
2. Loss Calculation: The loss function measures the difference between the predicted
output and the actual output. Common loss functions include Mean Squared Error
(MSE) for regression and Cross-Entropy Loss for classification.
NEURAL NETWORKS AND DEEP LEARNING IT1701
3. Backward Pass: The gradients of the loss function with respect to the weights and
biases are computed using the chain rule. This step involves backpropagation.
4. Weight Update: The weights and biases are updated using an optimization
algorithm, such as stochastic gradient descent (SGD) or Adam. The learning rate
controls the size of the updates.
5. Iterative Process: The forward and backward passes are repeated for many
iterations (epochs) until the network's performance converges.
NEURAL NETWORKS AND DEEP LEARNING IT1701