0% found this document useful (0 votes)

16 views14 pages

Unit-5: Introduction To Deep Learning: Artificial Neural Networks

Uploaded by

yuvraj120555

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

16 views14 pages

Unit-5: Introduction To Deep Learning: Artificial Neural Networks

Uploaded by

yuvraj120555

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 14

UNIT-5: INTRODUCTION TO DEEP LEARNING

 Artificial Neural Networks:

Artificial Neural Networks contain artificial neurons which are called units. These units are
arranged in a series of layers that together constitute the whole Artificial Neural Network in a
system. A layer can have only a dozen units or millions of units as this depends on how the
complex neural networks will be required to learn the hidden patterns in the dataset.
Commonly, Artificial Neural Network has an input layer, an output layer as well as hidden
layers. The input layer receives data from the outside world, which the neural network needs
to analyze or learn about. Then this data passes through one or multiple hidden layers that
transform the input into data that is valuable for the output layer. Finally, the output layer
provides an output in the form of a response of the Artificial Neural Networks to input data
provided.

In the majority of neural networks, units are interconnected from one layer to another. Each
of these connections has weights that determine the influence of one unit on another unit. As
the data transfers from one unit to another, the neural network learns more and more about the
data, which eventually results in an output from the output layer.

Dendrites from Biological Neural Network represent inputs in Artificial Neural Networks,
cell nucleus represents Nodes, synapse represents Weights, and Axon represents Output.
Relationship between Biological neural network and artificial neural network:

Biological Neural Network Artificial Neural Network

Dendrites Inputs
Cell nucleus Nodes
Synapse Weights
Axon Output

 The architecture of an artificial neural network:

In order to define a neural network that consists of a large number of artificial neurons, which
are termed units arranged in a sequence of layers. Lets us look at various types of layers
available in an artificial neural network.
Artificial Neural Network primarily consists of three layers:
• Input Layer: As the name suggests, it accepts inputs in several different formats
provided by the programmer.
• Hidden Layer: The hidden layer presents in-between input and output layers. It
performs all the calculations to find hidden features and patterns
• Output Layer: The input goes through a series of transformations using the hidden
layer, which finally results in output that is conveyed using this layer.
 Perceptron EX-OR problem:
The XOR, or "exclusive OR", problem is a classic problem in the field of artificial
intelligence and machine learning. It is a problem that cannot be solved by a single layer
perceptron, and therefore requires a multi-layer perceptron or a deep learning model. This
Answer aims to provide a comprehensive understanding of the XOR problem and how it can
be solved using a neural network.
The XOR function
The XOR function is a binary function that takes two binary inputs and returns a binary output.
The output is true if the number of true inputs is odd, and false otherwise. In other words, it
returns true if exactly one of the inputs is true, and false otherwise.
The following table shows the truth table for the XOR function:

Input A Input B Output (A XOR B)

0 0 0
0 1 1
1 0 1
1 1 0

Linear separability of points

Linear separability is a concept in machine learning that refers to the ability to distinguish
between classes of data points using a straight line (in two dimensions) or a hyperplane (in
higher dimensions). If two classes of points can be perfectly separated by such a line or
hyperplane, they are considered linearly separable. This concept is fundamental to
understanding the limitations of single-layer perceptrons, which can only model linearly
separable functions.
Consider the data points:
The XOR problem
The XOR function is not linearly separable, which means we cannot draw a single straight
line to separate the inputs that yield different outputs.
This is illustrated in the following diagram:

In the illustration, the circle is drawn when both x and y are the same, and the diamond is for
when they are different. But as shown in the figure, we cannot separate the circles and
diamonds by drawing a line. Hence the XOR function is not linearly separable. This is where
the XOR problem in neural networks arises. A single-layer perceptron, due to its linear nature,
fails to model the XOR function.
Overcoming the XOR problem
The XOR problem can be overcome by using a multi-layer perceptron (MLP), also known as
a neural network. An MLP consists of multiple layers of perceptrons, allowing it to model
more complex, non-linear functions. In following structure, the first layer is the input layer.
The second layer (hidden layer) transforms the original non-linearly separable problem into a
linearly separable one, which the third layer (output layer) can then solve.
 Feedforward Propagation:
Feedforward propagation is the process in a neural network where the input data is passed
through the network's layers, from the input layer through the hidden layers to the output layer,
to generate a prediction or output. Each layer in the network performs a transformation on the
input data using weights and biases associated with the connections between neurons in
adjacent layers, typically followed by an activation function.
Here's a basic outline of feedforward propagation in a neural network:
Input Layer: The input data is fed into the input layer neurons. Each neuron in the input layer
corresponds to a feature in the input data.
Hidden Layers: The input data is then passed through one or more hidden layers. Each neuron
in a hidden layer receives input from all neurons in the previous layer, applies a weighted sum
of inputs, adds a bias term, and then applies an activation function (like ReLU, sigmoid, or
tanh) to produce an output.
Output Layer: The final hidden layer output is passed to the output layer, which processes
the inputs in a similar way as the hidden layers but typically uses a different activation function
(e.g., softmax for classification or linear for regression) to produce the final output of the
network.
Output: The output of the output layer is the prediction or output of the neural network for
the given input data. During the training phase, the weights and biases in the network are
adjusted based on the error between the predicted output and the actual target output, using
techniques like backpropagation and optimization algorithms like gradient descent, to
minimize the error and improve the network's performance.

 Back Propagation:
Backpropagation is the process used in training neural networks to update the weights of the
network in order to minimize the error between the predicted output and the actual target
output. It is essentially a way of calculating the gradient of the loss function with respect to
the weights of the network, which is then used to update the weights using an optimization
algorithm like gradient descent.
Here's a basic outline of the backpropagation process:
Forward Pass: During the forward pass (feedforward propagation), the input data is passed
through the network, and the output is calculated.
Calculate Loss: The output of the network is compared to the actual target output, and a loss
function is calculated to measure the difference between them. Common loss functions
include mean squared error (MSE) for regression problems and cross-entropy loss for
classification problems.
Backward Pass (Backpropagation): The goal of backpropagation is to calculate the gradient
of the loss function with respect to each weight in the network. This is done using the chain
rule of calculus to propagate the error backwards through the network.
Update Weights: Once the gradients have been calculated, the weights of the network are
updated using an optimization algorithm like gradient descent. The weights are updated in the
opposite direction of the gradient in order to minimize the loss function.
Repeat: Steps 1-4 are repeated for each batch of training data until the model converges and
the weights have been optimized. Backpropagation is a key component of training neural
networks and allows them to learn complex patterns in data by iteratively adjusting the
weights of the network to minimize the error.

 LOSSES:
In artificial neural networks (ANNs), losses are used to measure the difference between the
predicted output and the actual target. The choice of loss function depends on the task at hand
(e.g., regression, classification) and the network's output (e.g., scalar, vector, matrix).
• Mean Squared Error (MSE): Commonly used for regression tasks, MSE calculates
the average of the squared differences between predicted and actual values. It
penalizes large errors more than small ones.
• Binary Cross-Entropy: Used for binary classification tasks, this loss function
measures the difference between two probability distributions (predicted and actual)
for a binary outcome.
• Categorical Cross-Entropy: Used for multiclass classification tasks, categorical
cross-entropy calculates the difference between predicted and actual class
probabilities across all classes.
• Sparse Categorical Cross-Entropy: Similar to categorical cross-entropy but used
when the target labels are integers (e.g., 0, 1, 2) instead of one-hot encoded vectors.
• Kullback-Leibler Divergence (KL Divergence): Measures how one probability
distribution diverges from a second, expected probability distribution. It's often used
in scenarios where you have a target distribution and want to measure how well your
model captures it.
 Activation Function:
1. Sigmoid / Logistic Activation Function
This function takes any real value as input and outputs values in the range of 0 to 1. The larger
the input (more positive), the closer the output value will be to 1.0, whereas the smaller the
input (more negative), the closer the output will be to 0.0, as shown below.
Mathematically it can be represented as:

It is commonly used for models where we have to predict the probability as an output. Since
probability of anything exists only between the range of 0 and 1, sigmoid is the right choice
because of its range.
The function is differentiable and provides a smooth gradient, i.e., preventing jumps in output
values. This is represented by an S-shape of the sigmoid activation function.

2. Tanh Function (Hyperbolic Tangent)

Tanh function is very similar to the sigmoid/logistic activation function, and even has the
same S-shape with the difference in output range of -1 to 1. In Tanh, the larger the input (more
positive), the closer the output value will be to 1.0, whereas the smaller the input (more
negative), the closer the output will be to -1.0.
Mathematically it can be represented as

Advantages of using this activation function are:

• The output of the tanh activation function is Zero centered; hence we can easily map
the output values as strongly negative, neutral, or strongly positive.
• Usually used in hidden layers of a neural network as its values lie between -1 to;
therefore, the mean for the hidden layer comes out to be 0 or very close to it. It helps
in centering the data and makes learning for the next layer much easier.

3. ReLU Function
ReLU stands for Rectified Linear Unit. Although it gives an impression of a linear function,
ReLU has a derivative function and allows for backpropagation while simultaneously making
it computationally efficient.
The main catch here is that the ReLU function does not activate all the neurons at the same
time.
The neurons will only be deactivated if the output of the linear transformation is less than 0.
Mathematically it can be represented as:
The advantages of using ReLU as an activation function are as follows:
• Since only a certain number of neurons are activated, the ReLU function is far more
computationally efficient when compared to the sigmoid and tanh functions.
• ReLU accelerates the convergence of gradient descent towards the global minimum
of the loss function due to its linear, non-saturating property.

4. Leaky ReLU Function:

Leaky ReLU is an improved version of ReLU function to solve the Dying ReLU problem as
it has a small positive slope in the negative area.

Mathematically it can be represented as:

The advantages of Leaky ReLU are same as that of ReLU, in addition to the fact that it does
enable backpropagation, even for negative input values.
By making this minor modification for negative input values, the gradient of the left side of
the graph comes out to be a non-zero value. Therefore, we would no longer encounter dead
neurons in that region.

5. Parametric ReLU Function:

Parametric ReLU is another variant of ReLU that aims to solve the problem of gradient’s
becoming zero for the left half of the axis.
This function provides the slope of the negative part of the function as an argument a. By
performing backpropagation, the most appropriate value of a is learnt.

Mathematically it can be represented as:

Where "a" is the slope parameter for negative values.

The parameterized ReLU function is used when the leaky ReLU function still fails at solving
the problem of dead neurons, and the relevant information is not successfully passed to the
next layer.
This function’s limitation is that it may perform differently for different problems depending
upon the value of slope parameter a.
6. Exponential Linear Units (ELUs) Function:

Exponential Linear Unit, or ELU for short, is also a variant of ReLU that modifies the slope
of the negative part of the function.
ELU uses a log curve to define the negativ values unlike the leaky ReLU and Parametric
ReLU functions with a straight line.

Mathematically it can be represented as:

ELU is a strong alternative for f ReLU because of the following advantages:

• ELU becomes smooth slowly until its output equal to -α whereas RELU sharply
smoothes.
• Avoids dead ReLU problem by introducing log curve for negative values of input. It
helps the network nudge weights and biases in the right direction.
The limitations of the ELU function are as follow:
• It increases the computational time because of the exponential operation included
• No learning of the ‘a’ value takes place
• Exploding gradient problem
 GPU Training:
Graphics processing units (GPUs), originally developed for accelerating graphics processing,
can dramatically speed up computational processes for deep learning. They are an essential
part of a modern artificial intelligence infrastructure, and new GPUs have been developed and
optimized specifically for deep learning.
Training machine learning models on GPUs can significantly speed up the process, especially
for deep learning models that require intensive computations. To train models on GPUs, you
can use libraries like TensorFlow or PyTorch, which provide GPU support out of the box.

 BASIC HYPERPARAMETERS IN ANN:

In artificial neural networks (ANNs), hyperparameters are parameters that are set before the
learning process begins. They control aspects of the learning process such as the network
architecture, the optimization algorithm, and the training process. Here are some basic
hyperparameters in ANNs:
1. Number of hidden layers: The number of layers in the neural network, not including the
input and output layers. More layers can potentially capture more complex patterns in the data
but can also lead to overfitting.
2. Number of neurons per hidden layer: The number of neurons (nodes) in each hidden layer.
A larger number of neurons can increase the model's capacity to learn complex patterns but
can also lead to overfitting.
3. Activation function: The function applied to the output of each neuron in the network.
Common activation functions include ReLU (Rectified Linear Unit), sigmoid, and tanh. The
choice of activation function can affect the model's ability to learn and the speed of
convergence during training.
4. Learning rate: The size of the step taken during the optimization process (e.g., gradient
descent) to update the weights of the network. A larger learning rate can lead to faster
convergence but may cause the model to overshoot the optimal weights, while a smaller
learning rate can lead to slower convergence but may result in more stable training.
Batch size: The number of data points used in each iteration of the training process. Larger
batch sizes can lead to faster training but may result in less noise in the gradient estimates,
while smaller batch sizes can lead to slower training but may result in more accurate gradient
estimates.
6. Epochs: The number of times the entire dataset is passed through the network during
training. One epoch is completed when the model has seen all the training data once. Training
for more epochs can lead to better performance but may also increase the risk of overfitting.
7. Optimizer: The algorithm used to update the weights of the network during training.
Common optimizers include stochastic gradient descent (SGD), Adam, and RMSprop.
8. Regularization: Techniques used to prevent overfitting, such as L1 or L2 regularization,
dropout, or early stopping. These techniques introduce additional hyper parameters that
control the strength of regularization.
These are just a few examples of hyper parameters in ANNs. The choice of hyper parameters
can significantly impact the performance of the model, and it often requires experimentation
and tuning to find the optimal set of hyperparameters for a specific problem.

 SELECTION OF NEURONS IN ANN:

Selecting the number of neurons in each layer of an artificial neural network (ANN) is a crucial
step in designing an effective model. The number of neurons can affect the model's capacity
to learn complex patterns, its ability to generalize to unseen data, and its computational
efficiency. Here are some guidelines for selecting the number of neurons in an ANN:
Input Layer: The number of neurons in the input layer should match the dimensionality of
your input data. Each neuron in the input layer represents a feature or attribute of the input
data.
Output Layer: The number of neurons in the output layer depends on the type of problem
you are trying to solve. For binary classification problems, a single neuron with a sigmoid
activation function is often used. For multi-class classification, you can use one neuron per
class with a softmax activation function. For regression problems, a single neuron without an
activation function can be used.
Hidden Layers: The number of neurons in the hidden layers is a more complex decision. Too
few neurons may result in underfitting, where the model cannot capture the complexity of the
data. Too many neurons may result in overfitting, where the model learns noise in the training
data instead of the underlying patterns.
One common approach is to start with a single hidden layer and gradually increase the number
of neurons until the model's performance on a validation set starts to decrease.
Another approach is to use the "rule of thumb," which suggests using a number of neurons
between the number of input neurons and output neurons, or a multiple of this number.
For more complex problems, such as image recognition or natural language processing, deeper
architectures with multiple hidden layers and varying numbers of neurons in each layer are
often used.

 GREEDY SEARCH IN ANN:

Greedy search is a simple optimization algorithm that makes the most optimal choice at each
step with the hope of finding a global optimum. In the context of optimizing the number of
layers in an Artificial Neural Network (ANN), a greedy search approach would involve
iteratively adding or removing layers from a base model and evaluating the performance of
each modified model based on a chosen metric, such as accuracy or loss.
Here's a basic outline of how a greedy search for optimizing the number of layers in an ANN
might work:
• Define a Base Model: Start with a simple base model with a fixed number of layers
and nodes per layer.
• Define a Greedy Search Criteria: Choose a metric to evaluate the performance of
your model, such as accuracy, loss, etc.
• Iterative Process:
1. Initialize the current best model with the base model.
2. For each iteration, consider adding or removing a layer from the current best
model and evaluate its performance using the chosen metric.
3. If the performance improves, update the current best model.
4. Repeat this process for a fixed number of iterations or until a stopping criterion
is met.
• Finalize Model: Once the iterative process is complete, finalize the model with the
best configuration found during the search.

 Random Access in ANN:

If you're referring to randomly accessing layers in an Artificial Neural Network (ANN), it

typically means accessing or modifying layers in a non-sequential manner. In most neural
network frameworks, layers are added sequentially, and accessing or modifying them
randomly is not a standard operation. However, you can achieve similar functionality by
maintaining a list or dictionary of layers and accessing them by index or name.

Neural Network and Fuzzy Logic
50% (2)
Neural Network and Fuzzy Logic
54 pages
Unit 2 Deep Learning
No ratings yet
Unit 2 Deep Learning
19 pages
ML Unit-5
No ratings yet
ML Unit-5
19 pages
Lesson 3 Artificial Neural Network
No ratings yet
Lesson 3 Artificial Neural Network
77 pages
Lecture 7 - Neural Networks
No ratings yet
Lecture 7 - Neural Networks
48 pages
Quantitative Techniques P-1 With Answers
100% (1)
Quantitative Techniques P-1 With Answers
3 pages
4.0 The Complete Guide To Artificial Neural Networks
No ratings yet
4.0 The Complete Guide To Artificial Neural Networks
23 pages
Neural Networks
No ratings yet
Neural Networks
54 pages
Neural Networks
No ratings yet
Neural Networks
37 pages
Unit - 2
No ratings yet
Unit - 2
24 pages
Artificial Intelligence - Chapter 7
No ratings yet
Artificial Intelligence - Chapter 7
18 pages
Unit 4
No ratings yet
Unit 4
38 pages
XOR Problem Demonstration Using MATLAB
0% (1)
XOR Problem Demonstration Using MATLAB
19 pages
Unit - 4
No ratings yet
Unit - 4
17 pages
Unit - 4 Artificial Neural Networks
No ratings yet
Unit - 4 Artificial Neural Networks
33 pages
Ai Unit 4 Part 2
No ratings yet
Ai Unit 4 Part 2
45 pages
CH 12 - Artificial Neural Networks
No ratings yet
CH 12 - Artificial Neural Networks
39 pages
Module 2
No ratings yet
Module 2
44 pages
Artificial Neural Network
No ratings yet
Artificial Neural Network
75 pages
ML Unit-2
No ratings yet
ML Unit-2
141 pages
Week - 5 (Deep Learning) Q. 1) Explain The Architecture of Feed Forward Neural Network or Multilayer Perceptron. (12 Marks)
No ratings yet
Week - 5 (Deep Learning) Q. 1) Explain The Architecture of Feed Forward Neural Network or Multilayer Perceptron. (12 Marks)
7 pages
DL 02 Deep Forward Networks
No ratings yet
DL 02 Deep Forward Networks
47 pages
Lecture 2
No ratings yet
Lecture 2
52 pages
Unit V
No ratings yet
Unit V
9 pages
The Spectral-Element Method: Introduction: Heiner Igel
No ratings yet
The Spectral-Element Method: Introduction: Heiner Igel
67 pages
@vtucode - in Module 5 AI 2021 Scheme 5th Sem
No ratings yet
@vtucode - in Module 5 AI 2021 Scheme 5th Sem
66 pages
Deep Learning
No ratings yet
Deep Learning
19 pages
Artificial Neural Network: Lecture Module 22
No ratings yet
Artificial Neural Network: Lecture Module 22
54 pages
Unit I
No ratings yet
Unit I
90 pages
cst414 - Deep Learning
No ratings yet
cst414 - Deep Learning
34 pages
ML Unit-5 Final
No ratings yet
ML Unit-5 Final
23 pages
Unit 2 v1.
No ratings yet
Unit 2 v1.
41 pages
Unit 5
No ratings yet
Unit 5
10 pages
Unit 5 ML
No ratings yet
Unit 5 ML
37 pages
Machine Learning Unit 5 Notes
No ratings yet
Machine Learning Unit 5 Notes
19 pages
ANN Research
No ratings yet
ANN Research
18 pages
Unit 1
No ratings yet
Unit 1
72 pages
Unit 3
No ratings yet
Unit 3
7 pages
Unit 1
No ratings yet
Unit 1
20 pages
MLT Unit 4 and 5 Part 2
No ratings yet
MLT Unit 4 and 5 Part 2
34 pages
Machine Learning
No ratings yet
Machine Learning
83 pages
DEEP LEARNING Paper
No ratings yet
DEEP LEARNING Paper
12 pages
CC511 Week 5 - 6 - NN - BP
No ratings yet
CC511 Week 5 - 6 - NN - BP
62 pages
Deep Learning
No ratings yet
Deep Learning
20 pages
DL Mod 1 Final
No ratings yet
DL Mod 1 Final
4 pages
Shortnotedeeplearning
No ratings yet
Shortnotedeeplearning
11 pages
Unit 3
No ratings yet
Unit 3
8 pages
6ee412 ch6 Neural DSP
No ratings yet
6ee412 ch6 Neural DSP
41 pages
Chapter 7 PDF
No ratings yet
Chapter 7 PDF
2 pages
02 Deep Feedforward Learning - Notes
No ratings yet
02 Deep Feedforward Learning - Notes
34 pages
Introduction To Artificial Neural Networks
No ratings yet
Introduction To Artificial Neural Networks
31 pages
Chapter 5 Final
No ratings yet
Chapter 5 Final
80 pages
SHAI - Task 3 - NN
No ratings yet
SHAI - Task 3 - NN
10 pages
Unit-4 Full
No ratings yet
Unit-4 Full
48 pages
Artificial Neural Network
No ratings yet
Artificial Neural Network
35 pages
Neural Network
No ratings yet
Neural Network
7 pages
Artificial Neural Networks
No ratings yet
Artificial Neural Networks
13 pages
3rd Unit ML
No ratings yet
3rd Unit ML
7 pages
Unit 12
No ratings yet
Unit 12
28 pages
Unit III
No ratings yet
Unit III
29 pages
Neural Networks
No ratings yet
Neural Networks
10 pages
Unit IV Convolutional Codes
No ratings yet
Unit IV Convolutional Codes
33 pages
8960 - DWM Experiment 5
No ratings yet
8960 - DWM Experiment 5
6 pages
Chapter 1
No ratings yet
Chapter 1
13 pages
Quantum Cryptography and The Future of Cyber Security 1st Edition by Chaubey Nirbhay 1799822567 9781799822561 PDF Download
100% (1)
Quantum Cryptography and The Future of Cyber Security 1st Edition by Chaubey Nirbhay 1799822567 9781799822561 PDF Download
75 pages
Scott Aaronson P NP Survey
No ratings yet
Scott Aaronson P NP Survey
122 pages
Unit II Full Notes
No ratings yet
Unit II Full Notes
108 pages
Extracted Questions and Answers CMP 411
No ratings yet
Extracted Questions and Answers CMP 411
5 pages
Module-3: Ab Initio Molecular Dynamics
No ratings yet
Module-3: Ab Initio Molecular Dynamics
22 pages
Sequencing Problems - Processing N Jobs Through 2 Machines Problem Calculator
No ratings yet
Sequencing Problems - Processing N Jobs Through 2 Machines Problem Calculator
2 pages
Anyasor C Applied Machine Learning. A Practical Guide From Novice To Pro 2024
No ratings yet
Anyasor C Applied Machine Learning. A Practical Guide From Novice To Pro 2024
322 pages
M.tech Digital System Design
No ratings yet
M.tech Digital System Design
2 pages
Encipherment Using Modern Symmetric-Key Ciphers-Block Ciphers
No ratings yet
Encipherment Using Modern Symmetric-Key Ciphers-Block Ciphers
48 pages
Midterm Exam: ENGR 391 Numerical Methods
No ratings yet
Midterm Exam: ENGR 391 Numerical Methods
2 pages
Quasi-Likelihood Functions, Generalized Linear Models, and The Gauss-Newton Method (Wedderburn Article)
No ratings yet
Quasi-Likelihood Functions, Generalized Linear Models, and The Gauss-Newton Method (Wedderburn Article)
9 pages
MS BKRP
No ratings yet
MS BKRP
51 pages
Pert CPM
No ratings yet
Pert CPM
11 pages
C: A Hierarchical Clustering Algorithm Using Dynamic Modeling
No ratings yet
C: A Hierarchical Clustering Algorithm Using Dynamic Modeling
22 pages
1-Poll Physics
No ratings yet
1-Poll Physics
2 pages
Presented by 18A25A0227 18A25A0208 18A25A0241 17A21A0223
No ratings yet
Presented by 18A25A0227 18A25A0208 18A25A0241 17A21A0223
21 pages
An Enhanced AI-Based Network Intrusion Detection System Using Generative Adversarial Networks-1
No ratings yet
An Enhanced AI-Based Network Intrusion Detection System Using Generative Adversarial Networks-1
16 pages
BMS 6103 ECONOMETRICS - Set B
No ratings yet
BMS 6103 ECONOMETRICS - Set B
2 pages
A Clustering and Dimensionality Reduction Based Evolutionary Algorithm For Large-Scale Multi-Objective Problems
No ratings yet
A Clustering and Dimensionality Reduction Based Evolutionary Algorithm For Large-Scale Multi-Objective Problems
18 pages
MIT6 0001F16 Pset4
No ratings yet
MIT6 0001F16 Pset4
10 pages
Congestion Control Prediction Model For 5G Environment Based On Supervised and Unsupervised Machine Learning Approach
No ratings yet
Congestion Control Prediction Model For 5G Environment Based On Supervised and Unsupervised Machine Learning Approach
13 pages
Understanding Uncertainty
No ratings yet
Understanding Uncertainty
8 pages
Structural Health Monitoring of Concrete Dams Using Long-Term Air Temperature For Thermal Effect Simulation
No ratings yet
Structural Health Monitoring of Concrete Dams Using Long-Term Air Temperature For Thermal Effect Simulation
12 pages
ENAG1DE-TEST1 Memo
No ratings yet
ENAG1DE-TEST1 Memo
5 pages
Long Short Term Memory: Fundamentals and Applications for Sequence Prediction
From Everand
Long Short Term Memory: Fundamentals and Applications for Sequence Prediction
Fouad Sabry
No ratings yet
Competitive Learning: Fundamentals and Applications for Reinforcement Learning through Competition
From Everand
Competitive Learning: Fundamentals and Applications for Reinforcement Learning through Competition
Fouad Sabry
No ratings yet

Unit-5: Introduction To Deep Learning: Artificial Neural Networks

Uploaded by

Unit-5: Introduction To Deep Learning: Artificial Neural Networks

Uploaded by

UNIT-5: INTRODUCTION TO DEEP LEARNING

 Artificial Neural Networks:

Biological Neural Network Artificial Neural Network

 The architecture of an artificial neural network:

Input A Input B Output (A XOR B)

Linear separability of points

2. Tanh Function (Hyperbolic Tangent)

Advantages of using this activation function are:

4. Leaky ReLU Function:

Mathematically it can be represented as:

5. Parametric ReLU Function:

Mathematically it can be represented as:

Where "a" is the slope parameter for negative values.

Mathematically it can be represented as:

ELU is a strong alternative for f ReLU because of the following advantages:

 BASIC HYPERPARAMETERS IN ANN:

 SELECTION OF NEURONS IN ANN:

 GREEDY SEARCH IN ANN:

 Random Access in ANN:

If you're referring to randomly accessing layers in an Artificial Neural Network (ANN), it

You might also like