0% found this document useful (0 votes)

27 views31 pages

Activation Functions and Keras Metrics

Uploaded by

viceprincipal

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

27 views31 pages

Activation Functions and Keras Metrics

Uploaded by

viceprincipal

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 31

Activation Functions in Neural Networks [12 Types & Use Cases]

v7labs.com/blog/neural-networks-activation-functions

“The world is one big data problem.”

As it turns out—

This saying holds true both for our brains as well as machine learning.

Every single moment our brain is trying to segregate the incoming information into the “useful” and “not-
so-useful” categories.

1/31
‍
A similar process occurs in artificial neural network architectures in deep learning.

The segregation plays a key role in helping a neural network properly function, ensuring that it learns
from the useful information rather than get stuck analyzing the not-useful part.

And this is also where activation functions come into the picture.

💡 Activation Function helps the neural network to use important information while suppressing irrelevant data points.
Sounds a little confusing? Worry not!

Here’s what we’ll cover:

1. What is a Neural Network Activation Function?

2. Why do Neural Networks Need an Activation Function?
3. 3 Types of Neural Networks Activation Functions
4. Why are deep neural networks hard to train?
5. How to choose the right Activation Function?
6. Neural Networks Activation Functions in a Nutshell

Ready? Let’s get started :)

Solve any video or image labeling task 10x faster and with 10x less manual work.

Don't start empty-handed. Explore our repository of 500+ open datasets and test-drive V7's tools.

What is a Neural Network Activation Function?

2/31
An Activation Function decides whether a neuron should be activated or not. This means that it will
decide whether the neuron’s input to the network is important or not in the process of prediction using
simpler mathematical operations.

The role of the Activation Function is to derive output from a set of input values fed to a node (or a layer).

But—

Let’s take a step back and clarify: What exactly is a node?

Well, if we compare the neural network to our brain, a node is a replica of a neuron that receives a set of
input signals—external stimuli.

Depending on the nature and intensity of these input signals, the brain processes them and decides
whether the neuron should be activated (“fired”) or not.

In deep learning, this is also the role of the Activation Function—that’s why it’s often referred to as a
Transfer Function in Artificial Neural Network.

The primary role of the Activation Function is to transform the summed weighted input from the node into
an output value to be fed to the next hidden layer or as output.

3/31
Now, let's have a look at the Neural Networks Architecture.

Elements of a Neural Networks Architecture

Here’s the thing—

If you don’t understand the concept of neural networks and how they work, diving deeper into the topic of
activation functions might be challenging.

That’s why it’s a good idea to refresh your knowledge and take a quick look at the structure of the Neural
Networks Architecture and its components. Here it is.

4/31
In the image above, you can see a neural network made of interconnected neurons. Each of them is
characterized by its weight, bias, and activation function.

Here are other elements of this network.

Input Layer

The input layer takes raw input from the domain. No computation is performed at this layer. Nodes here
just pass on the information (features) to the hidden layer.

Hidden Layer

As the name suggests, the nodes of this layer are not exposed. They provide an abstraction to the neural
network.

The hidden layer performs all kinds of computation on the features entered through the input layer and
transfers the result to the output layer.

Output Layer

It’s the final layer of the network that brings the information learned through the hidden layer and delivers
the final value as a result.

📢 Note: All hidden layers usually use the same activation function. However, the output layer will
typically use a different activation function from the hidden layers. The choice depends on the goal or type
of prediction made by the model.

5/31
Feedforward vs. Backpropagation

When learning about neural networks, you will come across two essential terms describing the movement
of information—feedforward and backpropagation.

Let’s explore them.

💡 Feedforward Propagation -the flow of information occurs in the forward direction. The input is used to calculate
some intermediate function in the hidden layer, which is then used to calculate an output.

In the feedforward propagation, the Activation Function is a mathematical “gate” in between the input
feeding the current neuron and its output going to the next layer.

💡 Backpropagation - the weights of the network connections are repeatedly adjusted to minimize the difference
between the actual output vector of the net and the desired output vector.
To put it simply—backpropagation aims to minimize the cost function by adjusting the network’s weights
and biases. The cost function gradients determine the level of adjustment with respect to parameters like
activation function, weights, bias, etc.

Why do Neural Networks Need an Activation Function?

So we know what Activation Function is and what it does, but—

Why do Neural Networks need it?

Well, the purpose of an activation function is to add non-linearity to the neural network.

6/31
Activation functions introduce an additional step at each layer during the forward propagation, but its
computation is worth it. Here is why—

Let’s suppose we have a neural network working without the activation functions.

In that case, every neuron will only be performing a linear transformation on the inputs using the weights
and biases. It’s because it doesn’t matter how many hidden layers we attach in the neural network; all
layers will behave in the same way because the composition of two linear functions is a linear function
itself.

Although the neural network becomes simpler, learning any complex task is impossible, and our model
would be just a linear regression model.

3 Types of Neural Networks Activation Functions

Now, as we’ve covered the essential concepts, let’s go over the most popular neural networks activation
functions.

Binary Step Function

Binary step function depends on a threshold value that decides whether a neuron should be activated or
not.

The input fed to the activation function is compared to a certain threshold; if the input is greater than it,
then the neuron is activated, else it is deactivated, meaning that its output is not passed on to the next
hidden layer.

7/31
Binary Step Function

Mathematically it can be represented as:

Here are some of the limitations of binary step function:

It cannot provide multi-value outputs—for example, it cannot be used for multi-class classification
problems.
The gradient of the step function is zero, which causes a hindrance in the backpropagation process.

Linear Activation Function

The linear activation function, also known as "no activation," or "identity function" (multiplied x1.0), is
where the activation is proportional to the input.

The function doesn't do anything to the weighted sum of the input, it simply spits out the value it was
given.

8/31
Linear Activation Function

Mathematically it can be represented as:

However, a linear activation function has two major problems :

It’s not possible to use backpropagation as the derivative of the function is a constant and has no
relation to the input x.
All layers of the neural network will collapse into one if a linear activation function is used. No matter
the number of layers in the neural network, the last layer will still be a linear function of the first
layer. So, essentially, a linear activation function turns the neural network into just one layer.

Non-Linear Activation Functions

The linear activation function shown above is simply a linear regression model.

9/31
Because of its limited power, this does not allow the model to create complex mappings between the
network’s inputs and outputs.

Non-linear activation functions solve the following limitations of linear activation functions:

They allow backpropagation because now the derivative function would be related to the input, and
it’s possible to go back and understand which weights in the input neurons can provide a better
prediction.
They allow the stacking of multiple layers of neurons as the output would now be a non-linear
combination of input passed through multiple layers. Any output can be represented as a functional
computation in a neural network.

Now, let’s have a look at ten different non-linear neural networks activation functions and their
characteristics.

10 Non-Linear Neural Networks Activation Functions

Sigmoid / Logistic Activation Function

This function takes any real value as input and outputs values in the range of 0 to 1.

The larger the input (more positive), the closer the output value will be to 1.0, whereas the smaller the
input (more negative), the closer the output will be to 0.0, as shown below.

Sigmoid/Logistic Activation Function

Mathematically it can be represented as:

10/31
Here’s why sigmoid/logistic activation function is one of the most widely used functions:

It is commonly used for models where we have to predict the probability as an output. Since
probability of anything exists only between the range of 0 and 1, sigmoid is the right choice because
of its range.
The function is differentiable and provides a smooth gradient, i.e., preventing jumps in output
values. This is represented by an S-shape of the sigmoid activation function.

The limitations of sigmoid function are discussed below:

The derivative of the function is f'(x) = sigmoid(x)*(1-sigmoid(x)).

11/31
The derivative of the Sigmoid Activation Function

‍
As we can see from the above Figure, the gradient values are only significant for range -3 to 3, and the
graph gets much flatter in other regions.

It implies that for values greater than 3 or less than -3, the function will have very small gradients. As the
gradient value approaches zero, the network ceases to learn and suffers from the Vanishing gradient
problem.

The output of the logistic function is not symmetric around zero. So the output of all the neurons will
be of the same sign. This makes the training of the neural network more difficult and unstable.

Tanh Function (Hyperbolic Tangent)

Tanh function is very similar to the sigmoid/logistic activation function, and even has the same S-shape
with the difference in output range of -1 to 1. In Tanh, the larger the input (more positive), the closer the
output value will be to 1.0, whereas the smaller the input (more negative), the closer the output will be to
-1.0.

12/31
Tanh Function (Hyperbolic Tangent)

Mathematically it can be represented as:

Advantages of using this activation function are:

The output of the tanh activation function is Zero centered; hence we can easily map the output
values as strongly negative, neutral, or strongly positive.
Usually used in hidden layers of a neural network as its values lie between -1 to; therefore, the
mean for the hidden layer comes out to be 0 or very close to it. It helps in centering the data and
makes learning for the next layer much easier.

13/31
Have a look at the gradient of the tanh activation function to understand its limitations.

Gradient of the Tanh Activation Function

As you can see— it also faces the problem of vanishing gradients similar to the sigmoid activation
function. Plus the gradient of the tanh function is much steeper as compared to the sigmoid function.

💡 Note:Although both sigmoid and tanh face vanishing gradient issue, tanh is zero centered, and the gradients are not
restricted to move in a certain direction. Therefore, in practice, tanh nonlinearity is always preferred to sigmoid
nonlinearity.

ReLU Function

ReLU stands for Rectified Linear Unit.

Although it gives an impression of a linear function, ReLU has a derivative function and allows for
backpropagation while simultaneously making it computationally efficient.

The main catch here is that the ReLU function does not activate all the neurons at the same time.

The neurons will only be deactivated if the output of the linear transformation is less than 0.

14/31
ReLU Activation Function

Mathematically it can be represented as:

The advantages of using ReLU as an activation function are as follows:

Since only a certain number of neurons are activated, the ReLU function is far more computationally
efficient when compared to the sigmoid and tanh functions.
ReLU accelerates the convergence of gradient descent towards the global minimum of the loss
function due to its linear, non-saturating property.

The limitations faced by this function are:

The Dying ReLU problem, which I explained below.

15/31
The Dying ReLU problem

‍
The negative side of the graph makes the gradient value zero. Due to this reason, during the
backpropagation process, the weights and biases for some neurons are not updated. This can create
dead neurons which never get activated.

All the negative input values become zero immediately, which decreases the model’s ability to fit or
train from the data properly.

Note: For building the most reliable ML models, split your data into train, validation, and test sets.

Leaky ReLU Function

Leaky ReLU is an improved version of ReLU function to solve the Dying ReLU problem as it has a small
positive slope in the negative area.

16/31
Leaky ReLU

Mathematically it can be represented as:

The advantages of Leaky ReLU are same as that of ReLU, in addition to the fact that it does enable
backpropagation, even for negative input values.

By making this minor modification for negative input values, the gradient of the left side of the graph
comes out to be a non-zero value. Therefore, we would no longer encounter dead neurons in that region.

Here is the derivative of the Leaky ReLU function.

17/31
The derivative of the Leaky ReLU function

The limitations that this function faces include:

The predictions may not be consistent for negative input values.

The gradient for negative values is a small value that makes the learning of model parameters time-
consuming.

Parametric ReLU Function

Parametric ReLU is another variant of ReLU that aims to solve the problem of gradient’s becoming zero
for the left half of the axis.

This function provides the slope of the negative part of the function as an argument a. By performing
backpropagation, the most appropriate value of a is learnt.

18/31
Parametric ReLU

Mathematically it can be represented as:

Where "a" is the slope parameter for negative values.

The parameterized ReLU function is used when the leaky ReLU function still fails at solving the problem
of dead neurons, and the relevant information is not successfully passed to the next layer.

This function’s limitation is that it may perform differently for different problems depending upon the value
of slope parameter a.

19/31
Exponential Linear Units (ELUs) Function

Exponential Linear Unit, or ELU for short, is also a variant of ReLU that modifies the slope of the negative
part of the function.

ELU uses a log curve to define the negativ values unlike the leaky ReLU and Parametric ReLU functions
with a straight line.

ELU Activation Function

Mathematically it can be represented as:

20/31
ELU is a strong alternative for f ReLU because of the following advantages:

ELU becomes smooth slowly until its output equal to -α whereas RELU sharply smoothes.
Avoids dead ReLU problem by introducing log curve for negative values of input. It helps the
network nudge weights and biases in the right direction.

The limitations of the ELU function are as follow:

It increases the computational time because of the exponential operation included

No learning of the ‘a’ value takes place
Exploding gradient problem

ELU Activation Function and its derivative

Mathematically it can be represented as:

21/31
‍

Softmax Function

Before exploring the ins and outs of the Softmax activation function, we should focus on its building block
—the sigmoid/logistic activation function that works on calculating probability values.

Probability

The output of the sigmoid function was in the range of 0 to 1, which can be thought of as probability.

But—

This function faces certain problems.

22/31
Let’s suppose we have five output values of 0.8, 0.9, 0.7, 0.8, and 0.6, respectively. How can we move
forward with it?

The answer is: We can’t.

The above values don’t make sense as the sum of all the classes/output probabilities should be equal to
1.

You see, the Softmax function is described as a combination of multiple sigmoids.

It calculates the relative probabilities. Similar to the sigmoid/logistic activation function, the SoftMax
function returns the probability of each class.

It is most commonly used as an activation function for the last layer of the neural network in the case of
multi-class classification.

Mathematically it can be represented as:

Softmax Function

Let’s go over a simple example together.

Assume that you have three classes, meaning that there would be three neurons in the output layer. Now,
suppose that your output from the neurons is [1.8, 0.9, 0.68].

Applying the softmax function over these values to give a probabilistic view will result in the following
outcome: [0.58, 0.23, 0.19].

23/31
The function returns 1 for the largest probability index while it returns 0 for the other two array indexes.
Here, giving full weight to index 0 and no weight to index 1 and index 2. So the output would be the class
corresponding to the 1st neuron(index 0) out of three.

You can see now how softmax activation function make things easy for multi-class classification
problems.

Swish

It is a self-gated activation function developed by researchers at Google.

Swish consistently matches or outperforms ReLU activation function on deep networks applied to various
challenging domains such as image classification, machine translation etc.

Swish Activation Function

This function is bounded below but unbounded above i.e. Y approaches to a constant value as X
approaches negative infinity but Y approaches to infinity as X approaches infinity.

Mathematically it can be represented as:

24/31
Here are a few advantages of the Swish activation function over ReLU:

Swish is a smooth function that means that it does not abruptly change direction like ReLU does
near x = 0. Rather, it smoothly bends from 0 towards values < 0 and then upwards again.

Small negative values were zeroed out in ReLU activation function. However, those negative values
may still be relevant for capturing patterns underlying the data. Large negative values are zeroed
out for reasons of sparsity making it a win-win situation.

The swish function being non-monotonous enhances the expression of input data and weight to be
learnt.

Gaussian Error Linear Unit (GELU)

The Gaussian Error Linear Unit (GELU) activation function is compatible with BERT, ROBERTa, ALBERT,
and other top NLP models. This activation function is motivated by combining properties from dropout,
zoneout, and ReLUs.

ReLU and dropout together yield a neuron’s output. ReLU does it deterministically by multiplying the input
by zero or one (depending upon the input value being positive or negative) and dropout stochastically
multiplying by zero.

RNN regularizer called zoneout stochastically multiplies inputs by one.

We merge this functionality by multiplying the input by either zero or one which is stochastically
determined and is dependent upon the input. We multiply the neuron input x by

m ∼ Bernoulli(Φ(x)), where Φ(x) = P(X ≤x), X ∼ N (0, 1) is the cumulative distribution function of the
standard normal distribution.

This distribution is chosen since neuron inputs tend to follow a normal distribution, especially with Batch
Normalization.

25/31
Gaussian Error Linear Unit (GELU) Activation Function

Mathematically it can be represented as:

GELU nonlinearity is better than ReLU and ELU activations and finds performance improvements across
all tasks in domains of computer vision, natural language processing, and speech recognition.

Scaled Exponential Linear Unit (SELU)

SELU was defined in self-normalizing networks and takes care of internal normalization which means
each layer preserves the mean and variance from the previous layers. SELU enables this normalization
by adjusting the mean and variance.

SELU has both positive and negative values to shift the mean, which was impossible for ReLU activation
function as it cannot output negative values.

Gradients can be used to adjust the variance. The activation function needs a region with a gradient
larger than one to increase it.

26/31
SELU Activation Function

Mathematically it can be represented as:

‍
SELU has values of alpha α and lambda λ predefined.

Here’s the main advantage of SELU over ReLU:

Internal normalization is faster than external normalization, which means the network converges
faster.

SELU is a relatively newer activation function and needs more papers on architectures such as CNNs
and RNNs, where it is comparatively explored.

27/31
Why are deep neural networks hard to train?
There are two challenges you might encounter when training your deep neural networks.

Let's discuss them in more detail.

Vanishing Gradients

Like the sigmoid function, certain activation functions squish an ample input space into a small output
space between 0 and 1.

Therefore, a large change in the input of the sigmoid function will cause a small change in the output.
Hence, the derivative becomes small. For shallow networks with only a few layers that use these
activations, this isn’t a big problem.

However, when more layers are used, it can cause the gradient to be too small for training to work
effectively.

Exploding Gradients

Exploding gradients are problems where significant error gradients accumulate and result in very large
updates to neural network model weights during training.

An unstable network can result when there are exploding gradients, and the learning cannot be
completed.

The values of the weights can also become so large as to overflow and result in something called NaN
values.

How to choose the right Activation Function?

You need to match your activation function for your output layer based on the type of prediction problem
that you are solving—specifically, the type of predicted variable.

Here’s what you should keep in mind.

As a rule of thumb, you can begin with using the ReLU activation function and then move over to other
activation functions if ReLU doesn’t provide optimum results.

And here are a few other guidelines to help you out.

1. ReLU activation function should only be used in the hidden layers.

2. Sigmoid/Logistic and Tanh functions should not be used in hidden layers as they make the model
more susceptible to problems during training (due to vanishing gradients).
3. Swish function is used in neural networks having a depth greater than 40 layers.

Finally, a few rules for choosing the activation function for your output layer based on the type of
prediction problem that you are solving:

28/31
1. Regression - Linear Activation Function
2. Binary Classification—Sigmoid/Logistic Activation Function
3. Multiclass Classification—Softmax
4. Multilabel Classification—Sigmoid

The activation function used in hidden layers is typically chosen based on the type of neural network
architecture.

5. Convolutional Neural Network (CNN): ReLU activation function.

6. Recurrent Neural Network: Tanh and/or Sigmoid activation function.

And hey—use this cheatsheet to consolidate all the knowledge on the Neural Network Activation
Functions that you've just acquired :)

Neural Network Activation Functions: Cheat Sheet

29/31
Neural Networks Activation Functions in a Nutshell
Well done!

You’ve made it this far ;-) Now, let’s have a quick recap of everything you’ve learnt in this tutorial:

Activation Functions are used to introduce non-linearity in the network.

A neural network will almost always have the same activation function in all hidden layers. This
activation function should be differentiable so that the parameters of the network are learned in
backpropagation.

ReLU is the most commonly used activation function for hidden layers.

While selecting an activation function, you must consider the problems it might face: vanishing and
exploding gradients.

Regarding the output layer, we must always consider the expected value range of the predictions. If
it can be any numeric value (as in case of the regression problem) you can use the linear activation
function or ReLU.

Use Softmax or Sigmoid function for the classification problems.

💡 Read Next:
The Ultimate Guide to Object Detection

65+ Best Free Datasets for Machine Learning

40+ Data Science Interview Questions and Answers

The Complete Guide to CVAT—Pros & Cons

5 Alternatives to Scale AI

YOLO: Real-Time Object Detection Explained

The Beginner’s Guide to Contrastive Learning

9 Reinforcement Learning Real-Life Applications

Mean Average Precision (mAP) Explained: Everything You Need to Know

A Step-by-Step Guide to Text Annotation [+Free OCR Tool]

The Essential Guide to Data Augmentation in Deep Learning

30/31
Pragati Baheti

Microsoft

Pragati is a software developer at Microsoft, and a deep learning enthusiast. She writes about the
fundamental mathematics behind deep neural networks.

31/31

Multi-Layer Perceptrons (MLP) in R - GeeksforGeeks
No ratings yet
Multi-Layer Perceptrons (MLP) in R - GeeksforGeeks
8 pages
100 AI Algorithms
No ratings yet
100 AI Algorithms
5 pages
Unit1 DL JNTUK
No ratings yet
Unit1 DL JNTUK
43 pages
Unit 5
No ratings yet
Unit 5
50 pages
ML Unit 2
No ratings yet
ML Unit 2
63 pages
Classification in Machine Learning
No ratings yet
Classification in Machine Learning
25 pages
Fundamentos de Procesamiento de Lenguaje Natural: Summer Camp
100% (1)
Fundamentos de Procesamiento de Lenguaje Natural: Summer Camp
97 pages
Transformer Networks
No ratings yet
Transformer Networks
53 pages
ML PPT Activation Functions
No ratings yet
ML PPT Activation Functions
12 pages
Deep Learning - Unit 1 Notes
No ratings yet
Deep Learning - Unit 1 Notes
27 pages
TensorFlow in 1 Day: Make your own Neural Network
From Everand
TensorFlow in 1 Day: Make your own Neural Network
Krishna Rungta
3.5/5 (10)
Csen3261 - Machine Learning and Its Applications
No ratings yet
Csen3261 - Machine Learning and Its Applications
2 pages
DL - Unit IV
No ratings yet
DL - Unit IV
36 pages
Transfer Learning CNN
No ratings yet
Transfer Learning CNN
21 pages
CS445 - Neural Networks and Deep Learning - Lecture Notes
No ratings yet
CS445 - Neural Networks and Deep Learning - Lecture Notes
5 pages
Chapter 3 Ann
No ratings yet
Chapter 3 Ann
26 pages
Lec 05
No ratings yet
Lec 05
46 pages
Question Bank On Artificial Neural Networks
No ratings yet
Question Bank On Artificial Neural Networks
2 pages
Activation Function 1706811454
No ratings yet
Activation Function 1706811454
11 pages
Artificial Neural Networks
No ratings yet
Artificial Neural Networks
13 pages
A Deep Learning Bidirectional Long-Short Term Memory Model For Short-Term Wind Speed Forecasting
No ratings yet
A Deep Learning Bidirectional Long-Short Term Memory Model For Short-Term Wind Speed Forecasting
7 pages
A Text Classification Model Based On GCN and BiGRU Fusion
No ratings yet
A Text Classification Model Based On GCN and BiGRU Fusion
5 pages
Lec 22 Activations Functions Complete
No ratings yet
Lec 22 Activations Functions Complete
33 pages
01 DS 2019 CODESIGN Correction Ex1
No ratings yet
01 DS 2019 CODESIGN Correction Ex1
17 pages
Deep Learning Interview
No ratings yet
Deep Learning Interview
28 pages
161.demystifying Neural Networks and Backpropagation - A Guided Exploration - by Ilakkuvaselvi (Ilak) Manoharan - Medium
No ratings yet
161.demystifying Neural Networks and Backpropagation - A Guided Exploration - by Ilakkuvaselvi (Ilak) Manoharan - Medium
15 pages
R3 - Dive Into Deep Learning - Zhang Lipton Li Smola
100% (1)
R3 - Dive Into Deep Learning - Zhang Lipton Li Smola
1,025 pages
CVDL Cae1
No ratings yet
CVDL Cae1
28 pages
Second Exam 2021-22
No ratings yet
Second Exam 2021-22
14 pages
Activation Functions in Neural Networks (12 Types & Use Cases)
No ratings yet
Activation Functions in Neural Networks (12 Types & Use Cases)
43 pages
Notes On Introduction To Deep Learning
No ratings yet
Notes On Introduction To Deep Learning
19 pages
Unit 2
No ratings yet
Unit 2
35 pages
CS5228 Project 2 Twitter Sentiment Analysis Group No.: 29: 1 Problem Statement
No ratings yet
CS5228 Project 2 Twitter Sentiment Analysis Group No.: 29: 1 Problem Statement
15 pages
Supervised Learning: Adane Letta Mamuye (PHD)
No ratings yet
Supervised Learning: Adane Letta Mamuye (PHD)
41 pages
Artificial Neural Artificial Neural Networks
No ratings yet
Artificial Neural Artificial Neural Networks
40 pages
Week 7 (ANN)
No ratings yet
Week 7 (ANN)
23 pages
Deep Learning
No ratings yet
Deep Learning
48 pages
Basics of DL: Prof. Leal-Taixé and Prof. Niessner 1
No ratings yet
Basics of DL: Prof. Leal-Taixé and Prof. Niessner 1
76 pages
Activation Function
No ratings yet
Activation Function
7 pages
Activatn FN 2
No ratings yet
Activatn FN 2
10 pages
Fundamentals of Neural Network
No ratings yet
Fundamentals of Neural Network
84 pages
Deep Learning
No ratings yet
Deep Learning
37 pages
2-Neural Networks Basics - Functions in Neural Networks-22-07-2024
No ratings yet
2-Neural Networks Basics - Functions in Neural Networks-22-07-2024
4 pages
Deep Learning
No ratings yet
Deep Learning
10 pages
Unit 2 - Activation Function - PR
No ratings yet
Unit 2 - Activation Function - PR
22 pages
UNIT-III Activation-Function
No ratings yet
UNIT-III Activation-Function
6 pages
Activation Function
No ratings yet
Activation Function
9 pages
Unit 2 Deep Learning and Neural Networks
No ratings yet
Unit 2 Deep Learning and Neural Networks
38 pages
MLS 1 - Presentation
No ratings yet
MLS 1 - Presentation
11 pages
NNDL Umit 1 Important Questions
No ratings yet
NNDL Umit 1 Important Questions
8 pages
Lect 5-6activation Function
No ratings yet
Lect 5-6activation Function
15 pages
3-Activation Function, Loss Function-24-07-2024
No ratings yet
3-Activation Function, Loss Function-24-07-2024
19 pages
Image Captioning Research Paper
No ratings yet
Image Captioning Research Paper
59 pages
Unit V Neural Networks
No ratings yet
Unit V Neural Networks
35 pages
Perceptron in Machine Learning
No ratings yet
Perceptron in Machine Learning
11 pages
Activation Funtions
No ratings yet
Activation Funtions
26 pages
Neural Networks: Directed by
No ratings yet
Neural Networks: Directed by
53 pages
26 - Netinput Activation Function Forward and Back Propogation
No ratings yet
26 - Netinput Activation Function Forward and Back Propogation
41 pages
Activation Functions
No ratings yet
Activation Functions
9 pages
Activation Functions in Neural Networks - 241102 - 224129
No ratings yet
Activation Functions in Neural Networks - 241102 - 224129
7 pages
DL Answers
No ratings yet
DL Answers
24 pages
Lecture 9-NN - Modified
No ratings yet
Lecture 9-NN - Modified
94 pages
CS 522 Selected Topics in CS: Lecture 07 - Artificial Neural Network
No ratings yet
CS 522 Selected Topics in CS: Lecture 07 - Artificial Neural Network
52 pages
AI Unit5 Neural Network 1c2c9166 c1b7 47a3 8ce1 E914f1ab6afb
No ratings yet
AI Unit5 Neural Network 1c2c9166 c1b7 47a3 8ce1 E914f1ab6afb
52 pages
Activation Function in NN
No ratings yet
Activation Function in NN
29 pages
Biological Neuron and Memory: Understanding The Basics of Neural Function and Memory Mechanisms
No ratings yet
Biological Neuron and Memory: Understanding The Basics of Neural Function and Memory Mechanisms
455 pages
Associative Memory Neural Networks
100% (1)
Associative Memory Neural Networks
35 pages
4 - Activation Functions in Neural Networks
No ratings yet
4 - Activation Functions in Neural Networks
12 pages
Lecture 15
No ratings yet
Lecture 15
21 pages
Unit 5 Activation Function
No ratings yet
Unit 5 Activation Function
15 pages
SC ESE Cae 1
No ratings yet
SC ESE Cae 1
25 pages
Activation Function
No ratings yet
Activation Function
44 pages
Module1 - Upto Loss Function
No ratings yet
Module1 - Upto Loss Function
137 pages
Lecture8,9-Neural Networks
No ratings yet
Lecture8,9-Neural Networks
65 pages
Activation Function
No ratings yet
Activation Function
4 pages
Activation Function
No ratings yet
Activation Function
31 pages
Deep Learning 10 Hours
No ratings yet
Deep Learning 10 Hours
27 pages
Ad3451 ML Unit 4 Notes
No ratings yet
Ad3451 ML Unit 4 Notes
34 pages
Unit Iv
No ratings yet
Unit Iv
34 pages
12 Types of Neural Network Activation Functions
No ratings yet
12 Types of Neural Network Activation Functions
38 pages
Aditya Jain NN Assignment
No ratings yet
Aditya Jain NN Assignment
13 pages
Mod 2.3 - Activation Function
No ratings yet
Mod 2.3 - Activation Function
9 pages
Deep Learning Handout
100% (1)
Deep Learning Handout
6 pages
Neural Networks
No ratings yet
Neural Networks
61 pages
Ad3451 ML Unit 4 Notes Eduengg
No ratings yet
Ad3451 ML Unit 4 Notes Eduengg
36 pages
Module1
No ratings yet
Module1
124 pages
ML Mentorship Prahitha Movva V1
No ratings yet
ML Mentorship Prahitha Movva V1
5 pages
Fundamentals Deep Learning Activation Functions When To Use Them
No ratings yet
Fundamentals Deep Learning Activation Functions When To Use Them
15 pages
0905 Cs 161183 Vishal
No ratings yet
0905 Cs 161183 Vishal
38 pages
7 Types of Neural Network Activation Functions
No ratings yet
7 Types of Neural Network Activation Functions
16 pages