0% found this document useful (0 votes)

4 views

ANN notes

The document provides an overview of Artificial Neural Networks, detailing types of connections (fully, partially, and linearly connected), common issues like vanishing and exploding gradients, and the importance of hyperparameters. It also explains forward and backward propagation, loss functions, and various activation functions with their use cases and limitations. Key concepts such as overfitting, underfitting, and normalization are also discussed.

Uploaded by

eyemusican333

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

4 views

ANN notes

Uploaded by

eyemusican333

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 7

Notes of ANN

SUBJECT
Artificial Neural Networks
SECTION
Alpha

3. Fully Connected, Partially Connected, and Linearly

Connected Neural Networks
 Fully Connected: Every neuron in one layer is

connected to every neuron in the next layer.

 Partially Connected: Only a subset of neurons is

connected between layers, reducing computation.

 Linearly Connected: Neurons are connected in a linear

sequence, often used in recurrent networks.

6. Vanishing Gradient Problem
In deep networks, gradients become very small and stop
updating weights. Sigmoid/Tanh functions suffer from this.
Solution: Use ReLU, Batch Normalization, Skip
Connections (ResNets).

7. Exploding Gradient Problem

When gradients become too large, weights explode. Happens
in deep networks.
Solution: Use gradient clipping and proper weight
initialization.

8. Hyperparameters and Their Types

Hyperparameters control how a network learns:
 Learning Rate – Controls how much weights update.

 Batch Size – Number of samples per training step.

 Epochs – Number of times the model sees the full

dataset.
 Hidden Layers – Number of layers between input and

output.
9. Forward Propagation
The process of calculating output from inputs using weights
and activation functions.

10. Backward Propagation

A process where the network updates weights using
gradients from the loss function (uses chain rule of
differentiation).

11. Loss Function (Cost Function)

Loss function calculates error between predicted and actual
output.
Common types:
 Mean Squared Error (MSE) – For regression.

 Cross-Entropy Loss – For classification.

12. Concepts: Epoch, Learning Rate, Iteration, Batch Size,

Normalization, Overfitting, Underfitting
 Epoch: One complete pass through the dataset.

 Learning Rate: Step size for weight updates.

 Iteration: One batch processed in training.

 Batch Size: Number of samples processed before

updating weights.
 Normalization: Scaling inputs for better performance.

 Overfitting: Model memorizes training data, bad for

new data.
 Underfitting: Model is too simple, doesn't learn well.

Activation Functions: Explanation & Formulas

1. Binary Step Function
The simplest activation function, used in perceptrons for classification
tasks.
 Use Case: Used in early perceptron models and simple
classification tasks.
 Limitation: Not useful for deep learning since it doesn’t allow
for gradient-based optimization (no derivative for learning).

2. Linear Activation Function

A basic function where output is proportional to the input.
f(x)=axf(x) = ax
where aa is a constant.
 Use Case: Used in regression problems.

 Limitation: Cannot introduce non-linearity, making it

unsuitable for deep networks.

3. Sigmoid (Logistic) Function

A smooth S-shaped function that maps any real number to a range
between 0 and 1.

 Use Case: Used in binary classification problems.

 Limitation:
o Causes vanishing gradient for large or small xx.

o Output is not centered around 0, slowing convergence.

4. Hyperbolic Tangent (Tanh) Function

An improved version of sigmoid that maps values between -1 and 1.
 Use Case: Used in hidden layers to center activations around
zero.
 Limitation: Still suffers from vanishing gradient for large or
small xx.

5. Rectified Linear Unit (ReLU)

The most commonly used activation function in deep learning.

 Use Case: Used in hidden layers of deep neural networks.

 Advantages:
o Efficient and fast.

o Reduces the vanishing gradient problem.

 Limitation: Suffering from dying ReLU problem (neurons

stop learning when x≤0x \leq 0).

6. Leaky ReLU
A modified version of ReLU that allows small negative values instead
of zero.

where α\alpha is a small constant (e.g., 0.01).

 Use Case: Helps avoid dying ReLU problem.

7. Exponential Linear Unit (ELU)

Similar to Leaky ReLU but with smoother transitions for negative
inputs.
 Use Case: Reduces bias shift and helps deeper networks train
better.

9. Swish Function
A self-gated activation function proposed by Google.

 Use Case: Used in modern deep learning architectures.

 Advantage: Often performs better than ReLU in some tasks.

10. GELU (Gaussian Error Linear Unit)

Improves upon ReLU and Swish by using a Gaussian function.

where Φ(x)\Phi(x) is the cumulative distribution function of the

Gaussian distribution.
 Use Case: Used in transformer models like BERT.

11. SELU (Scaled Exponential Linear Unit)

A variation of ELU that normalizes activations automatically.

where λ\lambda and α\alpha are constants.

 Use Case: Used in self-normalizing neural networks (SNNs).

Choosing the Right Activation Function

Activation
Use Case Limitation
Function
Binary Step Simple classifiers Not differentiable
Linear Regression No non-linearity
Sigmoid Binary classification Vanishing gradient
Tanh Hidden layers Still vanishes
ReLU Deep learning Dying ReLU problem
Leaky ReLU Fixing dying ReLU May not always help
ELU Improved ReLU More computational cost
Multi-class Large input values cause
Softmax
classification instability
Advanced deep
Swish Computationally expensive
learning
GELU Transformer models High complexity
Self-normalizing Requires specific
SELU
networks initialization
Let me know if you need more explanations! 🚀

2 DNN-CNN-RNN
100% (1)
2 DNN-CNN-RNN
87 pages
Module 2
No ratings yet
Module 2
13 pages
L4 Training Neural Networks en
No ratings yet
L4 Training Neural Networks en
48 pages
what are the activation functions, how do i deter...
No ratings yet
what are the activation functions, how do i deter...
3 pages
Different Activation Functions With The Equations
No ratings yet
Different Activation Functions With The Equations
6 pages
UNIT II DNN
No ratings yet
UNIT II DNN
24 pages
Deep Learning (1)
No ratings yet
Deep Learning (1)
19 pages
AyushChokhani AI Asiignment 2
No ratings yet
AyushChokhani AI Asiignment 2
12 pages
tutorial 1,2
No ratings yet
tutorial 1,2
12 pages
ANN Unit IV Notes
No ratings yet
ANN Unit IV Notes
4 pages
sdl unit 2 3 4
No ratings yet
sdl unit 2 3 4
12 pages
Neural_Networks_Activation_Functions__1694135997
No ratings yet
Neural_Networks_Activation_Functions__1694135997
7 pages
Deep+Learning+Module-02+Search+Creators
No ratings yet
Deep+Learning+Module-02+Search+Creators
15 pages
Lec08-1Activation Functions
No ratings yet
Lec08-1Activation Functions
19 pages
06 AIS302 ANN backpropagation
No ratings yet
06 AIS302 ANN backpropagation
83 pages
AML 03 Dense Neural Networks
No ratings yet
AML 03 Dense Neural Networks
20 pages
Deep Learning
No ratings yet
Deep Learning
40 pages
ML Modelling - part 1
No ratings yet
ML Modelling - part 1
7 pages
Deep MLP's
No ratings yet
Deep MLP's
44 pages
Chapter 9
No ratings yet
Chapter 9
73 pages
ML_MU_Unit_5NeuralNetworkpdf__2025_04_16_13_47_39
No ratings yet
ML_MU_Unit_5NeuralNetworkpdf__2025_04_16_13_47_39
57 pages
Need and Use of Activation Functions in Anndeep Learning
No ratings yet
Need and Use of Activation Functions in Anndeep Learning
7 pages
DL_EXP-3_16010422230
No ratings yet
DL_EXP-3_16010422230
9 pages
Object Classification Using CNN
No ratings yet
Object Classification Using CNN
9 pages
26- netinput activation function forward and back propogation
No ratings yet
26- netinput activation function forward and back propogation
41 pages
1725876123-Unit 1 Fundamental of Deep Learning
No ratings yet
1725876123-Unit 1 Fundamental of Deep Learning
51 pages
Supervised Deep Learning
No ratings yet
Supervised Deep Learning
28 pages
Training Deep Neural Networks
No ratings yet
Training Deep Neural Networks
55 pages
AI & ML Unit 5 Notes
No ratings yet
AI & ML Unit 5 Notes
23 pages
Lecture 2.1.2activation Function
No ratings yet
Lecture 2.1.2activation Function
15 pages
ML_Lec-22
No ratings yet
ML_Lec-22
25 pages
Unit 4
No ratings yet
Unit 4
19 pages
f8194544 Microsoft PowerPoint DeepLearning
No ratings yet
f8194544 Microsoft PowerPoint DeepLearning
28 pages
Artificial Neural Networks_dl
No ratings yet
Artificial Neural Networks_dl
55 pages
DeepLearing Theory
No ratings yet
DeepLearing Theory
51 pages
Deep Learning
No ratings yet
Deep Learning
5 pages
6.3 HiddenUnits
No ratings yet
6.3 HiddenUnits
26 pages
7 Types of Neural Network Activation Functions
No ratings yet
7 Types of Neural Network Activation Functions
16 pages
cst414- Deep learning
No ratings yet
cst414- Deep learning
34 pages
Deep Learning: International Islamic University of Chittagong
No ratings yet
Deep Learning: International Islamic University of Chittagong
31 pages
Activation Function To Back Pro
No ratings yet
Activation Function To Back Pro
22 pages
9.b Handout-4-Activation Functions
No ratings yet
9.b Handout-4-Activation Functions
4 pages
[Fall 2024] Deep Learning 1
No ratings yet
[Fall 2024] Deep Learning 1
55 pages
Activation Functions in Neural Networks - 241102 - 224129
No ratings yet
Activation Functions in Neural Networks - 241102 - 224129
7 pages
Introduction To Neural Network
No ratings yet
Introduction To Neural Network
20 pages
Ad3451 Ml Unit 4 Notes
No ratings yet
Ad3451 Ml Unit 4 Notes
34 pages
Study of Ensemble of Activation Functions in Deep Learning
No ratings yet
Study of Ensemble of Activation Functions in Deep Learning
10 pages
ML prep for samsung
No ratings yet
ML prep for samsung
73 pages
Deep Learning Unit 2
No ratings yet
Deep Learning Unit 2
4 pages
LLM Ai Interview SS
No ratings yet
LLM Ai Interview SS
187 pages
26 Neural Nets
No ratings yet
26 Neural Nets
77 pages
NN unit_1
No ratings yet
NN unit_1
27 pages
Ad3451 ML Unit 4 Notes Eduengg
No ratings yet
Ad3451 ML Unit 4 Notes Eduengg
36 pages
DL M2 Tech
No ratings yet
DL M2 Tech
32 pages
Deep Learing
No ratings yet
Deep Learing
37 pages
Deep Learning UNIT-II Part1
No ratings yet
Deep Learning UNIT-II Part1
48 pages
Deep Learning 15
No ratings yet
Deep Learning 15
13 pages
Machine Learning Questions
No ratings yet
Machine Learning Questions
2 pages
Unit 2 Deep Learning and Neural Networks
No ratings yet
Unit 2 Deep Learning and Neural Networks
38 pages
DL Quiz1
No ratings yet
DL Quiz1
5 pages
Backpropagation: Fundamentals and Applications for Preparing Data for Training in Deep Learning
From Everand
Backpropagation: Fundamentals and Applications for Preparing Data for Training in Deep Learning
Fouad Sabry
No ratings yet
Neural_Networks
No ratings yet
Neural_Networks
6 pages
KMEANS
No ratings yet
KMEANS
9 pages
Chapter 5 - Machine Learning Basics
No ratings yet
Chapter 5 - Machine Learning Basics
58 pages
AIML_ECE304_Assign-2_Kartikeya_Kandpal_Ajitesh_S.ipynb - Colab
No ratings yet
AIML_ECE304_Assign-2_Kartikeya_Kandpal_Ajitesh_S.ipynb - Colab
3 pages
8th_lecture_Delta_Rule_Learning_s1_21_22
No ratings yet
8th_lecture_Delta_Rule_Learning_s1_21_22
48 pages
NN Learning and Expert Systems
No ratings yet
NN Learning and Expert Systems
8 pages
Unit 2
No ratings yet
Unit 2
38 pages
UNIT-4
No ratings yet
UNIT-4
38 pages
Technical Report On DenseNet Architecture (Deep Learning Network Model)
No ratings yet
Technical Report On DenseNet Architecture (Deep Learning Network Model)
9 pages
UNIT 4 ML NN ,DL,CNN-1
No ratings yet
UNIT 4 ML NN ,DL,CNN-1
84 pages
MACHINE LEARNING Question Bank
No ratings yet
MACHINE LEARNING Question Bank
11 pages
Final PPT
100% (1)
Final PPT
16 pages
2021-exam2-solution
No ratings yet
2021-exam2-solution
11 pages
Improvement in Power Transformer Intelligent Dissolved Gas Analysis Method
No ratings yet
Improvement in Power Transformer Intelligent Dissolved Gas Analysis Method
4 pages
A Comparative Study On Machine Learning Techniques Using Titanic Dataset
No ratings yet
A Comparative Study On Machine Learning Techniques Using Titanic Dataset
6 pages
2085-Article Text-5597-1-10-20220804
No ratings yet
2085-Article Text-5597-1-10-20220804
12 pages
Recurrent Neural Network
No ratings yet
Recurrent Neural Network
21 pages
Cours 2 - Training Deep Neural Networks
No ratings yet
Cours 2 - Training Deep Neural Networks
42 pages
Neural Network
No ratings yet
Neural Network
37 pages
Assignment Class Notes
No ratings yet
Assignment Class Notes
8 pages
Le y Yang - Tiny ImageNet Visual Recognition Challenge
No ratings yet
Le y Yang - Tiny ImageNet Visual Recognition Challenge
6 pages
Machine Learning
No ratings yet
Machine Learning
31 pages
Artificial Neural Network̄
No ratings yet
Artificial Neural Network̄
62 pages
Single Layer Perceptron and Multilayer Perceptron
No ratings yet
Single Layer Perceptron and Multilayer Perceptron
2 pages
A Text Classification Model Based On GCN and BiGRU Fusion
No ratings yet
A Text Classification Model Based On GCN and BiGRU Fusion
5 pages
Random Forest
No ratings yet
Random Forest
32 pages
QB Dl-Cie1
No ratings yet
QB Dl-Cie1
1 page
mod3
No ratings yet
mod3
101 pages

ANN notes

Uploaded by

ANN notes

Uploaded by

Notes of ANN

3. Fully Connected, Partially Connected, and Linearly

connected to every neuron in the next layer.

connected between layers, reducing computation.

sequence, often used in recurrent networks.

7. Exploding Gradient Problem

8. Hyperparameters and Their Types

 Batch Size – Number of samples per training step.

 Epochs – Number of times the model sees the full

10. Backward Propagation

11. Loss Function (Cost Function)

 Cross-Entropy Loss – For classification.

12. Concepts: Epoch, Learning Rate, Iteration, Batch Size,

 Learning Rate: Step size for weight updates.

 Iteration: One batch processed in training.

 Batch Size: Number of samples processed before

 Overfitting: Model memorizes training data, bad for

Activation Functions: Explanation & Formulas

2. Linear Activation Function

 Limitation: Cannot introduce non-linearity, making it

3. Sigmoid (Logistic) Function

 Use Case: Used in binary classification problems.

o Output is not centered around 0, slowing convergence.

4. Hyperbolic Tangent (Tanh) Function

5. Rectified Linear Unit (ReLU)

 Use Case: Used in hidden layers of deep neural networks.

o Reduces the vanishing gradient problem.

 Limitation: Suffering from dying ReLU problem (neurons

where α\alpha is a small constant (e.g., 0.01).

7. Exponential Linear Unit (ELU)

 Use Case: Used in modern deep learning architectures.

10. GELU (Gaussian Error Linear Unit)

where Φ(x)\Phi(x) is the cumulative distribution function of the

11. SELU (Scaled Exponential Linear Unit)

where λ\lambda and α\alpha are constants.

Choosing the Right Activation Function

You might also like