0% found this document useful (0 votes)

14 views12 pages

Batch Normalization in AIML Accelerating Deep Learning (3)

Batch Normalization is a technique that normalizes inputs in neural networks, addressing internal covariate shift to speed up training and stabilize models. It allows for higher learning rates and reduces sensitivity to initialization, resulting in faster convergence and better generalization. The method is widely supported in frameworks like TensorFlow and PyTorch, making it essential for modern AI/ML development.

Uploaded by

Dharshana Srinivasan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

14 views12 pages

Batch Normalization in AIML Accelerating Deep Learning (3)

Uploaded by

Dharshana Srinivasan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 12

Batch Normalization in

AI/ML: Accelerating
Deep Learning
This presentation covers Batch Norm basics, why it’s revolutionary, and
how it speeds training and stabilizes models substantially.
What is Batch
Normalization?
Batch Normalization is a technique that normalizes
layer inputs in a neural network.

It helps the model learn faster and become more

stable during training.

Imagine making all inputs to a layer have similar scale

and distribution.

This avoids huge jumps or slow progress, making

learning smoother and easier.
The Problem: Internal
Covariate Shift
Definition 1
Internal Covariate Shift
means the distribution of
the inputs to each layer 2 Impact
keeps changing during Slower training
training, because the
Optimization becomes
parameters of the previous
tough
layers are changing.
Layer interference

Worse generalization
Example: You are playing basketball.

Every time you shoot, the This moving basket is just like
basket moves a little bit!
Sometimes it’s higher, internal covariate shift in a
sometimes it’s lower, sometimes neural network
it’s to the side.
You have to adjust every single
time before you shoot.

That’s really hard, right?

It slows you down because you
are always guessing
What is Normalization?
Normalization Standardization
Layer inputs normalized Zero mean and unit
within each mini-batch. variance inputs improve
stability.

Placement
Applied before activation functions.
A Mini Real-World Example :
Data Example Soccer Analogy

Imagine you have 5 numbers coming into a layer: Imagine you’re a coach training a soccer team.

Data 100 102 98 101 99 It’s chaos because everyone is on a different "scale."
To train the team properly, you first make everyone play at similar
Mean = ( 100 + 102 + 98 + 101 + 99 ) / 5 = 100 speeds — not too fast, not too slow — so the coaching becomes
easier.
Variance = Measures how much they spread out (here it's small
because they are close).

Normalize step-by-step: Some players run super fast ⚡️.

Subtract the mean (100) from each value → Center around 0

Divide by the standard deviation → So the numbers aren't too big or Some players are slow 🐢.
too small

Now they might look like:

Some players shoot hard, some soft.
Normalized Data 0.0 0.89 -0.89 0.45 -0.45
What Is an Activation
Function?
In a neural network, each neuron processes input data and decides
whether to "activate" or not. This decision-making process is governed
by an activation function.

If the input is weak,

If the input is strong the switch stays off
enough, the switch (the neuron doesn't
turns on (the neuron activate).
activates).
The Math Behind Batch Norm
The normalization helps reduce the internal covariate shift, which can speed up training and improve
the model's performance. The scaling and shifting parameters allow the model to learn an optimal
transformation of the normalized inputs. The small constant ε ensures numerical stability and prevents
division by zero.

Compute Mean 1
Compute the mean of the inputs in the current batch.

2 Compute Variance
Compute the variance of the inputs in the current batch.

Normalize 3
Normalize the inputs by subtracting the mean and dividing by the square root of the variance
plus a small constant ε.
4 Scale and Shift
Scale and shift the normalized inputs using two learnable parameters γ and β.
How Batch Norm Works in Practice

Insertion Point
After linear layers, before activation.

• This layer performs a mathematical operation on the input data, combining it with weights and biases to produce a new output. Think of it as mixing ingredients in a specific
ratio.

Batch Statistics
Mini-batch stats approximate population during training.

For each mini-batch, it computes the mean and variance of the activations

Mini-Batch Mean (μ₍B₎).

• Mini-Batch Variance (σ²₍B₎)

Inference Use
Use moving averages of mean and variance for predictions.

when you're baking cookies to sell in a store, you want every batch to be consistent, regardless of the room temperature. So, you use the average room temperature you've
recorded over time to adjust your recipe, ensuring every batch turns out the same.

In the same way, during inference (when the trained model is used to make predictions), BatchNorm uses the moving averages of the mean and variance it calculated during
training. This ensures that the data is normalized consistently, leading to stable and reliable predictions.

Framework Support
TensorFlow, PyTorch, Keras support batch norm natively.
Results: Faster
Training, Higher
Accuracy
40% Less Supports Better
Training Time Higher Generalizatio
Learning n
Rates

Less Sensitive
to
Initialization
Benefits of Batch Normalization

Higher Learning Rates

Accelerates Training
Allows aggressive optimization
2x to 10x faster convergence. 1 2 without divergence.

Reduces Initialization
Acts as Regularizer
Sensitivity
Decreases overfitting, boosting 4 3 Simplifies weight initialization
robustness.
requirements.
Summary: Batch Norm is Essential

Addresses Simple and Improves Modern Deep

Internal Covariate Effective Performance Learning Standard
Shift

Batch normalization remains a cornerstone technique accelerating modern AI/ML development.

Ai-Ds 2
No ratings yet
Ai-Ds 2
31 pages
Lecture 8.7
No ratings yet
Lecture 8.7
9 pages
Batch Normalization
No ratings yet
Batch Normalization
11 pages
SNGAN_5th_Module
No ratings yet
SNGAN_5th_Module
12 pages
Batch Norm Explained Visually - How It Works, and Why Neural Networks Need It - by Ketan Doshi - Towards Data Science
No ratings yet
Batch Norm Explained Visually - How It Works, and Why Neural Networks Need It - by Ketan Doshi - Towards Data Science
20 pages
12.Batch Normalization
No ratings yet
12.Batch Normalization
12 pages
Introduction To Neural Networks
No ratings yet
Introduction To Neural Networks
28 pages
7 CNN 3
No ratings yet
7 CNN 3
30 pages
Batch Normalization
No ratings yet
Batch Normalization
7 pages
RevisitingInternalCovariateShiftforBN
No ratings yet
RevisitingInternalCovariateShiftforBN
12 pages
Deep Learning
No ratings yet
Deep Learning
3 pages
Gradient Boosting _ Hyperparameter Tuning Python
No ratings yet
Gradient Boosting _ Hyperparameter Tuning Python
22 pages
Revisiting Internal Covariate Shift For Batch Normalization
No ratings yet
Revisiting Internal Covariate Shift For Batch Normalization
11 pages
Batch Norm
No ratings yet
Batch Norm
7 pages
Normalization Techniques
No ratings yet
Normalization Techniques
23 pages
Deep Learning UNIT-II Part1
No ratings yet
Deep Learning UNIT-II Part1
48 pages
Batch Normalization
No ratings yet
Batch Normalization
6 pages
批处理标准化如何帮助优化（李宏毅教授视频推荐）
No ratings yet
批处理标准化如何帮助优化（李宏毅教授视频推荐）
26 pages
Batch Normalization Separate
No ratings yet
Batch Normalization Separate
20 pages
Fixing Neural Network Course 2 1659759284
No ratings yet
Fixing Neural Network Course 2 1659759284
30 pages
Dropout
No ratings yet
Dropout
14 pages
Optimization
No ratings yet
Optimization
44 pages
Neural Networks - 2
No ratings yet
Neural Networks - 2
79 pages
Batch Normalisation
No ratings yet
Batch Normalisation
17 pages
20-1135
No ratings yet
20-1135
41 pages
BN Layer
No ratings yet
BN Layer
4 pages
18 DL Regularization
No ratings yet
18 DL Regularization
41 pages
ppt3dl
No ratings yet
ppt3dl
15 pages
Difference Between Local Response Normalization and Batch Normalization - by Aqeel Anwar - Towards Data Science
No ratings yet
Difference Between Local Response Normalization and Batch Normalization - by Aqeel Anwar - Towards Data Science
9 pages
Learning Deep Learning Theory and Practice of Neural Networks Computer Vision Natural Language Processing and Transformers Using TensorFlow 1st Edition Magnus Ekman instant download
100% (2)
Learning Deep Learning Theory and Practice of Neural Networks Computer Vision Natural Language Processing and Transformers Using TensorFlow 1st Edition Magnus Ekman instant download
44 pages
Understanding Batch Normalization, Layer Normalization and Group Normalization by Implementing From Scratch - LinkedIn
No ratings yet
Understanding Batch Normalization, Layer Normalization and Group Normalization by Implementing From Scratch - LinkedIn
5 pages
Layer Normalization: Jimmy@psi - Toronto.edu Rkiros@cs - Toronto.edu Hinton@cs - Toronto.edu
No ratings yet
Layer Normalization: Jimmy@psi - Toronto.edu Rkiros@cs - Toronto.edu Hinton@cs - Toronto.edu
14 pages
Deep Feedforward Networks and Regularization: Licheng Zhang
No ratings yet
Deep Feedforward Networks and Regularization: Licheng Zhang
56 pages
How To Use Batch Normalization With TensorFlow and TF - Keras To Train Deep Neural Networks Faster
No ratings yet
How To Use Batch Normalization With TensorFlow and TF - Keras To Train Deep Neural Networks Faster
11 pages
6 Batchnorm
No ratings yet
6 Batchnorm
30 pages
A Survey of Deep Convolutional Neural Networks Applied for Prediction of Plant Leaf Diseases
No ratings yet
A Survey of Deep Convolutional Neural Networks Applied for Prediction of Plant Leaf Diseases
35 pages
ml
No ratings yet
ml
10 pages
DL Unit-3
No ratings yet
DL Unit-3
10 pages
Batch Normalization
No ratings yet
Batch Normalization
2 pages
Batch Normalization: Motivation
No ratings yet
Batch Normalization: Motivation
1 page
Optimization of Deep Networks
No ratings yet
Optimization of Deep Networks
84 pages
Notes For - Batch Normalization - Accelerating Deep Network Training by Reducing Internal Covariate Shift - Paper GitHub
No ratings yet
Notes For - Batch Normalization - Accelerating Deep Network Training by Reducing Internal Covariate Shift - Paper GitHub
3 pages
IJST-2021-1266
No ratings yet
IJST-2021-1266
15 pages
Neural
No ratings yet
Neural
53 pages
Different Activation Functions With The Equations
No ratings yet
Different Activation Functions With The Equations
6 pages
Unit I
No ratings yet
Unit I
41 pages
cours6
No ratings yet
cours6
26 pages
Fundamentals of Deep Learning
No ratings yet
Fundamentals of Deep Learning
26 pages
Deep Learning Computer Vision NLP
No ratings yet
Deep Learning Computer Vision NLP
140 pages
CS601_Machine Learning_Unit 2 New
No ratings yet
CS601_Machine Learning_Unit 2 New
56 pages
Plant Diesase Thesis
No ratings yet
Plant Diesase Thesis
48 pages
Lesson 5 Deep Neural Net Optimization Tuning Interpretability
100% (1)
Lesson 5 Deep Neural Net Optimization Tuning Interpretability
105 pages
Classification BP Regression KNN Other Classifiers_ Final.ppt
No ratings yet
Classification BP Regression KNN Other Classifiers_ Final.ppt
116 pages
Lecture_2
No ratings yet
Lecture_2
31 pages
A Quick Guide On Basic Regularization Methods For Neural Networks - by Jaime Durán - Yottabytes - Medium
No ratings yet
A Quick Guide On Basic Regularization Methods For Neural Networks - by Jaime Durán - Yottabytes - Medium
9 pages
cours4
No ratings yet
cours4
30 pages
unit 4 nndl
No ratings yet
unit 4 nndl
37 pages
DL
No ratings yet
DL
73 pages
Chapter-2(Deep Learning)
No ratings yet
Chapter-2(Deep Learning)
18 pages
ANN-Implemetation of Back-Prop
No ratings yet
ANN-Implemetation of Back-Prop
89 pages
465-Lecture 10-11
No ratings yet
465-Lecture 10-11
79 pages
IV Ai & Ds Al3451 Ml Unit4 Qb
No ratings yet
IV Ai & Ds Al3451 Ml Unit4 Qb
6 pages
AML Lab
No ratings yet
AML Lab
34 pages
DL unit 3-5
No ratings yet
DL unit 3-5
44 pages
Exponential Convergence Rates For Batch Normalization - 1
No ratings yet
Exponential Convergence Rates For Batch Normalization - 1
1 page
Chen, Deng et al 2021 - Effective and Efficient Batch Normalization
No ratings yet
Chen, Deng et al 2021 - Effective and Efficient Batch Normalization
15 pages
Introduction To Neural Network
No ratings yet
Introduction To Neural Network
20 pages
Breast Cancer Classification-IEEE Paper
No ratings yet
Breast Cancer Classification-IEEE Paper
17 pages
Improving Deep Neural Networks: Hyperparameter Tuning, Regularization and Optimization
No ratings yet
Improving Deep Neural Networks: Hyperparameter Tuning, Regularization and Optimization
1 page
Convolution Neural Network (CNN) Unit 2: Dr. Kavita R Singh
No ratings yet
Convolution Neural Network (CNN) Unit 2: Dr. Kavita R Singh
65 pages
CS231n Convolutional Neural Networks For Visual Recognition 6
No ratings yet
CS231n Convolutional Neural Networks For Visual Recognition 6
17 pages
Unit-2 Improving-Deep-Neural-Networks
No ratings yet
Unit-2 Improving-Deep-Neural-Networks
18 pages
Deep Learning For Impact Detection in Composite Plates With Sparsely Integrated Sensors
No ratings yet
Deep Learning For Impact Detection in Composite Plates With Sparsely Integrated Sensors
18 pages
Convolutional Neural Networks (Image Recognition) Part - II: Dr. Syed M. Usman
No ratings yet
Convolutional Neural Networks (Image Recognition) Part - II: Dr. Syed M. Usman
75 pages
Applsci 13 10063
No ratings yet
Applsci 13 10063
21 pages
Zero Initialization Initializi
No ratings yet
Zero Initialization Initializi
14 pages
A Survey On Deep Learning For Data-Driven Soft Sensors
No ratings yet
A Survey On Deep Learning For Data-Driven Soft Sensors
14 pages
DN CNN
No ratings yet
DN CNN
14 pages
Deep Learning On Edge Extracting Field Boundaries
No ratings yet
Deep Learning On Edge Extracting Field Boundaries
15 pages
CS601 - Machine Learning - Unit 2 - Notes - 1672759753
No ratings yet
CS601 - Machine Learning - Unit 2 - Notes - 1672759753
14 pages
Density Estimation Using Real NVP
No ratings yet
Density Estimation Using Real NVP
32 pages
Convolutional Neural Network
100% (1)
Convolutional Neural Network
59 pages
Beyond A Gaussian Denoiser: Residual Learning of Deep CNN For Image Denoising
No ratings yet
Beyond A Gaussian Denoiser: Residual Learning of Deep CNN For Image Denoising
13 pages
Neural Network
100% (1)
Neural Network
54 pages
Exploring Simple Siamese Representation Learning
No ratings yet
Exploring Simple Siamese Representation Learning
10 pages
EfficientNet Tutorial
No ratings yet
EfficientNet Tutorial
20 pages
Deep Learning Tutorial 9
No ratings yet
Deep Learning Tutorial 9
70 pages
YOLO-LITE: A Real-Time Object Detection Algorithm Optimized For Non-GPU Computers
No ratings yet
YOLO-LITE: A Real-Time Object Detection Algorithm Optimized For Non-GPU Computers
8 pages
Artificial Intelligence Interview Questions
From Everand
Artificial Intelligence Interview Questions
Tech Interviews
5/5 (2)
A Probabilistic Theory of Deep Learning: Unit 2
No ratings yet
A Probabilistic Theory of Deep Learning: Unit 2
17 pages