0% found this document useful (0 votes)

6 views19 pages

Module-4 4

Uploaded by

as.business.023

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

6 views19 pages

Module-4 4

Uploaded by

as.business.023

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 19

Regularization in Deep

Learning
Different Scenarios
p-norms visualized

lines indicate penalty = 1

w2
p w2
1 0.5
For example, if w1 = 0.5 1.5 0.75
2 0.87
3 0.95
∞ 1
p-norms visualized

all p-norms penalize larger

weights

p < 2 tends to create sparse

(i.e. lots of 0 weights)

p > 2 tends to like similar

weights
Regularizers summarized
L2 is popular because it tends to result in sparse
solutions (i.e. lots of zero weights)
However, it is not differentiable, so it only works for gradient
descent solvers

L1 is also popular because for some loss functions, it

can be solved directly (no gradient descent required,
though often iterative solvers still)

Lp is less popular since they don’t tend to shrink the

weights enough
Regularization:Dropout

• In addition to the L2 and L1 regularization, another famous and

powerful regularization technique is called the dropout regularization.
The procedure behind dropout regularization is quite simple.
• In a nutshell, dropout means that during training with some
probability P a neuron of the neural network gets turned off during
training. Let’s look at a visual example.
Dropout

• Assume on the left side we have a feedforward neural

network with no dropout. Using dropout with let’s say a
probability of P=0.5 that a random neuron gets turned
off during training would result in a neural network on
the right side.
Dropout
• In this case, you can observe that approximately half of
the neurons are not active and are not considered as a
part of the neural network. And as you can observe the
neural network becomes simpler.
• A simpler version of the neural network results in less
complexity that can reduce overfitting. The deactivation
of neurons with a certain probability P is applied at each
forward propagation and weight update step.
In a nutshell….
• Overfitting occurs in more complex neural network models (many
layers, many neurons)
• Complexity of the neural network can be reduced by using L1 and
L2 regularization as well as dropout
• L1 regularization forces the weight parameters to become zero
• L2 regularization forces the weight parameters towards zero (but
never exactly zero)
• Smaller weight parameters make some neurons neglectable →
neural network becomes less complex → less overfitting
• During dropout, some neurons get deactivated with a random
probability P → Neural network becomes less complex → less
overfitting
Dataset augmentation
• Dataset augmentation is a process of generating data
artificially from the existing training data by doing minor
changes like rotation, flips, adding blur to some pixels in
the original image, or translations. Augmenting with
more data will make it harder for the neural network to
drive the training error to zero.
Data Augmentation
• By generating more data, the network will have a better chance of
performing better on the test data. Depending on the task at hand,
we might use all the augmentation techniques and generate more
training data.
• To apply data augmentation, we can make use of the existing
methods present in the frameworks like Keras, PyTorch.
• In Keras, we can use ImageDataGenerator to augment or create more
data by doing transformations, and similarly, we can use the
transforms class present in torchvision from PyTorch to augment data.
Early Stopping

• The idea behind early stopping is that when we’re fitting a neural
network on the training data and model is evaluated on the unseen
data after each iteration. If the performance of the model on the
validation data is not improving i.e…validation error is increasing or
remaining the same for certain iterations, then there is no point in
training the model further. This process of stopping model training
before it reaches the lowest training error is known as early stopping.
Early Stopping

• Let’s consider that we have set the patience of 5 epochs

(i.e. the number of epochs to wait before early stop). For
5 epochs, we’ll monitor the validation error, and if it
isn’t improving (either remains constant or increases)
while the training error decreases, then we don’t want
to train any further.
Early Stopping

• By using the early stopping technique, we’re making

sure that the model doesn’t remember the patterns and
noise present in the training data. Instead, we’re
pushing it towards generalizing the training data.
• Early stopping can be applied manually during the
training process, or you can do even better by
integrating these rules in your experiment through the
hooks/callbacks provided in most common frameworks
like Pytorch, Keras and TensorFlow.
Sample code: L2 regularization to a
Dense layer.

from keras import regularizers

model.add(Dense(64, input_dim=64,
kernel_regularizer=regularizers.l2(0.01)

Here the value 0.01 is the value of regularization parameter, i.e., lambda, which we need to optimize
further.
We can optimize it using the grid-search method.
Code: Dropout

In keras, we can implement dropout using the keras core layer.

Below is the python code for it:

from keras.layers.core
import Dropout model = Sequential([ Dense(output_dim=hidden1_num_units,
input_dim=input_num_units, activation='relu'), Dropout(0.25),
Dense(output_dim=output_num_units, input_dim=hidden5_num_units,
activation='softmax'), ])

As you can see, we have defined 0.25 as the probability of dropping.

We can tune it further for better results using the grid search method.
Data Augmentation
Data augmentation is a big leap in improving the accuracy of the model.
It can be considered as a mandatory trick in order to improve our predictions.
In keras, we can perform all of these transformations
using ImageDataGenerator. It has a big list of arguments which you
can use to pre-process your training data.
Below is the sample code to implement it.

from keras.preprocessing.image import ImageDataGenerator datagen =

ImageDataGenerator(horizontal flip=True)
datagen.fit(train)
Callbacks API

• A callback is an object that can perform actions at various

stages of training (e.g. at the start or end of an epoch, before
or after a single batch, etc).
• You can use callbacks to:
• Write TensorBoard logs after every batch of training to monitor
your metrics
• Periodically save your model to disk
• Do early stopping
• Get a view on internal states and statistics of a model during
training
• ...and more
Dropout
• In keras, we can apply early stopping using the callbacks function. Below is the sample
code for it.
from keras.callbacks import EarlyStopping
EarlyStopping(monitor='val_err', patience=5)

• Here, monitor denotes the quantity that needs to be monitored and ‘val_err’ denotes
the validation error.
• Patience denotes the number of epochs with no further improvement after which the
training will be stopped. For better understanding, let’s take a look at the above image
again. After the dotted line, each epoch will result in a higher value of validation error.
Therefore, 5 epochs after the dotted line (since our patience is equal to 5), our model
will stop because no further improvement is seen.

Practical MLOPS
100% (1)
Practical MLOPS
52 pages
Dataset Augmentation
No ratings yet
Dataset Augmentation
30 pages
Deep Learning Unit2
No ratings yet
Deep Learning Unit2
16 pages
Regularization
No ratings yet
Regularization
19 pages
Convolutional Neural Networks (Image Recognition) Part - II: Dr. Syed M. Usman
No ratings yet
Convolutional Neural Networks (Image Recognition) Part - II: Dr. Syed M. Usman
75 pages
Tutorial 4
No ratings yet
Tutorial 4
6 pages
Validation and Training
No ratings yet
Validation and Training
3 pages
UNIT-II Regularization in Deep Learning
No ratings yet
UNIT-II Regularization in Deep Learning
24 pages
NB4-08 PT III Early Stopping
No ratings yet
NB4-08 PT III Early Stopping
6 pages
03 Reg Slides
No ratings yet
03 Reg Slides
64 pages
Regularization and Normalization
No ratings yet
Regularization and Normalization
29 pages
Regularization
No ratings yet
Regularization
9 pages
Deep Neural Networks
No ratings yet
Deep Neural Networks
26 pages
DL Class3
No ratings yet
DL Class3
28 pages
What Is Regularization.
No ratings yet
What Is Regularization.
10 pages
Early Stopping, Dropout, Augmentation, Optimizers New
No ratings yet
Early Stopping, Dropout, Augmentation, Optimizers New
91 pages
2.6 Regularization
No ratings yet
2.6 Regularization
24 pages
Cours 4
No ratings yet
Cours 4
30 pages
Deep Learning Module 2 Important Topics PYQs
No ratings yet
Deep Learning Module 2 Important Topics PYQs
30 pages
Lecture 6
No ratings yet
Lecture 6
41 pages
DL Lect 7
No ratings yet
DL Lect 7
15 pages
9.b Handout-2-Regularization
No ratings yet
9.b Handout-2-Regularization
5 pages
Deep Learning Basics Lecture 4 Regularization II
No ratings yet
Deep Learning Basics Lecture 4 Regularization II
27 pages
NB4-09 PT IV Data Augmentation and Early Stopping
No ratings yet
NB4-09 PT IV Data Augmentation and Early Stopping
5 pages
Regularization Slides
No ratings yet
Regularization Slides
50 pages
Lecture 5-6
No ratings yet
Lecture 5-6
45 pages
Week 10
No ratings yet
Week 10
69 pages
DL Unit 3
No ratings yet
DL Unit 3
14 pages
LECTURE#9 EE258 F22 Part2 Draft v1
No ratings yet
LECTURE#9 EE258 F22 Part2 Draft v1
14 pages
How To Reduce Overfitting With Dropout Regularization in Keras
No ratings yet
How To Reduce Overfitting With Dropout Regularization in Keras
12 pages
Improved Regularization of Convolutional Neural Networks With Cutout
No ratings yet
Improved Regularization of Convolutional Neural Networks With Cutout
8 pages
Unit Ii
No ratings yet
Unit Ii
8 pages
Unit II.
No ratings yet
Unit II.
14 pages
Practical Aspects of Deep Learning PI
No ratings yet
Practical Aspects of Deep Learning PI
46 pages
6 - Tips For Training Deep Neural Networks
No ratings yet
6 - Tips For Training Deep Neural Networks
59 pages
Lecture 1 Part II
No ratings yet
Lecture 1 Part II
24 pages
Unit-2 Improving-Deep-Neural-Networks
No ratings yet
Unit-2 Improving-Deep-Neural-Networks
18 pages
S10 DNN Regularization Wip
No ratings yet
S10 DNN Regularization Wip
11 pages
Early Stopping in Practice
No ratings yet
Early Stopping in Practice
14 pages
Regularization For Neural Networks 1718966083
No ratings yet
Regularization For Neural Networks 1718966083
9 pages
DL Mod 2
No ratings yet
DL Mod 2
4 pages
Hyperparameters
No ratings yet
Hyperparameters
15 pages
Deep Learning
No ratings yet
Deep Learning
30 pages
Regularization in Machine Learning
No ratings yet
Regularization in Machine Learning
17 pages
ImageNet Classification With Deep Convolutional Neural Networks
No ratings yet
ImageNet Classification With Deep Convolutional Neural Networks
18 pages
Neural Network Implementation Using Keras
No ratings yet
Neural Network Implementation Using Keras
8 pages
Deep MLP's
No ratings yet
Deep MLP's
44 pages
MLESA v2024 Week07 Assignment Solution
No ratings yet
MLESA v2024 Week07 Assignment Solution
5 pages
Chap 2 Training Feed Forward Neural Networks
No ratings yet
Chap 2 Training Feed Forward Neural Networks
22 pages
Lec 05 Regularization
No ratings yet
Lec 05 Regularization
77 pages
Module 4
No ratings yet
Module 4
54 pages
NN 08
No ratings yet
NN 08
36 pages
Deep Neural Network Module 4 Regularization
No ratings yet
Deep Neural Network Module 4 Regularization
53 pages
Training Deep Neural Networks
No ratings yet
Training Deep Neural Networks
55 pages
Regularization
No ratings yet
Regularization
3 pages
5m DL Answers
No ratings yet
5m DL Answers
12 pages
4 - DNN Tip
No ratings yet
4 - DNN Tip
52 pages
Unit-2 L2
No ratings yet
Unit-2 L2
22 pages
Cst414-Deep Learning Module 2
No ratings yet
Cst414-Deep Learning Module 2
13 pages
Deep Learning - Lecture 3 - Regularization in Neural Networks
No ratings yet
Deep Learning - Lecture 3 - Regularization in Neural Networks
16 pages
Amazing Java: Learn Java Quickly
From Everand
Amazing Java: Learn Java Quickly
Andrei Besedin
No ratings yet
If Else
No ratings yet
If Else
2 pages
Tuple and Dictionary
No ratings yet
Tuple and Dictionary
1 page
Solution DFH Worksheet1
No ratings yet
Solution DFH Worksheet1
10 pages
Homewwork 1
No ratings yet
Homewwork 1
2 pages
Supervised Learning Algorithms Cheat Sheet
No ratings yet
Supervised Learning Algorithms Cheat Sheet
20 pages
Yoga Postures Correction and Estimation Using Open CV and VGG 19 Architecture
No ratings yet
Yoga Postures Correction and Estimation Using Open CV and VGG 19 Architecture
8 pages
Handwritten Text Recognition Using Deep Learning
No ratings yet
Handwritten Text Recognition Using Deep Learning
13 pages
Review (3) A Comprehensive Review On Email Spam Classification Using Machine Learning Algorithms
No ratings yet
Review (3) A Comprehensive Review On Email Spam Classification Using Machine Learning Algorithms
6 pages
Chapter 5 Long Questions
No ratings yet
Chapter 5 Long Questions
3 pages
Abhijit Balaji PDF
No ratings yet
Abhijit Balaji PDF
1 page
Accenture 2019 Batch ExpPPT-4
No ratings yet
Accenture 2019 Batch ExpPPT-4
14 pages
App - Py Code
No ratings yet
App - Py Code
22 pages
PRCV Lab Manual-Final
No ratings yet
PRCV Lab Manual-Final
60 pages
The Freshie Magazine 2022
No ratings yet
The Freshie Magazine 2022
60 pages
Detailed Profile RST
No ratings yet
Detailed Profile RST
3 pages
Training Test
No ratings yet
Training Test
54 pages
Final Year Project Report-1
No ratings yet
Final Year Project Report-1
42 pages
Mask CTC: Non-Autoregressive End-to-End ASR With CTC and Mask Predict
No ratings yet
Mask CTC: Non-Autoregressive End-to-End ASR With CTC and Mask Predict
6 pages
6339-Machine Learning Applications in Smart Tourism Overview Research Challenges and The Road Ahead-Final
No ratings yet
6339-Machine Learning Applications in Smart Tourism Overview Research Challenges and The Road Ahead-Final
8 pages
02+ijisae Budi+juarto
No ratings yet
02+ijisae Budi+juarto
7 pages
Addendum Circular For SBT-2025 6-6-25
No ratings yet
Addendum Circular For SBT-2025 6-6-25
9 pages
K Means Clustering
No ratings yet
K Means Clustering
11 pages
25 Aj2800
No ratings yet
25 Aj2800
8 pages
Addition Multiplication RNN
No ratings yet
Addition Multiplication RNN
7 pages
CMPE597 Syllabus
No ratings yet
CMPE597 Syllabus
3 pages
27-33python EmpoweringDataScienceApplicationsandResearch
No ratings yet
27-33python EmpoweringDataScienceApplicationsandResearch
8 pages
(Slide) Logistic Regression
No ratings yet
(Slide) Logistic Regression
42 pages
Digital Marketing 1
100% (1)
Digital Marketing 1
35 pages
Hand Gesture Recognition2
No ratings yet
Hand Gesture Recognition2
5 pages
How Shapley Values Work - A Simple Guide
No ratings yet
How Shapley Values Work - A Simple Guide
11 pages
CH2 Data
No ratings yet
CH2 Data
25 pages
Sanmati Engineering College Brochure PDF
No ratings yet
Sanmati Engineering College Brochure PDF
22 pages
RP-220635 Revised WID Artificial Intelligence (AI) Machine Learning (ML) For NG-RAN-cl
No ratings yet
RP-220635 Revised WID Artificial Intelligence (AI) Machine Learning (ML) For NG-RAN-cl
8 pages

Module-4 4

Uploaded by

Module-4 4

Uploaded by

Regularization in Deep

lines indicate penalty = 1

all p-norms penalize larger

p < 2 tends to create sparse

p > 2 tends to like similar

L1 is also popular because for some loss functions, it

Lp is less popular since they don’t tend to shrink the

• In addition to the L2 and L1 regularization, another famous and

• Assume on the left side we have a feedforward neural

• Let’s consider that we have set the patience of 5 epochs

• By using the early stopping technique, we’re making

from keras import regularizers

In keras, we can implement dropout using the keras core layer.

As you can see, we have defined 0.25 as the probability of dropping.

from keras.preprocessing.image import ImageDataGenerator datagen =

• A callback is an object that can perform actions at various

You might also like