0% found this document useful (0 votes)

20 views10 pages

What Is Regularization.

I hope it'll be useful

Uploaded by

py4041548

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

20 views10 pages

What Is Regularization.

I hope it'll be useful

Uploaded by

py4041548

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 10

What is Regularization?

Regularization is a technique used in machine learning and deep

learning to prevent overfitting and improve a model’s generalization
performance. It involves adding a penalty term to the loss
function during training.

This penalty discourages the model from becoming too complex or

having large parameter values, which helps in controlling the model’s
ability to fit noise in the training data. Regularization in deep
learning methods includes L1 and L2 regularization, dropout, early
stopping, and more. By applying regularization for deep learning,
models become more robust and better at making accurate predictions
on unseen data.

Before we deep dive into the topic, take a look at this image:

Have you seen this image before? As we move towards the right in this
image, our model tries to learn too well the details and the noise from
the training data, ultimately resulting in poor performance on the
unseen data.

In other words, while going toward the right, the complexity of the
model increases such that the training error reduces but the testing
error doesn’t. This is shown in the image below:
If you’ve built a neural network before, you know how complex they
are. This makes them more prone to overfitting.

Regularization is a technique that modifies the learning algorithm

slightly so that the model generalizes better. This, in turn, improves the
model’s performance on unseen data as well.
How does Regularization help Reduce Overfitting?

Let’s consider a neural network that is overfitting on the training data

as shown in the image below:

Assume that our regularization coefficient is so high that some of the

weight matrices are nearly equal to zero.

This will result in a much simpler linear network and slight

underfitting of the training data.
Such a large value of the regularization coefficient is not that useful.
We need to optimize the value of the regularization coefficient to
obtain a well-fitted model as shown in the image below:

Different Regularization Techniques in Deep Learning

Now that we understand how regularization helps reduce overfitting,

we’ll learn a few different techniques for applying regularization in
deep learning.

L2&L1 Regularization

L1 and L2 are the most common types of regularization deep learning.

These update the general cost function by adding another term known
as the regularization term.

 Cost function = Loss (say, binary cross entropy) +

Regularization term

Due to the addition of this regularization term, the values of weight

matrices decrease because it assumes that a neural network with
smaller weight matrices leads to simpler models. Therefore, it will also
reduce overfitting to quite an extent.

However, this regularization term differs in L1 and L2.

For L2:

For L1:

In L2, we have: ||w||^2 = Σ w_i^2. This is known as ridge regression,

where lambda is the regularization parameter. It is the hyperparameter
whose value is optimized for better results. L2 regularization is also
known as weight decay as it forces the weights to decay towards zero
(but not exactly zero).

In L1, we having: ||w||=Σ |w_i|. In this, we penalize the absolute value

of the weights. Unlike L2, the weights may be reduced to zero here. L1
regularization is also called lasso regression. Hence, it is very useful
when we are trying to compress our model. Otherwise, we usually
prefer L2 over it.

In keras, we can directly apply regularization for deep learning to any

layer using the regularizers.

Below is the sample code to apply L2 regularization to a Dense layer:

Dropout

This is one of the most interesting types of regularization techniques . It

also produces very good results and is consequently the most
frequently used regularization technique in the field of deep learning.

To understand dropout, let’s say our neural network structure is to the

one shown below:

So what does dropout do? At every iteration, it randomly selects some

nodes and removes them along with all of their incoming and outgoing
connections as shown below:
Each iteration has a different set of nodes, which results in a different
set of outputs. This can also be thought of as an ensemble technique in
machine learning.

Ensemble models usually perform better than a single model as they

capture more randomness. Similarly, dropout models also perform
better than normal neural network models.

This probability of choosing how many nodes should be dropped is the

hyperparameter of the dropout function. As seen in the image above,
dropout can be applied to both the hidden layers as well as the input
layers.
Due to these reasons, dropout is usually preferred when we have a
large neural network structure to introduce more randomness.

As you can see, we have defined 0.25 as the probability of dropping.

We can tune it further for better results using the grid search method.

Data Augmentation

The simplest way to reduce overfitting is to increase the training data

size. In machine learning, however, increasing the training data size
was impossible as the labeled data was too costly.

But now, let’s consider we are dealing with images. In this case, there
are a few ways of increasing the size of the training data—rotating the
image, flipping, scaling, shifting, etc. In the image below, some
transformation has been done on the handwritten digits dataset.
This technique is known as data augmentation. It usually provides a
big leap in improving the accuracy of the model, and it can be
considered a mandatory trick to improve our predictions.

In keras, we can perform all of these transformations

using ImageDataGenerator . It has a big list of arguments that you can
use to pre-process your training data.

Early Stopping

Early stopping is a cross-validation strategy in which we keep one part

of the training set as the validation set. When we see that the
performance on the validation set is getting worse, we immediately
stop the training on the model.

In the above image, we will stop training at the dotted line since, after
that, our model will start overfitting on the training data.
Patience denotes the number of epochs with no further improvement
after which the training will be stopped. For a better understanding,
let’s look at the above image again. After the dotted line, each epoch
will result in a higher validation error value. Therefore, our model will
stop 5 epochs after the dotted line (since our patience equals 5) because
no further improvement is seen.

12-Regularization For Deep Learning-17!08!2024
No ratings yet
12-Regularization For Deep Learning-17!08!2024
51 pages
Empowering Deep Learning For Images: A Comparative Analysis of Regularization Techniques in CNNs
No ratings yet
Empowering Deep Learning For Images: A Comparative Analysis of Regularization Techniques in CNNs
13 pages
5 Regularization
No ratings yet
5 Regularization
79 pages
Unit 4
No ratings yet
Unit 4
93 pages
Unit Ii
No ratings yet
Unit Ii
8 pages
03 Reg Slides
No ratings yet
03 Reg Slides
64 pages
Regularization in Machine Learning
No ratings yet
Regularization in Machine Learning
17 pages
Regularization in Deep Learning
No ratings yet
Regularization in Deep Learning
49 pages
4th Unit DL Final Class Notes
No ratings yet
4th Unit DL Final Class Notes
68 pages
L10 Regularization Slides
No ratings yet
L10 Regularization Slides
45 pages
5m DL Answers
No ratings yet
5m DL Answers
12 pages
Mod 4
No ratings yet
Mod 4
65 pages
Week 10
No ratings yet
Week 10
69 pages
Unit 4
No ratings yet
Unit 4
62 pages
DL Unit-3
No ratings yet
DL Unit-3
56 pages
Deep Learning Basics Lecture 4 Regularization II
No ratings yet
Deep Learning Basics Lecture 4 Regularization II
27 pages
Color Code Ieee 1580 Table 22
No ratings yet
Color Code Ieee 1580 Table 22
1 page
Module-4 4
No ratings yet
Module-4 4
19 pages
NNDL Notes
No ratings yet
NNDL Notes
73 pages
Convolutional Neural Networks (Image Recognition) Part - II: Dr. Syed M. Usman
No ratings yet
Convolutional Neural Networks (Image Recognition) Part - II: Dr. Syed M. Usman
75 pages
Unit - 4 REGULARIZATION FOR DEEP LEARNING
No ratings yet
Unit - 4 REGULARIZATION FOR DEEP LEARNING
56 pages
07 Regularization
No ratings yet
07 Regularization
51 pages
Regularization For Deep Learning: Tsz-Chiu Au Chiu@unist - Ac.kr
No ratings yet
Regularization For Deep Learning: Tsz-Chiu Au Chiu@unist - Ac.kr
100 pages
Regularization Slides
No ratings yet
Regularization Slides
50 pages
How To Hack Wifi in Windows 7 - 8 - 8.1 - 10 Without Any Software - Using With CMD
No ratings yet
How To Hack Wifi in Windows 7 - 8 - 8.1 - 10 Without Any Software - Using With CMD
10 pages
3.4. Sharpening Spatial Filtering
No ratings yet
3.4. Sharpening Spatial Filtering
45 pages
JD700B User Guide R06.0
No ratings yet
JD700B User Guide R06.0
690 pages
Regularization
No ratings yet
Regularization
46 pages
Module-4 3
No ratings yet
Module-4 3
20 pages
Underfitting Overfitting
No ratings yet
Underfitting Overfitting
38 pages
Lec 05 Regularization
No ratings yet
Lec 05 Regularization
77 pages
CM20315 09 Regularization
No ratings yet
CM20315 09 Regularization
44 pages
21 CF With Regularization (Guide)
No ratings yet
21 CF With Regularization (Guide)
2 pages
2.6 Regularization
No ratings yet
2.6 Regularization
24 pages
DL Unit 3
No ratings yet
DL Unit 3
59 pages
Deep Learning Module 2 Important Topics PYQs
No ratings yet
Deep Learning Module 2 Important Topics PYQs
30 pages
DL IT324a 3
No ratings yet
DL IT324a 3
13 pages
DL Lect 7
No ratings yet
DL Lect 7
15 pages
Regularization and Normalization
No ratings yet
Regularization and Normalization
29 pages
Unit 4
No ratings yet
Unit 4
35 pages
Overfitting Problem Regularization (Ridge, Lasso, Elastic) Dropout and Early Stopping
No ratings yet
Overfitting Problem Regularization (Ridge, Lasso, Elastic) Dropout and Early Stopping
17 pages
4 NN Regularization
No ratings yet
4 NN Regularization
13 pages
Lecture 05 - Regularization - 4p
No ratings yet
Lecture 05 - Regularization - 4p
21 pages
Pre-Employment Requirements
No ratings yet
Pre-Employment Requirements
2 pages
Regularization
No ratings yet
Regularization
19 pages
Unit 2.3
No ratings yet
Unit 2.3
43 pages
Mathematics of Codes: Topics (And Subtopics)
No ratings yet
Mathematics of Codes: Topics (And Subtopics)
19 pages
UNIT-II Regularization in Deep Learning
No ratings yet
UNIT-II Regularization in Deep Learning
24 pages
Overfitting Underfitting: UNIT 2: Optimization and Regularization in Neural Networks
No ratings yet
Overfitting Underfitting: UNIT 2: Optimization and Regularization in Neural Networks
18 pages
Cours 4
No ratings yet
Cours 4
30 pages
A Quick Guide On Basic Regularization Methods For Neural Networks - by Jaime Durán - Yottabytes - Medium
No ratings yet
A Quick Guide On Basic Regularization Methods For Neural Networks - by Jaime Durán - Yottabytes - Medium
9 pages
Regularization
No ratings yet
Regularization
3 pages
Regularization in Machine Learning
No ratings yet
Regularization in Machine Learning
5 pages
KPM180 Manual
No ratings yet
KPM180 Manual
108 pages
Deep Learning Unit2
No ratings yet
Deep Learning Unit2
16 pages
S10 DNN Regularization Wip
No ratings yet
S10 DNN Regularization Wip
11 pages
Module - 2 Ver 1.4
No ratings yet
Module - 2 Ver 1.4
35 pages
DL Module 2
No ratings yet
DL Module 2
8 pages
Regularization (Mathematics) - Wikipedia
No ratings yet
Regularization (Mathematics) - Wikipedia
13 pages
DL Mod 4 & 6 Notes
No ratings yet
DL Mod 4 & 6 Notes
12 pages
DL Class3
No ratings yet
DL Class3
28 pages
Projection
No ratings yet
Projection
26 pages
Regularization: Swetha V, Research Scholar
No ratings yet
Regularization: Swetha V, Research Scholar
32 pages
The Bengal Records Manual 1943
No ratings yet
The Bengal Records Manual 1943
249 pages
Data 98
No ratings yet
Data 98
4 pages
SCDA PPT Presentation
100% (1)
SCDA PPT Presentation
20 pages
UNIT LV
No ratings yet
UNIT LV
8 pages
What Is Morphological Operations.
No ratings yet
What Is Morphological Operations.
30 pages
Code Wars 2024 Sponsorship
No ratings yet
Code Wars 2024 Sponsorship
9 pages
250 MW O&M Manual
100% (2)
250 MW O&M Manual
375 pages
Regularization For Neural Networks 1718966083
No ratings yet
Regularization For Neural Networks 1718966083
9 pages
BS 2573-2 Contents
No ratings yet
BS 2573-2 Contents
1 page
Hydraulic Surgery Table Manual
No ratings yet
Hydraulic Surgery Table Manual
8 pages
Confidentiality and Working Agreement: Between
No ratings yet
Confidentiality and Working Agreement: Between
10 pages
5-Introduction To regularization-03-Aug-2020Material - I - 03-Aug-2020 - Module3 - Regularization
No ratings yet
5-Introduction To regularization-03-Aug-2020Material - I - 03-Aug-2020 - Module3 - Regularization
10 pages
Intermediate Level
No ratings yet
Intermediate Level
41 pages
BS en Iso 28927-5-2009 PDF
No ratings yet
BS en Iso 28927-5-2009 PDF
32 pages
Image Filtering
No ratings yet
Image Filtering
6 pages
Unit III 8254
No ratings yet
Unit III 8254
29 pages
Setting Up OpenVPN Server On Ubuntu
No ratings yet
Setting Up OpenVPN Server On Ubuntu
35 pages
AirCheck Detail Report - PK8AP02
No ratings yet
AirCheck Detail Report - PK8AP02
100 pages
Algebraic Geometry For Geometric Modeling: Ragni Piene
No ratings yet
Algebraic Geometry For Geometric Modeling: Ragni Piene
46 pages
Assessment User Experience Responsive Web Applications Case Study
No ratings yet
Assessment User Experience Responsive Web Applications Case Study
8 pages
Force Analysis of Spur Gears PDF
No ratings yet
Force Analysis of Spur Gears PDF
5 pages
1p00q00 5
No ratings yet
1p00q00 5
1 page
Fourier Transform
No ratings yet
Fourier Transform
2 pages
Log
No ratings yet
Log
9 pages
Aman Mishra: Projects Education
No ratings yet
Aman Mishra: Projects Education
1 page
Corvis Prospekt 4 Seitig 0611
No ratings yet
Corvis Prospekt 4 Seitig 0611
4 pages
Simple Multi-Gbps 60 GHZ Radio-Over-Fiber Links Employing Optical and Electrical Data Up-Convers
No ratings yet
Simple Multi-Gbps 60 GHZ Radio-Over-Fiber Links Employing Optical and Electrical Data Up-Convers
3 pages
Cadcam Iat - 3 Question Paper
No ratings yet
Cadcam Iat - 3 Question Paper
1 page
Machine Learning Interview Questions
From Everand
Machine Learning Interview Questions
Tech Interviews
4.5/5 (2)
Artificial Intelligence Interview Questions
From Everand
Artificial Intelligence Interview Questions
Tech Interviews
5/5 (2)