0% found this document useful (0 votes)

32 views19 pages

Regularization

The document discusses methods to reduce overfitting in machine learning, focusing on regularization techniques such as L1 (Lasso) and L2 (Ridge) regularization, early stopping, and data augmentation. Regularization adds penalties to the loss function to prevent models from memorizing training data, while early stopping halts training when performance on a validation set begins to decline. Data augmentation generates new training examples through transformations, enhancing model generalization, particularly in image processing.

Uploaded by

devanand272003

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

32 views19 pages

Regularization

Uploaded by

devanand272003

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 19

Regularization L1, L2

Early Stopping
Data Augmentation
Mr. Sivadasan E T
Associate Professor
Vidya Academy of Science and Technology, Thrissur
What is Regularization

Regularization is a method in machine learning used to

reduce overfitting and improve the model's ability to
generalize to unseen data.

It works by adding a penalty or constraint to the model's

objective function (usually the loss function), discouraging it
from fitting too closely to the training data.
Why Regularization

When a machine learning model is too complex, it can

memorize the training data instead of learning the underlying
patterns.

This leads to overfitting, where the model performs well on

training data but poorly on new, unseen data. Regularization
helps control this complexity.
How Does Regularization Work?
In a typical machine learning model, the goal is to minimize
the loss function, which measures the difference between the
model’s predictions and the actual values.

Regularization modifies this loss function by adding a

penalty term based on the model's parameters.
New loss function:
Loss=Original Loss + Penalty Term
L1 Regularization (Lasso):
Adds the sum of the absolute values of the model parameters
as a penalty.
Drives some parameters to zero, effectively removing them
(feature selection).
Formula for the penalty:
L2 Regularization (Ridge):
Adds the sum of the squared values of the model parameters
as a penalty.
Reduces the magnitude of parameters but doesn't set them to
zero.
Formula for the penalty
Early Stopping

Neural networks are trained using variations of gradient-

descent methods.
In most optimization models, gradient-descent methods are
executed to convergence.
However, executing gradient descent to convergence
optimizes the loss on the training data, but not necessarily on
the out-of-sample test data.
Early Stopping

This is because the final few steps often overfit to the specific
nuances of the training data, which might not generalize well
to the test data.
Another common form of regularization is early stopping, in
which the gradient descent is ended after only a few
iterations.
One way to decide the stopping point is by holding out a part
of the training data, and then testing the error of the model on
the held-out set.
Early Stopping

The gradient-descent approach is terminated when the error

on the held-out set begins to rise.
Early stopping essentially reduces the size of the parameter
space to a smaller neighborhood within the initial values of
the parameters.
From this point of view, early stopping acts as a regularizer
because it effectively restricts the parameter space.
Early Stopping

In this method, a portion of the training data is held out as a

validation set.
The backpropagation-based training is only applied to the
portion of the training data that does not include the
validation set.
At the same time, the error of the model on the validation set
is continuously monitored.
Early Stopping

At some point, this error begins to rise on the validation set,
even though it continues to reduce on the training set.
This is the point at which further training causes overfitting.
Therefore, this point can be chosen for termination.
Early Stopping

It is important to keep track of the best solution achieved so

far in the learning process (as computed on the validation
data).

This is because one does not perform early stopping after tiny
increases in the out-of-sample error (which might be caused
by noisy variations), but it is advisable to continue to train to
check if the error continues to rise.
Early Stopping

In other words, the termination point is chosen in hindsight

after the error on the validation set continues to rise, and all
hope is lost of improving the error performance on the
validation set.
Data Augmentation

A common trick to reduce overfitting in convolutional neural

networks is the idea of data augmentation.
In data augmentation, new training examples are generated
by using transformations on the original examples.
Data Augmentation

It works better in some domains than others. Image

processing is one domain to which data augmentation is very
well suited.
This is because many transformations such as translation,
rotation, patch extraction, and reflection do not
fundamentally change the properties of the object in an
image.
Data Augmentation

However, they do increase the generalization power of the

data set when trained with the augmented data set.

For example, if a data set is trained with mirror images and

reflected versions of all the bananas in it, then the model is
able to better recognize bananas in different orientations.
Data Augmentation

Many of these forms of data augmentation require very little

computation, and therefore the augmented images do not
need to be explicitly generated up front.

Rather, they can be created at training time, when an image is

being processed.
Data Augmentation

For example, while processing an image of a banana, it can

be reflected into a modified banana at training time.

Similarly, the same banana might be represented in somewhat

different color intensities in different images, and therefore it
might be helpful to create representations of the same image
in different color intensities.
Thank You!

The Impact of Jeepney Modernization To The Commuters of South Caloocan
100% (3)
The Impact of Jeepney Modernization To The Commuters of South Caloocan
61 pages
Early Stopping, Dropout, Augmentation, Optimizers New
No ratings yet
Early Stopping, Dropout, Augmentation, Optimizers New
91 pages
Module-4 4
No ratings yet
Module-4 4
19 pages
Unit Ii
No ratings yet
Unit Ii
8 pages
Deep Learning Unit2
No ratings yet
Deep Learning Unit2
16 pages
Deep Learning - Lecture 3 - Regularization in Neural Networks
No ratings yet
Deep Learning - Lecture 3 - Regularization in Neural Networks
16 pages
UNIT-II Regularization in Deep Learning
No ratings yet
UNIT-II Regularization in Deep Learning
24 pages
Convolutional Neural Networks (Image Recognition) Part - II: Dr. Syed M. Usman
No ratings yet
Convolutional Neural Networks (Image Recognition) Part - II: Dr. Syed M. Usman
75 pages
Module 2 Part2
No ratings yet
Module 2 Part2
8 pages
What Is Regularization.
No ratings yet
What Is Regularization.
10 pages
Module - 2 Ver 1.4
No ratings yet
Module - 2 Ver 1.4
35 pages
Deep Learning Basics Lecture 4 Regularization II
No ratings yet
Deep Learning Basics Lecture 4 Regularization II
27 pages
03 Reg Slides
No ratings yet
03 Reg Slides
64 pages
Regularization and Normalization
No ratings yet
Regularization and Normalization
29 pages
Dataset Augmentation
No ratings yet
Dataset Augmentation
30 pages
LECTURE#9 EE258 F22 Part2 Draft v1
No ratings yet
LECTURE#9 EE258 F22 Part2 Draft v1
14 pages
DL Unit 3
No ratings yet
DL Unit 3
59 pages
Deep Neural Network Module 4 Regularization
No ratings yet
Deep Neural Network Module 4 Regularization
53 pages
Unit - 4 REGULARIZATION FOR DEEP LEARNING
No ratings yet
Unit - 4 REGULARIZATION FOR DEEP LEARNING
56 pages
DL Class3
No ratings yet
DL Class3
28 pages
Deep Neural Networks
No ratings yet
Deep Neural Networks
26 pages
Deep Learning Module 2 Important Topics PYQs
No ratings yet
Deep Learning Module 2 Important Topics PYQs
30 pages
Unit 3
No ratings yet
Unit 3
47 pages
Regularization
No ratings yet
Regularization
9 pages
Lec 05 Regularization
No ratings yet
Lec 05 Regularization
77 pages
Regularization in Machine Learning
No ratings yet
Regularization in Machine Learning
17 pages
Week 10
No ratings yet
Week 10
69 pages
2.6 Regularization
No ratings yet
2.6 Regularization
24 pages
DL Lect 7
No ratings yet
DL Lect 7
15 pages
Regularization Slides
No ratings yet
Regularization Slides
50 pages
Lecture 1 Part II
No ratings yet
Lecture 1 Part II
24 pages
12-Regularization For Deep Learning-17!08!2024
No ratings yet
12-Regularization For Deep Learning-17!08!2024
51 pages
Unit 4
No ratings yet
Unit 4
93 pages
Lecture 6
No ratings yet
Lecture 6
41 pages
Underfitting Overfitting
No ratings yet
Underfitting Overfitting
38 pages
CST414 M2 Ktunotes - in
No ratings yet
CST414 M2 Ktunotes - in
30 pages
Cours 4
No ratings yet
Cours 4
30 pages
Lecture 5-6
No ratings yet
Lecture 5-6
45 pages
Tutorial 4
No ratings yet
Tutorial 4
6 pages
S10 DNN Regularization Wip
No ratings yet
S10 DNN Regularization Wip
11 pages
Deep Learning (All in One)
No ratings yet
Deep Learning (All in One)
23 pages
Slides Regu Early Stopping
No ratings yet
Slides Regu Early Stopping
5 pages
Unit 2.3
No ratings yet
Unit 2.3
43 pages
Regularization
No ratings yet
Regularization
3 pages
Automatic Early Stopping Using Cross Validation: Quantifying The Criteria
No ratings yet
Automatic Early Stopping Using Cross Validation: Quantifying The Criteria
7 pages
An Overview of Overfitting and Its Solutions
No ratings yet
An Overview of Overfitting and Its Solutions
7 pages
Regularization
No ratings yet
Regularization
18 pages
An Overview of Overfitting and Its Solutions
No ratings yet
An Overview of Overfitting and Its Solutions
7 pages
Practical Aspects of Deep Learning PI
No ratings yet
Practical Aspects of Deep Learning PI
46 pages
Unit 4
No ratings yet
Unit 4
35 pages
Unit-2 L3
No ratings yet
Unit-2 L3
23 pages
DL Unit 3
No ratings yet
DL Unit 3
14 pages
An Overview of Overfitting and Its Solutions
No ratings yet
An Overview of Overfitting and Its Solutions
7 pages
6 - Tips For Training Deep Neural Networks
No ratings yet
6 - Tips For Training Deep Neural Networks
59 pages
DL Mod 2
No ratings yet
DL Mod 2
4 pages
Deep Feedforward Networks and Regularization: Licheng Zhang
No ratings yet
Deep Feedforward Networks and Regularization: Licheng Zhang
56 pages
5m DL Answers
No ratings yet
5m DL Answers
12 pages
Regularization: Swetha V, Research Scholar
No ratings yet
Regularization: Swetha V, Research Scholar
32 pages
465-Lecture 10-11
No ratings yet
465-Lecture 10-11
79 pages
MACHINE LEARNING FOR BEGINNERS: A Practical Guide to Understanding and Applying Machine Learning Concepts (2023 Beginner Crash Course)
From Everand
MACHINE LEARNING FOR BEGINNERS: A Practical Guide to Understanding and Applying Machine Learning Concepts (2023 Beginner Crash Course)
Elaine Tate
No ratings yet
DATA MINING AND MACHINE LEARNING. PREDICTIVE TECHNIQUES: REGRESSION, GENERALIZED LINEAR MODELS, SUPPORT VECTOR MACHINE AND NEURAL NETWORKS
From Everand
DATA MINING AND MACHINE LEARNING. PREDICTIVE TECHNIQUES: REGRESSION, GENERALIZED LINEAR MODELS, SUPPORT VECTOR MACHINE AND NEURAL NETWORKS
César Pérez López
No ratings yet
Speech Recognition
No ratings yet
Speech Recognition
7 pages
Introduction To Deep Learning - Deep Feed Forward Network
No ratings yet
Introduction To Deep Learning - Deep Feed Forward Network
24 pages
AdaGrad - RMSProp - Adam
No ratings yet
AdaGrad - RMSProp - Adam
9 pages
Recurrent Neural Networks RNN
No ratings yet
Recurrent Neural Networks RNN
19 pages
Introduction To Neural Networks - Single Layer Perceptrons - Modified
No ratings yet
Introduction To Neural Networks - Single Layer Perceptrons - Modified
26 pages
Encoder-Decoder Sequence To Sequence Architechure
No ratings yet
Encoder-Decoder Sequence To Sequence Architechure
16 pages
Computer Vision
No ratings yet
Computer Vision
20 pages
Activation Functions - Sigmoid - Tanh - ReLU - Softmax - Risk Minimization - Loss Function
No ratings yet
Activation Functions - Sigmoid - Tanh - ReLU - Softmax - Risk Minimization - Loss Function
17 pages
Year 5 Science Term 1
No ratings yet
Year 5 Science Term 1
42 pages
Physical Traces PDF
No ratings yet
Physical Traces PDF
150 pages
History 7-10 - Sequence of Content
No ratings yet
History 7-10 - Sequence of Content
9 pages
Mechanics of Materials B.C. Punmia - Get The Ebook in PDF Format For A Complete Experience
No ratings yet
Mechanics of Materials B.C. Punmia - Get The Ebook in PDF Format For A Complete Experience
56 pages
Math 2016
No ratings yet
Math 2016
12 pages
Assignment No. 7 Chemical Engineering Fluid Dynamics Session 2016 Due Date: 16 May-2018 Solve All The Questions. (As A Part of Assessment of CLO3)
No ratings yet
Assignment No. 7 Chemical Engineering Fluid Dynamics Session 2016 Due Date: 16 May-2018 Solve All The Questions. (As A Part of Assessment of CLO3)
1 page
Brochure Final Brochure A. M. 2025
No ratings yet
Brochure Final Brochure A. M. 2025
4 pages
Problem Set #8: Newton - Raphson Method: Iteration, The Approximate Root of The Given Function Is 2.094551
No ratings yet
Problem Set #8: Newton - Raphson Method: Iteration, The Approximate Root of The Given Function Is 2.094551
5 pages
Sample Lesson Plan For JET Program Teaching Demo Carl Benson Vlogs Japan
No ratings yet
Sample Lesson Plan For JET Program Teaching Demo Carl Benson Vlogs Japan
2 pages
Abrar's Lesson Plan
No ratings yet
Abrar's Lesson Plan
4 pages
UAS High School Profile 2024 25 Vers2
No ratings yet
UAS High School Profile 2024 25 Vers2
4 pages
BSR Tran Uno Bsu
No ratings yet
BSR Tran Uno Bsu
2 pages
Advertisement For Dav
No ratings yet
Advertisement For Dav
9 pages
The Role of Ritucharya in Human Body According To Different Ritu'S
No ratings yet
The Role of Ritucharya in Human Body According To Different Ritu'S
4 pages
Conductivity-Depth Imaging of Helicopter-Borne TEM Data Based On Pseudo-Layer Half Space Model
No ratings yet
Conductivity-Depth Imaging of Helicopter-Borne TEM Data Based On Pseudo-Layer Half Space Model
7 pages
HRD Final - 1
No ratings yet
HRD Final - 1
20 pages
CC Block (Horiskhali)
No ratings yet
CC Block (Horiskhali)
1 page
As Phy Revision BK For Mid Term PDF
No ratings yet
As Phy Revision BK For Mid Term PDF
10 pages
Quantum Physics For Babies
No ratings yet
Quantum Physics For Babies
13 pages
Belk - Possessions and The Extended Self
No ratings yet
Belk - Possessions and The Extended Self
31 pages
Perth 2014 - Abstract Book - Final PDF
100% (1)
Perth 2014 - Abstract Book - Final PDF
277 pages
Building Social Protection Floors For All: Global Flagship Programme Strategy (2016-20)
No ratings yet
Building Social Protection Floors For All: Global Flagship Programme Strategy (2016-20)
24 pages
Answer The Following Questions in About 120 Words
No ratings yet
Answer The Following Questions in About 120 Words
2 pages
Year 5 Equivalent Fractions and Decimals Tenths RPS
No ratings yet
Year 5 Equivalent Fractions and Decimals Tenths RPS
2 pages
Live Case Study - 1
No ratings yet
Live Case Study - 1
7 pages
UG II, III, IV, and PG TIME TABLE JAN-MAY 2023
No ratings yet
UG II, III, IV, and PG TIME TABLE JAN-MAY 2023
317 pages
Heink 2017
No ratings yet
Heink 2017
9 pages
MPU Group Presentation Evaluation Form (20%)
No ratings yet
MPU Group Presentation Evaluation Form (20%)
1 page
AI Tech Agency - by Slidesgo
No ratings yet
AI Tech Agency - by Slidesgo
41 pages