5-Introduction To regularization-03-Aug-2020Material - I - 03-Aug-2020 - Module3 - Regularization

Regularization techniques are used to reduce overfitting in machine learning models. They modify the learning algorithm to favor simpler models by adding constraints or penalty terms to the objective function. Common regularization strategies include L2 regularization, which penalizes weights with large magnitudes, driving them closer to zero. This helps control model complexity and improve generalization to new data.

Uploaded by

Anand Amsuri

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

33 views10 pages

5-Introduction To regularization-03-Aug-2020Material - I - 03-Aug-2020 - Module3 - Regularization

Uploaded by

Anand Amsuri

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 10

Regularization for Deep Learning

• Ability to perform well on training data and new inputs

• Available strategies are designed to reduce the test error

with increased training error

• Modifying the LA to reduce its generalization (test) error

and not the training error

• General regularization strategies
• Adding extra constraints / parameters on the ML model
• Introduce extra terms in the cost / objective function

• Other types – Ensemble methods

Regularization Strategies
• Generally based on regularizing estimators
• Effective regularizer – trade-off with Increased bias by decreasing
variance
• Generalization and overfitting (Model)
– Excluded the data generating process (underfitting)
– Matched the true data generating process
– Include the generating process
• Model complexity
– Finding the model of right size with right number of parameters
– Determine the best fitting model (large) that has been regularized
properly
– Intention is to create a large, deep, regularized model
Parameter Norm Penalties
•• Limit
the models capacity (NN, LR, LoR)
– Add a parameter norm penalty to the objective function
– , , where
• For NNs (Parameter Norm Penalty - PNP) impacts the
weights across each layer and the biases remain unregularized
• ω – vector for weights affected by
• θ – vector for parameters comprising ω and the unregularized
parameters
• Alternatively, NNs deploy  coefficient different for each layer
of the N/w.
L2 Parameter Regularization
• Simplest and commonly utilized
• L2 PNP is known as weight decay
• To avoid overfitting, a weight update w with the respect
to∇J/∇w and subtract from λ∙w, thereby the weights
decay towards zero – weight update
• Drive the weights closer to the origin by adding a
regularization term to the objective function
• Ridge regression / Tikhonov regularization
Regularization (revisited)
• Regularization refers to the act of modifying a learning
algorithm to favor “simpler” prediction rules to avoid
overfitting.
• Most commonly, regularization refers to modifying the
loss function to penalize certain values of the weights you
are learning.
• Specifically, penalize weights that are large.
• Identify large weights using
• L2 norm of w – vector’s length / Euclidean norm
L2 Regularization (ctd..)

• New goal for minimization –

Loss minimizing
function this, we
prefer
solutions
where
w is closer to 0.

• λ - hyperparameter that adjusts the trade-off

between having low training loss and having
low weights
L2 Regularization (ctd..)
• Assuming no bias (i.e. θ is just ω), then
cost function is
• And the gradient
• Updating weight

• Further quadratic approximation to J to

yield minimal unregularized training cost
by tuning weights
L2 Regularization (ctd..)
•
• Then J becomes
• H – Hessian matrix of J
• Minimum of J occurs at
• Adding weight decay gradient

– When =0, approaches

– When  grows perform eigen decomposition of H
L2 Regularization (ctd..)
•
• H is decomposed into diagonal matrix and
orthonormal basis of eigen vectors as
• Therefore, becomes
L2 Regularization (ctd..)
• Extending to linear regression - Cost
function J in terms of sum of squared errors

• Applying L2 regularization modifies J

• Therefore weight decay becomes

Chap 7-1 Regularization For Deep Learning-Keonwoo Noh
No ratings yet
Chap 7-1 Regularization For Deep Learning-Keonwoo Noh
41 pages
Unit - 4-NNDL - Notes
No ratings yet
Unit - 4-NNDL - Notes
14 pages
Regularization: Updates To Assignment
No ratings yet
Regularization: Updates To Assignment
21 pages
L11+ Regularization
No ratings yet
L11+ Regularization
25 pages
12-Regularization For Deep Learning-17!08!2024
No ratings yet
12-Regularization For Deep Learning-17!08!2024
51 pages
Unit 4
No ratings yet
Unit 4
93 pages
Unit Ii
No ratings yet
Unit Ii
8 pages
Mod 4
No ratings yet
Mod 4
65 pages
Regularization in Deep Learning
No ratings yet
Regularization in Deep Learning
49 pages
Regularization in Machine Learning
No ratings yet
Regularization in Machine Learning
3 pages
Deep Learning 02
No ratings yet
Deep Learning 02
28 pages
Overfitting Problem Regularization (Ridge, Lasso, Elastic) Dropout and Early Stopping
No ratings yet
Overfitting Problem Regularization (Ridge, Lasso, Elastic) Dropout and Early Stopping
17 pages
Unit 4
No ratings yet
Unit 4
62 pages
Regularization in Machine Learning
No ratings yet
Regularization in Machine Learning
17 pages
DL Unit-3
No ratings yet
DL Unit-3
56 pages
Overfitting Underfitting: UNIT 2: Optimization and Regularization in Neural Networks
No ratings yet
Overfitting Underfitting: UNIT 2: Optimization and Regularization in Neural Networks
18 pages
Regularization Slides
No ratings yet
Regularization Slides
50 pages
UNIT 2 Notes
No ratings yet
UNIT 2 Notes
19 pages
DL Chpter 3
No ratings yet
DL Chpter 3
8 pages
Lecture 05 - Regularization - 4p
No ratings yet
Lecture 05 - Regularization - 4p
21 pages
Lec8 Regularization
No ratings yet
Lec8 Regularization
41 pages
BACK PROPAGATION and REGULATION, BATCH NORMALIZATION
No ratings yet
BACK PROPAGATION and REGULATION, BATCH NORMALIZATION
20 pages
DL Mod 4 & 6 Notes
No ratings yet
DL Mod 4 & 6 Notes
12 pages
DL Unit 4
No ratings yet
DL Unit 4
15 pages
07 Regularization
No ratings yet
07 Regularization
51 pages
NNDL Notes
No ratings yet
NNDL Notes
73 pages
UNIT-II Regularization in Deep Learning
No ratings yet
UNIT-II Regularization in Deep Learning
24 pages
What Is Regularization.
No ratings yet
What Is Regularization.
10 pages
Regularization
No ratings yet
Regularization
46 pages
L11+ Regularization
No ratings yet
L11+ Regularization
24 pages
Regularization
No ratings yet
Regularization
3 pages
03 Reg Slides
No ratings yet
03 Reg Slides
64 pages
Regularization and Normalization
No ratings yet
Regularization and Normalization
29 pages
Penalizing Gradient Norm For Efficiently Improving Generalization in Deep Learning
No ratings yet
Penalizing Gradient Norm For Efficiently Improving Generalization in Deep Learning
11 pages
Unit-2 L1
No ratings yet
Unit-2 L1
23 pages
Deep Learning Basics Lecture 3 Regularization I
No ratings yet
Deep Learning Basics Lecture 3 Regularization I
32 pages
Unit - 4 REGULARIZATION FOR DEEP LEARNING
No ratings yet
Unit - 4 REGULARIZATION FOR DEEP LEARNING
56 pages
Regularization (Mathematics)
No ratings yet
Regularization (Mathematics)
11 pages
Unit 4
No ratings yet
Unit 4
35 pages
Unit4 DL Final
No ratings yet
Unit4 DL Final
30 pages
S10 DNN Regularization Wip
No ratings yet
S10 DNN Regularization Wip
11 pages
4th Unit DL Final Class Notes
No ratings yet
4th Unit DL Final Class Notes
68 pages
Lec 05 Regularization
No ratings yet
Lec 05 Regularization
77 pages
Unit 2.3
No ratings yet
Unit 2.3
43 pages
Regularization (Mathematics) - Wikipedia
No ratings yet
Regularization (Mathematics) - Wikipedia
13 pages
Deep Learning: Computer Science and Engineering
No ratings yet
Deep Learning: Computer Science and Engineering
18 pages
Deep Learning Basics Lecture 4 Regularization II
No ratings yet
Deep Learning Basics Lecture 4 Regularization II
27 pages
Deep Neural Network Module 4 Regularization
No ratings yet
Deep Neural Network Module 4 Regularization
53 pages
NN&DL Unit-IV Regularization For Deep Learning
No ratings yet
NN&DL Unit-IV Regularization For Deep Learning
16 pages
Unit Iv NNHDL
No ratings yet
Unit Iv NNHDL
15 pages
UNIT LV
No ratings yet
UNIT LV
8 pages
Regularization
No ratings yet
Regularization
14 pages
Module - 2 Ver 1.4
No ratings yet
Module - 2 Ver 1.4
35 pages
Regularization For Deep Learning: Tsz-Chiu Au Chiu@unist - Ac.kr
No ratings yet
Regularization For Deep Learning: Tsz-Chiu Au Chiu@unist - Ac.kr
100 pages
07: Regularization: The Problem of Overfitting
No ratings yet
07: Regularization: The Problem of Overfitting
5 pages
Chapter 5-Computer Theory BY Danial I. A Cohen
67% (21)
Chapter 5-Computer Theory BY Danial I. A Cohen
19 pages
Regularization For Neural Networks 1718966083
No ratings yet
Regularization For Neural Networks 1718966083
9 pages
Regularization: Swetha V, Research Scholar
No ratings yet
Regularization: Swetha V, Research Scholar
32 pages
Model Curriculum: Iot - Domain Specialist
100% (1)
Model Curriculum: Iot - Domain Specialist
23 pages
DSP Lab Manual PDF
100% (1)
DSP Lab Manual PDF
51 pages
Playfair Cipher
100% (1)
Playfair Cipher
6 pages
DataMining Course Handout PDF
No ratings yet
DataMining Course Handout PDF
5 pages
A Comparative Study On Fake Job Post Prediction Using Different Data Mining Techniques
100% (1)
A Comparative Study On Fake Job Post Prediction Using Different Data Mining Techniques
5 pages
8 Adagrad, RMSprop, Adam 04 Sep 2020material I 04 Sep 2020 Module4 Optimization
No ratings yet
8 Adagrad, RMSprop, Adam 04 Sep 2020material I 04 Sep 2020 Module4 Optimization
50 pages
10-Variants of Convolution Function-21-Sep-2020Material I 21-Sep-2020 Module5 CNN
No ratings yet
10-Variants of Convolution Function-21-Sep-2020Material I 21-Sep-2020 Module5 CNN
23 pages
F Inal CoursePack - CCS - R1UC505C
No ratings yet
F Inal CoursePack - CCS - R1UC505C
17 pages
First Order Open Loop System: Che 529 Process Dynamics and Control
No ratings yet
First Order Open Loop System: Che 529 Process Dynamics and Control
5 pages
IT-ITeS Q8210 IoT - Domain Specialist Qualification File
No ratings yet
IT-ITeS Q8210 IoT - Domain Specialist Qualification File
28 pages
10 TASK2.EXPT2 25 Aug 2020material - I - 25 Aug 2020 - PORT - PROGRAMMING
No ratings yet
10 TASK2.EXPT2 25 Aug 2020material - I - 25 Aug 2020 - PORT - PROGRAMMING
2 pages
Machine Learning Supervised
No ratings yet
Machine Learning Supervised
42 pages
ELEN4903 hw1 Spring2018
No ratings yet
ELEN4903 hw1 Spring2018
2 pages
Introduction To Feed Forward Neural Networks
No ratings yet
Introduction To Feed Forward Neural Networks
121 pages
Equations To Approximate Changes To The Properties of Crude Oil With Changing Temperature
No ratings yet
Equations To Approximate Changes To The Properties of Crude Oil With Changing Temperature
7 pages
Naval Research Laboratory Washington, DC 20375-5320 Nrl/Mr/6410!93!7192
No ratings yet
Naval Research Laboratory Washington, DC 20375-5320 Nrl/Mr/6410!93!7192
134 pages
1-MATERIAL DL Syllabus V2
No ratings yet
1-MATERIAL DL Syllabus V2
2 pages
Introduction To Cryptography: Basic Concepts Classical Techniqes Modern Conventional Techniques
No ratings yet
Introduction To Cryptography: Basic Concepts Classical Techniqes Modern Conventional Techniques
35 pages
6 - Modeling Road Traffic Flow On The Link
No ratings yet
6 - Modeling Road Traffic Flow On The Link
15 pages
AI Note
No ratings yet
AI Note
113 pages
Deep Learning For Diagnosis and Classification of Faults in Industrial Rotating Machinery
No ratings yet
Deep Learning For Diagnosis and Classification of Faults in Industrial Rotating Machinery
23 pages
MATH 5 Sample Question 2
No ratings yet
MATH 5 Sample Question 2
2 pages
Signal and System: Muhammad Umair
No ratings yet
Signal and System: Muhammad Umair
18 pages
Lab 6
No ratings yet
Lab 6
8 pages
1983 - High Resolution Schemes For Hyperbolic Conservation Laws - Harten
No ratings yet
1983 - High Resolution Schemes For Hyperbolic Conservation Laws - Harten
37 pages
Assignment
No ratings yet
Assignment
20 pages
Section 5. Graphing Systems: 5A. The Phase Plane
No ratings yet
Section 5. Graphing Systems: 5A. The Phase Plane
5 pages
10-8085 Microprocessor-04-Aug-2020Material - I - 04-Aug-2020 - Introduction - To - 8085 - Processor
No ratings yet
10-8085 Microprocessor-04-Aug-2020Material - I - 04-Aug-2020 - Introduction - To - 8085 - Processor
17 pages
Salem College of Engineering and Technology: Principles of Digital Signal Processing
No ratings yet
Salem College of Engineering and Technology: Principles of Digital Signal Processing
2 pages
VaR Estimation Using GANs - 1553122463
No ratings yet
VaR Estimation Using GANs - 1553122463
23 pages
Assignment 2
No ratings yet
Assignment 2
10 pages
9 TASK2.EXPT1 18 Aug 2020material - II - 18 Aug 2020 - TASK PDF
No ratings yet
9 TASK2.EXPT1 18 Aug 2020material - II - 18 Aug 2020 - TASK PDF
7 pages
Lower-Upper Symmetric-Gauss-Seidel Method For The Euler and Navier-Stokes Equations
No ratings yet
Lower-Upper Symmetric-Gauss-Seidel Method For The Euler and Navier-Stokes Equations
2 pages
NTH Term in The Series
No ratings yet
NTH Term in The Series
6 pages
Machine Learning Bloque 4
No ratings yet
Machine Learning Bloque 4
12 pages
2-Qp Key Ece3048 Deep Learning f2 Cat1
No ratings yet
2-Qp Key Ece3048 Deep Learning f2 Cat1
3 pages
TASK 1: Implementation of Digital Circuits Using KEIL For An 8051 Microcontroller
No ratings yet
TASK 1: Implementation of Digital Circuits Using KEIL For An 8051 Microcontroller
2 pages
Causal Loop Diagram
No ratings yet
Causal Loop Diagram
4 pages
JFJF
No ratings yet
JFJF
14 pages
Demand Forecasting Case Study
No ratings yet
Demand Forecasting Case Study
2 pages
A Conversation About Calculus
From Everand
A Conversation About Calculus
Ginachukwu Amah
No ratings yet
Gauss Nodes Revolution: Numerical Integration Theory Radically Simplified And Generalised
From Everand
Gauss Nodes Revolution: Numerical Integration Theory Radically Simplified And Generalised
Rob Porter
No ratings yet
Bundle Adjustment: Optimizing Visual Data for Precise Reconstruction
From Everand
Bundle Adjustment: Optimizing Visual Data for Precise Reconstruction
Fouad Sabry
No ratings yet