0% found this document useful (0 votes)

13 views22 pages

Unit-2 L2

Uploaded by

pari

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

13 views22 pages

Unit-2 L2

Uploaded by

pari

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 22

REGULARIZATION

DATA SET AUGMENTATION

• Regularization: Regularization: Overview, Parameter Penalties, Norm Penalties as Constrained
Optimization, Regularization and Underconstrained Problems, Data Augmentation, Noise
Robustness, Batch Normalization, Semi-Supervised Learning, Multi-Task Learning, Early
Stopping, Parameter Tying and Parameter Sharing, SparseRepresentations, Bagging, Dropout.
Tuning Neural Networks, Hyperparameters
DATA AUGMENTATION

• More data is better:

• Best way to make an ML model generalize better is to
train it on more data.
• In practice, amount of data is limited
• Get around the problem by creating synthesized data
• For some ML tasks, it is straightforward to synthesize
data
DATA AUGMENTATION
• Augmentation for classification:
• Data augmentation is easiest for classification
• Classifier takes high-dimensional input x and
summarizes it with a single category identity y
• Main task of classifier is to be invariant to a wide
variety of transformations
• Generate new samples (x,y) just by transforming
inputs
• Approach not easily generalized to other problems
• For density estimation problem
• it is not possible generate new data without
solving density estimation
DATA AUGMENTATION
• Effective for Object Recognition:
• Data set augmentation is very effective for the
classification problem of object recognition
• Images are high-dimensional and include a variety
of variations, may easily simulated
• Translating the images a few pixels can greatly
improve performance
• Even when designed to be invariant using convolution
and pooling
• Rotating and scaling are also effective
DATA AUGMENTATION
• Main data augmentation methods:
DATA AUGMENTATION
• Caution in Data Augmentation:
• Not apply transformation that would change the class
• OCR example: ‘b’ vs ‘d’ and ‘6’ vs ‘9’
• Horizontal flips and 180-degree rotations are not appropriate
ways.

• Some transformations are not easy to perform

• Out-of-plane rotation cannot be implemented as a simple
geometric operation on pixels
REGULARIZATION
NOISE ROBUSTNESS
NOISE ROBUSTNESS
• Noise injection
• Noise applied to inputs is a data augmentation
• For some models, the addition of noise with infinitesimal
variance at the input is equivalent to imposing a penalty on
the norm of the weights, e.g., λwTw
• Noise applied to hidden units
• Noise injection can be much more powerful than simply
shrinking the parameters
• Noise applied to hidden units is so important that it merits its
own separate discussion
• Dropout is the main development of this approach
NOISE ROBUSTNESS
• Adding Noise to Weights
• This technique is primarily used with RNNs
• This can be interpreted as a stochastic implementation
of Bayesian inference over the weights
• Bayesian learning considers model weights to be uncertain
and representable via a probability distribution p(w) that
reflects that uncertainty,
• Adding noise to weights is a practical, stochastic way to
reflect this uncertainty
NOISE ROBUSTNESS
• Adding Noise to Weights
• Noise applied to weights is equivalent to
traditional regularization, encouraging stability
• This can be seen in a regression setting
• Train to map to a scalar using least squares between model
prediction and true values y:

• We perturb weights with

• For small η, this is equivalent to a regularization term
• It encourages parameters to regions where small
perturbations of weights have small influence on
output
NOISE ROBUSTNESS
• Injecting Noise at Output Targets:
• Most datasets have some mistakes in y labels
• Harmful to maximize log p(y|x) when y is a mistake
• To prevent it we explicitly model noise on labels
• Ex: we assume training set label y is correct with
probability 1-ε, and otherwise any of the other labels
may be correct
• This can be incorporated into the cost function
• Ex: Local Smoothing regularizes a model based on a
softmax with k output values by replacing the hard 0
and 1 classification targets with targets of ε/(k-1) and 1-
ε respectively
REGULARIZATION
SEMI-SUPERVISED LEARNING
SEMI-SUPERVISED LEARNING
• Both unlabeled examples from P(x) and labeled
examples from P(x,y) are used to estimate P(y|x) or
predict y from x.
SEMI-SUPERVISED LEARNING
• Both unlabeled examples from P(x) and labeled
examples from P(x,y) are used to estimate P(y|x) or
predict y from x.
• In the context of deep learning it refers to learning a
representation h = f(x).
• The goal is to learn a representation so that examples
from the same class have similar representations.
• Unsupervised learning can provide useful clues for
how to group examples in representational space.
• Examples that cluster tightly in the input space should
be mapped to similar representations
SEMI-SUPERVISED LEARNING
• A linear classifier in the new space may achieve better
generalization.
• A variant is the application of PCA as a preprocessing step
before applying a classifier to the projected data.
• Instead of separate unsupervised and supervised components
in the model, construct models in which generative models of
either P(x) or P(x,y) shares parameters with a discriminative
model of P(y|x).
• One can then trade-off the supervised criterion –log P(y|x)
with the unsupervised or generative one (such as –log P(x) or
–log P(x,y)).
• The generative criterion then expresses a prior belief about
the solution to the supervised problem
REGULARIZATION
MULTI-TASK LEARNING
MULTI-TASK LEARNING
• Sharing parameters over tasks:
• Multi-task learning is a way to improve
generalization by pooling the examples out of
several tasks
• Examples can be seen as providing soft constraints on
the parameters
• In the same way that additional training examples
put more pressure on the parameters of the model
towards values that generalize well
• Different supervised tasks, predicting y(i) given x
• Share the same input x, as well as some
intermediate representation h(shared) capturing a
MULTI-TASK LEARNING
MULTI-TASK LEARNING
• Common multi-task situation:
• Common input but different target
random variables
• Lower layers (whether feedforward or
includes a generative component with
downward arrows) can be shared across
such tasks.
• Task-specific parameters h(1), h(2) can be
learned on top of those yielding a shared
representation h(shared)
• Common pool of factors explain
variations of Input x while each task is
associated with a Subset of these factors
MULTI-TASK LEARNING
• Model can be divided into two parts
1. Task specific parameters
• Which only benefit from the examples
of their task to achieve good
generalization.
• These are the upper layers of the neural
network
2. Generic parameters
• Shared across all tasks
• Which benefit from the pooled data of all
tasks
• These are the lower levels of the neural

Open Web Application Security Project (OWASP)
No ratings yet
Open Web Application Security Project (OWASP)
4 pages
LECTURE#9 EE258 F22 Part2 Draft v1
No ratings yet
LECTURE#9 EE258 F22 Part2 Draft v1
14 pages
DL M2 Regularization
No ratings yet
DL M2 Regularization
12 pages
Unit Ii
No ratings yet
Unit Ii
8 pages
Module - 2 Ver 1.4
No ratings yet
Module - 2 Ver 1.4
35 pages
Deep Learning Basics Lecture 4 Regularization II
No ratings yet
Deep Learning Basics Lecture 4 Regularization II
27 pages
DL Unit 3
No ratings yet
DL Unit 3
59 pages
Unit 3
No ratings yet
Unit 3
47 pages
CM20315 09 Regularization
No ratings yet
CM20315 09 Regularization
44 pages
465-Lecture 10-11
No ratings yet
465-Lecture 10-11
79 pages
Deep Neural Network Module 4 Regularization
No ratings yet
Deep Neural Network Module 4 Regularization
53 pages
Lecture 5-6
No ratings yet
Lecture 5-6
45 pages
Deep Feedforward Networks and Regularization: Licheng Zhang
No ratings yet
Deep Feedforward Networks and Regularization: Licheng Zhang
56 pages
Deep Learning
No ratings yet
Deep Learning
78 pages
Deep MLP's
No ratings yet
Deep MLP's
44 pages
L4 Training Neural Networks en
No ratings yet
L4 Training Neural Networks en
48 pages
Deep Learning (All in One)
No ratings yet
Deep Learning (All in One)
23 pages
Practical Aspects of Deep Learning PI
No ratings yet
Practical Aspects of Deep Learning PI
46 pages
Cours 4
No ratings yet
Cours 4
30 pages
6 - Tips For Training Deep Neural Networks
No ratings yet
6 - Tips For Training Deep Neural Networks
59 pages
EN3150 Pattern Recognition - L02
No ratings yet
EN3150 Pattern Recognition - L02
51 pages
465-Lecture 1 (Deep Learning)
No ratings yet
465-Lecture 1 (Deep Learning)
47 pages
Unit 3
No ratings yet
Unit 3
110 pages
UNIT-II Regularization in Deep Learning
No ratings yet
UNIT-II Regularization in Deep Learning
24 pages
Convolutional Neural Networks (Image Recognition) Part - II: Dr. Syed M. Usman
No ratings yet
Convolutional Neural Networks (Image Recognition) Part - II: Dr. Syed M. Usman
75 pages
Nips10 Workshop Tutorial Final PDF
No ratings yet
Nips10 Workshop Tutorial Final PDF
73 pages
Introduction To Neural Network
No ratings yet
Introduction To Neural Network
20 pages
Early Stopping, Dropout, Augmentation, Optimizers New
No ratings yet
Early Stopping, Dropout, Augmentation, Optimizers New
91 pages
Week 10
No ratings yet
Week 10
69 pages
Unit - 4-NNDL - Notes
No ratings yet
Unit - 4-NNDL - Notes
14 pages
HCIP-AI-EI Developer V2.0 Training Material
No ratings yet
HCIP-AI-EI Developer V2.0 Training Material
508 pages
Cours 6
No ratings yet
Cours 6
26 pages
Different Activation Functions With The Equations
No ratings yet
Different Activation Functions With The Equations
6 pages
Deep Neural Network
No ratings yet
Deep Neural Network
60 pages
L10 Learning II Gradient Based Learning
No ratings yet
L10 Learning II Gradient Based Learning
72 pages
L09 - Regularisation
No ratings yet
L09 - Regularisation
79 pages
Neural Networks For Machine Learning: Lecture 9a Overview of Ways To Improve Generalization
No ratings yet
Neural Networks For Machine Learning: Lecture 9a Overview of Ways To Improve Generalization
39 pages
Training Neural Netwok: Data Set
No ratings yet
Training Neural Netwok: Data Set
35 pages
Richi's Neural Nets Summary
No ratings yet
Richi's Neural Nets Summary
114 pages
DL UNIT II PART II (IMP) Optimization For Training Deep Model
No ratings yet
DL UNIT II PART II (IMP) Optimization For Training Deep Model
81 pages
Crashcourse DL Pytorch Parr
No ratings yet
Crashcourse DL Pytorch Parr
39 pages
Lec 05 Regularization
No ratings yet
Lec 05 Regularization
77 pages
L06 Slides - mlp3
No ratings yet
L06 Slides - mlp3
26 pages
03 Reg Slides
No ratings yet
03 Reg Slides
64 pages
Twentyone 20466 PDF
No ratings yet
Twentyone 20466 PDF
15 pages
Chapter-2 Single Feed Forward Netwotk
No ratings yet
Chapter-2 Single Feed Forward Netwotk
132 pages
L5 Training Neural Networks Part 2 en v2
No ratings yet
L5 Training Neural Networks Part 2 en v2
70 pages
3 DL
No ratings yet
3 DL
15 pages
Lecture 6
No ratings yet
Lecture 6
41 pages
Cst414-Deep Learning Module 2
No ratings yet
Cst414-Deep Learning Module 2
13 pages
DL Unit 3
No ratings yet
DL Unit 3
14 pages
Unit 2.3
No ratings yet
Unit 2.3
43 pages
NLP-NeuralNetworks Reading Notes
No ratings yet
NLP-NeuralNetworks Reading Notes
13 pages
NN 08
No ratings yet
NN 08
36 pages
Unit-2 Improving-Deep-Neural-Networks
No ratings yet
Unit-2 Improving-Deep-Neural-Networks
18 pages
A Probabilistic Theory of Deep Learning: Unit 2
100% (1)
A Probabilistic Theory of Deep Learning: Unit 2
17 pages
Greedy Layerwise Learning
No ratings yet
Greedy Layerwise Learning
39 pages
DL Module 2
No ratings yet
DL Module 2
8 pages
Chapter 9
No ratings yet
Chapter 9
73 pages
DL Class3
No ratings yet
DL Class3
28 pages
Naive Bayes Classifier: Fundamentals and Applications
From Everand
Naive Bayes Classifier: Fundamentals and Applications
Fouad Sabry
No ratings yet
Tp200-Interface Guide (2024 - 09 - 09 23 - 12 - 19 UTC)
No ratings yet
Tp200-Interface Guide (2024 - 09 - 09 23 - 12 - 19 UTC)
50 pages
PB Maxilog RH
No ratings yet
PB Maxilog RH
2 pages
Xiaomi Pricelist 09.25.2024
No ratings yet
Xiaomi Pricelist 09.25.2024
10 pages
MTC 30521
No ratings yet
MTC 30521
28 pages
Profihub f1 Manual en v101
No ratings yet
Profihub f1 Manual en v101
26 pages
(Ebook PDF) A Short Course in Photography: Digital 3rd Editionpdf Download
80% (5)
(Ebook PDF) A Short Course in Photography: Digital 3rd Editionpdf Download
54 pages
Reference Citation
100% (1)
Reference Citation
8 pages
Computer Networking MQP by Nitin Paliwal
No ratings yet
Computer Networking MQP by Nitin Paliwal
5 pages
Sam Satapathy Resume
No ratings yet
Sam Satapathy Resume
11 pages
Electrical Turrets: Instruction For Use and Maintenance
No ratings yet
Electrical Turrets: Instruction For Use and Maintenance
14 pages
Top Notch 2 List of Vocabulary in Unit 9
No ratings yet
Top Notch 2 List of Vocabulary in Unit 9
8 pages
Project Ep Iii
No ratings yet
Project Ep Iii
12 pages
The System Unit Is A Case That Contains Electronic Components of The Computer Used To Process Data
No ratings yet
The System Unit Is A Case That Contains Electronic Components of The Computer Used To Process Data
7 pages
Fire Detection Algorithm Based On The Fusion of YOLOv8 and Deformable Conv DCN
No ratings yet
Fire Detection Algorithm Based On The Fusion of YOLOv8 and Deformable Conv DCN
8 pages
Simplex Method Calculator
No ratings yet
Simplex Method Calculator
5 pages
Seminar On Blockchain
No ratings yet
Seminar On Blockchain
15 pages
ARC400 Update Manual
No ratings yet
ARC400 Update Manual
11 pages
Tera Com
No ratings yet
Tera Com
9 pages
CU-2021 B.Sc. (Honours) Computer Science Semester-IV Paper-CC-10 QP
No ratings yet
CU-2021 B.Sc. (Honours) Computer Science Semester-IV Paper-CC-10 QP
2 pages
Rack Access Control Solution
No ratings yet
Rack Access Control Solution
8 pages
Submission Guidelines PG STAR Session3
No ratings yet
Submission Guidelines PG STAR Session3
11 pages
Mobile Edge Computing - A Survey On Architecture and Computation Offloading
No ratings yet
Mobile Edge Computing - A Survey On Architecture and Computation Offloading
28 pages
Arcsight Platform 24.2 Release Notes
No ratings yet
Arcsight Platform 24.2 Release Notes
63 pages
Part 4 - Solution Design Documents - What You Need To Know
No ratings yet
Part 4 - Solution Design Documents - What You Need To Know
29 pages
Key Differences Between BioMérieux MALDI-ToF MS (V
No ratings yet
Key Differences Between BioMérieux MALDI-ToF MS (V
9 pages
Hall Ticket
No ratings yet
Hall Ticket
1 page
Read
No ratings yet
Read
2 pages
Database Management System
No ratings yet
Database Management System
9 pages
E-Health in The Philippines
100% (1)
E-Health in The Philippines
4 pages

Unit-2 L2

Uploaded by

Unit-2 L2

Uploaded by

REGULARIZATION

DATA SET AUGMENTATION

• More data is better:

• Some transformations are not easy to perform

• We perturb weights with

You might also like