0% found this document useful (0 votes)
10 views24 pages

Lecture 1 Part II

The document covers the backpropagation algorithm in deep learning, detailing both the forward and backward passes, along with examples. It discusses issues of underfitting and overfitting, and presents solutions such as data augmentation, regularization techniques (L1, L2, dropout), bagging, and early stopping. The content is aimed at electrical and computer engineering students, providing foundational knowledge for deep learning applications.

Uploaded by

roycetheebanedu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views24 pages

Lecture 1 Part II

The document covers the backpropagation algorithm in deep learning, detailing both the forward and backward passes, along with examples. It discusses issues of underfitting and overfitting, and presents solutions such as data augmentation, regularization techniques (L1, L2, dropout), bagging, and early stopping. The content is aimed at electrical and computer engineering students, providing foundational knowledge for deep learning applications.

Uploaded by

roycetheebanedu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 24

EC 9170

Deep Learning for Electrical &


Computer Engineers

Lecture 01:
Deep feedforward networks
Part II

11th March 2024


Faculty of Engineering, University of Jaffna
Backpropagation Algorithm

Chain Rule of Calculus


Backpropagation Algorithm Cont…

y depends in 2 variables

y1,y2 depend on x
Backpropagation Algorithm Cont…

1. Backpropagation - Forward Pass


Backpropagation Algorithm Cont…
2. Backpropagation - Backward Pass
Ø Backpropagation Example
0.15 w1 0.4 w5 o1 0.01
X1 h1
0.2 0.45
0.05 w2 w6

w3 w7
0.25 0.5
o2 0.99
X2 h2
0.3 w4 0.55 w8
0.1

1 0.35 1 0.6
b1 b2
Ø Backpropagation Example Cont…

1. Compute the output of the hidden layer

2. Compute the output of the output layer


Ø Backpropagation Example Cont…

3. Calculate the error for each output


Ø Backpropagation Example Cont…
4.1 Backpropagation - Backward Pass- Output layer
Ø Backpropagation Example Cont…
Ø Backpropagation Example Cont…
Putting it all together:
Ø Backpropagation Example Cont…
4.2 Backpropagation - Backward Pass- Hidden layer
Step 01:

Step 02: Plugging them together:


Step 03: Step 04:

Step 05: Step 06:


Underfitting and Overfitting
Underfitting Overfitting

Training dataset Training dataset


Low accuracy High accuracy

Training dataset
High Training accuracy
High Testing accuracy
Optimal
Test dataset Test dataset
Low accuracy Low accuracy
Solutions for Overfitting
• Increase the size of the dataset – e.g Data augmentation
• Regularization
• L1
• L2
• Dropout
• Bagging/Ensemble models
• Early stopping
Ø Data Augmentation Data augmentation techniques in computer
vision
• Cropping.
• Flipping.
• Rotation.
• Translation.
• Brightness.
• Contrast.
• Color Augmentation.
• Saturation.
Ø Regularization for deep learning
• Regularization is any modification made to the learning algorithm with the intention of lowering
the generalization error but not the training error.
• In the context of deep learning, most regularization strategies involve regularizing estimators.
This is done by reducing variance at the expense of increasing the estimator's bias.
• An effective regularizer is one that decreases the variance significantly while not overly
increasing the bias.
• controlling the complexity of the model is not a simple matter of finding the right model size
and the right number of parameters.
• Instead, deep learning relies on finding the best-fitting model, a large model that has been
properly regularized.
Ø L1/L2 Regularization
Ø Dropout
• Dropout provides a computationally inexpensive but powerful method of
regularizing a broad family of models.
• Dropout provides an inexpensive approximation to training and evaluating a
bagged ensemble of exponentially many neural networks.
• Specifically, dropout trains the ensemble consisting of all sub-networks that can
be formed by removing non-output units from an underlying base network.
Ø Training with Dropout
• To train with dropout, we use a minibatch-based learning algorithm that makes
small steps, such as stochastic gradient descent.
• Each time we load an example into a minibatch, we randomly sample a different
binary mask to apply to all of the input and hidden units in the network.
• The mask for each unit is sampled independently from all of the others.
• Typically, the probability of including a hidden unit is 0.5, while the probability of
including an input unit is 0.8.
Ø Bagging/ Ensemble models
• Bagging (short for bootstrap aggregating) is a technique for reducing
generalization error by combining several models.
• Bagging is defined as follows:
• Train k different models on k different subsets of training data, constructed to
have the same number of examples as the original dataset through random
sampling from that dataset with replacement.
• Have all of the models vote on the output for test examples.
• Techniques employing bagging are called ensemble models.
Ø Bagging/ Ensemble models
• Bagging works because different models will usually not all make the same errors
on the test set.
• This is a direct result of training on k different subsets of the training data, each
subset missing some of the examples from the original dataset.
• Other factors, such as differences in random initialization, random selection of
mini-batches,differences in hyperparameters, or different outcomes of non-
deterministic neural network implementations, are often enough to cause
different members of the ensemble to make partially independent errors.
Ø Early stopping
• When training models with sufficient representational capacity to overfit the
task, we often observe that training error decreases steadily over time while the
error on the validation set begins to rise again.
• The occurrence of this behaviour in the scope of our applications is almost
certain.
• This means we can obtain a model with better validation set error (and thus,
hopefully, better test set error) by returning to the parameter setting at the
point in time with the lowest validation set error.
• This is termed Early Stopping.
Thank you!

You might also like