DL Lect 7
DL Lect 7
2024/2025
Deep Learning for Computer Vision
Lecture 7:
Outlines
• Batch Normalization
• Batch Normalization Algorithm
• Underfitting and Overfitting
• Avoiding Overfitting
➢ Cleaning Dataset
➢ Dropout Regularization
➢ Dataset Augmentation
➢ Regularization
➢ Other regularization techniques
Batch Normalization
• Although using ELU(or any variant Of ReLlJ) can remarkably reduce the vanishing/exploding gradients
problems at the beginning of training, the use of ELU cannot guarantee that vanishing/exploding
gradients won't come back during training.
• To address this issue, Batch Normalization (BN) can be implemented.
• BN consists Of adding an operation to the model just before or after the activation function Of each
hidden layer.
• We will simply employ zero-centering and normalizing each input.
• This operation allows the model to learn the optimal scale and the mean of each layer's inputs.
• To employ zero-center and normalize the inputs, we will need to estimate each input's mean ( )
• and standard deviation ( ) over the current mini-batch.
Batch Normalization Algorithm
Underfitting and Overfitting
❑ Underfitting: The trained model does not fit the dataset well or in other words the trained model
underestimate the dataset
❑ Overfitting: The trained model fits the dataset exceptionally well. but not the unseen data
❑ Good fitting: The trained model fits the dataset relatively well as well as the unseen data
• Underfitting: The trained model does not fit the dataset well
Good Fitting, or in other words the trained model underestimate the
dataset
Underfitting, and • Overfitting: The trained model fits the dataset exceptionally
well. but not the unseen data
Overfitting • Good fitting: The trained model fits the dataset relatively well
as well as the unseen data
Avoiding The challenge of overfitting can be
addressed through
Overfitting • Early Stopping.
Through • Cleaning data.
Regularization • Dropout.
Avoiding Cleaning Dataset
• Reduce number of features:
Overfitting • Manually select which features to keep
Avoiding Overfitting
Dropout Regularization
• Dropout Removes randomly some
neurons every iteration of training
• These models normally perform better
and memorize less data
Dataset Augmentation
➢ More data means better model.
➢ Data can be augmented by
creating fake data and adding
them to the dataset.
➢ This can be done by applying
transformation on the existing
dataset to get synthesized data.
• Regularization is used for tweaking the loss function
through adding an additional penalty term in the error
function.
• The additional term enables to control the excessively
fluctuating function such that the coefficients that take
extreme values.
• This way we help the network to have small weights.
Regularization • Large weights in ANN normally give an indication of a more
complex network that has overfitting.
• L1 and L2 vector norm penalty can be added to the
optimization of the NN and produce smaller weights.
• L1 and L2 regularization to constrain the NN's connecting
weights (but typically not the biases).
L1: Sum of the absolute weights.
L2: Sum of the squared weights.
LIL2: Sup of the absolute and the squared weights.
Regularization
• If you want to apply Ci regularization, replace
keras.regularizers.l2() by