What Is Regularization.
What Is Regularization.
Before we deep dive into the topic, take a look at this image:
Have you seen this image before? As we move towards the right in this
image, our model tries to learn too well the details and the noise from
the training data, ultimately resulting in poor performance on the
unseen data.
In other words, while going toward the right, the complexity of the
model increases such that the training error reduces but the testing
error doesn’t. This is shown in the image below:
If you’ve built a neural network before, you know how complex they
are. This makes them more prone to overfitting.
L2&L1 Regularization
For L2:
For L1:
Data Augmentation
But now, let’s consider we are dealing with images. In this case, there
are a few ways of increasing the size of the training data—rotating the
image, flipping, scaling, shifting, etc. In the image below, some
transformation has been done on the handwritten digits dataset.
This technique is known as data augmentation. It usually provides a
big leap in improving the accuracy of the model, and it can be
considered a mandatory trick to improve our predictions.
Early Stopping
In the above image, we will stop training at the dotted line since, after
that, our model will start overfitting on the training data.
Patience denotes the number of epochs with no further improvement
after which the training will be stopped. For a better understanding,
let’s look at the above image again. After the dotted line, each epoch
will result in a higher validation error value. Therefore, our model will
stop 5 epochs after the dotted line (since our patience equals 5) because
no further improvement is seen.