Module-4_3
Module-4_3
The left image shows the constraint function (green area) for the
L1 regularization and the right image shows the constraint
function for the L2 regularization. The red ellipses are contours
off the
h loss
l function
f that
h is used d during
d the
h gradient
d d
descent. In
the center of the contours there is a set of optimal weights for
which the loss function has a global minimum.
Visualization
• In the case of L1 and L2 regularization, the estimates of W1
and W2 are given by the first point where the ellipse
intersects with the green constraint area.
• Since L2 regularization has a circular constraint area, the
intersection won’t generally occur on an axis, and this the
estimates for W1 and W2 will be exclusively non‐zero.
• In the case of L1, the constraints area has a diamond shape
with corners. And thus the contours of the loss function will
often intersect the constraint region at an axis. Then this
occurs, one of the estimates (W1 or W2) will be zero.