0% found this document useful (0 votes)
12 views

Regularization

Underfitting occurs when a hypothesis function is too simple to model the trends in the data, while overfitting occurs when a hypothesis function fits the available data too closely and does not generalize to new data. To address underfitting, more features should be added. To address overfitting, features should be manually selected, model selection algorithms implemented, or regularization used. Regularization works by adding a penalty term for large parameter values to prevent overfitting.

Uploaded by

Toshinari Tong
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views

Regularization

Underfitting occurs when a hypothesis function is too simple to model the trends in the data, while overfitting occurs when a hypothesis function fits the available data too closely and does not generalize to new data. To address underfitting, more features should be added. To address overfitting, features should be manually selected, model selection algorithms implemented, or regularization used. Regularization works by adding a penalty term for large parameter values to prevent overfitting.

Uploaded by

Toshinari Tong
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 3

overfitting

underfitting / high bias ok fit overfitting / high variance

Underfitting happens when the form of our hypothesis function maps poorly to the trend of the
data. It is usually caused by a function that is too simple or uses too few features.

Overfitting is caused by a hypothesis function that fits the available data but does not generalize
well to predict new data. It is usually caused by a complicated function that creates a lot of
unnecessary curves and angles unrelated to the data.

To solve the problem of underfitting:

 Add more features with higher degrees.

To solve the problem of overfitting:

 Manually select which features to keep.


 Implement a model selection algorithm.
 Implement regularization

regularization
m n
1 λ
J ( θ )= ∑
2 m i=1
cost (hθ ( x ) , y )+ ∑
2 m j=1
θ2j

To minimize the effects of all the parameters except the bias term, we will set lambda to a large
number and multiply it the the sum of squared parameters, so if the parameters are large,
which will lead to overfitting, the regularization term will penalize it and the cost function will be
very large. We want to minimize the cost function, so in the end, the function will smoothen
out. Note that we do not regularize the bias term, or the first parameter.

∂ J (θ ) 1 m λ
= ∑ ( hθ ( x )− y ) x j+ θ j
∂ θj m i=1 m

∂ J (θ)
θ j ≔θ j−α
∂θ j
m
α αλ
θ j ≔θ j− ∑ ( hθ ( x )− y ) x j− θ
m i=1 m j

( )
m
αλ α
θ j ≔ 1− θ − ∑ ( hθ ( x ) − y ) x j
m j m i=1
vectorized implementation:

(
θ ≔ 1−
αλ
m ) α
θ− X T (hθ ( X )− y )
m

regularized normal equation


−1
θ=( X X+ λL ) X y
T T

L=¿

You might also like