Machine Learning Math Essentials - 12.02.2025
Machine Learning Math Essentials - 12.02.2025
• Reducible
errors: These errors
can be reduced to
improve the model
accuracy. Such errors
can further be
classified into bias and
Variance.
• Irreducible
errors: These errors
will always be present
in the model
• regardless of which
algorithm has been
used. The cause of
these errors is
unknown variables
whose value can't be
reduced.
What is bias in machine learning?
• For example, a linear regression model may have a high bias if the
data has a non-linear relationship.
• A linear algorithm has a high bias, as it makes them learn fast. The
simpler the algorithm, the higher the bias it has likely to be introduced.
Whereas a nonlinear algorithm often has low bias.
• Some examples of machine learning algorithms with low bias are
Decision Trees, k-Nearest Neighbours and Support Vector
Machines. At the same time, an algorithm with high bias is Linear
Regression, Linear Discriminant Analysis and Logistic
Regression.
Characteristics of a high bias model include:
1. Low-Bias, Low-Variance:
The combination of low bias and low variance shows an ideal
machine learning model. However, it is not possible
practically.
2. Low-Bias, High-Variance: With low bias and high variance,
model predictions are inconsistent and accurate on average.
This case occurs when the model learns with a large number
of parameters and hence leads to an overfitting
3. High-Bias, Low-Variance: With High bias and low
variance, predictions are consistent but inaccurate on
average. This case occurs when a model does not learn well
with the training dataset or uses few numbers of the
parameter. It leads to underfitting problems in the model.
4. High-Bias, High-Variance:
With high bias and high variance, predictions are
inconsistent and also inaccurate on average
•
Bias-Variance Trade-Off
•
• If the algorithm is too simple (hypothesis
with linear equation) then it may be on high
bias and low variance condition and thus is
error-prone.
• If algorithms fit too complex (hypothesis with
high degree equation) then it may be on high
variance and low bias. In the latter condition,
the new entries will not perform well.
• For an accurate prediction of the model,
algorithms need a low variance and low bias
• The total error is the sum of bias error and variance
error. The optimal region shows the area with the
balance between bias and variance, showing
optimal model complexity with minimum error.
Mathematical Representation
• The prediction error in the machine learning model
can be written mathematically as follows −
Error = bias2 + variance + irreducible error.
• To minimize the model prediction error, we need to
choose model complexity in such a way so that a
balance between these two errors can be met.
Techniques to Balance Bias and Variance