0% found this document useful (0 votes)
14 views88 pages

Machine Learning Math Essentials - 12.02.2025

The document discusses the concepts of probability in relation to machine learning, differentiating between frequentist and Bayesian probabilities. It explains bias and variance as sources of error in predictive models, emphasizing the importance of balancing these errors to avoid underfitting and overfitting. Techniques for reducing bias and variance are also outlined to improve model accuracy.

Uploaded by

venkat Mohan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views88 pages

Machine Learning Math Essentials - 12.02.2025

The document discusses the concepts of probability in relation to machine learning, differentiating between frequentist and Bayesian probabilities. It explains bias and variance as sources of error in predictive models, emphasizing the importance of balancing these errors to avoid underfitting and overfitting. Techniques for reducing bias and variance are also outlined to improve model accuracy.

Uploaded by

venkat Mohan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 88

Turning data into probabilities

Probability represents the certainty factor. Certainty is the rate


that you would assign to an event to happen.
E.g.Rolling a dice getting chances of 6 is 1/6=16.67%
That’s the certainty you allot to that particular event.This is called
probability or frequentist probability
The frequentist probability denotes the frequency with which the
event can happen amongst many trials/events
Not all the scenario are frequency related in our previous
assumption
• Consider ML algorithm in which we estimate the probability
of inflation or deflation of the price of a fuel
• The latter phenomenon is called Bayesian probability. Rather
than considering the frequency with which an event repeats,
we quantify our belief
• There’s a 32% chance that a diabetic patient is going to
develop heart failure
• This statement isn’t prone to repetition where we create
infinite replicas of the patient’s symptoms. We instead
quantify with a 32% certainty that heart failure could happen.
• Altogether, probability measures the extent of certainty
pertaining to an uncertain event.
• Formulating an easy and uncertain rule is
better in comparison to formulating a complex
and certain rule-It’s cheaper to generate and
analyze.
• Moreover a certain rule does not guarantee
generating the right and required output
always. An uncertain rule on the other hand
though non-deterministic helps in reaching
generalized conclusion
Turning Data into Probabilities
• Machine Learning is an interdisciplinary field that
uses statistics, probability, algorithms to learn from
data and provide insights which can be used to
build intelligent applications.
• In probability theory, an event is a set of outcomes
of an experiment to which a probability is assigned.
• If E represents an event, then P(E) represents the
probability that E will occur. A situation where E
might happen (success) or might not happen
(failure) is calleda trial
• This event can be anything like tossing a coin, rolling a
die or pulling a colored ball out of a bag. In these
examples the outcome of the event is random, so the
variable that represents the outcome of these events is
called a random variable.
• Theoretical probability on the other hand is given by
the number of ways the particular event can occur
divided by the total number of possible outcomes. So
ahead can occur once and possible outcomes are two
(head, tail). The true(theoretical) probability of a head
is 1/2.
Bias and Variance in Machine Learning
• Machine learning is a branch of Artificial Intelligence,
which allows machines to perform data analysis and
make predictions.
• However, if the machine learning model is not accurate,
it can make predictions errors, and these prediction
errors are usually known as Bias and Variance.
• In machine learning, these errors will always be present
as there is always a slight difference between the model
predictions and actual predictions.
• The main aim of ML/data science analysts is to reduce
these errors in order to get more accurate results.
• In this topic, we are going to discuss bias and variance,
Bias-variance trade-off, Underfitting and Overfitting. But
before starting, let's first understand what errors in
Machine learning are?
Errors in Machine Learning
• An error is a measure of how accurately an algorithm can
make predictions for the previously unknown dataset. On the
basis of these errors, the machine learning model is selected
that can perform best on the particular dataset. There are
mainly two types of errors in machine learning, which are:
• Reducible errors: These errors can be reduced to improve
the model accuracy. Such errors can further be classified
into bias and Variance.
• Irreducible errors: These errors will always be present in the
model regardless of which algorithm has been used. The
cause of these errors is unknown variables whose value
can't be reduced.
Errors in Machine Learning?

• Reducible
errors: These errors
can be reduced to
improve the model
accuracy. Such errors
can further be
classified into bias and
Variance.
• Irreducible
errors: These errors
will always be present
in the model
• regardless of which
algorithm has been
used. The cause of
these errors is
unknown variables
whose value can't be
reduced.
What is bias in machine learning?

• Bias is simply defined as the inability of the


model because of that there is some
difference or error occurring between the
model’s predicted value and the actual value.
These differences between actual or expected
values and the predicted values are known as
error or bias error or error due to bias.
• Bias is a systematic error that occurs due to
wrong assumptions in the machine learning
process.
• In general, a machine learning model analyses
the data, find patterns in it and make
predictions. While training, the model learns
these patterns in the dataset and applies them
to test data for prediction. While making
predictions, a difference occurs between
prediction values made by the model and
actual values/expected values, and this
difference is known as bias errors or Errors
due to bias. It can be defined as an inability of
machine learning algorithms such as Linear
Regression to capture the true relationship
between the data points
Bias vs. variance, and the tradeoff

•Bias and variance are two sources of error in


predictive models. Getting the right balance
between the bias and variance tradeoff is
fundamental to effective machine learning
algorithms. Here is a quick explanation of
these concepts:
•Bias. Bias refers to error caused by a model
for solving complex problems that is over
simplified, makes significant assumptions,
and misses important relationships in your
data.
•Variance. Variance is an error caused by an
algorithm that is too sensitive to fluctuations
in data, creating an overly complex model
that sees patterns in data that are actually
just randomness.
•Bias–variance tradeoff. Minimizing errors
caused by oversimplification and excessive
complication requires finding the right
balance or tradeoff between the two.
Bias
• Low Bias: Low bias value means fewer assumptions are taken to
build the target function. In this case, the model will closely match the
training dataset.
•High Bias: High bias value means more assumptions are taken to
build the target function. In this case, the model will not match the
training dataset closely. A high bias model also cannot perform
well on new data.
•When the Bias is high, assumptions made by our model are too basic,
the model can’t capture the important features of our data. This
means that our model hasn’t captured patterns in the training data
and hence cannot perform well on the testing data too. If this is the
case, our model cannot perform on new data and cannot be sent into
production.
•The high-bias model will not be able to capture the dataset trend. It is
considered as the underfitting model which has a high error rate. It is
due to a very simplified algorithm.
• This instance, where the model cannot find patterns in our training set
and hence fails for both seen and unseen data, is called Underfitting.
• The below figure shows an example of Underfitting. As we can see,
the model has found no patterns in our data and the line of best fit is a
straight line that does not pass through any of the data points. The
model has failed to train properly on the data given and cannot predict
new data either.

• For example, a linear regression model may have a high bias if the
data has a non-linear relationship.
• A linear algorithm has a high bias, as it makes them learn fast. The
simpler the algorithm, the higher the bias it has likely to be introduced.
Whereas a nonlinear algorithm often has low bias.
• Some examples of machine learning algorithms with low bias are
Decision Trees, k-Nearest Neighbours and Support Vector
Machines. At the same time, an algorithm with high bias is Linear
Regression, Linear Discriminant Analysis and Logistic
Regression.
Characteristics of a high bias model include:

• Failure to capture proper data trends

• Potential towards underfitting

• More generalized/overly simplified

• High error rate


Ways to reduce high bias in Machine Learning:

• Use more complex models, such as


including some polynomial
features.
• Increase the number of input
features as the model is underfitted
• Reduce Regularization of the
model
• Increase the size of the training
data
What is a Variance Error?
• Variance is the very opposite of Bias
• Variance is the measure of spread in data from its
mean position.
• In machine learning variance is the amount by
which the performance of a predictive model
changes when it is trained on different subsets of
the training data
• More specifically, variance is the variability of the
model that how much it is sensitive to another
subset of the training dataset. i.e. how much it
can adjust on the new subset of the training
dataset.
Variance
• If model accuracies on training
and test data vary greatly, the
model has high variance.

• A model with high variance can


even fit noises on training data
but lacks generalization to new,
unseen data.
• Low variance. Models with
high bias will have low
variance.

• High variance. Models with


high variance will have a low
bia
• During training, it allows our model to ‘see’ the data a
certain number of times to find patterns in it. If it does not
work on the data for long enough, it will not find patterns
and bias occurs.
• On the other hand, if our model is allowed to view the data
too many times, it will learn very well for only that data. It
will capture most patterns in the data, but it will also learn
from the unnecessary data present, or from the noise.
• We can define variance as the model’s sensitivity to
fluctuations in the data. Our model may learn from noise.
This will cause our model to consider trivial features as
important.
• our model has learned extremely
well for our training data, which
has taught it to identify cats. But
when given new data, such as the
picture of a fox, our model predicts
it as a cat, as that is what it has
learned. This happens when the
Variance is high, our model will
capture all the features of the data
given to it, including the noise, will
tune itself to the data, and predict
it very well but when given new
data, it cannot predict on it as it is
too specific to training data.
• our model will perform really well
on testing data and get high
accuracy but will fail to perform on
new, unseen data. New data may
not have the exact same features
and the model won’t be able to
predict it very well. This is called
Overfitting.
Types of Variance
• High Variance
High variance models capture noise along with hidden pattern. It
leads to overfitting. High variance models show high training
accuracy but low test accuracy. Some features of a high variance
model are an overly complex model, overfitting, low error on
training data, and high error or test data.
High variance shows a large variation in the prediction of the
target function with changes in the training dataset.
• Low Variance
A model with low variance is unable to capture the hidden pattern
in the data. Low variance may occur when we have a very small
amount of data or use a very simplified model. Low variance
leads to underfitting.
there is a small variation in the prediction of the target function
with changes in the training data set
Underfitting and overfitting

• Underfitting happens when your model is too simple to


capture variations and patterns in your data. The machine
doesn’t learn the right characteristics and relationships
from the training data, and thus performs poorly with
subsequent data sets. It might be trained on a red apple
and mistake a red cherry for an apple.

• Overfitting happens when a model is too complex, with


too much detail and random fluctuations or noise in the
training data set. The machine erroneously sees this noise
as true patterns, and thus is not able to generalize and see
real patterns in subsequent data sets. It might be trained
on many details of a specific type of apple and thus cannot
find apples if they don’t have all these specific details.
Underfitting and Overfitting
High Variance and Low variance

Characteristics of a high variance model

•Noise in the data set


•Potential towards overfitting
•Complex models
•Trying to put all data points as close as possible

•Ways to Reduce High Variance

• Reduce the input features or number of parameters as a


model is overfitted.
• Do not use a much complex model.
• Increase the training data.
• Increase the Regularization term.
Different Combinations of Bias-Variance

1. Low-Bias, Low-Variance:
The combination of low bias and low variance shows an ideal
machine learning model. However, it is not possible
practically.
2. Low-Bias, High-Variance: With low bias and high variance,
model predictions are inconsistent and accurate on average.
This case occurs when the model learns with a large number
of parameters and hence leads to an overfitting
3. High-Bias, Low-Variance: With High bias and low
variance, predictions are consistent but inaccurate on
average. This case occurs when a model does not learn well
with the training dataset or uses few numbers of the
parameter. It leads to underfitting problems in the model.
4. High-Bias, High-Variance:
With high bias and high variance, predictions are
inconsistent and also inaccurate on average

Bias-Variance Trade-Off


• If the algorithm is too simple (hypothesis
with linear equation) then it may be on high
bias and low variance condition and thus is
error-prone.
• If algorithms fit too complex (hypothesis with
high degree equation) then it may be on high
variance and low bias. In the latter condition,
the new entries will not perform well.
• For an accurate prediction of the model,
algorithms need a low variance and low bias
• The total error is the sum of bias error and variance
error. The optimal region shows the area with the
balance between bias and variance, showing
optimal model complexity with minimum error.
Mathematical Representation
• The prediction error in the machine learning model
can be written mathematically as follows −
Error = bias2 + variance + irreducible error.
• To minimize the model prediction error, we need to
choose model complexity in such a way so that a
balance between these two errors can be met.
Techniques to Balance Bias and Variance

Reducing High Bias


• Choosing a more complex model − As we have
seen in the above diagram, choosing a more
complex model may reduce the bias error of the
model prediction.
• Adding more features − Adding mode features
can increase the complexity of the model that can
capture even better hidden patterns that will
decrease the bias error of the model.
• Reducing regularization − Regularization
prevents overfitting, but while decreasing the
variance, it can increase bias. So, reducing the
regularization parameters or removing
regularization overall can reduce bias errors.
Reducing High Variance
• Applying regularization techniques − Regularization
techniques add penalty to complex model that will
eventually result in reduced complexity of the model. A
less complex model will show less variance.
• Simplifying model complexity − A less complex model
will have low variance. You can reduce the variance by
using a simpler algorithm.
• Adding more data − Adding more data to the dataset
can help the model to perform better showing less
variance.
• Cross-validation − Cross-validation can be useful to
identify overfitting by comparing the performance on
training and validation sets of the datasets.

You might also like