Machine Learning Math Essentials - 12.02.2025

The document discusses the concepts of probability in relation to machine learning, differentiating between frequentist and Bayesian probabilities. It explains bias and variance as sources of error in predictive models, emphasizing the importance of balancing these errors to avoid underfitting and overfitting. Techniques for reducing bias and variance are also outlined to improve model accuracy.

Uploaded by

venkat Mohan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

15 views88 pages

Machine Learning Math Essentials - 12.02.2025

Uploaded by

venkat Mohan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 88

Turning data into probabilities

Probability represents the certainty factor. Certainty is the rate

that you would assign to an event to happen.
E.g.Rolling a dice getting chances of 6 is 1/6=16.67%
That’s the certainty you allot to that particular event.This is called
probability or frequentist probability
The frequentist probability denotes the frequency with which the
event can happen amongst many trials/events
Not all the scenario are frequency related in our previous
assumption
• Consider ML algorithm in which we estimate the probability
of inflation or deflation of the price of a fuel
• The latter phenomenon is called Bayesian probability. Rather
than considering the frequency with which an event repeats,
we quantify our belief
• There’s a 32% chance that a diabetic patient is going to
develop heart failure
• This statement isn’t prone to repetition where we create
infinite replicas of the patient’s symptoms. We instead
quantify with a 32% certainty that heart failure could happen.
• Altogether, probability measures the extent of certainty
pertaining to an uncertain event.
• Formulating an easy and uncertain rule is
better in comparison to formulating a complex
and certain rule-It’s cheaper to generate and
analyze.
• Moreover a certain rule does not guarantee
generating the right and required output
always. An uncertain rule on the other hand
though non-deterministic helps in reaching
generalized conclusion
Turning Data into Probabilities
• Machine Learning is an interdisciplinary field that
uses statistics, probability, algorithms to learn from
data and provide insights which can be used to
build intelligent applications.
• In probability theory, an event is a set of outcomes
of an experiment to which a probability is assigned.
• If E represents an event, then P(E) represents the
probability that E will occur. A situation where E
might happen (success) or might not happen
(failure) is calleda trial
• This event can be anything like tossing a coin, rolling a
die or pulling a colored ball out of a bag. In these
examples the outcome of the event is random, so the
variable that represents the outcome of these events is
called a random variable.
• Theoretical probability on the other hand is given by
the number of ways the particular event can occur
divided by the total number of possible outcomes. So
ahead can occur once and possible outcomes are two
(head, tail). The true(theoretical) probability of a head
is 1/2.
Bias and Variance in Machine Learning
• Machine learning is a branch of Artificial Intelligence,
which allows machines to perform data analysis and
make predictions.
• However, if the machine learning model is not accurate,
it can make predictions errors, and these prediction
errors are usually known as Bias and Variance.
• In machine learning, these errors will always be present
as there is always a slight difference between the model
predictions and actual predictions.
• The main aim of ML/data science analysts is to reduce
these errors in order to get more accurate results.
• In this topic, we are going to discuss bias and variance,
Bias-variance trade-off, Underfitting and Overfitting. But
before starting, let's first understand what errors in
Machine learning are?
Errors in Machine Learning
• An error is a measure of how accurately an algorithm can
make predictions for the previously unknown dataset. On the
basis of these errors, the machine learning model is selected
that can perform best on the particular dataset. There are
mainly two types of errors in machine learning, which are:
• Reducible errors: These errors can be reduced to improve
the model accuracy. Such errors can further be classified
into bias and Variance.
• Irreducible errors: These errors will always be present in the
model regardless of which algorithm has been used. The
cause of these errors is unknown variables whose value
can't be reduced.
Errors in Machine Learning?

• Reducible
errors: These errors
can be reduced to
improve the model
accuracy. Such errors
can further be
classified into bias and
Variance.
• Irreducible
errors: These errors
will always be present
in the model
• regardless of which
algorithm has been
used. The cause of
these errors is
unknown variables
whose value can't be
reduced.
What is bias in machine learning?

• Bias is simply defined as the inability of the

model because of that there is some
difference or error occurring between the
model’s predicted value and the actual value.
These differences between actual or expected
values and the predicted values are known as
error or bias error or error due to bias.
• Bias is a systematic error that occurs due to
wrong assumptions in the machine learning
process.
• In general, a machine learning model analyses
the data, find patterns in it and make
predictions. While training, the model learns
these patterns in the dataset and applies them
to test data for prediction. While making
predictions, a difference occurs between
prediction values made by the model and
actual values/expected values, and this
difference is known as bias errors or Errors
due to bias. It can be defined as an inability of
machine learning algorithms such as Linear
Regression to capture the true relationship
between the data points
Bias vs. variance, and the tradeoff

•Bias and variance are two sources of error in

predictive models. Getting the right balance
between the bias and variance tradeoff is
fundamental to effective machine learning
algorithms. Here is a quick explanation of
these concepts:
•Bias. Bias refers to error caused by a model
for solving complex problems that is over
simplified, makes significant assumptions,
and misses important relationships in your
data.
•Variance. Variance is an error caused by an
algorithm that is too sensitive to fluctuations
in data, creating an overly complex model
that sees patterns in data that are actually
just randomness.
•Bias–variance tradeoff. Minimizing errors
caused by oversimplification and excessive
complication requires finding the right
balance or tradeoff between the two.
Bias
• Low Bias: Low bias value means fewer assumptions are taken to
build the target function. In this case, the model will closely match the
training dataset.
•High Bias: High bias value means more assumptions are taken to
build the target function. In this case, the model will not match the
training dataset closely. A high bias model also cannot perform
well on new data.
•When the Bias is high, assumptions made by our model are too basic,
the model can’t capture the important features of our data. This
means that our model hasn’t captured patterns in the training data
and hence cannot perform well on the testing data too. If this is the
case, our model cannot perform on new data and cannot be sent into
production.
•The high-bias model will not be able to capture the dataset trend. It is
considered as the underfitting model which has a high error rate. It is
due to a very simplified algorithm.
• This instance, where the model cannot find patterns in our training set
and hence fails for both seen and unseen data, is called Underfitting.
• The below figure shows an example of Underfitting. As we can see,
the model has found no patterns in our data and the line of best fit is a
straight line that does not pass through any of the data points. The
model has failed to train properly on the data given and cannot predict
new data either.

• For example, a linear regression model may have a high bias if the
data has a non-linear relationship.
• A linear algorithm has a high bias, as it makes them learn fast. The
simpler the algorithm, the higher the bias it has likely to be introduced.
Whereas a nonlinear algorithm often has low bias.
• Some examples of machine learning algorithms with low bias are
Decision Trees, k-Nearest Neighbours and Support Vector
Machines. At the same time, an algorithm with high bias is Linear
Regression, Linear Discriminant Analysis and Logistic
Regression.
Characteristics of a high bias model include:

• Failure to capture proper data trends

• Potential towards underfitting

• More generalized/overly simplified

• High error rate

Ways to reduce high bias in Machine Learning:

• Use more complex models, such as

including some polynomial
features.
• Increase the number of input
features as the model is underfitted
• Reduce Regularization of the
model
• Increase the size of the training
data
What is a Variance Error?
• Variance is the very opposite of Bias
• Variance is the measure of spread in data from its
mean position.
• In machine learning variance is the amount by
which the performance of a predictive model
changes when it is trained on different subsets of
the training data
• More specifically, variance is the variability of the
model that how much it is sensitive to another
subset of the training dataset. i.e. how much it
can adjust on the new subset of the training
dataset.
Variance
• If model accuracies on training
and test data vary greatly, the
model has high variance.

• A model with high variance can

even fit noises on training data
but lacks generalization to new,
unseen data.
• Low variance. Models with
high bias will have low
variance.

• High variance. Models with

high variance will have a low
bia
• During training, it allows our model to ‘see’ the data a
certain number of times to find patterns in it. If it does not
work on the data for long enough, it will not find patterns
and bias occurs.
• On the other hand, if our model is allowed to view the data
too many times, it will learn very well for only that data. It
will capture most patterns in the data, but it will also learn
from the unnecessary data present, or from the noise.
• We can define variance as the model’s sensitivity to
fluctuations in the data. Our model may learn from noise.
This will cause our model to consider trivial features as
important.
• our model has learned extremely
well for our training data, which
has taught it to identify cats. But
when given new data, such as the
picture of a fox, our model predicts
it as a cat, as that is what it has
learned. This happens when the
Variance is high, our model will
capture all the features of the data
given to it, including the noise, will
tune itself to the data, and predict
it very well but when given new
data, it cannot predict on it as it is
too specific to training data.
• our model will perform really well
on testing data and get high
accuracy but will fail to perform on
new, unseen data. New data may
not have the exact same features
and the model won’t be able to
predict it very well. This is called
Overfitting.
Types of Variance
• High Variance
High variance models capture noise along with hidden pattern. It
leads to overfitting. High variance models show high training
accuracy but low test accuracy. Some features of a high variance
model are an overly complex model, overfitting, low error on
training data, and high error or test data.
High variance shows a large variation in the prediction of the
target function with changes in the training dataset.
• Low Variance
A model with low variance is unable to capture the hidden pattern
in the data. Low variance may occur when we have a very small
amount of data or use a very simplified model. Low variance
leads to underfitting.
there is a small variation in the prediction of the target function
with changes in the training data set
Underfitting and overfitting

• Underfitting happens when your model is too simple to

capture variations and patterns in your data. The machine
doesn’t learn the right characteristics and relationships
from the training data, and thus performs poorly with
subsequent data sets. It might be trained on a red apple
and mistake a red cherry for an apple.

• Overfitting happens when a model is too complex, with

too much detail and random fluctuations or noise in the
training data set. The machine erroneously sees this noise
as true patterns, and thus is not able to generalize and see
real patterns in subsequent data sets. It might be trained
on many details of a specific type of apple and thus cannot
find apples if they don’t have all these specific details.
Underfitting and Overfitting
High Variance and Low variance
•
Characteristics of a high variance model

•Noise in the data set

•Potential towards overfitting
•Complex models
•Trying to put all data points as close as possible

•Ways to Reduce High Variance

• Reduce the input features or number of parameters as a

model is overfitted.
• Do not use a much complex model.
• Increase the training data.
• Increase the Regularization term.
Different Combinations of Bias-Variance

1. Low-Bias, Low-Variance:
The combination of low bias and low variance shows an ideal
machine learning model. However, it is not possible
practically.
2. Low-Bias, High-Variance: With low bias and high variance,
model predictions are inconsistent and accurate on average.
This case occurs when the model learns with a large number
of parameters and hence leads to an overfitting
3. High-Bias, Low-Variance: With High bias and low
variance, predictions are consistent but inaccurate on
average. This case occurs when a model does not learn well
with the training dataset or uses few numbers of the
parameter. It leads to underfitting problems in the model.
4. High-Bias, High-Variance:
With high bias and high variance, predictions are
inconsistent and also inaccurate on average
•
Bias-Variance Trade-Off

•
• If the algorithm is too simple (hypothesis
with linear equation) then it may be on high
bias and low variance condition and thus is
error-prone.
• If algorithms fit too complex (hypothesis with
high degree equation) then it may be on high
variance and low bias. In the latter condition,
the new entries will not perform well.
• For an accurate prediction of the model,
algorithms need a low variance and low bias
• The total error is the sum of bias error and variance
error. The optimal region shows the area with the
balance between bias and variance, showing
optimal model complexity with minimum error.
Mathematical Representation
• The prediction error in the machine learning model
can be written mathematically as follows −
Error = bias2 + variance + irreducible error.
• To minimize the model prediction error, we need to
choose model complexity in such a way so that a
balance between these two errors can be met.
Techniques to Balance Bias and Variance

Reducing High Bias

• Choosing a more complex model − As we have
seen in the above diagram, choosing a more
complex model may reduce the bias error of the
model prediction.
• Adding more features − Adding mode features
can increase the complexity of the model that can
capture even better hidden patterns that will
decrease the bias error of the model.
• Reducing regularization − Regularization
prevents overfitting, but while decreasing the
variance, it can increase bias. So, reducing the
regularization parameters or removing
regularization overall can reduce bias errors.
Reducing High Variance
• Applying regularization techniques − Regularization
techniques add penalty to complex model that will
eventually result in reduced complexity of the model. A
less complex model will show less variance.
• Simplifying model complexity − A less complex model
will have low variance. You can reduce the variance by
using a simpler algorithm.
• Adding more data − Adding more data to the dataset
can help the model to perform better showing less
variance.
• Cross-validation − Cross-validation can be useful to
identify overfitting by comparing the performance on
training and validation sets of the datasets.

ML Decode
No ratings yet
ML Decode
130 pages
Merge +1
No ratings yet
Merge +1
107 pages
Bias and Variance in Machine Learning
100% (1)
Bias and Variance in Machine Learning
7 pages
Bias, Variance, and Tradeoff
No ratings yet
Bias, Variance, and Tradeoff
8 pages
Bias and Variance
No ratings yet
Bias and Variance
6 pages
Module 3 Modified
No ratings yet
Module 3 Modified
48 pages
Bias and Variance
No ratings yet
Bias and Variance
7 pages
12 Bias-Variance - Underfit - Overfit
No ratings yet
12 Bias-Variance - Underfit - Overfit
4 pages
Chapter2 1 22
No ratings yet
Chapter2 1 22
9 pages
Lec 3
No ratings yet
Lec 3
13 pages
Ensemble Method
No ratings yet
Ensemble Method
12 pages
Diagnosing Bias Vs Variance
No ratings yet
Diagnosing Bias Vs Variance
11 pages
Bias and Variance
No ratings yet
Bias and Variance
36 pages
11 July Unit 1
No ratings yet
11 July Unit 1
47 pages
Bias Variance
No ratings yet
Bias Variance
8 pages
Csa202 Unit 2
No ratings yet
Csa202 Unit 2
36 pages
Unit 4
No ratings yet
Unit 4
50 pages
Lecture 8
No ratings yet
Lecture 8
15 pages
Unit 2
No ratings yet
Unit 2
97 pages
Bias and Variance in Machine Learning - Javatpoint
100% (2)
Bias and Variance in Machine Learning - Javatpoint
6 pages
Machine Learning-Unit 3
No ratings yet
Machine Learning-Unit 3
18 pages
4 - Bias-Variance Tradeoff
No ratings yet
4 - Bias-Variance Tradeoff
28 pages
Bais and Variance
No ratings yet
Bais and Variance
4 pages
DL Unit1
100% (2)
DL Unit1
79 pages
Lec 8
No ratings yet
Lec 8
19 pages
Bias and Variance
No ratings yet
Bias and Variance
15 pages
Machine Learning Notes Anna University
No ratings yet
Machine Learning Notes Anna University
9 pages
Machine Learning-2
No ratings yet
Machine Learning-2
87 pages
ML Lec-7
No ratings yet
ML Lec-7
12 pages
Bias Variance Dichotomy
No ratings yet
Bias Variance Dichotomy
11 pages
1 Bias Variance Overfit Underfit
No ratings yet
1 Bias Variance Overfit Underfit
6 pages
Bias - Variance Trade Off
No ratings yet
Bias - Variance Trade Off
11 pages
Linear Regression, Polynomical, Gradiant Descent
No ratings yet
Linear Regression, Polynomical, Gradiant Descent
42 pages
ML MU Unit 2
100% (2)
ML MU Unit 2
42 pages
ML Decode
No ratings yet
ML Decode
130 pages
ML MU Unit 2
100% (3)
ML MU Unit 2
84 pages
40 Machine Learning Interview Questions
No ratings yet
40 Machine Learning Interview Questions
55 pages
Probability Theory
No ratings yet
Probability Theory
3 pages
Uf, Of, Bias-Variance Tradeoff
No ratings yet
Uf, Of, Bias-Variance Tradeoff
3 pages
Bias Variance Overfitting
No ratings yet
Bias Variance Overfitting
3 pages
Overview of Bias and Variance
No ratings yet
Overview of Bias and Variance
3 pages
Emsemble Methods-Pages-Deleted
No ratings yet
Emsemble Methods-Pages-Deleted
2 pages
(Technical) Machine Learning U3-6 (2019 Pattern)
No ratings yet
(Technical) Machine Learning U3-6 (2019 Pattern)
101 pages
Bias and Variance
No ratings yet
Bias and Variance
21 pages
ML UNIT 4 Notes
No ratings yet
ML UNIT 4 Notes
30 pages
Chapter 1-ML
No ratings yet
Chapter 1-ML
27 pages
Bias Variance Tradeoff
No ratings yet
Bias Variance Tradeoff
10 pages
Vsat2k - ML - Ch1a Evaluation of Learning Algorithms - Jan 2025
No ratings yet
Vsat2k - ML - Ch1a Evaluation of Learning Algorithms - Jan 2025
19 pages
Module3 DS PPT
No ratings yet
Module3 DS PPT
68 pages
Unit 1.2 Perceptron 2024
No ratings yet
Unit 1.2 Perceptron 2024
107 pages
08 Eval-Intro Notes
No ratings yet
08 Eval-Intro Notes
10 pages
Machine Learning Volume I 280820241047
No ratings yet
Machine Learning Volume I 280820241047
4 pages
2 1 TXT Bias Variance
No ratings yet
2 1 TXT Bias Variance
4 pages
Machine Learning Models
No ratings yet
Machine Learning Models
54 pages
Bias and Variance
No ratings yet
Bias and Variance
4 pages
Machine Learning Yearning
No ratings yet
Machine Learning Yearning
40 pages
ML 5
No ratings yet
ML 5
26 pages
SONA Institute of Technology - Second Round Shortlisted Candidates
No ratings yet
SONA Institute of Technology - Second Round Shortlisted Candidates
6 pages
Sona Hwi Interview Shortlist
No ratings yet
Sona Hwi Interview Shortlist
2 pages
Sona College-Interview Reschedule
No ratings yet
Sona College-Interview Reschedule
5 pages
Endava 2026 Batch JD
No ratings yet
Endava 2026 Batch JD
3 pages
Wipro Milestone1 Result
No ratings yet
Wipro Milestone1 Result
12 pages
Sona College of Technology, Salem
No ratings yet
Sona College of Technology, Salem
6 pages
Sona L1 Assessment
No ratings yet
Sona L1 Assessment
2 pages
JMAN - CSE New List
No ratings yet
JMAN - CSE New List
28 pages
Cognizant Communication Assessment Questions
100% (1)
Cognizant Communication Assessment Questions
4 pages
Associate Software Engineer
No ratings yet
Associate Software Engineer
2 pages
Programming in Java
No ratings yet
Programming in Java
1 page
Associate Data Scientist
No ratings yet
Associate Data Scientist
2 pages
Jman - Cse
No ratings yet
Jman - Cse
49 pages
Prospectus Pharm D
No ratings yet
Prospectus Pharm D
18 pages
Programming in Java 250525 093638
No ratings yet
Programming in Java 250525 093638
1 page
CTS 3
No ratings yet
CTS 3
16 pages
Current Affairs - MAY 2025
No ratings yet
Current Affairs - MAY 2025
127 pages
JD Software Engineer JMAN 2025
No ratings yet
JD Software Engineer JMAN 2025
1 page
DevRev Customer Support Engineering Intern
No ratings yet
DevRev Customer Support Engineering Intern
3 pages
Zentropy Addon
No ratings yet
Zentropy Addon
4 pages
Vacation Training Cse - Iqmath (Paid and Attendee List)
No ratings yet
Vacation Training Cse - Iqmath (Paid and Attendee List)
4 pages
SONA Round1 Shortlisted Candidates
No ratings yet
SONA Round1 Shortlisted Candidates
2 pages
CSE-Batch 3 List
No ratings yet
CSE-Batch 3 List
1 page
Honours Courses - Registration Semester 7
No ratings yet
Honours Courses - Registration Semester 7
1 page
CSD
No ratings yet
CSD
2 pages
Cognizant DN Java FSE Progress Report
No ratings yet
Cognizant DN Java FSE Progress Report
4 pages
Sona HWI Final Registration
No ratings yet
Sona HWI Final Registration
30 pages
Infosys Addon
No ratings yet
Infosys Addon
8 pages
AI Lab Question Bank
No ratings yet
AI Lab Question Bank
18 pages
DN4.0 Deepskilling Handbook Java FSE
No ratings yet
DN4.0 Deepskilling Handbook Java FSE
33 pages
Mini Project - PPT 1
No ratings yet
Mini Project - PPT 1
12 pages
How Data Becomes Knowledge, Part 1 From Data Toknowledge 7p
No ratings yet
How Data Becomes Knowledge, Part 1 From Data Toknowledge 7p
7 pages
Introduction To Modern Control Systems: It Chivorn
No ratings yet
Introduction To Modern Control Systems: It Chivorn
27 pages
Statement of Purpose: Jaweria Amjad
No ratings yet
Statement of Purpose: Jaweria Amjad
3 pages
Unit 5 Deep Learning
No ratings yet
Unit 5 Deep Learning
24 pages
Signature Verification and Detection
No ratings yet
Signature Verification and Detection
61 pages
Arshad Hisham Sir
No ratings yet
Arshad Hisham Sir
10 pages
Lalit - Garg@um - Edu.mt Lalit - Garg@online - Liverpool.ac - Uk
No ratings yet
Lalit - Garg@um - Edu.mt Lalit - Garg@online - Liverpool.ac - Uk
12 pages
Students' Course Results Prediction Based On Data Processing and Machine Learning Methods
No ratings yet
Students' Course Results Prediction Based On Data Processing and Machine Learning Methods
13 pages
A Brief Survey of Deep Reinforcement Learning PDF
No ratings yet
A Brief Survey of Deep Reinforcement Learning PDF
14 pages
Data Mining Introduction
No ratings yet
Data Mining Introduction
52 pages
Cortex Data Lake
No ratings yet
Cortex Data Lake
3 pages
Internship Data Information Technology AY 2020 21 Compressed
No ratings yet
Internship Data Information Technology AY 2020 21 Compressed
346 pages
Unit 3
No ratings yet
Unit 3
21 pages
Crowdsourced Failure Reports
No ratings yet
Crowdsourced Failure Reports
22 pages
Ml-Mod 1 Pyq and Imp QN
No ratings yet
Ml-Mod 1 Pyq and Imp QN
12 pages
Feature Engineering
No ratings yet
Feature Engineering
9 pages
Sample Paper: General Instructions
No ratings yet
Sample Paper: General Instructions
4 pages
Explore State of The Art AI Technology
No ratings yet
Explore State of The Art AI Technology
28 pages
Towards The Sense Disambiguation of Afan Oromo Words Using Hybrid Approach (Unsupervised Machine Learning and Rule Based)
No ratings yet
Towards The Sense Disambiguation of Afan Oromo Words Using Hybrid Approach (Unsupervised Machine Learning and Rule Based)
17 pages
29-3 Slot C - University Practical Exam - Jan To Mar 2025 - Hs1
No ratings yet
29-3 Slot C - University Practical Exam - Jan To Mar 2025 - Hs1
6 pages
Lecture Week 2 KNN and Model Evaluation PDF
100% (1)
Lecture Week 2 KNN and Model Evaluation PDF
53 pages
500 AI Prompts
No ratings yet
500 AI Prompts
9 pages
Python - Final 1
No ratings yet
Python - Final 1
17 pages
Intern Iit Bombay
No ratings yet
Intern Iit Bombay
5 pages
Artificial Intelligence & Machine Learning Notes
No ratings yet
Artificial Intelligence & Machine Learning Notes
177 pages
Techincal Seminar
No ratings yet
Techincal Seminar
17 pages
Numbers of Classifier
No ratings yet
Numbers of Classifier
49 pages
Generative AI-Driven Design in Architecture and Urban Planning
No ratings yet
Generative AI-Driven Design in Architecture and Urban Planning
10 pages
AI For Wordfast
No ratings yet
AI For Wordfast
2 pages