0% found this document useful (0 votes)
16 views

Bias and Variance in Machine Learning _ GeeksforGeeks

The document discusses the concepts of bias and variance in machine learning, explaining how they affect model performance and error. Bias refers to errors due to incorrect assumptions in the model, while variance measures sensitivity to changes in training data. The text also outlines methods to reduce bias and variance, emphasizing the importance of achieving a balance between the two for optimal model generalization.

Uploaded by

Bhagya Lakshmi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views

Bias and Variance in Machine Learning _ GeeksforGeeks

The document discusses the concepts of bias and variance in machine learning, explaining how they affect model performance and error. Bias refers to errors due to incorrect assumptions in the model, while variance measures sensitivity to changes in training data. The text also outlines methods to reduce bias and variance, emphasizing the importance of achieving a balance between the two for optimal model generalization.

Uploaded by

Bhagya Lakshmi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 13

07/05/2025, 19:13 Bias and Variance in Machine Learning | GeeksforGeeks

Search... 91

Data Science IBM Certification Data Science Data Science Projects Data Analysis Data Visualization

Bias and Variance in Machine Learning


Last Updated : 03 Apr, 2025

There are various ways to evaluate a machine-learning model. We can


use MSE (Mean Squared Error) for Regression; Precision, Recall, and
ROC (Receiver operating characteristics) for a Classification Problem
along with Absolute Error. In a similar way, Bias and Variance help us in
parameter tuning and deciding better-fitted models among several built.

Bias is one type of error that occurs due to wrong assumptions about
data such as assuming data is linear when in reality, data follows a
complex function. On the other hand, variance gets introduced with high
sensitivity to variations in training data. This also is one type of error
since we want to make our model robust against noise. There are two
types of error in machine learning. Reducible error and Irreducible error.
Bias and Variance come under reducible error.

What is Bias?
Bias is simply defined as the inability of the model because of that there
is some difference or error occurring between the model’s predicted
value and the actual value. These differences between actual or
expected values and the predicted values are known as error or bias
error or error due to bias. Bias is a systematic error that occurs due to
wrong assumptions in the machine learning process.

Let Y be the true value of a parameter, and let Y^ be an estimator of Y


based on a sample of data. Then, the bias of the estimator Y^ is given
by:

Bias(Y^ ) = E(Y^ )–Y

where
We E(Y^to) ensure
use cookies is theyouexpected
have the bestvalue ofexperience
browsing the estimator Y^ . ItByisusing
on our website. theour site, you
acknowledgeofthat
measurement youmodel
the have readthat
and understood
how well ouritCookie
fits Policy & Privacy Policy
the data.
Got It !
https://www.geeksforgeeks.org/bias-vs-variance-in-machine-learning/ 1/13
07/05/2025, 19:13 Bias and Variance in Machine Learning | GeeksforGeeks

Low Bias: Low bias value means fewer assumptions are taken to
build the target function. In this case, the model will closely match
the training dataset.
High Bias: High bias value means more assumptions are taken to
build the target function. In this case, the model will not match the
training dataset closely.

The high-bias model will not be able to capture the dataset trend. It is
considered as the underfitting model which has a high error rate. It is
due to a very simplified algorithm.

For example, a linear regression model may have a high bias if the data
has a non-linear relationship.

Ways to reduce high bias in Machine Learning:

Use a more complex model: One of the main reasons for high bias is
the very simplified model. it will not be able to capture the
complexity of the data. In such cases, we can make our mode more
complex by increasing the number of hidden layers in the case of a
deep neural network. Or we can use a more complex model like
Polynomial regression for non-linear datasets, CNN for image
processing, and RNN for sequence learning.
Increase the number of features: By adding more features to train
the dataset will increase the complexity of the model. And improve
its ability to capture the underlying patterns in the data.
Reduce Regularization of the model: Regularization techniques such
as L1 or L2 regularization can help to prevent overfitting and improve
the generalization ability of the model. if the model has a high bias,
reducing the strength of regularization or removing it altogether can
help to improve its performance.
Increase the size of the training data: Increasing the size of the
training data can help to reduce bias by providing the model with
more examples to learn from the dataset.
We use cookies to ensure you have the best browsing experience on our website. By using our site, you
What acknowledge that you have read and understood our Cookie Policy & Privacy Policy
is Variance?

https://www.geeksforgeeks.org/bias-vs-variance-in-machine-learning/ 2/13
07/05/2025, 19:13 Bias and Variance in Machine Learning | GeeksforGeeks

Variance is the measure of spread in data from its mean position. In


machine learning variance is the amount by which the performance of a
predictive model changes when it is trained on different subsets of the
training data. More specifically, variance is the variability of the model
that how much it is sensitive to another subset of the training dataset.
i.e. how much it can adjust on the new subset of the training dataset.

Let Y be the actual values of the target variable, and Y^ be the


predicted values of the target variable. Then the variance of a model can
be measured as the expected value of the square of the difference
between predicted values and the expected value of the predicted
values.

Variance = E[(Y^ –E[Y^ ])2 ]

where E[Yˉ ] is the expected value of the predicted values. Here expected
value is averaged over all the training data.

Variance errors are either low or high-variance errors.

Low variance: Low variance means that the model is less sensitive to
changes in the training data and can produce consistent estimates of
the target function with different subsets of data from the same
distribution. However, low variance can also indicate underfitting if
the model is too simple and fails to capture the underlying patterns
in the data. This is when the model performs poorly on both the
training data and testing data.
High variance: High variance means that the model is very sensitive
to changes in the training data and can result in significant changes
in the estimate of the target function when trained on different
subsets of data from the same distribution. This is the case of
overfitting when the model performs well on the training data but
poorly on new, unseen test data. It fits the training data too closely
that it fails on the new training dataset.

Ways to Reduce the reduce Variance in Machine Learning:


We use cookies to ensure you have the best browsing experience on our website. By using our site, you
Cross-validation:
acknowledge that you By
havesplitting the data
read and understood ourinto
Cookietraining and Policy
Policy & Privacy testing sets
multiple times, cross-validation can help identify if a model is

https://www.geeksforgeeks.org/bias-vs-variance-in-machine-learning/ 3/13
07/05/2025, 19:13 Bias and Variance in Machine Learning | GeeksforGeeks

overfitting or underfitting and can be used to tune hyperparameters


to reduce variance.
Feature selection: By choosing the only relevant feature will
decrease the model’s complexity. and it can reduce the variance error.
Regularization: We can use L1 or L2 regularization to reduce
variance in machine learning models
Ensemble methods: It will combine multiple models to improve
generalization performance. Bagging, boosting, and stacking are
common ensemble methods that can help reduce variance and
improve generalization performance.
Simplifying the model: Reducing the complexity of the model, such
as decreasing the number of parameters or layers in a neural
network, can also help reduce variance and improve generalization
performance.
Early stopping: Early stopping is a technique used to prevent
overfitting by stopping the training of the deep learning model when
the performance on the validation set stops improving.

Mathematical Derivation for Total Error


MSE = (Y − Y^ )2

= (Y − E(Y^ ) + E(Y^ ) − Y^ )2 ​

= (Y − E(Y^ ))2 + (E(Y^ ) − Y^ )2 + 2(Y − E(Y^ ))(E(Y^ ) − Y^ )

Applying the Expectations on both sides.


E[(Y − Y^ )2 ] = E[(Y − E(Y^ ))2 + (E(Y^ ) − Y^ )2 + 2(Y − E(Y^ ))(E(Y^ ) − Y^ )]
= E[(Y − E(Y^ ))2 ] + E[(E(Y^ ) − Y^ )2 ] + 2E[(Y − E(Y^ ))(E(Y^ ) − Y^ )]]
= [(Y − E(Y^ ))2 ] + E[(E(Y^ ) − Y^ )2 ] + 2(Y − E(Y^ ))E[(E(Y^ ) − Y^ )]]
= [(Y − E(Y^ ))2 ] + E[(E(Y^ ) − Y^ )2 ] + 2(Y − E(Y^ ))[E[E(Y^ )] − E[Y^ ]]
= [(Y − E(Y^ ))2 ] + E[(E(Y^ ) − Y^ )2 ] + 2(Y − E(Y^ ))[E(Y^ )] − E[Y^ ]]
​ ​

= [(Y − E(Y^ ))2 ] + E[(E(Y^ ) − Y^ )2 ] + 2(Y − E(Y^ ))[0]


= [(Y − E(Y^ ))2 ] + E[(E(Y^ ) − Y^ )2 ] + 0
= [Bias2 ] + Variance

Different Combinations of Bias-Variance


There can be four combinations between bias and variance.
We use cookies to ensure you have the best browsing experience on our website. By using our site, you
High acknowledge
Bias, Low that you have read and
Variance: understood
A model our Cookie
with highPolicy
bias&and
Privacy
lowPolicy
variance is
said to be underfitting.
https://www.geeksforgeeks.org/bias-vs-variance-in-machine-learning/ 4/13
07/05/2025, 19:13 Bias and Variance in Machine Learning | GeeksforGeeks

High Variance, Low Bias: A model with high variance and low bias is
said to be overfitting.
High-Bias, High-Variance: A model has both high bias and high
variance, which means that the model is not able to capture the
underlying patterns in the data (high bias) and is also too sensitive to
changes in the training data (high variance). As a result, the model
will produce inconsistent and inaccurate predictions on average.
Low Bias, Low Variance: A model that has low bias and low variance
means that the model is able to capture the underlying patterns in
the data (low bias) and is not too sensitive to changes in the training
data (low variance). This is the ideal scenario for a machine learning
model, as it is able to generalize well to new, unseen data and
produce consistent and accurate predictions. But in practice, it’s not
possible.

Bias-Variance Combinations

Now we know that the ideal case will be Low Bias and Low variance,
but in practice, it is not possible. So, we trade off between Bias and
variance to achieve a balanced bias and variance.

A model with balanced bias and variance is said to have optimal


generalization performance. This means that the model is able to
capture the underlying patterns in the data without overfitting or
underfitting.
We Theyou
use cookies to ensure model is best
have the likely to be
browsing just complex
experience enough
on our website. toour
By using capture
site, you
acknowledge that you have read and understood our Cookie Policy
the complexity of the data, but not too complex to overfit the training & Privacy Policy
data. This can happen when the model has been carefully tuned to
https://www.geeksforgeeks.org/bias-vs-variance-in-machine-learning/ 5/13
07/05/2025, 19:13 Bias and Variance in Machine Learning | GeeksforGeeks

achieve a good balance between bias and variance, by adjusting the


hyperparameters and selecting an appropriate model architecture.

Machine Learning Algorithm Bias Variance

Linear Regression High Bias Less Variance

Decision Tree Low Bias High Variance

Random Forest Low Bias High Variance

Bagging Low Bias High Variance

Bias Variance Tradeoff


If the algorithm is too simple (hypothesis with linear equation) then it
may be on high bias and low variance condition and thus is error-prone.
If algorithms fit too complex (hypothesis with high degree equation)
then it may be on high variance and low bias. In the latter condition, the
new entries will not perform well. Well, there is something between
both of these conditions, known as a Trade-off or Bias Variance Trade-
off. This tradeoff in complexity is why there is a tradeoff between bias
and variance. An algorithm can’t be more complex and less complex at
the same time. For the graph, the perfect tradeoff will be like this.

We use cookies to ensure you have the best browsing experience on our website. By using our site, you
acknowledge that you have read and understood our Cookie Policy & Privacy Policy

https://www.geeksforgeeks.org/bias-vs-variance-in-machine-learning/ 6/13
07/05/2025, 19:13 Bias and Variance in Machine Learning | GeeksforGeeks

Bias-Variance Tradeoff

The technique by which we analyze the performance of the machine


learning model is known as Bias Variance Decomposition. Now we give
1-1 example of Bias Variance Decomposition for classification and
regression.

Bias Variance Decomposition for Classification and Regression

As per the formula, we have derived total error as the sum of Bias
squares and variance. We try to make sure that the bias and the
variance are comparable and one does not exceed the other by too
much difference.

# Import the necessary libraries


from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.tree import DecisionTreeClassifier
from sklearn.ensemble import BaggingClassifier
from mlxtend.evaluate import bias_variance_decomp
import warnings
warnings.filterwarnings('ignore')

# Load the dataset


X, y = load_iris(return_X_y=True)

# Split train and test dataset


X_train, X_test,\
y_train, y_test = train_test_split(X, y,
test_size=0.25,
random_state=23,
shuffle=True,
stratify=y)

# Build the classification model


tree = DecisionTreeClassifier(random_state=123)
clf = BaggingClassifier(base_estimator=tree,
n_estimators=50,
random_state=23)

# Bias variance decompositions


avg_expected_loss, avg_bias, \
avg_var = bias_variance_decomp(clf,
X_train, y_train,
X_test, y_test,
loss='0-1_loss',
random_seed=23)
# Print the value
We print('Average
use cookies to ensureexpected
you have the best %.2f'
loss: browsing% experience on our website. By using our site, you
avg_expected_loss)
acknowledge that
print('Average bias:you%.2f'
have read and understood our Cookie Policy & Privacy Policy
% avg_bias)
print('Average variance: %.2f' % avg_var)

https://www.geeksforgeeks.org/bias-vs-variance-in-machine-learning/ 7/13
07/05/2025, 19:13 Bias and Variance in Machine Learning | GeeksforGeeks

Output:

Average expected loss: 0.06


Average bias: 0.05
Average variance: 0.02

Now let’s perform the same on the regression task. And check the
values of the bias and variance.

# Load the necessary libraries


from sklearn.datasets import fetch_california_housing
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error
import tensorflow as tf
from mlxtend.evaluate import bias_variance_decomp
import warnings
warnings.filterwarnings('ignore')

# Laod the dataset


X, y = fetch_california_housing(return_X_y=True)

# Split train and test dataset


X_train, X_test,\
y_train, y_test = train_test_split(X, y,
test_size=0.25,
random_state=23,
shuffle=True)

# Build the regression model


model = tf.keras.Sequential([
tf.keras.layers.Dense(64, activation=tf.nn.relu),
tf.keras.layers.Dense(1)
])

# Set optimizer and loss


optimizer = tf.keras.optimizers.Adam()
model.compile(loss='mean_squared_error',
optimizer=optimizer)

# Train the model


model.fit(X_train, y_train, epochs=25, verbose=0)
# Evaluations
accuracy = model.evaluate(X_test, y_test)
print('Average: %.2f' % accuracy)

# Bias variance decompositions


avg_expected_loss, avg_bias,\
avg_var = bias_variance_decomp(model,
X_train, y_train,
X_test, y_test,
loss='mse',
random_seed=23,
We use cookies to ensure you have the best browsing experience on our website. By using our site, you
epochs=5,
acknowledge that you have read and understood our Cookie Policy & Privacy Policy
verbose=0)

# Print the result


https://www.geeksforgeeks.org/bias-vs-variance-in-machine-learning/ 8/13
07/05/2025, 19:13 Bias and Variance in Machine Learning | GeeksforGeeks
print('Average expected loss: %.2f' % avg_expected_loss)
print('Average bias: %.2f' % avg_bias)
print('Average variance: %.2f' % avg_var)

Output:

162/162 [==============================] - 0s 802us/step - loss:


0.9195
Average: 0.92
Average expected loss: 2.30
Average bias: 0.72
Average variance: 1.58

Comment More info Next Article


Bias-Variance Trade Off - Machine
Advertise with us Learning

Similar Reads
Bias-Variance Trade Off - Machine Learning
It is important to understand prediction errors (bias and variance) when it
comes to accuracy in any machine-learning algorithm. There is a tradeoff…

15+ min read

One Hot Encoding vs Label Encoding


One Hot Encoding and Label Encoding are machine learning techniques
for converting categorical data into numerical format. Since most machin…

15+ min read

RNN vs LSTM vs GRU vs Transformers


In sequential data processing, Recurrent Neural Networks (RNNs), Long
Short-Term Memory (LSTM) networks, Gated Recurrent Units (GRUs) an…
We use cookies to ensure you have the best browsing experience on our website. By using our site, you
15+acknowledge
min read that you have read and understood our Cookie Policy & Privacy Policy

https://www.geeksforgeeks.org/bias-vs-variance-in-machine-learning/ 9/13
07/05/2025, 19:13 Bias and Variance in Machine Learning | GeeksforGeeks

Model Selection for Machine Learning


Machine learning (ML) is a field that enables computers to learn patterns
from data and make predictions without being explicitly programmed.…

15+ min read

Machine Learning Tutorial


Machine learning is a branch of Artificial Intelligence that focuses on
developing models and algorithms that let computers learn from data…

15+ min read

Supervised and Unsupervised learning


Supervised and unsupervised learning are two key approaches in machine
learning. In supervised learning, the model is trained with labeled data…

15+ min read

ML | Underfitting and Overfitting


Machine learning models aim to perform well on both training data and
new, unseen data and is considered "good" if: It learns patterns effectivel…

15+ min read

Regularization in Machine Learning


In the previous session, we learned how to implement linear regression.
Now, we’ll move on to regularization, which helps prevent overfitting an…

15+ min read

Random Forest Algorithm in Machine Learning


A Random Forest is a collection of decision trees that work together to
make predictions. In this article, we'll explain how the Random Forest…

15+ min read

WeIntroduction to Deep
use cookies to ensure you haveLearning
the best browsing experience on our website. By using our site, you
acknowledge that you have read and understood our Cookie Policy & Privacy Policy
Deep Learning is transforming the way machines understand, learn, and
interact with complex data. Deep learning mimics neural networks of the…
https://www.geeksforgeeks.org/bias-vs-variance-in-machine-learning/ 10/13
07/05/2025, 19:13 Bias and Variance in Machine Learning | GeeksforGeeks

15+ min read

Corporate & Communications Address:


A-143, 7th Floor, Sovereign Corporate
Tower, Sector- 136, Noida, Uttar Pradesh
(201305)

Registered Address:
K 061, Tower K, Gulshan Vivante
Apartment, Sector 137, Noida, Gautam
Buddh Nagar, Uttar Pradesh, 201305

Advertise with us

Company Explore
About Us Job-A-Thon Hiring Challenge
Legal GfG Weekly Contest
Privacy Policy Offline Classroom Program
Careers DSA in JAVA/C++
In Media Master System Design
Contact Us Master CP
GfG Corporate Solution GeeksforGeeks Videos
Placement Training Program

Languages DSA
Python Data Structures
Java Algorithms
C++ DSA for Beginners
PHP Basic DSA Problems
GoLang DSA Roadmap
SQL DSA Interview Questions
R Language Competitive Programming
We use cookies to Android
ensure you have
Tutorial the best browsing experience on our website. By using our site, you
acknowledge that you have read and understood our Cookie Policy & Privacy Policy
Data Science & ML Web Technologies

https://www.geeksforgeeks.org/bias-vs-variance-in-machine-learning/ 11/13
07/05/2025, 19:13 Bias and Variance in Machine Learning | GeeksforGeeks

Data Science With Python HTML


Data Science For Beginner CSS
Machine Learning JavaScript
ML Maths TypeScript
Data Visualisation ReactJS
Pandas NextJS
NumPy NodeJs
NLP Bootstrap
Deep Learning Tailwind CSS

Python Tutorial Computer Science


Python Programming Examples GATE CS Notes
Django Tutorial Operating Systems
Python Projects Computer Network
Python Tkinter Database Management System
Web Scraping Software Engineering
OpenCV Tutorial Digital Logic Design
Python Interview Question Engineering Maths

DevOps System Design


Git High Level Design
AWS Low Level Design
Docker UML Diagrams
Kubernetes Interview Guide
Azure Design Patterns
GCP OOAD
DevOps Roadmap System Design Bootcamp
Interview Questions

School Subjects Databases


Mathematics SQL
Physics MYSQL
Chemistry PostgreSQL
Biology PL/SQL
Social Science MongoDB
English Grammar

Preparation Corner More Tutorials


Company-Wise Recruitment Process Software Development
Aptitude Preparation Software Testing
Puzzles Product Management
Company-Wise Preparation Project Management
Linux
Excel
All Cheat Sheets

We useMachine
cookies toLearning/Data
ensure you have Science
the best browsing experience on ourProgramming Languages
website. By using our site, you
acknowledge that you have read and understood
Complete Machine Learning & Data Science Program - [LIVE] our Cookie Policy & Privacy Policy
C Programming with Data Structures
C++ Programming Course

https://www.geeksforgeeks.org/bias-vs-variance-in-machine-learning/ 12/13
07/05/2025, 19:13 Bias and Variance in Machine Learning | GeeksforGeeks

Data Analytics Training using Excel, SQL, Python & PowerBI - Java Programming Course
[LIVE] Python Full Course
Data Science Training Program - [LIVE]
Data Science Course with IBM Certification

Clouds/Devops GATE 2026


DevOps Engineering GATE CS Rank Booster
AWS Solutions Architect Certification GATE DA Rank Booster
Salesforce Certified Administrator Course GATE CS & IT Course - 2026
GATE DA Course 2026
GATE Rank Predictor

@GeeksforGeeks, Sanchhaya Education Private Limited, All rights reserved

We use cookies to ensure you have the best browsing experience on our website. By using our site, you
acknowledge that you have read and understood our Cookie Policy & Privacy Policy

https://www.geeksforgeeks.org/bias-vs-variance-in-machine-learning/ 13/13

You might also like