0% found this document useful (0 votes)
8 views

machine learning notes

This document discusses the concepts of bias and variance in machine learning, explaining how bias refers to the inability of an algorithm to capture true relationships, while variance relates to the fit of datasets. It highlights the importance of balancing bias and variance using techniques like Regularization, Bagging, and Boosting to improve model performance. These techniques help manage complexity and enhance predictive accuracy by addressing the weaknesses of simpler or overly complex models.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views

machine learning notes

This document discusses the concepts of bias and variance in machine learning, explaining how bias refers to the inability of an algorithm to capture true relationships, while variance relates to the fit of datasets. It highlights the importance of balancing bias and variance using techniques like Regularization, Bagging, and Boosting to improve model performance. These techniques help manage complexity and enhance predictive accuracy by addressing the weaknesses of simpler or overly complex models.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 3

Introduction to Machine Learning

Unit-2
Understanding Bias and Variance
The inability for a machine learning algorithm to capture the true relationship between the
variables and the outcome is known as the bias. Figure 6.15 shows a straight line trying to fit all
the points. Because it doesn’t cut through all of the points, it has a high bias.
Introduction to Machine Learning

In machine learning, the fit between the datasets is known as variance. In this example, the curvy
line has high variance because it will result in vastly different RSS for different datasets. That is,
you can’t really predict how well it will perform with future datasets—sometimes it will do well
with certain datasets and at other times it may fail badly. On the other hand, the straight line has
a low variance, as the RSS is similar for different datasets.
High bias, with the line hugging as many points as possible
Low variance, with the line resulting in consistent predictions using different datasets

Figure 6.19 shows such an ideal curve—high bias and low variance. To strike a balance between
finding a simple model and a complex model, you can use techniques such as Regularization,
Bagging, and Boosting:
■■ Regularization is a technique that automatically penalizes the extra features you used in your
modeling.
■■ Bagging (or bootstrap aggregation) is a specific type of machine learning process that uses
ensemble learning to evolve machine learning models. Bagging uses a subset of the data and
each sample trains a weaker learner. The weak learners can then be combined (through averaging
or max vote)
to create a strong learner that can make accurate predictions.
■■ Boosting is also similar to Bagging, except that it uses all of the data to train each learner, but
data points that were misclassified by previous learners are given more weight so that subsequent
learners will give more
focus to them during training.
Introduction to Machine Learning

You might also like