0% found this document useful (0 votes)
27 views7 pages

MLP U2

Uploaded by

rishiparmar921
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
27 views7 pages

MLP U2

Uploaded by

rishiparmar921
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

MACHINE

LEARNING
WITH PYTHON
SEMESTER 5
UNIT - 2

HI COLLEGE
SYLLABUS
UNIT - 2

HI COLLEGE
SUPERVISED LEARNING ALGORITHMS:
Supervised learning algorithms are a type of machine learning technique that
use labeled data to train a model to make predictions or decisions for new,
unseen data. Here are some examples of commonly used supervised learning
algorithms:

1. Decision Trees: Decision trees are a popular supervised learning algorithm


used for both classification and regression tasks. They work by recursively
splitting the data into smaller subsets based on the values of its features, until a
decision is reached. The resulting tree can be used to make predictions by
following the path from the root node to a leaf node based on the feature
values of the input data.

2. Tree Pruning: Tree pruning is a technique used to improve the performance


of decision trees by removing unnecessary branches or nodes from the tree.
This helps to reduce overfitting and improve generalization performance.

3. Rule-base Classification: Rule-based classification is a type of decision tree


where each path from the root node to a leaf node represents a rule or
condition that must be met for that decision. This approach is particularly
useful for interpretability and explainability, as the rules can be easily
understood by humans.

4. Naïve Bayes: Naïve Bayes is a probabilistic algorithm used for both


classification and regression tasks. It works by calculating the probability of
each class given the feature values of the input data, based on Bayes' theorem.
The class with the highest probability is then selected as the prediction.

5. Bayesian Networks: Bayesian networks are a type of graphical model used


for probabilistic reasoning and decision making. They consist of a directed
acyclic graph (DAG) representing the conditional dependencies between
variables, along with a set of probability distributions for each variable given its
parents in the graph. Bayesian networks can be used for both classification and
regression tasks, as well as more complex tasks such as causal inference and
uncertainty propagation.

HiCollege Click Here For More Notes 01


SUPPORT VECTOR MACHINES
Support Vector Machines (SVMs) are a type of supervised learning algorithm
used for both classification and regression tasks. SVMs work by finding the best
hyperplane (a line or plane in a high-dimensional space) that separates the data
into two classes, with the largest margin possible between the hyperplane and
the data points.

The margin is the distance between the hyperplane and the closest data points,
called support vectors. By maximizing the margin, SVMs can achieve better
generalization performance and reduce overfitting.

SVMs can also handle non-linear decision boundaries by mapping the input
data into a higher-dimensional space using a kernel function, such as a
polynomial, radial basis function (RBF), or sigmoid function. This allows SVMs to
separate complex, non-linear decision boundaries.

In classification tasks, SVMs output a label for each new input data point based
on which side of the hyperplane it falls on. In regression tasks, SVMs output a
continuous value based on which side of the hyperplane the input data point
falls on.

SVMs have several advantages over other supervised learning algorithms, such
as their ability to handle high-dimensional input spaces, their robustness to
outliers and noise, and their ability to provide interpretable decision
boundaries. However, SVMs can be computationally expensive for large
datasets due to their optimization requirements.

K-NEAREST NEIGHBOUR
Imagine you're lost in a forest and need to find your way to a specific tree. You
wouldn't draw a map, right? Instead, you'd look around and ask nearby trees
which way to go.

k-Nearest Neighbours (k-NN) is like that! It's a simple way to make predictions
in machine learning. Here's how:

HiCollege Click Here For More Notes 02


1. Imagine your data points as trees: Each point has features (like color, size) and
a label (like type of tree).
2. you're lost: You have a new data point (a lost person) with features but no
label (no idea what tree they are).
3. Ask the neighbours:Find the k closest trees (k-NN) to the lost person based on
their features.
4. Follow the majority:Look at the labels of the k closest trees. The most
common label becomes the prediction for the lost person!

Example:

Imagine you want to classify fruits as apples or oranges. You have a basket of
fruits with features like size and color. When you encounter a new fruit, k-NN
would:

* Find the 3 closest fruits (k=3) in the basket based on size and color.
* If 2 are apples and 1 is an orange, the new fruit is likely an apple!

k-NN is simple, but powerful for many tasks like image recognition and
recommendation systems. It's like asking your friends for advice when you're
unsure just in a more mathematical way.

For deep understanding refer; Machine Learning | K-Nearest Neighbor (KNN)

HiCollege Click Here For More Notes 03


ENSEMBLE LEARNING
Ensemble learning is a machine learning technique that combines multiple
machine learning models to improve the accuracy and robustness of
predictions. Ensemble methods can be used for both classification and
regression tasks.

There are several types of ensemble learning algorithms:

1. Bagging (Bootstrap Aggregating): Bagging is a technique that creates


multiple models on different subsets of the training data, called bootstrap
samples, and combines their predictions using a voting or averaging scheme.
This helps to reduce overfitting and improve the stability of the model.

2. Boosting: Boosting is a technique that iteratively trains multiple models, with


each subsequent model focusing on the misclassified or mispredicted
examples from the previous model. This helps to improve the accuracy of the
model by focusing on the most difficult examples.

3. Stacking: Stacking is a technique that combines multiple models trained on


different feature subsets or different algorithms, and uses a meta-model to
combine their predictions. This helps to improve the accuracy and robustness
of the model by leveraging the strengths of multiple algorithms.

4. Gradient Boosting: Gradient Boosting is a type of boosting algorithm that


iteratively adds new trees to the model, with each tree focusing on correcting
the errors of the previous tree. This helps to improve the accuracy and
interpretability of the model, as each tree can be interpreted as a correction to
the previous tree's errors.

HiCollege Click Here For More Notes 04


RANDOM FOREST ALGORITHM
The Random Forest algorithm is a versatile and powerful tool in the machine
learning toolbox. Imagine it like a team of detectives, each with their own
unique approach to solving a mystery.

Here's the basic idea:

1. Building the Forest

Instead of relying on one detective, you gather a whole team (the forest).
Each detective builds their own "case file" (decision tree) based on a random
subset of features and data points from the investigation. This diversity
prevents any one detective from getting stuck on irrelevant details.

2. Investigating the Evidence:

When you encounter a new "suspect" (data point), each detective examines
it through their unique lens, asking different questions and analyzing
different clues (features).

3. Collaborating for Answers:

No single detective has all the answers, so they share their findings. Each
tree makes a prediction (e.g., guilty or innocent) based on its own analysis.
The final verdict? The majority vote wins! The most common prediction from
the forest becomes the overall prediction for the new data point.

Benefits of the Random Forest:

- Accuracy Boost:By combining diverse perspectives, the forest reduces the risk
of overfitting and improves overall accuracy compared to single decision trees.
- Robustness:Even if some detectives make mistakes, the majority vote can still
lead to a correct conclusion.
- Versatility: The forest can handle various tasks, including classification (e.g.,
spam vs. not spam) and regression (e.g., predicting house prices).

HiCollege Click Here For More Notes 05

You might also like