0% found this document useful (0 votes)
8 views5 pages

Machine Learning

The document discusses decision trees and random forests as machine learning algorithms, highlighting their structure, advantages, and the concept of entropy and information gain. Decision trees can suffer from overfitting, which random forests mitigate by using bagging techniques to combine multiple trees for improved accuracy and reduced variance. Random forests also provide insights into feature importance, making them effective for high-dimensional data and robust against noise.

Uploaded by

Papa Bansal
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views5 pages

Machine Learning

The document discusses decision trees and random forests as machine learning algorithms, highlighting their structure, advantages, and the concept of entropy and information gain. Decision trees can suffer from overfitting, which random forests mitigate by using bagging techniques to combine multiple trees for improved accuracy and reduced variance. Random forests also provide insights into feature importance, making them effective for high-dimensional data and robust against noise.

Uploaded by

Papa Bansal
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 5

Machine learning (low bias low variance)

1. Decision Tree: In a Decision tree, there are two nodes, which are
the Decision Node and Leaf Node. Decision nodes are used to make
any decision and have multiple branches, whereas Leaf nodes are the
output of those decisions and do not contain any further branches .

Decision trees are nothing but a giant structure of the nested if-else statement.
it is easy to understand and cost is low because of the logarithmic time. But
this algorithm has the overfitting issues (this will be solve by the random
forest).

I. Entropy
It is the measurement of the disorder.
The mathematical formula for entropy is:
c
E ( S )=∑ −p i log 2 pi Where pi is the proportion of the data points belonging to the
i=1

class I and c is the number of classes.


Entropy in the context of a decision tree algorithm is a measure of uncertainty
or impurity in a dataset. It is a fundamental concept used to determine how a
decision tree splits the data at each node.
 More the uncertainty more in entropy.
I. Information Gain:
Information gain is a metric used to train decision Trees,
Specifically this metric measure the quality of a split.
Whichever column has the highest information gain (max
decrease in entropy) the algorithm will select that column to
split the data.
for example we have the 4 column data so first we split the
data according the all of these column and find out the
information gain and then which has the highest information
gain the split is from this column.

Overfitting:
this can solve by the hyper parameter (max_depth) this the
depth of the tree.
There are different hyper parameter (min sample split )
Random Forest(very imp)
It is the best and it will work on any machine learning project and this will use
for the regression and classification task.
Random forest means (Bagging group of tress) if we use the decision tree in
the bagging technique then it is called the random forest.

Bagging; (bootstrapping and aggregation)

Bagging (short for Bootstrap Aggregating) is an ensemble learning technique


in machine learning. It combines the predictions of multiple models (often the
same type, like decision trees) to produce a more robust and accurate
prediction. Bagging reduces variance and helps prevent overfitting.

How Bagging Works

1. Bootstrap Sampling:
o Create multiple subsets of the training data by sampling with
replacement.
o Each subset has the same size as the original training set but may
contain duplicate samples and exclude others.

2. Train Multiple Models:


o Train a separate model (base learner) on each bootstrap sample.
o Commonly used base models include decision trees (e.g., in
Random Forest).

3. Aggregate Predictions:
o For classification: Use majority voting (the most common class
among models).
o For regression: Use the average of predictions from all models.

Why Use Bagging?

1. Reduces Variance:
o By averaging predictions, bagging stabilizes the results and
reduces the likelihood of overfitting.
2. Improves Accuracy:
o Even if individual models are weak learners, their combined
output can be much stronger.

3. Handles Overfitting:
o Particularly effective with high-variance models like decision trees.

Let suppose we have the large data and in the random forest we take the base
models as a decision tree, make the n models for to train the dataset for this
we distribute the data but don’t give all the data to the one model, so we
sampling the data on the basis of the (rows, column, and combination).
So, after the training process we have the n decision tree model which were
train on our dataset. When we predict the data we take all the prediction
output from all the train models.

For Classification: It uses majority voting (the most common class).


For Regression: It averages the predictions of all the trees.

3. How Does Random Forest Work?

1. Bootstrap Sampling:
o Create multiple subsets of the dataset by sampling with
replacement.
o Each subset is called a bootstrap sample and is used to train one
decision tree.
o Some data points will be repeated in a subset, while others may
be left out.

2. Train Decision Trees:


o Train each decision tree independently on its bootstrap sample.
o When splitting nodes in the tree, a random subset of features is
considered instead of all features. This reduces correlation among
trees.

3. Aggregate Predictions:
o For classification: Each tree votes for a class, and the majority vote
becomes the final prediction.
o For regression: The predictions are averaged to get the final
output.

In machine learning programs we want the low bias and low variance but we
can’t achieve it because our models will take tradeoff between both of us, but
Random Forest can give this type of results because Decision tree is low bias
high variance type of model so random forest can change the high variance to
low,

4. Why Use Random Forest?

Advantages:

1. Reduces Overfitting:
o Individual decision trees may overfit the data, but combining
many trees reduces this risk.
2. Handles High-Dimensional Data:
o Random Forest works well even with many features and datasets
with high dimensionality.
3. Robust to Noise:
o Since it averages predictions, outliers and noisy data have less
impact.
4. Feature Importance:
o Random Forest provides a measure of feature importance, which
can help you identify the most influential features in your dataset.

Feature Importance in Random Forest

Random Forest calculates feature importance by measuring how much each


feature reduces impurity across all trees. Features with higher importance
contribute more to the model's predictions.

 Measure the decrease in entropy or Gini impurity at each split caused by


a feature.
 Average this decrease across all trees.

You might also like