0% found this document useful (0 votes)
80 views2 pages

Experiment No 4 Vanraj

1) The document discusses implementing a decision tree classifier in Python. It describes decision trees as a supervised learning algorithm that partitions data into smaller subsets based on decision rules to create a tree-like structure. 2) It explains key concepts like information gain, nodes, branches, and pruning. Decision trees can handle both categorical and numeric data and are easy to understand but prone to overfitting. 3) The document provides code to implement a decision tree classifier in Python and analyzes the output to conclude the experiment goal of understanding decision tree implementation was achieved.

Uploaded by

Anurag Singh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
80 views2 pages

Experiment No 4 Vanraj

1) The document discusses implementing a decision tree classifier in Python. It describes decision trees as a supervised learning algorithm that partitions data into smaller subsets based on decision rules to create a tree-like structure. 2) It explains key concepts like information gain, nodes, branches, and pruning. Decision trees can handle both categorical and numeric data and are easy to understand but prone to overfitting. 3) The document provides code to implement a decision tree classifier in Python and analyzes the output to conclude the experiment goal of understanding decision tree implementation was achieved.

Uploaded by

Anurag Singh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 2

Sr.

No: 39
Name: Vanraj Pardeshi
Experiment No. 4
Aim: Implementation of decision tree classifier using python
LO: LO4 & Implement various data mining algorithms from scratch using languages like Python/Java, etc.
Theory:

Decision Tree Classifier:


Decision tree classifier is a type of supervised learning algorithm that is used for classification tasks. It works
by partitioning the data into smaller and smaller subsets based on a set of decision rules until the subsets are
homogeneous or the stopping criteria are met.
The decision rules are based on splitting the data at each node of the tree based on the feature that provides the
most information gain or the best split. Information gain is calculated as the reduction in entropy or impurity of
the data after a split.
The decision tree is constructed recursively, starting from the root node, which contains the entire dataset. Each
internal node represents a test on a feature, and each leaf node represents a class label. The path from the root to
a leaf node represents the decision rules that are used to classify an instance.
Decision tree classifiers have several advantages, including:
• They can handle both categorical and numerical data.
• They can handle missing values and outliers.
• They can handle irrelevant and redundant features.
• They can be used for both classification and regression tasks.
However, decision trees also have some limitations, including:
• They are prone to overfitting, especially if the tree is too deep or if there are too many irrelevant
• features.
• They are easy to understand and interpret, and the decision rules can be visualized in a tree-like
• structure.
• They can be sensitive to small changes in the data, which can result in a different tree structure.
• They can be biased towards features with many levels or values.
• They can be computationally expensive for large datasets.
Why use Decision Trees?
There are various algorithms in Machine learning, so choosing the best algorithm for the given dataset and
problem is the main point to remember while creating a machine learning model. Below are the two reasons for
using the Decision tree:
• Decision Trees usually mimic human thinking ability while making a decision, so it is easy to
understand.
• The logic behind the decision tree can be easily understood because it shows a tree-like structure.
Decision Tree Terminologies
• Root Node: Root node is from where the decision tree starts. It represents the entire dataset, which
further gets divided into two or more homogeneous sets.
• Leaf Node: Leaf nodes are the final output node, and the tree cannot be segregated further after getting a
leaf node.
• Splitting: Splitting is the process of dividing the decision node/root node into sub-nodes according to the
given conditions.
• Branch/Sub Tree: A tree formed by splitting the tree.
• Pruning: Pruning is the process of removing the unwanted branches from the tree.
• Parent/Child node: The root node of the tree is called the parent node, and other nodes are called the
child nodes.
Code & Output:

Conclusion: From the above Experiment, I understood how to implement decision tree classifier using python.

You might also like