0% found this document useful (0 votes)
22 views6 pages

Decitions Tree

The document discusses decision trees, which are a type of machine learning algorithm that use a tree-like model to make predictions or decisions. Decision trees break down a dataset into smaller and smaller subsets while associating data with outcomes. They provide a visual and intuitive way to classify data and gain insights from data attributes. The document covers key concepts like internal nodes, leaf nodes, attribute selection measures, information gain, gain ratio, and the Gini index.

Uploaded by

dmonter67
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
22 views6 pages

Decitions Tree

The document discusses decision trees, which are a type of machine learning algorithm that use a tree-like model to make predictions or decisions. Decision trees break down a dataset into smaller and smaller subsets while associating data with outcomes. They provide a visual and intuitive way to classify data and gain insights from data attributes. The document covers key concepts like internal nodes, leaf nodes, attribute selection measures, information gain, gain ratio, and the Gini index.

Uploaded by

dmonter67
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 6

Decitions tree

Concept

A decision tree is a flowchart-like tree structure where an internal node represents a feature(or
attribute), the branch represents a decision rule, and each leaf node represents the outcome.

The topmost node in a decision tree is known as the root node. It learns to partition on the basis
of the attribute value. It partitions the tree in a recursive manner called recursive partitioning.
This flowchart-like structure helps you in decision-making. It's visualization like a flowchart
diagram which easily mimics the human level thinking. That is why decision trees are easy to
understand and interpret.

Image | Abid Ali Awan

A decision tree is a white box type of ML algorithm. It shares internal decision-making logic,
which is not available in the black box type of algorithms such as with a neural network. Its
training time is faster compared to the neural network algorithm.

The time complexity of decision trees is a function of the number of records and attributes in the
given data. The decision tree is a distribution-free or non-parametric method which does not
depend upon probability distribution assumptions. Decision trees can handle high-dimensional
data with good accuracy.
Attribute Selection
The basic idea behind any decision tree algorithm is as follows:
1. Select the best attribute using Attribute Selection Measures (ASM) to split the
records.
2. Make that attribute a decision node and breaks the dataset into smaller subsets.
3. Start tree building by repeating this process recursively for each child until one
of the conditions will match:
 All the tuples belong to the same attribute value.
 There are no more remaining attributes.
 There are no more instances.

Attribute Selection Measures

Attribute selection measure is a heuristic for selecting the splitting criterion that
partitions data in the best possible manner. It is also known as splitting rules because
it helps us to determine breakpoints for tuples on a given node. ASM provides a rank
to each feature (or attribute) by explaining the given dataset. The best score attribute
will be selected as a splitting attribute. In the case of a continuous-valued attribute,
split points for branches also need to define. The most popular selection measures are
Information Gain, Gain Ratio, and Gini Index.
Information Gain

Claude Shannon invented the concept of entropy, which measures the impurity of the
input set. In physics and mathematics, entropy is referred to as the randomness or the
impurity in a system. In information theory, it refers to the impurity in a group of
examples. Information gain is the decrease in entropy. Information gain computes the
difference between entropy before the split and average entropy after the split of the
dataset based on given attribute values. ID3 (Iterative Dichotomiser) decision tree
algorithm uses information gain.

Where Pi is the probability that an arbitrary tuple in D belongs to class Ci.

Where:
 Info(D) is the average amount of information needed to identify the class label
of a tuple in D.
 |Dj|/|D| acts as the weight of the jth partition.
 InfoA(D) is the expected information required to classify a tuple from D based
on the partitioning by A.
The attribute A with the highest information gain, Gain(A), is chosen as the splitting
attribute at node N().
Gain Ratio

Information gain is biased for the attribute with many outcomes. It means it prefers
the attribute with a large number of distinct values. For instance, consider an attribute
with a unique identifier, such as customer_ID, that has zero info(D) because of pure
partition. This maximizes the information gain and creates useless partitioning.
C4.5, an improvement of ID3, uses an extension to information gain known as the
gain ratio. Gain ratio handles the issue of bias by normalizing the information gain
using Split Info. Java implementation of the C4.5 algorithm is known as J48, which is
available in WEKA data mining tool.

Where:
 |Dj|/|D| acts as the weight of the jth partition.
 v is the number of discrete values in attribute A.
The gain ratio can be defined as

The attribute with the highest gain ratio is chosen as the splitting attribute.
Gini index

Another decision tree algorithm CART (Classification and Regression Tree) uses the
Gini method to create split points.
Where pi is the probability that a tuple in D belongs to class Ci.
The Gini Index considers a binary split for each attribute. You can compute a
weighted sum of the impurity of each partition. If a binary split on attribute A
partitions data D into D1 and D2, the Gini index of D is:

In the case of a discrete-valued attribute, the subset that gives the minimum gini index
for that chosen is selected as a splitting attribute. In the case of continuous-valued
attributes, the strategy is to select each pair of adjacent values as a possible split point,
and a point with a smaller gini index is chosen as the splitting point.

The attribute with the minimum Gini index is chosen as the splitting attribute.

Usage of Decitions tree


Regression

Classification

Optimizing Decision Tree Performance


 criterion : optional (default=”gini”) or Choose attribute selection
measure. This parameter allows us to use the different-different attribute
selection measure. Supported criteria are “gini” for the Gini index and
“entropy” for the information gain.
 splitter : string, optional (default=”best”) or Split Strategy. This
parameter allows us to choose the split strategy. Supported strategies
are “best” to choose the best split and “random” to choose the best
random split.
 max_depth : int or None, optional (default=None) or Maximum Depth of a
Tree. The maximum depth of the tree. If None, then nodes are
expanded until all the leaves contain less than min_samples_split
samples. The higher value of maximum depth causes overfitting, and a
lower value causes underfitting (Source).

You might also like