0% found this document useful (0 votes)

37 views74 pages

Decision Tree

The document provides an overview of Decision Tree Induction Classifiers, detailing their construction, classification process, and algorithms such as ID3, C4.5, and CART. It discusses the advantages and disadvantages of decision trees, including their intuitive representation and susceptibility to overfitting. Additionally, it covers attribute selection measures, entropy, and information gain as key concepts in building effective decision trees.

Uploaded by

musavvirk04

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

37 views74 pages

Decision Tree

Uploaded by

musavvirk04

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 74

Decision Tree Induction Classifier

Contents
◦ Decision Tree Induction
◦ Classification by Decision Tree Induction with a Training Dataset
◦ Algorithm For Decision Tree Induction
◦ Attribute Selection Measures
◦ Extracting Classification Rules from Trees
◦ Over Fitting in Classification
Why Decision Tree Are So Popular
• The construction of decision tree classifiers does not require any domain knowledge

• Appropriate for exploratory knowledge discovery.

• They can handle multidimensional data

• Representation of acquired knowledge in tree form is intuitive

• Learning & Classification steps of decision tree are simple & fast.

• Good Accuracy Model

• Applications are Medicine, Manufacturing & Production , Financial analysis & Molecular
Biology.

3
Classification by Decision Tree Induction
A flow-chart-like tree structure
• Internal node denotes a test on an attribute
• Branch represents an outcome of the test
• Leaf nodes represent class labels or class distribution
Decision tree generation consists of two phases
• Tree construction
• At start, all the training examples are at the root
• Partition examples recursively based on selected attributes
• Tree pruning
• Identify and remove branches that reflect noise or outliers
Use of decision tree: Classifying an unknown sample
• Test the attribute values of the sample against the decision tree
4
5
Training Dataset

6
Output: A Decision Tree for “buys_computer”

age?

<=30 overcast
30..40 >40

student? yes credit rating?

no yes excellent fair

no yes no yes

7
Decision Tree Classification Task

Decision Tree

8
Example of a Decision Tree
cal cal s
r i r i ou
go o u
te t eg tin ass
ca ca on cl Splitting Attributes
c

Home
Owner
Ye No
s
NO MarSt
Married
Single, Divorced
Income NO
< 80K > 80K

NO YES

Model: Decision Tree

Training Data
9
Another Example of Decision Tree
al al s
ic c u
or ori u o
Single, Divorced
g g in ss MarSt
te te nt a Married
ca ca c o cl

NO Home
Ye Owner No
s
NO Income
< 80K > 80K

NO YES

There could be more than one tree that fits the same
data!

10
Apply Model to Test Data
Start from the root of tree.
Test Data

Home
Ye Owner No
s
NO MarSt
Married
Single, Divorced

Income NO
< 80K > 80K

NO YES

11
Apply Model to Test Data Test Data

Home
Ye Owner No
s
NO MarSt
Single, Divorced Married

Income NO
< 80K > 80K

NO YES

12
Apply Model to Test Data
Test Data

Home
Ye Owner No
s
NO MarSt
Single, Divorced Married

Income NO
< 80K > 80K

NO YES

13
Apply Model to Test Data Test Data

Home
Ye Owner No
s
NO MarSt
Single, Divorced Married

Income NO
< 80K > 80K

NO YES

14
Apply Model to Test Data
Test Data

Home
Ye Owner No
s
NO MarSt
Single, Divorced Married

Income NO
< 80K > 80K

NO YES

15
Apply Model to Test Data
Test Data

Home
Ye Owner No
s
NO MarSt
Single, Divorced Married Assign Defaulted to “No”

Income NO
< 80K > 80K

NO YES

16
Decision Tree Induction

Decision Tree Algorithms

ID3 C4.5 CART

17
Decision Tree Algorithms

ID3 C4.5 CART

• Iterative • CART which was
• Successor of ID3 which
Dichotomiser described as generation
became a benchmark to
• Invented by J. Ross of binary decision trees
new supervised learning
Quinlan algorithms
• Employs a top-down • ID3 & CART were
greedy search through invented independently
the space of possible of one another at the
decision trees. same time

18
CART (Classification and Regression Trees)

19
20
Methods for Expressing Test Conditions
• Depends on attribute types
• Binary
• Nominal
• Ordinal
• Continuous

• Depends on number of ways to split

• 2-way split
• Multi-way split

21
Test Condition for Nominal Attributes
• Multi-way split:
• Use as many partitions as distinct values.

• Binary split:
• Divides values into two subsets

22
Test Condition for Ordinal Attributes
• Multi-way split:
• Use as many partitions as distinct
values

• Binary split:
• Divides values into two subsets
• Preserve order property among
attribute values

This grouping
violates order
property

23
Test Condition for Continuous Attributes

24
Splitting Criteria for Decision Tree Induction

• Attribute Selection Method is used to determine the splitting criterion.

• Tells as to which branches to grow from node N with respect to outcomes

chosen

• Splitting criteria indicates : 1. Splitting Attribute

2. Split Point
3. Splitting Subset
• Splitting Criteria is determined to check if each partitions at branch are as
pure as possible

25
Splitting Criteria for Decision Tree Induction
Three possible scenarios: 1. Node is Discrete Valued
2. Node is Continuous Valued
3. Node is discrete valued and a Binary Tree

26
Algorithm for Decision Tree Induction
Basic algorithm (a greedy algorithm)
• Tree is constructed in a top-down recursive divide-and-conquer
manner
• At start, all the training examples are at the root.
• Attributes are categorical (if continuous-valued, they are discretized in
advance)
• Test attributes are selected on the basis of a heuristic or statistical
measure (e.g., information gain)

27
Algorithm for Decision Tree Induction
Conditions for stopping partitioning
•All samples for a given node belong to the same class
•There are no remaining attributes for further partitioning
– majority voting is employed for classifying the leaf
•There are no samples left

28
Attribute Selection Measure

• Here pi is the non zero probability that an arbitrary tuple in D belongs to class Ci.
• Information is encoded in bits ,that is why log is taken
• Info(D) is also known as entropy of D
• More information is needed to arrive at an exact classification. Info A(D) which is
the expected information required to classify a tuple.

29
Attribute Selection Measure

• Information gain is defined as the difference between the Original information &
Expected Information after partitioning

• The attribute with highest information gain is chosen as the splitting attribute.
• Info(D) should be more and Info A(D) should be less for good classification.

30
Decision Tree After Single Partition

32
Attribute Selection Measure

• Information gain (ID3/C4.5)

• All attributes are assumed to be categorical
• The attribute with highest information gain is chosen as splitting
attribute for Node N

• Gini index (IBM Intelligent Miner)

• All attributes are assumed continuous-valued
• Assume there exist several possible split values for each attribute
• May need other tools, such as clustering, to get the possible split values
• Can be modified for categorical attributes
34
Extracting Classification Rules from Trees

• Represent the knowledge in the form of IF-THEN rules

• One rule is created for each path from the root to a leaf
• Each attribute-value pair along a path forms a conjunction
• The leaf node holds the class prediction
• Rules are easier for humans to understand
• Example
IF age = “<=30” AND student = “no” THEN buys_computer = “no”
IF age = “<=30” AND student = “yes” THEN buys_computer = “yes”
IF age = “31…40” THEN buys_computer = “yes”
IF age = “>40” AND credit_rating = “excellent” THEN buys_computer = “yes”
IF age = “>40” AND credit_rating = “fair” THEN buys_computer = “no”

35
Avoid Over fitting in Classification
The generated tree may over fit the training data
• Too many branches, some may reflect anomalies due to noise or outliers
• Result is in poor accuracy for unseen samples
Two approaches to avoid over fitting
• Pre pruning: In Pre Pruning the construction of tree is halted by deciding not to
split further or partition the Training data
• Post pruning:
1. In Post Pruning sub trees are removed from a fully grown tree.
2. The sub tree is pruned by removing its branches and replacing it with a leaf
3. The leaf is labeled with the most frequent class among the sub tree replaced

36
Advantages
•Inexpensive to construct
•Extremely fast at classifying unknown records
•Easy to interpret for small-sized trees
•Robust to noise (especially when methods to avoid over fitting
are employed)
•Can easily handle redundant or irrelevant attributes (unless the
attributes are interacting)
37
Disadvantages
•Space of possible decision trees is exponentially large. Greedy
approaches are often unable to find the best tree.
•Does not take into account interactions between attributes
•Each decision boundary involves only a single attribute

38
Entropy
• Entropy measures the impurity of an arbitrary collection of
examples.
• For a collection S, entropy is given as:

• For a collection S having positive and negative examples

Entropy(S) = -p+log2p+ - p-log2p-
where p+ is the proportion of positive examples
and p- is the proportion of negative examples
In general, Entropy(S) = 0 if all members of S belong to the
same class.
Entropy(S) = 1 (maximum) when all members are split
equally.
Information Gain
• Measures the expected reduction in entropy. The higher
the IG, more is the expected reduction in entropy.

where Values(A) is the set of all possible values for attribute

A,
Sv is the subset of S for which attribute A has value v.
Example 1
Sample training data to determine whether an animal lays
eggs.
Dependent/
Independent/Condition attributes Decision
attributes
Animal Warm-blo Feathers Fur Swims Lays Eggs
oded
Ostrich Yes Yes No No Yes

Crocodile No No No Yes Yes

Raven Yes Yes No No Yes

Albatross Yes Yes No No Yes

Dolphin Yes No No Yes No

Koala Yes No Yes No No

Entropy(4Y,2N): -(4/6)log2(4/6) – (2/6)log2(2/6)
= 0.91829

Now, we have to find the IG for all four attributes

Warm-blooded, Feathers, Fur, Swims
For attribute ‘Warm-blooded’:
Values(Warm-blooded) : [Yes,No]
S = [4Y,2N]
SYes = [3Y,2N] E(SYes) = 0.97095
SNo = [1Y,0N] E(SNo) = 0 (all members belong to same class)
Gain(S,Warm-blooded) = 0.91829 – [(5/6)*0.97095 + (1/6)*0]
= 0.10916
For attribute ‘Feathers’:
Values(Feathers) : [Yes,No]
S = [4Y,2N]
SYes = [3Y,0N] E(SYes) = 0
SNo = [1Y,2N] E(SNo) = 0.91829
Gain(S,Feathers) = 0.91829 – [(3/6)*0 + (3/6)*0.91829]
= 0.45914
For attribute ‘Fur’:
Values(Fur) : [Yes,No]
S = [4Y,2N]
SYes = [0Y,1N] E(SYes) = 0
SNo = [4Y,1N] E(SNo) = 0.7219
Gain(S,Fur) = 0.91829 – [(1/6)*0 + (5/6)*0.7219]
= 0.3167
For attribute ‘Swims’:
Values(Swims) : [Yes,No]
S = [4Y,2N]
SYes = [1Y,1N] E(SYes) = 1 (equal members in both classes)
SNo = [3Y,1N] E(SNo) = 0.81127
Gain(S,Swims) = 0.91829 – [(2/6)*1 + (4/6)*0.81127]
= 0.04411
Gain(S,Warm-blooded) = 0.10916
Gain(S,Feathers) = 0.45914
Gain(S,Fur) = 0.31670
Gain(S,Swims) = 0.04411
Gain(S,Feathers) is maximum, so it is considered as the root
Anim War Feath Fur Swim Lays The ‘Y’ descendant has only
node al m-blo ers s Eggs positive examples and becomes the
oded leaf node with classification ‘Lays
Eggs’
Ostric Yes Yes No No Yes
h
Feathers
Croco No No No Yes Yes
dile Y N
Raven Yes Yes No No Yes
Albatr Yes Yes No No Yes [Ostrich, Raven, [Crocodile, Dolphin,
oss Albatross] Koala]
Dolph Yes No No Yes No
in Lays
Eggs ?
Koala Yes No Yes No No
Animal Warm-blo Feathers Fur Swims Lays Eggs
oded
Crocodile No No No Yes Yes
Dolphin Yes No No Yes No
Koala Yes No Yes No No

We now repeat the procedure,

S: [Crocodile, Dolphin, Koala]
S: [1+,2-]

Entropy(S) = -(1/3)log2(1/3) – (2/3)log2(2/3)

= 0.91829
• For attribute ‘Warm-blooded’:
Values(Warm-blooded) : [Yes,No]
S = [1Y,2N]
SYes = [0Y,2N] E(SYes) = 0
SNo = [1Y,0N] E(SNo) = 0
Gain(S,Warm-blooded) = 0.91829 – [(2/3)*0 + (1/3)*0] = 0.91829

• For attribute ‘Fur’:

Values(Fur) : [Yes,No]
S = [1Y,2N]
SYes = [0Y,1N] E(SYes) = 0
SNo = [1Y,1N] E(SNo) = 1
Gain(S,Fur) = 0.91829 – [(1/3)*0 + (2/3)*1] = 0.25162

• For attribute ‘Swims’:

Values(Swims) : [Yes,No]
S = [1Y,2N]
SYes = [1Y,1N] E(SYes) = 1
SNo = [0Y,1N] E(SNo) = 0
Gain(S,Swims) = 0.91829 – [(2/3)*1 + (1/3)*0] = 0.25162
Gain(S,Warm-blooded) is maximum
The final decision tree will be:

Feathers

Y N

Lays Warm-blooded
eggs
Y N

Does
Lays
not lay
Eggs
eggs
Example 2
• Factors affecting sunburn

Name Hair Height Weight Lotion Sunburned

Sarah Blonde Average Light No Yes
Dana Blonde Tall Average Yes No
Alex Brown Short Average Yes No
Annie Blonde Short Average No Yes
Emily Red Average Heavy No Yes
Pete Brown Tall Heavy No No
John Brown Average Heavy No No
Katie Blonde Short Light Yes No
• S = [3+, 5-]
Entropy(S) = -(3/8)log2(3/8) – (5/8)log2(5/8)
= 0.95443

Find IG for all 4 attributes: Hair, Height, Weight, Lotion

• For attribute ‘Hair’:

Values(Hair) : [Blonde, Brown, Red]
S = [3+,5-]
SBlonde = [2+,2-] E(SBlonde) = 1
SBrown = [0+,3-] E(SBrown) = 0
SRed = [1+,0-] E(SRed) = 0
Gain(S,Hair) = 0.95443 – [(4/8)*1 + (3/8)*0 + (1/8)*0]
= 0.45443
• For attribute ‘Height’:
Values(Height) : [Average, Tall, Short]
SAverage = [2+,1-] E(SAverage) = 0.91829
STall = [0+,2-] E(STall) = 0
SShort = [1+,2-] E(SShort) = 0.91829
Gain(S,Height) = 0.95443 – [(3/8)*0.91829 + (2/8)*0 + (3/8)*0.91829]
= 0.26571
• For attribute ‘Weight’:
Values(Weight) : [Light, Average, Heavy]
SLight = [1+,1-] E(SLight) = 1
SAverage = [1+,2-] E(SAverage) = 0.91829
SHeavy = [1+,2-] E(SHeavy) = 0.91829
Gain(S,Weight) = 0.95443 – [(2/8)*1 + (3/8)*0.91829 + (3/8)*0.91829]
= 0.01571
• For attribute ‘Lotion’:
Values(Lotion) : [Yes, No]
SYes = [0+,3-] E(SYes) = 0
SNo = [3+,2-] E(SNo) = 0.97095
Gain(S,Lotion) = 0.95443 – [(3/8)*0 + (5/8)*0.97095]
= 0.01571
Gain(S,Hair) = 0.45443 Gain(S,Height) = 0.26571
Gain(S,Weight) = 0.01571 Gain(S,Lotion) = 0.3475
Gain(S,Hair) is maximum, so it is considered as the root node
Name Hair Height Weigh Lotion Sunbur
t ned
Sarah Blonde Averag Light No Yes
e
Hair
Dana Blonde Tall Averag Yes No
e Blonde Brown
Alex Brown Short Averag Yes No Red
e [Sarah, Dana, [Alex, Pete, John]
Annie Blonde Short Averag No Yes Annie, Katie]
e Not
? Sunbur
Emily Red Averag Heavy No Yes
ned
e
[Emily]
Pete Brown Tall Heavy No No
John Brown Averag Heavy No No Sunbu
e rned
Katie Blonde Short Light Yes No
Name Hair Height Weight Lotion Sunburned
Sarah Blonde Average Light No Yes
Dana Blonde Tall Average Yes No
Annie Blonde Short Average No Yes
Katie Blonde Short Light Yes No

Repeating again:
S = [Sarah, Dana, Annie, Katie]
S: [2+,2-]
Entropy(S) = 1

Find IG for remaining 3 attributes Height, Weight, Lotion

• For attribute ‘Height’:
Values(Height) : [Average, Tall, Short]
S = [2+,2-]
SAverage = [1+,0-] E(SAverage) = 0
STall = [0+,1-] E(STall) = 0
SShort = [1+,1-] E(SShort) = 1
Gain(S,Height) = 1 – [(1/4)*0 + (1/4)*0 + (2/4)*1]
= 0.5
• For attribute ‘Weight’:
Values(Weight) : [Average, Light]
S = [2+,2-]
SAverage = [1+,1-] E(SAverage) = 1
SLight = [1+,1-] E(SLight) = 1
Gain(S,Weight) = 1 – [(2/4)*1 + (2/4)*1]
=0

• For attribute ‘Lotion’:

Values(Lotion) : [Yes, No]
S = [2+,2-]
SYes = [0+,2-] E(SYes) = 0
SNo = [2+,0-] E(SNo) = 0
Gain(S,Lotion) = 1 – [(2/4)*0 + (2/4)*0]
=1

Therefore, Gain(S,Lotion) is maximum

• In this case, the final decision tree will be
Hair

Blonde Brown
Red

Sunbu Not
Lotion rned Sunburn
ed
Y N

Not Sunbu
Sunburn rned
ed
Example Data
• Try creating tree using information gain splitting criteria

56
Gini Index
Gini Index is a metric to measure how often a randomly chosen
element would be incorrectly identified. It means an attribute with
lower gini index should be preferred.

• A, B, C, D attributes can be considered as predictors and E column

class labels can be considered as a target variable. For constructing a
decision tree from this data, we have to convert continuous data into
categorical data.
• We have chosen some random values to categorize each attribute:

57
Gini Index is defined as:

Gini Index for variable A:

58
59
60
61
62
• The Gini Index is calculated by subtracting the sum of the squared
probabilities of each class from one. It favors larger partitions.
• Information Gain multiplies the probability of the class times the log
(base=2) of that class probability. Information Gain favors smaller
partitions with many distinct values

63
Gain Ratio

64
Tree Replication Problem

65
Overfitting
•
• Overfitting is a practical problem while building a decision tree model.
• The model is having an issue of overfitting is considered when the
algorithm continues to go deeper and deeper in the to reduce the training
set error but results with an increased test set error i.e, Accuracy of
prediction for our model goes down.
• It generally happens when it builds many branches due to outliers and
irregularities in data.
Two approaches which we can use to avoid overfitting are:
• Pre-Pruning
• Post-Pruning
66
Different Types of Pruning
1) Prepruning:
• In this approach, the construction of the decision tree is stopped early. It means it is decided not to
further partition the branches. The last node constructed becomes the leaf node and this leaf node
may hold the most frequent class among the tuples.
• The attribute selection measures are used to find out the weightage of the split. Threshold values are
prescribed to decide which splits are regarded as useful. If the portioning of the node results in splitting
by falling below threshold then the process is halted.
2) Postpruning:
• This method removes the outlier branches from a fully grown tree. The unwanted branches are
removed and replaced by a leaf node denoting the most frequent class label. This technique requires
more computation than prepruning, however, it is more reliable.
• The pruned trees are more precise and compact when compared to unpruned trees but they carry a
disadvantage of replication and repetition.
• Repetition occurs when the same attribute is tested again and again along a branch of a tree.
Replication occurs when the duplicate subtrees are present within the tree. These issues can be
solved by multivariate splits.

67
Unpruned Vs Pruned Tree

68
Decision Tree Algorithm Advantages and
Disadvantages
Advantages:
• Decision Trees are easy to explain. It results in a set of rules.
• It follows the same approach as humans generally follow while making decisions.
• Interpretation of a complex Decision Tree model can be simplified by its visualizations. Even a naive
person can understand logic.
• The Number of hyper-parameters to be tuned is almost null.
Disadvantages:
• There is a high probability of overfitting in Decision Tree.
• Generally, it gives low prediction accuracy for a dataset as compared to other machine learning
algorithms.
• Information gain in a decision tree with categorical variables gives a biased response for attributes
with greater no. of categories.
• Calculations can become complex when there are many class labels.

69
DecisionTreeClassifier() : Python Class
• This is the classifier function for DecisionTree.
• It is the main function for implementing the algorithms. Some important parameters are:

• criterion: It defines the function to measure the quality of a split. Sklearn supports “gini” criteria for
Gini Index & “entropy” for Information Gain. By default, it takes “gini” value.
• splitter: It defines the strategy to choose the split at each node. Supports “best” value to choose the
best split & “random” to choose the best random split. By default, it takes “best” value.
• max_features: It defines the no. of features to consider when looking for the best split. We can input
integer, float, string & None value.
❖ If an integer is inputted then it considers that value as max features at each split.
❖ If float value is taken then it shows the percentage of features at each split.
❖ If “auto” or “sqrt” is taken then max_features=sqrt(n_features).
❖ If “log2” is taken then max_features= log2(n_features).
❖ If None, then max_features=n_features. By default, it takes “None” value.

70
• max_depth: The max_depth parameter denotes maximum depth of the tree. It can take
any integer value or None. If None, then nodes are expanded until all leaves are pure or
until all leaves contain less than min_samples_split samples. By default, it takes “None”
value.
• min_samples_split: This tells above the minimum no. of samples reqd. to split an
internal node. If an integer value is taken then consider min_samples_split as the
minimum no. If float, then it shows percentage. By default, it takes “2” value.
• min_samples_leaf: The minimum number of samples required to be at a leaf
node. If an integer value is taken then consider min_samples_leaf as the minimum
no. If float, then it shows percentage. By default, it takes “1” value.
• max_leaf_nodes: It defines the maximum number of possible leaf nodes. If None then it
takes an unlimited number of leaf nodes. By default, it takes “None” value.
• min_impurity_split: It defines the threshold for early stopping tree growth. A node will
split if its impurity is above the threshold otherwise it is a leaf.

71
72
Are tree based algorithms better than linear
models?
• If the relationship between dependent & independent variable is
well approximated by a linear model, linear regression will
outperform tree based model.
• If there is a high non-linearity & complex relationship between
dependent & independent variables, a tree model will
outperform a classical regression method.
• If you need to build a model which is easy to explain to people,
a decision tree model will always do better than a linear model.
Decision tree models are even simpler to interpret than linear
regression!

73
Thank You

Unit IV Da Online - PPTX 2 82
No ratings yet
Unit IV Da Online - PPTX 2 82
81 pages
Tree Based Classifiers: Dinesh R
No ratings yet
Tree Based Classifiers: Dinesh R
54 pages
CRTO July2022 6flags
No ratings yet
CRTO July2022 6flags
25 pages
FUGI PLC Manual
100% (1)
FUGI PLC Manual
493 pages
DWDM UNIT-IV Classification and Prediction
100% (1)
DWDM UNIT-IV Classification and Prediction
70 pages
Unit 3
No ratings yet
Unit 3
98 pages
DWDM Unit 4
No ratings yet
DWDM Unit 4
80 pages
Classification - Decision Trees
No ratings yet
Classification - Decision Trees
96 pages
Decision Trees Lectures
No ratings yet
Decision Trees Lectures
55 pages
Trees
No ratings yet
Trees
78 pages
ML Unit II
No ratings yet
ML Unit II
183 pages
Classification
No ratings yet
Classification
75 pages
Sampling and Sampling Distribution
100% (2)
Sampling and Sampling Distribution
43 pages
Unit 3
No ratings yet
Unit 3
95 pages
Decision Tree Algorithm, Explained-1-22
No ratings yet
Decision Tree Algorithm, Explained-1-22
22 pages
Chap4 Classification Lecture 5
No ratings yet
Chap4 Classification Lecture 5
74 pages
5 Classification
No ratings yet
5 Classification
59 pages
Classification
No ratings yet
Classification
45 pages
UNIT 2 Class Basic
No ratings yet
UNIT 2 Class Basic
69 pages
Lec05 Classification DecisionTree
No ratings yet
Lec05 Classification DecisionTree
67 pages
Session 5b Classification by Decision Tree Induction
No ratings yet
Session 5b Classification by Decision Tree Induction
42 pages
Unit-2 Material
No ratings yet
Unit-2 Material
52 pages
Unit6 - 2 Classification-Decision-Trees
No ratings yet
Unit6 - 2 Classification-Decision-Trees
36 pages
Decision Tree
No ratings yet
Decision Tree
41 pages
Data Mining: Concepts and Techniques
No ratings yet
Data Mining: Concepts and Techniques
59 pages
Data Mining - Lecture 5
No ratings yet
Data Mining - Lecture 5
33 pages
Decision Tree & Techniques
71% (7)
Decision Tree & Techniques
41 pages
Supervised Learning Algorithm
No ratings yet
Supervised Learning Algorithm
59 pages
Wk. 5.2. Decision Trees (27.10.2020)
No ratings yet
Wk. 5.2. Decision Trees (27.10.2020)
57 pages
Lecture11 Ch8 ClassBasic Part1
No ratings yet
Lecture11 Ch8 ClassBasic Part1
38 pages
Decision Tree
No ratings yet
Decision Tree
20 pages
Decision Trees: Decision Tree Is One of The Most Widely Used and
No ratings yet
Decision Trees: Decision Tree Is One of The Most Widely Used and
53 pages
Classification and Regression Trees (CART - I) : Dr. A. Ramesh
No ratings yet
Classification and Regression Trees (CART - I) : Dr. A. Ramesh
34 pages
AI&Ml-module 4 (Complete)
No ratings yet
AI&Ml-module 4 (Complete)
124 pages
AI&Ml-module 4 (Part 1)
No ratings yet
AI&Ml-module 4 (Part 1)
85 pages
Decision Tree
No ratings yet
Decision Tree
33 pages
Decision Tree Decision Tree: R. Akerkar
No ratings yet
Decision Tree Decision Tree: R. Akerkar
30 pages
Data Mining Unit-Iii
No ratings yet
Data Mining Unit-Iii
36 pages
L04 Decision Trees
No ratings yet
L04 Decision Trees
34 pages
Decision Tree
No ratings yet
Decision Tree
18 pages
U4 ML Updated
No ratings yet
U4 ML Updated
32 pages
Class Basic
No ratings yet
Class Basic
75 pages
Unit 3 Classification
No ratings yet
Unit 3 Classification
71 pages
Decision Trees Edited
No ratings yet
Decision Trees Edited
56 pages
FALLSEM2024-25 BCSE209L TH VL2024250101598 2024-08-05 Reference-Material-I
No ratings yet
FALLSEM2024-25 BCSE209L TH VL2024250101598 2024-08-05 Reference-Material-I
31 pages
Aiml M4 C1
No ratings yet
Aiml M4 C1
101 pages
Decision Tree
No ratings yet
Decision Tree
14 pages
Lecture 17 18
No ratings yet
Lecture 17 18
52 pages
Module 5: Data Mining Algorithms: Classification
No ratings yet
Module 5: Data Mining Algorithms: Classification
34 pages
15.module6 Decisiontree-Updated 14
No ratings yet
15.module6 Decisiontree-Updated 14
20 pages
DMDW Classification
No ratings yet
DMDW Classification
18 pages
Syllabus Laboratory Attendent Jkssb
No ratings yet
Syllabus Laboratory Attendent Jkssb
6 pages
20210913115613D3708 - Session 05-08 Decision Tree Classification
No ratings yet
20210913115613D3708 - Session 05-08 Decision Tree Classification
37 pages
Dentin
No ratings yet
Dentin
12 pages
08 Class Basic
No ratings yet
08 Class Basic
86 pages
Decision Tree
No ratings yet
Decision Tree
13 pages
Unit - Iii
No ratings yet
Unit - Iii
52 pages
Decision Tree
No ratings yet
Decision Tree
43 pages
Integration of Geotechnical and Structural Design in Tunnelling
No ratings yet
Integration of Geotechnical and Structural Design in Tunnelling
55 pages
Decisiontree
No ratings yet
Decisiontree
6 pages
Dwdm-Unit-3 R16
No ratings yet
Dwdm-Unit-3 R16
14 pages
Les 3 DWM
No ratings yet
Les 3 DWM
21 pages
2 - Decision Tree
No ratings yet
2 - Decision Tree
23 pages
STSC Rates W.E.F 12 01 23 2
No ratings yet
STSC Rates W.E.F 12 01 23 2
3 pages
Ict-F4 Mock 1 Examination
No ratings yet
Ict-F4 Mock 1 Examination
4 pages
Electrical Conductivity and Resistivity of Water: Standard Test Methods For
No ratings yet
Electrical Conductivity and Resistivity of Water: Standard Test Methods For
8 pages
11B-Consecutive Integer Word Problems
No ratings yet
11B-Consecutive Integer Word Problems
4 pages
Axial Flow Compressors
No ratings yet
Axial Flow Compressors
43 pages
Programming For Ai Sir Mids
No ratings yet
Programming For Ai Sir Mids
61 pages
8 - Iknow Machinery Co., Ltd. (IK)
No ratings yet
8 - Iknow Machinery Co., Ltd. (IK)
7 pages
EN - basICColor Input Manual
No ratings yet
EN - basICColor Input Manual
74 pages
ML Unit 4 Part A Material
No ratings yet
ML Unit 4 Part A Material
15 pages
Microbiology Out of Specification
No ratings yet
Microbiology Out of Specification
3 pages
Rec 38 Rev2 July 2020 Ul
No ratings yet
Rec 38 Rev2 July 2020 Ul
10 pages
Python Note 8
No ratings yet
Python Note 8
24 pages
Chapter 04
No ratings yet
Chapter 04
50 pages
Ism Code Implementation An Investigation of Safety Issues in The Shipping Industry
No ratings yet
Ism Code Implementation An Investigation of Safety Issues in The Shipping Industry
14 pages
An Analysis of Remote Sensing Change Detection
No ratings yet
An Analysis of Remote Sensing Change Detection
14 pages
BAMS Short Essays With Slokas
No ratings yet
BAMS Short Essays With Slokas
3 pages
Titanic Funnel Exterior Article
No ratings yet
Titanic Funnel Exterior Article
14 pages
RIPASA
No ratings yet
RIPASA
5 pages
Operation Guide 2888: About This Manual
No ratings yet
Operation Guide 2888: About This Manual
5 pages
M2 - Quiz 1 - Attempt Review
No ratings yet
M2 - Quiz 1 - Attempt Review
4 pages
Mine Performance Task (1) - Grade 7 Reg Math (1st Quarter)
No ratings yet
Mine Performance Task (1) - Grade 7 Reg Math (1st Quarter)
1 page
Introduction To Sets
No ratings yet
Introduction To Sets
2 pages
The Innovation Iteration Grid by Brett Trusko
No ratings yet
The Innovation Iteration Grid by Brett Trusko
2 pages
Records On Secondary Candidates For Graduation: Bulacao Community High School
No ratings yet
Records On Secondary Candidates For Graduation: Bulacao Community High School
2 pages

Decision Tree

Uploaded by

Decision Tree

Uploaded by

Decision Tree Induction Classifier

• Appropriate for exploratory knowledge discovery.

• They can handle multidimensional data

• Representation of acquired knowledge in tree form is intuitive

• Good Accuracy Model

student? yes credit rating?

no yes excellent fair

Model: Decision Tree

Decision Tree Algorithms

ID3 C4.5 CART

ID3 C4.5 CART

• Depends on number of ways to split

• Attribute Selection Method is used to determine the splitting criterion.

• Tells as to which branches to grow from node N with respect to outcomes

• Splitting criteria indicates : 1. Splitting Attribute

• Information gain (ID3/C4.5)

• Gini index (IBM Intelligent Miner)

• Represent the knowledge in the form of IF-THEN rules

• For a collection S having positive and negative examples

where Values(A) is the set of all possible values for attribute

Crocodile No No No Yes Yes

Raven Yes Yes No No Yes

Albatross Yes Yes No No Yes

Dolphin Yes No No Yes No

Koala Yes No Yes No No

Now, we have to find the IG for all four attributes

We now repeat the procedure,

Entropy(S) = -(1/3)log2(1/3) – (2/3)log2(2/3)

• For attribute ‘Fur’:

• For attribute ‘Swims’:

Name Hair Height Weight Lotion Sunburned

Find IG for all 4 attributes: Hair, Height, Weight, Lotion

• For attribute ‘Hair’:

Find IG for remaining 3 attributes Height, Weight, Lotion

• For attribute ‘Lotion’:

Therefore, Gain(S,Lotion) is maximum

• A, B, C, D attributes can be considered as predictors and E column

Gini Index for variable A:

You might also like