Decision Tree

The document describes decision trees, which classify instances by sorting them from the root node to a leaf node through a series of tests on attribute values. An instance is classified by starting at the root and moving through the tree based on the results of attribute tests at each node. Decision trees are suitable for problems with discrete attributes and outputs, and can handle missing data. The algorithm selects attributes using an information gain heuristic to build the tree recursively in a top-down manner until reaching pure leaf nodes. Overfitting can be addressed through pre-pruning or post-pruning techniques. Classification rules can be extracted from trees by creating rules for each path from root to leaf.

Uploaded by

Rithvik Dadapuram

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

37 views18 pages

Decision Tree

Uploaded by

Rithvik Dadapuram

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 18

Decision Tree

Decision Tree Representation

• A decision tree is an arrangement of tests that provides an appropriate
classification at every step in an analysis.
• "In general, decision trees represent a disjunction of conjunctions of
constraints on the attribute-values of instances. Each path from the tree root
to a leaf corresponds to a conjunction of attribute tests, and the tree itself to
a disjunction of these conjunctions"
• More specifically, decision trees classify instances by sorting them down
the tree from the root node to some leaf node, which provides the
classification of the instance. Each node in the tree specifies a test of
some attribute of the instance, and each branch descending from that node
corresponds to one of the possible values for this attribute.
Decision Tree(Contd..)
• An instance is classified by starting at the root node of the decision
tree, testing the attribute specified by this node, then moving down the
tree branch corresponding to the value of the attribute. This process is
then repeated at the node on this branch and so on until a leaf node is
reached.
• Diagram
• Each nonleaf node is connected to a test that splits its set of possible
answers into subsets corresponding to different test results.
• Each branch carries a particular test result's subset to another node.
• Each node is connected to a set of possible answers.
Appropriate Problems for Decision Tree
Learning
• Decision tree learning is generally best suited to problems with the
following characteristics:
• Instances are represented by attribute-value pairs.
• There is a finite list of attributes (e.g. hair color) and each instance stores a
value for that attribute (e.g. blonde).
• When each attribute has a small number of distinct values (e.g. blonde, brown,
red) it is easier for the decision tree to reach a useful solution.
• The algorithm can be extended to handle real-valued attributes (e.g. a floating
point temperature)
Contd..
• The target function has discrete output values.
• A decision tree classifies each example as one of the output values.
• Simplest case exists when there are only two possible classes (Boolean classification).
• However, it is easy to extend the decision tree to produce a target function with more than two
possible output values.
• Although it is less common, the algorithm can also be extended to produce a target
function with real-valued outputs.
• Disjunctive descriptions may be required.
• Decision trees naturally represent disjunctive expressions.
• The training data may contain errors.
• Errors in the classification of examples, or in the attribute values describing those
examples are handled well by decision trees, making them a robust learning method.
• Decision tree prone to overfit the data
Cond..
• The training data may contain missing attribute values.
• Decision tree methods can be used even when some training examples have
unknown values (e.g., humidity is known for only a fraction of the examples).
• After a decision tree learns classification rules, it can also be
re-represented as a set of if-then rules in order to improve readability.
Basic Decision Tree learning
• A decision tree is constructed by looking for regularities in data.
• Quinlan's ID3 Algorithm for Constructing a Decision Tree
Algorithm for Decision Tree Induction
• Basic algorithm (a greedy algorithm)
• Tree is constructed in a top-down recursive divide-and-conquer
manner
• At start, all the training examples are at the root
• Attributes are categorical (if continuous-valued, they are
discretized in advance)
• Examples are partitioned recursively based on selected
attributes
• Test attributes are selected on the basis of a heuristic or
statistical measure (e.g., information gain)
• Conditions for stopping partitioning
• All samples for a given node belong to the same class
• There are no remaining attributes for further partitioning –
majority voting is employed for classifying the leaf
• There are no samples left
9
Brief Review of Entropy
•

m=2
10
Attribute Selection Measure: Information
Gain (ID3/C4.5)
■ Select the attribute with the highest information gain
■ Let pi be the probability that an arbitrary tuple in D belongs to
class Ci, estimated by |Ci, D|/|D|
■ Expected information (entropy) needed to classify a tuple in D:

■ Information needed (after using A to split D into v partitions) to

classify D:

■ Information gained by branching on attribute A

11
Example
Attribute Selection: Information Gain
g Class P: buys_computer = “yes”
g Class N: buys_computer = “no”

means “age <=30” has 5 out of 14

samples, with 2 yes’es and 3 no’s.
Hence

Similarly,

13
Resulted Decision Tree
Overfitting and Tree Pruning
• Overfitting: An induced tree may overfit the training data
• Too many branches, some may reflect anomalies due to
noise or outliers
• Poor accuracy for unseen samples
• Two approaches to avoid overfitting
• Prepruning: Halt tree construction early ̵ do not split a node if
this would result in the goodness measure falling below a
threshold
• Difficult to choose an appropriate threshold
• Postpruning: Remove branches from a “fully grown”
tree—get a sequence of progressively pruned trees
• Use a set of data different from the training data to decide which is
the “best pruned tree”

15
If-Then Rules
• Extracting Classification Rules from Trees
• Goal: Represent the knowledge in the form of IF-THEN determinant
rules
• One rule is created for each path from the root to a leaf;
• Each attribute-value pair along a path forms a conjunction;
• The leaf node holds the class prediction
• Rules are easier to understand
If- Then Rules
Excercise

Module 3 - Decision Tress and Artificial Neural Networks
No ratings yet
Module 3 - Decision Tress and Artificial Neural Networks
177 pages
Decision Tree
No ratings yet
Decision Tree
28 pages
Decision Tree Algorithm in Machine Learning
No ratings yet
Decision Tree Algorithm in Machine Learning
17 pages
Unit IV Da Online - PPTX 2 82
No ratings yet
Unit IV Da Online - PPTX 2 82
81 pages
Cse 445 Lecture 8 Mma
No ratings yet
Cse 445 Lecture 8 Mma
107 pages
Lect 8-Decision Tree-2
No ratings yet
Lect 8-Decision Tree-2
16 pages
AIML Module-04
No ratings yet
AIML Module-04
46 pages
ML UNIT 2 Decision Tree
No ratings yet
ML UNIT 2 Decision Tree
109 pages
Decision Tree Algorithm, Explained-1-22
No ratings yet
Decision Tree Algorithm, Explained-1-22
22 pages
9-Module 5 Decision Tree-21-03-2024
No ratings yet
9-Module 5 Decision Tree-21-03-2024
83 pages
Decision Tree
No ratings yet
Decision Tree
74 pages
ML Ch-3 Decision Trees and Ensemble Methods
No ratings yet
ML Ch-3 Decision Trees and Ensemble Methods
14 pages
M2 Decision Trees
No ratings yet
M2 Decision Trees
37 pages
Decision Tree Learning (8 Hours)
No ratings yet
Decision Tree Learning (8 Hours)
141 pages
Decision Tree
No ratings yet
Decision Tree
45 pages
TheJobOverflow - Juspay - 21st October - Online Assessments - Locking The Tree of Space
No ratings yet
TheJobOverflow - Juspay - 21st October - Online Assessments - Locking The Tree of Space
3 pages
ML Unit 2-2-40
No ratings yet
ML Unit 2-2-40
39 pages
Lec 34
No ratings yet
Lec 34
32 pages
Unit 3
No ratings yet
Unit 3
95 pages
MCA3 (DS) Unit 4 ML
No ratings yet
MCA3 (DS) Unit 4 ML
29 pages
19 - Decision Tree - ID3
No ratings yet
19 - Decision Tree - ID3
87 pages
Classification
No ratings yet
Classification
75 pages
2 ML Ch3 Decision Trees Final
No ratings yet
2 ML Ch3 Decision Trees Final
70 pages
Decisiontrees
No ratings yet
Decisiontrees
28 pages
Unit-4 Solution
No ratings yet
Unit-4 Solution
21 pages
L3 - Decision Trees
No ratings yet
L3 - Decision Trees
28 pages
Decision Tree
No ratings yet
Decision Tree
20 pages
Unit 3
No ratings yet
Unit 3
33 pages
Module 3
No ratings yet
Module 3
102 pages
Unit6 - 2 Classification-Decision-Trees
No ratings yet
Unit6 - 2 Classification-Decision-Trees
36 pages
Unit 3
No ratings yet
Unit 3
46 pages
Module 3
No ratings yet
Module 3
101 pages
New Module 3 Part1
No ratings yet
New Module 3 Part1
69 pages
Decision Tree
No ratings yet
Decision Tree
41 pages
Unit II Part 1
No ratings yet
Unit II Part 1
62 pages
Screenshot 2024-02-06 at 1.43.15 PM
No ratings yet
Screenshot 2024-02-06 at 1.43.15 PM
66 pages
Transportation Model
No ratings yet
Transportation Model
64 pages
Theory of Computation
No ratings yet
Theory of Computation
145 pages
Springer - Linguistic Decision Trees For Classification-2014
No ratings yet
Springer - Linguistic Decision Trees For Classification-2014
43 pages
ML L8 Decision Tree
No ratings yet
ML L8 Decision Tree
109 pages
Wk. 5.2. Decision Trees (27.10.2020)
No ratings yet
Wk. 5.2. Decision Trees (27.10.2020)
57 pages
Lecture 04 Decession Trees 04112022 015118pm
No ratings yet
Lecture 04 Decession Trees 04112022 015118pm
43 pages
16-Decision Tree Classification Algorithm Advantages With Examples (Iterative Dichotomiser 3-ID3) - 22-03-2024
No ratings yet
16-Decision Tree Classification Algorithm Advantages With Examples (Iterative Dichotomiser 3-ID3) - 22-03-2024
83 pages
Machine Learning: MVJ21CS62
No ratings yet
Machine Learning: MVJ21CS62
12 pages
Programación en PROTON BASIC PDF
No ratings yet
Programación en PROTON BASIC PDF
38 pages
Decision Tree
No ratings yet
Decision Tree
14 pages
Decision Trees-Lecture 9&10
No ratings yet
Decision Trees-Lecture 9&10
60 pages
Chapter18 2
No ratings yet
Chapter18 2
40 pages
AI&Ml-module 4 (Complete)
No ratings yet
AI&Ml-module 4 (Complete)
124 pages
Machine Learning
No ratings yet
Machine Learning
8 pages
15.module6 Decisiontree-Updated 14
No ratings yet
15.module6 Decisiontree-Updated 14
20 pages
AI&Ml-module 4 (Part 1)
No ratings yet
AI&Ml-module 4 (Part 1)
85 pages
CST 402 - Distributed Computing
No ratings yet
CST 402 - Distributed Computing
78 pages
Decision Tree
No ratings yet
Decision Tree
16 pages
ML Unit-2.1
No ratings yet
ML Unit-2.1
17 pages
FALLSEM2024-25 BCSE209L TH VL2024250101598 2024-08-05 Reference-Material-I
No ratings yet
FALLSEM2024-25 BCSE209L TH VL2024250101598 2024-08-05 Reference-Material-I
31 pages
Lecture Notes 3
No ratings yet
Lecture Notes 3
11 pages
Decisiontree
No ratings yet
Decisiontree
6 pages
Module - 2 Decision Tree Learning
No ratings yet
Module - 2 Decision Tree Learning
79 pages
DFA Minimization
No ratings yet
DFA Minimization
6 pages
Random Forest-Supervised ML
No ratings yet
Random Forest-Supervised ML
45 pages
Ch02 DecisionTree
No ratings yet
Ch02 DecisionTree
41 pages
Decision Trees
No ratings yet
Decision Trees
15 pages
Merge Sort Algorithm
No ratings yet
Merge Sort Algorithm
5 pages
Lecture Note #5 - PEC-CS701E
No ratings yet
Lecture Note #5 - PEC-CS701E
16 pages
Unit Ii
No ratings yet
Unit Ii
22 pages
DSA RTU 2022 Paper
No ratings yet
DSA RTU 2022 Paper
15 pages
Fundamental Algorithms, Assignment 4 Solutions
No ratings yet
Fundamental Algorithms, Assignment 4 Solutions
3 pages
C2 Logarithms Assignment With Answers
No ratings yet
C2 Logarithms Assignment With Answers
2 pages
Chapter 4classification and Prediction
No ratings yet
Chapter 4classification and Prediction
19 pages
Bubble Sort
No ratings yet
Bubble Sort
31 pages
Pincer Search Algo
No ratings yet
Pincer Search Algo
8 pages
Module 5
No ratings yet
Module 5
27 pages
ML Question Bank
No ratings yet
ML Question Bank
4 pages
Information Representation Floating Point
No ratings yet
Information Representation Floating Point
17 pages
Numerical Analysis: NA Team 2024
No ratings yet
Numerical Analysis: NA Team 2024
17 pages
Data Object and Type: Outline
No ratings yet
Data Object and Type: Outline
14 pages
Implement The Graph and Traverse It Using Depth First Search
No ratings yet
Implement The Graph and Traverse It Using Depth First Search
8 pages
For The Single Machine Total Weighted Tardiness Scheduling Problem
No ratings yet
For The Single Machine Total Weighted Tardiness Scheduling Problem
18 pages
Parabola Review Math 2+ Unit 3
No ratings yet
Parabola Review Math 2+ Unit 3
5 pages
Polymorphism in Java
No ratings yet
Polymorphism in Java
4 pages
Bcse307l Compiler-Design TH 1.0 70 Bcse307l
No ratings yet
Bcse307l Compiler-Design TH 1.0 70 Bcse307l
2 pages
Javascript Variables
No ratings yet
Javascript Variables
3 pages
Problem 1: CS 103 Homework 6 Solutions Spring 2013-14
No ratings yet
Problem 1: CS 103 Homework 6 Solutions Spring 2013-14
4 pages
Lab No 7
No ratings yet
Lab No 7
4 pages
Master of Computer Application 3 Yr. Major XX Principles of Programming Languages (2050250)
No ratings yet
Master of Computer Application 3 Yr. Major XX Principles of Programming Languages (2050250)
2 pages
3 Logic Gates and Logic Circuits
No ratings yet
3 Logic Gates and Logic Circuits
2 pages
INFOSYS (SP&DSP) Placement Material by PLACEMENT LELO
100% (1)
INFOSYS (SP&DSP) Placement Material by PLACEMENT LELO
27 pages
Alternating Decision Tree: Fundamentals and Applications
From Everand
Alternating Decision Tree: Fundamentals and Applications
Fouad Sabry
No ratings yet
Decision Tree Pruning: Fundamentals and Applications
From Everand
Decision Tree Pruning: Fundamentals and Applications
Fouad Sabry
No ratings yet

Decision Tree

Uploaded by

Decision Tree

Uploaded by

Decision Tree

Decision Tree Representation

■ Information needed (after using A to split D into v partitions) to

■ Information gained by branching on attribute A

means “age <=30” has 5 out of 14

You might also like