0% found this document useful (0 votes)
20 views16 pages

2b Decision Tree 18may

The document provides an introduction to decision trees, including definitions, examples, and issues in generating decision trees from training data. It discusses decision trees as classifiers with decision and leaf nodes, and shows examples of possible decision tree structures for classifying loans and whether to play tennis. It also outlines top-down induction of decision trees using a divide-and-conquer approach to build the tree.

Uploaded by

Akshay Kaushik
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
20 views16 pages

2b Decision Tree 18may

The document provides an introduction to decision trees, including definitions, examples, and issues in generating decision trees from training data. It discusses decision trees as classifiers with decision and leaf nodes, and shows examples of possible decision tree structures for classifying loans and whether to play tennis. It also outlines top-down induction of decision trees using a divide-and-conquer approach to build the tree.

Uploaded by

Akshay Kaushik
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 16

Foundations of Machine Learning

Module 2: Linear Regression and Decision Tree

Part B: Introduction to Decision Tree

Sudeshna Sarkar
IIT Kharagpur
Definition
• A decision tree is a classifier in the form of a
tree structure with two types of nodes:
– Decision node: Specifies a choice or test of
some attribute, with one branch for each
outcome
– Leaf node: Indicates classification of an example
Decision Tree Example 1
Whether to approve a loan
Employed?
No Yes

Credit
Income?
Score?
High Low High Low

Approve Reject Approve Reject


Decision Tree Example 3
Issues
• Given some training examples, what decision tree
should be generated?
• One proposal: prefer the smallest tree that is
consistent with the data (Bias)
– the tree with the least depth?
– the tree with the fewest nodes?
• Possible method:
– search the space of decision trees for the smallest decision
tree that fits the data
Example Data
Training Examples:
Action Author Thread Length Where
e1 skips known new long Home
e2 reads unknown new short Work
e3 skips unknown old long Work
e4 skips known old long home
e5 reads known new short home
e6 skips known old long work

New Examples:
e7 ??? known new short work
e8 ??? unknown new short work
Possible splits
skips 9
length reads 9

long short skips 9


thread reads 9
Skips 2
Skips 7
Reads 9
Reads 0 new old
Skips 6
Skips 3
Reads 2
Reads 7
Two Example DTs
Decision Tree for PlayTennis
• Attributes and their values:
– Outlook: Sunny, Overcast, Rain
– Humidity: High, Normal
– Wind: Strong, Weak
– Temperature: Hot, Mild, Cool

• Target concept - Play Tennis: Yes, No


Decision Tree for PlayTennis
Outlook

Sunny Overcast Rain

Humidity Yes Wind

High Normal Strong Weak

No Yes No Yes
Decision Tree for PlayTennis
Outlook

Sunny Overcast Rain

Humidity Each internal node tests an attribute

High Normal Each branch corresponds to an


attribute value node

No Yes Each leaf node assigns a classification


Decision Tree for PlayTennis
Outlook Temperature Humidity Wind PlayTennis
Sunny Hot High Weak ? No
Outlook

Sunny Overcast Rain

Humidity Yes Wind

High Normal Strong Weak

No Yes No Yes
Decision Tree
decision trees represent disjunctions of conjunctions
Outlook

Sunny Overcast Rain

Humidity Yes Wind

High Normal Strong Weak

No Yes No Yes

(Outlook=Sunny  Humidity=Normal)
 (Outlook=Overcast)
 (Outlook=Rain  Wind=Weak)
Searching for a good tree
• How should you go about building a decision tree?
• The space of decision trees is too big for systematic
search.

• Stop and
– return the a value for the target feature or
– a distribution over target feature values

• Choose a test (e.g. an input feature) to split on.


– For each value of the test, build a subtree for those
examples with this value for the test.
Top-Down Induction of Decision Trees ID3

1. Which node to proceed with?


1. A  the “best” decision attribute for next node
2. Assign A as decision attribute for node
3. For each value of A create new descendant
4. Sort training examples to leaf node according to the
attribute value of the branch
5. If all training examples are perfectly classified (same
value of target attribute) stop, else iterate over new
leaf nodes. 2. When to stop?
Choices
• When to stop
– no more input features
– all examples are classified the same
– too few examples to make an informative split

• Which test to split on


– split gives smallest error.
– With multi-valued features
• split on all values or
• split values into half.

You might also like