0% found this document useful (0 votes)

4 views25 pages

Classification by Decision Tree Induction

Decision tree induction is a method for creating decision trees from class-labeled training data, where each node represents a test on an attribute and each leaf node indicates a class label. The process involves calculating attribute selection measures like information gain to determine the best splitting criteria for the data. The document discusses the historical development of decision tree algorithms, including ID3 and CART, and provides examples of how to calculate information gain for various attributes.

Uploaded by

anilatthuluru

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

4 views25 pages

Classification by Decision Tree Induction

Uploaded by

anilatthuluru

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 25

CLASSIFICATION BY DECISION TREE INDUCTION

Decision tree induction is the learning of decision trees from class-labeled training tuples.

A decision tree is a flowchart-like tree structure, where each internal node (non-leaf node) denotes a test on an attribute,

each branch represents an outcome of the test, and each leaf node (or terminal node) holds a class label.

The topmost node in a tree is the root node.

A decision tree for the concept buys computer, indicating

whether a customer at AllElectronics is likely to

purchase a computer.

Each internal (nonleaf) node represents a test on an

attribute.

Each leaf node represents a class (either buys computer

= yes or buys computer = no).

Decision Tree Induction

During the late 1970s and early 1980s, J. Ross Quinlan, a researcher in machine learning, developed a decision tree

algorithm known as ID3 (Iterative Dichotomiser).

This work expanded on earlier work on concept learning systems, described by E. B. Hunt, J. Marin, and P. T. Stone.

Quinlan later presented C45.

In 1984, a group of statisticians (L. Breiman, J. Friedman, R. Olshen, and C. Stone) published the book Classification

and Regression Trees (CART), which described the generation of binary decision trees.
A is discrete-valued:

In this case, the outcomes of the test at node N correspond directly to the known values of A.

A branch is created for each known value, aj, of A and labeled with that value.

Partition Dj is the subset of class-labeled tuples in D having value aj of A.

Because all of the tuples in a given partition have the same value for A, then A need not be considered in any

future partitioning of the tuples.

A is continuous-valued

In this case, the test at node N has two possible outcomes, corresponding to the conditions A <= split point and A >

split point, respectively,

A is discrete-valued and a binary tree must be produced

The test at node N is of the form “A∈SA?”. SA is the splitting subset for A, returned by Attribute selection

method as part of the splitting criterion. It is a subset of the known values of A. If a given tuple has value

aj of A and if aj∈SA, then the test at node N is satisfied

Attribute Selection Measures

An attribute selection measure is a heuristic for selecting the splitting criterion that “best” separates a given data

partition, D, of class-labeled training tuples into individual classes.

Attribute selectionmeasures are also known as splitting rules because they determine how the tuples at a given

node are to be split.

This section describes three popular attribute selection measures—

1. Information gain

2. Gain ratio

3. Gini index.
Information gain

The following table represents a training set, D, of class-labeled tuples randomly selected from the AllElectronics

customer database
 Int the above example, each attribute is discrete-valued.

 Continuous valued attributes have been generalized.

 The class label attribute, buys computer, has two distinct values (namely{yes, no}); therefore, there are two distinct

classes (that is, m=2).

 Let class c1 correspond to ‘yes’ and class c2 correspond to ‘no’.

 There are nine tuples of class ‘yes’ and five tuples of class ‘no’.

 A(root) node N is created for the tuples in D.

To find the splitting criterion for these tuples, we must compute the information gain of each attribute.

The expected information needed to classify a tuple in D is given by

Calculate information gain for age:
Age can be:
Calculate entropy for youth:
 youth

 middle_aged

 senior

YOUTH
Youth Class:buys_computer
Yes 2
No 3

entropy for youth= -(2/5)-(3/5)

Calculate entropy for middle_aged: Calculate entropy for senior:

MIDDLE_AGED SENIOR

middle_aged Class:buys_computer senior Class:buys_computer

Yes 4 Yes 3
No 0 No 2

entropy for middle_aged= -(4/4)-(0/4) entropy for senior= -(3/5)-(2/5)

the expected information needed to classify a tuple in D if the tuples are partitioned according to age is

Hence, the gain in information from such a partitioning would be

Calculate information gain for income:
Income can be:
Calculate entropy for low:
 low

 medium

 high

LOW
Low Class:buys_computer
Yes 3
No 1

entropy for low= -(3/4)-(3/4)

Calculate entropy for medium: Calculate entropy for high:

MEDIUM HIGH

medium Class:buys_computer high Class:buys_computer

Yes 4 Yes 2
No 2 No 2

entropy for medium= -(4/6)-(2/6) entropy for high= -(2/4)-(2/4)

the expected information needed to classify a tuple in D if the tuples are partitioned according to income is

= (4/14)X(-(3/4)-(3/4))

+(6/14)X(-(4/6)-(2/6))

+(4/14)X(-(2/4)-(2/4))

= 0.911 bits

Hence, the gain in information from such a partitioning would be

Gain(income)= =0.94-0.911=0.029
Calculate information gain for student:

student can be:

 yes

 no
Calculate entropy for yes: Calculate entropy for no:

YES NO

Yes Class:buys_computer No Class:buys_computer

Yes 6 Yes 3
No 1 No 4

entropy for Yes= -(6/7)-(1/7) entropy for No= -(3/7)-(4/7)

the expected information needed to classify a tuple in D if the tuples are partitioned according to student is

= (7/14)X(-(6/7)-(1/7))

+(7/14)X(-(3/7)-(4/7))

= 0.789 bits

Hence, the gain in information from such a partitioning would be

Gain(income)= =0.94-0.789=0.151
Calculate information gain for credit_rating:

Credit_rating can be:

 fair

 excellent
Calculate entropy for fair: Calculate entropy for excellent:

FAIR EXCELLENT

Fair Class:buys_computer Excellent Class:buys_computer

Yes 6 Yes 3
No 2 No 3

entropy for fair= -(6/8)-(2/8) entropy for excellent= -(3/6)-(3/6)

the expected information needed to classify a tuple in D if the tuples are partitioned according to credit_rating is

= (8/14)X(-(6/8)-(2/8))

+(6/14)X(-(3/6)-(3/6))

= 0.892 bits

Hence, the gain in information from such a partitioning would be

Gain(income)= =0.94-0.892=0.048
Because age has the highest information gain among the attributes, it is selected as the splitting attribute.

Node N is labeled with age, and branches are grown for each of the attribute’s values.

The tuples are then partitioned accordingly.

Notice that the tuples falling into the partition for age = middle_aged all belong to the same class.

Because they all belong to class “yes,” a leaf should therefore be created at the end of this branch and labeled with “yes.”
The attribute age has the highest information gain and therefore becomes the splitting attribute at the root node of the decision

tree. Branches are grown for each outcome of age. The tuples are shown partitioned accordingly.
The final decision tree returned by the algorithm is shown in below figure

COMP 6930 Topic01 Classification Basics
No ratings yet
COMP 6930 Topic01 Classification Basics
190 pages
701
100% (2)
701
35 pages
L35 MC 6
No ratings yet
L35 MC 6
351 pages
Machine Learning Unit-3.2
No ratings yet
Machine Learning Unit-3.2
61 pages
Decision Tree
No ratings yet
Decision Tree
74 pages
CSHP Template For Small Construction Projects
50% (4)
CSHP Template For Small Construction Projects
5 pages
Decision Tree
100% (1)
Decision Tree
12 pages
Dairy Industry Analysis
No ratings yet
Dairy Industry Analysis
48 pages
Unit-3 Classification
No ratings yet
Unit-3 Classification
28 pages
Unit-3 (MLT)
No ratings yet
Unit-3 (MLT)
46 pages
Decision Tree
No ratings yet
Decision Tree
30 pages
Unit 3 - Classification
No ratings yet
Unit 3 - Classification
28 pages
Module 5: Data Mining Algorithms: Classification
No ratings yet
Module 5: Data Mining Algorithms: Classification
34 pages
ML - 4
No ratings yet
ML - 4
58 pages
DWDM Asgmnt Prog
No ratings yet
DWDM Asgmnt Prog
51 pages
Summary-RK Narayan - The Financial Expert
100% (13)
Summary-RK Narayan - The Financial Expert
5 pages
Attribute Selection Measures
No ratings yet
Attribute Selection Measures
15 pages
Unit-Iii: Classification and Prediction
No ratings yet
Unit-Iii: Classification and Prediction
21 pages
Data Mining: Concepts and Techniques
No ratings yet
Data Mining: Concepts and Techniques
59 pages
Decision Tree
No ratings yet
Decision Tree
43 pages
Decision Tree
No ratings yet
Decision Tree
30 pages
Decision Tree Induction
No ratings yet
Decision Tree Induction
80 pages
DWM Unit-V Notes
No ratings yet
DWM Unit-V Notes
15 pages
Topic01 Classification Basics Jiawei Han Extra
No ratings yet
Topic01 Classification Basics Jiawei Han Extra
198 pages
Module - 4.1-DM-1
No ratings yet
Module - 4.1-DM-1
63 pages
Unit-4 DM
No ratings yet
Unit-4 DM
15 pages
Classification DecisionTreesNaiveBayeskNN
No ratings yet
Classification DecisionTreesNaiveBayeskNN
75 pages
Classification - Decision Tree
No ratings yet
Classification - Decision Tree
32 pages
AIML Lec-11
No ratings yet
AIML Lec-11
18 pages
Supervised Learning Algorithm
No ratings yet
Supervised Learning Algorithm
59 pages
08 Class Basic
No ratings yet
08 Class Basic
86 pages
Chap4 Classification Lecture 5
No ratings yet
Chap4 Classification Lecture 5
74 pages
DM 4
No ratings yet
DM 4
68 pages
Decision Tree
No ratings yet
Decision Tree
22 pages
5 Classification
No ratings yet
5 Classification
59 pages
Asset v1 MKAU+SEng9032+DEV 01+Type@Asset+Block@ML Chapterthree
No ratings yet
Asset v1 MKAU+SEng9032+DEV 01+Type@Asset+Block@ML Chapterthree
129 pages
Classification and Prediction
No ratings yet
Classification and Prediction
143 pages
Construction of Decision Tree Attribute Selection Measures
No ratings yet
Construction of Decision Tree Attribute Selection Measures
5 pages
Data Mining & Knowledge Discovery
No ratings yet
Data Mining & Knowledge Discovery
34 pages
Operation and Maintenance Manual - RAM2
No ratings yet
Operation and Maintenance Manual - RAM2
45 pages
Module 3 Notes
No ratings yet
Module 3 Notes
31 pages
Data Mining - Lecture 5
No ratings yet
Data Mining - Lecture 5
33 pages
PR Electronics 5715v104 - Uk
No ratings yet
PR Electronics 5715v104 - Uk
25 pages
Decitions Tree
No ratings yet
Decitions Tree
6 pages
ML Unit II
No ratings yet
ML Unit II
183 pages
Data Mining Unit 3
No ratings yet
Data Mining Unit 3
50 pages
Mod 3 Part1 - Merged
No ratings yet
Mod 3 Part1 - Merged
101 pages
Classification and Regression Trees (CART - I) : Dr. A. Ramesh
No ratings yet
Classification and Regression Trees (CART - I) : Dr. A. Ramesh
34 pages
Trees
No ratings yet
Trees
78 pages
Classification
No ratings yet
Classification
75 pages
UNIT 2 Class Basic
No ratings yet
UNIT 2 Class Basic
69 pages
DM 3
No ratings yet
DM 3
37 pages
Decision Tree
No ratings yet
Decision Tree
41 pages
Unit 1 Classification & Prediction DM
No ratings yet
Unit 1 Classification & Prediction DM
71 pages
Decision Tree
No ratings yet
Decision Tree
33 pages
ML 04 Classification Decission Tree
No ratings yet
ML 04 Classification Decission Tree
28 pages
Unit6 - 2 Classification-Decision-Trees
No ratings yet
Unit6 - 2 Classification-Decision-Trees
36 pages
DMDW Classification
No ratings yet
DMDW Classification
18 pages
2013 Facilitating Decision Support Through Decision Tree
No ratings yet
2013 Facilitating Decision Support Through Decision Tree
5 pages
DM Unit-4
No ratings yet
DM Unit-4
75 pages
20210913115613D3708 - Session 05-08 Decision Tree Classification
No ratings yet
20210913115613D3708 - Session 05-08 Decision Tree Classification
37 pages
Slide 07 Chapter8 Classification Basic Concept
No ratings yet
Slide 07 Chapter8 Classification Basic Concept
55 pages
Session - Decision Trees PPT DOM304
No ratings yet
Session - Decision Trees PPT DOM304
8 pages
MLT 3 UNIT-Part-1
No ratings yet
MLT 3 UNIT-Part-1
28 pages
8 Classification
No ratings yet
8 Classification
82 pages
Calculation of Electrical Induction Near Power Lines
No ratings yet
Calculation of Electrical Induction Near Power Lines
22 pages
Disposal of Unused Drugs - Knowledge and Behavior Among People Around The World
100% (1)
Disposal of Unused Drugs - Knowledge and Behavior Among People Around The World
34 pages
Updated Land Law 2 - 085110
No ratings yet
Updated Land Law 2 - 085110
34 pages
Proposed RAT For VPs
No ratings yet
Proposed RAT For VPs
3 pages
Solution Assigment Chapter 5
No ratings yet
Solution Assigment Chapter 5
11 pages
CBSE Class 3 Mathematics - 4 Digit Numbers-2
No ratings yet
CBSE Class 3 Mathematics - 4 Digit Numbers-2
4 pages
WI For FMEA
No ratings yet
WI For FMEA
2 pages
DL24/DL24P User Manual
No ratings yet
DL24/DL24P User Manual
9 pages
Fast Track Quick Reference
No ratings yet
Fast Track Quick Reference
7 pages
MX SB RO: User Manual
No ratings yet
MX SB RO: User Manual
23 pages
CPC Modes of Servive Esummon
No ratings yet
CPC Modes of Servive Esummon
12 pages
Department of Management Presentation
No ratings yet
Department of Management Presentation
84 pages
Illycaffe: The Starbucks Threat: Marketing Strategy
No ratings yet
Illycaffe: The Starbucks Threat: Marketing Strategy
12 pages
Charles Oman
No ratings yet
Charles Oman
49 pages
TOPIC 7 Unemployment
No ratings yet
TOPIC 7 Unemployment
13 pages
Francisco Padilla 1
No ratings yet
Francisco Padilla 1
2 pages
Darjeeling Toy Train
No ratings yet
Darjeeling Toy Train
2 pages
Get Data Analytics For Accounting, 3rd Edition Vernon J. Richardson Free All Chapters
No ratings yet
Get Data Analytics For Accounting, 3rd Edition Vernon J. Richardson Free All Chapters
40 pages
Notice of Recurrence: U.S. Department of Labor
No ratings yet
Notice of Recurrence: U.S. Department of Labor
4 pages
3 RD Sem Results
No ratings yet
3 RD Sem Results
2 pages
LSIQNF2309462 - Pak Ageng
No ratings yet
LSIQNF2309462 - Pak Ageng
1 page
Fire Hydrant 2 Polyhose
No ratings yet
Fire Hydrant 2 Polyhose
1 page
CV Ahmad Mustafa
No ratings yet
CV Ahmad Mustafa
1 page
De-Mystifying Math and Stats for Machine Learning: Mastering the Fundamentals of Mathematics and Statistics for Machine Learning
From Everand
De-Mystifying Math and Stats for Machine Learning: Mastering the Fundamentals of Mathematics and Statistics for Machine Learning
Seaport AI Madhavan
No ratings yet
Alternating Decision Tree: Fundamentals and Applications
From Everand
Alternating Decision Tree: Fundamentals and Applications
Fouad Sabry
No ratings yet

Classification by Decision Tree Induction

Uploaded by

Classification by Decision Tree Induction

Uploaded by

CLASSIFICATION BY DECISION TREE INDUCTION

The topmost node in a tree is the root node.

whether a customer at AllElectronics is likely to

Each internal (nonleaf) node represents a test on an

Each leaf node represents a class (either buys computer

= yes or buys computer = no).

algorithm known as ID3 (Iterative Dichotomiser).

Quinlan later presented C45.

Partition Dj is the subset of class-labeled tuples in D having value aj of A.

future partitioning of the tuples.

split point, respectively,

aj of A and if aj∈SA, then the test at node N is satisfied

partition, D, of class-labeled training tuples into individual classes.

node are to be split.

This section describes three popular attribute selection measures—

 Continuous valued attributes have been generalized.

classes (that is, m=2).

 Let class c1 correspond to ‘yes’ and class c2 correspond to ‘no’.

 A(root) node N is created for the tuples in D.

The expected information needed to classify a tuple in D is given by

entropy for youth= -(2/5)-(3/5)

middle_aged Class:buys_computer senior Class:buys_computer

entropy for middle_aged= -(4/4)-(0/4) entropy for senior= -(3/5)-(2/5)

Hence, the gain in information from such a partitioning would be

entropy for low= -(3/4)-(3/4)

medium Class:buys_computer high Class:buys_computer

entropy for medium= -(4/6)-(2/6) entropy for high= -(2/4)-(2/4)

Hence, the gain in information from such a partitioning would be

student can be:

Yes Class:buys_computer No Class:buys_computer

entropy for Yes= -(6/7)-(1/7) entropy for No= -(3/7)-(4/7)

Hence, the gain in information from such a partitioning would be

Credit_rating can be:

Fair Class:buys_computer Excellent Class:buys_computer

entropy for fair= -(6/8)-(2/8) entropy for excellent= -(3/6)-(3/6)

Hence, the gain in information from such a partitioning would be

The tuples are then partitioned accordingly.

You might also like