0% found this document useful (0 votes)

348 views11 pages

@vtucode - in Module 4 AI 2021 Scheme 5th Sem

The document discusses decision tree learning and algorithms. It introduces decision trees, describing their structure with root, branch and leaf nodes. It then covers two major procedures of decision trees: building the tree and performing classification. Several algorithms for building decision trees are described, including ID3, C4.5 and CART. The document also discusses validating and pruning decision trees to prevent overfitting.

Uploaded by

Ratan Shet

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

348 views11 pages

@vtucode - in Module 4 AI 2021 Scheme 5th Sem

Uploaded by

Ratan Shet

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 11

ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING(21CS54)

MODULE 4
CHAPTER 6
DECISION TREE LEARNING
6.1 Introduction

 Why called as decision tree ?

 As starts from root node and finds number of solutions .
 The benefits of having a decision tree are as follows :
 It does not require any domain knowledge.
 It is easy to comprehend.
 The learning and classification steps of a decision tree are simple and fast.
 Example : Toll free number

6.1.1 Structure of a Decision Tree A decision tree is a structure that includes a root
node, branches, and leaf nodes. Each internal node denotes a test on an attribute, each
branch denotes the outcome of a test, and each leaf node holds a class label. The topmost
node in the tree is the root node.

Applies to classification and regression model.

vtucode.in
1
ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING(21CS54)

The decision tree consists of 2 major procedures:

1) Building a tree and

2) Knowledge inference or classification.

Building the Tree

Knowledge Inference or Classification

Advantages of Decision Trees

vtucode.in
2
ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING(21CS54)

Disadvantages of Decision Trees

6.1.2 Fundamentals of Entropy

 How to draw a decision tree ?

Entropy
Information gain

vtucode.in
3
ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING(21CS54)

Algorithm 6.1: General Algorithm for Decision Trees

6.2 DECISION TREE INDUCTION ALGORITHMS

6.2.1 ID3 Tree Construction(ID3 stands for Iterative Dichotomiser 3 )

A decision tree is one of the most powerful tools of supervised learning algorithms
used for both classification and regression tasks.
It builds a flowchart-like tree structure where each internal node denotes a test on an
attribute, each branch represents an outcome of the test, and each leaf node (terminal

vtucode.in
4
ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING(21CS54)

node) holds a class label. It is constructed by recursively splitting the training data
into subsets based on the values of the attributes until a stopping criterion is met, such
as the maximum depth of the tree or the minimum number of samples required to split
a node.

vtucode.in
5
ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING(21CS54)

6.2.2 C4.5 Construction

C4.5 is a widely used algorithm for constructing decision trees from a dataset.
Disadvantages of ID3 are: Attributes must be nominal values, dataset must not include
missing data, and finally the algorithm tend to fall into overfitting.
To overcome this disadvantage Ross Quinlan, inventor of ID3, made some
improvements for these bottlenecks and created a new algorithm named C4.5. Now, the
algorithm can create a more generalized models including continuous data and could
handle missing data. And also works with discrete data, supports post-prunning.

vtucode.in
6
ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING(21CS54)

Dealing with Continuous Attributes in C4.5

vtucode.in
7
ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING(21CS54)

6.2.3 Classification and Regression Trees Construction

Classification and Regression Trees (CART) is a widely used algorithm for
constructing decision trees that can be applied to both classification and regression
tasks. CART is similar to C4.5 but has some differences in its construction and splitting
criteria.
The classification method CART is required to construct a decision tree based on Gini's
impurity index. It serves as an example of how the values of other variables can be used
to predict the values of a target variable. It functions as a fundamental machine-learning
method and provides a wide range of use cases

vtucode.in
8
ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING(21CS54)

6.2.4 Regression Trees

vtucode.in
9
ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING(21CS54)

6.3 VALIDATING AND PRUNING OF DECISION TREES

Validating and pruning decision trees is a crucial part of building accurate and robust
machine learning models. Decision trees are prone to overfitting, which means they can
learn to capture noise and details in the training data that do not generalize well to new,
unseen data.

Validation and pruning are techniques used to mitigate this issue and improve the
performance of decision tree models.

The pre-pruning technique of Decision Trees is tuning the hyperparameters prior to

the training pipeline. It involves the heuristic known as ‘early stopping’ which stops the
growth of the decision tree - preventing it from reaching its full depth. It stops the tree-
building process to avoid producing leaves with small samples. During each stage of
the splitting of the tree, the cross-validation error will be monitored. If the value of the
error does not decrease anymore - then we stop the growth of the decision tree.

The hyperparameters that can be tuned for early stopping and preventing overfitting

are: max_depth, min_samples_leaf, and min_samples_split

These same parameters can also be used to tune to get a robust model

Post-pruning does the opposite of pre-pruning and allows the Decision Tree model to

grow to its full depth. Once the model grows to its full depth, tree branches are removed

vtucode.in
10
ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING(21CS54)

to prevent the model from overfitting. The algorithm will continue to partition data into

smaller subsets until the final subsets produced are similar in terms of the outcome

variable. The final subset of the tree will consist of only a few data points allowing the

tree to have learned the data to the T. However, when a new data point is introduced

that differs from the learned data - it may not get predicted well.

The hyperparameter that can be tuned for post-pruning and preventing overfitting

is: ccp_alpha

ccp stands for Cost Complexity Pruning and can be used as another option to control

the size of a tree. A higher value of ccp_alpha will lead to an increase in the number of

nodes pruned.

vtucode.in
11

AIML 4th and 5th Module Notes
No ratings yet
AIML 4th and 5th Module Notes
77 pages
Expert Systems Unit III&IV
No ratings yet
Expert Systems Unit III&IV
14 pages
Ai Chat Bot Unit - 2
No ratings yet
Ai Chat Bot Unit - 2
31 pages
KNN Solved Numerical Problem (Regression)
No ratings yet
KNN Solved Numerical Problem (Regression)
3 pages
MLT, Two Marks
No ratings yet
MLT, Two Marks
19 pages
@vtucode - in BCS515B Module 3 Textbook
No ratings yet
@vtucode - in BCS515B Module 3 Textbook
32 pages
Designing A Learning System
No ratings yet
Designing A Learning System
21 pages
BCS515B
No ratings yet
BCS515B
2 pages
AIES455 Uu
100% (1)
AIES455 Uu
4 pages
R22 ML Syllabus
No ratings yet
R22 ML Syllabus
2 pages
Advanced Machine Learning: Module-1
No ratings yet
Advanced Machine Learning: Module-1
164 pages
ML - CSA 301 - ML Perspective and Issues
No ratings yet
ML - CSA 301 - ML Perspective and Issues
34 pages
CS6659 AI UNIT 1 Notes
100% (8)
CS6659 AI UNIT 1 Notes
47 pages
18CS42 Design and Analysis of Algorithms
No ratings yet
18CS42 Design and Analysis of Algorithms
16 pages
Building A Data Culture in MOF
100% (1)
Building A Data Culture in MOF
136 pages
Pattern Recognition and Anomaly Detection Lab
No ratings yet
Pattern Recognition and Anomaly Detection Lab
3 pages
Of Module-1 1.: I. What Is AI?
100% (1)
Of Module-1 1.: I. What Is AI?
19 pages
AIML Module 3
No ratings yet
AIML Module 3
25 pages
Branch and Bound
No ratings yet
Branch and Bound
30 pages
Artificial Intelligence Module 5
No ratings yet
Artificial Intelligence Module 5
23 pages
Decision Tree & Techniques
71% (7)
Decision Tree & Techniques
41 pages
AI Lab MAnual Final
No ratings yet
AI Lab MAnual Final
44 pages
DAA Assignment - 17-April-2020: Q1. With Help of Venn Diagram Explain Commonly Believed Relationship Between P and NP?
No ratings yet
DAA Assignment - 17-April-2020: Q1. With Help of Venn Diagram Explain Commonly Believed Relationship Between P and NP?
11 pages
Module-02 AIML NOTES
No ratings yet
Module-02 AIML NOTES
29 pages
Naïve Bayes Classifier
No ratings yet
Naïve Bayes Classifier
3 pages
Question Bank COURSE: Artificial Intelligence Department: Cse Class: Iii B.Tech Sem Ii Year: 2009-2010 Unit I
No ratings yet
Question Bank COURSE: Artificial Intelligence Department: Cse Class: Iii B.Tech Sem Ii Year: 2009-2010 Unit I
9 pages
Daa Lab Manual
No ratings yet
Daa Lab Manual
60 pages
21CS54 Aiml Module3 PPT
No ratings yet
21CS54 Aiml Module3 PPT
102 pages
ML Lab
No ratings yet
ML Lab
62 pages
Ai-Unit2 - QB-VDP
No ratings yet
Ai-Unit2 - QB-VDP
13 pages
Animal Detection and Prevention in Agri Field Using Iot
No ratings yet
Animal Detection and Prevention in Agri Field Using Iot
36 pages
Decision Tree - A Step-by-Step Guide
No ratings yet
Decision Tree - A Step-by-Step Guide
36 pages
20 431 Internship PPT Final
No ratings yet
20 431 Internship PPT Final
19 pages
Unit 4 NNDL
No ratings yet
Unit 4 NNDL
37 pages
AI Unitwise Imp Questions
No ratings yet
AI Unitwise Imp Questions
3 pages
ML Unit 1
No ratings yet
ML Unit 1
44 pages
Concept Learning
No ratings yet
Concept Learning
85 pages
Introduction To Anthropology, Sociology and Political Science
No ratings yet
Introduction To Anthropology, Sociology and Political Science
4 pages
IML-IITKGP - Assignment 7 Solution
No ratings yet
IML-IITKGP - Assignment 7 Solution
8 pages
KNN (K Nearest Neighbor)
No ratings yet
KNN (K Nearest Neighbor)
21 pages
Question Bank: T.E. (Computer Engineering) Data Science and Big Data Analytics (2019 Pattern)
No ratings yet
Question Bank: T.E. (Computer Engineering) Data Science and Big Data Analytics (2019 Pattern)
4 pages
SRM Institute of Science and Technology: Artificial Intelligence Is About
No ratings yet
SRM Institute of Science and Technology: Artificial Intelligence Is About
7 pages
ML Question Bank
No ratings yet
ML Question Bank
29 pages
Unit - 3-NNDL - Notes
No ratings yet
Unit - 3-NNDL - Notes
17 pages
CP4252 Machine Learning Lab Manual
No ratings yet
CP4252 Machine Learning Lab Manual
37 pages
DAN Lab ManuaL
No ratings yet
DAN Lab ManuaL
53 pages
Textbook ML - Removed
No ratings yet
Textbook ML - Removed
10 pages
CS8691 AI CO-PO Mapping
No ratings yet
CS8691 AI CO-PO Mapping
6 pages
Studocu DAA Unit 5 Notes
No ratings yet
Studocu DAA Unit 5 Notes
23 pages
Artificial Intelligence - AL3391 - Important Questions With Answer - Unit 2 - Problem Solving
No ratings yet
Artificial Intelligence - AL3391 - Important Questions With Answer - Unit 2 - Problem Solving
9 pages
Mfcs PPT (All Units)
No ratings yet
Mfcs PPT (All Units)
103 pages
Assignment 8: (Https://swayam - Gov.in)
No ratings yet
Assignment 8: (Https://swayam - Gov.in)
4 pages
21CS54 Module 4 2021 Scheme
No ratings yet
21CS54 Module 4 2021 Scheme
42 pages
Constraint Satisfaction Problems: AIMA: Chapter 6
No ratings yet
Constraint Satisfaction Problems: AIMA: Chapter 6
64 pages
Sample Report 22-23 1
No ratings yet
Sample Report 22-23 1
30 pages
Strategies in Teaching Math in The New Normal
No ratings yet
Strategies in Teaching Math in The New Normal
36 pages
internship-PPT (Pradip Pokharel 1HM17CS023)
No ratings yet
internship-PPT (Pradip Pokharel 1HM17CS023)
23 pages
ADA Important Questions PDF
No ratings yet
ADA Important Questions PDF
13 pages
Two Mark Questions and Answers
No ratings yet
Two Mark Questions and Answers
19 pages
Editing Coding Tabulation of Data
No ratings yet
Editing Coding Tabulation of Data
18 pages
ML Question Bank - Beena Kapadia
No ratings yet
ML Question Bank - Beena Kapadia
3 pages
ML - Module-3-Chapter-6 RNSIT
No ratings yet
ML - Module-3-Chapter-6 RNSIT
10 pages
Chapter 3: Research Methodology
No ratings yet
Chapter 3: Research Methodology
4 pages
Parental Involvement in Learners' Education: Significance, Benefits, and Barriers
No ratings yet
Parental Involvement in Learners' Education: Significance, Benefits, and Barriers
8 pages
13.operation and Job Sheet
No ratings yet
13.operation and Job Sheet
2 pages
Writing Chapter 1
No ratings yet
Writing Chapter 1
19 pages
18AI61
No ratings yet
18AI61
3 pages
Chapter 2 Theory and Practice Foundations For Effective Technology Integration
100% (1)
Chapter 2 Theory and Practice Foundations For Effective Technology Integration
21 pages
CS6659 Ai
No ratings yet
CS6659 Ai
1 page
Case Study of An Adolescent Age 13 Raquel
No ratings yet
Case Study of An Adolescent Age 13 Raquel
22 pages
GR 6 - Math MTP Curriculum 2021-22
No ratings yet
GR 6 - Math MTP Curriculum 2021-22
4 pages
Writing An Article Critique
No ratings yet
Writing An Article Critique
12 pages
Analisis Kesiapan Penerapan Rekam Medis Elektronik: Sebuah Studi Kualitatif
No ratings yet
Analisis Kesiapan Penerapan Rekam Medis Elektronik: Sebuah Studi Kualitatif
15 pages
Agora BUS 302 Final
No ratings yet
Agora BUS 302 Final
31 pages
NEP 2 Syllabus Scheme 2023 24
No ratings yet
NEP 2 Syllabus Scheme 2023 24
13 pages
Work Plan For Elearning
No ratings yet
Work Plan For Elearning
3 pages
Chapter 10 - The Nature of Intercultural Communication
No ratings yet
Chapter 10 - The Nature of Intercultural Communication
52 pages
Social & Cultural Geography
No ratings yet
Social & Cultural Geography
5 pages
Lesson 15 Fith Year Construction Roofs - Laptop-Mai5irde
No ratings yet
Lesson 15 Fith Year Construction Roofs - Laptop-Mai5irde
6 pages
ChatGPT PowerPoint
No ratings yet
ChatGPT PowerPoint
11 pages
Correlational Research Review
No ratings yet
Correlational Research Review
15 pages
IO Psychology Lecture 1
No ratings yet
IO Psychology Lecture 1
10 pages
Computer Holiday Homework
No ratings yet
Computer Holiday Homework
10 pages
Implementation of Geographic Information System GIS in Teaching Geography in Secondary Schools in Hong Local Government Area
No ratings yet
Implementation of Geographic Information System GIS in Teaching Geography in Secondary Schools in Hong Local Government Area
5 pages
Statistical Questions
No ratings yet
Statistical Questions
12 pages
Yenny Frisca Madhona, Muhammad Maulana Rizki: Keywords: Manual Handling
No ratings yet
Yenny Frisca Madhona, Muhammad Maulana Rizki: Keywords: Manual Handling
12 pages
List of Courses - m2 Ape
No ratings yet
List of Courses - m2 Ape
3 pages
G10-2nd PERIODICAL TEST-MAPEH10 (TOS) Final
No ratings yet
G10-2nd PERIODICAL TEST-MAPEH10 (TOS) Final
3 pages
Decision Tree Pruning: Fundamentals and Applications
From Everand
Decision Tree Pruning: Fundamentals and Applications
Fouad Sabry
No ratings yet
How To Get Started in Bioinformatics: Various Resources Listed by Nyberman Bioinformatics Europe
No ratings yet
How To Get Started in Bioinformatics: Various Resources Listed by Nyberman Bioinformatics Europe
4 pages

@vtucode - in Module 4 AI 2021 Scheme 5th Sem

Uploaded by

@vtucode - in Module 4 AI 2021 Scheme 5th Sem

Uploaded by

ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING(21CS54)

 Why called as decision tree ?

Applies to classification and regression model.

The decision tree consists of 2 major procedures:

1) Building a tree and

2) Knowledge inference or classification.

Building the Tree

Knowledge Inference or Classification

Advantages of Decision Trees

Disadvantages of Decision Trees

6.1.2 Fundamentals of Entropy

 How to draw a decision tree ?

Algorithm 6.1: General Algorithm for Decision Trees

6.2 DECISION TREE INDUCTION ALGORITHMS

6.2.1 ID3 Tree Construction(ID3 stands for Iterative Dichotomiser 3 )

6.2.2 C4.5 Construction

Dealing with Continuous Attributes in C4.5

6.2.3 Classification and Regression Trees Construction

6.2.4 Regression Trees

6.3 VALIDATING AND PRUNING OF DECISION TREES

The pre-pruning technique of Decision Trees is tuning the hyperparameters prior to

are: max_depth, min_samples_leaf, and min_samples_split

You might also like