0% found this document useful (0 votes)

85 views

Decision Tree Learning

Decision tree learning is a common method used in data mining to predict target variables. It creates a flowchart-like structure where internal nodes represent input variables that are used to split the data into separate subsets, and leaf nodes represent target variable outcomes. It has advantages like being simple to understand, handling both numerical and categorical data, and performing well with large datasets. Limitations include the problem being NP-complete and possibility of overfitting.

Uploaded by

dbaechtel

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

85 views

Decision Tree Learning

Uploaded by

dbaechtel

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 15

Decision Tree Learning

Presented by
Don Baechtel
Decision tree learning
• used in statistics, data mining and machine learning.
• uses a decision tree as a predictive model which
maps observations about an item to conclusions
about the item's target value.
• More descriptive names for such tree models are
classification trees or regression trees.
• In these tree structures, leaves represent
classifications and branches represent conjunctions
of features that lead to those classifications.
Decision Analysis
• a decision tree can be used to visually and
explicitly represent decisions and
decision making.
• In data mining, a decision tree describes data
but not decisions;
• rather the resulting classification tree can be
an input for decision making.
Example Decision Tree
Decision tree learning
• Decision tree learning is a common method used in data mining.
• The goal is to create a model that predicts the value of a target
variable based on several input variables.
• Each interior node corresponds to one of the input variables;
• there are edges to children for each of the possible values of that
input variable.
• Each leaf represents a value of the target variable given the
values of the input variables represented by the path from the
root to the leaf.
• trees can be described also as the combination of mathematical
and computational techniques to aid the description,
categorization and generalization of a given set of data.
Tree Learning
• A tree can be "learned" by splitting the source set
into subsets based on an attribute value test.
• This process is repeated on each derived subset in
a recursive manner called recursive partitioning.
• The recursion is completed when the subset at a
node all has the same value of the target variable,
or when splitting no longer adds value to the
predictions.
Decision Tree Types
• Classification tree analysis is when the predicted outcome is the
class to which the data belongs.
• Regression tree analysis is when the predicted outcome can be
considered a real number (e.g. the price of a house, or a
patient’s length of stay in a hospital).
• Classification And Regression Tree (CART) analysis is used to
refer to both of the above procedures.
• CHi-squared Automatic Interaction Detector (CHAID). Performs
multi-level splits when computing classification trees.
• A Random Forest classifier uses a number of decision trees, in
order to improve the classification rate.
• Boosted Trees can be used for regression-type and classification-
type problems.
Formulae
• The algorithms that are used for constructing decision trees
usually work top-down by choosing a variable at each step that
is the next best variable to use in splitting the set of items.
• "Best" is defined by how well the variable splits the set into
homogeneous subsets that have the same value of the target
variable.
• Different algorithms use different formulae for measuring
"best".
• These formulae are applied to each candidate subset, and the
resulting values are combined (e.g., averaged) to provide a
measure of the quality of the split.
Gini impurity
• Used by the CART algorithm, Gini impurity is a
measure of how often a randomly chosen element
from the set would be incorrectly labeled if it were
randomly labeled according to the distribution of
labels in the subset.
• Gini impurity can be computed by summing the
probability of each item being chosen times the
probability of a mistake in categorizing that item.
• It reaches its minimum (zero) when all cases in the
node fall into a single target category.
Gini impurity
Decision tree advantages
• Simple to understand and interpret. People are able to understand decision tree models
after a brief explanation.
• Requires little data preparation. Other techniques often require data normalization,
dummy variables need to be created and blank values to be removed.
• Able to handle both numerical and categorical data. Other techniques are usually
specialized in analyzing datasets that have only one type of variable. Ex: relation rules can
be used only with nominal variables while neural networks can be used only with numerical
variables.
• Uses a white box model. If a given situation is observable in a model the explanation for
the condition is easily explained by Boolean logic. An example of a black box model is an
artificial neural network since the explanation for the results is difficult to understand.
• Possible to validate a model using statistical tests. That makes it possible to account for
the reliability of the model.
• Robust. Performs well even if its assumptions are somewhat violated by the true model
from which the data were generated.
• Perform well with large data in a short time. Large amounts of data can be analyzed using
personal computers in a time short enough to enable stakeholders to take decisions based
on its analysis.
Limitations
• The problem of learning an optimal decision tree is known to be NP-complete
under several aspects of optimality and even for simple concepts. Consequently,
practical decision-tree learning algorithms are based on heuristic algorithms such
as the greedy algorithm where locally optimal decisions are made at each node.
Such algorithms cannot guarantee to return the globally optimal decision tree.
Recent developments suggest the use of genetic algorithms to avoid local optimal
decisions and search the decision tree space with little a priori bias.
• Decision-tree learners can create over-complex trees that do not generalize the
data well. This is called overfitting. Mechanisms such as pruning are necessary to
avoid this problem.
• There are concepts that are hard to learn because decision trees do not express
them easily, such as XOR, parity or multiplexer problems. In such cases, the
decision tree becomes prohibitively large. Approaches to solve the problem involve
either changing the representation of the problem domain (known as
propositionalization) or using learning algorithms based on more expressive
representations (such as statistical relational learning or
inductive logic programming).
Extending decision trees
with decision graphs
• In a decision tree, all paths from the root node to the leaf node
proceed by way of conjunction, or AND.
• In a decision graph, it is possible to use disjunctions (ORs) to join
two more paths together using Minimum Message Length (MML).
• Decision graphs have been further extended to allow for
previously unstated new attributes to be learnt dynamically and
used at different places within the graph.
• The more general coding scheme results in better predictive
accuracy and log-loss probabilistic scoring.
• In general, decision graphs infer models with fewer leaves than
decision trees.
Implementations
• Weka, a free and open-source data mining
suite, contains many decision tree algorithms.
• Orange, a free data mining software suite,
module orngTree.
• Sipina, a free decision tree software, including
an interactive tree builder.
Reference Materials
• Building Decision Trees in Python From
O'Reilly.
• An Addendum to "Building Decision Trees in P
ython"
From O'Reilly.
• Decision Trees page at aaai.org, a page with
commented links.
• Decision tree implementation in Ruby (AI4R)
at http://ai4r.rubyforge.org

Packet Headers Subnet Breakdown
No ratings yet
Packet Headers Subnet Breakdown
34 pages
Unit Ii
No ratings yet
Unit Ii
22 pages
Machine_Learning_Lecture_08_Decision Tree Learning (1)
No ratings yet
Machine_Learning_Lecture_08_Decision Tree Learning (1)
67 pages
EDA Cat2
No ratings yet
EDA Cat2
54 pages
AIML Module 4 Imp
No ratings yet
AIML Module 4 Imp
5 pages
Unit IV Da Online - PPTX 2 82
No ratings yet
Unit IV Da Online - PPTX 2 82
81 pages
Decision Trees and Regression Techniques
No ratings yet
Decision Trees and Regression Techniques
27 pages
Decision Tree Is An Upside
No ratings yet
Decision Tree Is An Upside
7 pages
Decision Tree Learning
No ratings yet
Decision Tree Learning
10 pages
ML Unit 2
No ratings yet
ML Unit 2
8 pages
Lecture Note #5_PEC-CS701E
No ratings yet
Lecture Note #5_PEC-CS701E
16 pages
An Introduction TO Decision Trees
No ratings yet
An Introduction TO Decision Trees
30 pages
Analysis of Various Decision Tree Algorithms For Classification in Data Mining PDF
No ratings yet
Analysis of Various Decision Tree Algorithms For Classification in Data Mining PDF
5 pages
Decision Trees
No ratings yet
Decision Trees
21 pages
Decision Tree Learning
No ratings yet
Decision Tree Learning
11 pages
Ch02 DecisionTree
No ratings yet
Ch02 DecisionTree
41 pages
ML L8 Decision Tree
No ratings yet
ML L8 Decision Tree
109 pages
Lecture 7 Overview of ML models
No ratings yet
Lecture 7 Overview of ML models
77 pages
AI&Ml-module 4 (Complete)
No ratings yet
AI&Ml-module 4 (Complete)
124 pages
AI&Ml-module 4 (Part 1)
No ratings yet
AI&Ml-module 4 (Part 1)
85 pages
Decision Tree Is An Upside
No ratings yet
Decision Tree Is An Upside
17 pages
Module 4 Lecture -2
No ratings yet
Module 4 Lecture -2
65 pages
Lecture 17 18
No ratings yet
Lecture 17 18
52 pages
Tree Based Learning Methods
No ratings yet
Tree Based Learning Methods
28 pages
Business Analytics: Foundation: Material Handouts
No ratings yet
Business Analytics: Foundation: Material Handouts
7 pages
Chapter 4classification and Prediction
No ratings yet
Chapter 4classification and Prediction
19 pages
Classification: Decision Trees: Business Analytics Lecture 7/8
No ratings yet
Classification: Decision Trees: Business Analytics Lecture 7/8
35 pages
Decisiontree
No ratings yet
Decisiontree
6 pages
Decision Tree
No ratings yet
Decision Tree
68 pages
Decision Trees_ a Complete Introduction With Examples _ by Shubham Koli _ Medium
No ratings yet
Decision Trees_ a Complete Introduction With Examples _ by Shubham Koli _ Medium
22 pages
U4 ML Updated
No ratings yet
U4 ML Updated
32 pages
decision tree
No ratings yet
decision tree
13 pages
Aiml M4 C1
No ratings yet
Aiml M4 C1
101 pages
DECISION TREES-jb
No ratings yet
DECISION TREES-jb
8 pages
Unit-II - Tree Based Methods
No ratings yet
Unit-II - Tree Based Methods
158 pages
Minor Project Synopsis
No ratings yet
Minor Project Synopsis
12 pages
Unit Iir20
No ratings yet
Unit Iir20
22 pages
Decision Trees: A Recent Overview: S. B. Kotsiantis
No ratings yet
Decision Trees: A Recent Overview: S. B. Kotsiantis
23 pages
Decision Tree
No ratings yet
Decision Tree
31 pages
Decision Tree
No ratings yet
Decision Tree
57 pages
Decision Tree
100% (1)
Decision Tree
57 pages
Decision Trees and Decision Modeling
No ratings yet
Decision Trees and Decision Modeling
58 pages
Module 04 Edited
No ratings yet
Module 04 Edited
19 pages
Decision Tree in Machine Learning
No ratings yet
Decision Tree in Machine Learning
11 pages
ML CLASS 6 Decision Tree Algorithm
No ratings yet
ML CLASS 6 Decision Tree Algorithm
21 pages
Decision Trees
No ratings yet
Decision Trees
27 pages
ESGB_2025_classification and regression tress [Enregistré automatiquement]
No ratings yet
ESGB_2025_classification and regression tress [Enregistré automatiquement]
43 pages
Decision Tree Analysis On J48 Algorithm PDF
No ratings yet
Decision Tree Analysis On J48 Algorithm PDF
6 pages
Decisiontree1 2
No ratings yet
Decisiontree1 2
29 pages
Chapter 03
No ratings yet
Chapter 03
30 pages
Week 8 - Understanding the Decision Tree
No ratings yet
Week 8 - Understanding the Decision Tree
28 pages
Chapter 03
No ratings yet
Chapter 03
30 pages
Introduction to Decision Tree Algorithm
No ratings yet
Introduction to Decision Tree Algorithm
11 pages
Decision Trees
No ratings yet
Decision Trees
15 pages
UNIT II 2.1 ML Decision Tree Learning
No ratings yet
UNIT II 2.1 ML Decision Tree Learning
55 pages
Module 3 - Decision Tress and Artificial Neural Networks
No ratings yet
Module 3 - Decision Tress and Artificial Neural Networks
177 pages
Supervised Learning Algorithm DT
No ratings yet
Supervised Learning Algorithm DT
15 pages
Machine Learning: Prepared by
No ratings yet
Machine Learning: Prepared by
44 pages
Decision Tree Pruning: Fundamentals and Applications
From Everand
Decision Tree Pruning: Fundamentals and Applications
Fouad Sabry
No ratings yet
Alternating Decision Tree: Fundamentals and Applications
From Everand
Alternating Decision Tree: Fundamentals and Applications
Fouad Sabry
No ratings yet
Visualizing Data Structures
From Everand
Visualizing Data Structures
Rhonda Hoenigman
No ratings yet
QB - Boundaries (Medium and Hard)
No ratings yet
QB - Boundaries (Medium and Hard)
23 pages
Unpacking
No ratings yet
Unpacking
2 pages
Introduction To Internet Programming: 5ECTS, 2lec, 2lab
No ratings yet
Introduction To Internet Programming: 5ECTS, 2lec, 2lab
18 pages
Akbar The Great
No ratings yet
Akbar The Great
16 pages
Download ebooks file Writing Reports to Get Results Quick Effective Results Using the Pyramid Method 3rd Edition Blicq R.S. all chapters
100% (10)
Download ebooks file Writing Reports to Get Results Quick Effective Results Using the Pyramid Method 3rd Edition Blicq R.S. all chapters
40 pages
Kushal Mani - Resume
No ratings yet
Kushal Mani - Resume
1 page
Lost in Translation - The Schubert Lieder PDF
No ratings yet
Lost in Translation - The Schubert Lieder PDF
3 pages
Big English 2 Semana1
No ratings yet
Big English 2 Semana1
13 pages
Automatic Line Breaking of Long Lines of Text? - Tex..
No ratings yet
Automatic Line Breaking of Long Lines of Text? - Tex..
3 pages
Euler and Hamiltonian Graph
No ratings yet
Euler and Hamiltonian Graph
6 pages
Fyjc It Objectives
100% (1)
Fyjc It Objectives
21 pages
18 Improper Integrals
No ratings yet
18 Improper Integrals
6 pages
PKN Kelmpok Baru
No ratings yet
PKN Kelmpok Baru
16 pages
2023-24 S3 First Term Reading Exam (Question & Answer Book)
No ratings yet
2023-24 S3 First Term Reading Exam (Question & Answer Book)
6 pages
09le 03 Gs Punctuation i Pt Pa 2
No ratings yet
09le 03 Gs Punctuation i Pt Pa 2
9 pages
The Torture and Death of Saint Simon of Trent
No ratings yet
The Torture and Death of Saint Simon of Trent
2 pages
Assignment of Github
No ratings yet
Assignment of Github
2 pages
The Strings Family
No ratings yet
The Strings Family
7 pages
computer paper
No ratings yet
computer paper
10 pages
Material Costing Sheet 1
No ratings yet
Material Costing Sheet 1
6 pages
1524 - შებლ ებადი, ალ-არაბი ემარა - მონუმენტური წარწერა თბილისის გალავანზე - არქეოლოგიური და ისტორიული კვლევა
No ratings yet
1524 - შებლ ებადი, ალ-არაბი ემარა - მონუმენტური წარწერა თბილისის გალავანზე - არქეოლოგიური და ისტორიული კვლევა
7 pages
The Effectiveness of Demonstration Method On The Students' Speaking Skill at The Eighth Grade of SMPN 2 Arjawinangun
No ratings yet
The Effectiveness of Demonstration Method On The Students' Speaking Skill at The Eighth Grade of SMPN 2 Arjawinangun
13 pages
T24 PDF
No ratings yet
T24 PDF
25 pages
Praktik LKPD Muhammad Mufti Haris
No ratings yet
Praktik LKPD Muhammad Mufti Haris
5 pages
Anglo-Saxon period
No ratings yet
Anglo-Saxon period
2 pages
Hajun Scale via cosmology
No ratings yet
Hajun Scale via cosmology
17 pages
Quarter 1 WEEK 1.2: 21 Century Literature From The Philippines and The World
100% (3)
Quarter 1 WEEK 1.2: 21 Century Literature From The Philippines and The World
6 pages
Communication Skills
No ratings yet
Communication Skills
6 pages
WLL Week 2 Work Immersion
100% (1)
WLL Week 2 Work Immersion
1 page

Decision Tree Learning

Uploaded by

Decision Tree Learning

Uploaded by

Decision Tree Learning

You might also like