0% found this document useful (0 votes)

16 views41 pages

21-Data Clustering (K-Means Clustering Algorithm), Predictive Analytics-11!04!2023

The document provides an overview of random forests and decision trees. It defines decision trees as data structures that can solve computational problems by organizing nodes and edges to represent attribute tests and outcomes. Decision trees help classify data by posing a series of questions starting at the root node. The document also describes how decision trees are built using a greedy algorithm that recursively splits the training data into purer subsets based on attribute tests until reaching leaf nodes of the same class. It discusses different approaches for splitting nodes based on attribute type, such as binary, nominal, ordinal, or numerical. An example illustrates how the build decision tree algorithm constructs different trees by selecting attributes in different orders.

Uploaded by

Shubham Kodilkar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

16 views41 pages

21-Data Clustering (K-Means Clustering Algorithm), Predictive Analytics-11!04!2023

Uploaded by

Shubham Kodilkar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 41

Random Forest

&
Decision Tree

1
Basic Concept
 A Decision Tree is an important data structure known to solve many
computational problems

Example 9.1: Binary Decision Tree

A B C f
0 0 0 m0
0 0 1 m1
0 1 0 m2
0 1 1 m3
1 0 0 m4
1 0 1 m5
1 1 0 m6
1 1 1 m7

22
Basic Concept
 In Example 9.1, we have considered a decision tree where values of any
attribute if binary only. Decision tree is also possible where attributes are
of continuous data type

Example 9.2: Decision Tree with numeric data

23
Some Characteristics
 Decision tree may be n-ary, n ≥ 2.

 There is a special node called root node.

 All nodes drawn with circle (ellipse) are called internal nodes.

 All nodes drawn with rectangle boxes are called terminal nodes or leaf
nodes.

 Edges of a node represent the outcome for a value of the node.

 In a path, a node with same label is never repeated.

 Decision tree is not unique, as different ordering of internal nodes can

give different decision tree.

24
Decision Tree and Classification Task
 Decision tree helps us to classify data.

 Internal nodes are some attribute

 Edges are the values of attributes

 External nodes are the outcome of classification

 Such a classification is, in fact, made by posing questions starting from

the root node to each terminal node.

25
Decision Tree and Classification Task
 Example 9.3 illustrates how we can solve a classification
problem by asking a series of question about the attributes.
 Each time we receive an answer, a follow-up question is asked until we
reach a conclusion about the class-label of the test.

 The series of questions and their answers can be organized in

the form of a decision tree
 As a hierarchical structure consisting of nodes and edges

 Once a decision tree is built, it is applied to any test to

classify it.

26
Definition of Decision Tree

Definition 9.1: Decision Tree

Given a database D = 𝑡1 , 𝑡2 , … . . , 𝑡𝑛 , where 𝑡𝑖 denotes a tuple, which is defined by

a set of attribute 𝐴 = 𝐴1 , 𝐴2 , … . . , 𝐴𝑚 . Also, given a set of classes C =
𝑐1 , 𝑐2 , … . . , 𝑐𝑘 .

A decision tree T is a tree associated with D that has the following properties:
• Each internal node is labeled with an attribute Ai
• Each edges is labeled with predicate that can be applied to the attribute
associated with the parent node of it
• Each leaf node is labeled with class cj

27
Building Decision Tree
 In principle, there are exponentially many decision tree that can be
constructed from a given database (also called training data).
 Some of the tree may not be optimum

 Some of them may give inaccurate result

 Two approaches are known

 Greedy strategy
 A top-down recursive divide-and-conquer

 Modification of greedy strategy

 ID3

 C4.5
 CART, etc.

28
Built Decision Tree Algorithm
 Algorithm BuiltDT
 Input: D : Training data set
 Output: T : Decision tree
Steps
1. If all tuples in D belongs to the same class Cj
Add a leaf node labeled as Cj
Return // Termination condition

2. Select an attribute Ai (so that it is not selected twice in the same branch)
3. Partition D = { D1, D2, …, Dp} based on p different values of Ai in D
4. For each Dk ϵ D
Create a node and add an edge between D and Dk with label as the Ai’s attribute value in Dk

5. For each Dk ϵ D
BuildTD(Dk) // Recursive call
6. Stop

29
Node Splitting in BuildDT Algorithm
 BuildDT algorithm must provides a method for expressing an attribute test
condition and corresponding outcome for different attribute type

 Case: Binary attribute

 This is the simplest case of node splitting

 The test condition for a binary attribute generates only two outcomes

30
Node Splitting in BuildDT Algorithm
 Case: Nominal attribute
 Since a nominal attribute can have many values, its test condition can be expressed
in two ways:
 A multi-way split
 A binary split

 Muti-way split: Outcome depends on the number of distinct values for the
corresponding attribute

 Binary splitting by grouping attribute values

31
Node Splitting in BuildDT Algorithm
 Case: Ordinal attribute
 It also can be expressed in two ways:

 A multi-way split
 A binary split

 Muti-way split: It is same as in the case of nominal attribute

 Binary splitting attribute values should be grouped maintaining the order property
of the attribute values

32
Node Splitting in BuildDT Algorithm
 Case: Numerical attribute
 For numeric attribute (with discrete or continuous values), a test condition can be
expressed as a comparison set

 Binary outcome: A >v or A ≤ v

 In this case, decision tree induction must consider all possible split positions

 Range query : vi ≤ A < vi+1 for i = 1, 2, …, q (if q number of ranges are chosen)

 Here, q should be decided a priori

 For a numeric attribute, decision tree induction is a combinatorial

optimization problem

33
Illustration : BuildDT Algorithm
Example 9.4: Illustration of BuildDT Algorithm
 Consider a training data set as shown.

Person Gender Height Class

1 F 1.6 S
2 M 2.0 M
3 F 1.9 M Attributes:
4 F 1.88 M
Gender = {Male(M), Female (F)} // Binary attribute
5 F 1.7 S
Height = {1.5, …, 2.5} // Continuous attribute
6 M 1.85 M
7 F 1.6 S
Class = {Short (S), Medium (M), Tall (T)}
8 M 1.7 S
9 M 2.2 T
10 M 2.1 T
11 F 1.8 M Given a person, we are to test in which class s/he belongs
12 M 1.95 M
13 F 1.9 M
14 F 1.8 M
15 F 1.75 S

34
Illustration : BuildDT Algorithm
 To built a decision tree, we can select an attribute in two different orderings:
<Gender, Height> or <Height, Gender>

 Further, for each ordering, we can choose different ways of splitting

 Different instances are shown in the following.

 Approach 1 : <Gender, Height>

35
Illustration : BuildDT Algorithm

36
Illustration : BuildDT Algorithm
 Approach 2 : <Height, Gender>

37
38
39
40
41

Decision Tree Algorithm in Machine Learning
No ratings yet
Decision Tree Algorithm in Machine Learning
17 pages
Decision Tree in ML
No ratings yet
Decision Tree in ML
21 pages
Decision Tree
No ratings yet
Decision Tree
66 pages
Object Oriented System Design Quantum
70% (10)
Object Oriented System Design Quantum
253 pages
Decision Tree
No ratings yet
Decision Tree
74 pages
Unit IV Da Online - PPTX 2 82
No ratings yet
Unit IV Da Online - PPTX 2 82
81 pages
Tree Based Classifiers: Dinesh R
No ratings yet
Tree Based Classifiers: Dinesh R
54 pages
8-Value Engineering (Case Study Example), Monetizing IoT Product-28-02-2023
No ratings yet
8-Value Engineering (Case Study Example), Monetizing IoT Product-28-02-2023
66 pages
Decision Tree Induction Algorithm
No ratings yet
Decision Tree Induction Algorithm
6 pages
Decision Tree
100% (1)
Decision Tree
57 pages
Lecture 9-C++ Arrays
No ratings yet
Lecture 9-C++ Arrays
19 pages
09 Decision Tree Induction
No ratings yet
09 Decision Tree Induction
120 pages
1st Summative Assessment (Math 8) Q2
100% (1)
1st Summative Assessment (Math 8) Q2
2 pages
Week 8 - Understanding The Decision Tree
No ratings yet
Week 8 - Understanding The Decision Tree
28 pages
Decision Tree Algorithm, Explained-1-22
No ratings yet
Decision Tree Algorithm, Explained-1-22
22 pages
Machine Learning: Prepared by
No ratings yet
Machine Learning: Prepared by
44 pages
Decision Trees and Regression Techniques
No ratings yet
Decision Trees and Regression Techniques
27 pages
Classification - Decision Trees
No ratings yet
Classification - Decision Trees
96 pages
XI Maths Test Linear Inequalities
No ratings yet
XI Maths Test Linear Inequalities
3 pages
MI - Unit 4
No ratings yet
MI - Unit 4
79 pages
Adobe Scan 16 May 2023
No ratings yet
Adobe Scan 16 May 2023
14 pages
Coding With Lua Cheatsheet: Create New Scripts Run Code
No ratings yet
Coding With Lua Cheatsheet: Create New Scripts Run Code
2 pages
Unit 4 Classification
No ratings yet
Unit 4 Classification
15 pages
Unit III Assignment Digital Electronics
No ratings yet
Unit III Assignment Digital Electronics
6 pages
TEAA - Tree Ensembles-1
No ratings yet
TEAA - Tree Ensembles-1
43 pages
Unit 3 Classification - Dr. Vidyut D
No ratings yet
Unit 3 Classification - Dr. Vidyut D
72 pages
1822 B.E Cse Batchno 149
No ratings yet
1822 B.E Cse Batchno 149
66 pages
Classification and Regression Tree Construction
No ratings yet
Classification and Regression Tree Construction
18 pages
ML Unit 3
No ratings yet
ML Unit 3
30 pages
Title: Implementation of Decision Tree Classification: Department of Computer Science and Engineering
No ratings yet
Title: Implementation of Decision Tree Classification: Department of Computer Science and Engineering
8 pages
Decision Tree
No ratings yet
Decision Tree
18 pages
15.module6 Decisiontree-Updated 14
No ratings yet
15.module6 Decisiontree-Updated 14
20 pages
ML Ch-3 Decision Trees and Ensemble Methods
No ratings yet
ML Ch-3 Decision Trees and Ensemble Methods
14 pages
Unit-II - Tree Based Methods
No ratings yet
Unit-II - Tree Based Methods
158 pages
ML L8 Decision Tree
No ratings yet
ML L8 Decision Tree
109 pages
Decisiontree1 2
No ratings yet
Decisiontree1 2
29 pages
Decision Trees Edited
No ratings yet
Decision Trees Edited
56 pages
Decision Tree Learning (8 Hours)
No ratings yet
Decision Tree Learning (8 Hours)
141 pages
19 - Decision Tree - ID3
No ratings yet
19 - Decision Tree - ID3
87 pages
Lec.7.intro.D.S. Fall 2023
No ratings yet
Lec.7.intro.D.S. Fall 2023
26 pages
Classification and Prediction
No ratings yet
Classification and Prediction
81 pages
Decision Trees Lectures
No ratings yet
Decision Trees Lectures
55 pages
Decisiontree
No ratings yet
Decisiontree
6 pages
Decision Tree
No ratings yet
Decision Tree
57 pages
ISC XI CA Syllabus
No ratings yet
ISC XI CA Syllabus
12 pages
16-Decision Tree Classification Algorithm Advantages With Examples (Iterative Dichotomiser 3-ID3) - 22-03-2024
No ratings yet
16-Decision Tree Classification Algorithm Advantages With Examples (Iterative Dichotomiser 3-ID3) - 22-03-2024
83 pages
Decision Tree
No ratings yet
Decision Tree
43 pages
Lecture 04 Decession Trees 04112022 015118pm
No ratings yet
Lecture 04 Decession Trees 04112022 015118pm
43 pages
U4 ML Updated
No ratings yet
U4 ML Updated
32 pages
Wk. 5.2. Decision Trees (27.10.2020)
No ratings yet
Wk. 5.2. Decision Trees (27.10.2020)
57 pages
Unit-2 Material
No ratings yet
Unit-2 Material
52 pages
Classification: Decision Tree Induction: Lecture #9
No ratings yet
Classification: Decision Tree Induction: Lecture #9
121 pages
Unit Ii
No ratings yet
Unit Ii
22 pages
AI&Ml-module 4 (Complete)
No ratings yet
AI&Ml-module 4 (Complete)
124 pages
Machine - Learning - Lecture - 08 - Decision Tree Learning
No ratings yet
Machine - Learning - Lecture - 08 - Decision Tree Learning
67 pages
Decision Tree
No ratings yet
Decision Tree
13 pages
CSC 204 Session 4
No ratings yet
CSC 204 Session 4
16 pages
AI&Ml-module 4 (Part 1)
No ratings yet
AI&Ml-module 4 (Part 1)
85 pages
Circular Linked Lists Questions and Answers
No ratings yet
Circular Linked Lists Questions and Answers
10 pages
Decision Trees - A Complete Introduction With Examples - by Shubham Koli - Medium
No ratings yet
Decision Trees - A Complete Introduction With Examples - by Shubham Koli - Medium
22 pages
DS Tech M 3 1
No ratings yet
DS Tech M 3 1
13 pages
Unit IV Decision Trees
No ratings yet
Unit IV Decision Trees
37 pages
Lecture Note #5 - PEC-CS701E
No ratings yet
Lecture Note #5 - PEC-CS701E
16 pages
Decision Tree
No ratings yet
Decision Tree
41 pages
Chapter 4classification and Prediction
No ratings yet
Chapter 4classification and Prediction
19 pages
ML - Module-3-Chapter-6 RNSIT
No ratings yet
ML - Module-3-Chapter-6 RNSIT
10 pages
Decision Tree Classification Algorithm
No ratings yet
Decision Tree Classification Algorithm
11 pages
Lec 5
No ratings yet
Lec 5
13 pages
2 - Decision Tree
No ratings yet
2 - Decision Tree
23 pages
Lecture 1 - Introduction - Sets
No ratings yet
Lecture 1 - Introduction - Sets
45 pages
DS Sem 3 Slips
No ratings yet
DS Sem 3 Slips
21 pages
Ara Api
No ratings yet
Ara Api
119 pages
Instructions PLC Mitsubishi
No ratings yet
Instructions PLC Mitsubishi
41 pages
10 - (Module-6) Data Generation, Data Gathering-07-03-2023
No ratings yet
10 - (Module-6) Data Generation, Data Gathering-07-03-2023
18 pages
Midterm 3-4
No ratings yet
Midterm 3-4
17 pages
6 Adjectives 14 02 2023
No ratings yet
6 Adjectives 14 02 2023
17 pages
1
No ratings yet
1
11 pages
Ejercicio Forecast PDF
No ratings yet
Ejercicio Forecast PDF
4 pages
5 TH Sem CSE
No ratings yet
5 TH Sem CSE
18 pages
17-Predictive Analytics and Streaming Analytics-06-04-2023
No ratings yet
17-Predictive Analytics and Streaming Analytics-06-04-2023
9 pages
DSU Microp
No ratings yet
DSU Microp
14 pages
Unit Iii
No ratings yet
Unit Iii
11 pages
Data Structure Model Papers (Two) 2425
No ratings yet
Data Structure Model Papers (Two) 2425
6 pages
A310 Question Paper CSE ND 2020 CS 2303 Theory of Computation 578923108 X60383 (CS2303 CS1303)
No ratings yet
A310 Question Paper CSE ND 2020 CS 2303 Theory of Computation 578923108 X60383 (CS2303 CS1303)
3 pages
Tang 2017
No ratings yet
Tang 2017
8 pages
Minimum Cost QM
No ratings yet
Minimum Cost QM
1 page
Ps 1
No ratings yet
Ps 1
4 pages
Part 2 Solution To Exercise On 2DDCT Using Matrix Implementation
No ratings yet
Part 2 Solution To Exercise On 2DDCT Using Matrix Implementation
3 pages
ACFrOgCWlClcx1ibUFjvHCo6eyVnuGeyKTRP7M Sx3hQSEqKo 7US9oiFV05847kNihwC3Uot6dizn54ztxz6BBuaKkHJyMkKqTMquWkMu YMoh8n8v LTtkrILBaPJR1IeMUNyGU3hTmJ5Dp97
No ratings yet
ACFrOgCWlClcx1ibUFjvHCo6eyVnuGeyKTRP7M Sx3hQSEqKo 7US9oiFV05847kNihwC3Uot6dizn54ztxz6BBuaKkHJyMkKqTMquWkMu YMoh8n8v LTtkrILBaPJR1IeMUNyGU3hTmJ5Dp97
2 pages
Untitled0.Ipynb - Colab
No ratings yet
Untitled0.Ipynb - Colab
2 pages
Class 9 - CAS-I
No ratings yet
Class 9 - CAS-I
1 page
Dsa - Barnette and Tonga - 10
No ratings yet
Dsa - Barnette and Tonga - 10
3 pages
Vb Net Programming
From Everand
Vb Net Programming
Martin Booch
No ratings yet
Alternating Decision Tree: Fundamentals and Applications
From Everand
Alternating Decision Tree: Fundamentals and Applications
Fouad Sabry
No ratings yet

21-Data Clustering (K-Means Clustering Algorithm), Predictive Analytics-11!04!2023

Uploaded by

21-Data Clustering (K-Means Clustering Algorithm), Predictive Analytics-11!04!2023

Uploaded by

Random Forest

Example 9.1: Binary Decision Tree

Example 9.2: Decision Tree with numeric data

 There is a special node called root node.

 Edges of a node represent the outcome for a value of the node.

 In a path, a node with same label is never repeated.

 Decision tree is not unique, as different ordering of internal nodes can

 Internal nodes are some attribute

 Edges are the values of attributes

 External nodes are the outcome of classification

 Such a classification is, in fact, made by posing questions starting from

 The series of questions and their answers can be organized in

 Once a decision tree is built, it is applied to any test to

Definition 9.1: Decision Tree

Given a database D = 𝑡1 , 𝑡2 , … . . , 𝑡𝑛 , where 𝑡𝑖 denotes a tuple, which is defined by

 Some of them may give inaccurate result

 Two approaches are known

 Modification of greedy strategy

 Case: Binary attribute

 Binary splitting by grouping attribute values

 Muti-way split: It is same as in the case of nominal attribute

 Binary outcome: A >v or A ≤ v

 Here, q should be decided a priori

 For a numeric attribute, decision tree induction is a combinatorial

Person Gender Height Class

 Further, for each ordering, we can choose different ways of splitting

 Different instances are shown in the following.

 Approach 1 : <Gender, Height>

You might also like