0% found this document useful (0 votes)

48 views161 pages

DM-Lecture Decision Trees (A)

This document discusses supervised learning and decision tree classification. It introduces decision trees as a method for inductive learning where concepts are represented as trees that can be converted to if-then rules. Decision tree learning is widely used for classification tasks due to its accuracy and efficiency. The document outlines the basic algorithm for growing decision trees using a top-down greedy search approach and selecting the attribute that best classifies examples at each node.

Uploaded by

Zain Ul Abedin

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

48 views161 pages

DM-Lecture Decision Trees (A)

Uploaded by

Zain Ul Abedin

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 161

Supervised Learning:

Classification-I
M M Awais

SPJCM
Decision tree induction

Classification - Decision Tree 2

Introduction
 It is a method that induces concepts from
examples (inductive learning)
 Most widely used & practical learning
method
 The learning is supervised: i.e. the classes
or categories of the data instances are
known
 It represents concepts as decision trees
(which can be rewritten as if-then rules)

Classification - Decision Tree 3

Introduction
 Decision tree learning is one of the most
widely used techniques for classification.
 Its classification accuracy is competitive with
other methods, and
 it is very efficient.
 The classification model is a tree, called
decision tree.
 C4.5 by Ross Quinlan is perhaps the best
known system. It can be downloaded from
the Web.

Classification - Decision Tree 4

Introduction

 The target function can be Boolean or

discrete valued

Classification - Decision Tree 5

Decision Trees
 Example: “is it a good day to play golf?” A particular instance in the
 a set of attributes and their possible values: training set might be:
outlook sunny, overcast, rain
<overcast, hot, normal, false>: play
temperature cool, mild, hot
humidity high, normal
windy true, false

In this case, the target class

is a binary attribute, so each
instance represents a positive
or a negative example.

Classification - Decision Tree 6

Decision Tree Representation

1. Each node corresponds to an attribute

2. Each branch corresponds to an attribute

value

3. Each leaf node assigns a classification

Classification - Decision Tree 7

Example

Classification - Decision Tree 8

Example

 Outlook
 Sunny  Rain
 Overcast
 Humidity Wind
 High  Normal  Strong  Weak

 A Decision Tree for the concept PlayTennis

 An unknown observation is classified by testing its

attributes and reaching a leaf node

Classification - Decision Tree 9
Using Decision Trees for Classification
 Examples can be classified as follows
 1. look at the example's value for the feature specified

 2. move along the edge labeled with this value

 3. if you reach a leaf, return the label of the leaf

 4. otherwise, repeat from step 1

 Example (a decision tree to decide whether to go on a picnic):

outlook
So a new instance:
sunny overcast rain <rainy, hot, normal, true>: ?
will be classified as “noplay”
humidity P windy

high normal true false

N P N P

Classification - Decision Tree 10

Decision Trees and Decision Rules
outlook

If attributes are continuous, sunny overcast rain

internal nodes may test
against a threshold. humidity yes windy

> 75% <= 75% > 20 <= 20

no yes no yes

Each path in the tree represents a decision rule:

Rule1: Rule3:
If (outlook=“sunny”) AND (humidity<=0.75) If (outlook=“overcast”)
Then (play=“yes”) Then (play=“yes”)
Rule2: ...
If (outlook=“rainy”) AND (wind>20)
Then (play=“no”)

Classification - Decision Tree 11

 DECISION TREES

 Basic Decision Tree Learning Algorithm

 Most algorithms for growing decision trees
are variants of a basic algorithm

 An example of this core algorithm is the ID3

algorithm developed by Quinlan (1986)

 Itemploys a top-down, greedy search through

the space of possible decision trees
12
 DECISION TREES

 Basic Decision Tree Learning Algorithm

 First
of all we select the best attribute to be
tested at the root of the tree

 Formaking this selection each attribute is

evaluated using a statistical test to determine
how well it alone classifies the training
examples

13
 DECISION TREES

 Basic Decision Tree Learning Algorithm

 We have

 D12  D11 - 12 observations

 D1
- 4 attributes
D10
 D2  D5
 D4

• Outlook
 D6
 D14
 D3 • Temperature

 D8  D9• Humidity
 D7
 D13
• Wind

- 2 classes (Yes, No)

14
 DECISION TREES

 Basic Decision Tree Learning Algorithm

 Outlook
 Sunny  Rain
 Overcast

 D10 
 D1  D8 D6
 D3
 D14
 D11  D4
 D9  D12
 D2  D7
 D5
 D13

15
 DECISION TREES

 Basic Decision Tree Learning Algorithm

 The selection process is then repeated using
the training examples associated with each
descendant node to select the best attribute
to test at that point in the tree

16
 DECISION TREES
 Outlook
 Sunny  Rain
 Overcast

 D10 
 D1  D8 D6
 D3
 D14
 D11  D4
 D9  D12
 D2  D7
 D5
 D13

 What is the “best” attribute to test at this point? The possible

choices are Temperature, Wind & Humidity

17
 DECISION TREES

 Basic Decision Tree Learning Algorithm

 Thisforms a greedy search for an acceptable
decision tree, in which the algorithm never
backtracks to reconsider earlier choices

18
 DECISION TREES

Which Attribute is the Best Classifier?

 The central choice in the ID3 algorithm is
selecting which attribute to test at each node
in the tree
 We would like to select the attribute which is

most useful for classifying examples

 For this we need a good quantitative measure

 For this purpose a statistical property, called

information gain, Gini Index is used

19
Top-Down Decision Tree Generation
 The basic approach usually consists of two phases:
 Tree construction

 At the start, all the training examples are at the

root
 Partition examples are recursively based on
selected attributes
 Tree pruning

 remove tree branches that may reflect noise in

the training data and lead to errors when
classifying test data
 improve classification accuracy

Classification - Decision Tree 20

Top-Down Decision Tree Generation
 Basic Steps in Decision Tree Construction
 Tree starts a single node representing all data

 If sample are all same class then node becomes a

leaf labeled with class label

 Otherwise, select feature that best separates sample

into individual classes.

 Recursion stops when:

 Samples in node belong to the same

tu re ? class
ct fe a
(majority) se l e
w to
 There are H nooremaining attributes on which to
split

Classification - Decision Tree 21

How to find Feature to split?

 Many methods are available but our focus

will be on the following two:
 Information Theory(Information Gain)
 Gain Ratio
 Gini Index

Classification - Decision Tree 22

Information
High Uncertainty

No Uncertainty
Classification - Decision Tree 23
Valuable Information

 Which information is more valuable:

 Of high uncertain region, or
 Of no uncertain region

gi o n
tai n re
g h U ncer
H i

Classification - Decision Tree 24

Information theory
 Information theory provides a mathematical basis
for measuring the information content.
 To understand the notion of information, think
about it as providing the answer to a question, for
example, whether a coin will come up heads.
 If one already has a good guess about the answer,
then the actual answer is less informative.
 If one already knows that the coin is rigged so that it
will come with heads with probability 0.99, then a
message (advanced information) about the actual
outcome of a flip is worth less than it would be for a
honest coin (50-50).

Classification - Decision Tree 25

Information theory (cont …)
 For a fair (honest) coin, you have no
information, and you are willing to pay more
(say in terms of $) for advanced information -
less you know, the more valuable the
information.
 Information theory uses this same intuition,
but instead of measuring the value for
information in dollars, it measures information
contents in bits.
 One bit of information is enough to answer a
yes/no question about which one has no
idea, such as the flip of a fair coin

Classification - Decision Tree 26

Information Basic

Classification - Decision Tree 27

Entropy

Classification - Decision Tree 28

Classification - Decision Tree 29
Classification - Decision Tree 30
Classification - Decision Tree 31
Information: Basics
 Information (Entropy) is:
 E= - pi log pi,
 where pi is the probability of an event i
 (-pi log pi is always +ve)
 For multiple events
 E(I) = i -pi log pi
 Suppose you toss a fair coin, find the information
(entropy) when the probability of head or tail is 0.5 each.
 possible events: 2, pi=0.5
 E(I)= - 0.5log 0.5 - 0.5log 0.5 = 1.0
 If the coin is biased i.e, chances of heads is 0.75 and of tail
is 0.25, then E(I)= - 0.75log 0.75 - 0.25log 0.25 < 1.0
Classification - Decision Tree 32
Information: Basics
 Suppose you have dice and you roll it, find the entropy if
getting a ‘6’ if the probabilities of each event i.e, of getting
1 to 6 is equal.
 possible events: 6, pi=1/6
 E(I)= 6(- 1/6)log (1/6)=2.585
 If the dice is biased i.e, chances of ‘6’ is 0.75 then what is the
entropy:
 p(for 6) =0.75,
 p(for all other) = 0.25,
 p (any other number) = 0.25/5 = 0.05 (equally divided among 1 to 5)
 then E(I)= - 0.75log 0.75 – 5 (0.05)log (0.05) = 1.39

Classification - Decision Tree 33

Information: Basics
nt y
 Suppose you have dice and you roll it, find the air tentropy if
e
ci.e, of getting
getting a ‘6’ if the probabilities of each event u n
s e s e r
1 to 6 is equal. e a w
c r l o
 possible events: 6, p =1/6
t i n l s o
i
e n i s a
 E(I)= 6(- 1/6)log (1/6)=2.585 e v p y
n
achances t r o
 If the dice is biased i.e, o f e n of ‘6’ is 0.75 then what is the
l it y he
entropy: b i o t
ba s
 p r
=0.75,
p o e s
(for 6)
e uc
 ps t
h r d
e0.25,
A (for all other) =
 p
(any other number) = 0.25/5 = 0.05 (equally divided among 1 to 5)
 then E(I)= - 0.75log 0.75 – 5 (0.05)log (0.05) = 1.39

Classification - Decision Tree 34

Information: Basics
 Suppose you have dice and you roll it, find the entropy if
getting a ‘6’ if the probabilities of each eevent e i.e,
bl eof getting
w n
1 to 6 is equal. t r r i a no
i o n v a s k
i s r e e i
 possible events: 6, pi=1/6 d ec a tu a l u
g a f e s v
 E(I)= 6(- 1/6)log (1/6)=2.585 i n t h e i t
k
a chances s of ‘6’ e
cis 0.75 then what is the
If the dice is biased m i.e, a o n

i n b le t y
entropy: S o r i a a i n
v a e r t
 p(for 6) =0.75, se a n c
o e u
h o t h
 c = 0.25,
p(for all other) e s
u c
 d = 0.25/5 = 0.05 (equally divided among 1 to 5)
p (any otherrenumber)
a t
 thenh
t E(I)= - 0.75log 0.75 – 5 (0.05)log (0.05) = 1.39

Classification - Decision Tree 35

Decision Trees
The most notable types of decision tree algorithms are:-
 Iterative Dichotomiser 3 (ID3): This algorithm uses Information

Gain to decide which attribute is to be used classify the current

subset of the data. For each level of the tree, information gain is
calculated for the remaining data recursively.
 C4.5: This algorithm is the successor of the ID3 algorithm. This

algorithm uses either Information gain or Gain ratio to decide upon

the classifying attribute. It is a direct improvement from the ID3
algorithm as it can handle both continuous and missing attribute
values.
 Classification and Regression Tree(CART): It is a dynamic

learning algorithm which can produce a regression tree as well as a

classification tree depending upon the dependent variable.

Classification - Decision Tree 36

DT: Entropy – A measuring Value
 Entropy is a concept originated in thermodynamics
but later found its way to information theory.
 In decision tree construction process, definition of
entropy as a measure of disorder suits well.
 If the class values of the data in a node is equally
divided among possible values of the class value,
we say entropy (disorder) is maximum.
 If the class values of the data in a node is same for
all data, entropy (disorder) is minimum.

Classification - Decision Tree 37

DT: Entropy – A measuring Value

 A decision tree is built top-down from a root

node and involves partitioning the data into
subsets that contain instances with similar
values (homogenous).
 ID3 algorithm uses entropy to calculate the
homogeneity of a sample.
 If the sample is completely homogeneous the
entropy is zero and if the sample is an
equally divided it has entropy of one.

Classification - Decision Tree 38

Entropy

 Maximum probability at the center where curve is touching highest

point is =1 as given above

Classification - Decision Tree 39

Information theory: Entropy measure
 The entropy formula,
|C |
entropy ( D)    Pr(c ) log
j 1
j 2 Pr(c j )

|C |

 Pr(c )  1,
j 1
j

 Pr(cj) is the probability of class cj in data set D

 We use entropy as a measure of impurity or
disorder of data set D. (or, a measure of
information in a tree)

Classification - Decision Tree 40

Entropy measure: E= - (p /s)log(p /s) - (n /s)log(n /s)
p= all +ve examples, n= -ve, s=total examples

 As the data become purer and purer, the entropy value

becomes smaller and smaller. This is useful for classification
Classification - Decision Tree 41
Information gain
 Given a set of examples D, we first compute its
entropy for the ‘c’ classes:
|C |
entropy ( D)   Pr(c j ) log 2 Pr(c j )
j 1

 If we choose attribute Ai, with v values, the root of the

current tree, this will partition D into v subsets D1, D2
…, Dv . The expected entropy if Ai is used as the
current root: v |D |
entropy Ai ( D)  
j
 entropy ( D j )
j 1 | D |

Classification - Decision Tree 42

Information Gai

Classification - Decision Tree 43

Information gain (cont …)
 Information gained by selecting attribute Ai to
branch or to partition the data is
gain( D, Ai )  entropy ( D)  entropy Ai ( D)

 We choose the attribute with the highest gain to

branch/split the current tree.
 As the information gain increases for a variable,
the uncertainty in decision making reduces.

Classification - Decision Tree 44

Day outlook temp humidity wind play
D1 sunny hot high weak No
D2 sunny hot high strong No
D3 overcast hot high weak Yes
D4 rain mild high weak Yes
D5 rain cool normal weak Yes
D6 rain cool normal strong No
D7 overcast cool normal strong Yes
D8 sunny mild high weak No
D9 sunny cool normal weak Yes
D10 rain mild normal weak Yes
D11 sunny mild normal strong Yes
D12 overcast mild high strong Yes
D13 overcast hot normal weak Yes
D14 rain mild high strong No
Classification - Decision Tree 45
 To build a decision tree, we need to calculate two types
of entropy using frequency tables as follows:
 a) Entropy using the frequency table of one attribute:

Note: Probability of P(9)=9/14=0.64 & P(5)=5/14=0.36

Classification - Decision Tree 46

How calculate Log base 2?
 To Calculate Log2 based value:
Log(value)/log(2) like to calculate
log2(0.36)=log10(0.36)/log10(2)
 http://logbase2.blogspot.com/2008/08/log-calculator.html

Classification - Decision Tree 47

b) Entropy using the frequency table of two
attributes :

Classification - Decision Tree 48

 Calculate the following Entropies
 E(PlayGolf, temp)
 P(Hot). E(Yes, No)+P(Mild).E(Yes, No)+P(Cool).E(Yes, No)
 Help: First Built the table

 E(PlayGolf, humidity)

 E(PlayGolf, windy)

Classification - Decision Tree 49

Information Gain

 The information gain is based on the

decrease in entropy after a dataset is split
on an attribute.
 Constructing a decision tree is all about
finding attribute that returns the highest
information gain (i.e., the most homogeneous
branches).

Classification - Decision Tree 50

Information Gain
Information Gain
Information Gain
Information Gain
Information Gain

 Step 3: Choose attribute with the largest

information gain as the decision node.
Information Gain

 Step 4a: A branch with entropy of 0 is a leaf

node.
Information Gain

 Step 4b: A branch with entropy more than 0

needs further splitting.
Information Gain
Information Gain

 Step 5: The ID3 algorithm is run recursively

on the non-leaf branches, until all data is
classified.
Decision Tree to Decision Rules
Decision Tree to Decision Rules
Example
Owns House Married Gender Employed Credit Risk Class
History
Yes Yes M Yes A B
No No F Yes A A
Yes Yes F Yes B C
Yes No M No B B
No Yes F Yes B C
No No F Yes B A
No No M No B B
Yes No F Yes A A
No Yes F Yes A C
Yes Yes F Yes a C

Classification - Decision Tree 62

Choosing the “Best” Feature
Own House? Credit rating

Yes No A B

Married ? Gender

Yes No M F

Classification - Decision Tree 63

Owns Married Gender Employe Credit Risk
House d History Class

Choosing the “Best” Feature Yes Yes M Yes A B

Own House? No No F Yes A A

Yes Yes F Yes B C
Yes No Yes No M No B B
No Yes F Yes B C
No No F Yes B A
 Find the overall entropy first: No No M No B B

 Total samples: 10 Yes No F Yes A A

No Yes F Yes A C
Yes Yes F Yes a C

Classification - Decision Tree 64

Owns Married Gender Employe Credit Risk
House d History Class

Choosing the “Best” Feature Yes Yes M Yes A B

Own House? No No F Yes A A

Yes Yes F Yes B C
Yes No Yes No M No B B
No Yes F Yes B C
No No F Yes B A
 Find the overall entropy first: No No M No B B

 Total samples: 10 Yes No F Yes A A

No Yes F Yes A C
 Class A: 3, Class B: 3, Class C: 4 Yes Yes F Yes a C

 Entropy(D)= -(3/10)log(3/10)-(3/10)log(3/10)-(4/10)log(4/10) = 1.57

 Own homes: has two v values, Yes (5 instances) and No (5
instances), total 10, probability of each is 0.5
 Find entropy(Dj) for each yes and no and the add the two weighted
by their class probabilities
 E(yes)= -(1/5)log(1/5) - (2/5)log(2/5) -(2/5)log(2/5) = 1.52
 E(no)= -(2/5)log(2/5) - (1/5)log(1/5) -(2/5)log(2/5) = 1.52
 E(Dj) = 0.5*E(yes)+0.5*E(no) = 1.52
 Gain(D, Own House) = 1.57-1.52 = 0.05

Classification - Decision Tree 65

Owns Married Gender Employe Credit Risk
House d History Class

Choosing the “Best” Feature Yes Yes M Yes A B

Own House? No No F Yes A A

Yes Yes F Yes B C
Yes No Yes No M No B B
No Yes F Yes B C
No No F Yes B A
 Find the overall entropy first: No No M No B B

 Total samples: 10 Yes No F Yes A A

No Yes F Yes A C
 Class A: 3, Class B: 3, Class C: 4 Yes Yes F Yes a C

 Entropy(D)= -(3/10)log(3/10)-(3/10)log(3/10)-(4/10)log(4/10) = 1.57

Classification - Decision Tree 66

Owns Married Gender Employe Credit Risk
House d History Class

Choosing the “Best” Feature Yes Yes M Yes A B

Own House? No No F Yes A A

Yes Yes F Yes B C
Yes No Yes No M No B B
No Yes F Yes B C
No No F Yes B A
 Find the overall entropy first: No No M No B B

 Total samples: 10 Yes No F Yes A A

No Yes F Yes A C
 Class A: 3, Class B: 3, Class C: 4 Yes Yes F Yes a C

 Entropy(D)= -(3/10)log(3/10)-(3/10)log(3/10)-(4/10)log(4/10) = 1.57

Classification - Decision Tree 67

Owns Married Gender Employe Credit Risk
House d History Class

Choosing the “Best” Feature Yes Yes M Yes A B

Own House? No No F Yes A A

Yes Yes F Yes B C
Yes No Yes No M No B B
No Yes F Yes B C
No No F Yes B A
 Find the overall entropy first: No No M No B B

 Total samples: 10 Yes No F Yes A A

No Yes F Yes A C
 Class A: 3, Class B: 3, Class C: 4 Yes Yes F Yes a C

 Entropy(D)= -(3/10)log(3/10)-(3/10)log(3/10)-(4/10)log(4/10) = 1.57

 Own homes: has two v values, Yes (5 instances) and No (5
instances), total 10, probability of each is 0.5
 Find entropy(Dj) for each yes and no and the add the two weighted
by their class probabilities
 E(yes)= -(1/5)log(1/5) - (2/5)log(2/5) -(2/5)log(2/5) = 1.52
 E(no)= -(2/5)log(2/5)
Only 1 out- of
(1/5)log(1/5) -(2/5)log(2/5)
5 have class A for own =house:
1.52 yes
 E(Dj) = 0.5*E(yes)+0.5*E(no)
Only 2 out of 5 have= 1.52
class B for own house: yes
 Gain(D, Own House)
Only 2 out=of1.57-1.52 = 0.05C for own house: yes
5 have class

Classification - Decision Tree 68

Owns Married Gender Employe Credit Risk
House d History Class

Choosing the “Best” Feature Yes Yes M Yes A B

Own House? No No F Yes A A

Yes Yes F Yes B C
Yes No Yes No M No B B
No Yes F Yes B C
No No F Yes B A
 Find the overall entropy first: No No M No B B

 Total samples: 10 Yes No F Yes A A

No Yes F Yes A C
 Class A: 3, Class B: 3, Class C: 4 Yes Yes F Yes a C

 Entropy(D)= -(3/10)log(3/10)-(3/10)log(3/10)-(4/10)log(4/10) = 1.57

Classification - Decision Tree 69

Owns Married Gender Employe Credit Risk
House d History Class

Choosing the “Best” Feature Yes Yes M Yes A B

Own House? No No F Yes A A

Yes Yes F Yes B C
Yes No Yes No M No B B
No Yes F Yes B C
No No F Yes B A
 Find the overall entropy first: No No M No B B

 Total samples: 10 Yes No F Yes A A

No Yes F Yes A C
 Class A: 3, Class B: 3, Class C: 4 Yes Yes F Yes a C

 Entropy(D)= -(3/10)log(3/10)-(3/10)log(3/10)-(4/10)log(4/10) = 1.57

Classification - Decision Tree 70

Owns Married Gender Employe Credit Risk
House d History Class

Choosing the “Best” Feature Yes Yes M Yes A B

Own House? No No F Yes A A

Yes Yes F Yes B C
Yes No Yes No M No B B
No Yes F Yes B C
No No F Yes B A
 Find the overall entropy first: No No M No B B

 Total samples: 10 Yes No F Yes A A

No Yes F Yes A C
 Class A: 3, Class B: 3, Class C: 4 Yes Yes F Yes a C

 Entropy(D)= -(3/10)log(3/10)-(3/10)log(3/10)-(4/10)log(4/10) = 1.57

Classification - Decision Tree 71

Owns Married Gender Employe Credit Risk
House d History Class

Similarly Find the values Yes

No
Yes
No
M
F
Yes
Yes
A
A
B
A

for all the other variables Yes

Yes
Yes
No
F
M
Yes
No
B
B
C
B
No Yes F Yes B C
No No F Yes B A
No No M No B B
Yes No F Yes A A
No Yes F Yes A C
Yes Yes F Yes a C

 Own House: 0.05

 Married: 0.72
 Gender: 0.88
 Employed: 0.45 Selected as Root Node
 Credit rating: 0.05

Classification - Decision Tree 72

Owns Married Gender Employe Credit Risk

Choosing the “Best”

House d History Class

Yes Yes M Yes A B

Feature
No No F Yes A A
Yes Yes F Yes B C

Gender Yes No M No B B
No Yes F Yes B C

M F No No F Yes B A
No No M No B B
Yes No F Yes A A
No Yes F Yes A C
Class A: 0 Class A: 3 Yes Yes F Yes a C
Class B: 3 Class B: 0
Class C: 0 Class C: 4

No further Further split is

split is required required here, cannot
here, identifies B identify A, and C fully
fully

Apply the same procedure again on other variables leaving

out column for Gender, and rows for class B as it has been fully
determined

Classification - Decision Tree 73

Owns Married Gender Employe Credit Risk

Choosing the “Best”

House d History Class

Yes Yes M Yes A B

Feature
No No F Yes A A
Yes Yes F Yes B C

Gender Yes No M No B B
No Yes F Yes B C

M F No No F Yes B A
No No M No B B
Yes No F Yes A A
No Yes F Yes A C
Class A: 0 Class A: 3 Yes Yes F Yes a C
Class B: 3 Class B: 0
Class C: 0 Class C: 4

No further Further split is

split is required required here, cannot
here, identifies B identify A, and C fully
fully

Apply the same procedure again on other variables leaving

out column for Gender, and rows for class B as it has been fully
determined

Classification - Decision Tree 74

Owns Married Gender Employe Credit Risk
House d History Class

Choosing the “Best” Feature Yes Yes M Yes A B

No No F Yes A A

Own House? Yes Yes F Yes B C

Yes No M No B B
Yes No No Yes F Yes B C
No No F Yes B A
No No M No B B
Yes No F Yes A A

 E(D)=1.33 No Yes F Yes A C

Yes Yes F Yes a C

 Own House: 0.96

 Married: 0.00
 Etc…

Married is the best node as E(Dj) = 0,

Hence information gain will be maximum
Classification - Decision Tree 75
Owns Married Gender Employe Credit Risk
House d History Class

Completing DT Yes
No
Yes
Yes
No
Yes
M
F
F
Yes
Yes
Yes
A
A
B
B
A
C
Yes No M No B B
No Yes F Yes B C
No No F Yes B A
Gender No No M No B B
Yes No F Yes A A

M F No Yes F Yes A C
Yes Yes F Yes a C

Class B: 3 Class A: 3,Class C: 4

Married

Yes No

Class C: 4 Class A: 3

Classification - Decision Tree 76

Owns Married Gender Employe Credit Risk

Completing DT
House d History Class
Yes Yes M Yes A B
No No F Yes A A
Gender Yes Yes F Yes B C
Yes No M No B B
No Yes F Yes B C
M F
No No F Yes B A
No No M No B B
Yes No F Yes A A

Class B: 3 Class A: 3,Class C: 4 No Yes F Yes A C

Yes Yes F Yes a C
Married

Yes No

Class C: 4 Class A: 3

R1: If Gender=M then Class B

R2: If Gender=F and Married=Yes
Then Class C
Rules R3: If Gender=F and Married=No
Then Class A

Classification - Decision Tree 77

Table 6.1 Class‐labeled training tuples from AllElectronics customer database.

78
Classification - Decision Tree 79
Classification - Decision Tree 80
Classification - Decision Tree 81
Trees Construction Algorithm (ID3)
 Decision Tree Learning Method (ID3)
 Input: a set of examples S, a set of features F, and a target set T (target

class T represents the type of instance we want to classify, e.g., whether

“to play golf”)
 1. If every element of S is already in T, return “yes”; if no element of S is in

T return “no”
 2. Otherwise, choose the best feature f from F (if there are no features

remaining, then return failure);

 3. Extend tree from f by adding a new branch for each attribute value

 4. Distribute training examples to leaf nodes (so each leaf node S is now

the set of examples at that node, and F is the remaining set of features not
yet selected)
 5. Repeat steps 1-5 for each leaf node

 Main Question:
 how do we choose the best feature at each step?

Note:
Note:ID3ID3algorithm
algorithmonly
onlydeals
dealswith
withcategorical
categoricalattributes,
attributes,but
butcan
canbe
beextended
extended
(as
(asininC4.5)
C4.5)totohandle
handlecontinuous
continuousattributes
attributes
Classification - Decision Tree 82
Choosing the “Best” Feature
 Using Information Gain to find the “best” (most discriminating) feature
 Entropy, E(I) of a set of instance I, containing p positive and n negative examples
p p n n
E(I )   log 2  log 2
pn pn pn pn
 Gain(A, I) is the expected reduction in entropy due to feature (attribute) A

pj  nj
Gain( A, I )  E ( I )   pn
E(I j )
descendant j
 the jth descendant of I is the set of instances with value vj for A

S: [9+,5-]
Outlook? E = -(9/14).log(9/14) - (5/14).log(5/14)
= 0.940
overcast rainy
sunny

“yes”, since all positive examples

[4+,0-] [2+,3-] [3+,2-]

Classification - Decision Tree 83

Decision Tree Learning - Example
Day outlook temp humidity wind play
S: [9+,5-] (E = 0.940)
D1 sunny hot high weak No
D2 sunny hot high strong No humidity?
D3 overcast hot high weak Yes
high normal
D4 rain mild high weak Yes
D5 rain cool normal weak Yes
D6 rain cool normal strong No
D7 overcast cool normal strong Yes [3+,4-] (E = 0.985) [6+,1-] (E = 0.592)
D8 sunny mild high weak No
Gain(S, humidity) = .940 - (7/14)*.985 - (7/14)*.592
D9 sunny cool normal weak Yes
= .151
D10 rain mild normal weak Yes
D11 sunny mild normal strong Yes
D12 overcast mild high strong Yes S: [9+,5-] (E = 0.940)
D13 overcast hot normal weak Yes
wind?
D14 rain mild high strong No
weak strong
So, classifying examples by humidity
provides more information gain than by
wind. In this case, however, you can [6+,2-] (E = 0.811) [3+,3-] (E = 1.00)
verify that outlook has largest information Gain(S, wind) = .940 - (8/14)*.811 - (6/14)*1.0
gain, so it’ll be selected as root = .048

Classification - Decision Tree 84

Decision Tree Learning - Example
Day outlook temp humidity wind play
D1 sunny hot high weak No
D2 sunny hot high strong No
S: [9+,5-]
D3 overcast hot high weak Yes
D4 rain mild high weak Yes Outlook{D1, D2, …, D14}
D5 rain cool normal weak Yes
D6 rain cool normal strong No
D7 overcast cool normal strong Yes
D8 sunny mild high weak No sunny overcast rainy
D9 sunny cool normal weak Yes
D10 rain mild normal weak Yes
D11 sunny mild normal strong Yes
D12 overcast mild high strong Yes ? yes ?
D13 overcast hot normal weak Yes
D14 rain mild high strong No
[2+,3-] [4+,0-] [3+,2-]
E=.970 E=0 E=.970

So, classifying examples by humidity

provides more information gain than by Gain(S, outlook) = .940 - (5/14)*..97 - (4/14)*0 - (5/14)*.97
wind. In this case, however, you can = .241
verify that outlook has largest information
gain, so it’ll be selected as root

Classification - Decision Tree 85

Decision Tree Learning - Example
 Partially learned decision tree
S: [9+,5-]
Outlook {D1, D2, …, D14}

sunny overcast rainy

?E=.970 yes E=0 ?E=.970

[2+,3-] [4+,0-] [3+,2-]
{D1, D2, D8, D9, D11} {D3, D7, D12, D13} {D4, D5, D6, D10, D14}

 which attribute should be tested here?

Ssunny = {D1, D2, D8, D9, D11}
Gain(Ssunny, humidity) = .970 - (3/5)*0.0 - (2/5)*0.0 = .970
Gain(Ssunny, temp) = .970 - (2/5)*0.0 - (2/5)*1.0 - (1/5)*0.0 = .570
Gain(Ssunny, wind) = .970 - (2/5)*1.0 - (3/5)*.918 = .019

Classification - Decision Tree 86

Highly-branching attributes

 Problematic: attributes with a large number of

values (extreme case: ID code)
 Subsets are more likely to be pure if there is
a large number of values
 Information gain is biased towards choosing
attributes with a large number of values
 This may result in overfitting(selection of an
attribute that is non-optimal for prediction)

Classification - Decision Tree 87

Gain Ratio for Attribute Selection (C4.5)

Classification - Decision Tree 88

Another Alternative to avoid selecting
attributes with large -domains

Classification - Decision Tree 89

Gini Index (CART, SLIQ, ibm
IntellegentMiner)

Classification - Decision Tree 90

Contd..!!!

Classification - Decision Tree 91

Classification - Decision Tree 92
Comparing Attribute Selection Measures

Classification - Decision Tree 93

Split Algorithm with Gini Index
 Basic concept taken from Economics given
by Corrado Gini (1884 to 1965)
 The index varies from 0 to 1
 ZERO means no uncertainty
 ONE means maximum uncertainty
 Brazil 0.59
Income distribution of  India 0.32
various countries  China 0.45

 USA 0.41
 Japan 0.25
Most evenly distributed income

Classification - Decision Tree 94

Gini Index
 The Gini index is measure of impurity developed by
Italian statistician Corrado Gini in 1912.
 It is usually used to measure income inequality but
can be used to measure any form of uneven
distribution.
 Gini index is a number between 0 and 1where 0
corresponds with perfect equality (where every one
has same income) and 1 corresponds the perfect
inequality (where one person has all the income and
everyone else has zero income).

GINI (t ) 1   p 2 ( j | t )
j

Classification - Decision Tree 95

Diversity and Gini Index
high diversity, low purity

G = 1-(3/8)2 -(3/8)2 -(1/8)2 -(1/8)2 = .69 (E=1.811)

low diversity, high purity

G = 1-(6/7)2-(1/7)2 = .24 (E=0.592)

Classification - Decision Tree 96
Owns Married Gender Employe Credit Risk
House d History Class

Choosing the “Best” Feature Yes Yes M Yes A B

Own House? No No F Yes A A

Yes Yes F Yes B C
Yes No Yes No M No B B
No Yes F Yes B C
No No F Yes B A
 Find the overall G first: No No M No B B

 Total samples: 10 Yes No F Yes A A

No Yes F Yes A C
 Gt=1- (3/10)2 - (3/10)2 - (4/10)2 = 0.66 Yes Yes F Yes a C

 Attribute: Own House

 G(y)= 1- (1/5)2 - (2/5)2 - (2/5)2 = 0.66
 Gain=G -G
 G(n)=0.64 t i
 Total G=0.5G(y)+0.5G(n)= 0.64  Gains:
 Attribute: Married  Own House:
 Total G=0.5G(y)+0.5G(n)= 0.40 0.02
 Attribute: Gender G=0.511  Married:
 Attribute: Employed: G= 0.475 0.26
 Attribute: Credit Rating: G=0.64  Gender:
0.302
Classification - Decision Tree  Employed: 97
Owns Married Gender Employe Credit Risk
House d History Class

Choosing the “Best” Feature Yes Yes M Yes A B

No No F Yes A A
Yes Yes F Yes B C
Gender Yes No M No B B
No Yes F Yes B C
M F
No No F Yes B A
No No M No B B
Yes No F Yes A A
No Yes F Yes A C

 Choose Gender Yes Yes F Yes a C

Apply the same procedure again on other variables leaving

out column for Gender, and rows for class B as it has been fully
determined

Check if you can get the same DT or not

Classification - Decision Tree 98

Categorical Attributes: Computing
Gini Index
• For each distinct value, gather counts for each class in the set.
• Use the count matrix to make decisions

Multi-way Split Two-way split

(Find best partition of values)
Outlook
Outlook Outlook
Overcast Rain/
Overca Rain Sunny Sunny Overcast/ Sunny
st rain

C1 1 2 1 C1 4 5
C1 7 2

C2 4 1 1 C2 0 5 C2 2 3
Gini 0 .48 .48 Gini 0 .5 Gini .345 .48
0.34 0.36 0.391

Classification - Decision Tree 99

Continuous Attributes: Computing
Gini Index…

Cheat No No No Yes Yes Yes No No No No

Taxable Income

60 70 75 85 90 95 100 120 125 220

Sorted
Values 55 65 72 80 87 92 97 110 122 172 230

Split Pos
≤ > ≤ > ≤ > ≤ > ≤ > ≤ > ≤ > ≤ > ≤ > ≤ > ≤ >

Yes 0 3 0 3 0 3 0 3 1 2 2 1 3 0 3 0 3 0 3 0 3 0

No 0 7 1 6 2 5 3 4 3 4 3 4 3 4 4 3 5 2 6 1 7 0

Gini 0.420 0.400 0.375 0.343 0.417 0.400 0.300 0.343 0.375 0.400 0.420

Classification - Decision Tree 100

Gini index (CART)
 E.g., two classes, Pos and Neg, and dataset S
with p Pos-elements and n Neg-elements.
 fp = p / (p+n) fn = n / (p+n)

gini(S) = 1 – fp2 - fn2

 If dataset S is split into S1, S2 then

ginisplit(S1, S2 )=gini(S1)·(p1+n1)/(p+n) +
gini(S2)·(p2+n2)/(p+n)

Classification - Decision Tree 101

Gini index - play tennis example
Outlook Temperature Humidity W indy Class outlook
sunny hot high false N
sunny hot high true N overcast rain, sunny
overcast hot high false P
rain mild high false P
rain cool normal false P P 100% ……………
rain cool normal true N
overcast cool normal true P
sunny mild high false N
sunny cool normal false P
rain mild normal false P humidity
sunny mild normal true P
overcast mild high true P normal high
overcast hot normal false P
rain mild high true N
P 86% ……………
 Two top best splits at root node:
 Split on outlook:
 S1: {overcast} (4Pos, 0Neg) S2: {sunny, rain}
 Split on humidity:
 S1: {normal} (6Pos, 1Neg) S2: {high}

Classification - Decision Tree 102

Calculations
 Outlook
 Sunny or Rainy Yes = 5 No = 5 Gini = .5
 Overcast Yes = 4 No = 0 Gini = 0
 Gain = 0.36
 Temperature
 Hot or Cold Yes = 5 No = 3 Gini = 0.47
 Mild Yes = 4 No = 2 Gini = 0.44
 Gain = 0.46
 Humidity
 High Yes = 3 No = 4 Gini = 0.49
 Normal Yes = 6 No = 1 Gini = 0.25
 Gain = 0.37
 Windy
 FALSE Yes = 6 No = 2 Gini = 0.38
 TRUE Yes = 3 No = 3 Gini = 0.5
 Gain = 0.43

Classification - Decision Tree 103

Calculations at Node 0
 Outlook

 5  2  5  2  1
GINI (outlook  sunny  rainy )  1        
 10   10   2
 4  2  0  2 
GINI (outlook  overcast )  1         0
 4   4  
 10   1   4  
GINI ( split based on outlook )    *     (0)   0.3571
 14   2   14  

Classification - Decision Tree 104

Temperature

 5  2  3  2 
GINI (temperatur e  hot  cold ) 1         0.46875
 8   8  
 4  2  2  2 
GINI (temperatur e  mild ) 1         0.44
 6   6  
 8   6 
GINI ( split based on temperatur e)    * 0.46875    * 0.44   0.456
 14   14  

Classification - Decision Tree 105

Humidity

 3  2  4  2  24
GINI (humidity  high)  1        
 7   7   49
 4  2  2  2  12
GINI (humidity  normal )  1        
 6   6   49
 7   24   7   12  
GINI ( split based on humidity )    *      *     0.37
 14   49   14   49  

Classification - Decision Tree 106

Windy

 6  2  2  2 
GINI ( windy  FALSE )  1         0.375
 8   8  
 3  2  3  2 
GINI ( windy  TRUE )  1         0. 5
 6   6  
 8  6 
GINI ( split based on windy)    * 0.375    * 0.5  0.43
 14   14  

Classification - Decision Tree 107

V=outlook
N=14
Overcast Rain/Sunny

Humidity
4 yes, 0 no
N = 10

Normal High

windy
4 yes 1 no
N=5

False true

V=outlook
3 yes, 0 no
N=2
Rain Sunny

1 no, 0 yes 1 yes 0 no

Classification - Decision Tree 108

Classification - Decision Tree 109
Dealing With Continuous Variables
 Partition continuous attribute into a discrete set of
intervals
 sort the examples according to the continuous attribute A
 identify adjacent examples that differ in their target classification

 generate a set of candidate thresholds midway

 problem: may generate too many intervals

 Another Solution:
 take a minimum threshold M of the examples of the majority class in each

adjacent partition; then merge adjacent partitions with the same majority class
70.5 77.5
Example: M = 3
Temperature 64 65 68 69 70 71 72 72 75 75 80 81 83 85
Play? yes no yes yes yes no no yes yes yes no yes yes no

Same majority, so they are merged

Final mapping: temperature £ 77.5 ==> “yes”; temperature > 77.5 ==> “---”

Classification - Decision Tree 110

Improving on Information Gain
 Info. Gain tends to favor attributes with a large number of values
 larger distribution ==> lower entropy ==> larger Gain

 Quinlan suggests using Gain Ratio

 penalize for large number of values

Si S Gain ( A, S )
SplitInfo ( A, S )   log i GainRatio ( A, S ) 
S S SplitInfo ( A, S )
 Example: “outlook”
S: [9+,5-]
SplitInfo (outlook, S) Outlook
= -(4/14).log(4/14) - (5/14).log(5/14) - (5/14).log(5/14)
= 1.577 overcast rainy
sunny
GainRatio (outlook, S)
= 0.246 / 1.577 = 0.156
S1: [4+,0-] S2 : [2+,3-] S3 : [3+,2-]

Classification - Decision Tree 111

Gain Ratios of Decision Variables
 Temperature
 Outlook
 Info = 0.911
 Info = 0.693
 Gain = 0.940 - .911 = 0.029
 Gain = 0.940 - .693 = 0.247
 Split info = info ([4, 6, 4]) = 1.362
 Split info = info ([5, 4, 5]) = 1.577
 Gain ratio = 0.029/1.362=0.021

Gain ratio = 0.247/1.577=0.156

 Humidity  Windy
 Info = 0.788  Info = 0.892
 Gain = 0.940 - .788 = 0.152  Gain = 0.940 - .892 = 0.048
 Split info = info ([7, 7]) = 1  Split info = info ([8, 6]) = .985

  Gain ratio = 0..048/.985=0.049

Gain ratio = 0.152/1=0.152

Classification - Decision Tree 112

Over-fitting in Classification
 A tree generated may over-fit the training examples due to noise or too small
a set of training data
 Two approaches to avoid over-fitting:
 (Stop earlier): Stop growing the tree earlier

 (Post-prune): Allow over-fit and then post-prune the tree

 Approaches to determine the correct final tree size:

 Separate training and testing sets or use cross-validation

 Use all the data for training, but apply a statistical test (e.g., chi-square) to

estimate whether expanding or pruning a node may improve over entire

distribution
 Use Minimum Description Length (MDL) principle: halting growth of the

tree when the encoding is minimized.

 Rule post-pruning (C4.5): converting to rules before pruning

Classification - Decision Tree 113

The loan data (reproduced)
Approved or not

Classification - Decision Tree 114

A decision tree from the loan data
 Decision nodes and leaf nodes (classes)

Classification - Decision Tree 115

Use the decision tree

Classification - Decision Tree 116

Is the decision tree unique?
 No. Here is a simpler tree.
 We want smaller tree and accurate tree.
 Easy to understand and perform better.

 Finding the best tree is

NP-hard.
 All current tree building
algorithms are heuristic
algorithms

Classification - Decision Tree 117

From a decision tree to a set of rules
 A decision tree can
be converted to a
set of rules
 Each path from the
root to a leaf is a
rule.

Classification - Decision Tree 118

Algorithm for decision tree learning
 Basic algorithm (a greedy divide-and-conquer algorithm)
 Assume attributes are categorical now (continuous attributes
can be handled too)
 Tree is constructed in a top-down recursive manner
 At start, all the training examples are at the root
 Examples are partitioned recursively based on selected
attributes
 Attributes are selected on the basis of an impurity function
(e.g., information gain)
 Conditions for stopping partitioning
 All examples for a given node belong to the same class
 There are no remaining attributes for further partitioning –
majority class is the leaf
 There are no examples left

Classification - Decision Tree 119

Decision tree learning algorithm

Classification - Decision Tree 120

Choose an attribute to partition data
 The key to building a decision tree - which
attribute to choose in order to branch.
 The objective is to reduce impurity or
uncertainty in data as much as possible.
 A subset of data is pure if all instances belong to
the same class.
 The heuristic in C4.5 is to choose the attribute
with the maximum Information Gain or Gain
Ratio based on information theory.

Classification - Decision Tree 121

The loan data (reproduced)
Approved or not

Classification - Decision Tree 122

Two possible roots, which is better?

 Fig. (B) seems to be better.

Classification - Decision Tree 123

An example
6 6 9 9
entropy ( D)    log 2   log 2  0.971
15 15 15 15

6 9
entropy Own _ house ( D)    entropy ( D1 )   entropy ( D2 )
15 15
6 9
  0   0.918
15 15
 0.551

5 5 5
entropy Age ( D)    entropy ( D1 )   entropy ( D2 )   entropy ( D3 ) Age Yes No entropy(Di)
15 15 15
young 2 3 0.971
5 5 5
  0.971   0.971   0.722 middle 3 2 0.971
15 15 15
 0.888
old 4 1 0.722

 Own_house is the best

choice for the root.

Classification - Decision Tree 124

We build the final tree

 We can use information gain ratio to evaluate the

impurity as well

Classification - Decision Tree 125

Handling continuous attributes
 Handle continuous attribute by splitting into
two intervals (can be more) at each node.
 How to find the best threshold to divide?
 Use information gain or gain ratio again
 Sort all the values of an continuous attribute in
increasing order {v1, v2, …, vr},
 One possible threshold between two adjacent
values vi and vi+1. Try all possible thresholds and
find the one that maximizes the gain (or gain
ratio).

Classification - Decision Tree 126

An example in a continuous space

Classification - Decision Tree 127

Avoid overfitting in classification
 Overfitting: A tree may overfit the training data
 Good accuracy on training data but poor on test data
 Symptoms: tree too deep and too many branches,
some may reflect anomalies due to noise or outliers
 Two approaches to avoid overfitting
 Pre-pruning: Halt tree construction early
 Difficult to decide because we do not know what may
happen subsequently if we keep growing the tree.
 Post-pruning: Remove branches or sub-trees from a
“fully grown” tree.
 This method is commonly used. C4.5 uses a statistical
method to estimates the errors at each node for pruning.
 A validation set may be used for pruning as well.

Classification - Decision Tree 128

An example Likely to overfit the data

Classification - Decision Tree 129

Other issues in decision tree learning

 From tree to rules, and rule pruning

 Handling of missing values
 Handling skewed distributions
 Handling attributes and classes with different
costs.
 Attribute construction (adding a new one)
 etc.

Classification - Decision Tree 130

Name Gender Height Output1 Output2

DT Example (1) Kristina

Jim
F
M
1.6 m
2m
Short
Tall
Medium
Medium
Maggie F 1.9 m Medium Tall
Martha F 1.88 m Medium Tall
Stephanie F 1.7 m Short Medium
 Considering the data in the table and the Bob M 1.85 m Medium Medium
correct classification in Output1, we Kathy F 1.6 m Short Medium
have: Dave M 1.7 m Short Medium
Worth M 2.2 m Tall Tall
 Short (4/15) Steven M 2.1 m Tall Tall
 Medium (8/15) Debbie F 1.8 m Medium Medium
 Tall (3/15) Todd M 1.95 m Medium Medium
Kim F 1.9 m Medium Tall
Amy F 1.8 m Medium Medium
 Entropy = 4/15 log(15/4) + 8/15 log(15/8) Wynette F 1.75 m Medium Medium
+ 3/15 log(15/3) = 0.4384
Entropy(F) = 3/9 log(9/3) +
6/9 log(9/6) = 0.2764
 Choosing the gender as the splitting
attribute we have: Entropy(M) = 1/6 log(6/1) +
2/6 log(6/2) + 3/6 log(6/3) =
0.4392

Classification - Decision Tree 131

DT Example (2)
 The algorithm must determine what the gain
in information is by using this split.
 To do this, we calculate the weighted sum of
these last two entropies to get:
((9/15) 0.2764) + ((6/15) 0.4392) = 0.34152

 The gain in entropy by using the gender

attribute is thus:
0.4384 – 0.34152 = 0.09688

Classification - Decision Tree 132

DT Example (3)
Name Gender Height Output1 Output2
Kristina F 1.6 m Short Medium
Jim M 2m Tall Medium
 Looking at the height Maggie F 1.9 m Medium Tall
attribute, we divide it into Martha F 1.88 m Medium Tall
ranges: Stephanie F 1.7 m Short Medium
(0, 1.6], (1.6, 1.7], (1.7, 1.8], (1.8, 1.9], Bob M 1.85 m Medium Medium
(1.9, 2.0], (2.0, ) Kathy F 1.6 m Short Medium
Dave M 1.7 m Short Medium
Worth M 2.2 m Tall Tall
 Now we can compute the Steven M 2.1 m Tall Tall
entropy Debbie F 1.8 m Medium Medium
 2 in (0, 1.6]  (2/2(0)+0+0)=0 Todd M 1.95 m Medium Medium
 2 in (1.6, 1.7]  (2/2(0)+0+0)=0 Kim F 1.9 m Medium Tall
 3 in (1.7, 1.8]  (0+3/3(0)+0)=0 Amy F 1.8 m Medium Medium
 4 in (1.8, 1.9]  (0+4/4(0)+0)=0 Wynette F 1.75 m Medium Medium
 2 in (1.9, 2.0] 
(0+1/2(0.301)+1/2(0.301))=0.301
 2 in (2.0, ]  (0+0+2/2(0))=0

Classification - Decision Tree 133

DT Example (4)
 All the states are completely ordered (entropy 0)
except for the (1.9, 2.0] state.
 The gain in entropy by using the height attribute is:
0.4384-2/15(0.301)=0.3983

 Thus, this has the greater gain, and we choose this

over gender as the first splitting attribute

Classification - Decision Tree 134

DT Example (5) Height

<=1.6m >2.0m

>1.6m >1.7m >1.8m >1.9m

<=1.7m <=1.8m <=1.9m <=2.0m

Short Short Medium Medium Tall

Height

 (1.9, 2.0] – is too large !!! <=1.95m >1.95m

 A further subdivision on height is needed

Medium Tall

Height

We can optimize the tree <=1.7m >1.7m

<=1.95m
>1.95m

Short Medium Tall

Classification - Decision Tree 135

Quinlan’s ID3 and C4.5 decision tree
algorithms
Given dataset T
Attribute1 Attribute2 Attribute3 Class
A 70 True CLASS1
A 90 True CLASS2
A 85 False CLASS2
A 95 False CLASS2
A 70 False CLASS1
B 90 True CLASS1
B 78 False CLASS1
B 65 True CLASS1
B 75 False CLASS1
C 80 True CLASS2
C 70 True CLASS2
C 80 False CLASS1
C 80 False CLASS1
C 96 False CLASS1
Classification - Decision Tree 136
Quinlan’s ID3 and C4.5 decision tree
algorithms
Consider test on attribute 1
freq(class,value) CLASS1 CLASS2
A 2 3 5
B 4 0 4
C 3 2 5
9 5 14

Info(T) 0.4098 0.5305 0.9403

Info(S) CLASS1 CLASS2 Info(S) Weight

A 0.5288 0.4422 0.9710 0.3571
B 0.0000 0.0000 0.0000 0.2857
C 0.4422 0.5288 0.9710 0.3571
0.6935

Gain .9403 - .6935 = 0.2467

Classification - Decision Tree 137
Quinlan’s ID3 and C4.5 decision tree
algorithms
Consider test on attribute 3
freq(class,value) CLASS1 CLASS2
True 3 3 6
False 6 2 8
9 5 14

Info(T) 0.4098 0.5305 0.9403

Info(S) CLASS1 CLASS2 Info(S) Weight

True 0.5000 0.5000 1.0000 0.4286
False 0.3113 0.5000 0.8113 0.5714
0.8922

Gain .9403 - .8922 = 0.0481

Classification - Decision Tree 138
Quinlan’s ID3 and C4.5 decision tree algorithms
 Summary
 Gain(Attribute1) = 0.2467
 Gain(Attribute3) = 0.0481
 In ID3 Attribute 2 not considered
 Since it is numeric
 Thus split on Attribute 1 – highest gain

Classification - Decision Tree 139

C4.5 decision tree algorithms
 What about numeric attribute 2 (as done by C4.5)?
 Consider as categorical
 Then gain = 0.4039 – should split on it
 But – 9 branches, of which 6 with only one

instance
 Tree too wide – not compact
 Since it is really numerical – what to do with a
different value?
 Use threshold Z, and split into two subsets
 Y <= Z and Y > Z
 More complex tests, assuming discrete values and
variable number of subsets
Classification - Decision Tree 140
C4.5 decision tree algorithms
 C4.5 and continuous attribute
 Sort values into v1,…,vm
 Try Zi = vi or Zi = (vi + vi+1) / 2 for i=1,…,m-1
 C4.5 uses Z = vi – more explainable decision rule
 Select splitting value Z*
 So that gain(Z*) = max {gain(Zi), i=1,…,m-1)}
 For last example – Attribute 2 (see next slide)
 Z* = 80
 Gain = 0.1022
 So even with this approach – would have split on Attribute
1

Classification - Decision Tree 141

C4.5 decision tree algorithms
Attribute 2 freq(class,value) Info(S)
Zi Att3 <= Zi Att3 > Zi CLASS1 CLASS2 CLASS1 CLASS2 Total Weight Info(Tx) Gain
65 1 1 0 0.0000 0.0000 0.0000 0.0714
13 8 5 0.4310 0.5302 0.9612 0.9286 0.8926 0.0477
70 4 3 1 0.3113 0.5000 0.8113 0.2857
10 6 4 0.4422 0.5288 0.9710 0.7143 0.9253 0.0150
75 5 4 1 0.2575 0.4644 0.7219 0.3571
9 5 4 0.4711 0.5200 0.9911 0.6429 0.8950 0.0453
78 6 5 1 0.2192 0.4308 0.6500 0.4286
8 4 4 0.5000 0.5000 1.0000 0.5714 0.8500 0.0903
80 9 7 2 0.2820 0.4822 0.7642 0.6429
5 2 3 0.5288 0.4422 0.9710 0.3571 0.8380 0.1022
85 10 7 3 0.3602 0.5211 0.8813 0.7143
4 2 2 0.5000 0.5000 1.0000 0.2857 0.9152 0.0251
90 12 8 4 0.3900 0.5283 0.9183 0.8571
2 1 1 0.5000 0.5000 1.0000 0.1429 0.9300 0.0103
95 13 8 5 0.4310 0.5302 0.9612 0.9286
1 1 0 0.0000 0.0000 0.0000 0.0714 0.8926 0.0477

Classification - Decision Tree 142

C4.5 decision tree algorithms

All same class – so T2 is a leaf

Classification - Decision Tree 143
C4.5 decision tree algorithms
Consider test on attribute 3 FOR SUBSET T1
freq(class,value) CLASS1 CLASS2
True 1 1 2
False 1 2 3
2 3 5

Info(T) 0.5288 0.4422 0.9710 Book has 0.940

Info(S) CLASS1 CLASS2 Info(S) Weight

True 0.5000 0.5000 1.0000 0.4000
False 0.5283 0.3900 0.9183 0.6000
0.9510

Gain .971 - .951 = 0.0200

Attribute 2 freq(class,value) Info(S)

Zi Att3 <= Zi Att3 > Zi CLASS1 CLASS2 CLASS1 CLASS2 Total Weight Info(Tx) Gain
70 2 2 0 0.0000 0.0000 0.0000 0.4000
3 0 3 0.0000 0.0000 0.0000 0.6000 0.0000 0.9710
85 3 2 1 0.3900 0.5283 0.9183 0.6000
2 0 2 0.0000 0.0000 0.0000 0.4000 0.5510 0.4200
90 4 2 2 0.5000 0.5000 1.0000 0.8000
1 0 1 0.0000 0.0000 0.0000 0.2000 0.8000 0.1710

Max gain on Attribute 2 - split on Z* = 70

Classification - Decision Tree 144
C4.5 decision tree algorithms
Consider test on attribute 3 FOR SUBSET T3
freq(class,value) CLASS1 CLASS2
True 0 2 2
False 3 0 3
3 2 5

Info(T) 0.4422 0.5288 0.9710 Book has 0.940

Info(S) CLASS1 CLASS2 Info(S) Weight

True 0.0000 0.0000 0.0000 0.4000
False 0.0000 0.0000 0.0000 0.6000
0.0000

Gain .971 - 0.000 = 0.9710

Attribute 2 freq(class,value) Info(S)

Zi Att3 <= Zi Att3 > Zi CLASS1 CLASS2 CLASS1 CLASS2 Total Weight Info(Tx) Gain
70 1 0 1 0.0000 0.0000 0.0000 0.2000
4 3 1 0.3113 0.5000 0.8113 0.8000 0.6490 0.3219
80 4 2 2 0.5000 0.5000 1.0000 0.8000
1 1 0 0.0000 0.0000 0.0000 0.2000 0.8000 0.1710

Max gain on Attribute 3

Classification - Decision Tree 145
C4.5 decision tree algorithms

Classification - Decision Tree 146

C4.5 decision tree algorithms

Classification - Decision Tree 147

C4.5 decision tree algorithms
 We used entropy of T after splitting into T1,…,Tn
 Info(T ) = -Σ k [(freq(C ,T ) / |T |) ∙ log (freq(C ,T ) / |T |)]
j i=1 i j j 2 i j j
 Info (T) = Σ n (|T | / |T|) ∙ Info(T )
x j=1 j j
 Gain(X) = Info(T) – Info (T)
x
 This is biased in favor of tests X with many outcomes
 Split on ID will generate one subset for each unique value – i.e.,

for all – with each subset of 1 instance

 It has maximal gain as Info (T)=0
x
 But result is a one level tree with one branch for each instance

 Thus divide by number of branches – to measure average gain

Classification - Decision Tree 148

C4.5 decision tree algorithms
 So define entropy
 Split-info(X) = Σ n (|T | / |T|) ∙ log (|T | / |T|)
j=1 j 2 j
 Potential information generated by splitting T into T ,…,T
1 n
 Similar to definition of Info(T )
j
 Use entropy of T after splitting into T1,…,Tn as before
 Info(T ) = -Σ k [(freq(C ,T ) / |T |) ∙ log (freq(C ,T ) / |T |)]
j i=1 i j j 2 i j j
 Info (T) = Σ n (|T | / |T|) ∙ Info(T )
x j=1 j j
 Gain(X) = Info(T) – Info (T)
x
 Selection criteria
 Gain-ratio(X) = Gain(X) / Split-info(X)

 Proportion of information generated by a “useful” compact split

 Select X* so that Gain-ratio(X*) = max {Gain-ratio(X)}

attributes X

Classification - Decision Tree 149

C4.5 decision tree algorithms
Splitting the root
Attribute1 Attribute2 Attribute3
Gain(X) 0.2467 0.1022 0.0481
|T1| 5 9 6
|T2| 4 5 8
|T3| 5
|T| 14 14 14
|T1|/|T|*log(|T1|/|T|) 0.5305 0.4098 0.5239
|T2|/|T|*log(|T2|/|T|) 0.5164 0.5305 0.4613
|T3|/|T|*log(|T3|/|T|) 0.5305
Split-info(X) 1.5774 0.9403 0.9852
Gain-ratio(X) 0.1564 0.1087 0.0488
Still split on Attribute 1

Classification - Decision Tree 150

C4.5 decision tree algorithms
 Missing data
 Unknown
 Not recorded
 Data entry error
 What to do with missing data?
 Eliminate instances with missing data
 Only useful when there are few
 Replace missing data with some values
 Fixed values, mean, mode, from distribution
 Modify algorithm to work with missing data

Classification - Decision Tree 151

C4.5 decision tree algorithms

 Issues with modified algorithm

 How compare subsets with different number of
unknown values
 With what class to associate instances with
unknown values
 C4.5 replaces unknown values
 Based on the distribution (=relative frequency) of
known values

Classification - Decision Tree 152

C4.5 decision tree algorithms
 For Split-info(X)
 Add one subset for the missing values
 That is – if there are n known classes, use T n+1 for missing
values
 For Info(T) and Infox(T) for a certain attribute
 Use only known values
 Compute F = (number instances with a known value) /
(total number of instances in data set)
 Use Gain(X) = F ∙ [Info(T) – Info x(T)]

Classification - Decision Tree 153

C4.5 decision tree algorithms
Given dataset T
Attribute1 Attribute2 Attribute3 Class
A 70 True CLASS1
A 90 True CLASS2
A 85 False CLASS2
A 95 False CLASS2
A 70 False CLASS1
????? 90 True CLASS1
B 78 False CLASS1
B 65 True CLASS1
B 75 False CLASS1
C 80 True CLASS2
C 70 True CLASS2
C 80 False CLASS1
C 80 False CLASS1
C 96 False CLASS1

Classification - Decision Tree 154

C4.5 decision tree algorithms
Given dataset T
Attribute1 Attribute2 Attribute3 Class
Consider test on attribute 1 A 70 True CLASS1
freq(class,value) CLASS1 CLASS2 A 90 True CLASS2
A 85 False CLASS2
A 2 3 5 A 95 False CLASS2
B 3 0 3 A 70 False CLASS1
C 3 2 5 ????? 90 True CLASS1
B 78 False CLASS1
8 5 13 B 65 True CLASS1
B 75 False CLASS1
C 80 True CLASS2
Factor F = 13 / 14 0.9286 C 70 True CLASS2
C 80 False CLASS1
Info(T) 0.4310 0.5302 0.9612 C 80 False CLASS1
C 96 False CLASS1

Info(S) CLASS1 CLASS2 Info(S) Weight

A 0.5288 0.4422 0.9710 0.3846
B 0.0000 0.0000 0.0000 0.2308
C 0.4422 0.5288 0.9710 0.3846
0.7469

Original Gain equation .9612 - .7469 = 0.2144

New Gain Equation F*Original-Gain 0.1990

 Weight is calculated on the basis of : n/N-m

 n= attribute values, N=total number of tuples, m=number of missing values of attribute

Classification - Decision Tree 155

C4.5 decision tree algorithms

Splitting the root

Attribute1 Attribute2 Attribute3
Gain(X) 0.1990 0.0587 -0.0205
|T1| 5 9 6
|T2| 3 5 8
|T3| 5
????? 1
|T| 13 14 14
|T1|/|T|*log(|T1|/|T|) 0.5302 0.4098 0.5239
|T2|/|T|*log(|T2|/|T|) 0.4882 0.5305 0.4613
|T3|/|T|*log(|T3|/|T|) 0.5302
Unknown 0.2846
Split-info(X) 1.8332 0.9403 0.9852
Gain-ratio(X) 0.1086 0.0625 -0.0208
Still split on Attribute 1
Classification - Decision Tree 156
C4.5 decision tree algorithms

 At this point, with unknown values

 Test attributes selected for each node
 Subsets defined for instances with known values
 But what to do with the unknown?
 C4.5 assigns it to ALL the subsets T1,…,Tn
 With probability (or weight)
 P(Ti) = w = |Ti known values| / |T known values|

Classification - Decision Tree 157

C4.5 decision tree algorithms

Classification - Decision Tree 158

C4.5 decision tree algorithms

Given dataset T
Attribute1 Attribute2 Attribute3 Class
A 70 True CLASS1
A 90 True CLASS2
A 85 False CLASS2
A 95 False CLASS2
A 70 False CLASS1
????? 90 True CLASS1
B 78 False CLASS1
B 65 True CLASS1
B 75 False CLASS1
C 80 True CLASS2
C 70 True CLASS2
C 80 False CLASS1
C 80 False CLASS1
C 96 False CLASS1

Classification - Decision Tree 159

C4.5 decision tree algorithms
 Classification = CLASS2 (3.4 /
Given dataset T
0.4) means Attribute1 Attribute2 Attribute3 Class
 3.4 = |updated T | = 3 + 5/13 = A 70 True CLASS1
i
A 90 True CLASS2
3.3846 A 85 False CLASS2
 0.4 = number of instances A 95 False CLASS2
A 70 False CLASS1
without (known) value in Ti ????? 90 True CLASS1
 Thus 3 / 3.3846 = 88.64% belong B 78 False CLASS1
B 65 True CLASS1
to CLASS2 B 75 False CLASS1
 The balance (~12%) is error C 80 True CLASS2
C 70 True CLASS2
rate C 80 False CLASS1
 Belongs to other classes C 80 False CLASS1
C 96 False CLASS1
– in this case CLASS1

Classification - Decision Tree 160

C4.5 decision tree algorithms

 Prediction
 Same approach – with probabilities – is used
 If values of attributes known – class is well
defined
 Else all paths from the root explored
 Probability of each class is determined for all classes
 Which is a sum of probabilities along paths
 Class with highest probability is selected

Classification - Decision Tree 161

2023 OBE Syllabus Business Office Management and Technology
No ratings yet
2023 OBE Syllabus Business Office Management and Technology
14 pages
Lecture 023+-+Decision+Trees+ - 1
No ratings yet
Lecture 023+-+Decision+Trees+ - 1
54 pages
Classification and Clustering
No ratings yet
Classification and Clustering
59 pages
Decision - Tree
No ratings yet
Decision - Tree
75 pages
Module 3 Chap 3 Decision Tree Learning
No ratings yet
Module 3 Chap 3 Decision Tree Learning
79 pages
ML Unit-2 Material WORD
No ratings yet
ML Unit-2 Material WORD
25 pages
Module - 2 Decision Tree Learning
No ratings yet
Module - 2 Decision Tree Learning
79 pages
Day 5 Supervised Technique-Decision Tree For Classification PDF
100% (1)
Day 5 Supervised Technique-Decision Tree For Classification PDF
58 pages
CS446: Machine Learning: Lecture 21 (ML Models - Decision Trees - ID3)
No ratings yet
CS446: Machine Learning: Lecture 21 (ML Models - Decision Trees - ID3)
54 pages
AIML Lect5 Decision Tree
No ratings yet
AIML Lect5 Decision Tree
33 pages
Decision Trees
No ratings yet
Decision Trees
15 pages
06 Classification Decision Tree
No ratings yet
06 Classification Decision Tree
42 pages
Unit6 - 2 Classification-Decision-Trees
No ratings yet
Unit6 - 2 Classification-Decision-Trees
36 pages
Asset v1 MKAU+SEng9032+DEV 01+Type@Asset+Block@ML Chapterthree
No ratings yet
Asset v1 MKAU+SEng9032+DEV 01+Type@Asset+Block@ML Chapterthree
129 pages
Lecture 6 - Decision Trees
No ratings yet
Lecture 6 - Decision Trees
43 pages
Decision Tree: Dept of CS & IT Bahauddin Zakariya University, Sahiwal Campus
No ratings yet
Decision Tree: Dept of CS & IT Bahauddin Zakariya University, Sahiwal Campus
31 pages
Decision Tree Learning: - A Learned Decision Tree Can Also Be Re-Represented As A Set of If-Then Rules
No ratings yet
Decision Tree Learning: - A Learned Decision Tree Can Also Be Re-Represented As A Set of If-Then Rules
49 pages
MAchine Learning 2
No ratings yet
MAchine Learning 2
16 pages
MAchine Learning 1
No ratings yet
MAchine Learning 1
17 pages
Deep Learning: Decision Trees I
No ratings yet
Deep Learning: Decision Trees I
45 pages
Decision Tree
No ratings yet
Decision Tree
14 pages
Decision Trees
No ratings yet
Decision Trees
14 pages
DM Unit Iii
No ratings yet
DM Unit Iii
87 pages
Wk. 5.2. Decision Trees (27.10.2020)
No ratings yet
Wk. 5.2. Decision Trees (27.10.2020)
57 pages
ML Unit-2 Material
No ratings yet
ML Unit-2 Material
20 pages
Unit-2 Material
No ratings yet
Unit-2 Material
52 pages
Chap5 - Machine Learning Part II - Decision Tree
No ratings yet
Chap5 - Machine Learning Part II - Decision Tree
68 pages
Machine Learning: MVJ21CS62
No ratings yet
Machine Learning: MVJ21CS62
12 pages
Chapter 02 - DM Tasks - Part I - Classification
No ratings yet
Chapter 02 - DM Tasks - Part I - Classification
58 pages
CENG313 Introduction To Data Science: Lecture 12: Classification Decision Trees
No ratings yet
CENG313 Introduction To Data Science: Lecture 12: Classification Decision Trees
61 pages
Lecture 6 Classification-Decision Tree Rule Based K-NN
No ratings yet
Lecture 6 Classification-Decision Tree Rule Based K-NN
73 pages
Dec Tree
No ratings yet
Dec Tree
17 pages
AI Chapter 3 Part 2
No ratings yet
AI Chapter 3 Part 2
51 pages
Slide 3
No ratings yet
Slide 3
23 pages
16-Decision Tree Classification Algorithm Advantages With Examples (Iterative Dichotomiser 3-ID3) - 22-03-2024
No ratings yet
16-Decision Tree Classification Algorithm Advantages With Examples (Iterative Dichotomiser 3-ID3) - 22-03-2024
83 pages
Chapter 5 2018 2019
No ratings yet
Chapter 5 2018 2019
5 pages
Module 3-Decision Tree Learning
100% (1)
Module 3-Decision Tree Learning
33 pages
Springer - Linguistic Decision Trees For Classification-2014
No ratings yet
Springer - Linguistic Decision Trees For Classification-2014
43 pages
ML L8 Decision Tree
No ratings yet
ML L8 Decision Tree
109 pages
3-Classification, Clustering and Prediction
No ratings yet
3-Classification, Clustering and Prediction
142 pages
Decision Trees For Classification - A Machine Learning Algorithm - Xoriant Blog
No ratings yet
Decision Trees For Classification - A Machine Learning Algorithm - Xoriant Blog
17 pages
ML-3-Decision Tree
No ratings yet
ML-3-Decision Tree
17 pages
Classification
No ratings yet
Classification
148 pages
Unit IV Da Online - PPTX 2 82
No ratings yet
Unit IV Da Online - PPTX 2 82
81 pages
07 - ML - Decision Tree
No ratings yet
07 - ML - Decision Tree
37 pages
Decision Trees
No ratings yet
Decision Trees
53 pages
Tree Models
No ratings yet
Tree Models
42 pages
Chapter 3 Decision Trees
No ratings yet
Chapter 3 Decision Trees
61 pages
Decision Tree
No ratings yet
Decision Tree
41 pages
Lec-3-Decision Trees
No ratings yet
Lec-3-Decision Trees
47 pages
DMDW Co3 Session 14
No ratings yet
DMDW Co3 Session 14
55 pages
Decision Tree Learning
No ratings yet
Decision Tree Learning
70 pages
L8-1-decisiontrees--random-forest (1)
No ratings yet
L8-1-decisiontrees--random-forest (1)
118 pages
Decision Tree & Random Forest
No ratings yet
Decision Tree & Random Forest
41 pages
UNIT II 2.1 ML Decision Tree Learning
No ratings yet
UNIT II 2.1 ML Decision Tree Learning
55 pages
L3 - Decision Trees
No ratings yet
L3 - Decision Trees
28 pages
Practice Q Machine Learning Ans
No ratings yet
Practice Q Machine Learning Ans
54 pages
Decision Trees Iterative Dichotomiser 3 (ID3) For Classification: An ML Algorithm
No ratings yet
Decision Trees Iterative Dichotomiser 3 (ID3) For Classification: An ML Algorithm
7 pages
ID3 Explanation
No ratings yet
ID3 Explanation
23 pages
Cse 445 Lecture 8 Mma
No ratings yet
Cse 445 Lecture 8 Mma
107 pages
Alternating Decision Tree: Fundamentals and Applications
From Everand
Alternating Decision Tree: Fundamentals and Applications
Fouad Sabry
No ratings yet
General Official Request
No ratings yet
General Official Request
1 page
PPGMS23
No ratings yet
PPGMS23
10 pages
Test G-20
No ratings yet
Test G-20
1 page
Test B-28
No ratings yet
Test B-28
1 page
ADME & ADNS Brochure 2025 New
No ratings yet
ADME & ADNS Brochure 2025 New
2 pages
2.basic MGMT Skills Engy MGMT
No ratings yet
2.basic MGMT Skills Engy MGMT
20 pages
Analisis Pengaruh Servqual Terhadap Kepuasan Serta Loyalitas Mahasiswa (Studi Kasus Pada Jurusan Teknik Mesin Di Universitas "XYZ")
No ratings yet
Analisis Pengaruh Servqual Terhadap Kepuasan Serta Loyalitas Mahasiswa (Studi Kasus Pada Jurusan Teknik Mesin Di Universitas "XYZ")
10 pages
Grade 7 - Iplan - Determine The Worth of Ideas To The Text Listened To d1
No ratings yet
Grade 7 - Iplan - Determine The Worth of Ideas To The Text Listened To d1
3 pages
Junior Senior
No ratings yet
Junior Senior
4 pages
Adoption Brochure
No ratings yet
Adoption Brochure
2 pages
Minute Taker Evaluation Form BSBADM502
100% (1)
Minute Taker Evaluation Form BSBADM502
2 pages
01w Esmeraldaaguilar Unit7 Eportfoliopeerreview Jaredbeesley
No ratings yet
01w Esmeraldaaguilar Unit7 Eportfoliopeerreview Jaredbeesley
3 pages
Krumova Pesheva R - Et Al - Nervni 1 2013
No ratings yet
Krumova Pesheva R - Et Al - Nervni 1 2013
8 pages
French B SL New Written Assignment Criteres
No ratings yet
French B SL New Written Assignment Criteres
1 page
Response Letter
No ratings yet
Response Letter
3 pages
Data Structures BCA-III Sem Syllubus
No ratings yet
Data Structures BCA-III Sem Syllubus
3 pages
Mary Jane Acc. October 2023
No ratings yet
Mary Jane Acc. October 2023
2 pages
OBD-Individual-Final-Project-Exam-Equivalent 2023-2024-1
No ratings yet
OBD-Individual-Final-Project-Exam-Equivalent 2023-2024-1
5 pages
Millennium Scholarship PDF
No ratings yet
Millennium Scholarship PDF
4 pages
How to Write a Literature Review Geography
No ratings yet
How to Write a Literature Review Geography
23 pages
Instructional Narrative
No ratings yet
Instructional Narrative
3 pages
GERIATRIC PSYCHOLOGY Unit 1
No ratings yet
GERIATRIC PSYCHOLOGY Unit 1
5 pages
ATMO 336 Syllabus - University of Arizona
No ratings yet
ATMO 336 Syllabus - University of Arizona
3 pages
Eng8-Q2 Mod4 Version5
No ratings yet
Eng8-Q2 Mod4 Version5
11 pages
15 - PEERS Treatment Manual Flyer
No ratings yet
15 - PEERS Treatment Manual Flyer
2 pages
Skimming
100% (1)
Skimming
1 page
Chapter 5 Leadership and Strategic Planning
No ratings yet
Chapter 5 Leadership and Strategic Planning
36 pages
The Role of Artificial Intelligence in A
No ratings yet
The Role of Artificial Intelligence in A
13 pages
All Thesis
No ratings yet
All Thesis
93 pages
Unit 6 Know How To Support Clients Who Take Part in Exercise and Physical Activity
No ratings yet
Unit 6 Know How To Support Clients Who Take Part in Exercise and Physical Activity
6 pages
Cor 015
No ratings yet
Cor 015
4 pages
SC Application Form DLP AITS 8-08-2023 2
No ratings yet
SC Application Form DLP AITS 8-08-2023 2
2 pages
Steps For Genre Analysis
100% (1)
Steps For Genre Analysis
2 pages