0% found this document useful (0 votes)
15 views

Example_Classification

The document discusses decision tree induction using a training dataset called Buys_computer, illustrating the process of creating a decision tree based on attributes like age, income, student status, and credit rating. It explains the concept of information gain for attribute selection and the recursive procedure for building the tree. Additionally, it covers the calculation of Gini index for binary and categorical attributes to evaluate the effectiveness of splits in the decision tree.

Uploaded by

khlykun0209
Copyright
© © All Rights Reserved
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views

Example_Classification

The document discusses decision tree induction using a training dataset called Buys_computer, illustrating the process of creating a decision tree based on attributes like age, income, student status, and credit rating. It explains the concept of information gain for attribute selection and the recursive procedure for building the tree. Additionally, it covers the calculation of Gini index for binary and categorical attributes to evaluate the effectiveness of splits in the decision tree.

Uploaded by

khlykun0209
Copyright
© © All Rights Reserved
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
You are on page 1/ 71

Decision Tree Induction: An Example

q Training data set: Buys_computer


age income student credit_rating buys_computer
q The data set follows an example of <=30 high no fair no
Quinlan’s ID3 (Playing Tennis) <=30 high no excellent no
31…40 high no fair yes
q Resulting tree:age? >40 medium no fair yes
>40 low yes fair yes
>40 low yes excellent no
31…40 low yes excellent yes
<=30 overcast31..40 >40 <=30 medium no fair no
<=30 low yes fair yes
>40 medium yes fair yes
student? yes credit rating? <=30 medium yes excellent yes
31…40 medium no excellent yes
31…40 high yes fair yes
no yes excellent fair >40 medium no excellent no

no yes yes
4
Attribute Selection: Information Gain
¨ Class P: buys_computer = “yes” 9 9 5 5
¨ Class N: buys_computer = “no” Info(D) = I (9,5) = - 14 log2 (14 ) - 14 log2 (14 ) =0.940

age income student credit_rating buys_computer


<=30 high no fair no Look at “age”:
<=30 high no excellent no
31…40 high no fair yes age pi ni I(pi, ni)
>40 medium no fair yes
>40 low yes fair yes <=30 2 3 0.971
>40 low yes excellent no 31…40 4 0 0
31…40 low yes excellent yes >40 3 2 0.971
<=30 medium no fair no
<=30 low yes fair yes 5 4
>40 medium yes fair yes Info (D) = I (2,3) + I (4,0)
<=30 medium yes excellent yes age 14 14
31…40 medium no excellent yes 5

31…40 high yes fair yes


>40 medium no excellent no + 14 I (3,2) = 0.694
10
Attribute Selection: Information Gain
¨ Class P: buys_computer = “yes” 9 9 5 5
¨ Class N: buys_computer = “no” Info(D) = I (9,5) = - 14 log2 (14 ) - 14 log2 (14 )

age income student credit_rating buys_computer 5 4


<=30 high no fair no =0.940 Infoage (D) = 14 I (2,3) + 14 I (4,0)
<=30 high no excellent no
31…40 high no fair yes
>40 medium no fair yes 5
>40 low yes fair yes
+ 14 I (3,2) = 0.694
>40 low yes excellent no
31…40 low yes excellent yes
<=30 medium no fair no Gain(age) = Info(D) - Infoage (D) = 0.246
<=30 low yes fair yes
>40 medium yes fair yes
<=30 medium yes excellent yes Similarly,
31…40 medium no excellent yes Gain(income) = 0.029
31…40 high yes fair yes Gain(student) = 0.151
>40 medium no excellent no
Gain(credit _ rating) = 0.048
11
Recursive
Procedure
1.After selecting age at the
age income student credit_rating buys_computer root node, we will create three
<=30 high no fair no child nodes.
<=30 high no excellent no
31…40 high no fair yes
2.One child node is associated with
>40 medium no fair yes
>40 low yes fair yes red data tuples.
>40 low yes excellent no
31…40 low yes excellent yes 3. How to continue for this child node?
<=30 medium no fair no
<=30 low yes fair yes
>40 medium yes fair yes Now, you will make D = {red data
<=30 medium yes excellent yes tuples}
31…40 medium no excellent yes
31…40 high yes fair yes and then select the best attribute to
>40 medium no excellent no
further split
D.
A recursive procedure.

12
Binary Attributes: Computing Gini Index
! Splits into two partitions n 2
gini(D) =1- å p j j =1
! Effect of weighing partitions:
– Larger and Purer Partitions are sought for.
Parent
B?
C1 6
Yes No C2 6

Node N1 Node N2
Gini = ?
22
Binary Attributes: Computing Gini Index
! Splits into two partitions gini(D) =1- n 2
å pj
! Effect of weighing partitions: j =1
– Prefer Larger and Purer Partitions.
Parent
B? C1 6
Yes No C2 6
Gini = 0.500
Gini(N1) Node N1 Node N2
= 1 – (5/7)2 – (2/7)2
= 0.408 N1 N2 Gini(Children)
Gini(N2) C1 5 1
2 2 C2 2 4 = 7/12 * 0.408 + weighting 5/12
= 1 – (1/5) – (4/5)
= 0.320 Gini=0.371 * 0.320

= 0.371
71
Categorical Attributes: Computing Gini Index
¨ For each distinct value, gather counts for each class in the dataset
¨ Use the count matrix to make decisions
Multi-way split Two-way split
(find best partition of values)

CarType CarType CarType

Family Sports Luxury {Sports, {Family,


{Sports}
Luxury} {Family} Luxury}
C1 1 2 1 C1 3 1 C1 2 2
C2 4 1 1 C2 2 4 C2 1 5
Gini 0.393 Gini 0.400 Gini 0.419
25
Continuous Attributes: Computing Gini Index or
Information Gain
Tid Refund Marital Taxable Cheat
¨ To discretize the attribute values Status Income

¤ Use Binary Decisions based on one splitting value 1 Yes Single 125K No
2 No Married 100K No

¨ Several Choices for the splitting value 3 No Single 70K No

¤ Number of possible splitting values = Number of distinct values 4 Yes Married 120K No
-1
¤ Typically, the midpoint between each pair of adjacent values is 5 No Divorced 95K Yes
considered as a
possible split point 6 No Married 60K No
7 Yes Divorced 220K No
n (ai+ai+1)/2 is the midpoint between the values of ai and ai+1 8 No Single 85K Yes

¨ Each splitting value has a count matrix associated with it 9 No Married 75K No
10 No Single 90K Yes

¤ Class counts in each of the partitions, A < v and A ³ v

¨ Simple method to choose best v


¤ For each v, scan the database to gather count matrix and compute Taxable
its Gini index Income
> 80K?
¤ Computationally Inefficient! Repetition of work.
Yes No

26
Continuous Attributes:
Computing Gini Index or expected information
requirement
First decide the splitting value to discretize the attribute:
¨ For efficient computation: for each attribute,
Step 1: Sort the attribute on values

Cheat No No No Yes Yes Yes No No No No


Taxable Income

Step Sorted Values 60 70 75 85 90 95 100 120 125 220

1: Possible Splitting 55 65 72 80 87 92 97 110 122 172 230


Use
Values midpoint
<= > <= > <= > <= > <= > <= > <= > <= > <= > <= > <= >
Yes 0 3 0 3 0 3 0 3 1 2 2 1 3 0 3 0 3 0 3 0 3 0
No 0 7 1 6 2 5 3 4 3 4 3 4 3 4 4 3 5 2 6 1 7 0

Gini 0.420 0.400 0.375 0.343 0.417 0.400 0.300 0.343 0.375 0.400 0.420
27
Continuous Attributes:
Computing Gini Index or expected information
requirement
First decide the splitting value to discretize the attribute:
¨ For efficient computation: for each attribute,
Step 1: Sort the attribute on values
Step 2: Linearly scan these values, each time updating the count matrix

Cheat No No No Yes Yes Yes No No No No


Taxable Income

Step Sorted Values 60 70 75 85 90 95 100 120 125 220

1: Possible Splitting 55 65 72 80 87 92 97 110 122 172 230


Values
<= > <= > <= > <= > <= > <= > <= > <= > <= > <= > <= >

Step Yes 0 3 0 3 0 3 0 3 1 2 2 1 3 0 3 0 3 0 3 0 3 0

2: No 0 7 1 6 2 5 3 4 3 4 3 4 3 4 4 3 5 2 6 1 7 0

Gini 0.420 0.400 0.375 0.343 0.417 0.400 0.300 0.343 0.375 0.400 0.420
For each splitting value, get its count matrix: how many
data tuples have:
(a) Taxable income <=65 with class label “Yes” , (b)
Taxable income <=65 with class label “No”, (c) Taxable
income >65 with class label “Yes”,
28 (d) Taxable income >65 with class
label “No”.
Continuous Attributes:
Computing Gini Index or expected information
requirement
First decide the splitting value to discretize the attribute:
¨ For efficient computation: for each attribute,
Step 1: Sort the attribute on values
Step 2: Linearly scan these values, each time updating the count matrix

Cheat No No No Yes Yes Yes No No No No


Taxable Income

Step Sorted Values 60 70 75 85 90 95 100 120 125 220

1: Possible Splitting 55 65 72 80 87 92 97 110 122 172 230


Values
<= > <= > <= > <= > <= > <= > <= > <= > <= > <= > <= >

Step Yes 0 3 0 3 0 3 0 3 1 2 2 1 3 0 3 0 3 0 3 0 3 0

2: No 0 7 1 6 2 5 3 4 3 4 3 4 3 4 4 3 5 2 6 1 7 0

Gini 0.420 0.400 0.343 0.417 0.400 0.300 0.343 0.375 0.400 0.420

For each splitting value, get its count matrix: how many
data tuples have:
(a) Taxable income <=72 with class label “Yes” , (b)
Taxable income
<=72 with class label “No”, (c) Taxable income >72 with
class label “Yes”,
29 (d) Taxable income >72 with class
label “No”.
Continuous Attributes:
Computing Gini Index or expected information
requirement
First decide the splitting value to discretize the attribute:
¨ For efficient computation: for each attribute,
Step 1: Sort the attribute on values
Step 2: Linearly scan these values, each time updating the count matrix

Cheat No No No Yes Yes Yes No No No No


Taxable Income

Step Sorted Values 60 70 75 85 90 95 100 120 125 220

1: Possible Splitting 55 65 72 80 87 92 97 110 122 172 230


Values
<= > <= > <= > <= > <= > <= > <= > <= > <= > <= > <= >

Step Yes 0 3 0 3 0 3 0 3 1 2 2 1 3 0 3 0 3 0 3 0 3 0

2: No 0 7 1 6 2 5 3 4 3 4 3 4 3 4 4 3 5 2 6 1 7 0

Gini 0.420 0.400 0.375 0. 0.417 0.400 0.300 0.343 0.375 0.400 0.420
For each splitting value, get its count matrix: how many
data tuples have:
(a) Taxable income <=80 with class label “Yes” , (b)
Taxable income <=80 with class label “No”, (c) Taxable
income >80 with class label “Yes”,
30 (d) Taxable income >80 with class
label “No”.
Continuous Attributes:
Computing Gini Index or expected information
requirement
First decide the splitting value to discretize the attribute:
¨ For efficient computation: for each attribute,
Step 1: Sort the attribute on values
Step 2: Linearly scan these values, each time updating the count matrix

Cheat No No No Yes Yes Yes No No No No


Taxable Income

Step Sorted Values 60 70 75 85 90 95 100 120 125 220

1: Possible Splitting 55 65 72 80 87 92 97 110 122 172 230


Values
<= > <= > <= > <= > <= > <= > <= > <= > <= > <= > <= >

Step Yes 0 3 0 3 0 3 0 3 1 2 2 1 3 0 3 0 3 0 3 0 3 0

2: No 0 7 1 6 2 5 3 4 3 4 3 4 3 4 4 3 5 2 6 1 7 0

Gini 0.420 0.400 0.375 0.343 0.417 0.400 0.300 0.343 0.375 0. 0.420

For each splitting value, get its count matrix: how many data tuples have:
(a) Taxable income <=172 with class label “Yes” , (b)
Taxable income <=172 with class label “No”, (c) Taxable
income >172 with class label
31 “Yes”, (d) Taxable income >172 with class
label “No”.
Continuous Attributes:
Computing Gini Index or expected information
requirement
First decide the splitting value to discretize the attribute:
¨ For efficient computation: for each attribute,
Step 1: Sort the attribute on values
Step 2: Linearly scan these values, each time updating the count matrix
Step 3: Computing Gini index and choose the split position that has the least
Gini index

Step 3:

Step
1:

Step
2:
Sorted Values
Possible Splitting
For each splitting value v (e.g., 65), compute its Gini
index:
gini (D) = |D1| |D | Here D1 and D2 are two partitions based on
Taxable _ Income gini(D1) + 2 gini(D2) v: D1 has
|D|
32 |D| taxable income <=v and D2 has >v
Continuous Attributes:
Computing Gini Index or expected information
requirement
First decide the splitting value to discretize the attribute:
¨ For efficient computation: for each attribute,
Step 1: Sort the attribute on values
Step 2: Linearly scan these values, each time updating the count matrix
Step 3: Computing Gini index and choose the split position that has the least
Gini index

Step 3:

Step
1:

Step
2:
Sorted Values
Possible Splitting
For each splitting value v (e.g., 72), compute its Gini
index:
gini (D) = |D1| |D | Here D1 and D2 are two partitions based on
Taxable _ Income gini(D1) + 2 gini(D2) v: D1 has
|D|
33 |D| taxable income <=v and D2 has >v
Continuous Attributes:
Computing Gini Index or expected information
requirement
First decide the splitting value to discretize the attribute:
¨ For efficient computation: for each attribute,
Step 1: Sort the attribute on values
Step 2: Linearly scan these values, each time updating the count matrix
Step 3: Computing Gini index and choose the split position that has the least
Gini index

Step 3:

Step
1:

Step
2:
<= > <= > <= > <= > <= > <= > <= > <= > <= > <= > <= >
Yes 0 3 0 3 0 3 0 3 1 2 2 1 3 0 3 0 3 0 3 0 3 0
No 0 7 1 6 2 5 3 4 3 4 3 4 3 4 4 3 5 2 6 1 7 0
Sorted Values
Gini 0.420 0.400 0.375 0.343 0.417 0.400 0.300 0.343 0.375 0.400 0.420
Possible Splitting

Choose this splitting value (=97) with the least Gini index to discretize
Taxable Income
34
Continuous Attributes:
Computing Gini Index or expected information
requirement
First decide the splitting value to discretize the attribute:
¨ For efficient computation: for each attribute,
Step 1: Sort the attribute on values
Step 2: Linearly scan these values, each time updating the count matrix
Step 3: Computing expected information requirement and choose the split position
that has the least value

Step 3:

Step
1:

Step
2:
Sorted Values
Possible Splitting
If Information Gain is Similarly to calculating Gini index, for each splitting value, compute
used Info_{Taxable Income}:
for attribute selection, Info (D) =
2
|Dj |
å ´ Info(D j )
35
Taxable-Income
j =1 |D|
Continuous Attributes:
Computing Gini Index or expected information
requirement
First decide the splitting value to discretize the attribute:
¨ For efficient computation: for each attribute,
Step 1: Sort the attribute on values
Step 2: Linearly scan these values, each time updating the count matrix
Step 3: Computing Gini index and choose the split position that has the least
Gini index

Step 3:

Step
1:

Step
2:
Sorted Values
Possible Splitting
Choose this splitting value (=97 here) with the least Gini index or expected information requirement
to discretize Taxable Income

36
Continuous Attributes:
Computing Gini Index or expected information
requirement
First decide the splitting value to discretize the attribute:
¨ For efficient computation: for each attribute,
Step 1: Sort the attribute on values
Step 2: Linearly scan these values, each time updating the count matrix
Step 3: Computing Gini index and choose the split position that has the least Gini
index

Step 3:

Step
1:

Step
2:
No 0 7 1 6 2 5 3 4 3 4 3 4 3 4 4 3 5 2 6 1 7 0

Gini 0.420 0.400 0.375 0.343 0.417 0.400 0.300 0.343 0.375 0.400 0.420

Sorted Values
Possible Splitting
At each level of the decision tree, for attribute selection, (1) First, discretize a continuous attribute
by deciding the splitting value; (2) Then, compare the discretized attribute with other attributes in
terms of Gini Index reduction or Information Gain.
37
Continuous Attributes:
Computing Gini Index or expected information
requirement
First decide the splitting value to discretize the attribute:
For each
¨ For efficient computation: for each attribute,
attribute,
Step 1: Sort the attribute on values
only scan the
Step 2: Linearly scan these values, each time updating the count
matrix data tuples
once
Step 3: Computing Gini index and choose the split position that has the least Gini
index
Step 2:

Step 3:
Step
1:
No 0 7 1 6 2 5 3 4 3 4 3 4 3 4 4 3 5 2 6 1 7 0

Gini 0.420 0.400 0.375 0.343 0.417 0.400 0.300 0.343 0.375 0.400 0.420

Sorted Values
Possible Splitting
At each level of the decision tree, for attribute selection, (1) First, discretize a continuous attribute
by deciding the splitting value; (2) Then, compare the discretized attribute with other attributes in
terms of Gini Index reduction or Information Gain.
38
Naïve Bayes Classifier: Training
Dataset
Class: age income student credit_rating buys_computer
<=30 high no fair no
C1:buys_computer = ‘yes’ <=30 high no excellent no
31…40 high no fair yes
C2:buys_computer = ‘no’ >40 medium no fair yes
>40 low yes fair yes
Data to be classified: >40 low yes excellent no
31…40 low yes excellent yes
X = (age <=30, Income = medium, <=30 medium no fair no
Student = yes, Credit_rating = Fair) <=30 low yes fair yes
>40 medium yes fair yes
<=30 medium yes excellent yes
31…40 medium no excellent yes
31…40 high yes fair yes
>40 medium no excellent no
118
Naïve Bayes Classifier: An age
<=30
income student credit_rating buys_computer
high no fair no

Example <=30
31…40
>40
high
high
no excellent
no fair
medium no fair
no
yes
yes
>40 low yes fair yes
¨ Prior probability P(Ci): >40 low yes excellent no
31…40 low yes excellent yes
P(buys_computer = “yes”) = 9/14 <=30 medium no fair no
<=30 low yes fair yes
= 0.643 P(buys_computer = “no”) >40 medium yes fair yes
<=30 medium yes excellent yes
= 5/14= 0.357 31…40 medium no excellent yes
31…40 high yes fair yes
>40 medium no excellent no

119
Naïve Bayes Classifier: An age
<=30
income student credit_rating buys_computer
high no fair no

Example <=30
31…40
>40
high
high
no excellent
no fair
medium no fair
no
yes
yes
>40 low yes fair yes
¨ P(Ci): P(buys_computer = “yes”) = 9/14 >40 low yes excellent no
31…40 low yes excellent yes
= 0.643 P(buys_computer = “no”) <=30 medium no fair no
<=30 low yes fair yes
= 5/14= 0.357 >40 medium yes fair yes
<=30 medium yes excellent yes
31…40 medium no excellent yes
31…40 high yes fair yes
¨ Compute P(X|Ci) for each class, where, >40 medium no excellent no
X = (age <=30, Income = medium, Student = yes,
Credit_rating = Fair)

According to “the naïve assumption”, first get:


P(age = “<=30”|buys_computer = “yes”) = 2/9 =
0.222
120
P(income =
Naïve Bayes Classifier: An “medium” |
buys_computer =
Example “no”) = 2/5 = 0.4
P(student = “yes” |
¨ P(Ci): P(buys_computer = “yes”) = 9/14 buys_computer =
“yes) = 6/9 = 0.667
= 0.643 P(buys_computer = “no”) P(student = “yes” |
= 5/14= 0.357 buys_computer =
“no”) = 1/5 = 0.2
¨ Compute P(X|Ci) for each class, where, P(credit_rating =
X = (age <=30, Income = medium, Student = yes, “fair” |
Credit_rating = Fair) buys_computer =
“yes”) = 6/9 =
0.667
According to “the naïve assumption”, first get: P(credit_rating =
P(age = “<=30”|buys_computer = “yes”) = 2/9 = “fair” |
0.222 buys_computer =
P(age = “<= 30”|buys_computer = “no”) = 3/5 = 0.6 “no”) = 2/5 = 0.4
P(income = “medium” | buys_computer = “yes”) =
4/9 = 0.444
31…40 low yes excellent yes
<=30 medium no fair no
age income student credit_rating buys_computer <=30 low yes fair yes
<=30 high no fair no >40 medium yes fair yes
<=30 high no excellent no <=30 medium yes excellent yes
31…40 high no fair yes 31…40 medium no excellent yes
>40 medium no fair yes 31…40 high yes fair yes
>40 low yes fair yes >40 medium no excellent no
>40 low yes excellent no

121
P(student = “yes” |
Naïve Bayes Classifier: An buys_computer =
“no”) = 1/5 = 0.2
Example P(credit_rating =
“fair” |
¨ P(Ci): P(buys_computer = “yes”) = 9/14 buys_computer =
“yes”) = 6/9 =
= 0.643 P(buys_computer =
0.667
“no”) = 5/14= 0.357 P(credit_rating =
¨ Compute P(Xi|Ci) for each class “fair” |
P(age = “<=30”|buys_computer = “yes”) = 2/9 = buys_computer =
0.222 “no”) = 2/5 = 0.4
P(age = “<= 30”|buys_computer = “no”) = 3/5 = 0.6
P(income = “medium” | buys_computer = “yes”) =
4/9 = 0.444
P(income = “medium” | buys_computer = “no”) =
2/5 = 0.4
P(student = “yes” | buys_computer = “yes) = 6/9 =
0.667
31…40 low yes excellent yes
<=30 medium no fair no
age income student credit_rating buys_computer <=30 low yes fair yes
<=30 high no fair no >40 medium yes fair yes
<=30 high no excellent no <=30 medium yes excellent yes
31…40 high no fair yes 31…40 medium no excellent yes
>40 medium no fair yes 31…40 high yes fair yes
>40 low yes fair yes >40 medium no excellent no
>40 low yes excellent no

¨ X = (age <= 30 , income = medium, student = yes, credit_rating = fair)


P(X|Ci) : P(X|buys_computer = “yes”) = P(age = “<=30”|buys_computer =
“yes”) x P(income = “medium” | buys_computer = “yes”) x P(student = “yes”
| buys_computer = “yes) x P(credit_rating = “fair” | buys_computer = “yes”)
= 0.044

122
P(credit_rating =
Naïve Bayes Classifier: An “fair” |
buys_computer =
Example “yes”) = 6/9 =
0.667
P(credit_rating =
¨ P(Ci): P(buys_computer = “yes”) = 9/14
“fair” |
= 0.643 P(buys_computer = “no”) = 5/14= buys_computer =
0.357 “no”) = 2/5 = 0.4
¨ Compute P(Xi|Ci) for each class
P(age = “<=30”|buys_computer = “yes”) = 2/9 =
0.222
P(age = “<= 30”|buys_computer = “no”) = 3/5 = 0.6
P(income = “medium” | buys_computer = “yes”) =
4/9 = 0.444
P(income = “medium” | buys_computer = “no”) =
2/5 = 0.4
P(student = “yes” | buys_computer = “yes) = 6/9 =
0.667
P(student = “yes” | buys_computer = “no”) = 1/5 =
0.2
31…40 low yes excellent yes
<=30 medium no fair no
age income student credit_rating buys_computer <=30 low yes fair yes
<=30 high no fair no >40 medium yes fair yes
<=30 high no excellent no <=30 medium yes excellent yes
31…40 high no fair yes 31…40 medium no excellent yes
>40 medium no fair yes 31…40 high yes fair yes
>40 low yes fair yes >40 medium no excellent no
>40 low yes excellent no
¨ X = (age <= 30 , income = medium, student = yes,
credit_rating = fair) P(X|Ci) : P(X|buys_computer = “yes”) =
0.222 x 0.444 x 0.667 x 0.667 = 0.044 P(X|buys_computer =
“no”) = 0.6 x 0.4 x 0.2 x 0.4 = 0.019
Take into account
P(X|Ci)*P(Ci) : P(X|buys_computer = “yes”) * P(buys_computer the prior
= “yes”) = 0.028 P(X|buys_computer = “no”) * probabilities
P(buys_computer = “no”) = 0.007
123
P(credit_rating =
Naïve Bayes Classifier: An “fair” |
buys_computer =
Example “yes”) = 6/9 =
0.667
P(credit_rating =
¨ P(Ci): P(buys_computer = “yes”) = 9/14
“fair” |
= 0.643 P(buys_computer = “no”) = 5/14= buys_computer =
0.357 “no”) = 2/5 = 0.4
¨ Compute P(Xi|Ci) for each class
P(age = “<=30”|buys_computer = “yes”) = 2/9 =
0.222
P(age = “<= 30”|buys_computer = “no”) = 3/5 = 0.6
P(income = “medium” | buys_computer = “yes”) =
4/9 = 0.444
P(income = “medium” | buys_computer = “no”) =
2/5 = 0.4
P(student = “yes” | buys_computer = “yes) = 6/9 =
0.667
P(student = “yes” | buys_computer = “no”) = 1/5 =
0.2
31…40 low yes excellent yes
<=30 medium no fair no
age income student credit_rating buys_computer <=30 low yes fair yes
<=30 high no fair no >40 medium yes fair yes
<=30 high no excellent no <=30 medium yes excellent yes
31…40 high no fair yes 31…40 medium no excellent yes
>40 medium no fair yes 31…40 high yes fair yes
>40 low yes fair yes >40 medium no excellent no
>40 low yes excellent no
X = (age <= 30 , income = medium, student = yes,
¨

credit_rating = fair) P(X|Ci) : P(X|buys_computer = “yes”) =


0.222 x 0.444 x 0.667 x 0.667 = 0.044 P(X|buys_computer =
“no”) = 0.6 x 0.4 x 0.2 x 0.4 = 0.019
P(X|Ci)*P(Ci) : P(X|buys_computer = “yes”) * P(buys_computer =
“yes”) = 0.028 P(X|buys_computer = “no”) *
P(buys_computer = “no”) = 0.007
Since Red > Blue here, X belongs to class (“buys_computer = yes”)
124
ROC Calculation
83

¨ Rank the test examples by prediction probability in


descending order
¨ Gradually decreases the classification threshold from
1.0 to 0.0 and calculate the true positive and false
positive rate along the way
0.95 Yes
Yes Input
"≥1.0→ -() = 0.0

Prebability of Actual Class '() = 0.0

Prediction

!!
0.85 Yes
!" 0.75 No
!
#
0.65 Yes
!$
0.4 No
!% 0.3 No
!
&

83
ROC Calculation
84

¨ Rank the test examples by prediction probability in


descending order
¨ Gradually decreases the classification threshold from
1.0 to 0.0 and calculate the true positive and false
positive rate along the way
Input Prebability of Actual Class
"≥0.9→ Prediction '() = 0.334
Yes ! 0.95
0.85
Yes
Yes -() = 0.0

!"
0.75 No
!
#
0.65 Yes
!$
0.4 No
!% 0.3 No
!
&

84
ROC Calculation
85

¨ Rank the test examples by prediction probability in


descending order
¨ Gradually decreases the classification threshold from
1.0 to 0.0 and calculate the true positive and false
positive rate along the way
Input Prebability of Actual Class
'() = 0.666
Prediction
0.95 Yes
"≥0.8→ !" 0.75 No

Yes !! 0.85 Yes

#
!
0.65 Yes
!$
0.4 No
!% 0.3 No
!
&

85
ROC Calculation
86

¨ Rank the test examples by prediction probability in


descending order
¨ Gradually decreases the classification threshold from
1.0 to 0.0 and calculate the true positive and false
positive rate along the way
Input Prebability of Actual Class
Prediction
0.95 Yes
!! 0.85 Yes '() = 0.666
0.65 Yes

Yes
"≥0.7→ !
# -() = 0.334

!" 0.75 No
0.4 No
!$

!% 0.3 No
!
&

86
ROC Calculation
87

¨ Rank the test examples by prediction probability in


descending order
¨ Gradually decreases the classification threshold from
1.0 to 0.0 and calculate the true positive and false
positive rate along the way
Input Prebability of Actual Class
Prediction
0.95 Yes
!!
0.85 Yes
!"
0.75 No '() = 1.0
"≥0.5→ !$ 0.4 No

Yes !
#
0.65 Yes
!%
0.3 No
!
&

87
ROC Calculation
88

¨ Rank the test examples by prediction probability in


descending order
¨ Gradually decreases the classification threshold from
1.0 to 0.0 and calculate the true positive and false
positive rate along the way
Input Prebability of Actual Class
Prediction
0.95 Yes
!!
0.85 Yes
!" 0.75 No
#
! 0.65 Yes '() = 1.0
"≥0.4→ !% 0.3 No
Yes 0.4 No
!
!$

&
88
ROC Calculation
89

¨ Rank the test examples by prediction probability in


descending order
¨ Gradually decreases the classification threshold from
1.0 to 0.0 and calculate the true positive and false
positive rate along the way
Input Prebability of Actual Class
Prediction
0.95 Yes
!!
0.85 Yes
!" 0.75 No
!
#
0.65 Yes
0.4 No
"≥0.3→ !$ '() = 1.0

Yes 0.3 No
!
& -() = 1.0

!%

89

You might also like