0% found this document useful (0 votes)
17 views12 pages

Part1 Lecture 13 Annotated

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views12 pages

Part1 Lecture 13 Annotated

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

Machine Learning : Lecture – 13

Decision Trees

Prof. Viswanath Gopalakrishnan


IIIT-Bangalore
PCA Recap: Covariance Matrix
Topics Covered So Far
Decision Trees
Used for Classification & Regression

Example Problem: Predict the customer will buy a laptop or not

Training Data

Decision trees make no assumptions on relationships between features.

Can handle multi-collinearity of features

Can handle high dimensional data features


Decision Trees Classifier

𝑋1

𝑌 𝑁

𝑋0 𝑌 𝑁

𝑌 𝑁

A trained decision tree


https://www.quora.com/What-is-the-interpretation-and-intuitive-explanation-of-Gini-impurity-in-decision-trees

Gini Index

Data with labels:

Category Feature n
0 C 2.85
1 B 2.03
2 A 7.82
3 C 2.7
4 B 1.5
… …. …

Objective: Find a split on X-axis that maximizes the ‘homogeneity’ of subsets after the split
https://www.quora.com/What-is-the-interpretation-and-intuitive-explanation-of-Gini-impurity-in-decision-trees

Gini Index
Case 0 : C= 2

Lets try a measure for impurity:

Maximum ‘impurity’ occurs at equal probability for


both classes.
https://www.quora.com/What-is-the-interpretation-and-intuitive-explanation-of-Gini-impurity-in-decision-trees

Gini Index
https://www.quora.com/What-is-the-interpretation-and-intuitive-explanation-of-Gini-impurity-in-decision-trees

Finding Optimal Point using Gini Index


Entropy

Gini and Entropy gives similar results, Entropy


is more computationally complex due to
logarithm.
Decision Tree Regression
x_1
𝑓 𝑥_0, 𝑥_1

x_0

Regression Problem Input Plane Trained Decision Tree


𝑓 16, −2 = (130 + 254+ 158 )/3
Decision Tree Regression – Splitting Criteria

You might also like