ShortCourse-QTT-Lecture2
ShortCourse-QTT-Lecture2
Tho Quan
[email protected]
Agenda
• Inductive learning
• Decision Tree: ID3 and C4.5
• From Decision Tree to Random Forest
Five Tribes of Machine Learning
Inference Mechanisms
• IF temperature high AND NOT (water level low) THEN pressure high
• IF tranducer output low THEN water level low
Deductiion and Induction
Rule 1 : If Travel cost/km is expensive then Gender :Male
mode = car Car Ownership : 1
Rule 2 : If Travel cost/km is standard then Travel Cost/Km : Standard
mode = train Income Level : High
Rule 3 : If Travel cost/km is cheap and gender Transportation Mode ?
is male then mode = bus
Rule 4 : If Travel cost/km is cheap and gender
is female and she owns no car then mode =
bus
Rule 5 : If Travel cost/km is cheap and gender
is female and she owns 1 car then mode =
train
Decision Tree
- Choose best
attribute
- Split data set
- Recurse until each
data item classified
correctly
Generate a Decision Tree
Generate a Decision Tree
• Measure Impurity :
• Information Gain :
Generate a Decision Tree
• Pro(Bus) = 4/10
• Pro(Car) = 3/10
• Pro(Train) = 3/10
• Entropy = – 0.4 log (0.4) – 0.3 log (0.3) – 0.3 log (0.3) =
1.571
• Gini Index = 1 – (0.4^2 + 0.3^2 + 0.3^2) = 0.660
How to Use a Decision Tree
Make
Data Decision Tree Predictions on
unseen Data
Decision Rule
How to Use a Decision Tree
• Gender :Male
• Car Ownership : 1
• Travel Cost/Km : Standard
• Income Level : High
• Transportation Mode ?
• Avoid overfitting
• Deal with continuous attributes
• Deal with missing data
Pruning
Man ? Woman
Problem with All Margin-based
Discriminative Classifier
• Ensemble Learning
Average out biases
Reduce the variance
Unlikely to overfit
Bias-variance Decomposition
Explanation
- Input: A selected flight in future
Arrival hour: 0.25467993054
(a flight from Changi to Tan Son Airline: 0.253308988692
Nhat for the next 48 hours by Origin: 0.158077791536
Departure time: 0.1364141321
Singapore Airlines).
Destination: 0.105243518586
- Output: Delay prediction (Y/N) Duration: 0.0660441127126
Type: 0.0219200955523
Arrival DoW: 0.00245074824256
Departure DoW: 0.00186059074095
Operation Type: 9.12956600922e-08