Lecture 5 Classification P2 Decision Tree
Lecture 5 Classification P2 Decision Tree
Since 2004
Hanoi, 09/2023
Recap: Key Issues in Machine Learning
● What are good hypothesis spaces? We choose
○ Which spaces have been useful in practical applications and why? To
● What algorithms can work with these spaces? Optimize
○ Are there general design principles for machine learning algorithms?
● How can we find the best hypothesis in an efficient way?
○ How to find the optimal solution efficiently (“optimization” question)
● How can we optimize accuracy on future data?
○ Known as the “overfitting” problem (i.e., “generalization” theory)
● How can we have confidence in the results?
○ How much training data is required to find accurate hypothesis? (“statistical” question)
● Are some learning problems computationally intractable? (“computational” question)
● How can we formulate application problems as machine learning problems? (“engineering”
question)
FIT-CS INT3405 - Machine Learning 2
Recap: Model Representation
Training Set How do we represent h ?
Learning Algorithm y
Size of h Estimated x
house price
x Hypothesis y
Linear regression with one variable.
“Univariate Linear Regression”
● Analytical solution
Take O(mn2+n3)
Source: https://regenerativetoday.com/simple-explanation-on-how-decision-tree-algorithm-makes-decisions/
● Binary split:
○ Divides values into two subsets
● Binary split:
○ Divides values into two subsets
This grouping
violates order
property
● Entropy
● Misclassification error
Thank you
Email me
[email protected]