Intro
Intro
Landscape
Xudong Liu
Assistant Professor
School of Computing
University of North Florida
“A field of study that gives computers the ability to learn without being
explicitly programmed.”
Features
Types of ML Systems
1. K-Nearest Neighbors
2. Linear Regression
3. Logistic Regression
4. Support Vector Machines
5. Decision Trees and Random Forests
6. Neural Networks
Types of ML Systems
1. K-Means (Clustering)
2. Principal Component Analysis (Dimensionality Reduction)
3. Deep Neural Networks
Types of ML Systems
Clustering
• Use Cases
Disadvantages
• Prone to overfitting
• Regularize by setting maximum depth
• Comes up only with orthogonal boundaries
• Sensitive to training set rotation – Use PCA!
ML Landscape
Ensemble Methods
Basic Idea
• Two Decision Trees by themselves may overfit. But combining their predictions may be a good idea!
Ensemble Methods
Basic Idea
• Two Decision Trees by themselves may overfit. But combining their predictions may be a good idea!
Bagging
Bagging = Bootstrap Aggregation
• Use the same training algorithm for every predictor, but train them on different random subsets of
the training set.
Random Forest is an Ensemble of Decision Trees, generally trained via the bagging method.
Boosting
Basic Idea
• Train several weak learners sequentially, each trying to correct the errors made by its predecessor.
Best Performance