03 01 Machine Learning
03 01 Machine Learning
• Classification
Two key problems
• Regression
Classification vs Regression
• Classification
– Map input to discrete value
– Data is unordered
– Evaluate by # of correct classifications
• Regression
– Map input to continuous value
– Data is ordered
– Evaluate by root mean squared error
Regression Example
Regression Example
Jupyter Notebook
• See RegressionExample.ipynb for more examples, including with
Gaussian functions
Classification
• Support Vector Classifier
– Finds hyperplane(s) to split data
SVC margin
• Find hyperplane with largest margin
– Sometimes you allow some samples to be miscategorized
SVC Kernels
• Sometimes data isn’t linearly separable
– Use kernel function to map to linearly separable feature space
SVC Kernels
• Sometimes data isn’t linearly separable
– Use kernel function to map to linearly separable feature space
SVC Notebook
• Extends k-means
– K-means has a lack of flexibility in cluster shape
– K-means lacks probabilistic cluster assignment
Gaussian Mixture Models
1. Choose starting “means”
2. Repeat until convergence
– For each point, find probability in each cluster
– For each cluster, update location and shape based on all data
points, using weights
Gaussian Mixture Modeling