0% found this document useful (0 votes)
32 views34 pages

03 01 Machine Learning

Uploaded by

bzahidhussain
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
32 views34 pages

03 01 Machine Learning

Uploaded by

bzahidhussain
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 34

Machine Learning

Dr. Martin “Doc” Carlisle


Machine Learning

Computer Algorithms that improve with “experience”


Do we have labeled data? (review)
• Supervised
– Can train on data with labeled instances of normal vs anomaly
classes
– Not very common
• Semisupervised
– Labeled instances for only normal data
• Unsupervised
– No labeled data
Two key problems

• Classification
Two key problems
• Regression
Classification vs Regression
• Classification
– Map input to discrete value
– Data is unordered
– Evaluate by # of correct classifications
• Regression
– Map input to continuous value
– Data is ordered
– Evaluate by root mean squared error
Regression Example
Regression Example
Jupyter Notebook
• See RegressionExample.ipynb for more examples, including with
Gaussian functions
Classification
• Support Vector Classifier
– Finds hyperplane(s) to split data
SVC margin
• Find hyperplane with largest margin
– Sometimes you allow some samples to be miscategorized
SVC Kernels
• Sometimes data isn’t linearly separable
– Use kernel function to map to linearly separable feature space
SVC Kernels
• Sometimes data isn’t linearly separable
– Use kernel function to map to linearly separable feature space
SVC Notebook

• Jupyter Notebook SVC_Example.ipynb has


examples of SVC
Principal Component Analysis

• Fast and flexible way to reduce dimensionality of data


(e.g. our faces)
– Computes eigenvectors of the data’s covariance matrix
Principal Component Analysis
PCA Notebook

• Jupyter Notebook PCA_Example.ipynb has example of


PCA, and the same data with an ISOMAP (http://www-
clmc.usc.edu/publications/T/tenenbaum-Science2000.pdf)
Bayesian Classifier
𝑃 𝑌𝑋 𝑃(𝑋)
• Bayes Theorem 𝑃 𝑋 𝑌 =
𝑃(𝑌)
• If we have a bunch of independent Ys, then:
– 𝑃 𝐶𝑙𝑎𝑠𝑠 𝑌1, 𝑌2, … 𝑌𝑛 ∝ 𝑃(𝐶𝑙𝑎𝑠𝑠) ς𝑛𝑖=1 𝑃(𝑌𝑖|𝐶𝑙𝑎𝑠𝑠)
• So we guess a class by just picking the biggest
probability!
Naïve Bayesian Classifier
𝑛

𝑃 𝐶𝑙𝑎𝑠𝑠 𝑌1, 𝑌2, … 𝑌𝑛 ∝ 𝑃(𝐶𝑙𝑎𝑠𝑠) ෑ 𝑃(𝑌𝑖|𝐶𝑙𝑎𝑠𝑠)


𝑖=1
• Need a probability distribution to compute
𝑃(𝑌𝑖|𝐶𝑙𝑎𝑠𝑠)
– Gaussian, compute mean and variance for each
class
Multinomial Bayesian Classifier

• Use multinomial distribution instead, useful for


data with “counts” (e.g. word counts in text)
Naïve Bayesian Summary

• Good for well-separated categories


• Good for high dimensional data
• Good when naïve assumptions match
(independence, distributions)
• Create fast, explainable models
Multinomial Bayesian Notebook

• Jupyter notebook Multinomial_Naive_Bayes.ipynb with


Multinomial Naïve Bayes
Scikit TF-IDF
Decision Trees
• Repeatedly split space with hyperplane
Decision Tree
• Not usually so easily explainable, may get strange
behavior of automatically generated
Random Forests
Create many trees to reduce such oddities - every decision tree is trained by first
applying principal component analysis (PCA) on a random subset of the input features
Random Forests

• Revisit digits in Jupyter Notebook Random_Forests.ipynb


K-Means Clustering
• Partition n observations into k clusters each belonging to
nearest mean
1. Select k “means”
2. Partition
3. Update means
4. Repeat 1-3 until converge
K-Means Clustering
• Not necessarily optimal (depends on selection of initial “means”)
• Must know # of clusters in advance
• Might also require mapping to new space
K-Means Clustering
• Revisit digits in Jupyter Notebook
K_Means_Clustering.ipynb
• Uses t-distributed Stochastic Neighbor Embedding
(TSNE) to visualize high-dimensional data
– converts affinities of data points to probabilities (Gaussian joint
probabilities)
Gaussian Mixture Models
• Extends k-means
– K-means has a lack of flexibility in cluster shape
Gaussian Mixture Models

• Extends k-means
– K-means has a lack of flexibility in cluster shape
– K-means lacks probabilistic cluster assignment
Gaussian Mixture Models
1. Choose starting “means”
2. Repeat until convergence
– For each point, find probability in each cluster
– For each cluster, update location and shape based on all data
points, using weights
Gaussian Mixture Modeling

• Revisit digits in Jupyter Notebook


Gaussian_Mixture_Model.ipynb, now using Kernel
Density Estimation
– Uses mixture of one Gaussian per point

You might also like