0% found this document useful (0 votes)

32 views34 pages

03 01 Machine Learning

Uploaded by

bzahidhussain

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

32 views34 pages

03 01 Machine Learning

Uploaded by

bzahidhussain

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 34

Machine Learning

Dr. Martin “Doc” Carlisle

Machine Learning

Computer Algorithms that improve with “experience”

Do we have labeled data? (review)
• Supervised
– Can train on data with labeled instances of normal vs anomaly
classes
– Not very common
• Semisupervised
– Labeled instances for only normal data
• Unsupervised
– No labeled data
Two key problems

• Classification
Two key problems
• Regression
Classification vs Regression
• Classification
– Map input to discrete value
– Data is unordered
– Evaluate by # of correct classifications
• Regression
– Map input to continuous value
– Data is ordered
– Evaluate by root mean squared error
Regression Example
Regression Example
Jupyter Notebook
• See RegressionExample.ipynb for more examples, including with
Gaussian functions
Classification
• Support Vector Classifier
– Finds hyperplane(s) to split data
SVC margin
• Find hyperplane with largest margin
– Sometimes you allow some samples to be miscategorized
SVC Kernels
• Sometimes data isn’t linearly separable
– Use kernel function to map to linearly separable feature space
SVC Kernels
• Sometimes data isn’t linearly separable
– Use kernel function to map to linearly separable feature space
SVC Notebook

• Jupyter Notebook SVC_Example.ipynb has

examples of SVC
Principal Component Analysis

• Fast and flexible way to reduce dimensionality of data

(e.g. our faces)
– Computes eigenvectors of the data’s covariance matrix
Principal Component Analysis
PCA Notebook

• Jupyter Notebook PCA_Example.ipynb has example of

PCA, and the same data with an ISOMAP (http://www-
clmc.usc.edu/publications/T/tenenbaum-Science2000.pdf)
Bayesian Classifier
𝑃 𝑌𝑋 𝑃(𝑋)
• Bayes Theorem 𝑃 𝑋 𝑌 =
𝑃(𝑌)
• If we have a bunch of independent Ys, then:
– 𝑃 𝐶𝑙𝑎𝑠𝑠 𝑌1, 𝑌2, … 𝑌𝑛 ∝ 𝑃(𝐶𝑙𝑎𝑠𝑠) ς𝑛𝑖=1 𝑃(𝑌𝑖|𝐶𝑙𝑎𝑠𝑠)
• So we guess a class by just picking the biggest
probability!
Naïve Bayesian Classifier
𝑛

𝑃 𝐶𝑙𝑎𝑠𝑠 𝑌1, 𝑌2, … 𝑌𝑛 ∝ 𝑃(𝐶𝑙𝑎𝑠𝑠) ෑ 𝑃(𝑌𝑖|𝐶𝑙𝑎𝑠𝑠)

𝑖=1
• Need a probability distribution to compute
𝑃(𝑌𝑖|𝐶𝑙𝑎𝑠𝑠)
– Gaussian, compute mean and variance for each
class
Multinomial Bayesian Classifier

• Use multinomial distribution instead, useful for

data with “counts” (e.g. word counts in text)
Naïve Bayesian Summary

• Good for well-separated categories

• Good for high dimensional data
• Good when naïve assumptions match
(independence, distributions)
• Create fast, explainable models
Multinomial Bayesian Notebook

• Jupyter notebook Multinomial_Naive_Bayes.ipynb with

Multinomial Naïve Bayes
Scikit TF-IDF
Decision Trees
• Repeatedly split space with hyperplane
Decision Tree
• Not usually so easily explainable, may get strange
behavior of automatically generated
Random Forests
Create many trees to reduce such oddities - every decision tree is trained by first
applying principal component analysis (PCA) on a random subset of the input features
Random Forests

• Revisit digits in Jupyter Notebook Random_Forests.ipynb

K-Means Clustering
• Partition n observations into k clusters each belonging to
nearest mean
1. Select k “means”
2. Partition
3. Update means
4. Repeat 1-3 until converge
K-Means Clustering
• Not necessarily optimal (depends on selection of initial “means”)
• Must know # of clusters in advance
• Might also require mapping to new space
K-Means Clustering
• Revisit digits in Jupyter Notebook
K_Means_Clustering.ipynb
• Uses t-distributed Stochastic Neighbor Embedding
(TSNE) to visualize high-dimensional data
– converts affinities of data points to probabilities (Gaussian joint
probabilities)
Gaussian Mixture Models
• Extends k-means
– K-means has a lack of flexibility in cluster shape
Gaussian Mixture Models

• Extends k-means
– K-means has a lack of flexibility in cluster shape
– K-means lacks probabilistic cluster assignment
Gaussian Mixture Models
1. Choose starting “means”
2. Repeat until convergence
– For each point, find probability in each cluster
– For each cluster, update location and shape based on all data
points, using weights
Gaussian Mixture Modeling

• Revisit digits in Jupyter Notebook

Gaussian_Mixture_Model.ipynb, now using Kernel
Density Estimation
– Uses mixture of one Gaussian per point

Full Stack Data-Science AI, ChatGPT & Generative - 5
No ratings yet
Full Stack Data-Science AI, ChatGPT & Generative - 5
35 pages
DELTA Plus Model & Five Stages of Analytics Maturity: A Primer
No ratings yet
DELTA Plus Model & Five Stages of Analytics Maturity: A Primer
12 pages
Understanding Machine Learning Algorithms - in Depth
No ratings yet
Understanding Machine Learning Algorithms - in Depth
167 pages
Machine Learning With Python
100% (2)
Machine Learning With Python
137 pages
2-Capacity, Underfitting, overfitting-15-Jul-2020Material - I - 15-Jul-2020 - ML - Fundamentals
No ratings yet
2-Capacity, Underfitting, overfitting-15-Jul-2020Material - I - 15-Jul-2020 - ML - Fundamentals
35 pages
Week 1 Introduction To ML
100% (1)
Week 1 Introduction To ML
42 pages
Cheat Sheet-Building Unsupervised Learning Models
No ratings yet
Cheat Sheet-Building Unsupervised Learning Models
3 pages
Outline: Three Basic Algorithms
No ratings yet
Outline: Three Basic Algorithms
34 pages
InnoTech Scholarship Awardee Report 2021
100% (1)
InnoTech Scholarship Awardee Report 2021
72 pages
K Mean Clustering
No ratings yet
K Mean Clustering
36 pages
Machine Unit4
No ratings yet
Machine Unit4
55 pages
Modern Machine Learning in Python
No ratings yet
Modern Machine Learning in Python
50 pages
Data Mining Disease Diagnosis Presentation
No ratings yet
Data Mining Disease Diagnosis Presentation
35 pages
CCS369
No ratings yet
CCS369
2 pages
20MEMECH Part 3 - Classification
No ratings yet
20MEMECH Part 3 - Classification
49 pages
DSM MOd 5
No ratings yet
DSM MOd 5
34 pages
ML Notes 1
No ratings yet
ML Notes 1
3 pages
Introduction To Data Mining-Sources
No ratings yet
Introduction To Data Mining-Sources
5 pages
A. Install Relevant Package For Classification. B. Choose Classifier For Classification Problem. C. Evaluate The Performance of Classifier
No ratings yet
A. Install Relevant Package For Classification. B. Choose Classifier For Classification Problem. C. Evaluate The Performance of Classifier
10 pages
AI Overview Simplified
No ratings yet
AI Overview Simplified
17 pages
ML Imppp
No ratings yet
ML Imppp
12 pages
UNIT - II - Data Mining Essentials
No ratings yet
UNIT - II - Data Mining Essentials
20 pages
Machine Learning SVM - Supervised
No ratings yet
Machine Learning SVM - Supervised
32 pages
Data Science Crash Course
100% (1)
Data Science Crash Course
32 pages
Data Science in FInancial Services - 3
No ratings yet
Data Science in FInancial Services - 3
76 pages
Lab 6 Dsa
No ratings yet
Lab 6 Dsa
15 pages
ML Lab File
No ratings yet
ML Lab File
43 pages
04 MLModelingBasics
No ratings yet
04 MLModelingBasics
61 pages
Experiment # 10
No ratings yet
Experiment # 10
10 pages
Prediction On Iris
No ratings yet
Prediction On Iris
14 pages
Data Science Practical
No ratings yet
Data Science Practical
22 pages
YanchangZhao Refcard Data Mining
No ratings yet
YanchangZhao Refcard Data Mining
3 pages
Data Science
No ratings yet
Data Science
13 pages
ICT202B AI ML and Emerging Technologies UNIT 3 (Classification and Regression) 2
No ratings yet
ICT202B AI ML and Emerging Technologies UNIT 3 (Classification and Regression) 2
23 pages
Prathamesh KRAI
No ratings yet
Prathamesh KRAI
38 pages
Classifying Data Using Support Vector Machines (SVMS) in Python
No ratings yet
Classifying Data Using Support Vector Machines (SVMS) in Python
5 pages
Data Science Cheatsheet
No ratings yet
Data Science Cheatsheet
5 pages
Automated Plant Disease Analysis (APDA) Performance Comparison of Machine
No ratings yet
Automated Plant Disease Analysis (APDA) Performance Comparison of Machine
6 pages
1 An Introduction To Machine Learning With Scikit Learn
No ratings yet
1 An Introduction To Machine Learning With Scikit Learn
2 pages
R Course - Part7 ML - Exercise Sheet 2024
No ratings yet
R Course - Part7 ML - Exercise Sheet 2024
8 pages
ML Lab Manual Completed
No ratings yet
ML Lab Manual Completed
56 pages
ML Lab
No ratings yet
ML Lab
30 pages
5 Levels of AI Agents (Updated)
No ratings yet
5 Levels of AI Agents (Updated)
16 pages
2 Machine Learning
No ratings yet
2 Machine Learning
21 pages
End SEM V IMP DSE 2
No ratings yet
End SEM V IMP DSE 2
9 pages
L6 Lecture Image - Classification.fundemental v4
No ratings yet
L6 Lecture Image - Classification.fundemental v4
66 pages
MILIT PPT Modifies
No ratings yet
MILIT PPT Modifies
43 pages
Clustering in Python-Dr. Afsaneh Javadi
No ratings yet
Clustering in Python-Dr. Afsaneh Javadi
8 pages
Classification
No ratings yet
Classification
34 pages
2023 Artificial Intelligence Machine Learning Overview Report Preview
No ratings yet
2023 Artificial Intelligence Machine Learning Overview Report Preview
10 pages
MACHINE LEARNING Notes
No ratings yet
MACHINE LEARNING Notes
8 pages
AP For NLP-LO2
No ratings yet
AP For NLP-LO2
38 pages
Machine Learning Laboratory: Manual
No ratings yet
Machine Learning Laboratory: Manual
52 pages
AIML
No ratings yet
AIML
30 pages
Aiml Model
No ratings yet
Aiml Model
13 pages
Maxbox - Starter67 Machine Learning
No ratings yet
Maxbox - Starter67 Machine Learning
7 pages
Lecture 03 Bayes Classifier With Prob Concepts
No ratings yet
Lecture 03 Bayes Classifier With Prob Concepts
70 pages
Data Science Classes
No ratings yet
Data Science Classes
13 pages
ML Exp
No ratings yet
ML Exp
9 pages
Machine Learning Algorithms 1728923216
No ratings yet
Machine Learning Algorithms 1728923216
12 pages
11 Most Common Machine Learning Algorithms Explained in A Nutshell by Soner Yıldırım Towards Data Science
No ratings yet
11 Most Common Machine Learning Algorithms Explained in A Nutshell by Soner Yıldırım Towards Data Science
16 pages
Python 06 MachineLearning
No ratings yet
Python 06 MachineLearning
45 pages
Classification
No ratings yet
Classification
4 pages
AI PPT Spring 2k22
No ratings yet
AI PPT Spring 2k22
44 pages
Final Research Paper
No ratings yet
Final Research Paper
3 pages
SVM Using Python
No ratings yet
SVM Using Python
24 pages
Resume 2025
No ratings yet
Resume 2025
1 page
Maxbox Starter60 Machine Learning
No ratings yet
Maxbox Starter60 Machine Learning
8 pages
Artificial Intelligence & Machine Learning
No ratings yet
Artificial Intelligence & Machine Learning
82 pages
Unit 1-1
No ratings yet
Unit 1-1
10 pages
Large Language Model Routing With Benchmark Datasets
No ratings yet
Large Language Model Routing With Benchmark Datasets
18 pages
KMeans Example
No ratings yet
KMeans Example
8 pages
Geohazards 03 00011 v2
No ratings yet
Geohazards 03 00011 v2
28 pages
Data Analytics in Project Management Spalek Seweryn Instant Download
No ratings yet
Data Analytics in Project Management Spalek Seweryn Instant Download
85 pages
Iris Species IB
No ratings yet
Iris Species IB
7 pages
Capstone Project Report CO6I
No ratings yet
Capstone Project Report CO6I
34 pages
MachineLearning Lecture 2
No ratings yet
MachineLearning Lecture 2
23 pages
M.E. Medical Electronics Brochure
No ratings yet
M.E. Medical Electronics Brochure
16 pages
Capstone Project Proposal: Suraksha
No ratings yet
Capstone Project Proposal: Suraksha
12 pages
FPGA Based Implementation of Neural Network
No ratings yet
FPGA Based Implementation of Neural Network
5 pages
Thesis Final Kajal Pourjalil
No ratings yet
Thesis Final Kajal Pourjalil
58 pages
Data Analytics and Model Evaluation
No ratings yet
Data Analytics and Model Evaluation
55 pages
Recent Advances in Earthquake Seismology Using Machine Learning
No ratings yet
Recent Advances in Earthquake Seismology Using Machine Learning
32 pages
NSDC Ai Data Scientist
No ratings yet
NSDC Ai Data Scientist
33 pages
PragyaKishore Resume PDF
No ratings yet
PragyaKishore Resume PDF
1 page
Just Walk Out Technology
No ratings yet
Just Walk Out Technology
14 pages
6G Functional Architecture
No ratings yet
6G Functional Architecture
9 pages
Quant Developers' Tools and Techniques: Quant Books, #1
From Everand
Quant Developers' Tools and Techniques: Quant Books, #1
Manfred Hindering
No ratings yet
The Numpy Pocketbook: Essentials on the Go
From Everand
The Numpy Pocketbook: Essentials on the Go
Silas Meadowlark
No ratings yet
Kernel Methods: Fundamentals and Applications
From Everand
Kernel Methods: Fundamentals and Applications
Fouad Sabry
No ratings yet

03 01 Machine Learning

Uploaded by

03 01 Machine Learning

Uploaded by

Machine Learning

Dr. Martin “Doc” Carlisle

Computer Algorithms that improve with “experience”

• Jupyter Notebook SVC_Example.ipynb has

• Fast and flexible way to reduce dimensionality of data

• Jupyter Notebook PCA_Example.ipynb has example of

𝑃 𝐶𝑙𝑎𝑠𝑠 𝑌1, 𝑌2, … 𝑌𝑛 ∝ 𝑃(𝐶𝑙𝑎𝑠𝑠) ෑ 𝑃(𝑌𝑖|𝐶𝑙𝑎𝑠𝑠)

• Use multinomial distribution instead, useful for

• Good for well-separated categories

• Jupyter notebook Multinomial_Naive_Bayes.ipynb with

• Revisit digits in Jupyter Notebook Random_Forests.ipynb

• Revisit digits in Jupyter Notebook

You might also like