0% found this document useful (0 votes)

1K views

An Introduction To Supervised Machine Learning and Pattern Classification - The Big Picture

This document provides an overview of machine learning concepts and supervised learning. It discusses the different types of machine learning including supervised, unsupervised, and reinforcement learning. For supervised learning, it describes the workflow including data collection, preprocessing, model training and evaluation. It then explains key supervised learning algorithms like decision trees, naive Bayes, k-nearest neighbors, logistic regression, and support vector machines. The document emphasizes the importance of validation, avoiding overfitting, and selecting the right algorithm based on characteristics of the problem and data.

Uploaded by

Weirliam John

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

1K views

An Introduction To Supervised Machine Learning and Pattern Classification - The Big Picture

Uploaded by

Weirliam John

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 55

Practical Data Science

An Introduction to Supervised Machine Learning

and Pattern Classification: The Big Picture

Sebastian Raschka
Michigan State University
NextGen Bioinformatics Seminars - 2015

Feb. 11, 2015

A Little Bit About Myself ...

PhD candidate in Dr. L. Kuhns Lab:
Developing software & methods for
- Protein ligand docking
- Large scale drug/inhibitor discovery

and some other machine learning side-projects

What is Machine Learning?

"Field of study that gives computers the
ability to learn without being explicitly
programmed.
(Arthur Samuel, 1959)

By Phillip Taylor [CC BY 2.0]

http://drewconway.com/zia/2013/3/26/the-data-science-venn-diagram

Examples of Machine Learning

Text Recognition

Biology

http://commons.wikimedia.org/wiki/
File:American_book_company_1916._letter_envelope-2.JPG#filelinks
[public domain]

Spam Filtering
https://flic.kr/p/5BLW6G [CC BY 2.0]

Examples of Machine Learning

Self-driving cars
Recommendation systems

http://commons.wikimedia.org/wiki/File:Netflix_logo.svg [public domain]

By Steve Jurvetson [CC BY 2.0]

Photo search
and many, many
more ...
http://googleresearch.blogspot.com/2014/11/a-picture-is-worth-thousand-coherent.html

How many of you have used

machine learning before?

Our Agenda

Concepts and the big picture

Workflow

Practical tips & good habits

Labeled data
Direct feedback
Predict outcome/future

Supervised

Learning
Unsupervised

No labels
No feedback
Find hidden structure

Reinforcement

Decision process
Reward system
Learn series of actions

Unsupervised
learning

Unsupervised Learning

Supervised
learning

Supervised Learning

Clustering:

Regression:

Classification:

[DBSCAN on a toy dataset]

[Soccer Fantasy Score prediction]

[SVM on 2 classes of the Wine dataset]

Todays topic

Nomenclature

IRIS
https://archive.ics.uci.edu/ml/datasets/Iris

Instances (samples, observations)

sepal_length

sepal_width

petal_length

petal_width

class

5.1

3.5

1.4

0.2

setosa

4.9

3.0

1.4

0.2

setosa

6.4

3.2

4.5

1.5

veriscolor

150

5.9

3.0

5.1

1.8

virginica

Features (attributes, dimensions)

Classes (targets)

Classification
1) Learn from training data
class1
class2
x1

2) Map unseen (new) data

Supervised
Learning

Raw Data Collection

Missing Data

Pre-Processing
Feature Extraction

Sampling

Training Dataset

Split

Feature Selection
Feature Scaling

Pre-Processing

Test Dataset

New Data

Dimensionality Reduction

Final Model
Evaluation

Prediction

Learning Algorithm
Training

Cross Validation
Refinement
Hyperparameter
Optimization

Performance Metrics

Post-Processing
Model Selection

Final Classification/
Regression Model

Sebastian Raschka 2014

This work is licensed under a Creative Commons Attribution 4.0 International License.

Supervised
Learning

Raw Data Collection

Missing Data

Pre-Processing
Feature Extraction

Sampling

Training Dataset

Split

Feature Selection
Feature Scaling

Pre-Processing

Test Dataset

New Data

Dimensionality Reduction

Final Model
Evaluation

Prediction

Learning Algorithm
Training

Cross Validation
Refinement
Hyperparameter
Optimization

Performance Metrics

Post-Processing
Model Selection

Final Classification/
Regression Model

Sebastian Raschka 2014

This work is licensed under a Creative Commons Attribution 4.0 International License.

A Few Common Classifiers

Perceptron

Naive Bayes

Decision Tree
K-Nearest Neighbor
Logistic Regression
Artificial Neural Network / Deep Learning
Support Vector Machine
Ensemble Methods: Random Forest, Bagging, AdaBoost

Discriminative Algorithms
Map x y directly.
E.g., distinguish between people speaking different languages
without learning the languages.
Logistic Regression, SVM, Neural Networks

Generative Algorithms
Models a more general problem: how the data was generated.
I.e., the distribution of the class; joint probability distribution p(x,y).
Naive Bayes, Bayesian Belief Network classifier, Restricted
Boltzmann Machine

Examples of Discriminative Classifiers:

Perceptron
F. Rosenblatt. The perceptron, a perceiving and recognizing automaton Project Para. Cornell Aeronautical Laboratory, 1957.

xi1
xi2

w0
w1

y {-1,1}

y = wTx = w0 + w1x1 + w2x2

wj = weight
xi = training sample
yi = desired output
y^ i = actual output
t = iteration step
= learning rate
= threshold (here 0)

update rule:

1 if wTxi
-1 otherwise

wj(t+1) = wj(t) + (yi - yi)xi

until
t+1 = max iter
or error = 0

Discriminative Classifiers:
Perceptron
F. Rosenblatt. The perceptron, a perceiving and recognizing automaton Project Para. Cornell Aeronautical Laboratory, 1957.

1
xi1
xi2

w0
w1

yi
y {-1,1}

x1
x2

Binary classifier (one vs all, OVA)

Convergence problems (set n iterations)
Modification: stochastic gradient descent
Modern perceptron: Support Vector Machine (maximize margin)
Multilayer perceptron (MLP)

Generative Classifiers:
Naive Bayes
Bayes Theorem:

P(j | xi) =

Posterior probability =

Iris example:

P(xi | j) P(j)
P(xi)
Likelihood x Prior probability

P(Setosa"| xi),

Evidence

xi = [4.5 cm, 7.4 cm]

Generative Classifiers:
Naive Bayes

Bayes Theorem:

Decision Rule:

P(j | xi) =

P(xi | j) P(j)

pred. class label j

P(xi)

argmax P(j | xi)

i = 1, , m

e.g., j {Setosa, Versicolor, Virginica}

Generative Classifiers:
Naive Bayes
Evidence:

P(j | xi) =

Prior probability:

P(xi | j) P(j)
P(xi)

Nj
P(j) =
Nc

Class-conditional
probability
(here Gaussian kernel):

(cancels out)

(class frequency)

1
P(xik |j) = (2 j2) exp

P(xi |j)

(xik - j)2
2j2

P(xik |j)

Generative Classifiers:
Naive Bayes

Naive conditional independence assumption typically

violated
Works well for small datasets
Multinomial model still quite popular for text classification
(e.g., spam filter)

Non-Parametric Classifiers:
K-Nearest Neighbor
k=3

e.g., k=1

Simple!
Lazy learner
Very susceptible to curse of dimensionality

Iris Example

C=3

k=3
mahalanobis dist.
uniform weights

Setosa

Virginica

depth = 2

Versicolor

Decision Tree
petal length <= 2.45?

N
petal length <= 4.75?

Setosa

Y
Virginica

N
Versicolor

Entropy = pi logk pi
i
depth = 4

e.g.,

2 (- 0.5 log2(0.5)) = 1

Information Gain =
entropy(parent) [avg entropy(children)]

"No Free Lunch" :(

D. H. Wolpert. The supervised learning no-free-lunch theorems. In Soft Computing and Industry, pages 2542. Springer, 2002.

Our model is a simplification of reality

Simplification is based on assumptions (model bias)

Assumptions fail in certain situations

Roughly speaking:
No one model works best for all possible situations.

Which Algorithm?
What is the size and dimensionality of my training set?
Is the data linearly separable?
How much do I care about computational efficiency?
- Model building vs. real-time prediction time
- Eager vs. lazy learning / on-line vs. batch learning
- prediction performance vs. speed
Do I care about interpretability or should it "just work well?"
...

Supervised
Learning

Raw Data Collection

Missing Data

Pre-Processing
Feature Extraction

Sampling

Training Dataset

Split

Feature Selection
Feature Scaling

Pre-Processing

Test Dataset

New Data

Dimensionality Reduction

Final Model
Evaluation

Prediction

Learning Algorithm
Training

Cross Validation
Refinement
Hyperparameter
Optimization

Performance Metrics

Post-Processing
Model Selection

Final Classification/
Regression Model

Sebastian Raschka 2014

This work is licensed under a Creative Commons Attribution 4.0 International License.

Missing Values:
- Remove features (columns)
- Remove samples (rows)
- Imputation (mean, nearest neighbor, )

Sampling:
- Random split into training and validation sets
- Typically 60/40, 70/30, 80/20
- Dont use validation set until the very end!
(overfitting)

Categorical Variables
M

10.1

class
label
class1

color size
0 green
1

red

13.5

class2

blue

15.3

class1

nominal

ordinal

green (1,0,0)
red (0,1,0)
blue (0,0,1)

0
1
2

class
label
0
1
0

prize

M1
L2
XL 3

color=blue color=green
0
1
0
0
1
0

color=red
0
1
0

prize
10.1
13.5
15.3

size
1
2
3

Supervised
Learning

Raw Data Collection

Missing Data

Pre-Processing
Feature Extraction

Sampling

Training Dataset

Split

Feature Selection
Feature Scaling

Pre-Processing

Test Dataset

New Data

Dimensionality Reduction

Final Model
Evaluation

Prediction

Learning Algorithm
Training

Cross Validation
Refinement
Hyperparameter
Optimization

Performance Metrics

Post-Processing
Model Selection

Final Classification/
Regression Model

Sebastian Raschka 2014

This work is licensed under a Creative Commons Attribution 4.0 International License.

Generalization Error and Overfitting

How well does the model perform on unseen data?

Generalization Error and Overfitting

Error Metrics: Confusion Matrix

here: setosa = positive

[Linear SVM on sepal/petal lengths]

Error Metrics
here: setosa = positive

[Linear SVM on sepal/petal lengths]

micro and macro

averaging for multi-class

TP + TN
Accuracy =
FP +FN +TP +TN
= 1 - Error
FP
False Positive Rate =
N
TP
True Positive Rate =
P
(Recall)
TP
Precision =
TP + FP

Receiver Operating Characteristic

(ROC) Curves

Model Selection
Complete dataset

Training dataset

Test dataset

k-fold cross-validation (k=4):

fold 1

fold 2

fold 3

fold 4

Test set

1st iteration

calc. error

2nd iteration

calc. error

3rd iteration

calc. error

4th iteration

calc. error

calculate
avg. error

k-fold CV and ROC

Feature Selection
IMPORTANT!
(Noise, overfitting, curse of dimensionality, efficiency)
-

Domain knowledge
Variance threshold
Exhaustive search
Decision trees

Simplest example:
Greedy Backward Selection

start:

X = [x1, x2, x3, x4]

X = [x1, x3, x4]

stop:
(if d = k)

X = [x1, x3]

Dimensionality Reduction

Transformation onto a new feature subspace

e.g., Principal Component Analysis (PCA)

Find directions of maximum variance

Retain most of the information

PCA in 3 Steps
0. Standardize data
xik - k
z=

1. Compute covariance matrix

ik =

1 (xij - j) (xik - k)
n -1 i

2 1
21
=
31
41

12
2 2
32
42

13
23
2 3
43

14
24
34
2 4

PCA in 3 Steps
2. Eigendecomposition and sorting eigenvalues
Xv=v

Eigenvectors
[[ 0.52237162
[-0.26335492
[ 0.58125401
[ 0.56561105

-0.37231836 -0.72101681 0.26199559]

-0.92555649 0.24203288 -0.12413481]
-0.02109478 0.14089226 -0.80115427]
-0.06541577 0.6338014
0.52354627]]

Eigenvalues
[ 2.93035378

0.92740362

(from high to low)

0.14834223

0.02074601]

PCA in 3 Steps
3. Select top k eigenvectors and transform data
Eigenvectors
[[ 0.52237162
[-0.26335492
[ 0.58125401
[ 0.56561105

-0.37231836 -0.72101681 0.26199559]

-0.92555649 0.24203288 -0.12413481]
-0.02109478 0.14089226 -0.80115427]
-0.06541577 0.6338014
0.52354627]]

Eigenvalues
[ 2.93035378

0.92740362

[First 2 PCs of Iris]

0.14834223

0.02074601]

Hyperparameter Optimization:
GridSearch in scikit-learn

C=1000,
gamma=0.1

C=1

k=11
uniform weights

Non-Linear Problems
- XOR gate
depth=4

Kernel Trick
Kernel function
Kernel
Map onto high-dimensional space (non-linear combinations)

Kernel Trick
Trick: No explicit dot product!
Radius Basis Function (RBF) Kernel:

Kernel PCA

PC1, linear PCA

PC1, kernel PCA

Supervised
Learning

Raw Data Collection

Missing Data

Pre-Processing
Feature Extraction

Sampling

Training Dataset

Split

Feature Selection
Feature Scaling

Pre-Processing

Test Dataset

New Data

Dimensionality Reduction

Final Model
Evaluation

Prediction

Learning Algorithm
Training

Cross Validation
Refinement
Hyperparameter
Optimization

Performance Metrics

Post-Processing
Model Selection

Final Classification/
Regression Model

Sebastian Raschka 2014

This work is licensed under a Creative Commons Attribution 4.0 International License.

Thanks!
Questions?
@rasbt
[email protected]
https://github.com/rasbt

Additional Slides

Inspiring Literature
P. N. Klein. Coding the Matrix: Linear
Algebra Through Computer Science
Applications. Newtonian Press, 2013.

S. Gutierrez. Data Scientists at Work.

Apress, 2014.

R. Schutt and C. ONeil. Doing Data

Science: Straight Talk from the Frontline.
OReilly Media, Inc., 2013.

R. O. Duda, P. E. Hart, and D. G. Stork.

Pattern classification. 2nd. Edition. New
York, 2001.

Useful Online Resources

https://www.coursera.org/course/ml

http://stats.stackexchange.com

http://www.kaggle.com

My Favorite Tools
http://scikit-learn.org/stable/
http://www.numpy.org
http://pandas.pydata.org

Seaborn

http://stanford.edu/~mwaskom/software/seaborn/
http://ipython.org/notebook.html

Which one to pick?

class1
class2

Generalization error!

The problem of overfitting

A Seminar Report On Machine Learing
35% (23)
A Seminar Report On Machine Learing
30 pages
Introduction To Inverse Problems - Sari Lasanen
No ratings yet
Introduction To Inverse Problems - Sari Lasanen
104 pages
Lecture 1 Pyhton Programming DOST 1
No ratings yet
Lecture 1 Pyhton Programming DOST 1
67 pages
Sequencing Problem Type 3 and 4
No ratings yet
Sequencing Problem Type 3 and 4
9 pages
I. Egan, J. Ritchie, P. D. Gardiner (Auth.), Grigore Gogu, Daniel Coutellier, Patrick Chedmail, Pascal Ray (Eds.) - Recent Advances in Integrated Design and Manufacturing in Mechanical Engineering-Spr
No ratings yet
I. Egan, J. Ritchie, P. D. Gardiner (Auth.), Grigore Gogu, Daniel Coutellier, Patrick Chedmail, Pascal Ray (Eds.) - Recent Advances in Integrated Design and Manufacturing in Mechanical Engineering-Spr
541 pages
A Collaborative Iterated Greedy Algorithm With Reinforcement Learning For Energy-Aware Distributed Blocking Flow-Shop Scheduling
No ratings yet
A Collaborative Iterated Greedy Algorithm With Reinforcement Learning For Energy-Aware Distributed Blocking Flow-Shop Scheduling
23 pages
TI 6126 Scheduling Theory Flow Shop Scheduling (#7)
No ratings yet
TI 6126 Scheduling Theory Flow Shop Scheduling (#7)
36 pages
Training Microsoft Project
No ratings yet
Training Microsoft Project
39 pages
Microsoft Project
0% (1)
Microsoft Project
63 pages
PSO Tutorial
No ratings yet
PSO Tutorial
18 pages
Microsoft Project Versions
No ratings yet
Microsoft Project Versions
7 pages
Sequencing and Scheduling: J J J J J J J J J J J J
No ratings yet
Sequencing and Scheduling: J J J J J J J J J J J J
8 pages
Exploring: Safran Project
No ratings yet
Exploring: Safran Project
660 pages
Optimization (SF1811 SF1831 SF1841)
No ratings yet
Optimization (SF1811 SF1831 SF1841)
198 pages
Kenya Offshore (L5, L7, L11a, L11B & L12)
No ratings yet
Kenya Offshore (L5, L7, L11a, L11B & L12)
4 pages
Symes - 2009 - The Seismic Reflection Inverse Problem
No ratings yet
Symes - 2009 - The Seismic Reflection Inverse Problem
39 pages
Volvo 2011 XC90 Brochure
No ratings yet
Volvo 2011 XC90 Brochure
48 pages
Inverse Sturm-Liouville Problems and Their Applications
No ratings yet
Inverse Sturm-Liouville Problems and Their Applications
305 pages
New Offshore Kenya Survey Complete
No ratings yet
New Offshore Kenya Survey Complete
1 page
Seismic Attributes and AVO Analysis by Ahmed Hafez: January 2018
No ratings yet
Seismic Attributes and AVO Analysis by Ahmed Hafez: January 2018
30 pages
Tabu Search 1
100% (1)
Tabu Search 1
15 pages
Blog - Seismic Processing Guides & Tutorials
No ratings yet
Blog - Seismic Processing Guides & Tutorials
5 pages
Intelligent Machining - Key Takeaways
No ratings yet
Intelligent Machining - Key Takeaways
3 pages
Twri 2-D1 PDF
No ratings yet
Twri 2-D1 PDF
123 pages
Equations
No ratings yet
Equations
9 pages
Reverse Time Migration
No ratings yet
Reverse Time Migration
2 pages
The Value of 3d Seismic in Today's Exploration Environment - in Canada and Around The World
No ratings yet
The Value of 3d Seismic in Today's Exploration Environment - in Canada and Around The World
5 pages
GXT Completes East Africaspan: Date: Wednesday, June 20, 2007
No ratings yet
GXT Completes East Africaspan: Date: Wednesday, June 20, 2007
2 pages
Ojo PHD Thesis
No ratings yet
Ojo PHD Thesis
111 pages
An Improved Bat Algorithm For The Hybrid Flowshop Scheduling To Minimize Total Job Completion Time
No ratings yet
An Improved Bat Algorithm For The Hybrid Flowshop Scheduling To Minimize Total Job Completion Time
7 pages
The Mosaic of Metaheuristic Algorithms in Structural Optimization
No ratings yet
The Mosaic of Metaheuristic Algorithms in Structural Optimization
57 pages
Particle Swarm Optimization
No ratings yet
Particle Swarm Optimization
12 pages
Flexible Flow Shop Sheduling Problem With Machine Eligibility
100% (1)
Flexible Flow Shop Sheduling Problem With Machine Eligibility
12 pages
Chapter 05 - 3D Seismic Data
No ratings yet
Chapter 05 - 3D Seismic Data
8 pages
Swarm Intelligence PSO and ACO
No ratings yet
Swarm Intelligence PSO and ACO
69 pages
Random Forest PDF
No ratings yet
Random Forest PDF
92 pages
Seismic Survey Equipment Market Outlook, Trends, Analysis 2031
100% (1)
Seismic Survey Equipment Market Outlook, Trends, Analysis 2031
4 pages
Stein J.Y. Digital Signal Processing - A Computer Science Perspective (Wiley, 2000) (T) (869s)
100% (1)
Stein J.Y. Digital Signal Processing - A Computer Science Perspective (Wiley, 2000) (T) (869s)
869 pages
Full Metaheuristics For Machine Learning Algorithms and Applications 1st Edition Kanak Kalita Ebook All Chapters
100% (11)
Full Metaheuristics For Machine Learning Algorithms and Applications 1st Edition Kanak Kalita Ebook All Chapters
70 pages
Pso Soft Computing
No ratings yet
Pso Soft Computing
4 pages
CSE 473 Pattern Recognition: Instructor: Dr. Md. Monirul Islam
100% (1)
CSE 473 Pattern Recognition: Instructor: Dr. Md. Monirul Islam
57 pages
An Overview of Indian Sedimentary Basins With Special Focus On Emerging East Coast Deepwater Frontiers
100% (1)
An Overview of Indian Sedimentary Basins With Special Focus On Emerging East Coast Deepwater Frontiers
10 pages
Tunnel Engineering 2
No ratings yet
Tunnel Engineering 2
34 pages
Optimizing Time Cost Trade Off Scheduling by Genetic Algorithm
No ratings yet
Optimizing Time Cost Trade Off Scheduling by Genetic Algorithm
9 pages
Flexible Job-Shop Scheduling With TRIBES-PSO Approach
No ratings yet
Flexible Job-Shop Scheduling With TRIBES-PSO Approach
9 pages
Petrel 2013
No ratings yet
Petrel 2013
4 pages
Application of Artificial Intelligence To Reservoir Characterization: An Interdisciplinary Approach
No ratings yet
Application of Artificial Intelligence To Reservoir Characterization: An Interdisciplinary Approach
14 pages
03 Visualization
No ratings yet
03 Visualization
242 pages
Multi Utility Tunnel
No ratings yet
Multi Utility Tunnel
24 pages
Hybrid Flow Shop Scheduling With Sequence Dependent Family Setup Time and Uncertain Due Dates
No ratings yet
Hybrid Flow Shop Scheduling With Sequence Dependent Family Setup Time and Uncertain Due Dates
35 pages
Seismic Survey
No ratings yet
Seismic Survey
250 pages
Instant Access to Metaheuristics for Machine Learning Algorithms and Applications 1st Edition Kanak Kalita ebook Full Chapters
100% (12)
Instant Access to Metaheuristics for Machine Learning Algorithms and Applications 1st Edition Kanak Kalita ebook Full Chapters
60 pages
Modern Optimization Book
No ratings yet
Modern Optimization Book
434 pages
Sun, Ne-Zheng - 1999 - Inverse Problems in Groundwater Modeling
No ratings yet
Sun, Ne-Zheng - 1999 - Inverse Problems in Groundwater Modeling
346 pages
Genetic Algorithms For Solving Scheduling Problems
No ratings yet
Genetic Algorithms For Solving Scheduling Problems
7 pages
Introduction To Obspy Lion
No ratings yet
Introduction To Obspy Lion
54 pages
Particle Swarm Optimization: Technique, System and Challenges
No ratings yet
Particle Swarm Optimization: Technique, System and Challenges
9 pages
Intro To Python
No ratings yet
Intro To Python
19 pages
ML Unit 2
No ratings yet
ML Unit 2
37 pages
Introduction To Machine Learning: Workshop On Machine Learning For Intelligent Image Processing
No ratings yet
Introduction To Machine Learning: Workshop On Machine Learning For Intelligent Image Processing
44 pages
Introduction To Classification - PPT Slides 1
No ratings yet
Introduction To Classification - PPT Slides 1
62 pages
STAD Club Strategies Revisited: Looking Back at Volumes 1-12
No ratings yet
STAD Club Strategies Revisited: Looking Back at Volumes 1-12
188 pages
SSRN Id1808129
No ratings yet
SSRN Id1808129
19 pages
StableMagnet Standart Smart Contract Security Audit
No ratings yet
StableMagnet Standart Smart Contract Security Audit
7 pages
Crosscat: A Fully Bayesian Nonparametric Method For Analyzing Heterogeneous, High Dimensional Data
No ratings yet
Crosscat: A Fully Bayesian Nonparametric Method For Analyzing Heterogeneous, High Dimensional Data
49 pages
Deep Contextualized Word Representation
No ratings yet
Deep Contextualized Word Representation
15 pages
Chapter8 - Effective ML
No ratings yet
Chapter8 - Effective ML
15 pages
Python Metaprogramming
100% (1)
Python Metaprogramming
93 pages
NSW Photo Card Application: Important Customer Information
No ratings yet
NSW Photo Card Application: Important Customer Information
2 pages
Add
No ratings yet
Add
1 page
Thrilling Footy: Cats, Saints, Pies or Bulldogs For AFL Grand Final
No ratings yet
Thrilling Footy: Cats, Saints, Pies or Bulldogs For AFL Grand Final
4 pages
Zakar Vakar Dmbi
No ratings yet
Zakar Vakar Dmbi
79 pages
Computer Science Journals
No ratings yet
Computer Science Journals
6 pages
Science BSC Computer Science Semester 5 2022 November Elective I Artificial Intelligence Cbcs
No ratings yet
Science BSC Computer Science Semester 5 2022 November Elective I Artificial Intelligence Cbcs
29 pages
Classification Performances of Data Mining Clustering Algorithms For Remotely Sensed Multispectral Image Data
No ratings yet
Classification Performances of Data Mining Clustering Algorithms For Remotely Sensed Multispectral Image Data
4 pages
Lec04 BayesianLearning
No ratings yet
Lec04 BayesianLearning
39 pages
Towards Leveraging The Role of Machine Learning and A - 2022 - Computers and Ele
No ratings yet
Towards Leveraging The Role of Machine Learning and A - 2022 - Computers and Ele
29 pages
Lecture Notes For Chapter 5 Introduction To Data Mining: by Tan, Steinbach, Kumar
No ratings yet
Lecture Notes For Chapter 5 Introduction To Data Mining: by Tan, Steinbach, Kumar
88 pages
Unit-1 PRCV
No ratings yet
Unit-1 PRCV
86 pages
Unit 3 Machine learning aktu
No ratings yet
Unit 3 Machine learning aktu
13 pages
Classification of Stars, Galaxies and Quasars
No ratings yet
Classification of Stars, Galaxies and Quasars
8 pages
b 14 Sms Spam Detection Ml Ieee Report (1)
No ratings yet
b 14 Sms Spam Detection Ml Ieee Report (1)
5 pages
Prediction of Skin Disease Using Ensemble Data Mining Techniques and Feature Selection Method - A Comparative Study
No ratings yet
Prediction of Skin Disease Using Ensemble Data Mining Techniques and Feature Selection Method - A Comparative Study
19 pages
Depression Detection Emotion AI
No ratings yet
Depression Detection Emotion AI
5 pages
Applied Machine Learning Course Schedule: 1:fundamentals of Programming
No ratings yet
Applied Machine Learning Course Schedule: 1:fundamentals of Programming
33 pages
1000 Machine Learning MCQ (Multiple Choice Questions) - Sanfoundry
No ratings yet
1000 Machine Learning MCQ (Multiple Choice Questions) - Sanfoundry
16 pages
Aiml Unit 1 Nil
No ratings yet
Aiml Unit 1 Nil
24 pages
EDAB Module 5 Singular Value Decomposition (SVD)
No ratings yet
EDAB Module 5 Singular Value Decomposition (SVD)
58 pages
Analysis and Detection of Autism Spectrum Disorder Using Machine Learning Techniques-Ppt-1
100% (3)
Analysis and Detection of Autism Spectrum Disorder Using Machine Learning Techniques-Ppt-1
28 pages
Social Media Amharic Fake News Detection Using NLP Techniques With SVM Algorithm
No ratings yet
Social Media Amharic Fake News Detection Using NLP Techniques With SVM Algorithm
6 pages
Data Mining For Intelligence
No ratings yet
Data Mining For Intelligence
4 pages
M.tech Syllabus
No ratings yet
M.tech Syllabus
32 pages
FEM 2063 - Data Analytics: CHAPTER 4: Classifications
100% (1)
FEM 2063 - Data Analytics: CHAPTER 4: Classifications
76 pages
Machine Learning Notes
100% (1)
Machine Learning Notes
115 pages
MCQs Dumps 2
No ratings yet
MCQs Dumps 2
15 pages
BE EEC SchemeMarch3
No ratings yet
BE EEC SchemeMarch3
45 pages
L10-Naive Bayes Continuous
No ratings yet
L10-Naive Bayes Continuous
16 pages
15CS73 Module 4
No ratings yet
15CS73 Module 4
60 pages
Prediction Health Index Using Machine Learning and Its Correlation With Transformer Age on Historical Data
No ratings yet
Prediction Health Index Using Machine Learning and Its Correlation With Transformer Age on Historical Data
6 pages
Disease Prediction Using Data Mining
No ratings yet
Disease Prediction Using Data Mining
5 pages

An Introduction To Supervised Machine Learning and Pattern Classification - The Big Picture

Uploaded by

An Introduction To Supervised Machine Learning and Pattern Classification - The Big Picture

Uploaded by

Practical Data Science

An Introduction to Supervised Machine Learning

Feb. 11, 2015

A Little Bit About Myself ...

and some other machine learning side-projects

What is Machine Learning?

By Phillip Taylor [CC BY 2.0]

Examples of Machine Learning

Examples of Machine Learning

http://commons.wikimedia.org/wiki/File:Netflix_logo.svg [public domain]

How many of you have used

Concepts and the big picture

Practical tips & good habits

[DBSCAN on a toy dataset]

[Soccer Fantasy Score prediction]

[SVM on 2 classes of the Wine dataset]

Instances (samples, observations)

Features (attributes, dimensions)

2) Map unseen (new) data

Raw Data Collection

Sebastian Raschka 2014

Raw Data Collection

Sebastian Raschka 2014

A Few Common Classifiers

Examples of Discriminative Classifiers:

y = wTx = w0 + w1x1 + w2x2

wj(t+1) = wj(t) + (yi - yi)xi

Binary classifier (one vs all, OVA)

xi = [4.5 cm, 7.4 cm]

pred. class label j

argmax P(j | xi)

e.g., j {Setosa, Versicolor, Virginica}

Naive conditional independence assumption typically

"No Free Lunch" :(

Our model is a simplification of reality

Simplification is based on assumptions (model bias)

Assumptions fail in certain situations

Raw Data Collection

Sebastian Raschka 2014

Raw Data Collection

Sebastian Raschka 2014

Generalization Error and Overfitting

How well does the model perform on unseen data?

Generalization Error and Overfitting

Error Metrics: Confusion Matrix

[Linear SVM on sepal/petal lengths]

[Linear SVM on sepal/petal lengths]

micro and macro

Receiver Operating Characteristic

k-fold cross-validation (k=4):

k-fold CV and ROC

X = [x1, x2, x3, x4]

Transformation onto a new feature subspace

e.g., Principal Component Analysis (PCA)

Find directions of maximum variance

Retain most of the information

1. Compute covariance matrix

-0.37231836 -0.72101681 0.26199559]

(from high to low)

-0.37231836 -0.72101681 0.26199559]

[First 2 PCs of Iris]

PC1, linear PCA

PC1, kernel PCA

Raw Data Collection

Sebastian Raschka 2014

S. Gutierrez. Data Scientists at Work.

R. Schutt and C. ONeil. Doing Data

R. O. Duda, P. E. Hart, and D. G. Stork.

Useful Online Resources

Which one to pick?

The problem of overfitting

You might also like