100% found this document useful (1 vote)

109 views

13 PracticalMachineLearning

This document provides an overview of machine learning concepts including supervised and unsupervised learning, features, kernels, support vector machines, decision trees, and more. It discusses key machine learning algorithms like SVM, decision trees, k-means clustering, and their applications. Tips are provided on practical machine learning tasks like parameter tuning and multi-class classification. Comparisons are made between different algorithms to highlight their strengths and weaknesses.

Uploaded by

Matheus Silva

Available Formats

Download as PDF, TXT or read online on Scribd

100% found this document useful (1 vote)

109 views

13 PracticalMachineLearning

Uploaded by

Matheus Silva

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 84

Prac%cal

Machine Learning
Verena Kaynig-Fi4kau ([email protected])

We are drowning in informa%on and

starving for knowledge
John Naisbi4
Machine Learning
Analyze training data

Make predic%ons for new unseen data:

supervised learning

Find pa4erns:
unsupervised learning

Machine Learning
Supervised Learning
SVM, Decision Tree, Boos%ng, Random Forest

Unsupervised Learning
K-means, mean shiS
Supervised Learning

data points
x2
labels

features

separa%ng
hyper plane
x1
Features are important

?
roundness

weight
Features are important

shape

color
Googles Self-Driving Car
Car Features
Laser scan Intensity model Eleva%on model

Lane model
Camera vision 2D sta%onary map
So just measure everything?
More features = be4er classica%on?

Prac%cal issues:
Data volume, computa%on overhead

Theore%cal issues:
Generaliza%on performance
Curse of dimensionality
Supervised Learning

data points
x2
labels

features

separa%ng
hyper plane
x1
Perceptron
x: data point x1 w1

y: label x2
w2
w3
w: weight vector x3 b
b: bias -1

-1 w
The XOR Problem

x3
x2

x1
Support Vector Machine
Widely used for all sorts of classica%on
problems
www.clopinet.com/isabelle/Projects/SVM/applist.html

Some people say it is the best of the shelf

classier out there
Maximum Margin Classica%on

x2 x2

x1 x1
What about outliers?

: slack variables

x1
XOR problem revised

x=0

Did we add informa%on to make the problem seperable?

Polynomial Kernel in 3D
Quadra%c Kernel
Kernel Func%ons

Polynomial:

Radial basis func%on (RBF):
Kernel Trick for SVMs
Arbitrary many dimensions
Li4le computa%onal cost
Maximal margin helps with curse of
dimensionality

SVM Applet

h4p://www.ml.inf.ethz.ch/educa%on/
lectures_and_seminars/annex_estat/Classier/
JSupportVectorApplet.html
Tips and Tricks
SVMs are not scale invariant
Check if your library normalizes by default
Normalize your data
mean: 0 , std: 1
map to [0,1] or [-1,1]
Normalize test set in same way!

Tips and Tricks
RBF kernel is a good default
For parameters try exponen%al sequences
Read:
Chih-Wei Hsu et al., A Prac%cal Guide to
Support Vector Classica%on,
Bioinforma%cs (2010)
Parameter Tuning
Given a classica%on task

Which kernel ?
Which kernel parameter values?
Which value for C?

Try dierent combina%ons
and take the best.
Grid Search

Zang et al., Iden%ca%on of heparin samples that contain impuri%es

or contaminants by chemometric pa4ern recogni%on analysis
of proton NMR spectral data, Anal Bioanal Chem (2011)
Mul% Class

One vs. All

One vs All
Train n classier for n classes
Take classica%on with greatest posi%ve
margin
Slow training
Mul% Class

One vs. One

One vs One
Train n(n-1)/2 classiers
Take majority vote
Fast training
Decision Tree
aSer 10
no pm? yes

got
call friend
electricity?

got new
read book
dvd?

play
watch tv
computer
Decision Trees
Fast training
Fast prediciton
Easy to understand
Easy to interpret
Decision Tree - Idea

C D

A
A B C D E

Bishop, Pa4ern Recogni%on and Machine Learning, Springer, 2006
Decision Tree - Predic%on

C D

A
A B C D E
Decision Tree -Training
Learn the tree structure:
which feature to query
which threshold to choose

A B C D E
Node Purity

10 7
E

3 5 7 2 B

3 2
C D

A
A B C D E
When to Stop
node contains only one class
node contains less than x data points
max depth is reached
node purity is sucient
you start to overt => cross-valida%on
Decision Trees - Disadvantages
Sensi%ve to small changes in the data
Overtng
Only axis aligned splits
Decision Trees vs SVM

Has%e et al.,The Elements of Sta%s%cal Learning: Data Mining, Inference, and Predic%on, Springer (2009)
Wisdom of Crowds
The collec%ve knowledge of a diverse and
independent body of people typically exceeds
the knowledge of any single individual, and can
be harnessed by vo%ng.
James Surowiecki

h4p://socialmedia4srm.wordpress.com/
Ensemble Methods
A single decision tree does not perform well
But, it is super fast
What if we learn mul%ple trees?

For mul%ple trees we need

even more data!
Bootstrap
Resampling method from sta%s%cs
Useful to get error bars on es%mates

Take N data points
Draw N %mes with replacement

Get es%mate from each bootstrapped sample

Bagging
Bootstrap aggregating

Sample with replacement from your data set

Learn a classier for each bootstrap sample
Average the results
Bagging Example

x1
Bagging
Reduces overtng (variance)
Normally uses one type of classier
Decision trees are popular
Easy to parallelize
Boos%ng
Also ensemble method like Bagging
But:
weak learners evolve over %me
votes are weighted

Be4er than Bagging for many applica%ons

Very popular method

Boos%ng
Boos%ng is one of the most powerful learning
ideas introduced in the last twenty years.

Has%e et al.,The Elements of Sta%s%cal Learning: Data
Mining, Inference, and Predic%on, Springer (2009)
Adaboost

x1
AdaBoost
Ini%alize weights for data points
For each itera%on:
Fit classier to training data
Compute weighted classica%on error
Compute weight for classier from the error
Update weights for data points
Final classier is weighted sum of all single
classiers
AdaBoost

Has%e et al.,The Elements of Sta%s%cal Learning: Data Mining, Inference, and Predic%on, Springer (2009)
AdaBoost
AdaBoost
Introduced by Freund and Schapire in 1995
Worked great, nobody understood why!

Then ve years later (Friedman et al. 2000):

Adaboost minimizes exponen%al loss func%on.
There s%ll are open ques%ons.
Random Forest
Builds upon the idea of bagging
Each tree build from bootstrap sample
Node splits calculated from random feature
subsets

h4p://www.andrewbun%ne.com/ar%cles/about/fun
Random Forest
All trees are fully grown
No pruning

Two parameters
Number of trees
Number of features

Random Forest Error Rate
Error depends on:
Correla%on between trees (higher is worse)
Strength of single trees (higher is be4er)

Increasing number of features for each split:

Increases correla%on
Increases strength of single trees
Out of Bag Error
Each tree is trained on a bootstrapped sample
About 1/3 of data points not used for training

Predict unseen points with each tree

Measure error
Out of Bag Error
data points
sample lter

bootstrap unused
sample data points

train test
Out of Bag Error
Very similar to cross-valida%on
Measured during training
Can be too op%mis%c
Variable Importance
Again use out of bag samples
Predict class for these samples
Randomly permute values of one feature
Predict classes again
Measure decrease in accuracy
Temp%ng Scenario
Run random forest with all features
Reduce number of features based on
importance weights
Run again with reduced feature set and report
out of bag error

This does not measure test

performance!
Unbalanced Classes
The Problem:

Oversample:

Subsample:

Subsample for each tree!

Random Forest Subsampling

sample

train
Random Forest
Similar to Bagging
Easy to parallelize
Packaged with some neat func%ons:
Out of bag error
Feature importance measure
Proximity es%ma%on
Cascade Classier
Ensemble methods are strong
But predic%on is slow
Solu%on: Make predic%on faster

Idea: Build a cascade
Cascade Classier

h4p://en.wikipedia.org/wiki/Viola%E2%80%93Jones_object_detec%on_framework
Viola Jones Face Detec%on

h4p://cvdazzle.com/
Viola Jones Face Detec%on
Takes long to train
Predic%on in real %me!

Widely used today

Summary
SVMs
Decision Trees
Bootstrap, Bagging, Boos%ng
Random Forest
Cascade Classier
Further Reading
Error Measures
predicted
True posi%ve (tp)
1 -1
True nega%ve (tn)
1
False posi%ve (fp) tp fn

true
False nega%ve (fn)
-1 fp tn
TPR and FPR
predicted
True Posi%ve Rate:
1 -1

1
tp fn

true

-1 fp tn
False Posi%ve Rate:
Precision Recall
predicted
1 -1
Recall:
1
tp fn

true

-1 fp tn
Precision:
Precision Recall Curve
1
precision

1
recall
Comparison

J. Davis & M. Goadrich,

The Rela%onship Between Precision-Recall and ROC Curves.,
ICML (2006)
F-measure
Weighted average of precision and recall

Usual case:
Increasing allocates weight to recall
Clustering Evalua%on Criteria
Based on expert knowledge
Debatable for real data
Hidden Unknown structures could be present
Do we even want to just reproduce known
structure?
Rand Index
Percentage of correct classica%ons
Compare pairs of elements:

tn
tp

fn fp

Fp and fn are equally weighted

Stability
Stability
What is the right number of clusters?
What makes a good clustering solu%on?

Clustering should generalize!

Stability
Gini Impurity
Example:
4 red, 3 green, 3 blue data points

random sample:
red: 4/10 green: 3/10 blue: 3/10

misclassica%on:
red: 4/10 * (3/10 + 3/10)
true wrong
class predic%on
Gini Impurity
Number of classes:
Number of data points:
Number of data points of class i:

true wrong
class predic%on
Gini Impurity

Has%e et al.,The Elements of Sta%s%cal Learning: Data Mining, Inference, and

Predic%on, Springer (2009)
Node Purity Gain
Compare:
A
Gini impurity of parent node
Gini impurity of child nodes B C
Pseudocode
Check for base cases
For each a4ribute a
Calculate the gain from splitng on a
Let a_best be the a4ribute with highest gain
Create a decision node that splits on a_best
Repeat on the sub-nodes

h4p://en.wikipedia.org/wiki/C4.5_algorithm

Lecture 1
100% (1)
Lecture 1
81 pages
Eda PDF
100% (1)
Eda PDF
45 pages
3 Regression Diagnostics
100% (1)
3 Regression Diagnostics
53 pages
Machine Learning Overview
No ratings yet
Machine Learning Overview
103 pages
Combined ML
100% (1)
Combined ML
705 pages
Early Stopping in Practice
No ratings yet
Early Stopping in Practice
14 pages
Chapter 5.3-Mulitple Linear Regression
No ratings yet
Chapter 5.3-Mulitple Linear Regression
26 pages
Python UNIT-5
100% (1)
Python UNIT-5
67 pages
22 Selected Top Papers On Deep Learning
No ratings yet
22 Selected Top Papers On Deep Learning
393 pages
71A Machine Learning
No ratings yet
71A Machine Learning
8 pages
Mathematical Foundations of Machine Learning: (NMAG 469, FALL TERM 2018-2019)
No ratings yet
Mathematical Foundations of Machine Learning: (NMAG 469, FALL TERM 2018-2019)
74 pages
Chapter 17 - Logistic Regression
No ratings yet
Chapter 17 - Logistic Regression
32 pages
Predict 422 - Module 8
100% (1)
Predict 422 - Module 8
138 pages
771 A18 Lec4
100% (1)
771 A18 Lec4
128 pages
Artificial Intelligence and Deep Learning
0% (1)
Artificial Intelligence and Deep Learning
9 pages
Natural Language Processing Syllabus
No ratings yet
Natural Language Processing Syllabus
2 pages
Data Science
No ratings yet
Data Science
39 pages
Lec16 - Autoencoders
No ratings yet
Lec16 - Autoencoders
18 pages
Machine Learning
100% (1)
Machine Learning
81 pages
L2 - Machine Learning Process
No ratings yet
L2 - Machine Learning Process
17 pages
ML Project Shivani Pandey
100% (2)
ML Project Shivani Pandey
49 pages
Class Material - 1
No ratings yet
Class Material - 1
66 pages
AML 04 Backpropagation
100% (1)
AML 04 Backpropagation
26 pages
Deep Learning: - Course Code: - Unit 1
No ratings yet
Deep Learning: - Course Code: - Unit 1
21 pages
BCA Semester 1 2023 Syllabus MAKAUT
100% (1)
BCA Semester 1 2023 Syllabus MAKAUT
21 pages
Data Mining Project Shivani Pandey
100% (1)
Data Mining Project Shivani Pandey
40 pages
Artificial Intelligence: CS60045 Course Introduction
100% (4)
Artificial Intelligence: CS60045 Course Introduction
16 pages
Class Xi Python
100% (2)
Class Xi Python
138 pages
Lesson 4 Deep Neural Network and Tools
No ratings yet
Lesson 4 Deep Neural Network and Tools
159 pages
Intro To Machine Learning 101 Python Data Science v2
No ratings yet
Intro To Machine Learning 101 Python Data Science v2
101 pages
Python Cheat Sheet
No ratings yet
Python Cheat Sheet
45 pages
DL Full Merged
No ratings yet
DL Full Merged
454 pages
Machine Learning Techniques Unit-1 (KAI-601)
No ratings yet
Machine Learning Techniques Unit-1 (KAI-601)
78 pages
Machine Learning Lab
No ratings yet
Machine Learning Lab
33 pages
1 - Machine Learning
No ratings yet
1 - Machine Learning
26 pages
Week 1 Introduction To ML
100% (1)
Week 1 Introduction To ML
42 pages
Deep Learning Notes
100% (1)
Deep Learning Notes
71 pages
Data Science
100% (2)
Data Science
38 pages
ML PDF
100% (1)
ML PDF
237 pages
Pytorch Slides
No ratings yet
Pytorch Slides
31 pages
Chapter 11: Business Intelligence and Knowledge Management: Oz (5th Edition)
100% (1)
Chapter 11: Business Intelligence and Knowledge Management: Oz (5th Edition)
20 pages
Pytorch Lightning Readthedocs Latest
100% (1)
Pytorch Lightning Readthedocs Latest
421 pages
What Is Convolutional Neural Network
No ratings yet
What Is Convolutional Neural Network
16 pages
Best Practices For Prompt Engineering With The OpenAI
No ratings yet
Best Practices For Prompt Engineering With The OpenAI
6 pages
GenAI-Unit1-3
No ratings yet
GenAI-Unit1-3
31 pages
Bias and Variance
No ratings yet
Bias and Variance
6 pages
Stanford University CS224d - Deep Learning For Natural Language Processing - Syllabus
No ratings yet
Stanford University CS224d - Deep Learning For Natural Language Processing - Syllabus
3 pages
Data Science Intervieew Questions
100% (1)
Data Science Intervieew Questions
16 pages
Hyperparameter Tuning in XGBoost Using Genetic Algorithm
100% (1)
Hyperparameter Tuning in XGBoost Using Genetic Algorithm
11 pages
Supervised Learning - Regression - Annotated
No ratings yet
Supervised Learning - Regression - Annotated
97 pages
Deep Learning Nanodegree Syllabus 8-15
No ratings yet
Deep Learning Nanodegree Syllabus 8-15
15 pages
ML Interview Questions and Answers
100% (1)
ML Interview Questions and Answers
25 pages
The Complete Guide To Data Preprocessing
No ratings yet
The Complete Guide To Data Preprocessing
50 pages
GANppt
100% (1)
GANppt
34 pages
Feature Engineering
No ratings yet
Feature Engineering
23 pages
Data Literacy Questions All Types
No ratings yet
Data Literacy Questions All Types
2 pages
Final Twitter - Sentiment - Analysis - Report
100% (1)
Final Twitter - Sentiment - Analysis - Report
14 pages
Guide To Evaluating LLM and RAG Systems
No ratings yet
Guide To Evaluating LLM and RAG Systems
41 pages
Machine Learning With Python
No ratings yet
Machine Learning With Python
44 pages
Machine Learning with Python: Design and Develop Machine Learning and Deep Learning Technique using real world code examples
From Everand
Machine Learning with Python: Design and Develop Machine Learning and Deep Learning Technique using real world code examples
Abhishek Vijayvargia
No ratings yet
Structure and Dynamics of Functional Networks in Child-Onset - Guilherme Ferraz de Arruda and Francisco A. Rodrigues
No ratings yet
Structure and Dynamics of Functional Networks in Child-Onset - Guilherme Ferraz de Arruda and Francisco A. Rodrigues
7 pages
Preprocessing of MRI Data For Alzheimer Diseases Diagnosis: July 2018
No ratings yet
Preprocessing of MRI Data For Alzheimer Diseases Diagnosis: July 2018
4 pages
CS109 Data Science: Trees, Networks & Databases
No ratings yet
CS109 Data Science: Trees, Networks & Databases
80 pages
Pytorch Cheat Sheet For Beginners and Udacity Deep Learning Nanodegree
No ratings yet
Pytorch Cheat Sheet For Beginners and Udacity Deep Learning Nanodegree
23 pages
DeepAD SubjectLevel Ready2submit Final
No ratings yet
DeepAD SubjectLevel Ready2submit Final
33 pages
My Portion : Written by Mark Barlow. Original Key DB Major
No ratings yet
My Portion : Written by Mark Barlow. Original Key DB Major
2 pages
Credit Risk Analysis Using Machine and Deep Learning
No ratings yet
Credit Risk Analysis Using Machine and Deep Learning
19 pages
Classification and Clustering: CS109/Stat121/AC209/E-109 Data Science
No ratings yet
Classification and Clustering: CS109/Stat121/AC209/E-109 Data Science
28 pages
04 DataMunging PDF
No ratings yet
04 DataMunging PDF
36 pages
19 Storytelling PDF
No ratings yet
19 Storytelling PDF
64 pages
Network Models II: CS109/Stat121/AC209/E-109 Data Science
No ratings yet
Network Models II: CS109/Stat121/AC209/E-109 Data Science
19 pages
08 HighDimensional PDF
No ratings yet
08 HighDimensional PDF
88 pages
CS109/Stat121/AC209/E-109 Data Science: Network Models
No ratings yet
CS109/Stat121/AC209/E-109 Data Science: Network Models
20 pages
14 MapReduce PDF
100% (1)
14 MapReduce PDF
82 pages
STAT121 / AC209 / E-109: CS109 Data Science
No ratings yet
STAT121 / AC209 / E-109: CS109 Data Science
74 pages
Bias and Sampling: CS109/Stat121/AC209/E-109 Data Science
No ratings yet
Bias and Sampling: CS109/Stat121/AC209/E-109 Data Science
17 pages
04 DataMunging PDF
No ratings yet
04 DataMunging PDF
36 pages
Classification and Clustering: CS109/Stat121/AC209/E-109 Data Science
No ratings yet
Classification and Clustering: CS109/Stat121/AC209/E-109 Data Science
28 pages
CS109/Stat121/AC209/E-109 Data Science: Network Models
No ratings yet
CS109/Stat121/AC209/E-109 Data Science: Network Models
20 pages
08 HighDimensional PDF
No ratings yet
08 HighDimensional PDF
88 pages
Ecu 04 Manual en
No ratings yet
Ecu 04 Manual en
4 pages
ICF - 8-Lesson 3
No ratings yet
ICF - 8-Lesson 3
14 pages
Hot-Indian-Telugu-Stories-01: Contributed by
0% (1)
Hot-Indian-Telugu-Stories-01: Contributed by
40 pages
Debug 1214
No ratings yet
Debug 1214
3 pages
BSBLDR501 Learner Guide
No ratings yet
BSBLDR501 Learner Guide
22 pages
Introduction To ITS and C-ITS Capital Wp3 Its1 Final 27.5.2019
No ratings yet
Introduction To ITS and C-ITS Capital Wp3 Its1 Final 27.5.2019
19 pages
Paper Overview Nonlinear MPC Applications
No ratings yet
Paper Overview Nonlinear MPC Applications
24 pages
Excel VBA - 3 Books in 1
No ratings yet
Excel VBA - 3 Books in 1
158 pages
Distributed Database System
No ratings yet
Distributed Database System
6 pages
LG High Brightness_Brochure(low)_241017
No ratings yet
LG High Brightness_Brochure(low)_241017
9 pages
Modbus LULC032 Communication Module: User's Manual 03/2005
No ratings yet
Modbus LULC032 Communication Module: User's Manual 03/2005
78 pages
ESwitching Basic Switching Wireless PT Practice SBA
No ratings yet
ESwitching Basic Switching Wireless PT Practice SBA
4 pages
Manual Zte
50% (2)
Manual Zte
69 pages
Feeder Less Flexi EDGE Site
No ratings yet
Feeder Less Flexi EDGE Site
9 pages
ART Q3 Elements and Principles of Photography
100% (1)
ART Q3 Elements and Principles of Photography
33 pages
Art Fundamental Part 4
100% (1)
Art Fundamental Part 4
3 pages
Fibonacci Time Lines
No ratings yet
Fibonacci Time Lines
3 pages
Dissipation Factor Basics
No ratings yet
Dissipation Factor Basics
8 pages
RDBMS 12
No ratings yet
RDBMS 12
35 pages
LESSON 2 - Purposive Communication
No ratings yet
LESSON 2 - Purposive Communication
4 pages
JIT Techniques Practiced in Service Industry
No ratings yet
JIT Techniques Practiced in Service Industry
4 pages
3ipk Intro Tacc
No ratings yet
3ipk Intro Tacc
10 pages
7 Ways To Boot in Safe Mode in Windows 10
No ratings yet
7 Ways To Boot in Safe Mode in Windows 10
19 pages
A Pic Real Time Clock Ic RTC Using The ds1307 - Compress PDF
No ratings yet
A Pic Real Time Clock Ic RTC Using The ds1307 - Compress PDF
6 pages
BÀI TẬP SO SÁNH
No ratings yet
BÀI TẬP SO SÁNH
3 pages
MBA Internship Report
100% (1)
MBA Internship Report
47 pages
Bosch Interview
No ratings yet
Bosch Interview
2 pages
Rounding Numbers
No ratings yet
Rounding Numbers
2 pages
Brosura Eseuri
No ratings yet
Brosura Eseuri
10 pages
Onkyo Sks-ht728 Datasheet
No ratings yet
Onkyo Sks-ht728 Datasheet
1 page