0% found this document useful (0 votes)

9 views81 pages

4 DL

Machine learning is the study of algorithms that improve their performance on tasks through experience, utilizing statistics for inference and computer science for efficient algorithms. It has applications in various fields such as speech recognition, natural language processing, and medical analysis, with methods including supervised, unsupervised, and reinforcement learning. The document also discusses classification, regression, and clustering strategies, emphasizing the importance of model generalization and the bias-variance trade-off in building effective classifiers.

Uploaded by

kushalgangwar98

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

9 views81 pages

4 DL

Uploaded by

kushalgangwar98

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 81

Machine Learning

What is Machine Learning?

• Machine Learning
– Study of algorithms that
– improve their performance
– at some task
– with experience
• Optimize a performance criterion using example data
or experience.
• Role of Statistics: Inference from a sample
• Role of Computer science: Efficient algorithms to
– Solve the optimization problem
– Representing and evaluating the model for inference
2
Growth of Machine Learning
• Machine learning is preferred approach to
– Speech recognition, Natural language processing
– Computer vision
– Medical outcomes analysis
– Robot control
– Computational biology
• This trend is accelerating
– Improved machine learning algorithms
– Improved data capture, networking, faster computers
– Software too complex to write by hand
– New sensors / IO devices
– Demand for self-customization to user, environment
– It turns out to be difficult to extract knowledge from human experts →
failure of expert systems in the 1980’s.
Alpydin & Ch. Eick: ML Topic1 3
Applications
• Association Analysis
• Supervised Learning
– Classification
– Regression/Prediction
• Unsupervised Learning
• Reinforcement Learning

4
Learning Associations
• Basket analysis:
P (Y | X ) probability that somebody who buys X also
buys Y where X and Y are products/services.

Example: P ( chips | beer ) = 0.7

Market-Basket transactions
TID Items
1 Bread, Milk
2 Bread, Diaper, Beer, Eggs
3 Milk, Diaper, Beer, Coke
4 Bread, Milk, Diaper, Beer
5 Bread, Milk, Diaper, Coke
Classification
• Example: Credit
scoring
• Differentiating
between low-risk
and high-risk
customers from
their income and
savings
Discriminant: IF income > θ1 AND savings > θ2
THEN low-risk ELSE high-risk

Model 6
Classification: Applications
• Aka Pattern recognition
• Face recognition: Pose, lighting, occlusion (glasses,
beard), make-up, hair style
• Character recognition: Different handwriting styles.
• Speech recognition: Temporal dependency.
– Use of a dictionary or the syntax of the language.
– Sensor fusion: Combine multiple modalities; eg, visual (lip
image) and acoustic for speech
• Medical diagnosis: From symptoms to illnesse.

7
Face Recognition

Training examples of a person

Test images

AT&T Laboratories, Cambridge UK

http://www.uk.research.att.com/facedatabase.html

8
Prediction: Regression

• Example: Price of a used

car
y = wx+w0
• x : car attributes
y : price
y = g (x | θ )
g ( ) model,
θ parameters

9
Regression Applications

• Navigating a car: Angle of the steering wheel.

10
Different Data Analysis Tasks

• Classification • Pattern detection

– Assign a category (ie, – Identify regularities (ie,
a class) for a new patterns) in temporal or
instance spatial data
• Clustering • Simulation
– Form clusters (ie, – Define mathematical
groups) with a set of formulas that can
instances generate data similar to
observations collected
11
Supervised Learning: Uses
Example: decision trees tools that create rules

• Prediction of future cases: Use the rule to predict the

output for future inputs
• Knowledge extraction: The rule is easy to understand
• Compression: The rule is simpler than the data it
explains
• Outlier detection: Exceptions that are not covered by
the rule, e.g., fraud

12
Unsupervised Learning
• Unsupervised learning is a type of machine learning
algorithm used to draw inferences from datasets
consisting of input data without labeled responses.
• Clustering: Grouping similar instances
• Other applications:
– Predicting the weather
– Calculating the height of a person in the school.
– Summarization.

13
Reinforcement Learning
• Topics:
– Policies: what actions should an agent take in a particular
situation
– Utility estimation: how good is a state (→used by policy)
• No supervised output but delayed reward
• Credit assignment problem (what was responsible for
the outcome)
• Applications:
– Game playing
– Robot in a maze
– Multiple agents, partial observability,
14 ...
Clustering Strategies
• K-means
– Iteratively re-assign points to the nearest cluster
center
• Agglomerative clustering
– Start with each point as its own cluster and iteratively
merge the closest clusters
• Mean-shift clustering
– Estimate modes of pdf
• Spectral clustering
– Split the nodes in a graph based on assigned links with
similarity weights

As we go down this chart, the clustering strategies have more tendency

to transitively group points even if they are not nearby in feature space
The machine learning
framework
• Apply a prediction function to a feature representation of
the image to get the desired output:

f( ) = “apple”
f( ) = “tomato”
f( ) = “cow”
Slide credit: L. Lazebnik
The machine learning
framework
y = f(x)
output prediction Image
function feature

• Training: given a training set of labeled examples {(x1,y1),

…, (xN,yN)}, estimate the prediction function f by minimizing
the prediction error on the training set
• Testing: apply f to a never before seen test example x and
output the predicted value y = f(x)
Slide credit: L. Lazebnik
Steps
Training Training
Labels
Training
Images
Image Learned
Training
Features model

Testing

Image Learned
Prediction
Features model
Test Image Slide credit: D. Hoiem and L. Lazebnik
Features
• Raw pixels

• Histograms

• GIST descriptors

• …
Slide credit: L. Lazebnik
Classifiers: Nearest neighbor

Training
Training Test
examples
examples example
from class 2
from class 1

f(x) = label of the training example nearest to x

• All we need is a distance function for our inputs

• No training required!
Slide credit: L. Lazebnik
Classifiers: Linear

• Find a linear function to separate the classes:

f(x) = sgn(w  x + b)

Slide credit: L. Lazebnik

Many classifiers to choose from
• SVM
• Neural networks
Which is the best one?
• Naïve Bayes
• Bayesian network
• Logistic regression
• Randomized Forests
• Boosted Decision Trees
• K-nearest neighbor
• RBMs
• Etc.

Slide credit: D. Hoiem

Recognition task and supervision
• Images in the training set must be annotated with the
“correct answer” that the model is expected to produce

Contains a motorbike

Slide credit: L. Lazebnik

Generalization

Training set (labels known) Test set (labels

unknown)

• How well does a learned model generalize from

the data it was trained on to a new test set?
Slide credit: L. Lazebnik
Generalization
• Components of generalization error
– Bias: how much the average model over all training sets differ
from the true model?
• Error due to inaccurate assumptions/simplifications made by
the model
– Variance: how much models estimated from different training
sets differ from each other
• Underfitting: model is too “simple” to represent all the
relevant class characteristics
– High bias and low variance
– High training error and high test error
• Overfitting: model is too “complex” and fits irrelevant
characteristics (noise) in the data
– Low bias and high variance
– Low training error and high test error
Slide credit: L. Lazebnik
Bias-Variance Trade-off

• Models with too few

parameters are
inaccurate because of a
large bias (not enough
flexibility).

• Models with too many

parameters are
inaccurate because of a
large variance (too much
sensitivity to the sample).

Slide credit: D. Hoiem

Bias-variance tradeoff

Underfitting Overfitting
Error

Test error

Training error

High Bias Low Bias

Low Variance
Complexity High Variance

Slide credit: D. Hoiem

Remember…
• No classifier is inherently
better than any other: you
need to make assumptions to
generalize

• Three kinds of error

– Inherent: unavoidable
– Bias: due to over-simplifications
– Variance: due to inability to
perfectly estimate parameters
from limited data

Slide
Slide
credit:
credit:
D. D.
Hoiem
Hoiem
How to reduce variance?

• Choose a simpler classifier

• Regularize the parameters

• Get more training data

Slide credit: D. Hoiem

Very brief tour of some classifiers
• K-nearest neighbor
• SVM
• Boosted Decision Trees
• Neural networks
• Naïve Bayes
• Bayesian network
• Logistic regression
• Randomized Forests
• RBMs
• Etc.
Classification
• Assign input vector to one of two or more
classes
• Any decision rule divides input space into
decision regions separated by decision
boundaries

Slide credit: L. Lazebnik

Nearest Neighbor Classifier

• Assign label of nearest training data point to each test data

point

from Duda et al.

Voronoi partitioning of feature space

for two-category 2D and 3D data Source: D. Lowe
K-nearest neighbor

x
x
x o
x x
x
+ o
o x
x
o o+
o
o
x2

x1
1-nearest neighbor

x
x
x o
x x
x
+ o
o x
x
o o+
o
o
x2

x1
3-nearest neighbor

x
x
x o
x x
x
+ o
o x
x
o o+
o
o
x2

x1
5-nearest neighbor

x
x
x o
x x
x
+ o
o x
x
o o+
o
o
x2

x1
Classifiers: Logistic Regression

Maximize likelihood of
label given data,
male
assuming a log-linear
model
Height
female

x1 Pitch of voice
P( x1 , x2 | y = 1)
log = wT x
P( x1 , x2 | y = −1)

P( y = 1 | x1 , x2 ) = 1 / (1 + exp(− w T x ))
Classifiers: Linear SVM

x
x
x x x
x
o x
x
o o
o
o
x2

x1
• Find a linear function to separate the classes:
f(x) = sgn(w  x + b)
Classifiers: Linear SVM

x
x
x x x
x
o x
x
o o
o
o
x2

x1
• Find a linear function to separate the classes:
f(x) = sgn(w  x + b)
Classifiers: Linear SVM

x
x
x o
x x
x
o x
x
o o
o
o
x2

x1
• Find a linear function to separate the classes:
f(x) = sgn(w  x + b)
Nonlinear SVMs
• Datasets that are linearly separable work out great:

0 x

• But what if the dataset is just too hard?

0 x

• We can map it to a higher-dimensional space:

0 x Slide credit: Andrew Moore

Nonlinear SVMs
• General idea: the original input space can
always be mapped to some higher-dimensional
feature space where the training set is
separable:

Φ: x → φ(x)

Slide credit: Andrew Moore

Classifiers: Decision Trees

x
x
x o
x x
x
o x
o x
o o
o
o
x2

x1
Classification Process

1. Classification tasks
2. Building a classifier
3. Evaluating a classifier

70
Classifying Mushrooms

◆ What mushrooms are edible,

i.e., not poisonous?
◆ Book lists many kinds of
mushrooms identified as
either edible, poisonous, or
unknown edibility
◆ Given a new kind
mushroom not listed in the
book, is it edible?

https://archive.ics.uci.edu/ml/datasets/Mushroom
71
Classifying Iris Plants

◆ Iris flowers have different

sepal and petal shapes:
◆ Iris Setosa
◆ Iris Versicolour
◆ Iris Virginica

◆ Suppose you are shown

lots of examples of each
type. Given a new iris
flower, what type is it?
https://en.wikipedia.org/wiki/Iris_setosa
https://en.wikipedia.org/wiki/Iris_versicolor
72
1. Classification Tasks

73
Classification Tasks

◆ Given:
◆ A set of classes
◆ Instances (examples)
of each class

◆ Generate: A method (aka

model) that when given a
new instance it will
determine its class

http://www.business-insight.com/html/intelligence/bi_overfitting.html 74
Classification Tasks

◆ Given: ◆ Instances are described

◆ A set of classes as a set of features or
attributes and their values
◆ Instances of each
class ◆ The class that the
◆ Generate: A method that
instance belongs to is
when given a new also called its “label”
instance it will ◆ Input is a set of
determine its class “labeled instances”

75
Classification Tasks

◆Given: A set of
labeled instances
◆Generate: A
method (aka model)
that when given a
new instance it will
hypothesize its class

80
Classifying a New Instance

84
Classifying New Instances

85
Training and Test Sets
Training instances
(training set)

Test instances
(test set)

86
Contamination
Training instances
(training set)

Test instances
(test set)

When training and test sets overlap

– this should NEVER happen

87
About Classification Tasks

◆ Classes must be disjoint, i.e., each instance belongs to

only one class
◆ Classification tasks are “binary” if there are only two
classes
◆ The classification method will rarely be perfect, it
will make mistakes in its classification of new
instances

88
2. Building a Classifier

89
What is a Modeler?
◆A
mathematical/algori
thmic approach to
generalize from
instances so it can
make predictions
about instances that
it has not seen
before
◆Its output is called a
model
90
Types of Modelers/Models

◆ Logistic regression

◆ Naïve Bayes classifiers

◆ Support vector machines (SVMs)

◆ Decision trees

◆ Random forests

◆ Kernel methods

◆ Genetic algorithms

◆ Neural networks
91
http://tjo-en.hatenablog.com/entry/2014/01/06/234155 93
http://tjo-en.hatenablog.com/entry/2014/01/06/234155 94
http://tjo-en.hatenablog.com/entry/2014/01/06/234155 95
What Modeler to Choose?

◆ Logistic regression
◆Data scientists try
◆ Naïve Bayes classifiers
different modelers,
◆ Support vector machines (SVMs)
with different
◆ Decision trees
parameters, and
◆ Random forests check the accuracy
◆ Kernel methods to figure out which
◆ Genetic algorithms (GAs) one works best for
◆ Neural networks: perceptrons the data at hand
98
Ensembles
◆ An ensemble method uses several
algorithms that do the same task,
and combines their results
◆ “Ensemble learning”

◆ A combination function joins the

results
◆ Majority vote: each algorithm
gets a vote
◆ Weighted voting: each
algorithm’s vote has a weight
◆ Other complex combination
functions

99
http://magizbox.com/index.php/machine-learning/ds-model-building/ensemble/ 100
3. Evaluating a Classifier

101
Classification Accuracy

◆ Accuracy: percentage of correct classifications

Total test instances classified correctly

Accuracy =
Total number of test instances

102
Evaluating a Classifier:
n-fold Cross Validation
◆ Suppose m labeled
instances
◆ Divide into n subsets
(“folds”) of equal
size

◆ Run classifier n times,

with each of the subsets
as the test set
◆ The rest (n-1) for
training
◆ Each run gives an
accuracy result
Translated from image by Joan.domenech91 (Own work) [CC BY-SA 3.0
(http://creativecommons.org/licenses/by-sa/3.0)], via Wikimedia Commons
(https://commons.wikimedia.org/wiki/File:K-fold_cross_validation.jpg) 103
Evaluating a Classifier:
Confusion Matrix

Classified positive Classified negative

Actual positive True positive False negative

Actual negative False positive True negative

TP: number of positive examples classified correctly

FN: number of positive examples classified incorrectly
FP: number of negative examples classified incorrectly
TN: number of negative examples classified correctly
104
Evaluating a Classifier:
Precision and Recall

TP: number of positive examples classified correctly

FN: number of positive examples classified incorrectly
FP: number of negative examples classified incorrectly
TN: number of negative examples classified correctly

TP TP
Precision = Recall =
TP + FP TP + FN

Note that the focus is on the positive class 105

Evaluating a Classifier:
Other Metrics

◆ There are many other accuracy metrics

◆ F1-score
◆ Receive Operating Characteristics (ROC) curve
◆ Area Under the Curve (AUC)

106
Evaluating a Classifier:
Other Metrics

◆ Other accuracy metrics ◆ Other concerns

◆ F1-score ◆ Explainability of
◆ Receive Operating classifier results
Characteristics (ROC) ◆ Cost of examples
curve ◆ Cost of feature
◆ Area Under the Curve values
(AUC) ◆ Labeling

107
Overfitting
◆ A model overfits the training data when it is very accurate
with that data, and may not do so well with new test data

Training Data Test Data

Model 1

Model 2

109
Induction

◆ Induction requires inferring general rules about

examples seen in the past
◆ Contrast with deduction: inferring things that are
a logical consequence of what we have seen in
the past
◆ Classifiers use induction: they generate general
rules about the target classes
◆ The rules are used to make predictions about new data
◆ These predictions can be wrong

110
When Facing a Classification
Task
◆ What features to choose ◆ What classes to choose
◆ Try defining different ◆ Edible / poisonous?
features ◆ Edible / poisonous /
◆ For some problems, unknown?
hundreds and maybe
thousands of features may ◆ How many labeled examples
be possible ◆ May require a lot of work
◆ Sometimes the features are ◆ What modeler to choose
not directly observable (ie,
◆ Better to try different ones
there are “latent” variables)

111
What to remember about classifiers

• No free lunch: machine learning algorithms are tools,

not dogmas

• Try simple classifiers first

• Better to have smart features and simple classifiers

than simple features and smart classifiers

• Use increasingly powerful classifiers with more

training data (bias-variance tradeoff)

Slide credit: D. Hoiem

Resources: Datasets
• UCI Repository:
http://www.ics.uci.edu/~mlearn/MLRepository.html

• UCI KDD Archive:

http://kdd.ics.uci.edu/summary.data.application.html

• Statlib: http://lib.stat.cmu.edu/
• Delve: http://www.cs.utoronto.ca/~delve/

113
Resources: Journals
• Journal of Machine Learning Research
www.jmlr.org
• Machine Learning
• IEEE Transactions on Neural Networks
• IEEE Transactions on Pattern Analysis and
Machine Intelligence
• Annals of Statistics
• Journal of the American Statistical Association
• ...
114
Resources: Conferences

• International Conference on Machine Learning (ICML)

• European Conference on Machine Learning (ECML)
• Neural Information Processing Systems (NIPS)
• Computational Learning
• International Joint Conference on Artificial Intelligence (IJCAI)
• ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD)
• IEEE Int. Conf. on Data Mining (ICDM)

115
Some Machine Learning References

• General
– Tom Mitchell, Machine Learning, McGraw Hill, 1997
– Christopher Bishop, Neural Networks for Pattern
Recognition, Oxford University Press, 1995
• Adaboost
– Friedman, Hastie, and Tibshirani, “Additive logistic
regression: a statistical view of boosting”, Annals of
Statistics, 2000
• SVMs
– http://www.support-vector.net/icml-tutorial.pdf

Slide credit: D. Hoiem

Angular Observable Tutorial
100% (1)
Angular Observable Tutorial
158 pages
Introduction To Machine Learning: Jaime S. Cardoso
100% (1)
Introduction To Machine Learning: Jaime S. Cardoso
52 pages
Omron PLC CP1E Manual
100% (1)
Omron PLC CP1E Manual
257 pages
Introduction To Machine Learning
100% (1)
Introduction To Machine Learning
119 pages
Quiz 1 On Wednesday
No ratings yet
Quiz 1 On Wednesday
46 pages
Machine Learning Notes
100% (3)
Machine Learning Notes
134 pages
LECTURE SET 07 - Machine Learning For Artificial Intelligence
No ratings yet
LECTURE SET 07 - Machine Learning For Artificial Intelligence
48 pages
Learning Paradigms
No ratings yet
Learning Paradigms
41 pages
13,14 Lecture
No ratings yet
13,14 Lecture
41 pages
Lecture Notes 2016
No ratings yet
Lecture Notes 2016
132 pages
Industrial Automation
No ratings yet
Industrial Automation
8 pages
Introduction To Machine Learning
No ratings yet
Introduction To Machine Learning
19 pages
Basics of Machine Learning and Classifications: Dr. Helal Uddin Ahmed
No ratings yet
Basics of Machine Learning and Classifications: Dr. Helal Uddin Ahmed
18 pages
Machine Learning - Unit - 1
100% (1)
Machine Learning - Unit - 1
58 pages
L02 Fundamentals of ML
No ratings yet
L02 Fundamentals of ML
39 pages
Machine Learning
No ratings yet
Machine Learning
51 pages
Accelerated Data Science Introduction To Machine Learning Algorithms
No ratings yet
Accelerated Data Science Introduction To Machine Learning Algorithms
37 pages
Introduction To Machine Learning
No ratings yet
Introduction To Machine Learning
24 pages
Esquema Eléctrico Motor Chevrolet Optra 1J 1600 Año 2008
No ratings yet
Esquema Eléctrico Motor Chevrolet Optra 1J 1600 Año 2008
50 pages
01 - Introduction
No ratings yet
01 - Introduction
35 pages
Machine Learning Introduction
No ratings yet
Machine Learning Introduction
56 pages
Machine Learning
No ratings yet
Machine Learning
28 pages
86 37 196 Mod 5
No ratings yet
86 37 196 Mod 5
52 pages
01 Introduction 1
No ratings yet
01 Introduction 1
71 pages
Study-Guide-Css - Work in Team Environment
No ratings yet
Study-Guide-Css - Work in Team Environment
4 pages
Chapter 2 - Data Analysis I
No ratings yet
Chapter 2 - Data Analysis I
36 pages
Introduction To Cellular Mobile Communications
100% (1)
Introduction To Cellular Mobile Communications
22 pages
The House of Love: Jamie Kimathi Milburn
No ratings yet
The House of Love: Jamie Kimathi Milburn
10 pages
MyChap1 - Introduction
No ratings yet
MyChap1 - Introduction
28 pages
Dell Ftos 07 VRRP
No ratings yet
Dell Ftos 07 VRRP
8 pages
Spark Streaming Twitter Example
No ratings yet
Spark Streaming Twitter Example
4 pages
1 Leaning Introduction
No ratings yet
1 Leaning Introduction
29 pages
ML Lec 02 Introduction II
No ratings yet
ML Lec 02 Introduction II
22 pages
AI Chapter 3 Part 1
No ratings yet
AI Chapter 3 Part 1
33 pages
Machine Learning
No ratings yet
Machine Learning
33 pages
University Institute of Engineering Department of Computer Science and Engg
No ratings yet
University Institute of Engineering Department of Computer Science and Engg
27 pages
Python JNTUK Lab Programs
0% (1)
Python JNTUK Lab Programs
8 pages
Lect3 Machine Learning
No ratings yet
Lect3 Machine Learning
27 pages
Network PDF
No ratings yet
Network PDF
18 pages
Catalog - Scientific Software Group
No ratings yet
Catalog - Scientific Software Group
32 pages
Woodward Zastitni Releji
No ratings yet
Woodward Zastitni Releji
8 pages
1 B2B (Business-to-Business) : Lovely Professional University
No ratings yet
1 B2B (Business-to-Business) : Lovely Professional University
4 pages
Machine Learning INTRO
No ratings yet
Machine Learning INTRO
12 pages
A 0-Contributing PDF
No ratings yet
A 0-Contributing PDF
8 pages
MECH3780 Fluid Mechanics 2 and CFD: Computation Fluid Dynamics (CFD) Lecture 6 - Evaluation of Numerical Solutions
No ratings yet
MECH3780 Fluid Mechanics 2 and CFD: Computation Fluid Dynamics (CFD) Lecture 6 - Evaluation of Numerical Solutions
12 pages
Unit-1 ML
No ratings yet
Unit-1 ML
19 pages
Ain3001 - 01.3 - ML - Fast.tutorial
No ratings yet
Ain3001 - 01.3 - ML - Fast.tutorial
58 pages
15ME663
No ratings yet
15ME663
2 pages
105 Machine Learning Paper
No ratings yet
105 Machine Learning Paper
6 pages
Machine Learning HC
No ratings yet
Machine Learning HC
4 pages
Unit-1 - Machine Learning
No ratings yet
Unit-1 - Machine Learning
85 pages
Presentation On ML
No ratings yet
Presentation On ML
469 pages
Lecture 1 - Introduction
No ratings yet
Lecture 1 - Introduction
49 pages
Introduction 1175
No ratings yet
Introduction 1175
58 pages
CS3491-AI ML-Chapter 1
No ratings yet
CS3491-AI ML-Chapter 1
19 pages
Machine Learning Notes
No ratings yet
Machine Learning Notes
48 pages
Core DNS in Kubernetes - Simplified Learning
No ratings yet
Core DNS in Kubernetes - Simplified Learning
9 pages
Mlfa Autumn 22 Lec 01
No ratings yet
Mlfa Autumn 22 Lec 01
43 pages
Ai Notes
No ratings yet
Ai Notes
8 pages
Sec 1630
No ratings yet
Sec 1630
145 pages
Ch3-Machine Learning
No ratings yet
Ch3-Machine Learning
124 pages
ML - 1 - Sovan - Introduction To ML
No ratings yet
ML - 1 - Sovan - Introduction To ML
83 pages
Machine Learning Overview
No ratings yet
Machine Learning Overview
92 pages
Ict Note
No ratings yet
Ict Note
48 pages
DDoS FBI CISA PSA 508c
No ratings yet
DDoS FBI CISA PSA 508c
2 pages
Artificial Intelligence
No ratings yet
Artificial Intelligence
1 page
Introduction To Machine Learning
No ratings yet
Introduction To Machine Learning
89 pages
Payload
No ratings yet
Payload
1,152 pages
URC Total Control Roku IG Rev3.0 03152024
No ratings yet
URC Total Control Roku IG Rev3.0 03152024
27 pages
Unit 1 Machine Learning
No ratings yet
Unit 1 Machine Learning
68 pages
Lec-1 Introduction
No ratings yet
Lec-1 Introduction
65 pages
지원금 및 지원 수준 (Details on the Research Assistantship and Support) 교수 소속 및 연구분야 (Professor's Contact Details and Fields of Study)
No ratings yet
지원금 및 지원 수준 (Details on the Research Assistantship and Support) 교수 소속 및 연구분야 (Professor's Contact Details and Fields of Study)
1 page
Machine Learning
No ratings yet
Machine Learning
32 pages
Introduction To Machine Learning
No ratings yet
Introduction To Machine Learning
33 pages
Session 5
No ratings yet
Session 5
36 pages
Built in Function in C Programming
No ratings yet
Built in Function in C Programming
21 pages
Tirth PDF
No ratings yet
Tirth PDF
19 pages
Unit1 2
No ratings yet
Unit1 2
101 pages
LECTURE SET 07 - Machine Learning For Artificial Intelligence
No ratings yet
LECTURE SET 07 - Machine Learning For Artificial Intelligence
75 pages
UNIT I-Part 1
No ratings yet
UNIT I-Part 1
52 pages
AI Notes Week 11
No ratings yet
AI Notes Week 11
68 pages
Machine Learning Crash Course: Computer Vision James Hays
No ratings yet
Machine Learning Crash Course: Computer Vision James Hays
38 pages
Oracle® VM VirtualBox® User Manual12
No ratings yet
Oracle® VM VirtualBox® User Manual12
21 pages
PDF Sni 7268 20091 Air Pengisi Ketel Uap - Compress
No ratings yet
PDF Sni 7268 20091 Air Pengisi Ketel Uap - Compress
14 pages
Dissertation
No ratings yet
Dissertation
62 pages
MIX (@MailPassBases)
No ratings yet
MIX (@MailPassBases)
9 pages

4 DL

Uploaded by

4 DL

Uploaded by

Machine Learning

What is Machine Learning?

Example: P ( chips | beer ) = 0.7

Training examples of a person

AT&T Laboratories, Cambridge UK

• Example: Price of a used

• Navigating a car: Angle of the steering wheel.

• Classification • Pattern detection

• Prediction of future cases: Use the rule to predict the

As we go down this chart, the clustering strategies have more tendency

• Training: given a training set of labeled examples {(x1,y1),

f(x) = label of the training example nearest to x

• All we need is a distance function for our inputs

• Find a linear function to separate the classes:

Slide credit: L. Lazebnik

Slide credit: D. Hoiem

Slide credit: L. Lazebnik

Training set (labels known) Test set (labels

• How well does a learned model generalize from

• Models with too few

• Models with too many

Slide credit: D. Hoiem

High Bias Low Bias

Slide credit: D. Hoiem

• Three kinds of error

• Choose a simpler classifier

• Regularize the parameters

• Get more training data

Slide credit: D. Hoiem

Slide credit: L. Lazebnik

• Assign label of nearest training data point to each test data

from Duda et al.

Voronoi partitioning of feature space

• But what if the dataset is just too hard?

• We can map it to a higher-dimensional space:

0 x Slide credit: Andrew Moore

Slide credit: Andrew Moore

◆ What mushrooms are edible,

◆ Iris flowers have different

◆ Suppose you are shown

◆ Generate: A method (aka

◆ Given: ◆ Instances are described

When training and test sets overlap

◆ Classes must be disjoint, i.e., each instance belongs to

◆ Naïve Bayes classifiers

◆ Support vector machines (SVMs)

◆ A combination function joins the

◆ Accuracy: percentage of correct classifications

Total test instances classified correctly

◆ Run classifier n times,

Classified positive Classified negative

Actual positive True positive False negative

Actual negative False positive True negative

TP: number of positive examples classified correctly

TP: number of positive examples classified correctly

Note that the focus is on the positive class 105

◆ There are many other accuracy metrics

◆ Other accuracy metrics ◆ Other concerns

Training Data Test Data

◆ Induction requires inferring general rules about

• No free lunch: machine learning algorithms are tools,

• Try simple classifiers first

• Better to have smart features and simple classifiers

• Use increasingly powerful classifiers with more

Slide credit: D. Hoiem

• UCI KDD Archive:

• International Conference on Machine Learning (ICML)

Slide credit: D. Hoiem

You might also like