Comp Vis Week 3

The document discusses classification in machine learning. It defines classification, describes different types of classifiers, and gives examples of classification tasks like spam filtering and digit recognition. It also covers important topics like representation of objects, data types, training and validation of classifiers, and some commonly used state-of-the-art classifiers.

Uploaded by

MAJD ABDALLAH

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

12 views

Comp Vis Week 3

Uploaded by

MAJD ABDALLAH

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 44

Learning Based Vision - II

Fundamentals of Computer Vision – Week 3

Assist. Prof. Özge Öztimur Karadağ
ALKÜ – Department of Computer Engineering
Alanya
Last Week
• A computer program is said to learn from experience E with respect to
some class of tasks T and performance measure P, if its performance
at tasks in T, as measured by P , improves with experience E.
• Task: regression, classification…
• Experience: dataset, Learning: Supervised vs. Unsupervised
• Performance: Mean Squared Error…
• Python example on regression
Today
• Definition of classification
• Types of classifiers
• Sample examples
Classification
• In classification problems, each entity in some domain can be placed in one
of a discrete set of categories: yes/no, friend/foe, good/bad/indifferent,
blue/red/green, etc.
• Given a training set of labeled entities, develop a rule for assigning labels to
entities in a test set
• Many variations on this theme:
• binary classification
• multi-category classification
• non-exclusive categories
• ranking
• Many criteria to assess rules and their predictions
• overall errors
• costs associated with different kinds of errors
• operating points
Representation of Objects
• Each object to be classified is represented as a pair
(x, y):
• where x is a description of the object (see examples of
data types in the following slides)
• where y is a label (assumed binary for now)
• Success or failure of a machine learning classifier
often depends on choosing good descriptions of
objects
• the choice of description can also be viewed as a learning
problem, but good human intuitions are often needed
here
Data Types
• Vectorial data:
• physical attributes
• behavioral attributes
• context
• history
• etc
Data Types
• text and hypertext

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">

<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<title>Welcome to FairmontNET</title>
</head>
<STYLE type="text/css">
.stdtext {font-family: Verdana, Arial, Helvetica, sans-serif; font-size: 11px; color: #1F3D4E;}
.stdtext_wh {font-family: Verdana, Arial, Helvetica, sans-serif; font-size: 11px; color: WHITE;}
</STYLE>

<body leftmargin="0" topmargin="0" marginwidth="0" marginheight="0" bgcolor="BLACK">

<TABLE cellpadding="0" cellspacing="0" width="100%" border="0">
<TR>
<TD width=50% background="/TFN/en/CDA/Images/common/labels/decorative_2px_blk.gif"> </TD>
<TD><img src="/TFN/en/CDA/Images/common/labels/decorative.gif"></td>
<TD width=50% background="/TFN/en/CDA/Images/common/labels/decorative_2px_blk.gif"> </TD>
</TR>
</TABLE>
<tr>
<td align="right" valign="middle"><IMG src="/TFN/en/CDA/Images/common/labels/centrino_logo_blk.gif"></td>
</tr>
</body>
</html>
Data Types
• email
Return-path <[email protected]>Received from relay2.EECS.Berkeley.EDU
(relay2.EECS.Berkeley.EDU [169.229.60.28]) by imap4.CS.Berkeley.EDU (iPlanet Messaging Server
5.2 HotFix 1.16 (built May 14 2003)) with ESMTP id <[email protected]>;
Tue, 08 Jun 2004 11:40:43 -0700 (PDT)Received from relay3.EECS.Berkeley.EDU (localhost
[127.0.0.1]) by relay2.EECS.Berkeley.EDU (8.12.10/8.9.3) with ESMTP id i58Ieg3N000927; Tue, 08
Jun 2004 11:40:43 -0700 (PDT)Received from redbirds (dhcp-168-35.EECS.Berkeley.EDU
[128.32.168.35]) by relay3.EECS.Berkeley.EDU (8.12.10/8.9.3) with ESMTP id i58IegFp007613;
Tue, 08 Jun 2004 11:40:42 -0700 (PDT)Date Tue, 08 Jun 2004 11:40:42 -0700From Robert Miller
<[email protected]>Subject RE: SLT headcount = 25In-reply-
to <[email protected]>To 'Randy Katz'
<[email protected]>Cc "'Glenda J. Smith'" <[email protected]>, 'Gert Lanckriet'
<[email protected]>Message-
id <[email protected]>MIME-version 1.0X-
MIMEOLE Produced By Microsoft MimeOLE V6.00.2800.1409X-Mailer Microsoft Office Outlook, Build
11.0.5510Content-type multipart/alternative; boundary="----
=_NextPart_000_0033_01C44D4D.6DD93AF0"Thread-
index AcRMtQRp+R26lVFaRiuz4BfImikTRAA0wf3Qthe headcount is now 32. ---------------------------
------------- Robert Miller, Administrative Specialist University of California, Berkeley Electronics
Research Lab 634 Soda Hall #1776 Berkeley, CA 94720-1776 Phone: 510-642-6037 fax: 510-
643-1289
Data Types
• protein sequences
Data Types
• sequences of Unix system calls
Data Types
• network layout: graph
Data Types
• images
Example: Spam Filter
Dear Sir.
• Input: email
• Output: spam/ham First, I must solicit your confidence in this
transaction, this is by virture of its nature
• Setup: as being utterly confidencial and top
• Get a large collection of secret. …
example emails, each labeled
“spam” or “ham”
TO BE REMOVED FROM FUTURE
• Note: someone has to hand
label all this data MAILINGS, SIMPLY REPLY TO THIS
MESSAGE AND PUT "REMOVE" IN THE
• Want to learn to predict labels SUBJECT.
of new, future emails
99 MILLION EMAIL ADDRESSES
• Features: The attributes used to FOR ONLY $99
make the ham / spam decision
• Words: FREE! Ok, Iknow this is blatantly OT but I'm
• Text Patterns: $dd, CAPS beginning to go insane. Had an old Dell
• Non-text: SenderInContacts Dimension XPS sitting in the corner and
• … decided to put it to use, I know it was
working pre being stuck in the corner, but
when I plugged it in, hit the power nothing
happened.
Example: Digit Recognition
• Input: images / pixel grids 0
• Output: a digit 0-9
• Setup:
• Get a large collection of example images, each 1
labeled with a digit
• Note: someone has to hand label all this data
• Want to learn to predict labels of new, future digit
images 2

• Features: The attributes used to make the digit decision

• Pixels: (6,8)=ON
• Shape Patterns: NumComponents, AspectRatio, 1
NumLoops
• …
• Current state-of-the-art: Human-level performance ??
Other Examples of Real-World Classification Tasks
• Fraud detection (input: account activity, classes: fraud / no fraud)
• Web page spam detection (input: HTML/rendered page, classes: spam /
ham)
• Speech recognition and speaker recognition (input: waveform, classes:
phonemes or words)
• Medical diagnosis (input: symptoms, classes: diseases)
• Automatic essay grader (input: document, classes: grades)
• Customer service email routing and foldering
• Link prediction in social networks
• Catalytic activity in drug design
• … many many more

• Classification is an important commercial technology

Training and Validation
• Data: labeled instances, e.g. emails marked spam/ham
• Training set
• Validation set
• Test set

Training
• Training
• Estimate parameters on training set Data
• Tune hyperparameters on validation set
• Report results on test set
• Anything short of this yields over-optimistic claims

• Evaluation
• Many different metrics
• Ideally, the criteria used to train the classifier should be closely Validation
related to those used to evaluate the classifier Data
• Statistical issues
• Want a classifier which does well on test data
• Overfitting: fitting the training data very closely, but not generalizing Test
well Data
• Error bars: want realistic (conservative) estimates of accuracy
Some State-of-the-art Classifiers
• Support vector machine
• Decision Trees
• Artificial neural networks
• Bayesian classifiers
• Nearest neighbor
• …
Intuitive Picture of the Problem

Class1
Class2
Some Issues
• There may be a simple separator (e.g., a straight line in 2D or a
hyperplane in general) or there may not
• There may be “noise” of various kinds
• There may be “overlap”
• One should not be deceived by one’s low-dimensional geometrical
intuition
• Some classifiers explicitly represent separators (e.g., straight lines),
while for other classifiers the separation is done implicitly
• Some classifiers just make a decision as to which class an object is in;
others estimate class probabilities
Methods
I) Instance-based methods:
1) Nearest neighbor
II) Probabilistic models:
1) Naïve Bayes
2) Logistic Regression
III) Linear Models:
1) Perceptron
2) Support Vector Machine
IV) Decision Models:
1) Decision Trees
2) Boosted Decision Trees
3) Random Forest
Naive Bayes Classifier
Classifier
• Given a data point x, what is the probability of x belonging to some
class c?
• Directly model this statement as a conditional probability: p(c|x).
• Example:
• if there are 3 classes c₁, c₂, c₃, and x consists of 2 features x₁, x₂, the result of a classifier could
be something like p(c₁|x₁, x₂)=0.3, p(c₂|x₁, x₂)=0.5 and p(c₃|x₁, x₂)=0.2.
• If we care for a single label as the output, we would choose the one with the highest
probability, i.e. c₂ with a probability of 50% here.
The Naive Bayes Classifier
• Naive Bayes Classifier
• The naive Bayes classifier tries to compute the probabilities p(c|x) for all classes directly.

• max p(c|x) returns the maximum probability while argmax p(c|x) returns the c with this
highest probability.
Bayes Theorem
• Estimates conditional probability.
• Conditional probability is the likelihood of an outcome occurring, based on a
previous outcome occurring in similar circumstances.

𝑃(𝐴) 𝑃(𝐵|𝐴)
𝑃(𝐴|𝐵) =
𝑃(𝐵)
Bayes Theorem
𝑃 𝐴 𝑃(𝐵|𝐴) 𝑃(𝐴∩𝐵)
•𝑃 𝐴𝐵 = =
𝑃 𝐵 𝑃(𝐵)

P(A)= The probability of A occurring

P(B)= The probability of B occurring

P(A∣B)=The probability of A given B

P(B∣A)= The probability of B given A

P(A⋂B))= The probability of both A and B occurring

Bayes Theorem 𝑃(𝐴|𝐵) =
𝑃(𝐴) 𝑃(𝐵|𝐴)
𝑃(𝐵)
• Ex. Cancer diagnosis
• Binary classification problem
• Assume P(Cancer)=0.008
• Assume there is a blood test which results + or -.
• The test results + with a probability of %98 in cancer patients
• The test results – with a probability of %97 in cancer free people
• A person takes the test and results +, the questions are :
• What is the probability that the person has cancer?
• What is the probability that the person does not have cancer?
• What is is the diagnosis?
Bayes Theorem 𝑃(𝐴|𝐵) =
𝑃(𝐴) 𝑃(𝐵|𝐴)
𝑃(𝐵)
• Ex. Cancer diagnosis
• Binary classification problem
• Assume P(Cancer)=0.008
• Assume there is a blood test which results + or negative
• The test results + with a probability of %98 in cancer patients
• The test results – with a probability of %97 in cancer free people
• A person takes the test and results +, the questions are :
• What is the probability that the person has cancer?
• What is the probability that the person does not have cancer?
• What is is the diagnosis?
• P(cancer)=0.008
• P(notcancer)=0.992
• P(+|cancer)=0.98
• P(-|cancer)=0.02
• P(-|notcancer)=0.97
• P(+|notcancer)=0.03
• P(cancer|+)=?
• P(notcancer|+)=?

• P(cancer|+)=P(+|cancer)*P(cancer)/P(+)=0.98*0.008/P(+)=0.0078/P(+)
• P(notcancer|+)=P(+|notcancer)*P(notcancer)/P(+)=0.03*0.992/P(+)=0.0298/P(+)
• P(cancer|+)<P(notcancer|+)  not cancer
Bayes Theorem
• Ex: Imagine 100 people at a party, and you are told how many wear
pink or not, and if a man or not, and get these numbers:

• From these numbers, we can find the probabilities P(Man), P(Pink) etc.

• P(Man)=0.4
• P(Pink)=0.25
• P(Pink|Man)=0.125
Bayes Theorem
• Ex: Imagine 100 people at a party, and you are told how many wear pink or not,
and if a man or not, and get these numbers:
• From these numbers, we can find the probabilities P(Man), P(Pink) etc.

• P(Man)=0.4
• P(Pink)=0.25
• P(Pink|Man)=0.125

• What is the probability that a person wearing pink is a man P(Man|Pink)=?

• Even if we do not have access to the complete table, using only three numbers, we can
estimate it!
𝑃 𝑃𝑖𝑛𝑘 𝑀𝑎𝑛 ∗𝑃(𝑀𝑎𝑛)
• 𝑃 𝑀𝑎𝑛 𝑃𝑖𝑛𝑘 = 𝑃(𝑃𝑖𝑛𝑘)
• If we still have the raw data, we can calculate it directly.
Python Example on Classification
• iris_classification_example.py
K - Nearest Neighbor (KNN) Classifier
• KNN algorithm is one of the simplest classification algorithm
• non-parametric
• it does not make any assumptions on the underlying data distribution
• Its purpose is to use a dataset in which the data points are
separated into several classes to predict the classification of a new
sample point.
• KNN Algorithm is based on feature similarity
• How closely out-of-sample features resemble our training set
determines how we classify a given data point
1-Nearest Neighbor
3-Nearest Neighbor
Classification steps

1. Training phase: a model is constructed from the training

instances.
• classification algorithm finds relationships between predictors and
targets
• relationships are summarized in a model
2. Testing phase: test the model on a test sample whose class
labels are known but not used for training the model
3. Usage phase: use the model for classification on new data
whose class labels are unknown
K-Nearest Neighbor
• Features
• All instances correspond to points in an n-dimensional Euclidean space
• Classification is delayed till a new instance arrives
• Classification done by comparing feature vectors of the different points
• Target function may be discrete or real-valued
K-Nearest Neighbor
• An arbitrary instance is represented by (a1(x), a2(x), a3(x),.., an(x))
• ai(x) denotes features
• Euclidean distance between two instances
• 𝑑 𝑥𝑖 , 𝑥𝑗 = (σ𝑛𝑟=1(𝑎𝑟 𝑥𝑖 − 𝑎𝑟 (𝑥𝑗 ))2 )

• Continuous valued target function

• mean value of the k nearest training examples
What to determine in KNN?
•K
• K-nearest neighbours uses the local neighborhood to obtain a prediction
• The K memorized examples more similar to the one that is being classified are retrieved

• Distance metric
• A distance function is needed to compare the examples similarity
KNN - features
• If the ranges of the features differ, feaures with
bigger values will dominate decision
• In general feature values are normalized prior to
distance calculation
Python Example on Classification
• iris_classification_example.py
Homework
• Measure the performance of the classifiers (NB and KNN).
• Compare the two methods.
References
• Micheal I. Jordan, University of California, Berkeley, Lecture Notes,

MyCRM Demo Checklist Editable
No ratings yet
MyCRM Demo Checklist Editable
6 pages
Classification
No ratings yet
Classification
88 pages
lec09 (1) (1)
No ratings yet
lec09 (1) (1)
50 pages
lec09 (1)
No ratings yet
lec09 (1)
50 pages
cs221-lecture10
No ratings yet
cs221-lecture10
43 pages
14 Supervised Machine Learning
No ratings yet
14 Supervised Machine Learning
94 pages
03 Classification
No ratings yet
03 Classification
66 pages
Naive Bayes Ons
No ratings yet
Naive Bayes Ons
29 pages
CSE546: Naïve Bayes: Winter 2012
No ratings yet
CSE546: Naïve Bayes: Winter 2012
35 pages
Naive Bayes
No ratings yet
Naive Bayes
38 pages
Lecture1 - Introduction To Machine Learning
No ratings yet
Lecture1 - Introduction To Machine Learning
39 pages
DWM Exp5 C49
No ratings yet
DWM Exp5 C49
12 pages
Learning AI
No ratings yet
Learning AI
34 pages
MILIT PPT Modifies
No ratings yet
MILIT PPT Modifies
43 pages
Lecture 10 Naïve Bayes Classification
No ratings yet
Lecture 10 Naïve Bayes Classification
29 pages
Lecture13 Nbayes
No ratings yet
Lecture13 Nbayes
56 pages
Lesson 3.3 - Supervised Learning Rule Based Classification
No ratings yet
Lesson 3.3 - Supervised Learning Rule Based Classification
43 pages
20ECE633T Machine Learning in VLSI
No ratings yet
20ECE633T Machine Learning in VLSI
81 pages
Introduction To ML
No ratings yet
Introduction To ML
31 pages
Lecture 06 - Perceptron
No ratings yet
Lecture 06 - Perceptron
28 pages
IME672 - Lecture 44
No ratings yet
IME672 - Lecture 44
16 pages
NB Slides
No ratings yet
NB Slides
29 pages
ML Course PDF
No ratings yet
ML Course PDF
133 pages
Lecture W1c UG
No ratings yet
Lecture W1c UG
33 pages
Machine Learning: Professional CORE (CET3006B) T. Y. B.Tech CSE
No ratings yet
Machine Learning: Professional CORE (CET3006B) T. Y. B.Tech CSE
106 pages
Classification FoundationalMathofAI S24
No ratings yet
Classification FoundationalMathofAI S24
6 pages
Irs Unit 4 CH 1
No ratings yet
Irs Unit 4 CH 1
58 pages
Resentation On Aïve Bayesian Lassification
No ratings yet
Resentation On Aïve Bayesian Lassification
38 pages
Lecture1 PDF
No ratings yet
Lecture1 PDF
39 pages
"Classifiers": R & D Project by Under The Guidance of
No ratings yet
"Classifiers": R & D Project by Under The Guidance of
59 pages
Unit-3 AML (Bayesian Concept Learning)
No ratings yet
Unit-3 AML (Bayesian Concept Learning)
40 pages
Bayesian Classification
No ratings yet
Bayesian Classification
25 pages
n9
No ratings yet
n9
14 pages
Classification
No ratings yet
Classification
33 pages
ML Merged
No ratings yet
ML Merged
433 pages
ML -1_Sovan_Introduction to ML
No ratings yet
ML -1_Sovan_Introduction to ML
83 pages
Lecture 1
No ratings yet
Lecture 1
43 pages
Lecture 6 Text Classification
No ratings yet
Lecture 6 Text Classification
19 pages
ML Book
No ratings yet
ML Book
40 pages
3 - Bayesian Classification
No ratings yet
3 - Bayesian Classification
15 pages
1 CourseOverview
No ratings yet
1 CourseOverview
34 pages
Unit 1&2
No ratings yet
Unit 1&2
270 pages
Ch5
No ratings yet
Ch5
21 pages
AML All Merged PDF Class 1 To 8
No ratings yet
AML All Merged PDF Class 1 To 8
423 pages
19_ML_intro
No ratings yet
19_ML_intro
33 pages
01 - ML - Introduction (1)
No ratings yet
01 - ML - Introduction (1)
65 pages
A5 PDF
No ratings yet
A5 PDF
9 pages
L5 TextClassification Updated
No ratings yet
L5 TextClassification Updated
179 pages
NaiveBayes N Text Analytics
No ratings yet
NaiveBayes N Text Analytics
20 pages
Lecture 1 - Introduction
No ratings yet
Lecture 1 - Introduction
49 pages
IntroClassificationDA-2024
No ratings yet
IntroClassificationDA-2024
129 pages
Machine Learning Introduction
No ratings yet
Machine Learning Introduction
56 pages
Introduction To Classification - PPT Slides 1
No ratings yet
Introduction To Classification - PPT Slides 1
62 pages
CS464 Ch1 Intro Fall2020
No ratings yet
CS464 Ch1 Intro Fall2020
83 pages
Naive Bayes Classifiers - Parta
No ratings yet
Naive Bayes Classifiers - Parta
17 pages
8 Classification
No ratings yet
8 Classification
45 pages
L02 Fundamentals of ML
No ratings yet
L02 Fundamentals of ML
39 pages
Unit I MACHINE LEARNING
No ratings yet
Unit I MACHINE LEARNING
87 pages
Be Data Curious!: Be Data Curious!, #1
From Everand
Be Data Curious!: Be Data Curious!, #1
Nick Jewell
No ratings yet
Microsoft Visual Basic Interview Questions: Microsoft VB Certification Review
From Everand
Microsoft Visual Basic Interview Questions: Microsoft VB Certification Review
Equity Press
No ratings yet
ICDL Spreadhseets
From Everand
ICDL Spreadhseets
Michael Anderson
4/5 (2)
This Study Resource Was: IT Solutions
No ratings yet
This Study Resource Was: IT Solutions
5 pages
Summary
No ratings yet
Summary
5 pages
2020 12 01 - 59 - Meet and Confer Statement
No ratings yet
2020 12 01 - 59 - Meet and Confer Statement
15 pages
User Manual A09 English PDF
No ratings yet
User Manual A09 English PDF
44 pages
HYSECURE CLIENT BASED VPN Login and Mobile Token Registration Process
No ratings yet
HYSECURE CLIENT BASED VPN Login and Mobile Token Registration Process
11 pages
Assignment List
No ratings yet
Assignment List
3 pages
REGISTRATION LINK - Cognizant GenC Campus Hiring 2024 - Registrations Are Open
No ratings yet
REGISTRATION LINK - Cognizant GenC Campus Hiring 2024 - Registrations Are Open
2 pages
Yr8 ICT Wk1-5note
No ratings yet
Yr8 ICT Wk1-5note
7 pages
DR in Afghanistan FM 1
No ratings yet
DR in Afghanistan FM 1
7 pages
Customer Cif Amendment Request Form: For Individual / Joint Accounts (Non-Resident Customers Only)
No ratings yet
Customer Cif Amendment Request Form: For Individual / Joint Accounts (Non-Resident Customers Only)
1 page
Android_Programming
No ratings yet
Android_Programming
4 pages
Nurex Cargo Uk Turkey Office Uganda Germany Office: Dubai Office Nairobi Office Tanzania
No ratings yet
Nurex Cargo Uk Turkey Office Uganda Germany Office: Dubai Office Nairobi Office Tanzania
4 pages
ADDRESS English meaning - Cambridge Dictionary
No ratings yet
ADDRESS English meaning - Cambridge Dictionary
1 page
Once Well Known
No ratings yet
Once Well Known
16 pages
What Is Spam
No ratings yet
What Is Spam
9 pages
CV-Maria Odessa Cabatuan
No ratings yet
CV-Maria Odessa Cabatuan
1 page
G2A Method Guide 2024 v2.1
100% (1)
G2A Method Guide 2024 v2.1
9 pages
Sahil Bhardwaj Method BSC
No ratings yet
Sahil Bhardwaj Method BSC
9 pages
Summer Training PPT 11909002
No ratings yet
Summer Training PPT 11909002
15 pages
HROne Employee Manual
No ratings yet
HROne Employee Manual
21 pages
Gmail - Booking Confirmation On IRCTC, Train - 03244, 22-Sep-2020, 2S, GAYA - PNBE
No ratings yet
Gmail - Booking Confirmation On IRCTC, Train - 03244, 22-Sep-2020, 2S, GAYA - PNBE
1 page
Test - 04 Failed - GE Healthcare AMX 4 Plus - MedWrench
No ratings yet
Test - 04 Failed - GE Healthcare AMX 4 Plus - MedWrench
7 pages
Email Writing Quiz
No ratings yet
Email Writing Quiz
3 pages
Gmail - 2025 Batch __ BT Group __ Online Test __ 27th November 2024 __ 9_00 AM
No ratings yet
Gmail - 2025 Batch __ BT Group __ Online Test __ 27th November 2024 __ 9_00 AM
1 page
FWD - Balises On Precise Location Tolerance - Joniefwan06@gmail - Com - Gmail
No ratings yet
FWD - Balises On Precise Location Tolerance - Joniefwan06@gmail - Com - Gmail
1 page
(ENG) Nature Art Cube Exhibition 2020 - Application
No ratings yet
(ENG) Nature Art Cube Exhibition 2020 - Application
4 pages
Bitcoin Madness
No ratings yet
Bitcoin Madness
9 pages
Lab3 Identity - SSO Integration of Test SAML App in AAD
No ratings yet
Lab3 Identity - SSO Integration of Test SAML App in AAD
12 pages
AICTE - EduSkills - Palo Alto Cybersecurity Virual Internship Process Guide
No ratings yet
AICTE - EduSkills - Palo Alto Cybersecurity Virual Internship Process Guide
19 pages