0% found this document useful (0 votes)

46 views14 pages

Linear Classifiers PPT 1

Uploaded by

Atul Shende

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

46 views14 pages

Linear Classifiers PPT 1

Uploaded by

Atul Shende

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 14

Today

Linear Discriminant Functions

Introduction
2 classes
CS434a/541a: Pattern Recognition
Multiple classes
Prof. Olga Veksler
Optimization with gradient descent
Perceptron Criterion Function
Lecture 9 Batch perceptron rule
Single sample perceptron rule
Announcements Linear discriminant functions on Road Map
Final project proposal due Nov. 1 No probability distribution (no shape or a lot is
1-2 paragraph description parameters are known) known
Late Penalty: is 1 point off for each day late Labeled data salmon bass salmon salmon
Assignment 3 due November 10 The shape of discriminant functions is
Data for final project due Nov. 15 known
Must be ported in Matlab, send me .mat file with data ba linear

lightness
and a short description file of what the data is ss discriminant
function
Late penalty is 1 point off for each day late s a l
Final project progress report m
on
Meet with me the week of November 22-26
5 points of if I will see you that have done NOTHNG yet
length
Assignment 4 due December 1 Need to estimate parameters of the
little is
Final project due December 8 discriminant function (parameters of the known
line in case of linear discriminant)
Linear Discriminant Functions: Basic Idea LDF: Introduction
bass
salmon

Discriminant functions can be more general than

lightness

ba on
sa

ss
lm

linear
For now, we will study linear discriminant functions
Simple model (should try simpler models first)
length length
Analytically tractable
bad boundary good boundary
Linear Discriminant functions are optimal for
Have samples from 2 classes x1, x2 ,…, xn Gaussian distributions with equal covariance
Assume 2 classes can be separated by a linear May not be optimal for other data distributions, but
boundary l(θ) with some unknown parameters θ they are very simple to use
Fit the “best” boundary to data by optimizing over
parameters θ Knowledge of class densities is not required when
What is best? using linear discriminant functions
Minimize classification error on training data? we can say that this is a non-parametric approach
Does not guarantee small testing error
Parametric Methods vs. Discriminant Functions LDF: 2 Classes
Assume the shape of density Assume discriminant
for classes is known p1(x|θ1), functions are or known shape A discriminant function is linear if it can be written as
p2(x|θ2),… l(θ1), l(θ2), with parameters g(x) = wtx + w0
θ1, θ2,…
Estimate θ1, θ2,… from data w is called the weight vector and w0 called bias or threshold
Estimate θ1, θ2,… from data
Use a Bayesian classifier to Use discriminant functions for
find decision regions classification
x(2) ℜ1 g(x ) > 0 x ∈class 1
c3 g(x ) < 0 x ∈ class 2
c2 c2 g(x ) = 0 either class
c3
g(x) > 0
c1 c1 ℜ2
In theory, Bayesian classifier minimizes the risk
In practice, do not have confidence in assumed model shapes
In practice, do not really need the actual density functions in the end
x(1)
Estimating accurate density functions is much harder than
estimating accurate discriminant functions g(x) < 0
Some argue that estimating densities should be skipped decision boundary g(x) = 0
Why solve a harder problem than needed ?
LDF: 2 Classes LDF: 2 Classes
Decision boundary g(x) = wtx + w0=0 is a hyperplane
set of vectors x which for some scalars α0,…, αd
satisfy α0 +α1x(1)+…+ αdx(d) = 0
A hyperplane is
a point in 1D
a line in 2D
a plane in 3D
LDF: 2 Classes LDF: Many Classes
g(x) = wtx + w0 Suppose we have m classes
w determines orientation of the decision hyperplane Define m linear discriminant functions
w0 determines location of the decision surface gi ( x ) = w it x + w i 0 i = 1,..., m
x(2)
|| x Given x, assign class ci if
/ ||w
x) gi ( x ) ≥ g j ( x ) ∀j ≠ i
g(
g(x) > 0
w Such classifier is called a linear machine
||
w
/|| A linear machine divides the feature space into c
w0
x(1)
decision regions, with gi(x) being the largest
discriminant if x is in the region Ri
g(x) < 0 g(x) = 0
LDF: Many Classes LDF: Many Classes
Decision regions for a linear machine are convex
y , z ∈ Ri α y + (1 − α )z ∈ Ri y
z
Ri
∀j ≠ i g i (y ) ≥ g j (y ) and g i (z ) ≥ g j (z ) ⇔
⇔ ∀j ≠ i g i (α y + (1 − α )z ) ≥ g j (α y + (1 − α )z )
In particular, decision regions must be spatially
contiguous
Ri Ri Ri
Rj is a valid Rj is not a valid
decision region decision region
LDF: Many Classes LDF: Many Classes
For a two contiguous regions Ri and Rj; the Thus applicability of linear machine to mostly limited
boundary that separates them is a portion of to unimodal conditional densities p(x|θ)
hyperplane Hij defined by: even though we did not assume any parametric models
g (x ) = g (x ) ⇔ w x + w
i j
t
i i0 =w x +w
t
j j0
Example:
( ) (
⇔ wi − w j t x + wi0 − w j0 = 0 )
Thus wi – wj is normal to Hij
And distance from x to Hij is given by
gi ( x ) − g j ( x )
d (x ,H ) =
ij
wi − w j need non-contiguous decision regions
thus linear machine will fail
LDF: Augmented feature vector LDF: Training Error
Linear discriminant function: g ( x ) = w x + w 0
t
For the rest of the lecture, assume we have 2 classes
Can rewrite it: g ( x ) = [w 0 w t ] x1 = a t y = g (y ) Samples y1,…, yn some in class 1, some in class 2
Use these samples to determine weights a in the
new weight new feature
vector a vector y discriminant function g ( y ) = a t y
y is called the augmented feature vector What should be our criterion for determining a?
Added a dummy dimension to get a completely For now, suppose we want to minimize the training error
(that is the number of misclassifed samples y1,…, yn )
equivalent new homogeneous problem
old problem new problem g(y i ) > 0 y i classified c1
Recall that
g( x ) = w t x + w 0 g(y ) = at y g(y i ) < 0 y i classified c2
1
x1 x1 g ( y i ) > 0 ∀y i ∈ c1
Thus training error is 0 if g ( y i ) < 0 ∀y i ∈ c 2
xd xd
LDF: Augmented feature vector LDF: Problem “Normalization”
Feature augmenting is done for simpler notation a t y i > 0 ∀y i ∈ c1
Thus training error is 0 if
From now on we always assume that we have a t y i < 0 ∀y i ∈ c 2
augmented feature vectors Equivalently, training error is 0 if
Given samples x1,…, xn convert them to
augmented samples y1,…, yn by adding y i = x1 at y i > 0 ∀y i ∈ c1
a new dimension of value 1
i
a t (− y i ) > 0 ∀y i ∈ c 2
This suggest problem “normalization”:
y (2 ) ℜ2 ℜ1 1. Replace all examples from class c2 by their negative
g(y) < 0 g g(y) > 0
( y) yi → −y i ∀y i ∈ c 2
/ ||a 2. Seek weight vector a s.t.
a | |
y at y i > 0 ∀y i
If such a exists, it is called a separating or solution vector
g(y) = 0
y (1) Original samples x1,…, xn can indeed be separated by a
line then
LDF: Problem “Normalization” LDF: Solution Region
before normalization after “normalization”
Solution region for a: set of all possible solutions
y (2 ) y (2 ) defined in terms of normal a to the separating hyperplane
y (2 )
y (1) y (1)
y (1)
a
region
Seek a hyperplane that Seek hyperplane that solution
separates patterns from puts normalized
different categories patterns on the same
(positive) side
LDF: Solution Region Optimization
Find weight vector a s.t. for all samples y1,…, yn Need to minimize a function of many variables
d
J ( x ) = J ( x 1,..., x d )
at y i = ak y i( k ) > 0
k =0 We know how to minimize J(x)
y (2 ) Take partial derivatives and set them to zero
∂ gradient
J (x )
∂x 1
= ∇J ( x ) = 0
∂
y (1) J (x )
∂x d
a
However solving analytically is not always easy
best a
a Would you like to solve this system of nonlinear equations?
sin(x12 + x 23 ) + e x 4 = 0
2
In general, there are many such solutions a
( )
cos (x12 + x 23 ) + log x 53
x 42
=0
Sometimes it is not even possible to write down an analytical
expression for the derivative, we will see an example later today
Optimization: Gradient Descent Optimization: Gradient Descent
Gradient ∇J ( x ) points in direction of steepest increase of Gradient descent is guaranteed to find only a local
J(x), and − ∇J ( x ) in direction of steepest decrease minimum
one dimension two dimensions J(x)
− dJ (a )
J(x) dx − ∇J (a )
x
x(1) x(2) x(3) x(k) global minimum
a
a x Nevertheless gradient descent is very popular
− dJ (a ) − dJ (a ) because it is simple and applicable to any function
dx dx
a a
Optimization: Gradient Descent Optimization: Gradient Descent
J(x) − ∇ J (x ( 1 ) ) Main issue: how to set parameter η (learning rate )
− ∇ J (x ( 2 ) ) If η is too small, need too many iterations
J(x)
s(1) ∇ J (x ( k ) ) = 0 x
s (2)
x(1) x(2) x(3) x(k) x
Gradient Descent for minimizing any function J(x) J(x)
set k = 1 and x(1) to some initial guess for the weight vector If η is too large may
(k )
while η ∇J x (
(k )
>ε ) overshoot the minimum
choose learning rate η(k) and possibly never find it
(if we keep overshooting) x
x(k+1)= x(k) – η (k) ∇J ( x ) (update rule)
k=k+1 x(1) x(2)
Today LDF
Augmented and “normalized” samples y1,…, yn
Continue Linear Discriminant Functions Seek weight vector a s.t. a t y i > 0 ∀y i
Perceptron Criterion Function y (2 )
y (2 )
Batch perceptron rule
Single sample perceptron rule
y (1) y (1)
before normalization after “normalization”
If such a exists, it is called a separating or solution vector
original samples x1,…, xn can indeed be separated by a
line then
LDF: Augmented feature vector Optimization: Gradient Descent
J(x) − ∇ J (x ( 1 ) )
Linear discriminant function:
lightness

ba on
sa
( )

ss
lm
g( x ) = w t x + w 0 − ∇ J x (2 )
need to estimate parameters w
and w0 from data s(1) ∇ J (x ( k ) ) = 0 x
length s (2)
x(1) x(2) x(3) x(k)
Augment samples x to get equivalent homogeneous s (k +1 ) = x (k +1 ) − x (k ) = η (k ) (− ∇J (x (k ) ))
problem in terms of samples y:
Gradient Descent for minimizing any function J(x)
[
g( x ) = w 0 w t ] 1
x = a y = g (y )
t
set k = 1 and x(1) to some initial guess for the weight vector
“normalize” by replacing all examples from class c2
(k )
while η ∇J x (
(k )
>ε )
choose learning rate η(k)
by their negative
yi → −y i ∀y i ∈ c 2 x(k+1)= x(k) – η (k) ∇J ( x ) (update rule)
k=k+1
LDF: Criterion Function LDF: Perceptron Batch Rule
Find weight vector a s.t. for all samples
d
y1,…, yn J p (a ) = (− a y )
t
y ∈YM
at y = a y (k )
>0
i
k =0
k i
Gradient of Jp(a) is ∇J p (a ) = (− y )
Need criterion function J(a) which is minimized when y ∈YM
a is a solution vector YM are samples misclassified by a(k)
It is not possible to solve ∇J p (a ) = 0 analytically
Let YM be the set of examples misclassified by a because of YM
YM (a ) = { sample y i s .t . a t y i < 0 } Update rule for gradient descent: x(k+1)= x(k)–η (k) ∇J ( x )
First natural choice: number of misclassified examples
Thus gradient decent batch update rule for Jp(a) is:
J (a ) = YM (a )
J(a) a (k +1) = a (k ) + η (k ) y
y ∈YM
piecewise constant, gradient
descent is useless It is called batch rule because it is based on all
misclassified examples
a
LDF: Perceptron Criterion Function LDF: Perceptron Single Sample Rule
Better choice: Perceptron criterion function Thus gradient decent single sample rule for Jp(a) is:
J (a ) =
p (− a y ) t
a (k +1) = a (k ) + η (k ) y M
y ∈YM
y note that yM is one sample misclassified by a(k)

ay
t
If y is misclassified, a t y ≤ 0 must have a consistent way of visiting samples

/ ||
Thus J p (a ) ≥ 0

a ||
Geometric Interpretation:

a
Jp(a) is ||a|| times sum of yM misclassified by a(k)
distances of misclassified (a ( ) ) y
k t
≤0 yM

a(k
M
examples to decision boundary

+1)
yM is on the wrong side of

a
decision hyperplane ηyM

(k
Jp(a) is piecewise linear J(a)

)
adding ηyM to a moves new
and thus suitable for decision hyperplane in the right
gradient descent direction with respect to yM
a
LDF: Perceptron Single Sample Rule LDF Example: Augment feature vector
a (k +1) = a (k ) + η (k ) y M
features grade
name extra good tall? sleeps in chews
attendance? class? gum?
Jane 1 yes (1) yes (1) no (-1) no (-1) A
a(k

yM yM Steve 1 yes (1) yes (1) yes (1) yes (1) F

k +1
+1)

Mary 1 no (-1) no (-1) no (-1) yes (1) F

)
a

a
(k

(k
)

Peter 1 yes (1) no (-1) no (-1) yes (1) A

yk
yk convert samples x1,…, xn to augmented samples
η is too large, previously η is too small, yM is still y1,…, yn by adding a new dimension of value 1
correctly classified sample misclassified
yk is now misclassified
LDF: Perceptron Example LDF: Perform “Normalization”
features grade features grade
name good tall? sleeps in chews
name extra good tall? sleeps in chews
attendance? class? gum?
attendance? class? gum?
Jane yes (1) yes (1) no (-1) no (-1) A
Jane 1 yes (1) yes (1) no (-1) no (-1) A
Steve yes (1) yes (1) yes (1) yes (1) F Steve -1 yes (-1) yes (-1) yes (-1) yes (-1) F
Mary no (-1) no (-1) no (-1) yes (1) F
Mary -1 no (1) no (1) no (1) yes (-1) F
Peter yes (1) no (-1) no (-1) yes (1) A Peter 1 yes (1) no (-1) no (-1) yes (1) A
Replace all examples from class c2 by their negative
class 1: students who get grade A yi → −y i ∀y i ∈ c 2
class 2: students who get grade F Seek weight vector a s.t. at y i > 0 ∀y i
LDF: Use Single Sample Rule LDF: Gradient decent Example
features grade a (2 ) = [− 0.75 − 0.75 − 0.75 − 0.75 − 0.75 ]
name extra good tall? sleeps in chews
attendance? class? gum? name misclassified?
aty
Jane 1 yes (1) yes (1) no (-1) no (-1) A
Mary -0.75*(-1)-0.75*1 -0.75 *1 -0.75 *1 -0.75*(-1) <0 yes
Steve -1 yes (-1) yes (-1) yes (-1) yes (-1) F
Mary -1 no (1) no (1) no (1) yes (-1) F
Peter 1 yes (1) no (-1) no (-1) yes (1) A new weights
4 a (3 ) = a (2 ) + y M = [− 0.75 − 0.75 − 0.75 − 0.75 − 0.75 ] +
Sample is misclassified if at y i = ak y i( k ) < 0
k =0 + [− 1 1 1 1 − 1] =
gradient descent single sample rule: a (k +1) = a (k ) + η (k ) y = [− 1.75 0.25 0.25 0.25 − 1.75 ]
y ∈YM
Set fixed learning rate to η(k)= 1: a (k +1) = a (k ) + y M
LDF: Gradient decent Example LDF: Gradient decent Example
set equal initial weights a(1)=[0.25, 0.25, 0.25, 0.25] a (3 ) = [− 1.75 0.25 0.25 0.25 − 1.75 ]
visit all samples sequentially, modifying the weights
for after finding a misclassified example name aty misclassified?
Peter -1.75 *1 +0.25* 1+0.25* (-1) +0.25 *(-1)-1.75*1 <0 yes
name aty misclassified?
Jane 0.25*1+0.25*1+0.25*1+0.25*(-1)+0.25*(-1) >0 no
new weights
Steve 0.25*(-1)+0.25*(-1)+0.25*(-1)+0.25*(-1)+0.25*(-1)<0 yes
a ( 4 ) = a (3 ) + y M = [− 1.75 0.25 0.25 0.25 − 1.75 ] +
new weights + [1 1 −1 −1 1] =
a (2 ) = a (1) + y M = [0.25 0.25 0.25 0.25 0.25] +
= [− 0.75 1.25 − 0.75 − 0.75 − 0.75 ]
+ [− 1 − 1 − 1 − 1 − 1] =
= [− 0.75 − 0.75 − 0.75 − 0.75 − 0.75 ]
LDF: Gradient decent Example LDF: Nonseparable Example
a ( 4 ) = [− 0.75 1.25 − 0.75 − 0.75 − 0.75 ] Suppose we have 2 features
and samples are:
name aty misclassified?
Class 1: [2,1], [4,3], [3,5]
Jane -0.75 *1 +1.25*1 -0.75*1 -0.75 *(-1) -0.75 *(-1)+0 no Class 2: [1,3] and [5,6]
Steve -0.75*(-1)+1.25*(-1) -0.75*(-1) -0.75*(-1)-0.75*(-1)>0 no These samples are not
Mary -0.75 *(-1)+1.25*1-0.75*1 -0.75 *1 –0.75*(-1) >0 no separable by a line
Peter -0.75 *1+ 1.25*1-0.75* (-1)-0.75* (-1) -0.75 *1 >0 no Still would like to get approximate separation by a
line, good choice is shown in green
Thus the discriminant function is some samples may be “noisy”, and it’s ok if they are on
g (y ) = −0.75 * y (0 ) + 1.25 * y (1) − 0.75 * y (2 ) − 0.75 * y (3 ) − 0.75 * y ( 4 ) the wrong side of the line
Get y1, y2 , y3 , y4 by adding extra feature and
Converting back to the original features x: “normalizing” 1 1 1 −1
−1
g ( x ) = 1.25 * x (1) − 0.75 * x (2 ) − 0.75 * x (3 ) − 0.75 * x ( 4 ) − 0.75 y = 2
1 y = 4
2
y = 3
3 y = −1 4
y = −5
5
1 3 5 −3 −6
LDF: Gradient decent Example LDF: Nonseparable Example
Converting back to the original features x: Let’s apply Perceptron single
1.25 * x (1) − 0.75 * x (2 ) − 0.75 * x (3 ) − 0.75 * x ( 4 ) > 0.75 grade A
sample algorithm
1.25 * x (1) − 0.75 * x (2 ) − 0.75 * x (3 ) − 0.75 * x ( 4 ) < 0.75 grade F initial equal weights a (1 ) = [1 1 1]
this is line x(1)+x(2)+1=0 (1
)
good tall sleeps in class chews gum a
attendance
fixed learning rate η = 1
a (k +1) = a (k ) + y M
This is just one possible solution vector
1 1 1 −1 −1
y = 2 y = 4 y = 3 y4 = −1 y = −5
If we started with weights a(1)=[0,0.5, 0.5, 0, 0], 1
1
2
3
3
5 −3
5
−6
solution would be [-1,1.5, -0.5, -1, -1]
1.5 * x (1 ) − 0.5 * x (2 ) − x (3 ) − x ( 4 ) > 1 grade A yt1a(1) = [1 1 1]*[1 2 1]t > 0
1.5 * x (1 ) − 0.5 * x (2 ) − x (3 ) − x ( 4 ) < 1 grade F
In this solution, being tall is the least important feature
yt2a(1) = [1 1 1]*[1 4 3]t > 0
yt3a(1) = [1 1 1]*[1 3 5]t > 0
LDF: Nonseparable Example LDF: Nonseparable Example
a (1 ) = [1 1 1] a (k +1) = a (k ) + y M a ( 4 ) = [0 1 − 4 ] a (k +1) = a (k ) + y M
−1 −1 −1 −1 )
1 1 1 1 1 1 (3
y1 = 2 y2 = 4 y3 = 3 y4 = −1 y5 = − 5 y1 = 2 y2 = 4 y3 = 3 y4 = −1 y5 = − 5 a
1 3 5 −3 −6 1 3 5 −3 −6
)
(1
a a(4)
yt4a(1)=[1 1 1]*[-1 -1 -3]t = -5< 0 a(2)
yt2 a(3)=[1 4 3]*[1 2 -1]t =6 > 0
a (2 ) = a (1 ) + y M = [1 1 1] + [− 1 − 1 − 3 ] = [0 0 − 2 ] yt3 a(3)=[1 3 5]*[1 2 -1]t > 0
yt5 a(2)=[0 0 -2]*[-1 -5 -6]t = 12 > 0 yt4 a(3)=[-1 -1 -3]*[1 2 -1]t = 0
yt1 a(2)=[0 0 -2]*[1 2 1]t < 0 a ( 4 ) = a (3 ) + y M = [1 2 − 1] + [− 1 − 1 − 3 ] = [0 1 − 4 ]
a (3 ) = a (2 ) + y M = [0 0 − 2 ] + [1 2 1] = [1 2 − 1]
LDF: Nonseparable Example LDF: Nonseparable Example
a (3 ) = [1 2 − 1] a (k +1) = a (k ) + y M we can continue this forever
)
1 1 1 −1 −1 (3 there is no solution vector a satisfying for all i
y1 = 2 y2 = 4 y3 = 3 y4 = −1 y5 = − 5 a 5
1 3 5 −3 −6
at y =
i ak y i( k ) > 0
k =0
need to stop but at a good point:
a(2)
yt2 a(3)=[1 4 3]*[1 2 -1]t =6 > 0
yt3 a(3)=[1 3 5]*[1 2 -1]t > 0 solutions at iterations
yt4 a(3)=[-1 -1 -3]*[1 2 -1]t = 0 900 through 915.
a ( 4 ) = a (3 ) + y M = [1 2 − 1] + [− 1 − 1 − 3 ] = [0 1 − 4 ] Some are good
some are not.
How do we stop at a
good solution?
LDF: Convergence of Perceptron rules
If classes are linearly separable, and use fixed
learning rate, that is for some constant c, η(k) =c
both single sample and batch perceptron rules converge to
a correct solution (could be any a in the solution space)
If classes are not linearly separable:
algorithm does not stop, it keeps looking for solution which
does not exist
by choosing appropriate learning rate, can always ensure
( )
convergence: η k → 0 as k → ∞
(k ) η (1)
for example inverse linear learning rate: η =
k
for inverse linear learning rate convergence in the linearly
separable case can also be proven
no guarantee that we stopped at a good point, but there are
good reasons to choose inverse linear learning rate
LDF: Perceptron Rule and Gradient decent
Linearly separable data
perceptron rule with gradient decent works well
Linearly non-separable data
need to stop perceptron rule algorithm at a good point, this
maybe tricky
Batch Rule Single Sample Rule
Smoother gradient easier to analyze
because all samples are
used Concentrates more than
necessary on any isolated
“noisy” training examples

Support Vector Machines
No ratings yet
Support Vector Machines
57 pages
ML Unit 2
No ratings yet
ML Unit 2
53 pages
Discriminant, Generative, Discriminative Models
No ratings yet
Discriminant, Generative, Discriminative Models
98 pages
06-07-08-Supervised Learning by Computing Distances, Multi Class Classification, Decision Boundary
No ratings yet
06-07-08-Supervised Learning by Computing Distances, Multi Class Classification, Decision Boundary
32 pages
הרצאה-Classifiers and Decision Trees
No ratings yet
הרצאה-Classifiers and Decision Trees
119 pages
Linear Classifiers
No ratings yet
Linear Classifiers
48 pages
Machine Learning and Data Mining: Prof. Alexander Ihler
No ratings yet
Machine Learning and Data Mining: Prof. Alexander Ihler
46 pages
Lecture 1
No ratings yet
Lecture 1
48 pages
ML 41
No ratings yet
ML 41
49 pages
C30 C35 LinearModelForClassification
No ratings yet
C30 C35 LinearModelForClassification
50 pages
ML Unit-4
No ratings yet
ML Unit-4
28 pages
Linear Discriminant Functions Notes
No ratings yet
Linear Discriminant Functions Notes
2 pages
Unit-4 Part-1 ML Ai&Ml r23
No ratings yet
Unit-4 Part-1 ML Ai&Ml r23
20 pages
Linear - Classification
No ratings yet
Linear - Classification
72 pages
Machine Learning - Lecture 5
No ratings yet
Machine Learning - Lecture 5
19 pages
189 Cheat Sheet Nominicards PDF
No ratings yet
189 Cheat Sheet Nominicards PDF
2 pages
2021 Logistic Regression
No ratings yet
2021 Logistic Regression
33 pages
Notes Chapter Linear Classifiers
No ratings yet
Notes Chapter Linear Classifiers
4 pages
Statistical Methods in Artificial Intelligence CSE471 - Monsoon 2015: Lecture 03
No ratings yet
Statistical Methods in Artificial Intelligence CSE471 - Monsoon 2015: Lecture 03
22 pages
An Introduction Of: Support Vector Machine
No ratings yet
An Introduction Of: Support Vector Machine
36 pages
Pattern Revision
No ratings yet
Pattern Revision
63 pages
Module 3.1
No ratings yet
Module 3.1
25 pages
Lec 13
No ratings yet
Lec 13
16 pages
Linear Discriminant Functions: CS479/679 Pattern Recognition Dr. George Bebis
No ratings yet
Linear Discriminant Functions: CS479/679 Pattern Recognition Dr. George Bebis
41 pages
Support Vector Machines: Vibhav Gogate The University of Texas at Dallas
No ratings yet
Support Vector Machines: Vibhav Gogate The University of Texas at Dallas
36 pages
CpE646 9v3 PDF
No ratings yet
CpE646 9v3 PDF
45 pages
ML Lec SVM Linear
No ratings yet
ML Lec SVM Linear
19 pages
Chapter 8
No ratings yet
Chapter 8
103 pages
Introduction To Machine Learning Lecture 3: Linear Classification Methods
No ratings yet
Introduction To Machine Learning Lecture 3: Linear Classification Methods
40 pages
Pattern Recognition Linear Classifier by Zaheer Ahmad
0% (1)
Pattern Recognition Linear Classifier by Zaheer Ahmad
37 pages
PHM802 NN LP04
No ratings yet
PHM802 NN LP04
49 pages
Ai and ML
No ratings yet
Ai and ML
16 pages
Fitting A Model To Data
No ratings yet
Fitting A Model To Data
41 pages
Sergios Theodoridis Konstantinos Koutroumbas
No ratings yet
Sergios Theodoridis Konstantinos Koutroumbas
80 pages
Linear and Quadratic Discriminant Analysis: Tutorial: Benyamin Ghojogh
No ratings yet
Linear and Quadratic Discriminant Analysis: Tutorial: Benyamin Ghojogh
16 pages
Unit 4 ML
No ratings yet
Unit 4 ML
11 pages
Intro SVM New Example PDF
100% (1)
Intro SVM New Example PDF
56 pages
Sample Question
100% (1)
Sample Question
4 pages
SVM Problems1
No ratings yet
SVM Problems1
5 pages
cs188 Fa23 Note21
No ratings yet
cs188 Fa23 Note21
8 pages
Detailed Linear Discriminant Functions Notes
No ratings yet
Detailed Linear Discriminant Functions Notes
2 pages
CBM342 BCI Unit IV
No ratings yet
CBM342 BCI Unit IV
22 pages
Introduction To Support Vector Machines: Andrew Moore CMU
No ratings yet
Introduction To Support Vector Machines: Andrew Moore CMU
40 pages
Supervised Learning: Linear Methods (1/2) : Applied Multivariate Statistics - Spring 2012
No ratings yet
Supervised Learning: Linear Methods (1/2) : Applied Multivariate Statistics - Spring 2012
15 pages
Supervised Machine Learning
No ratings yet
Supervised Machine Learning
74 pages
Linear Discriminat Analysis
No ratings yet
Linear Discriminat Analysis
23 pages
1 An Introduction To Linear Classifiers
No ratings yet
1 An Introduction To Linear Classifiers
9 pages
Discriminant Functions
No ratings yet
Discriminant Functions
33 pages
2IIG0 Cheat Sheet 1
No ratings yet
2IIG0 Cheat Sheet 1
2 pages
n9 PDF
No ratings yet
n9 PDF
6 pages
4 - SVM
No ratings yet
4 - SVM
58 pages
SVM Class
No ratings yet
SVM Class
33 pages
SVM
No ratings yet
SVM
36 pages
Introduction To: Support Vector Machines
No ratings yet
Introduction To: Support Vector Machines
53 pages
An Introduction Of: Support Vector Machine
No ratings yet
An Introduction Of: Support Vector Machine
36 pages
Snowpro™ Core: Exam Study Guide
No ratings yet
Snowpro™ Core: Exam Study Guide
17 pages
IIT Madras Notes Machine Learning
No ratings yet
IIT Madras Notes Machine Learning
13 pages
Support Vector Machine
No ratings yet
Support Vector Machine
35 pages
Unit 3 Java Script: Web Technologies
No ratings yet
Unit 3 Java Script: Web Technologies
135 pages
Assignment 4 (Sol.) : Introduction To Machine Learning Prof. B. Ravindran
No ratings yet
Assignment 4 (Sol.) : Introduction To Machine Learning Prof. B. Ravindran
4 pages
Gmat Questions (PDFDrive)
No ratings yet
Gmat Questions (PDFDrive)
233 pages
Unit 2 VR Modeling
No ratings yet
Unit 2 VR Modeling
23 pages
Document Review Checklist
No ratings yet
Document Review Checklist
7 pages
IOM - Gas Detector - AE - AIYI - EN Manual-AG200 Series Fixed Gas Detector - AnrN - AnrS
No ratings yet
IOM - Gas Detector - AE - AIYI - EN Manual-AG200 Series Fixed Gas Detector - AnrN - AnrS
24 pages
Mahesh (7 0)
No ratings yet
Mahesh (7 0)
6 pages
MRG 2
No ratings yet
MRG 2
27 pages
DCM601A51 iTM 1
No ratings yet
DCM601A51 iTM 1
1 page
04 Processes Slides
No ratings yet
04 Processes Slides
33 pages
Testbank For Before We Are Born 9th Edition Moore
No ratings yet
Testbank For Before We Are Born 9th Edition Moore
17 pages
BSBSTR601 Project Portfolio Section 1 Continuous Improvement and Innovation Review
No ratings yet
BSBSTR601 Project Portfolio Section 1 Continuous Improvement and Innovation Review
11 pages
Test Phase and Pre-Final Viva Important Questions
No ratings yet
Test Phase and Pre-Final Viva Important Questions
10 pages
DS - Fujitsu PRIMERGY TX1310
No ratings yet
DS - Fujitsu PRIMERGY TX1310
7 pages
Developing World Shaping Solutions For A Global Economy: Whitepaper 3.0
No ratings yet
Developing World Shaping Solutions For A Global Economy: Whitepaper 3.0
31 pages
Operating System Design The Xinu Approach 2nd Edition Douglas Comer PDF Download
No ratings yet
Operating System Design The Xinu Approach 2nd Edition Douglas Comer PDF Download
36 pages
Georgios Papageorgiou Dissertation Data Mining in Sports
No ratings yet
Georgios Papageorgiou Dissertation Data Mining in Sports
82 pages
Communications Systems - PLM-4Px2x PLENA Matrix DSP Amplifiers
No ratings yet
Communications Systems - PLM-4Px2x PLENA Matrix DSP Amplifiers
4 pages
Slvaf 82 B
No ratings yet
Slvaf 82 B
19 pages
Microsoft Cognitive Toolkit
No ratings yet
Microsoft Cognitive Toolkit
2 pages
Scent Marketing: Subliminal Advertising Messages
No ratings yet
Scent Marketing: Subliminal Advertising Messages
10 pages
Mini
No ratings yet
Mini
6 pages
Fraunhofer CML TOS-Study Excerpt PDF
No ratings yet
Fraunhofer CML TOS-Study Excerpt PDF
13 pages
DG 441
No ratings yet
DG 441
12 pages
Nep 2020 Ciet Behera
No ratings yet
Nep 2020 Ciet Behera
15 pages
PSoC - Voltage (ADC) To Freq Converter Tutorial (Graphite Piano)
No ratings yet
PSoC - Voltage (ADC) To Freq Converter Tutorial (Graphite Piano)
30 pages
Gowtham P 4
No ratings yet
Gowtham P 4
2 pages
ESG Economic Validation - Aruba ESP
No ratings yet
ESG Economic Validation - Aruba ESP
11 pages
Mohit Jiteshbhai Gediya: Professional Skills
No ratings yet
Mohit Jiteshbhai Gediya: Professional Skills
2 pages
ICT - Tools in Education by Roshan - Sai
No ratings yet
ICT - Tools in Education by Roshan - Sai
1 page
Pyramid Image Processing: Exploring the Depths of Visual Analysis
From Everand
Pyramid Image Processing: Exploring the Depths of Visual Analysis
Fouad Sabry
No ratings yet
Radial Basis Networks: Fundamentals and Applications for The Activation Functions of Artificial Neural Networks
From Everand
Radial Basis Networks: Fundamentals and Applications for The Activation Functions of Artificial Neural Networks
Fouad Sabry
No ratings yet

Linear Classifiers PPT 1

Uploaded by

Linear Classifiers PPT 1

Uploaded by

Today

Linear Discriminant Functions

Discriminant functions can be more general than

yM yM Steve 1 yes (1) yes (1) yes (1) yes (1) F

Mary 1 no (-1) no (-1) no (-1) yes (1) F

Peter 1 yes (1) no (-1) no (-1) yes (1) A

You might also like