0% found this document useful (0 votes)
53 views

Statistical Methods in Artificial Intelligence CSE471 - Monsoon 2015: Lecture 03

This document summarizes a lecture on statistical methods in artificial intelligence. It begins with a recap of linear algebra concepts like vectors, vector operations, and the equation of a plane. It then discusses linear discriminant functions for two-class classification problems, using a linear boundary to separate classes. The perceptron algorithm is introduced as a method for learning these linear discriminant functions using a gradient descent approach to minimize classification error. Generalized linear discriminant functions are also discussed, allowing nonlinear decision boundaries through mapping to higher dimensions.

Uploaded by

srikanth.mujjiga
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
53 views

Statistical Methods in Artificial Intelligence CSE471 - Monsoon 2015: Lecture 03

This document summarizes a lecture on statistical methods in artificial intelligence. It begins with a recap of linear algebra concepts like vectors, vector operations, and the equation of a plane. It then discusses linear discriminant functions for two-class classification problems, using a linear boundary to separate classes. The perceptron algorithm is introduced as a method for learning these linear discriminant functions using a gradient descent approach to minimize classification error. Generalized linear discriminant functions are also discussed, allowing nonlinear decision boundaries through mapping to higher dimensions.

Uploaded by

srikanth.mujjiga
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 22

Statistical Methods in Artificial Intelligence

CSE471 - Monsoon 2015 : Lecture 03

Avinash Sharma
CVIT, IIIT Hyderabad

Lecture 03: Plan

Linear Algebra Recap


Linear Discriminant Functions (LDFs)
The Perceptron
Generalized LDFs
The Two-Category Linearly Separable
Case
Learning LDF: Basic Gradient Descend
Perceptron Criterion Function

Basic Linear Algebra


Operations
Vector
Vector Operations
Scaling
Transpose
Addition
Subtraction
Dot Product

Equation of a Plane

Vector
Operations

Transpose

Scaling: Only Magnitude Changes

Vector
Operations

Vector
Operations

Dot Product (Inner Product) of two vectors is a scalar.

Dot product if two perpendicular vectors is 0

Equation of a Plane

Linear Discriminant
Functions
Assumes
a 2-class classification setup

Decision boundary is represented explicitly in


terms of components of .
Aim is to seek parameters of a linear discriminant
function which minimize the training error.
Why Linear ?
Simplest possible
Generalized

Linear Discriminant Functions

Class A
Class B

The perceptron

Perceptron Decision
Boundary

Perceptron Summary
Decision boundary surface (hyperplane) divides

feature space into two regions.


Orientation of the boundary surface is decided by the
normal vector .
Location of the boundary surface is determined by the
bias term .
is proportional to distance of from the boundary
surface.
positive side and negative side.

Generalized LDFs
Linear:

Non Linear

(Quadratic)

Generalized LDFs
Linear

=1

Non Linear

Generalized LDFs

Generalized LDFs Summary


can be any arbitrary mapping function that projects

original data points to points where .


The hyperplane decision surface passes through
origin.
Advantage: In the mapped higher dimensional space
data might be linear separable.
Disadvantage: The mapping is computationally
intensive and learning the classification parameters
can be non-trivial.

Two-Category Linearly Separable


Case

Two-Category Linearly Separable


Case

Normalized Case

Two-Category Linearly Separable


Case

Data vector

Two-Category Linearly Separable


Case

Learning LDF: Basic Gradient


Descend
Define
a scalar function which captures

classification error for specific boundary plane


described by parameter
Minimize using gradient descent.
Start with arbitrary value of for .
Iteratively refine estimate of :

is the positive scale factor also known as


learning rate
A too small makes the convergence very slow
A too large can diverge due to overshooting of

You might also like