0% found this document useful (0 votes)

94 views67 pages

Introduction To: Information Retrieval

This document provides an overview of vector space classification techniques for text documents. It discusses feature selection methods like mutual information to select relevant features. It then introduces vector space classification where documents are represented as vectors in a high-dimensional space. Two classification techniques are covered - Rocchio classification which finds class centroids and assigns documents to the closest centroid, and k-nearest neighbors (kNN) classification which assigns documents to the class of its k nearest neighbors in the training set.

Uploaded by

Basit Jasani

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

94 views67 pages

Introduction To: Information Retrieval

Uploaded by

Basit Jasani

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 67

Introduction to Information

Retrieval

Introduction to

Information Retrieval
Hinrich Schtze and Christina Lioma
Lecture 14: Vector Space
Classification
1

Introduction to Information
Retrieval

Overview

Recap

Feature selection

Intro vector space classification

Rocchio

kNN

Linear classifiers

> two classes

Introduction to Information
Retrieval

Outline

Recap

Feature selection

Intro vector space classification

Rocchio

kNN

Linear classifiers

> two classes

Introduction to Information
Retrieval

Relevance feedback: Basic idea

The user issues a (short, simple) query.
The search engine returns a set of documents.
User marks some docs as relevant, some as
nonrelevant.
Search engine computes a new representation of
the information need should be better than the
initial query.
Search engine runs new query and returns new
results.
New results have (hopefully) better recall.
4

Introduction to Information
Retrieval

Rocchio illustrated

Introduction to Information
Retrieval

Take-away today
Feature selection for text classification: How to
select a subset of available dimensions
Vector space classification: Basic idea of doing
textclassification for documents that are
represented as vectors
Rocchio classifier: Rocchio relevance feedback
idea applied to text classification
k nearest neighbor classification
Linear classifiers
More than two classes
6

Introduction to Information
Retrieval

Outline

Recap

Feature selection

Intro vector space classification

Rocchio

kNN

Linear classifiers

> two classes

Introduction to Information
Retrieval

Feature selection
In text classification, we usually represent
documents in a high-dimensional space, with
each dimension corresponding to a term.
In this lecture: axis = dimension = word = term
= feature
Many dimensions correspond to rare words.
Rare words can mislead the classifier.
Rare misleading features are called noise
features.
Eliminating noise features from the
representation increases efficiency and
8
effectiveness of text classification.

Introduction to Information
Retrieval

Example for a noise feature

Lets say were doing text classification for the
class China.
Suppose a rare term, say ARACHNOCENTRIC, has no
information about China . . .
. . . but all instances of ARACHNOCENTRIC happen
to occur in
China documents in our training set.
Then we may learn a classifier that incorrectly
interprets ARACHNOCENTRIC as evidence for the
class China.
Such an incorrect generalization from an
accidental property of the training set is called
9
overfitting.

Introduction to Information
Retrieval

Basic feature selection algorithm

Introduction to Information
Retrieval

Different feature selection

methods
A feature selection method is mainly defined by
the feature utility measure it employs
Feature utility measures:
Frequency select the most frequent terms
Mutual information select the terms with the
highest mutual information
Mutual information is also called information gain
in this context.
Chi-square (see book)

Introduction to Information
Retrieval

Mutual information
Compute the feature utility A(t, c) as the
expected mutual information (MI) of term t and
class c.
MI tells us how much information the term
contains about the class and vice versa.
For example, if a terms occurrence is
independent of the class (same proportion of
docs within/without class contain the term), then
MI is 0.
Definition:

Introduction to Information
Retrieval

How to compute MI values

Based on maximum likelihood estimates, the
formula we actually use is:

N10: number of documents that contain t (et = 1)

and are
not in c (ec = 0); N11: number of documents that
contain t
(et = 1) and are in c (ec = 1); N01: number of
documents
that do not contain t (et = 1) and are in c (ec = 1);
13
N00:

Introduction to Information
Retrieval

MI example for poultry/EXPORT in

Reuters

Introduction to Information
Retrieval

MI feature selection on Reuters

Introduction to Information
Retrieval

Naive Bayes: Effect of feature

selection
(multinomial =
multinomial Naive
Bayes, binomial
= Bernoulli Naive
Bayes)

Introduction to Information
Retrieval

Feature selection for Naive Bayes

In general, feature selection is necessary for
Naive Bayes to get decent performance.
Also true for most other learning methods in text
classification: you need feature selection for
optimal performance.

Introduction to Information
Retrieval

Exercise
(i) Compute the export/POULTRY contingency table
for the
Kyoto/JAPAN in the collection given below. (ii) Make
up a
contingency table for which MI is 0 that is, term and
class are
independent of each other. export/POULTRY table:

Introduction to Information
Retrieval

Outline

Recap

Feature selection

Intro vector space classification

Rocchio

kNN

Linear classifiers

> two classes

Introduction to Information
Retrieval

Recall vector space representation

Each document is a vector, one component for
each term.
Terms are axes.
High dimensionality: 100,000s of dimensions
Normalize vectors (documents) to unit length
How can we do classification in this space?

Introduction to Information
Retrieval

Vector space classification

As before, the training set is a set of documents,
each labeled with its class.
In vector space classification, this set
corresponds to a labeled set of points or vectors
in the vector space.
Premise 1: Documents in the same class form a
contiguous region.
Premise 2: Documents from different classes
dont overlap.
We define lines, surfaces, hypersurfaces to
divide regions.
21

Introduction to Information
Retrieval

Classes in the vector space

Should the document be assigned to China, UK or

Kenya? Find
separators between the classes Based on these
separators: should
be assigned to China How do we find separators that22

Introduction to Information
Retrieval

Aside: 2D/3D graphs can be

misleading

Left: A projection of the 2D semicircle to 1D. For the points

x1, x2, x3, x4, x5 at x coordinates 0.9,0.2, 0, 0.2, 0.9 the
distance
|x2x3| 0.201 only differs by 0.5% from |x2x3| = 0.2; but
|x1x3|/|x1x3| = dtrue/dprojected 1.06/0.9 1.18 is an example of
23
a large distortion (18%) when projecting a large area. Right: The

Introduction to Information
Retrieval

Outline

Recap

Feature selection

Intro vector space classification

Rocchio

kNN

Linear classifiers

> two classes

Introduction to Information
Retrieval

Relevance feedback
In relevance feedback, the user marks
documents as relevant/nonrelevant.
Relevant/nonrelevant can be viewed as classes
or categories.
For each document, the user decides which of
these two classes is correct.
The IR system then uses these class assignments
to build a better query (model) of the
information need . . .
. . . and returns better documents.
Relevance feedback is a form of text
25
classification.

Introduction to Information
Retrieval

Using Rocchio for vector space

classification
The principal difference between relevance
feedback and text classification:
The training set is given as part of the input in text
classification.
It is interactively created in relevance feedback.

Introduction to Information
Retrieval

Rocchio classification: Basic idea

Compute a centroid for each class

The centroid is the average of all documents in the
class.

Assign each test document to the class of its

closest centroid.

Introduction to Information
Retrieval

Recall definition of centroid

where Dc is the set of all documents that belong to

class c and
is the vector space representation of d.

Introduction to Information
Retrieval

Rocchio algorithm

Introduction to Information
Retrieval

Rocchio illustrated : a1 = a2, b1 = b2,

c1 = c2

Introduction to Information
Retrieval

Rocchio properties

Rocchio forms a simple representation for each

class: the centroid
We can interpret the centroid as the prototype of
the class.

Classification is based on similarity to / distance

from centroid/prototype.
Does not guarantee that classifications are
consistent with the training data!
31

Introduction to Information
Retrieval

Time complexity of Rocchio

Introduction to Information
Retrieval

Rocchio vs. Naive Bayes

In many cases, Rocchio performs worse than

Naive Bayes.
One reason: Rocchio does not handle nonconvex,
multimodal classes correctly.

Introduction to Information
Retrieval

Rocchio cannot handle nonconvex,

multimodal classes

a
a
a
a

a
a a
a

a
aa

b
b

a
a

b
b
b b
bb
b
b

a
a
a a

Exercise: Why is Rocchio

not expected to do well
for
the classification task a
vs.
b here?
A is centroid of the
as, B is centroid of
the bs.
The point o is closer
to A than to B.
But o is a better fit
for the b class.
A is a multimodal
class with two
34
prototypes.

Introduction to Information
Retrieval

Outline

Recap

Feature selection

Intro vector space classification

Rocchio

kNN

Linear classifiers

> two classes

Introduction to Information
Retrieval

kNN classification
kNN classification is another vector space
classification method.
It also is very simple and easy to implement.
kNN is more accurate (in most cases) than Naive
Bayes and Rocchio.
If you need to get a pretty accurate classifier up
and running in a short time . . .
. . . and you dont care about efficiency that
much . . .
. . . use kNN.
36

Introduction to Information
Retrieval

kNN classification
kNN = k nearest neighbors
kNN classification rule for k = 1 (1NN): Assign
each test document to the class of its nearest
neighbor in the training set.
1NN is not very robust one document can be
mislabeled or atypical.
kNN classification rule for k > 1 (kNN): Assign
each test document to the majority class of its k
nearest neighbors in the training set.
Rationale of kNN: contiguity hypothesis
We expect a test document d to have the same
label as the training documents located in the
local region surrounding d.

Introduction to Information
Retrieval

Probabilistic kNN

Probabilistic version of kNN: P(c|d) = fraction of k

neighbors of d that are in c
kNN classification rule for probabilistic kNN:
Assign d to class c with highest P(c|d)

Introduction to Information
Retrieval

Probabilistic kNN
1NN, 3NN
classification
decision
for star?

Introduction to Information
Retrieval

kNN algorithm

Introduction to Information
Retrieval

Exercise

How is star classified by:

(i) 1-NN (ii) 3-NN (iii) 9-NN (iv) 15-NN (v) Rocchio?
41

Introduction to Information
Retrieval

Time complexity of kNN

kNN with preprocessing of training set

training
testing
kNN test time proportional to the size of the
training set!
The larger the training set, the longer it takes to
classify a test document.
kNN is inefficient for very large training sets.

Introduction to Information
Retrieval

kNN: Discussion
No training necessary
But linear preprocessing of documents is as
expensive as training Naive Bayes.
We always preprocess the training set, so in reality
training time of kNN is linear.

kNN is very accurate if training set is large.

Optimality result: asymptotically zero error if
Bayes rate is zero.
But kNN can be very inaccurate if training set is
small.

Introduction to Information
Retrieval

Outline

Recap

Feature selection

Intro vector space classification

Rocchio

kNN

Linear classifiers

> two classes

Introduction to Information
Retrieval

Linear classifiers
Definition:
A linear classifier computes a linear combination
or weighted sum
of the feature values.
Classification decision:
. . .where

(the threshold) is a parameter.

(First, we only consider binary classifiers.)

Geometrically, this corresponds to a line (2D), a
plane (3D) or a hyperplane (higher
dimensionalities), the separator.
We find this separator based on training set.
Methods for finding separator: Perceptron,
Rocchio, Nave Bayes as we will explain on the
45
next slides

Introduction to Information
Retrieval

A linear classifier in 1D
A linear classifier in
1D is a point
described by the
equation w1d1 =
The point at /w1
Points (d1) with w1d1
are in the class c.
Points (d1) with w1d1
< are in the
complement class
46

Introduction to Information
Retrieval

A linear classifier in 2D
A linear classifier in
2D is a line
described by the
equation w1d1 +w2d2
=
Example for a 2D
linear classifier
Points (d1 d2) with
w1d1 + w2d2 are
in the class c.
Points (d1 d2) with
w1d1 + w2d2 < are
47

Introduction to Information
Retrieval

A linear classifier in 2D
A linear classifier in
3D is a plane
described by the
equation w1d1 + w2d2
+ w3d3 =
Example for a 3D
linear classifier
Points (d1 d2 d3) with
w1d1 + w2d2 + w3d3
are in the class c.
Points (d1 d2 d3) with
w1d1 + w2d2 + w3d3 <
are in the
48

Introduction to Information
Retrieval

Rocchio as a linear classifier

Rocchio is a linear classifier defined by:

where
and

is the normal vector

Introduction to Information
Retrieval

Naive Bayes as a linear classifier

Multinomial Naive Bayes is a linear classifier (in log
space) defined
by:

where
, di = number of
occurrences of ti
in d, and
. Here, the index i ,
1 i M,
refers to terms of the vocabulary (not to positions in d
as k did in
our original definition of Naive Bayes)
50

Introduction to Information
Retrieval

kNN is not a linear classifier

Classification
decision based on
majority of k nearest
neighbors.
The decision
boundaries between
classes are
piecewise linear . . .
. . . but they are in
general not linear
classifiers that can
be described as
51

Introduction to Information
Retrieval

Example of a linear two-class

classifier

This is for the class interest in Reuters-21578.

For simplicity: assume a simple 0/1 vector representation
d1: rate discount dlrs world
d2: prime dlrs
=0
Exercise: Which class is d1 assigned to? Which class is d2
assigned to?
We assign document
rate discount dlrs world to
interest since

= 0.67 1 + 0.46 1 + (0.71) 1 + (0.35) 1 52

Introduction to Information
Retrieval

Which hyperplane?

Introduction to Information
Retrieval

Learning algorithms for vector space

classification
In terms of actual computation, there are two
types of learning algorithms.
(i) Simple learning algorithms that estimate the
parameters of the classifier directly from the
training data, often in one linear pass.
Naive Bayes, Rocchio, kNN are all examples of this.

(ii) Iterative algorithms

Support vector machines
Perceptron (example available as PDF on website:
http://ifnlp.org/ir/pdf/p.pdf)

The best performing learning algorithms usually

require iterative learning.
54

Introduction to Information
Retrieval

Which hyperplane?

Introduction to Information
Retrieval

Which hyperplane?
For linearly separable training sets: there are
infinitely many separating hyperplanes.
They all separate the training set perfectly . . .
. . . but they behave differently on test data.
Error rates on new data are low for some, high
for others.
How do we find a low-error separator?
Perceptron: generally bad; Naive Bayes, Rocchio:
ok; linear SVM: good

Introduction to Information
Retrieval

Linear classifiers: Discussion

Many common text classifiers are linear
classifiers: Naive Bayes, Rocchio, logistic
regression, linear support vector machines etc.
Each method has a different way of selecting the
separating hyperplane
Huge differences in performance on test
documents

Can we get better performance with more

powerful nonlinear classifiers?
Not in general: A given amount of training data
may suffice for estimating a linear boundary, but
not for estimating a more complex nonlinear
57
boundary.

Introduction to Information
Retrieval

A nonlinear problem

Linear classifier like Rocchio does badly on this

task.
kNN will do well (assuming enough training data)
58

Introduction to Information
Retrieval

Which classifier do I use for a given TC

problem?
Is there a learning method that is optimal for all
text classification problems?
No, because there is a tradeoff between bias and
variance.
Factors to take into account:
How much training data is available?
How simple/complex is the problem? (linear vs.
nonlinear decision boundary)
How noisy is the problem?
How stable is the problem over time?
For an unstable problem, its better to use a
simple and robust classifier.
59

Introduction to Information
Retrieval

Outline

Recap

Feature selection

Intro vector space classification

Rocchio

kNN

Linear classifiers

> two classes

Introduction to Information
Retrieval

How to combine hyperplanes for > 2

classes?

Introduction to Information
Retrieval

One-of problems
One-of or multiclass classification
Classes are mutually exclusive.
Each document belongs to exactly one class.
Example: language of a document (assumption: no
document
contains multiple languages)

Introduction to Information
Retrieval

One-of classification with linear

classifiers
Combine two-class linear classifiers as follows for
one-of classification:
Run each classifier separately
Rank classifiers (e.g., according to score)
Pick the class with the highest score

Introduction to Information
Retrieval

Any-of problems
Any-of or multilabel classification
A document can be a member of 0, 1, or many
classes.
A decision on one class leaves decisions open on
all other classes.
A type of independence (but not statistical
independence)
Example: topic classification
Usually: make decisions on the region, on the
subject area, on the industry and so on
independently
64

Introduction to Information
Retrieval

Any-of classification with linear

classifiers
Combine two-class linear classifiers as follows for
any-of classification:
Simply run each two-class classifier separately on
the test document and assign document
accordingly

Introduction to Information
Retrieval

Take-away today
Feature selection for text classification: How to
select a subset of available dimensions
Vector space classification: Basic idea of doing
text classification for documents that are
represented as vectors
Rocchio classifier: Rocchio relevance feedback
idea applied to text classification
k nearest neighbor classification
Linear classifiers
More than two classes
66

Introduction to Information
Retrieval

Resources
Chapter 13 of IIR (feature selection)
Chapter 14 of IIR
Resources at http://ifnlp.org/ir
Perceptron example
General overview of text classification: Sebastiani
(2002)
Text classification chapter on decision tress and
perceptrons: Manning & Schtze (1999)
One of the best machine learning textbooks:
Hastie, Tibshirani & Friedman (2003)
67

14 Vcat
No ratings yet
14 Vcat
66 pages
Introduction To: Information Retrieval
No ratings yet
Introduction To: Information Retrieval
48 pages
AZ Lecture7-Queryexpansion
No ratings yet
AZ Lecture7-Queryexpansion
49 pages
Relevance Feedback
No ratings yet
Relevance Feedback
47 pages
Relevance Feedback: Improving Results
No ratings yet
Relevance Feedback: Improving Results
41 pages
7 Lec 2025
No ratings yet
7 Lec 2025
50 pages
Chap 13
No ratings yet
Chap 13
68 pages
Materi Pertemuan Ke-1-Dno 2018-1
No ratings yet
Materi Pertemuan Ke-1-Dno 2018-1
42 pages
Lecture15 Learning Ranking
No ratings yet
Lecture15 Learning Ranking
46 pages
Introduction To: Information Retrieval
No ratings yet
Introduction To: Information Retrieval
28 pages
Introduction To: Information Retrieval
No ratings yet
Introduction To: Information Retrieval
40 pages
Lecture15 Learning Ranking
No ratings yet
Lecture15 Learning Ranking
46 pages
Introduction To: Information Retrieval
No ratings yet
Introduction To: Information Retrieval
52 pages
Lecture8-Evaluation 2013
No ratings yet
Lecture8-Evaluation 2013
44 pages
Introduction To: Information Retrieval
No ratings yet
Introduction To: Information Retrieval
48 pages
8.relavance Feedback - II
No ratings yet
8.relavance Feedback - II
52 pages
An Overview of Information Retrieval Outline: A (Simple) Database Example Databases vs. IR
No ratings yet
An Overview of Information Retrieval Outline: A (Simple) Database Example Databases vs. IR
16 pages
Information Retrieval: Introduction To
No ratings yet
Information Retrieval: Introduction To
48 pages
Ip 8
No ratings yet
Ip 8
51 pages
Lecture10 Efficient Scoring
No ratings yet
Lecture10 Efficient Scoring
19 pages
Lecture7b Efficient Scoring
No ratings yet
Lecture7b Efficient Scoring
18 pages
Information Storage and Retrival
No ratings yet
Information Storage and Retrival
31 pages
Lecture 17 Clustering
No ratings yet
Lecture 17 Clustering
63 pages
4 Lec 2025
No ratings yet
4 Lec 2025
57 pages
Lecture14 Clustering
No ratings yet
Lecture14 Clustering
50 pages
0809 Query Expansion and Probabilistic Retrieval Model
No ratings yet
0809 Query Expansion and Probabilistic Retrieval Model
56 pages
6 Text Clustering
No ratings yet
6 Text Clustering
66 pages
Boolean and Vector Space Retrieval Models
No ratings yet
Boolean and Vector Space Retrieval Models
27 pages
Boolean and Vector Space Retrieval Models
No ratings yet
Boolean and Vector Space Retrieval Models
33 pages
5-Introduction To Information Retrieval
No ratings yet
5-Introduction To Information Retrieval
3 pages
Ranked Retrieval: Thus Far, Our Queries Have All Been Boolean
No ratings yet
Ranked Retrieval: Thus Far, Our Queries Have All Been Boolean
40 pages
Lecture 6 Score - Term Weight - Vector Space Model
No ratings yet
Lecture 6 Score - Term Weight - Vector Space Model
43 pages
IR - 2 Unit
No ratings yet
IR - 2 Unit
46 pages
Lecture7a-Vectorspace Computing Scores
No ratings yet
Lecture7a-Vectorspace Computing Scores
43 pages
Lecture 2: More Similarity Searching Multidimensional Scaling
No ratings yet
Lecture 2: More Similarity Searching Multidimensional Scaling
8 pages
Introduction To: Information Retrieval
No ratings yet
Introduction To: Information Retrieval
60 pages
Lecture6-Tfidf Vector Space Model
No ratings yet
Lecture6-Tfidf Vector Space Model
45 pages
Vector Space and IR Evaluation
No ratings yet
Vector Space and IR Evaluation
41 pages
Neural IR
No ratings yet
Neural IR
45 pages
Information Retrieval
No ratings yet
Information Retrieval
72 pages
Flat Clustering PDF
No ratings yet
Flat Clustering PDF
73 pages
Retrieval Models and Rank Retrieval
No ratings yet
Retrieval Models and Rank Retrieval
16 pages
Clustring by Dr. Inam Ullah Khan
No ratings yet
Clustring by Dr. Inam Ullah Khan
48 pages
Lecture12 Efficient Scoring
No ratings yet
Lecture12 Efficient Scoring
52 pages
6 Tfidf
No ratings yet
6 Tfidf
48 pages
Lecture12 Clustering
No ratings yet
Lecture12 Clustering
48 pages
LIBS 894 Assignment Three Classic Models
No ratings yet
LIBS 894 Assignment Three Classic Models
8 pages
Introduction To: Information Retrieval
No ratings yet
Introduction To: Information Retrieval
50 pages
Vector Space Model
No ratings yet
Vector Space Model
11 pages
Introduction To: Information Retrieval
No ratings yet
Introduction To: Information Retrieval
48 pages
IRCh 7 Slides
No ratings yet
IRCh 7 Slides
52 pages
Module 7
No ratings yet
Module 7
53 pages
Evaluation and Result Summaries
No ratings yet
Evaluation and Result Summaries
60 pages
Information Retrievalpdf
No ratings yet
Information Retrievalpdf
7 pages
Introduction To: Information Retrieval
No ratings yet
Introduction To: Information Retrieval
60 pages
Vector Space Model and Features: Carl Staelin
No ratings yet
Vector Space Model and Features: Carl Staelin
28 pages
Lec2 2
No ratings yet
Lec2 2
17 pages
Week 3 - Probabilistic Retrieval and Relevance Feedback
No ratings yet
Week 3 - Probabilistic Retrieval and Relevance Feedback
37 pages
Support Vector Machine: Fundamentals and Applications
From Everand
Support Vector Machine: Fundamentals and Applications
Fouad Sabry
No ratings yet
Perceptrons: Fundamentals and Applications for The Neural Building Block
From Everand
Perceptrons: Fundamentals and Applications for The Neural Building Block
Fouad Sabry
No ratings yet
Artificial Intelligence A-Z™ 2023 Build An AI With
No ratings yet
Artificial Intelligence A-Z™ 2023 Build An AI With
19 pages
Chengqing Zong - Rui Xia - Jiajun Zhang - Text Data Mining-Springer Singapore
No ratings yet
Chengqing Zong - Rui Xia - Jiajun Zhang - Text Data Mining-Springer Singapore
528 pages
s18 Cu6051np Cw1 17031944 Nirakar Sigdel
No ratings yet
s18 Cu6051np Cw1 17031944 Nirakar Sigdel
18 pages
REPORT On Intern Work
No ratings yet
REPORT On Intern Work
56 pages
Theory Assn 2
No ratings yet
Theory Assn 2
2 pages
01 Introduction
No ratings yet
01 Introduction
52 pages
Deepfake Video Detection Using Convolutional Visio
No ratings yet
Deepfake Video Detection Using Convolutional Visio
9 pages
Random Forest: The Algorithm in A Nutshell
No ratings yet
Random Forest: The Algorithm in A Nutshell
10 pages
Content PDF
100% (3)
Content PDF
7 pages
Gujarat Technological University: Page 1 of 3
0% (2)
Gujarat Technological University: Page 1 of 3
3 pages
Form PHP Mysql
No ratings yet
Form PHP Mysql
17 pages
A Simple Robust PI/PID Controller Design Via Numerical Optimization Approach
No ratings yet
A Simple Robust PI/PID Controller Design Via Numerical Optimization Approach
8 pages
Design and Development of Convolutional Neural Network For Early Tumor Detection in Brain
No ratings yet
Design and Development of Convolutional Neural Network For Early Tumor Detection in Brain
49 pages
Chapter 17: Two-Port and Three-Port Networks
No ratings yet
Chapter 17: Two-Port and Three-Port Networks
34 pages
Module Outline 2023
No ratings yet
Module Outline 2023
5 pages
GATE Database Management System
No ratings yet
GATE Database Management System
2 pages
General Decoupling Theory 2
No ratings yet
General Decoupling Theory 2
54 pages
Machine Learning: Instructor: Prof. Ayesha
No ratings yet
Machine Learning: Instructor: Prof. Ayesha
31 pages
Theory of Memristive Controllers: Design and Stability Analysis For Linear Plants
No ratings yet
Theory of Memristive Controllers: Design and Stability Analysis For Linear Plants
9 pages
Computer Vision and Image Processing + Libaries
No ratings yet
Computer Vision and Image Processing + Libaries
9 pages
Data Driven Decisions For Business
100% (1)
Data Driven Decisions For Business
14 pages
Rethinking Automatic Chord Recognition With Convolutional Neural Networks
No ratings yet
Rethinking Automatic Chord Recognition With Convolutional Neural Networks
8 pages
Process of Communication
No ratings yet
Process of Communication
4 pages
Lecture 10 - Supervised Learning in Neural Networks - (Part 3)
No ratings yet
Lecture 10 - Supervised Learning in Neural Networks - (Part 3)
2 pages
Shapiro S.C. - Artificial Intelligence
No ratings yet
Shapiro S.C. - Artificial Intelligence
9 pages
Shannon and Weaver Model of Communication
67% (3)
Shannon and Weaver Model of Communication
6 pages
PID Scrollbar
No ratings yet
PID Scrollbar
10 pages
Recent Trends in Machine Learning For Human Activity Recognition - A Survey
No ratings yet
Recent Trends in Machine Learning For Human Activity Recognition - A Survey
16 pages
Acquisition of Word
No ratings yet
Acquisition of Word
9 pages
B Tech Major Project Report Final
No ratings yet
B Tech Major Project Report Final
56 pages

Introduction To: Information Retrieval

Uploaded by

Introduction To: Information Retrieval

Uploaded by

Introduction to Information

Intro vector space classification

> two classes

Intro vector space classification

> two classes

Relevance feedback: Basic idea

Intro vector space classification

> two classes

Example for a noise feature

Basic feature selection algorithm

Different feature selection

How to compute MI values

N10: number of documents that contain t (et = 1)

MI example for poultry/EXPORT in

MI feature selection on Reuters

Naive Bayes: Effect of feature

Feature selection for Naive Bayes

Intro vector space classification

> two classes

Recall vector space representation

Vector space classification

Classes in the vector space

Should the document be assigned to China, UK or

Aside: 2D/3D graphs can be

Left: A projection of the 2D semicircle to 1D. For the points

Intro vector space classification

> two classes

Using Rocchio for vector space

Rocchio classification: Basic idea

Compute a centroid for each class

Assign each test document to the class of its

Recall definition of centroid

where Dc is the set of all documents that belong to

Rocchio illustrated : a1 = a2, b1 = b2,

Rocchio forms a simple representation for each

Classification is based on similarity to / distance

Time complexity of Rocchio

Rocchio vs. Naive Bayes

In many cases, Rocchio performs worse than

Rocchio cannot handle nonconvex,

Exercise: Why is Rocchio

Intro vector space classification

> two classes

Probabilistic version of kNN: P(c|d) = fraction of k

How is star classified by:

Time complexity of kNN

kNN with preprocessing of training set

kNN is very accurate if training set is large.

Intro vector space classification

> two classes

(the threshold) is a parameter.

(First, we only consider binary classifiers.)

Rocchio as a linear classifier

is the normal vector

Naive Bayes as a linear classifier

kNN is not a linear classifier

Example of a linear two-class

This is for the class interest in Reuters-21578.

= 0.67 1 + 0.46 1 + (0.71) 1 + (0.35) 1 52

Learning algorithms for vector space

(ii) Iterative algorithms

The best performing learning algorithms usually

Linear classifiers: Discussion

Can we get better performance with more

Linear classifier like Rocchio does badly on this

Which classifier do I use for a given TC

Intro vector space classification

> two classes

How to combine hyperplanes for > 2

One-of classification with linear

Any-of classification with linear

You might also like