0% found this document useful (0 votes)
37 views17 pages

Lect 2 in Machine Learning For NLP

This document discusses natural language processing and machine learning techniques for NLP tasks. It covers supervised and unsupervised learning, with supervised learning involving classification using labeled training examples. Support vector machines are described as linear classifiers that find a hyperplane to separate classes of data, and can perform nonlinear separation using kernel functions.

Uploaded by

Mohamed Adel
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
37 views17 pages

Lect 2 in Machine Learning For NLP

This document discusses natural language processing and machine learning techniques for NLP tasks. It covers supervised and unsupervised learning, with supervised learning involving classification using labeled training examples. Support vector machines are described as linear classifiers that find a hyperplane to separate classes of data, and can perform nonlinear separation using kernel functions.

Uploaded by

Mohamed Adel
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 17

Natural Language Processing

Machine learning for NLP


Supervised vs. unsupervised Learning
• Supervised learning: classification is seen as
supervised learning from examples.
– Supervision: The data (observations,
measurements, etc.) are labeled with pre-defined
classes. It is like that a “teacher” gives the classes
(supervision).
– Test data are classified into these classes too.
• Unsupervised learning (clustering)
– Class labels of the data are unknown
– Given a set of data, the task is to establish the
existence of classes or clusters in the data

CS583, Bing Liu, UIC 7


Supervised learning process: two steps
 Learning (training): Learn a model using the
training data
 Testing: Test the model using unseen test data
to assess the model accuracy
Number of correct classifica tions
Accuracy  ,
Total number of test cases

CS583, Bing Liu, UIC 8


What do we mean by learning?
• Given
– a data set D,
– a task T, and
– a performance measure M,
a computer system is said to learn from D to
perform the task T if after learning the
system’s performance on T improves as
measured by M.
• In other words, the learned model helps the
system to perform T better as compared to no
learning.
CS583, Bing Liu, UIC 9
Support vector machines
• SVMs are linear classifiers that find a hyperplane to
separate two class of data, positive and negative.
• Kernel functions are used for nonlinear separation.
• SVM not only has a rigorous theoretical foundation, but
also performs classification more accurately than most
other methods in applications, especially for high
dimensional data.
• It is perhaps the best classifier for text classification.

CS583, Bing Liu, UIC 14


Basic concepts
• Let the set of training examples D be
{(x1, y1), (x2, y2), …, (xr, yr)},
where xi = (x1, x2, …, xn) is an input vector in a real-
valued space X  Rn and yi is its class label (output value),
yi  {1, -1}.
1: positive class and -1: negative class.
• SVM finds a linear function of the form (w: weight
vector)
f(x) = w  x + b
 1 if  w  x i   b  0
yi  
 1 if  w  x i   b  0
CS583, Bing Liu, UIC 15
The hyperplane
• The hyperplane that separates positive and negative
training data is
w  x + b = 0
• It is also called the decision boundary (surface).
• So many possible hyperplanes, which one to choose?

CS583, Bing Liu, UIC 16


Maximal margin hyperplane
• SVM looks for the separating hyperplane with the largest margin.
• Machine learning theory says this hyperplane minimizes the error
bound

CS583, Bing Liu, UIC 17

You might also like