Lecture 1
Lecture 1
Course Requirements
Background
• Machine Learning
• Proficiency in Python: Programming assignments and projects will
require use of Python, NumPy and Pandas.
Book:
1. Practical Natural Language Processing with Python by Mathangi Sri.
2. Daniel Jurafsky and James H. Martin. 2009. Speech and Language
Processing: An Introduction to Natural Language Processing, Speech
Recognition, and Computational Linguistics. 2nd edition. Prentice-Hall.
Grading
• Mid-Term (30%)
• Quiz & Assignments (20%)
• End term (30%)
• Final project (20%): Teams of 4/5 students
Why Study NLP?
• Large volumes of textual data
news articles, web pages, scientific articles, patents, emails, government documents , Tweets, Facebook posts,
comments, Quora ...
Natural language processing helps computers communicate with humans in their own language and scales
other language-related tasks. n
NLP makes it possible for computers to read text, hear speech, interpret it, measure sentiment and determine
which parts are important. T Kharagpur) Introduction to the Course Week 1: Lecture 1 7 / 9
The amount of unstructured data that’s generated every day, from medical records to social media,
automation will be critical to fully analyze text and speech data efficiently.
“little French” is ambiguous. As we don’t know whether it is about the language French or a person.
IDIOMS
Dark horse
Ball in your court
Burn the midnight oil
Steps of Natural Language Processing
Lexical Analysis/Morphological Processing
In lexical analysis, we divide a whole chunk of text into paragraphs, sentences,
and words. It involves identifying and analyzing words structure.
In this phase, the sentences, paragraphs are broken into tokens. These tokens are
the smallest unit of text. A paragraph may also be divided into sentences.
For example, “He is from IIMK.” is divided into [ ‘He’ , ‘is’ , ‘from’ , ‘IIMK’,
‘.’] .
Syntax Analysis
Check a sentence is well formed or not and to break it up into a structure that shows the
syntactic relationships between the different words.
Ex: “The school goes to the boy” would be rejected by syntax analyzer or parser .
Syntactic analysis (Parser): Read the tokens and output a parse tree or report syntax
errors.
Semantic Analysis
• Semantic Analysis captures the meaning of the given text while taking
into account context, logical structuring of sentences and grammar
roles.
• Ex: “I ate hot ice cream” will get rejected by the semantic analyzer
because it doesn’t make sense.
• In semantic analysis, the relation between lexical items are identified.
• Sen 1: Students love IIMK.
• Sen 2: IIMK loves students.
• Try to understand how combinations of individual words form the
meaning of the text.
Pragmatic Analysis
• Pragmatic analysis helps users to discover the intentional
effect by applying a set of rules.
• Ex: “close the window?” should be interpreted as a request
instead of an order.
An example of tasks in natural language understanding
The NLP pyramid
Morphology
In morphology, most of the operations are at a word level.
Prefixes/Suffixes
Singularization/Pluralization
Gender detection
Word inflection
Lemmatization
Spellchecking
Syntax
Syntax usually works on sentences.
Building Syntax Trees/Parse tree
Building Dependency Trees
Semantics
Semantics derives meaning from text
Named Entity Extraction
Relation Extraction
Semantic Role Labelling
Word Sense Disambiguation
Pragmatics
Pragmatics analyses the text as a whole. It’s about
determining underlying narrative threads, topics, references.
Some tasks are:
Coreference / Anaphora resolution (find out what word
refers what.
Example: Rahul is fine. He[Rahul]‘s in no danger.)
Topic segmentation
Lexical chains
Text Summarization
Illustration of different levels of text representation