NLP Syllabus
NLP Syllabus
Word level and syntactic analysis: Word Level Analysis: Regular Expressions-Finite-State Automata-
Morphological Parsing-Spelling Error Detection and Correction-Words and Word Classes-Part-of
Speech Tagging. Syntactic Analysis: Context-free Grammar-Constituency- Parsing-Probabilistic Parsing.
Textbook 1: Ch. 3,4 (08 Hours)
Module – III
Information Retrieval and Lexical Resources: Information Retrieval: Design features of Information
Retrieval Systems-Classical, Non classical, Alternative Models of Information Retrieval – valuation
Lexical Resources: World Net-Frame Net- Stemmers-POS Tagger-Research Corpora.
Textbook 1: Ch. 9,12 (08 Hours)
Module – IV
Annotating Linguistic Structure: Context-Free Grammars and Constituency Parsing, Dependency
Parsing.
Textbook 2: Ch. 17, 18 (08 Hours)
Module – V
NLP Applications:
Case Study: Machine Translation, Question Answering and Information Retrieval.
Textbook 2: Ch. 13, 14 (08 Hours)
Lab Components:
1. Develop a python program to perform the following:
a. To read a word file and extract the email ids present in the file using Regular expressions.
b. Develop a python program to read a text data, convert the text to lower case, remove punctuations
and stop words.
2. Develop a python program to illustrate text standardization and spell correction.
3. Develop a python program to illustrate Tokenizing, Stemming and Lemmatization.
4. Develop a python program for Text to feature conversion using
One-hot encoding
Count Vectorizer
TF – IDF
5. Develop a python program for Generating N-grams.
Course Outcomes:
CO1: Understand the fundamental concepts and techniques of natural Language Processing.
CO2: Apply appropriate natural language generation, probabilistic classification and sematic techniques
to solve the real-world problem.
CO3: Apply information retrieval techniques for real world problems.
CO4: Discover the usage of Context-Free Grammars and parsing.
CO5: Develop real time applications for the given NLP problem.
Text Books:
1. Tanveer Siddiqui, U.S. Tiwary, “Natural Language Processing and Information Retrieval”,
Oxford University Press, 2008.
2. Speech and Language Processing: An introduction to Natural Language Processing,
Computational Linguistics and Speech Recognition by Daniel Jurafsky and James H Martin, 3rd
Edition, Prentice Hall, 2019.
Reference Books:
1. Anne Kao and Stephen R. Poteet (Eds), “Natural Language Processing and Text Mining”, Springer-
Verlag London Limited 2007.
2. James Allen, “Natural Language Understanding”, 2nd edition, Benjamin/Cummingspublishing
company, 1995.
3. Gerald J. Kowalski and Mark.T. Maybury, “Information Storage and Retrieval systems”, Kluwer
academic Publishers, 2000.
Alternate Assessment Tools (AATs) suggested:
Experiential Learning/ MOOC/Certification Courses (Infosys Springboard, Geek for Geeks, IBM,
Hacker earth, Math works)
Model presentation
Video
Web links / e – resources:
1. https://nptel.ac.in/courses/106105158
2. https://archive.nptel.ac.in/courses/106/106/106106211/