Introduction nlc
Introduction nlc
Syllabus
Syllabus - CLOs
At the end of the course, students will be able to –
6. Ian Goodfellow, Yoshua Bengio, and Aaron Courville. Deep Learning, The MIT
Press
7. Lewis Tunstall, Leandro von Werra, and Thomas Wolf, Natural Language
Processing with Transformers, O'Reilly Media
Moodle Site
https://lms.nirmauni.ac.in/course/view.php?id=9264
Teaching & Evaluation Scheme
Teaching Scheme:
Theory Tutorial Practical Credits
3 0 2 4
Evaluation Methodology:
SEE CE LPW
Exam Duration 3.0 Hrs. Quizzes + Continuous + Sem.
Sess. + End
Assignment
Component 0.4 0.3 0.3 (0.6 + 0.4)
Weightage
Teaching & Evaluation Scheme
Breakup of CE
Unit 1 Unit 2 Unit 3
Exam Quizzes Assignments Sessional
Numbers 2 1 1
Marks 20 20
Assignments
1.
Practical List
Sr. Title Hours
No.
1 Introduction to text processing libraries 02
2 Introduction to PyTorch for NLP 02
3 Word2Vec implementation 04
4 Sequence models for (i) Sequence Classification (ii) 06
Named-Entity Recognition (iii) Machine Translation
➢ POS Tagging:
➢ Each token is then assigned a part of the speech
tag.
➢ "The" (Determiner, DT), "quick" (Adjective, JJ),
"brown" (Adjective, JJ), "fox" (Noun, NN), "jumps"
(Verb, VBZ), "over" (Preposition, IN), "the"
(Determiner, DT), "lazy" (Adjective, JJ), "dog"
(Noun, NN).
➢ Annotated Sentence: The/DT quick/JJ brown/JJ
fox/NN jumps/VBZ over/IN the/DT lazy/JJ
dog/NN
Introduction to NLP
➢ Feature Extraction
➢ Bag of Words (BoW):
➢ Represents text as a collection (bag) of words,
ignoring grammar and word order.
➢ Each word's occurrence is counted.
➢ Example: For sentences "I like NLP" and "NLP is
great“.
➢ Vocabulary: [I, like, NLP, is, great]
➢ "I like NLP": [1, 1, 1, 0, 0]
➢ "NLP is great": [0, 0, 1, 1, 1]
Introduction to NLP
➢ Feature Extraction
➢ Term Frequency-Inverse Document Frequency (TF-
IDF):
➢ Term Frequency (TF) measures how frequently a term (word)
appears in a document.
➢ It is calculated as the ratio of the number of times the term
appears in the document to the total number of terms in the
document.
Introduction to NLP
➢ Feature Extraction
➢ Term Frequency-Inverse Document Frequency (TF-
IDF):
➢ Inverse Document Frequency (IDF) measures how important a
term is.
➢ While computing TF, all terms are considered equally
important. However, certain terms like "is", "of", and "that"
may appear frequently but have little importance.
➢ Thus, we need to weigh down the frequent terms while scaling
up the rare ones by computing the following:
Introduction to NLP
➢ Feature Extraction
➢ Term Frequency-Inverse Document Frequency (TF-
IDF):
➢ TF-IDF combines the two measures to give a composite
weight for each term in a document, reflecting both the
term's frequency in the document and its rarity across the
entire document set.