0% found this document useful (0 votes)

28 views

Introduction nlc

Nlc

Uploaded by

jprem637

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

28 views

Introduction nlc

Nlc

Uploaded by

jprem637

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 69

Natural Language Computing

Syllabus
Syllabus - CLOs
At the end of the course, students will be able to –

1. Understand and apply foundational concepts in natural language

processing and neural network models (BL 2, 3)

2. Analyze and implement vector space models for word

representation. (BL 3,4)

3. Develop and evaluate recurrent neural networks and attention

mechanisms for advanced NLP tasks. (BL 5,6)

4. Implement and utilize transformer models for diverse NLP

applications, addressing ethical and interpretability aspects and
societal impacts. (BL 3, 4, 5)
Books

5. Jacob Eisenstein, Introduction to Natural Language Processing, The MIT Press

6. Ian Goodfellow, Yoshua Bengio, and Aaron Courville. Deep Learning, The MIT
Press

7. Lewis Tunstall, Leandro von Werra, and Thomas Wolf, Natural Language
Processing with Transformers, O'Reilly Media
Moodle Site
https://lms.nirmauni.ac.in/course/view.php?id=9264
Teaching & Evaluation Scheme
Teaching Scheme:
Theory Tutorial Practical Credits
3 0 2 4

Evaluation Methodology:
SEE CE LPW
Exam Duration 3.0 Hrs. Quizzes + Continuous + Sem.
Sess. + End
Assignment
Component 0.4 0.3 0.3 (0.6 + 0.4)
Weightage
Teaching & Evaluation Scheme
Breakup of CE
Unit 1 Unit 2 Unit 3
Exam Quizzes Assignments Sessional

Inter 0.2 0.2 0.6

Component Weightage

Numbers 2 1 1

Marks 20 20
Assignments
1.
Practical List
Sr. Title Hours
No.
1 Introduction to text processing libraries 02
2 Introduction to PyTorch for NLP 02
3 Word2Vec implementation 04
4 Sequence models for (i) Sequence Classification (ii) 06
Named-Entity Recognition (iii) Machine Translation

5 Enhance the Seq2Seq model using different attention 02

mechanisms.
6 Implement transformer model for (i) sequence 04
classification (ii) machine translation
7 Fine-tune pre-trained transformer models such as 02
BERT, GPT-2, or T5 on a specific NLP task.
8 Analyze the interpretability of NLP models using 02
techniques such as LIME or SHAP.
9 Build a conversational agent using LangChain 04
10 Incorporate RAG/LoRA/RLHF 02
Introduction to NLP
➢ What is Natural Language Processing (NLP)?
➢ Natural language processing (NLP) is a subfield of computer
science and artificial intelligence (AI) that uses machine
learning to enable computers to understand and communicate
with human language. (Source:
https://www.ibm.com/topics/natural-language-processing)
Introduction to NLP
➢ What is Natural Language Processing (NLP)?
➢ Natural language processing (NLP) is a subfield of computer
science and artificial intelligence (AI) that uses machine
learning to enable computers to understand and communicate
with human language. (Source:
https://www.ibm.com/topics/natural-language-processing)
OR

➢ Natural language processing (NLP) is the ability of a computer

program to understand human language as it's spoken and
written -- referred to as natural language. It's a component of
artificial intelligence (AI). (Source:
https://www.techtarget.com/searchenterpriseai/definition/na
tural-language-processing-NLP)
Introduction to NLP
➢ What is Natural Language Processing (NLP)?
➢ Natural language processing (NLP) is a subfield of computer
science and artificial intelligence (AI) that uses machine
learning to enable computers to understand and communicate
with human language. (Source:
https://www.ibm.com/topics/natural-language-processing)
OR

➢ Natural language processing (NLP) is the ability of a computer

➢ Natural language processing (NLP) is a machine learning

technology that allows computers to interpret, manipulate, and
comprehend human language. (Source:
https://aws.amazon.com/what-is/nlp/)
Applications
➢ Natural Language Processing
➢ Autonomous Tagging of Stackoverflow Questions
➢ Make a multi-label classification system that
automatically assigns tags for questions posted on a
forum such as StackOverflow or Quora.
➢ Dataset: StackLite or 10% sample
Applications
➢ Natural Language Processing
➢ Autonomous Tagging of Stackoverflow Questions
➢ Make a multi-label classification system that
automatically assigns tags for questions posted on a
forum such as StackOverflow or Quora.
➢ Dataset: StackLite or 10% sample

➢ Automated essay grading

➢ The purpose of this project is to implement and train
machine learning algorithms to automatically assess
and grade essay responses.
➢ Dataset: Essays with human graded scores
Applications
➢ Natural Language Processing
➢ Autonomous Tagging of Stackoverflow Questions
➢ Make a multi-label classification system that
automatically assigns tags for questions posted on a
forum such as StackOverflow or Quora.
➢ Dataset: StackLite or 10% sample

➢ Automated essay grading

➢ The purpose of this project is to implement and train
machine learning algorithms to automatically assess
and grade essay responses.
➢ Dataset: Essays with human graded scores

➢ Sentence to Sentence semantic similarity

➢ Can you identify question pairs that have the same
intent or meaning?
➢ Dataset: Quora question pairs with similar questions
marked
Applications
➢ Natural Language Processing
➢ Fight online abuse
➢ Can you confidently and accurately tell whether a particular
comment is abusive?
➢ Dataset: Toxic comments on Kaggle
Applications
➢ Natural Language Processing
➢ Fight online abuse
➢ Can you confidently and accurately tell whether a particular
comment is abusive?
➢ Dataset: Toxic comments on Kaggle

➢ Open Domain question answering

➢ Can you build a bot which answers questions according to the
student's age or her curriculum?
➢ Facebook's FAIR is built in a similar way for Wikipedia.
➢ Dataset: NCERT books for K-12/school students in India,
NarrativeQA by Google DeepMind and SQuAD by Stanford
Applications
➢ Natural Language Processing
➢ Fight online abuse
➢ Can you confidently and accurately tell whether a particular
comment is abusive?
➢ Dataset: Toxic comments on Kaggle

➢ Open Domain question answering

➢ Social Chat/Conversational Bots

➢ Can you build a bot which talks to you just like people talk on
social networking sites?
➢ Reference: Chat-bot architecture
➢ Dataset: Reddit Dataset
Applications
➢ Natural Language Processing
➢ Copy-cat Bot
➢ Generate plausible new text which looks like some
other text
➢ Obama Speeches? For instance, you can create a bot
which writes some new speeches in Obama's style
➢ Trump Bot? Or a Twitter bot which mimics
@realDonaldTrump
➢ Narendra Modi bot saying "doston"? Start by
scrapping off his Hindi speeches from his personal
website
➢ Example Dataset: English Transcript of Modi speeches
Applications
➢ Applications that can’t be programmed by hand
➢ Natural Language Processing
➢ Machine Translation
Applications
➢ Applications that can’t be programmed by hand
➢ Natural Language Processing
➢ Machine Translation
➢ Word Sense Disambiguation
➢ "I am taking aspirin for my cold" the disease sense is
intended
Applications
➢ Applications that can’t be programmed by hand
➢ Handwriting Recognition
➢ Natural Language Processing
➢ Machine Translation
➢ Word Sense Disambiguation
➢ "I am taking aspirin for my cold" the disease sense is
intended
➢ "Let's go inside, I'm cold" the temperature sensation sense is
meant
Applications
➢ Applications that can’t be programmed by hand
➢ Handwriting Recognition
➢ Natural Language Processing
➢ Machine Translation
➢ Word Sense Disambiguation
➢ "I am taking aspirin for my cold" the disease sense is
intended
➢ "Let's go inside, I'm cold" the temperature sensation sense is
meant
➢ "It's cold today, only 2 degrees", implies the environmental
condition sense.
Applications
➢ Applications that can’t be programmed by hand
➢ Handwriting Recognition
➢ Natural Language Processing
➢ Machine Translation
➢ Word Sense Disambiguation
➢ "I am taking aspirin for my cold" the disease sense is
intended
➢ "Let's go inside, I'm cold" the temperature sensation sense is
meant
➢ "It's cold today, only 2 degrees", implies the environmental
condition sense.
➢ Part-of-Speech Tagging
➢ “And now for something completely different”
➢ [('And', 'CC'), ('now', 'RB'), ('for', 'IN'), ('something',
'NN'), ('completely', 'RB'), ('different', 'JJ')]
➢ Here we see that and is CC, a coordinating conjunction; now
and completely are RB, or adverbs; for is IN, a preposition;
something is NN, a noun; and different is JJ, an adjective.
Applications
➢ Applications that can’t be programmed by hand
➢ Handwriting Recognition
➢ Natural Language Processing
➢ Machine Translation
➢ Word Sense Disambiguation
➢ "I am taking aspirin for my cold" the disease sense is
intended
➢ "Let's go inside, I'm cold" the temperature sensation sense is
meant
➢ "It's cold today, only 2 degrees", implies the environmental
condition sense.
➢ Part-of-Speech Tagging
➢ “And now for something completely different”
➢ [('And', 'CC'), ('now', 'RB'), ('for', 'IN'), ('something',
'NN'), ('completely', 'RB'), ('different', 'JJ')]
➢ Here we see that and is CC, a coordinating conjunction; now
and completely are RB, or adverbs; for is IN, a preposition;
something is NN, a noun; and different is JJ, an adjective.
➢ Sentiment Analysis
Applications
➢ Applications that can’t be programmed by hand
➢ Natural Language Processing
➢ Named-Entity Recognition
➢ Teddy Roosevelt was a great president
➢ Teddy bears are on sale
Applications
➢ Applications that can’t be programmed by hand
➢ Natural Language Processing
➢ Named-Entity Recognition
➢ Teddy Roosevelt was a great president
➢ Teddy bears are on sale
➢ Chatbot
Applications
➢ Applications that can’t be programmed by hand
➢ Natural Language Processing
➢ Named-Entity Recognition
➢ Teddy Roosevelt was a great president
➢ Teddy bears are on sale
➢ Chatbot
➢ Speech Recognition, etc.
➢ Speech recognition is the ability of a machine or program to
identify words and phrases in spoken language and convert
them to a machine-readable format.
➢ The most frequent applications of speech recognition include
voice dialling and voice search.
Applications
➢ Applications that can’t be programmed by hand
➢ Natural Language Processing
➢ Named-Entity Recognition
➢ Teddy Roosevelt was a great president
➢ Teddy bears are on sale
➢ Chatbot
➢ Speech Recognition, etc.
➢ Speech recognition is the ability of a machine or program to
identify words and phrases in spoken language and convert
them to a machine-readable format.
➢ The most frequent applications of speech recognition include
voice dialling and voice search.
➢ Text Summarization
➢ Extractive Summarization
➢ Abstractive Summarization
Introduction to NLP
➢ Text Preprocessing
➢ Lowercasing
➢ Removing Punctuation & Special Characters
➢ Stop-Words Removal
➢ Removal of URLs
➢ Removal of HTML Tags
➢ Stemming & Lemmatization
➢ Tokenization
➢ Text Normalization
Introduction to NLP
➢ Text Preprocessing
➢ Stemming & Lemmatization
➢ Stemming and lemmatization are techniques used in natural
language processing to reduce words to their base or root
form.
Introduction to NLP
➢ Text Preprocessing
➢ Stemming & Lemmatization
➢ Stemming and lemmatization are techniques used in natural
language processing to reduce words to their base or root
form.

➢ Both are used to improve the efficiency of text processing

tasks like search, indexing, and text analysis, but they do so
in different ways.
Introduction to NLP
➢ Text Preprocessing
➢ Stemming & Lemmatization
➢ Stemming and lemmatization are techniques used in natural
language processing to reduce words to their base or root
form.

➢ Both are used to improve the efficiency of text processing

tasks like search, indexing, and text analysis, but they do so
in different ways.

➢ Original word: "caring“, Stemmed form: "car“, Lemmatized

form: "care“
Introduction to NLP
➢ Text Preprocessing
➢ Stemming & Lemmatization
➢ Stemming and lemmatization are techniques used in natural
language processing to reduce words to their base or root
form.

➢ Both are used to improve the efficiency of text processing

tasks like search, indexing, and text analysis, but they do so
in different ways.

➢ Original word: "caring“, Stemmed form: "car“, Lemmatized

form: "care“

➢ Original word: "studies“, Stemmed form: "studi“, Lemmatized

form: "study“,
Introduction to NLP
➢ Text Preprocessing
➢ Stemming & Lemmatization
➢ Stemming and lemmatization are techniques used in natural
language processing to reduce words to their base or root
form.

➢ Both are used to improve the efficiency of text processing

tasks like search, indexing, and text analysis, but they do so
in different ways.

➢ Original word: "caring“, Stemmed form: "car“, Lemmatized

form: "care“

➢ Original word: "studies“, Stemmed form: "studi“, Lemmatized

form: "study“,

➢ Original word: "better“, Stemmed form: "better“,

Lemmatized form: "good“
Introduction to NLP
➢ Text Preprocessing
➢ Stemming & Lemmatization
➢ Stemming and lemmatization are techniques used in natural
language processing to reduce words to their base or root
form.

➢ Both are used to improve the efficiency of text processing

tasks like search, indexing, and text analysis, but they do so
in different ways.

➢ Original word: "caring“, Stemmed form: "car“, Lemmatized

form: "care“

➢ Original word: "studies“, Stemmed form: "studi“, Lemmatized

form: "study“,

➢ Original word: "better“, Stemmed form: "better“,

Lemmatized form: "good“

➢ Original word: "running“, Stemmed form: "run“, Lemmatized

form: "run"
Introduction to NLP
➢ Text Preprocessing
➢ Tokenization:
➢ Tokenization is the process of breaking down a text
into smaller units called tokens.
Introduction to NLP
➢ Text Preprocessing
➢ Tokenization:
➢ Tokenization is the process of breaking down a text
into smaller units called tokens.

➢ Word Tokenization: Breaks text into individual words.

➢ Example: "I like NLP.“
➢ Tokens: ["I", "like", "NLP", "."]
Introduction to NLP
➢ Text Preprocessing
➢ Tokenization:
➢ Tokenization is the process of breaking down a text
into smaller units called tokens.

➢ Word Tokenization: Breaks text into individual words.

➢ Example: "I like NLP.“
➢ Tokens: ["I", "like", "NLP", "."]

➢ Sentence Tokenization: Breaks text into individual

sentences.
➢ Example: "I like NLP. It's fascinating.“
➢ Tokens: ["I like NLP.", "It's fascinating."]
Introduction to NLP
➢ Text Preprocessing
➢ Tokenization:
➢ Tokenization is the process of breaking down a text
into smaller units called tokens.

➢ Word Tokenization: Breaks text into individual words.

➢ Example: "I like NLP.“
➢ Tokens: ["I", "like", "NLP", "."]

➢ Sentence Tokenization: Breaks text into individual

sentences.
➢ Example: "I like NLP. It's fascinating.“
➢ Tokens: ["I like NLP.", "It's fascinating."]

➢ Subword Tokenization: Breaks text into subwords or

parts of words, useful for dealing with rare or unknown
words.
➢ Example: "unhappiness“
➢ Tokens: ["un", "happiness"]
Introduction to NLP
➢ Text Preprocessing
➢ Text normalization
➢ It is a broad process encompassing several techniques
to bring text into a consistent and standardized format.

➢ The goal is to reduce variations in the text that do not

affect its meaning, thereby simplifying subsequent
analysis.
Introduction to NLP
➢ N-grams
➢ N-grams are contiguous sequences of 'n' items from a
given text or speech.

➢ These items can be words or characters depending on

the specific application.
Introduction to NLP
➢ N-grams
➢ Let's consider the sentence "I like natural language
processing" to illustrate 1-grams, 2-grams, and n-
grams.

➢ Unigram (1-gram): "I“, "like“, "natural“, "language“,

"processing“

➢ Bigrams (2-gram): "I like“, "like natural“, "natural

language“, "language processing“

➢ Trigrams (3-gram): "I like natural“, "like natural

language“, "natural language processing“

➢ 4-grams: "I like natural language“, "like natural

language processing"
Introduction to NLP
➢ Part-of-Speech Tagging
➢ Part-of-Speech (POS) Tagging is a fundamental task in
natural language processing (NLP) that involves
assigning parts of speech to each word in a given text.
Introduction to NLP
➢ Part-of-Speech Tagging
➢ Part-of-Speech (POS) Tagging is a fundamental task in
natural language processing (NLP) that involves
assigning parts of speech to each word in a given text.

➢ Parts of speech include categories like nouns, verbs,

adjectives, adverbs, pronouns, prepositions,
conjunctions, and interjections.

➢ POS tagging helps understand a sentence's

grammatical structure, which is essential for various
NLP applications such as named entity recognition and
machine translation.
Introduction to NLP
➢ Part-of-Speech Tagging
➢ How POS Tagging Works:
➢ POS tagging uses linguistic rules and statistical models to
determine the correct part of speech for each word in a
sentence.
Introduction to NLP
➢ Part-of-Speech Tagging
➢ How POS Tagging Works:
➢ POS tagging uses linguistic rules and statistical models to
determine the correct part of speech for each word in a
sentence.

➢ Modern POS taggers often rely on machine learning

algorithms trained on large annotated corpora to achieve high
accuracy.
Introduction to NLP
➢ Part-of-Speech Tagging
➢ Process:
➢ Tokenization:
➢ First, the sentence is split into individual words
(tokens).
➢ Tokens: ["The", "quick", "brown", "fox", "jumps",
"over", "the", "lazy", "dog"]
Introduction to NLP
➢ Part-of-Speech Tagging
➢ Process:
➢ Tokenization:
➢ First, the sentence is split into individual words
(tokens).
➢ Tokens: ["The", "quick", "brown", "fox", "jumps",
"over", "the", "lazy", "dog"]

➢ POS Tagging:
➢ Each token is then assigned a part of the speech
tag.
➢ "The" (Determiner, DT), "quick" (Adjective, JJ),
"brown" (Adjective, JJ), "fox" (Noun, NN), "jumps"
(Verb, VBZ), "over" (Preposition, IN), "the"
(Determiner, DT), "lazy" (Adjective, JJ), "dog"
(Noun, NN).
➢ Annotated Sentence: The/DT quick/JJ brown/JJ
fox/NN jumps/VBZ over/IN the/DT lazy/JJ
dog/NN
Introduction to NLP
➢ Feature Extraction
➢ Bag of Words (BoW):
➢ Represents text as a collection (bag) of words,
ignoring grammar and word order.
➢ Each word's occurrence is counted.
➢ Example: For sentences "I like NLP" and "NLP is
great“.
➢ Vocabulary: [I, like, NLP, is, great]
➢ "I like NLP": [1, 1, 1, 0, 0]
➢ "NLP is great": [0, 0, 1, 1, 1]
Introduction to NLP
➢ Feature Extraction
➢ Term Frequency-Inverse Document Frequency (TF-
IDF):
➢ Term Frequency (TF) measures how frequently a term (word)
appears in a document.
➢ It is calculated as the ratio of the number of times the term
appears in the document to the total number of terms in the
document.
Introduction to NLP
➢ Feature Extraction
➢ Term Frequency-Inverse Document Frequency (TF-
IDF):
➢ Inverse Document Frequency (IDF) measures how important a
term is.
➢ While computing TF, all terms are considered equally
important. However, certain terms like "is", "of", and "that"
may appear frequently but have little importance.
➢ Thus, we need to weigh down the frequent terms while scaling
up the rare ones by computing the following:
Introduction to NLP
➢ Feature Extraction
➢ Term Frequency-Inverse Document Frequency (TF-
IDF):
➢ TF-IDF combines the two measures to give a composite
weight for each term in a document, reflecting both the
term's frequency in the document and its rarity across the
entire document set.

➢ TF-IDF(𝑡,𝑑,𝐷) = TF(𝑡,𝑑) × IDF(𝑡,𝐷)

Introduction to NLP
➢ Feature Extraction
➢ Term Frequency-Inverse Document Frequency (TF-
IDF):
➢ Example:
Introduction to NLP
➢ Feature Extraction
➢ Term Frequency-Inverse Document Frequency (TF-
IDF):
➢ Example:
Introduction to NLP
➢ Feature Extraction
➢ Term Frequency-Inverse Document Frequency (TF-
IDF):
➢ Example:
Introduction to NLP
➢ Feature Extraction
➢ N-grams
➢ Part-of-Speech (PoS) Tagging
Introduction to NLP
➢ Feature Extraction
➢ Word Embedding (e.g., Word2Vec, Glove):
➢ Represents words in a continuous vector space where
semantically similar words are closer.

➢ Captures semantic meaning and relationships between words.

➢ Example: "king" and "queen" will have similar embeddings.

Introduction to NLP
➢ Feature Extraction
➢ Document Embeddings (e.g., Doc2Vec)
➢ Extends word embeddings to larger pieces of text like
sentences, paragraphs, or documents.
➢ Provides a fixed-length vector representation of variable-
length text.
Introduction to NLP
➢ Feature Extraction
➢ Latent Semantic Analysis (LSA):
➢ Uses singular value decomposition (SVD) on the term-
document matrix to reduce dimensions.
➢ Captures underlying topics and relationships between words.
➢ Example: Projects documents and terms into a lower-
dimensional space.
Introduction to NLP
➢ Feature Extraction
➢ Latent Dirichlet Allocation (LDA)
➢ A generative probabilistic model for topic modeling.
➢ Assumes each document is a mixture of topics and each topic
is a mixture of words.
➢ Example: Identifies topics like "sports," "politics," etc.,
within a collection of documents.
Introduction to NLP
➢ Feature Extraction
➢ Text Summarization
➢ Extracts or generates a concise summary of a longer text.
➢ Can be extractive (selecting important sentences) or
abstractive (generating new sentences).
Introduction to NLP
➢ Feature Extraction
➢ Transformer-based Models (e.g., BERT, GPT)
➢ Uses deep learning to generate contextual embeddings.
➢ Words are represented differently depending on their
context.
➢ Example: "bank" in "river bank" vs. "savings bank" will have
different embeddings.
Introduction to NLP
➢ Feature Extraction
➢ Hybrid Techniques
➢ Combines various features like BoW, TF-IDF, embeddings,
POS tags, and custom domain-specific features.
➢ Example: Combining TF-IDF scores with sentiment scores and
named entity counts for a more comprehensive
representation.
Introduction
➢ What is Machine Learning (ML)?
Introduction
➢ Flavors of Machine Learning
Introduction
➢ Flavors of Machine Learning
➢ Supervised Learning
➢ Unsupervised Learning
➢ Semi Supervised Learning
➢ Reinforcement Learning
References
1. https://towardsdatascience.com/the-10-deep-learning-methods-ai-
practitioners-need-to-apply-885259f402c1
2. Progressive Growing of GANs for Improved Quality, Stability, and
Variation
3. Ma, Liqian, et al. "Pose guided person image generation." Advances in
Neural Information Processing Systems. 2017.
4. https://github.com/junyanz/CycleGAN
5. https://github.com/hanzhanggit/StackGAN
6. https://github.com/pathak22/context-encoder
7. https://github.com/carpedm20/DiscoGAN-pytorch
8. https://github.com/phillipi/pix2pix
9. Antipov, Grigory, Moez Baccouche, and Jean-Luc Dugelay. "Face aging
with conditional generative adversarial networks." Image Processing (ICIP),
2017 IEEE International Conference on. IEEE, 2017.
10. https://pdfs.semanticscholar.org/presentation/f4af/de757b9dfc697
d149 e 95cb193aa4749530e2.pdf
11. Ganin, Yaroslav, and Victor Lempitsky. "Unsupervised domain adaptation
by backpropagation." arXiv preprint arXiv:1409.7495 (2014).
Disclaimer
➢ Content of this presentation is not original and it
has been prepared from various sources for
teaching purpose.

Institute of Actuaries of India: Subject CM1 - Actuarial Mathematics Core Principles For 2022 Examinations
No ratings yet
Institute of Actuaries of India: Subject CM1 - Actuarial Mathematics Core Principles For 2022 Examinations
7 pages
Natural Language Processing - Session 1 - Introduction
No ratings yet
Natural Language Processing - Session 1 - Introduction
55 pages
NLP Lect Unit I
100% (1)
NLP Lect Unit I
140 pages
Lecture_1_Introduction
No ratings yet
Lecture_1_Introduction
57 pages
6._NLP
No ratings yet
6._NLP
11 pages
NLP Lecture 1
No ratings yet
NLP Lecture 1
3 pages
Unit 3
No ratings yet
Unit 3
14 pages
What Is Natural Language Processing?
No ratings yet
What Is Natural Language Processing?
5 pages
Introducing Natural Language Processing
No ratings yet
Introducing Natural Language Processing
13 pages
DS Exp2 20101A0021 Satyam Mishra
No ratings yet
DS Exp2 20101A0021 Satyam Mishra
5 pages
DS Exp2 Rugved
No ratings yet
DS Exp2 Rugved
5 pages
Introduction To NLP - Part 1
No ratings yet
Introduction To NLP - Part 1
23 pages
Topic 2: Introduction To Natural Language Processing (NLP)
No ratings yet
Topic 2: Introduction To Natural Language Processing (NLP)
16 pages
NLP handwritten notes_copy
No ratings yet
NLP handwritten notes_copy
26 pages
Natural Language Processin1
No ratings yet
Natural Language Processin1
86 pages
Introduction to Natural Language Processing
No ratings yet
Introduction to Natural Language Processing
31 pages
CS 4063 Natural Language Processing Outline Spring2022
No ratings yet
CS 4063 Natural Language Processing Outline Spring2022
4 pages
Lecture01 Introduction
No ratings yet
Lecture01 Introduction
35 pages
NLP Notes
No ratings yet
NLP Notes
90 pages
Natural Language Processing
No ratings yet
Natural Language Processing
87 pages
Bhawini NLP Practical
No ratings yet
Bhawini NLP Practical
98 pages
NLP
No ratings yet
NLP
88 pages
Natural Language Processing In Action Meap V03 Hobson Lane Cole Howard instant download
No ratings yet
Natural Language Processing In Action Meap V03 Hobson Lane Cole Howard instant download
49 pages
What Is Natural Language Processing? - IBM
No ratings yet
What Is Natural Language Processing? - IBM
17 pages
NLP
No ratings yet
NLP
11 pages
NLP PPT1 (1)
No ratings yet
NLP PPT1 (1)
29 pages
Natural Language Processing (NLP) (A Complete Guide)
No ratings yet
Natural Language Processing (NLP) (A Complete Guide)
26 pages
Nlp Materia
No ratings yet
Nlp Materia
29 pages
unit 4 (1)
No ratings yet
unit 4 (1)
39 pages
NLP Syllabus R21
100% (1)
NLP Syllabus R21
2 pages
ML1701 - NLP Notes Unit-1
No ratings yet
ML1701 - NLP Notes Unit-1
38 pages
Natural Language Processing-2
No ratings yet
Natural Language Processing-2
13 pages
Group 8 NLP
No ratings yet
Group 8 NLP
3 pages
1.Machine Learning and its Applications (2)
No ratings yet
1.Machine Learning and its Applications (2)
75 pages
Introduction to NLP_first_week_lecture_1st
No ratings yet
Introduction to NLP_first_week_lecture_1st
6 pages
6.10AI-3A
No ratings yet
6.10AI-3A
9 pages
P-1.1.3
No ratings yet
P-1.1.3
9 pages
NLP StudyMaterial
No ratings yet
NLP StudyMaterial
540 pages
Natural Language Processing Notes
No ratings yet
Natural Language Processing Notes
80 pages
Intro NLP
No ratings yet
Intro NLP
47 pages
module-1
No ratings yet
module-1
49 pages
Natural Language processing
No ratings yet
Natural Language processing
43 pages
Lecture 01
No ratings yet
Lecture 01
44 pages
CCS369
No ratings yet
CCS369
2 pages
NLP A
No ratings yet
NLP A
6 pages
Intro. To NLP
No ratings yet
Intro. To NLP
18 pages
AIYA Session 3 Presentation (1)
No ratings yet
AIYA Session 3 Presentation (1)
40 pages
unit 3&4
No ratings yet
unit 3&4
10 pages
What Is Natural Language Processing (NLP)
No ratings yet
What Is Natural Language Processing (NLP)
15 pages
BTech Advanced AI Unit04
No ratings yet
BTech Advanced AI Unit04
45 pages
NLP M1 Students (1)
No ratings yet
NLP M1 Students (1)
17 pages
Natural Language Processing (NLP)
No ratings yet
Natural Language Processing (NLP)
7 pages
Unit 7
No ratings yet
Unit 7
17 pages
AI (NLP AND EVALUATION)
No ratings yet
AI (NLP AND EVALUATION)
6 pages
NLP Topper
100% (1)
NLP Topper
71 pages
NLP Lab Manual-1
No ratings yet
NLP Lab Manual-1
18 pages
1 intro to NLP
No ratings yet
1 intro to NLP
5 pages
Chapter - 6 Communicating, Perceiving, and Acting
No ratings yet
Chapter - 6 Communicating, Perceiving, and Acting
30 pages
Introduction To Natural Language Processing (NLP)
No ratings yet
Introduction To Natural Language Processing (NLP)
87 pages
Basic Guide to Programming Languages Python, JavaScript, and Ruby
From Everand
Basic Guide to Programming Languages Python, JavaScript, and Ruby
Kiet Huynh
No ratings yet
Learning Advanced Programming
From Everand
Learning Advanced Programming
IT Campus Academy
No ratings yet
Fall 2016 Textbooks For MS
No ratings yet
Fall 2016 Textbooks For MS
2 pages
Autonomous University of Baja California: Faculty of Engineering Aerospace Engineering
No ratings yet
Autonomous University of Baja California: Faculty of Engineering Aerospace Engineering
18 pages
Sample Material Macmillan Education: Hello!
No ratings yet
Sample Material Macmillan Education: Hello!
14 pages
Service Manual 6y8 Lan Gauges
No ratings yet
Service Manual 6y8 Lan Gauges
34 pages
Pneumatic Instrumentation Principles - The Force Balance System
No ratings yet
Pneumatic Instrumentation Principles - The Force Balance System
2 pages
Aula 3 - Introdução À Chocolateria
No ratings yet
Aula 3 - Introdução À Chocolateria
12 pages
Assembly Quiz - Bubble Sort
No ratings yet
Assembly Quiz - Bubble Sort
3 pages
Microprocessors and Microcontrollers
No ratings yet
Microprocessors and Microcontrollers
86 pages
Chapter 3 - Part 2
No ratings yet
Chapter 3 - Part 2
22 pages
MPU 3273/ LANG 2128/ BLC 221: Professional Communication
No ratings yet
MPU 3273/ LANG 2128/ BLC 221: Professional Communication
33 pages
Programming Guideline DOCU v14 en PDF
No ratings yet
Programming Guideline DOCU v14 en PDF
109 pages
Simulasi Harga Pekerjaan Painting: 21,981.28 Price Per-M2 Area (m2)
No ratings yet
Simulasi Harga Pekerjaan Painting: 21,981.28 Price Per-M2 Area (m2)
1 page
Geda Operating Instructions 1500 ZZP
100% (1)
Geda Operating Instructions 1500 ZZP
114 pages
Inventions and Discoveries: Invention - Inventor
No ratings yet
Inventions and Discoveries: Invention - Inventor
13 pages
Auto 2
No ratings yet
Auto 2
10 pages
Craad the Shadow Prince
No ratings yet
Craad the Shadow Prince
6 pages
Pip Arc01016
No ratings yet
Pip Arc01016
12 pages
LI-FI Project Physics Investigatory
No ratings yet
LI-FI Project Physics Investigatory
16 pages
Virtual Internships and Work-Integrated Learning in Hospitality and Tourism in A post-COVID-19 World
No ratings yet
Virtual Internships and Work-Integrated Learning in Hospitality and Tourism in A post-COVID-19 World
13 pages
Emortelle System Listing
No ratings yet
Emortelle System Listing
2 pages
Conditionals
No ratings yet
Conditionals
2 pages
ADTRAN TotalAccess908e
No ratings yet
ADTRAN TotalAccess908e
34 pages
11th NEW JEE CHEMISTRY 02-05-2021
No ratings yet
11th NEW JEE CHEMISTRY 02-05-2021
5 pages
Rebekah Jones OIG Report
No ratings yet
Rebekah Jones OIG Report
28 pages
Sheehan’s Take on the Newly Released Documents
No ratings yet
Sheehan’s Take on the Newly Released Documents
24 pages
Vector
No ratings yet
Vector
4 pages
LECTURE 26 Simple Serological Techniques
No ratings yet
LECTURE 26 Simple Serological Techniques
8 pages
Pump Head Calculations
100% (2)
Pump Head Calculations
4 pages
Torque Values (FT/LBS.) 100% of Yield: Bolt Diam Nut Size Moly Lube Nickel Lube Copper Lube OIL DRY
No ratings yet
Torque Values (FT/LBS.) 100% of Yield: Bolt Diam Nut Size Moly Lube Nickel Lube Copper Lube OIL DRY
6 pages

Introduction nlc

Uploaded by

Introduction nlc

Uploaded by

Natural Language Computing

1. Understand and apply foundational concepts in natural language

2. Analyze and implement vector space models for word

3. Develop and evaluate recurrent neural networks and attention

4. Implement and utilize transformer models for diverse NLP

5. Jacob Eisenstein, Introduction to Natural Language Processing, The MIT Press

Inter 0.2 0.2 0.6

5 Enhance the Seq2Seq model using different attention 02

➢ Natural language processing (NLP) is the ability of a computer

➢ Natural language processing (NLP) is the ability of a computer

➢ Natural language processing (NLP) is a machine learning

➢ Automated essay grading

➢ Automated essay grading

➢ Sentence to Sentence semantic similarity

➢ Open Domain question answering

➢ Open Domain question answering

➢ Social Chat/Conversational Bots

➢ Both are used to improve the efficiency of text processing

➢ Both are used to improve the efficiency of text processing

➢ Original word: "caring“, Stemmed form: "car“, Lemmatized

➢ Both are used to improve the efficiency of text processing

➢ Original word: "caring“, Stemmed form: "car“, Lemmatized

➢ Original word: "studies“, Stemmed form: "studi“, Lemmatized

➢ Both are used to improve the efficiency of text processing

➢ Original word: "caring“, Stemmed form: "car“, Lemmatized

➢ Original word: "studies“, Stemmed form: "studi“, Lemmatized

➢ Original word: "better“, Stemmed form: "better“,

➢ Both are used to improve the efficiency of text processing

➢ Original word: "caring“, Stemmed form: "car“, Lemmatized

➢ Original word: "studies“, Stemmed form: "studi“, Lemmatized

➢ Original word: "better“, Stemmed form: "better“,

➢ Original word: "running“, Stemmed form: "run“, Lemmatized

➢ Word Tokenization: Breaks text into individual words.

➢ Word Tokenization: Breaks text into individual words.

➢ Sentence Tokenization: Breaks text into individual

➢ Word Tokenization: Breaks text into individual words.

➢ Sentence Tokenization: Breaks text into individual

➢ Subword Tokenization: Breaks text into subwords or

➢ The goal is to reduce variations in the text that do not

➢ These items can be words or characters depending on

➢ Unigram (1-gram): "I“, "like“, "natural“, "language“,

➢ Bigrams (2-gram): "I like“, "like natural“, "natural

➢ Trigrams (3-gram): "I like natural“, "like natural

➢ 4-grams: "I like natural language“, "like natural

➢ Parts of speech include categories like nouns, verbs,

➢ POS tagging helps understand a sentence's

➢ Modern POS taggers often rely on machine learning

➢ TF-IDF(𝑡,𝑑,𝐷) = TF(𝑡,𝑑) × IDF(𝑡,𝐷)

➢ Captures semantic meaning and relationships between words.

➢ Example: "king" and "queen" will have similar embeddings.

You might also like