NLP Session I-Unit I and II
NLP Session I-Unit I and II
on
Natural Language Processing
[Elective V : 410252 A]
BE Computer Engineering 2019 Course
Organized By
S.T.E.S.’s Sinhgad Institute of Technology
in association with
BOS Computer Engineering, SPPU, Pune
(20th January 2023)
tii
tii
ep
ep
ep
Course Objectives
D
D
Introduction ! Integrate
To be familiar with fundamental concepts
To use appropriate tools and techniques for
and techniques of natural language
processing (NLP)
01 04 processing natural languages
Natural
ri
ri
r
ha
ha
ha
ud
ud
ud
ha
Language
ha
ha
C
C
tii
tii
tii
ep
ep
ep
Techniques
D
D
410252(A)
Language Modelling: Applications
Illustrations
To develop the various language 03 06 To describe Applications of NLP and
Machine Translations.
modeling techniques for NLP
ri
ri
i
ar
ha
ha
dh
ud
ud
u
ha
ha
ha
C
C
tii
tii
tii
ep
ep
D
F.O.P. on – Prof. Deptii Chaudhari Pune)
D
tii
tii
tii
ep
ep
ep
Course Outcomes
2
D
D
Integrate
Introduction ! Integrate the NLP techniques for the
Describe the fundamental concepts of NLP,
challenges and issues in NLP
01 04 information retrieval task
Natural
ri
ri
r
ha
ha
ha
ud
ud
ud
ha
Language
ha
ha
C
C
tii
tii
tii
ep
ep
ep
Techniques
D
D
techniques
ri
ri
i
ar
ha
ha
dh
ud
ud
u
ha
ha
ha
C
C
tii
tii
tii
ep
ep
D
F.O.P. on – Prof. Deptii Chaudhari Pune)
D
tii
tii
tii
ep
ep
ep
CO-PO Mapping
D
D
CO/PO PO1 PO2 PO3 PO4 PO5 PO6 PO7 PO8 PO9 PO10 PO11 PO12
CO1 2 2 1 - - - - - - - - -
CO2 3 3 2 2 2 - - - - - - 1
ri
ri
r
ha
ha
ha
CO3 2 3 3 2 2 - - - - - - 2
ud
ud
ud
ha
ha
ha
C
C
CO4 2 2 3 3 3 - 2 2 - - - 3
tii
tii
tii
ep
ep
ep
2 2 3 3 3 - - - - - - 3
D
D
CO5
CO6 3 3 3 3 3 2 1 1 - - - 3
ri
ri
i
ar
ha
ha
dh
ud
ud
u
ha
ha
ha
C
C
tii
tii
tii
ep
ep
ep
D
D
tii
tii
tii
ep
ep
ep
Sample Questions
D
D
Question CO BTL
Differentiate between natural languages and programming
CO1 BTL 2
languages.
Explain the various types of ambiguities in natural languages. CO1 BTL 2
ri
ri
r
ha
ha
ha
ud
ud
ud
ha
ha
ha
C
C
tii
tii
tii
CO2 BTL 4
ep
ep
ep
example.
D
D
ri
ri
i
ar
ha
ha
dh
ud
ud
u
ha
ha
ha
C
C
tii
tii
tii
ep
ep
ep
D
D
tii
tii
tii
ep
ep
ep
Teaching Methologies
D
D
ri
ri
r
ha
ha
ha
ud
ud
ud
concept
ha
ha
ha
morphemes
C
C
tii
tii
tii
ep
ep
ep
D
D
Research Beyond
Mini Projects
Papers Syllabus
Identify research Generative Models,
Covering all units
papers & ask Transformers for
students to read & NLP
present
ri
ri
i
ar
ha
ha
dh
ud
ud
u
ha
ha
ha
C
C
tii
tii
tii
ep
ep
ep
D
D
tii
tii
tii
ep
ep
ep
Learning Resources
D
D
Text Books:
1. Jurafsky, David, and James H. Martin, ―Speech and Language Processing: An Introduction to Natural
Language Processing‖, Computational Linguistics and Speech Recognition‖, , PEARSON Publication
2. Manning, Christopher D., and nrich Schütze , ―Foundations of Statistical Natural Language Processing‖,
Cambridge, MA: MIT Press
ri
ri
r
Reference Books:
ha
ha
ha
ud
ud
ud
1. Steven Bird, Ewan Klein, Edward Loper, ―Natural Language Processing with Python – Analyzing Text
ha
ha
ha
C
C
tii
tii
tii
ep
ep
ep
D
D
2. Dipanjan Sarkar , ―Text Analytics with Python: A Practical Real-World Approach to Gaining Actionable
Insights from your Data‖, Apress Publication ISBN: 9781484223871
3. Alexander Clark, Chris Fox, and Shalom Lappin, ―The Handbook of Computational Linguistics and
Natural Language Processing‖, Wiley Blackwell Publications
4. Jacob Eisenstein, ―Natural Language Processing‖, MIT Press
5. Jacob Eisenstein, ―An Introduction to Information Retrieval‖, Cambridge University Press
ri
ri
i
ar
ha
ha
dh
ud
ud
u
ha
ha
ha
C
C
tii
tii
tii
ep
ep
ep
D
D
tii
tii
tii
ep
ep
ep
Unit I: Introduction to Natural Language Processing
D
D
Introduction:
✓ What is Natural Language Processing? Why NLP is hard?
✓ Programming languages Vs Natural Languages
✓ Are natural languages regular?
✓ Finite automata for NLP
✓ Stages of NLP
ri
ri
r
ha
ha
ha
ud
ud
ud
ha
ha
ha
C
C
tii
tii
tii
ep
ep
ep
D
D
✓ Tokenization
✓ Stemming,
✓ Lemmatization,
✓ Part of Speech Tagging
Case Study: Why English is not a regular language
ri
ri
i
ar
ha
ha
Mapping to CO: CO1
dh
ud
ud
u
ha
ha
ha
C
C
tii
tii
tii
ep
ep
ep
D
D
tii
tii
tii
ep
ep
ep
What is Natural Language Processing?
D
D
ri
ri
r
ha
ha
ha
ud
ud
ud
generation, acquisition.
ha
ha
ha
C
C
tii
tii
tii
ep
ep
ep
D
D
ri
ri
i
ar
ha
ha
dh
ud
ud
u
ha
ha
ha
C
C
tii
tii
tii
ep
ep
ep
D
D
tii
tii
tii
ep
ep
ep
What is Natural Language Processing?
D
D
Natural Natural
Language Computer Language
ri
ri
r
ha
ha
ha
Natural Language
ud
ud
ud
ha
ha
ha
C
C
Understanding (NLU)
tii
tii
tii
ep
ep
ep
D
D
Natural Language
Generation (NLG)
ri
ri
i
ar
ha
ha
dh
ud
ud
u
ha
ha
ha
C
C
tii
tii
tii
ep
ep
ep
D
D
tii
tii
tii
ep
ep
ep
What is Natural Language Processing?
D
D
ri
ri
r
ha
ha
ha
ud
ud
ud
ha
ha
ha
C
C
tii
tii
tii
ep
ep
ep
D
working out a way to express it
D
Analysis Analysis
Morphological
Syntactic Deep Planning
Semantic Syntactic Parsing
ri
ri
i
ar
ha
ha
dh
Discourse ud
ud
u
ha
ha
ha
C
C
tii
tii
tii
ep
ep
ep
D
D
tii
tii
tii
ep
ep
ep
Core NLP Pipeline
D
D
ri
ri
r
▪ NLG
ha
ha
ha
Vectors)
ud
ud
ud
ha
ha
ha
C
C
tii
tii
tii
ep
ep
ep
D
D
Core NLP
Processing Decision Making
Information Output
Input Data Deterministic
Management Consumption
+
Probabilistic
Data Storage
ri
ri
i
ar
ha
ha
dh
ud
ud
u
ha
ha
ha
C
C
tii
tii
tii
ep
ep
ep
D
D
tii
tii
tii
ep
ep
ep
Alternative Views of NLP
D
D
ri
ri
r
ha
ha
ha
ud
ud
ud
ha
ha
ha
C
C
tii
tii
tii
ep
Goals of NLP
ep
ep
D
D
ri
ri
i
ar
ha
ha
dh
ud
ud
u
ha
ha
ha
C
C
tii
tii
tii
ep
ep
ep
D
D
tii
tii
tii
ep
ep
ep
Levels of Linguistic Representation
D
D
Pragmatics
Discourse
Semantics
i
Syntax
ri
ri
r
ha
ha
ha
ud
ud
ud
ha
ha
ha
C
C
tii
tii
tii
Lexemes
ep
ep
ep
D
D
Morphology
Phonology Orthography
Phonetics
ri
ri
i
ar
Speech ha
Text
ha
dh
ud
ud
u
ha
ha
ha
C
C
tii
tii
tii
ep
ep
ep
D
D
tii
tii
tii
ep
ep
ep
Stages of NLP
D
D
Process of converting a sequence of Process of looking for Uses a set of rules that describe
characters into a sequence of tokens meaning in a statement cooperative dialogues to help
you find the intended result
ri
ri
r
ha
ha
ha
ud
ud
ud
ha
ha
ha
C
C
tii
tii
tii
ep
ep
ep
D
D
ri
ri
i
ar
ha
ha
relationships between them. sentence in consideration.
dh
ud
ud
u
ha
ha
ha
C
C
tii
tii
tii
ep
ep
ep
D
D
tii
tii
tii
ep
ep
ep
Why NLP is hard?
D
D
ri
ri
r
ha
ha
ha
ud
ud
ud
mother.
ha
ha
ha
C
C
tii
tii
tii
ep
ep
ep
D
D
ri
ri
i
ar
ha
ha
dh
ud
ud
u
ha
ha
ha
C
C
tii
tii
tii
ep
ep
ep
D
D
tii
tii
tii Ambiguity at Many Levels
ep
ep
ep
D
D
➢ At acoustic level
➢ Homophones: words that sound similar but means different.
➢ E.g. I am going to buy an apple. “Apple” (Company) Vs. “apple” (Fruit)
➢ Word Boundary: ‘Aajayenge’ (Will come) Vs. ‘Aaj ayenge’ (Will come today)
➢ Phrase Boundary: Sentence “I got a plate” can be broken up in two different ways
➢ Either to mean “I got up late”, which means I woke up late
➢ Or “I got a plate”, I have a plate with me
ri
ri
➢ Disfluency: concerned with how a speaker intersperses his sentences with meaningless sounds just to be able
r
ha
ha
ha
ud
ud
ud
ha
ha
ha
C
C
tii
tii
tii
ep
ep
ep
➢ At Lexical level
D
D
Different structures
lead to different
interpretations.
ri
ri
i
ar
ha
ha
dh
ud
ud
u
ha
ha
ha
C
C
tii
tii
tii
ep
ep
ep
D
D
tii
tii
tii Ambiguity at Many Levels
ep
ep
ep
D
D
➢Structural
➢ “The camera man shot the man with the gun when he was near Tendulkar.”
➢ “Aid for kins of cops, killed in terrorist attacks”
➢At Semantic Level
➢ Word Sense Ambiguity
➢ They put money in the bank.
ri
ri
r
ha
ha
ha
➢ = buried in mud?
ud
ud
ud
ha
ha
ha
C
➢ I saw a boy with a telescope.
C
tii
tii
tii
ep
ep
ep
D
D
ri
ri
i
ar
ha
ha
dh
ud
ud
u
ha
ha
ha
C
C
tii
tii
tii
ep
ep
ep
D
D
tii
tii
tii
ep
ep
ep
Programming Languages Vs Natural Languages
D
D
ri
ri
r
ha
ha
ha
ud
ud
ud
None of these applies to programming languages. Communicate is both logical and emotional ways.
ha
ha
ha
C
C
tii
tii
tii
ep
ep
ep
D
D
ri
ri
i
ar
ha
ha
dh
ud
ud
u
ha
ha
ha
C
C
tii
tii
tii
ep
ep
ep
D
D
tii
tii
tii
ep
ep
ep
Programming Languages Vs Natural Languages
D
D
ri
ri
r
ha
ha
ha
irregular punctuation
ud
ud
ud
ha
ha
ha
C
C
tii
tii
tii
ep
ep
ep
communicate
D
D
Artificial creations: Rules and definitions were Natural creations, grammar changes as per context.
designed beforehand, which allows for them to be
fully described and studied in their entirety.
Self-defining grammar which doesn’t change
depending on the context.
ri
ri
i
ar
languages do
ha
ha
dh
ud
ud
u
ha
ha
ha
C
C
tii
tii
tii
ep
ep
ep
D
D
tii
tii
tii
ep
ep
ep
Challenges and Issues (Open Problems) in NLP
D
D
ri
ri
r
ha
ha
ha
Intentions
ud
ud
ud
ha
ha
ha
C
C
tii
tii
tii
ep
ep
ep
D
D
07
Domain specific / Low Resource 08
Lack of data, benchmarks,
Languages standards
ri
ri
i
ar
ha
ha
dh
ud
ud
u
ha
ha
ha
C
C
tii
tii
tii
ep
ep
ep
D
D
tii
tii
tii
ep
ep
ep 22
D
D
• Process of reducing
the inflectional Lemmatization, A process that
Process of breaking words to their root unlike stemming attaches each
down a text into forms reduces the word in a
tokens or given • Maps the word to a inflected words sentence with a
ri
ri
r
ha
ha
ha
paragraph into a same stem even if properly ensuring suitable tag from
ud
ud
ud
ha
ha
ha
list of sentences or the stem is not a that the root word a given set of tags
C
C
tii
tii
tii
ep
ep
ep
D
language
D
language
Part of Speech
Tokenization Stemming Lemmatization
Tagging
ri
ri
i
ar
ha
ha
dh
ud
ud
u
ha
ha
ha
C
C
tii
tii
tii
ep
ep
ep
D
D
tii
tii
tii
ep
ep
ep
Tokenization
D
D
Why Tokenize?
▪ Unstructured data and natural language text is broken into chunks of information that can be
understood by machine.
▪ Converts an unstructured string (text document) into a numerical data structure suitable for
machine learning. which allows the machines to understand each of the words by themselves,
as well as how they function in the larger text.
ri
ri
▪ First crucial step of the NLP process as it converts sentences into understandable bits of data
r
ha
ha
ha
ud
ud
ud
ha
ha
ha
C
C
tii
tii
tii
▪ Without a proper / correct tokenization, the NLP process can quickly devolve into a chaotic
ep
ep
ep
D
D
task.
Challenges
▪ Dealing with segment words when spaces or punctuation marks define the boundaries of the
word. For example: don’t
▪ Dealing with symbols that might change the meaning of the word significantly. For example:
₹100 vs 100
▪ Contractions such as ‘you’re’ and ‘I’m’
▪ Not applicable for symbol based languages like Chinese, Japanese, Korean Thai, Hindi, Urdu,
ri
ri
i
ar
ha
ha
dh
ud
ud
Tamil, and others
u
ha
ha
ha
C
C
tii
tii
tii
ep
ep
ep
D
D
tii
tii
tii
ep
ep
ep
Types of Tokenization
D
D
1. Word Tokenization
▪ Most common way of tokenization, uses natural breaks, like pauses in speech or spaces in text, and splits
the data into its respective words using delimiters (characters like ‘,’ or ‘;’ or ‘“,”’).
▪ Word tokenization’s accuracy is based on the vocabulary it is trained with. Unknown words or Out Of
Vocabulary (OOV) words cannot be tokenized.
2. White Space Tokenization
▪ Simplest technique, Uses white spaces as basis of splitting.
ri
ri
r
ha
ha
ha
▪ Works well for languages in which the white space breaks apart the sentence into meaningful words.
ud
ud
ud
ha
ha
ha
C
C
tii
tii
tii
ep
ep
ep
▪ Uses a set of rules that are created for the specific problem.
D
D
ri
i
ar
ha
ha
dh
ud
ud
u
ha
ha
ha
C
C
tii
tii
tii
ep
ep
ep
D
D
tii
tii
tii
ep
ep
ep
Stemming Vs Lemmatization
D
D
change change
changing changing
changes chang changes change
ri
ri
r
ha
ha
ha
changed
ud
ud
ud
changed
ha
ha
ha
C
C
tii
tii
tii
ep
ep
ep
changer changer
D
D
Stemming Lemmatization
▪ Porter Stemming: Uses suffix ▪ Wordnet Lemmatization: Uses
stripping to produce stems WordNet database to lookup lemmas
▪ Lancaster Stemming: Works with a of the words.
table containing about 120 rules
indexed by the last letter of a suffix.
ri
ri
i
ar
ha
ha
dh
ud
ud
u
ha
ha
ha
C
C
tii
tii
tii
ep
ep
ep
D
D
tii
tii
tii
ep
ep
ep
Parts of Speech Tagging
D
D
▪ Part of Speech tagging (or just tagging for short) is the process of assigning a
part of- speech or other syntactic class marker to each word in a corpus.
▪ Because tags are generally also applied to punctuation, tagging requires that
the punctuation marks (period, comma, etc) be separated off of the words.
▪ Thus tokenization is usually performed before, or as part of, the tagging
ri
ri
r
process, separating commas, quotation marks, etc., from words, and
ha
ha
ha
ud
ud
ud
ha
ha
ha
C
C
tii
tii
tii
ep
ep
ep
D
D
ri
i
ar
ha
ha
dh
ud
ud
▪ Hand me that book. book --> noun.
u
ha
ha
ha
C
C
tii
tii
tii
ep
ep
ep
D
D
tii
tii
tii
ep
ep
ep
Types of POS Taggers
D
D
ri
ri
r
ha
ha
ha
ud
ud
ud
▪ Involves a large database of hand-written ▪ Any model which somehow incorporates frequency or
ha
ha
ha
C
C
tii
tii
tii
ep
ep
ep
▪ Disambiguation is done by analyzing the ▪ The simplest stochastic taggers disambiguate words based
D
D
linguistic features of the word, its preceding solely on the probability that a word occurs with a
word, its following word, and other aspects. particular tag.
▪ Example of a rule: ▪ The problem with this approach is that while it may yield
▪ If an ambiguous/unknown word X is a valid tag for a given word, it can also yield inadmissible
preceded by a determiner and followed by a sequences of tags.
noun, tag it as an adjective. ▪ An alternative approach is to calculate the probability of a
▪ Example of Rule-based tagger is Brill’s given sequence of tags occurring known as n-gram
Tagger. approach, referring to the fact that the best tag for a given
ri
ri
i
ar
ha
ha
word is determined by the probability that it occurs with
dh
ud
ud
u
ha
ha
ha
C
C
tii
tii
tii
ep
ep
ep
D
D
tii
tii
tii
ep
ep
ep
Hidden Markov Model Tagging
D
D
• Section 5.5 - HMM PART-OF-SPEECH TAGGING - Jurafsky, David, and James H. Martin, ―Speech
and Language Processing: An Introduction to Natural Language Processing , Computational
Linguistics and Speech Recognition
• https://www.freecodecamp.org/news/an-introduction-to-part-of-speech-tagging-and-the-hidden-
markov-model-953d45338f24/
ri
ri
r
ha
ha
ha
ud
ud
ud
• https://www.freecodecamp.org/news/a-deep-dive-into-part-of-speech-tagging-using-viterbi-
ha
ha
ha
C
C
algorithm-17c8de32e8bc
tii
tii
tii
ep
ep
ep
D
D
• https://medium.com/data-science-in-your-pocket/pos-tagging-using-hidden-markov-models-hmm-
viterbi-algorithm-in-nlp-mathematics-explained-d43ca89347c4
ri
ri
i
ar
ha
ha
dh
ud
ud
u
ha
ha
ha
C
C
tii
tii
tii
ep
ep
ep
D
D
tii
tii
tii
ep
ep
ep
Unit II: Language Syntax and Semantics
D
D
Morphological Analysis:
✓ What is Morphology?
✓ Types of Morphemes
✓ Inflectional morphology & Derivational morphology
✓ Morphological parsing with Finite State Transducers (FST)
Syntactic Analysis:
✓ Syntactic Representations of Natural Language,
ri
ri
r
ha
ha
ha
ud
ud
ud
✓ Parsing Algorithms,
ha
ha
ha
C
C
tii
tii
tii
ep
ep
ep
D
Semantic Analysis:
D
✓ Lexical Semantic,
✓ Relations among lexemes & their senses –Homonymy, Polysemy, Synonymy, Hyponymy,
WordNet, Word Sense Disambiguation (WSD)
✓ Dictionary based approach
✓ Latent Semantic Analysis
Case Studies: Study of Stanford Parser and POS Tagger https://nlp.stanford.edu/software/lex-
parser.html, https://nlp.stanford.edu/software/tagger.html
ri
ri
i
ar
ha
ha
Mapping to CO: CO2
dh
ud
ud
u
ha
ha
ha
C
C
tii
tii
tii
ep
ep
ep
D
D
tii
tii
tii
ep
ep
ep
Morphological Analysis
D
D
Morpheme is the important unit of morphology, which is defined as the "minimal unit of
meaning“ or “the minimal unit of grammatical analysis"
ri
ri
r
ha
ha
ha
ud
ud
▪ There are three morphemes, each carrying a
ud
morphemes
ha
ha
ha
C
C
tii
tii
tii
ep
ep
ep
D
D
ri
ri
i
ar
ha
ha
"Jason feels very un ness today".
dh
ud
ud
u
ha
ha
ha
C
C
tii
tii
tii
ep
ep
ep
D
D
tii
tii
tii
ep
ep
ep
Types of Morphology
D
D
ri
ri
r
ha
ha
ha
ud
ud
ud
ha
ha
ha
C
C
Morphology Derivational
tii
tii
grammatical category.
tii
ep
ep
ep
D
D
ri
ri
i
ar
ha
dh
ud
ud
u
clitic.
ha
ha
ha
C
C
tii
tii
tii
ep
ep
ep
D
D
tii
tii
tii
ep
ep
ep
Morphological Parsing
D
D
1. lexicon: the list of stems and affixes, together with basic information
about them (whether a stem is a Noun stem or a Verb stem, etc.).
ri
ri
r
ha
ha
ha
ud
ud
ud
ha
ha
ha
C
C
tii
tii
tii
ep
ep
ep
D
D
3. orthographic rules: these spelling rules are used to model the changes
that occur in a word, usually when two morphemes combine (e.g., the y→ie
spelling rule discussed above that changes city + -s to cities rather than
citys)
ri
ri
i
ar
ha
ha
dh
ud
ud
u
ha
ha
ha
C
C
tii
tii
tii
ep
ep
ep
D
D
tii
tii
tii
ep
ep
ep
Syntactic Analysis
D
D
ri
ri
r
ha
ha
“setting out together or arrangement”, and refers to the way
ha
ud
ud
ud
ha
ha
ha
C
C
tii
tii
tii
ep
ep
ep
D
D
ri
ri
i
ar
ha
ha
dh
ud
ud
u
ha
ha
ha
C
C
tii
tii
tii
ep
ep
ep
D
D
tii
tii
tii
ep
ep
ep
Notions of Syntax and Grammar
D
D
▪ Constituency
▪ Groups of words may behave as a single unit or phrase, called a
constituent.
▪ Example: On September seventeenth, I’d like to fly from Pune to Delhi.
▪ Grammatical relations
ri
ri
r
▪ A formalization of ideas from traditional grammar such as SUBJECTS
ha
ha
ha
ud
ud
ud
ha
ha
ha
C
C
tii
tii
tii
ep
ep
ep
▪ Example: She ate a mammoth breakfast. Here noun phrase She is the
D
D
ri
i
ar
ha
ha
dh
ud
ud
▪ These are called facts about the subcategorization of the verb.
u
ha
ha
ha
C
C
tii
tii
tii
ep
ep
ep
D
D
tii
tii
tii
ep
ep
ep
Context Free Grammars
D
D
ri
ri
r
ha
ha
ha
symbols; the lexicon is the set of rules that introduce these terminal symbols.
ud
ud
ud
ha
ha
ha
C
C
▪ The symbols that express clusters or generalizations of these are called non-
tii
tii
tii
ep
ep
ep
D
D
terminals.
▪ In each context free rule, the item to the right of the arrow (→) is an ordered
list of one or more terminals and non-terminals, while to the left of the arrow
is a single non-terminal symbol expressing some cluster or generalization.
ri
ri
i
ar
ha
ha
dh
ud
ud
u
ha
ha
ha
C
C
tii
tii
tii
ep
ep
ep
D
D
tii
tii
tii
ep
ep
ep
Context Free Grammars
D
D
ri
ri
r
ha
ha
ha
ud
ud
ud
ha
ha
ha
C
C
strings (∑ ∪ N)∗
tii
tii
tii
ep
ep
ep
D
D
ri
ri
i
ar
ha
ha
dh
ud
ud
u
ha
ha
ha
C
C
tii
tii
tii
ep
ep
ep
D
D
tii
tii
tii
ep
ep
ep
Context Free Grammars
D
D
ri
ri
r
ha
ha
ha
ud
ud
ud
ha
ha
ha
C
C
tii
tii
tii
ep
ep
ep
D
D
ri
ri
i
ar
ha
ha
dh
ud
ud
u
ha
ha
ha
C
C
tii
tii
tii
ep
ep
ep
D
D
tii
tii
tii
ep
ep
ep
Parsing Algorithms
D
D
ri
ri
r
ha
ha
ha
ud
ud
ud
ha
ha
ha
C
C
tii
tii
tii
ep
ep
ep
D
D
ri
ri
i
ar
ha
ha
dh
ud
ud
u
ha
ha
ha
C
C
tii
tii
tii
ep
ep
ep
D
D
D D D
ep ep ep
tii tii tii
C C
ha ha
u dh ud
ar ha
i ri
D D D
ep ep ep
tii tii tii
C C
ha ha
ud ud
ha ha
ri r i
Parsing Algorithms - Ambiguity
D D D
ep ep ep
tii tii tii
C C
ha ha
ud ud
ha ha
ri ri
tii
tii
tii
ep
ep
ep
Dependency Parsing
D
D
ri
ri
r
ha
ha
ha
ud
ud
ud
ha
ha
ha
C
C
tii
tii
tii
ep
ep
ep
D
D
Relations.
▪ Dependency parsing is the task of extracting a dependency parse of a
sentence that represents its grammatical structure and defines the
relationships between “head” words and words, which modify those heads.
ri
ri
i
ar
ha
dh
ud
ud
u
ha
ha
ha
C
C
tii
tii
tii
ep
ep
ep
D
D
tii
tii
tii
ep
ep
ep
Probabilistic Context Free Grammars
D
D
ri
ri
r
ha
ha
ha
ud
ud
ud
ha
ha
ha
C
C
tii
tii
tii
ep
ep
ep
D
D
ri
ri
i
ar
ha
ha
dh
ud
ud
u
ha
ha
ha
C
C
tii
tii
tii
ep
ep
ep
D
D
tii
tii
tii
ep
ep
ep
Probabilistic Context Free Grammars
D
D
ri
ri
r
ha
ha
ha
ud
ud
ud
ha
ha
ha
C
C
tii
tii
tii
ep
ep
ep
D
D
ri
ri
i
ar
ha
ha
dh
ud
ud
u
ha
ha
ha
C
C
tii
tii
tii
ep
ep
ep
D
D
tii
tii
tii
ep
ep
ep
Probabilistic Context Free Grammars
D
D
ri
ri
r
ha
ha
ha
ud
ud
ud
ha
ha
ha
C
DT 1.0 NN 0.5 VP 0.4
C
C
PP1.0
tii
tii
tii
ep
ep
ep
D
D
ri
ri
i
ar
bullets
ha
ha
dh
ud
ud
u
ha
ha
ha
C
C
tii
tii
tii
ep
ep
ep
D
D
tii
tii
tii
ep
ep
ep
Probabilistic Context Free Grammars
D
D
ri
ri
r
ha
ha
ha
ud
ud
ud
ha
ha
ha
C
DT 1.0 NN 0.5 VBD 1.0
C
C
NP0.2
tii
tii
tii
ep
ep
ep
D
D
ri
ri
i
ar
ha
ha
dh
ud
ud
u
bullets
ha
ha
ha
C
C
tii
tii
tii
ep
ep
ep
D
D
tii
tii
tii
ep
ep
ep
Lexical Semantic
D
D
ri
ri
r
ha
ha
ha
ud
ud
ud
lemmatization.
ha
ha
ha
C
C
tii
tii
tii
ep
ep
ep
D
D
and syntax.
▪ Lexical semantics is concerned with the intrinsic characteristics of word
meaning, semantic relationships between words, and how word meaning is
related to syntactic structure.
ri
ri
i
ar
ha
ha
dh
ud
ud
u
ha
ha
ha
C
C
tii
tii
tii
ep
ep
ep
D
D
tii
tii
tii
ep
ep
ep
Relations between lexical items
D
D
ri
ri
r
ha
ha
ha
ud
ud
ud
ha
ha
ha
C
C
tii
tii
tii
▪ Synonymy refers to the words that are pronounced and spelled differently
ep
ep
ep
D
D
ri
ri
i
ar
ha
ha
dh
ud
ud
u
ha
ha
ha
C
C
tii
tii
tii
ep
ep
ep
D
D
tii
tii
tii
ep
ep
ep
Relations between lexical items
D
D
ri
ri
r
ha
ha
ha
ud
ud
ud
ha
ha
ha
C
C
tii
tii
tii
ep
ep
ep
D
D
ri
ri
i
ar
ha
ha
dh
ud
ud
u
ha
ha
ha
C
C
tii
tii
tii
ep
ep
ep
D
D
tii
tii
tii
ep
ep
ep
NLP Online Courses
D
D
https://lenavoita.github.io/nlp_course/word_embeddings.html
https://www.fast.ai/posts/2019-07-08-fastai-nlp.html
ri
ri
r
ha
ha
ha
ud
ud
ud
ha
ha
ha
https://www.udemy.com/course/natural-language-processing-with-
C
C
tii
tii
tii
ep
ep
ep
bert/
D
D
ri
ri
i
ar
ha
ha
dh
ud
ud
u
ha
ha
ha
C
C
tii
tii
tii
ep
ep
ep
D
D
D D D
ep ep ep
tii tii tii
C C
ha ha
u dh ud
ar ha
i ri
D D D
ep ep ep
tii tii tii
C C
ha ha
ud ud
ha ha
ri r i
Write to Me..
D D D
ep ep ep
tii tii tii
C C
ha ha
ud ud
ha ha
ri ri
D D D
ep ep ep
tii tii tii
C C
ha ha
u dh ud
ar ha
i ri
D D D
ep ep ep
tii tii tii
C C
ha ha
ud ud
ha ha
ri r i
D D D
ep ep ep
tii tii tii
C C
ha ha
ud ud
ha ha
ri ri