4.chapter5 - Syntactic and Semantic Representations
4.chapter5 - Syntactic and Semantic Representations
AC3110E
1
Chapter 5: Syntactic and Semantic
Representations
Reference:
+ Jurafsky, Daniel, and James H. Martin. Speech and Language Processing: An Introduction to Natural Language Processing,
Computational Linguistics, and Speech Recognition
+ CS224N: Natural Language Processing with Deep Learning, Stanford / Winter 2023
3
Syntactic Representations
4
Syntactic Representations
• Deal with the grammatical structure and relation of words and phrases
within sentences
• Part-of-speech tagging:
• identify the part-of-speech category for each word in
a sentence based on its grammatical role and context
POS tagging
• [Syntactic] parsing:
• Assigns a syntactic structure to a sentence or identifies the syntactic relationship of
words within a sentence, given the grammar rules of a language
6
1. Word Classes – POS tag set
7
1. Word Classes – POS tag set
• Open classes
• Nouns: for people, places, or things
• proper nouns, common nouns,
• can occur with determiners, and may occur in the plural
• count nouns, mass nouns
• Verbs
• refer to actions and processes
• verbs can have inflections (eat – eats – eating – eaten)
• Adjectives
• describe properties or qualities of nouns
• Adverbs
• generally modify something (often verbs, but also other adverbs and entire verb phrases)
• manner adverbs (slowly), locative adverbs (here), temporal adverbs (yesterday), frequency
(usually, rarely), degree (extremely, very)
• Interjections
• exclamation, greeting, yes/no response, etc. (oh, um, yes, hello)
• Closed classes
• Preposition/Postposition: marks a noun’s spacial, temporal, or other relation (in, on,
by, under)
• Auxiliary: helping verb marking tense, aspect, mood, etc. (can, may, should, are)
• Coordinating Conjunction: joins two phrases/clauses (and, or, but)
• Determiner: marks noun phrase properties (a, an, the, this)
• Numeral (one, two, first, second)
• Particle: a preposition-like form used together with a verb (up, down, on, off, in, out,
at, by)
• Pronoun: a shorthand for referring to an entity or event (she, who, I, others)
• Possessive pronouns (my, your, his, her)
• Wh-pronouns (what, who, whom)
• Subordinating Conjunction: joins a main clause with a subordinate clause such as a
sentential complement (that, which, but)
• Other
• Punctuation (˙, , () )
• Symbols ($,%) or emoji
• Universal Dependencies
tagset2
S → NP VP Productions:
NP → John, garbage John laughed. John walks.
VP → laughed, walks Garbage laughed. Garbage walks.
etc.
11
2. Grammar
12
2. Grammar
13
3. Constituency Parsing
https://www.nltk.org/book 14
Constituency Parsing Approaches
• Rule-based approach
• a top-down approach
• based on rules/grammar
• grammatical rules are coded manually in CFG (context-free grammar)
• Regex-based parser, CKY (Cocke-Kasami-Younger) Parser, Span-Based Neural
Constituency Parsing, etc.
• Probabilistic approach
• a bottom-up approach
• learn rules/grammar by using probabilistic models
• uses PCFG (Probabilistic context-free grammar), in which each rule is associated with
a probability
15
3.1. Rule-based approach
=> (S
sent="Mr. Obama played a big
Mr./NNP
role in the Health insurance
Obama/NNP
bill"
(VP
(V played/VBD)
(NP a/DT big/JJ role/NN)
(PP (P in/IN) (NP the/DT)))
Health/NNP
(NP insurance/NN bill/NN))
16
3.2 CKY (Cocke-Kasami-Younger) Parser
17
3.2 CKY (Cocke-Kasami-Younger) Parser
• CKY Recognition
S VP/X2 S/VP/X2
Nominal S/VP
Verb
S
Det NP NP
Nominal Nominal
Prep PP
NP
Proper-Noun
18
3.2 CKY (Cocke-Kasami-Younger) Parser
• CKY Parsing
“Book the flight through Houston”
S VP/X2 S/VP/X2 Parsing 1:
Nominal S/VP
Verb (S (Verb book) (NP (Det the) (NN (NN
S flight) (PP (Prep through) (NP (NNP
Det NP NP Houston))))))
Parsing 2:
(S (VP (Verb book) (NP (DT the) (NN
Nominal Nominal flight))) (PP (IN through) (NP (NNP
Houston))))
Parsing 3:
Prep PP (S (X2 (Verb book) (NP (DT the) (NN
flight))) (PP (IN through) (NP (NNP
Houston))))
NP
Proper-Noun => ambiguiation problem
19
3.3 Statistical Constituency Parsing
20
3.3. Statistical Constituency Parsing
• Probability of a Tree:
• Each rule i in the tree is expressed as LHSi → RHSi
P(T)
= .05 .20 .20 .20 .75 .30 .60 .10 .40
= 2.2×10−6
P(T)
= .05 .10 .20 .15 .75 .75 .30 .60 .10 .40
= 6.1×10−7
21
3.4. Evaluating Constituency Parsers
• PARSEVAL metric
• Measures how much the constituents in the hypothesis parse tree look like the
constituents in a reference parse tree (hand-labeled)
• A constituent in a hypothesis Ch is labeled correct if there is a constituent in the
reference parse Cr with the same starting point, ending point, and non-terminal
symbol.
• Cross-bracket metric:
• The number of constituents for which the reference parse has a bracketing such as
((A B) C) but the hypothesis parse has a bracketing such as (A (B C))
22
4. Dependency parsing
23
4. Dependency parsing
• Dependency Relations
• head and its dependent
• label: grammatical functions of dependent
• Universal Dependencies (UD) project
• The largest open community project for building dependency trees
• more than 100 languages, ~ 200 dependency treebanks
• 37 dependency relations
https://universaldependencies.org/
24
4. Dependency parsing
25
4.1 Transition-Based Dependency Parsing
Nivre algorithm 26
4.2. Graph-Based Dependency Parsing
• Parser searches through the space of possible trees for a given sentence S
for a tree (or trees) t that maximize a score.
arg max =arg max e t
∈𝒯 ∈𝒯
e: edges of the tree
• Parsing via finding the maximum spanning tree:
• Encode the search space as directed graphs and employ graph theory methods to
search the optimal solutions.
• Score calculation:
• Feature-based:
• Neural algorithm
27
4.3 Evaluation
28
Treebank
29
Common Syntactic Parsers
30
Stanford parser
('the', 3, 'det')
('big', 3, 'amod')
('dog', 4, 'nsubj')
('chased', 0, 'root')
('the', 6, 'det')
('cat', 4, 'obj')
31
Application of Syntactic parsing
• Machine translation
(Alshawi 1996, Wu 1997, ...)
• v.v.
32
Semantics Representations
33
Semantic Analysis
34
Semantic Analysis
35
Tasks of Semantic Analysis
36
Tasks of Semantic Analysis
37
Meaning representations
38
Meaning representations
39
Meaning representations
40
Semantic Analysis Methods
41
Semantic Analysis Methods
42
Semantic Analysis Methods
43
Conducting Semantic Analysis
45
Corpus
46
• end of Chapter 5.
47