0% found this document useful (0 votes)

50 views19 pages

18-Graph Based Dependency Parsing-19-09-2024

Uploaded by

amaanmohdsyed

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

50 views19 pages

18-Graph Based Dependency Parsing-19-09-2024

Uploaded by

amaanmohdsyed

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 19

Natural Language Understanding

Introduction to Dependency Parsing

1
Dependency parsing is different from constituent parsing

In ANLP and FNLP, we’ve already seen various parsing algorithms

for context-free languages (shift-reduce, CKY, active chart).

Why consider dependency parsing as a distinct topic?

• context-free parsing algorithms base their decisions on

adjacency;
• in a dependency structure, a dependent need not be adjacent
to its head (even if the structure is projective);
• we need new parsing algorithms to deal with non-adjacency
(and with non-projectivity if present).

11
There are many ways to parse dependencies

We will consider two types of dependency parsers:

1. graph-based dependency parsing, based on maximum
spanning trees (MST parser, ?);
2. transition-based dependency parsing, an extension of
shift-reduce parsing (MALT parser, ?).

Alternative 3: map dependency trees to phrase structure trees and

do standard CFG parsing (for projective trees) or LCFRS variants
(for non-projective trees). We will not cover this here.

Note that each of these approach arises from different views of

syntactic structure: as a set of constraints (MST), as the actions
of an automaton (transition-based), or as the derivations of a
grammar (CFG parsing). It is often possible to translate between
these views, with some effort. 12
Graph-based dependency parsing as tagging

Goal: find the highest scoring dependency tree in the space of all
possible trees for a sentence.

Let x = x1 ···xn be the input sentence, and y a dependency tree

for x. Here, y is a set of dependency edges, with (i , j) ∈ y if there
is an edge from xi to xj .

Intuition: since each word has exactly one parent, this is like a
tagging problem, where the possible tags are the other words in the
sentence (or a dummy node called root). If we edge factorize the
score of a tree so that it is simply the product of its edge scores,
then we can simply select the best incoming edge for each word...
subject to the constraint that the result must be a tree.

13
Formalizing graph-based dependency parsing

The score of a dependency edge (i , j) is a function s(i, j). We’ll

discuss the form of this function a little bit later.

Then the score of dependency tree y for sentence x is:

Σ
s(x, y) = s(i, j)
(i,j)∈y

Dependency parsing is the task of finding the tree y with highest

score for a given sentence x.

14
The best dependency parse is the maximum spanning tree

This task can be achieved using the following approach (?):

• start with a totally connected graph G, i.e., assume a directed

edge between every pair of words;
• assume you have a scoring function that assigns a score s(i, j)
to every edge (i , j);
• find the maximum spanning tree (MST) of G, i.e., the
directed tree with the highest overall score that includes all
nodes of G;
• this is possible in O(n2) time using the Chu-Liu-Edmonds
algorithm; it finds a MST which is not guaranteed to be
projective;
• the highest-scoring parse is the MST of G.

15
Chu-Liu-Edmonds (CLE) Algorithm

Example: x = John saw Mary, with graph Gx. Start with the fully
connected graph, with scores:

9
root 10

20 30
9 sa w

John 30 0 Mary

16
Chu-Liu-Edmonds (CLE) Algorithm

Each node j in the graph greedily selects the incoming edge with
the highest score s(i, j):
root

20 30
saw

John 30 Mary

If a tree results, it is the maximum spanning tree. If not, there

must be a cycle.
Intuition: We can break the cycle if we replace a single incoming
edge to one of the nodes in the cycle. Which one? Decide
recursively.
17
CLE Algorithm: Recursion

Identify the cycle and contract it into a single node and recalculate
scores of incoming and outgoing edges.
Intuition: edges into the cycle are the weight of the cycle with only
the dependency of the target word changed.

9
root 40

30
saw
wjs
John Mary

Now call CLE recursively on this contracted graph. MST on the

contracted graph is equivalent to MST on the original graph. 18
CLE Algorithm: Recursion

Again, greedily collect incoming edges to all nodes:

root 40

30
saw
wjs
John Mary

This is a tree, hence it must be the MST of the graph.

19
CLE Algorithm: Reconstruction

Now reconstruct the uncontracted graph: the edge from wjs to

Mary was from saw. The edge from ROOT to wjs was a tree from
ROOT to saw to John, so we include these edges too:

root
10
saw
30 30
John Mary

20
Where do we get edge scores s(i, j) from?

Σ
s(x, y) = s(i, j)
(i,j)∈y

21
Where do we get edge scores s(i, j) from?

Σ
s(x, y) = s(i, j)
(i,j)∈y

For the decade after 2005: linear model trained with clever variants
of SVMs, MIRA, etc.

21
Where do we get edge scores s(i, j) from?

Σ
s(x, y) = s(i, j)
(i,j)∈y

For the decade after 2005: linear model trained with clever variants
of SVMs, MIRA, etc.

More recently: neural networks, of course.

21
Scoring edges with a neural network

There are a few different formulations of this. An effective one

from Zhang and Lapata (2016):

exp(g(aj , a i ))
s(i, j) = Phead (wj |wi , x) = Σ |x|
k=0 exp(g(ak , a i ))
We get ai by concatenating the hidden states of a forward and
backward RNN at position i.

The function g(aj, ai ) computes an association score telling us

how much word wi prefers word wj as its head. A simple option
from among many:
g(aj, ai ) = v Ta ·tanh(Ua ·aj + Wa ·ai )
Association scores are a useful way to select from a dynamic group
of candidates, and underly the idea of attention used in MT. 22
Transition-based Dependency Parsing

An MST parser builds a dependency tree though graph surgery. An

alternative is transition-based parsing:

• for a given parse state, the transition system defines a set of

actions T which the parser can take;
• if more than one action is applicable, a classifier (e.g., an
SVM) is used to decide which action to take;
• just like in the MST model, this requires a mechanism to
compute scores over a set of (possibly dynamic) candidates.

23
Transition-based Dependency Parsing

The arc-standard transition system:

• configuration c = (s, b, A) with stack s, buffer b, set of
dependency arcs A;
• initial configuration for sentence w1, . . . , wn is
s = [ROOT],b = [w1, . . . , wn], A = ∅;
• c is terminal if buffer is empty, stack contains only ROOT, and
parse tree is given by Ac ;
• if si is the ith top element on stack, and bi the ith element on
buffer, then we have the following transitions:
• LEFT-ARC(l): adds arc s1 → s2 with label l and removes s2
from stack; precondition: |s| ≥ 2;
• RIGHT-ARC(l ): adds arc s2 → s1 with label l and removes s1
from stack; precondition: |s| ≥ 2;
• SHIFT: moves b1 from buffer to stack; recondition: |b| ≥ 1.
24
Transition-based Dependency Parsing

punct
root dobj

nsubj amod

ROOT He has good control .

PRP VBZ JJ NN .

Transition Stack Buffer A

[ROOT] [He has good control .] ∅
SHIFT [ROOT He] [has good control .]
SHIFT [ROOT He has] [good control .]
LEFT-ARC(nsubj) [ROOT has] [good control .] A∪ nsubj(has,He)
SHIFT [ROOT has good] [control .]
SHIFT [ROOT has good control] [.]
LEFT-ARC(amod) [ROOT has control] [.] A∪amod(control,good)
RIGHT-ARC(dobj) [ROOT has] [.] A∪ dobj(has,control)
... ... ... ...
RIGHT-ARC(root) [ROOT] [] A∪ root(ROOT,has)

25
Summary

Comparing MST and transition-based parsers:

• the MST parser selects the globally optimal tree, given a set
of edges with scores;
• it can naturally handle projective and non-projective trees;
• a transition-based parser makes a sequence of local decisions
about the best parse action;
• it can be extended to projective dependency trees by changing
the transition set;
• accuracies are similar, but transition-based is faster;
• both require dynamic classifiers, and these can be
implemented using neural networks, conditioned on
bidirectional RNN encodings of the sentence.

NLP Unit-Ii
No ratings yet
NLP Unit-Ii
71 pages
CH 4 - Semantic Analysis PDF
100% (1)
CH 4 - Semantic Analysis PDF
36 pages
Dependency Parsing 2: CMSC 723 / LING 723 / INST 725
No ratings yet
Dependency Parsing 2: CMSC 723 / LING 723 / INST 725
52 pages
NLP Unit-2
No ratings yet
NLP Unit-2
18 pages
NLP Unit 2
No ratings yet
NLP Unit 2
20 pages
Week 6
No ratings yet
Week 6
78 pages
CLEdependencyparsing
No ratings yet
CLEdependencyparsing
54 pages
Lecture 08
No ratings yet
Lecture 08
69 pages
Jaf 2015 PHD Thesis
No ratings yet
Jaf 2015 PHD Thesis
191 pages
17-Transition Based Dependency Parsing-13-09-2024
No ratings yet
17-Transition Based Dependency Parsing-13-09-2024
25 pages
Lecture08 Dependency Parsing
No ratings yet
Lecture08 Dependency Parsing
70 pages
Dependency Parsing
100% (11)
Dependency Parsing
127 pages
Shallow Parsing: Natural Language Processing
No ratings yet
Shallow Parsing: Natural Language Processing
32 pages
NPTEL NLP Assignment 6
No ratings yet
NPTEL NLP Assignment 6
5 pages
Natural Language Processing With Deep Learning CS224N/Ling284
No ratings yet
Natural Language Processing With Deep Learning CS224N/Ling284
47 pages
Dependency Parsing
No ratings yet
Dependency Parsing
51 pages
Parsing Dependency
No ratings yet
Parsing Dependency
26 pages
Kiperwasser 16
No ratings yet
Kiperwasser 16
16 pages
2021 EMNLP Single Root
No ratings yet
2021 EMNLP Single Root
18 pages
Natural Language Processing With Deep Learning CS224N/Ling284
No ratings yet
Natural Language Processing With Deep Learning CS224N/Ling284
45 pages
Mcdonald 06
No ratings yet
Mcdonald 06
8 pages
Efficient Third-Order Dependency Parsers Terry Koo and Michael Collins
No ratings yet
Efficient Third-Order Dependency Parsers Terry Koo and Michael Collins
11 pages
Dependency Parsing Concepts Pawan Goyal
No ratings yet
Dependency Parsing Concepts Pawan Goyal
31 pages
AMR Parsing As Sequence-to-Graph Transduction
No ratings yet
AMR Parsing As Sequence-to-Graph Transduction
15 pages
A Semi-Detailed Lesson Plan in English
50% (2)
A Semi-Detailed Lesson Plan in English
10 pages
Dependency Parsing
No ratings yet
Dependency Parsing
34 pages
Dependency Parsing - Part II: Pawan Goyal
No ratings yet
Dependency Parsing - Part II: Pawan Goyal
56 pages
NLP Assignment-6 Solution
No ratings yet
NLP Assignment-6 Solution
5 pages
NLP 2024 405
No ratings yet
NLP 2024 405
25 pages
Dependency Parsing
No ratings yet
Dependency Parsing
32 pages
Dependency Parsing
No ratings yet
Dependency Parsing
96 pages
CD Unit-3 Part-2
No ratings yet
CD Unit-3 Part-2
21 pages
Semantic Analysis
No ratings yet
Semantic Analysis
82 pages
Collobert 11 A
No ratings yet
Collobert 11 A
9 pages
6752 NLP
No ratings yet
6752 NLP
14 pages
NLP 2024 404
No ratings yet
NLP 2024 404
13 pages
Chapter 4
No ratings yet
Chapter 4
53 pages
NLP 2024 406
No ratings yet
NLP 2024 406
18 pages
cs224n 2019 Notes04 Dependencyparsing
No ratings yet
cs224n 2019 Notes04 Dependencyparsing
5 pages
Dependency Transformer Grammars: Integrating Dependency Structures Into Transformer Language Models
No ratings yet
Dependency Transformer Grammars: Integrating Dependency Structures Into Transformer Language Models
14 pages
p742 Goldberg 2
No ratings yet
p742 Goldberg 2
9 pages
Dependency Grammar
No ratings yet
Dependency Grammar
10 pages
What Is Parsing
No ratings yet
What Is Parsing
47 pages
Syntactic Analysis
No ratings yet
Syntactic Analysis
66 pages
Lecture15 Parsing
No ratings yet
Lecture15 Parsing
37 pages
CS224n: Natural Language Processing With Deep Learning: Lecture Notes: Part IV Dependency Parsing Winter 2019
No ratings yet
CS224n: Natural Language Processing With Deep Learning: Lecture Notes: Part IV Dependency Parsing Winter 2019
5 pages
Imitation Learning: Modeling & Learning Sequence of Decisions
No ratings yet
Imitation Learning: Modeling & Learning Sequence of Decisions
53 pages
Bottom Up Parsing and Transition Net Grammar
No ratings yet
Bottom Up Parsing and Transition Net Grammar
7 pages
Online Large-Margin Training of Dependency Parsers: Ryan Mcdonald Koby Crammer Fernando Pereira
No ratings yet
Online Large-Margin Training of Dependency Parsers: Ryan Mcdonald Koby Crammer Fernando Pereira
8 pages
Machine 22
No ratings yet
Machine 22
5 pages
cs224n 2023 Lecture04 Dep Parsing
No ratings yet
cs224n 2023 Lecture04 Dep Parsing
45 pages
Parsing Dependency
No ratings yet
Parsing Dependency
27 pages
Robust Parsing Using A Hidden Markov Model
No ratings yet
Robust Parsing Using A Hidden Markov Model
11 pages
English 4 DLP 61 - Using Prepositions
No ratings yet
English 4 DLP 61 - Using Prepositions
10 pages
A Fast and Accurate Dependency Parser Using Neural Networks
No ratings yet
A Fast and Accurate Dependency Parser Using Neural Networks
11 pages
Assignment 6 (COPY)
No ratings yet
Assignment 6 (COPY)
6 pages
Dependency Parsing: Pawan Goyal
No ratings yet
Dependency Parsing: Pawan Goyal
38 pages
2011 Parsing Aistats
No ratings yet
2011 Parsing Aistats
11 pages
Attribute Grammars Attribute Grammars
No ratings yet
Attribute Grammars Attribute Grammars
7 pages
Phonetic, Morphological and Semantic Motivation of Words
50% (2)
Phonetic, Morphological and Semantic Motivation of Words
7 pages
Unit 3. English/Spanish Morphological Contrastive Analysis: 1. Basic Concepts Infinitives
No ratings yet
Unit 3. English/Spanish Morphological Contrastive Analysis: 1. Basic Concepts Infinitives
7 pages
Pedagogi HBTL4103
No ratings yet
Pedagogi HBTL4103
21 pages
Important MCQs
No ratings yet
Important MCQs
55 pages
Present Perfect Vs Past Simple Board Game Boardgames Conversation Topics Dialogs Games Gramm 121243
No ratings yet
Present Perfect Vs Past Simple Board Game Boardgames Conversation Topics Dialogs Games Gramm 121243
1 page
Simple Past Tense Add Ed - 67150
40% (10)
Simple Past Tense Add Ed - 67150
2 pages
1 Week: English Language
No ratings yet
1 Week: English Language
38 pages
Week 7 Gerunds Use
No ratings yet
Week 7 Gerunds Use
5 pages
Iqra Syllabus Book 2 (2018) Single PDF
No ratings yet
Iqra Syllabus Book 2 (2018) Single PDF
20 pages
9 TH SECOND SEMESTER ACTIVITY SHEET - K.R.P.
No ratings yet
9 TH SECOND SEMESTER ACTIVITY SHEET - K.R.P.
5 pages
Split 6381189644595170835
No ratings yet
Split 6381189644595170835
21 pages
Simple Present Tense Vs Present Continuous Tense
No ratings yet
Simple Present Tense Vs Present Continuous Tense
9 pages
English - The Parts of Speech
No ratings yet
English - The Parts of Speech
408 pages
ATHS Sample Admission Exam G9 English
No ratings yet
ATHS Sample Admission Exam G9 English
6 pages
Modal Auxiliary Verbs
No ratings yet
Modal Auxiliary Verbs
36 pages
DIAGNOSTIC TEST FOR 2nd Bac
No ratings yet
DIAGNOSTIC TEST FOR 2nd Bac
6 pages
Modulo de Ingles 2023-2024
No ratings yet
Modulo de Ingles 2023-2024
30 pages
Verbal Form and Verbal Phrases
No ratings yet
Verbal Form and Verbal Phrases
3 pages
Unit 1 Test A
No ratings yet
Unit 1 Test A
3 pages
Basic English Grammar Module
No ratings yet
Basic English Grammar Module
12 pages
Compiler Group Assignments
No ratings yet
Compiler Group Assignments
18 pages
Speaking Handout 2 - New
No ratings yet
Speaking Handout 2 - New
4 pages
Almost Everything About TOEFL PBT
No ratings yet
Almost Everything About TOEFL PBT
20 pages
Absolute Beginner S1 #6 How To Mind Your Manners in A French Cinema
No ratings yet
Absolute Beginner S1 #6 How To Mind Your Manners in A French Cinema
6 pages
Direct Indirect Speech PDF
No ratings yet
Direct Indirect Speech PDF
22 pages
Atg Quiz Partsofspeech PDF
No ratings yet
Atg Quiz Partsofspeech PDF
2 pages
Getting Meaning From The Context
No ratings yet
Getting Meaning From The Context
4 pages
Dtuch Ad
No ratings yet
Dtuch Ad
1 page

18-Graph Based Dependency Parsing-19-09-2024

Uploaded by

18-Graph Based Dependency Parsing-19-09-2024

Uploaded by

Natural Language Understanding

Introduction to Dependency Parsing

In ANLP and FNLP, we’ve already seen various parsing algorithms

Why consider dependency parsing as a distinct topic?

• context-free parsing algorithms base their decisions on

We will consider two types of dependency parsers:

Alternative 3: map dependency trees to phrase structure trees and

Note that each of these approach arises from different views of

Let x = x1 ···xn be the input sentence, and y a dependency tree

The score of a dependency edge (i , j) is a function s(i, j). We’ll

Then the score of dependency tree y for sentence x is:

Dependency parsing is the task of finding the tree y with highest

This task can be achieved using the following approach (?):

• start with a totally connected graph G, i.e., assume a directed

If a tree results, it is the maximum spanning tree. If not, there

Now call CLE recursively on this contracted graph. MST on the

Again, greedily collect incoming edges to all nodes:

This is a tree, hence it must be the MST of the graph.

Now reconstruct the uncontracted graph: the edge from wjs to

More recently: neural networks, of course.

There are a few different formulations of this. An effective one

The function g(aj, ai ) computes an association score telling us

An MST parser builds a dependency tree though graph surgery. An

• for a given parse state, the transition system defines a set of

The arc-standard transition system:

ROOT He has good control .

Transition Stack Buffer A

Comparing MST and transition-based parsers:

You might also like