0% found this document useful (0 votes)

8 views7 pages

Notes 4

Uploaded by

Ollintzin Rosas

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

8 views7 pages

Notes 4

Uploaded by

Ollintzin Rosas

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 7

Massachusetts Institute of Technology

6.863J/9.611J, Natural Language Processing, Spring, 2001

Department of Electrical Engineering and Computer Science
Department of Brain and Cognitive Sciences

Handout 8: Computation & Hierarchical parsing II

1 FTN parsing redux

FTN Parser Earley Parser
Compute initial state set S0 Compute initial state set S0
1. S0← q0 1. S0←q0
Initialize:
2. S0← eta-closure (S0) 2. S0← eta-closure (S0)
q0= [Start→•S, 0] q0= [Start→•S, 0, 0]
eta-closure= transitive eta-closure= transitive closure
closure of jump arcs of Predict and Complete

Compute S i from Si-1 Compute S i from Si-1

For each word, wi, 1=1,...,n For each word, wi, 1=1,...,n
Loop: Si←∪δ(q, wi) Si←∪δ(q, wi)
q∈Si-1 q∈S i-1
= Scan(S i-1)
Si←e-closure(Si) q=item
Si←e-closure(Si )
e-closure=
closure(Predict, Complete)

Accept/reject: Accept/reject:
Final: If qf ∈ Sn then accept; If qf∈Sn then accept;
else reject else reject
qf= [Start→S•, 0] qf= [Start→S•, 0, n]

2 Earley’s algorithm
Earley’s algorithm is like the state set simulation of a nondeterministic FTN presented earlier,
with the addition of a single new integer representing the starting point of a hierarchical
phrase (since now phrases can start at any point in the input). Note that with simple linear
concatenation this information is implicitly encoded via the word position we are at. The
stopping or end point of a phrase will be encoded by the word position. To proceed, given
input n, a series of state sets S0 , S1 , . . ., Sn is built, where Si contains all the valid items after
reading i words. The algorithm as presented is a simple recognizer; as usual, parsing involves
more work.
2 6.863J Handout 8, Spring, 2001

In theorem-proving terms, the Earley algorithm selects the leftmost nonterminal (phrase)
in a rule as the next candidate to see if one can ﬁnd a “proof” for it in the input. (By varying
which nonterminal is selected, one can come up with a diﬀerent strategy for parsing.)

To recognize a sentence using a context-free grammar G and

Earley’s algorithm:

1 Compute the initial state set, S0 :

1a Put the start state, (Start → •S, 0, 0), in S0 .

1b Execute the following steps until no new state triples
are added.
1b1 Apply complete to S0 .
1b2 Apply predict to S0 .

2 For each word wi , i = 1, 2, . . . , n, build state set Si given

state set Si−1 .

2a Apply scan to state set Si .

2b Execute the following steps until no new state triples
are added to state set Si .
2b1 Apply complete to Si
2b2 Apply predict to Si
2c If state set Si is empty, reject the sentence; else in-
crement i.
2d If i < n then go to Step 2a; else go to Step 3.

3 If state set n includes the accept state (Start → S •, 0, n),

then accept; else reject.

Deﬁning the basic operations on items

Definition 1: Scan: For all states (A → α • tβ, k, i − 1) in state set Si−1 , if wi = t, then add
(A → αt • β, k, i) to state set Si .
Definition 2: Predict (Push): Given a state (A → α • Bβ, k, i) in state set Si , then add all
states of the form (B → •γ, i, i) to state set Si .
Definition 3: Complete (Pop): If state set Si contains the triple (B → γ •, k, i), then, for all
rules in state set k of the form, (A → α • Bβ, l, k), add (A → αB • β, l, i) to state set Si . (If
the return value is empty, then do nothing.)
Computation & Hierarchical parsing II 3

3 Comparison of FTN and Earley state set parsing

The FTN and Earley parsers are almost identical in terms of representations and algorithmic
structure. Both construct a sequence of state sets S0 , S1 , . . . , Sn . Both algorithms consist of
three parts: an initialization stage; a loop stage, and an acceptance stage. The only diﬀerence
is that since the Earley parser must handle an expanded notion of an item (it is now a partial
tree rather than a partial linear sequence), one must add a single new integer index to mark
the return address in hierarchical structure.
Note that prediction and completion both act like -transitions: they spark parser opera-
tions without consuming any input; hence, one must close each state set construction under
these operations (= we must add all states we can reach after reading i words, including those
reached under -transitions.)
Question: where is the stack in the Earley algorithm? (Since we need a stack for return
pointers.)

FTN Parser Earley Parser

Compute initial state set S0 Compute initial state set S0
1. S0← q0 1. S0←q0
Initialize:
2. S0← eta-closure (S0) 2. S0← eta-closure (S0)
q0= [Start→•S, 0] q0= [Start→•S, 0, 0]
eta-closure= transitive eta-closure= transitive closure
closure of jump arcs of Predict and Complete

Compute S i from Si-1 Compute S i from Si-1

Accept/reject: Accept/reject:
Final: If qf ∈ Sn then accept; If qf∈Sn then accept;
else reject else reject
qf= [Start→S•, 0] qf= [Start→S•, 0, n]

4 A simple example of the algorithm in action

Let’s now see how this works with a simple grammar and then examine how parses may be
retrieved. There have been several schemes proposed for parse storage and retrieval.
Here is a simple grammar plus an example parse for John ate ice-cream on the table (am-
biguous as to the placement of the Prepositional Phrase on the table).
4 6.863J Handout 8, Spring, 2001

Start→S S→NP VP
NP→Name NP→Det Noun
NP→Name PP PP→ Prep NP
VP→V NP VP→V NP PP
V→ate Noun→ice-cream
Name→John Name→ice-cream
Noun→table Det→the
Prep→on

Let’s follow how this parse works using Earley’s algorithm and the parser used in laboratory
2. (The headings and running count of state numbers aren’t supplied by the parser. Also note
that Start is replaced by *DO*. Some additional duplicated states that are printed during
tracing have been removed for clarity, and comments added.)

(in-package ’gpsg)
(remove-rule-set ’testrules)
(remove-rule-set ’testdict)
(add-rule-set ’testrules ’CFG)
(add-rule-list ’testrules
’((S ==> NP VP)
(NP ==> name)
(NP ==> Name PP)
(VP ==> V NP)
(NP ==> Det Noun)
(PP ==> Prep NP)
(VP ==> V NP PP)))

(add-rule-set ’testdict ’DICTIONARY)

(add-rule-list ’testdict
’((ate V)
(John Name)
(table Noun)
(ice-cream Noun)
(ice-cream Name)
(on Prep)
(the Det)))

(create-cfg-table ’testg ’testrules ’s 0)

? (pprint (p "john ate ice-cream on the table"

:grammar ’testg :dictionary ’testdict :print-states t))
Computation & Hierarchical parsing II 5

State set Return ptr Dotted rule

(nothing)
0 0 *D0* ==> . S $ (1) (start state)
0 0 S ==> . NP VP (2) (predict from 1)
0 0 NP ==> . NAME (3) (predict from 2)
0 0 NP ==> . NAME PP (4) (predict from 2)
0 0 NP ==> . DET NOUN (5) (predict from 2)

John [Name]
1 0 NP ==> NAME . (6) (scan over 3)
1 0 NP ==> NAME . PP (7) (scan over 4)
1 0 S ==> NP . VP (8) (complete 6 to 2)
1 1 PP ==> . PREP NP (9) (predict from 7)
1 1 VP ==> . V NP (10) (predict from 8)
1 1 VP ==> . V NP PP (11) (predict from 8)

ate [V]
2 1 VP ==> V . NP (12) (scan over 10)
2 1 VP ==> V . NP PP (13) (scan over 11)
2 2 NP ==> . NAME (14) (predict from 12/13)
2 2 NP ==> . NAME PP (15) (predict from 12/13)
2 2 NP ==> . DET NOUN (16) (predict from 12/13)

ice-cream [Name, Noun]

3 2 NP ==> NAME . (17) (scan over 14)
3 2 NP ==> NAME . PP (18) (scan over 15)
3 1 VP ==> V NP . PP (19) (complete 17 to 13)
3 1 VP ==> V NP . (20) (complete 17 to 12)
3 3 PP ==> . PREP NP (21) (predict from 18/19)
3 0 S ==> NP VP . (22) (complete 20 to 8)
3 0 *D0* ==> S . $ (23) (complete 8 to 1)

on [Prep]
4 3 PP ==> PREP . NP (24) (scan over 21)
4 4 NP ==> . NAME (25) (predict from 24)
4 4 NP ==> . NAME PP (26) (predict from 24)
4 4 NP ==> . DET NOUN (27) (predict from 24)

the [Det]
5 4 NP ==> DET . NOUN (28) (scan over 27)

table [Noun]
6 4 NP ==> DET NOUN . (29) (scan over 28)
6 3 PP ==> PREP NP . (30) (complete 29 to 24)
6 1 VP ==> V NP PP . (31) (complete 24 to 19)
6 2 NP ==> NAME PP . (32) (complete 24 to 18)
6 0 S ==> NP VP . (33) (complete 8 to 1)
6 0 *DO* ==> S . (34) (complete 1) [parse 1]
6 1 VP ==> V NP . (35) (complete 18 to 12)
6 0 S ==> NP VP . (36) (complete 12 to 1) = 33
6 6.863J Handout 8, Spring, 2001

*DO*→S• (34)

S→NP VP• (33)

VP→V NP PP • (31) VP→V NP•(35)

NP→Name PP•(32)

PP →Prep NP•(30)

NP→Det Noun• (29)

Figure 1: Distinct parses lead to distinct state triple paths in the Earley algorithm

6 0 DO ==> S . (37) (complete 1) = 34 [parse 2]

6 1 VP ==> V NP . PP (38) (complete 18 to 13)
6 6 PP ==> . PREP NP (39) (predict from 38)

((START
(S (NP (NAME JOHN))
(VP (V ATE) (NP (NAME ICE-CREAM))
(PP (PREP ON) (NP (DET THE) (NOUN TABLE))))))
(START
(S (NP (NAME JOHN))
(VP (V ATE)
(NP (NAME ICE-CREAM) (PP (PREP ON) (NP (DET THE) (NOUN TABLE))))))))

5 Time complexity of the Earley algorithm

The worst case time complexity of the Earley algorithm is dominated by the time to construct
the state sets. This in turn is decomposed into the time to process a single item in a state
set times the maximum number of items in a state set (assuming no duplicates; thus, we are
assuming some implementation that allows us to quickly check for duplicate states in a state
Computation & Hierarchical parsing II 7

set). In the worst case, the maximum number of distinct items is the maximum number of
dotted rules times the maximum number of distinct return values, or |G| · n. The time to
process a single item can be found by considering separately the scan, predict and complete
actions. Scan and predict are effectively constant time (we can build in advance all the
possible single next-state transitions, given a possible category). The complete action could
force the algorithm to advance the dot in all the items in a state set, which from the previous
calculation, could be |G| · n items, hence proportional to that much time. We can combine
the values as shown below to get an upper bound on the execution time, assuming that the
primitive operations of our computer allow us to maintain lists without duplicates without any
additional overhead (say, by using bit-vectors; if this is not done, then searching through or
ordering the list of states could add in another |G| factor.). Note as before that grammar size
(measure by the total number of symbols in the rule system) is an important component to
this bound; more so than the input sentence length, as you will see in Laboratory 2.
If there is no ambiguity, then this worst case does not arise (why?). The parse is then linear
time (why?). If there is only a finite ambiguity in the grammar (at each step, there are only a
finite, bounded in advance number of ambiguous attachment possibilities) then the worst case
time is proportional to n2 .

Maximum number of state sets X Maximum time to build ONE state set

Maximum number of Maximum time

items to process ONE item

Maximum possible number

of items=
[maximum number of dotted rules X maximum number of distinct return values]

Latihan Soal UAS Bahasa Inggris Kelas 10 Semester 1
87% (15)
Latihan Soal UAS Bahasa Inggris Kelas 10 Semester 1
7 pages
NLP-Module-3.1-Earley Parsing
No ratings yet
NLP-Module-3.1-Earley Parsing
16 pages
Earley Parser
No ratings yet
Earley Parser
6 pages
Chart Parsing-Earley Algorithm & Statistical Parsing: Dr. Sukhnandan Kaur Csed, Tiet
No ratings yet
Chart Parsing-Earley Algorithm & Statistical Parsing: Dr. Sukhnandan Kaur Csed, Tiet
40 pages
Unit 4 Earley Parser
No ratings yet
Unit 4 Earley Parser
56 pages
A Faster Earley Parser: Abstract
No ratings yet
A Faster Earley Parser: Abstract
13 pages
Bottomup
No ratings yet
Bottomup
39 pages
Improving Upon Earley's Parsing Algorithm in Prolog
No ratings yet
Improving Upon Earley's Parsing Algorithm in Prolog
14 pages
lr2 LR - 0 Parsing
No ratings yet
lr2 LR - 0 Parsing
32 pages
Lecture 07
No ratings yet
Lecture 07
35 pages
Thuật toán NLP
No ratings yet
Thuật toán NLP
57 pages
td2-ll_1-parsing
No ratings yet
td2-ll_1-parsing
45 pages
NLP Module 3
No ratings yet
NLP Module 3
11 pages
Basic Parsing Techniques - Parsing
No ratings yet
Basic Parsing Techniques - Parsing
20 pages
25 Scanning Parsing 3.Ppt
No ratings yet
25 Scanning Parsing 3.Ppt
55 pages
L5_TopDownParsing
No ratings yet
L5_TopDownParsing
30 pages
Table-Driven Parsing: Tables
No ratings yet
Table-Driven Parsing: Tables
22 pages
Bottom Up Parsing1
No ratings yet
Bottom Up Parsing1
69 pages
Cheat Sheet Final Final
No ratings yet
Cheat Sheet Final Final
2 pages
03_PARSING
No ratings yet
03_PARSING
71 pages
Efficient Earley Parsing with Regular Right-hand Sides
No ratings yet
Efficient Earley Parsing with Regular Right-hand Sides
14 pages
Predictive Parsing and LL (1) - Compiler Design - Dr. D. P. Sharma - NITK Surathkal by Wahid311
100% (2)
Predictive Parsing and LL (1) - Compiler Design - Dr. D. P. Sharma - NITK Surathkal by Wahid311
56 pages
Efficient Parsing Using Recursive Transition Networks With Output
No ratings yet
Efficient Parsing Using Recursive Transition Networks With Output
5 pages
4 Predctive Parser
No ratings yet
4 Predctive Parser
59 pages
Week 10 - Non Recursive Predictive Parsor
0% (1)
Week 10 - Non Recursive Predictive Parsor
41 pages
3 Syntax Analysis
No ratings yet
3 Syntax Analysis
42 pages
Scanner Parser
No ratings yet
Scanner Parser
62 pages
PPT Lecture 1.9 Top Down Parsing and Lecture 1.10 Recursive Descent Parsing (1)
No ratings yet
PPT Lecture 1.9 Top Down Parsing and Lecture 1.10 Recursive Descent Parsing (1)
21 pages
Chart Parsers PDF
No ratings yet
Chart Parsers PDF
7 pages
Parser
No ratings yet
Parser
89 pages
Lecture15 Parsing
No ratings yet
Lecture15 Parsing
37 pages
3.3-NLP
No ratings yet
3.3-NLP
32 pages
Presented by Jyoti Thakur
No ratings yet
Presented by Jyoti Thakur
31 pages
Chapter 5 Intro to Top Down Parsing
No ratings yet
Chapter 5 Intro to Top Down Parsing
50 pages
Module 4
No ratings yet
Module 4
53 pages
Computer Science
No ratings yet
Computer Science
111 pages
Lecture4 Java
No ratings yet
Lecture4 Java
46 pages
Building Predictive Parsing Tables: 1. Computing The Function FIRST
No ratings yet
Building Predictive Parsing Tables: 1. Computing The Function FIRST
6 pages
A Process of Recognizing The Lexical Components in A
No ratings yet
A Process of Recognizing The Lexical Components in A
39 pages
Compiler Design Unit-2
No ratings yet
Compiler Design Unit-2
29 pages
Construct a CLR Parsing Table for the Given Context
No ratings yet
Construct a CLR Parsing Table for the Given Context
10 pages
Context Sensitive Earley
No ratings yet
Context Sensitive Earley
18 pages
LR (K) Parsing: CPSC 388 Ellen Walker Hiram College
No ratings yet
LR (K) Parsing: CPSC 388 Ellen Walker Hiram College
30 pages
Unit 3
No ratings yet
Unit 3
8 pages
Compilers Lecture 7
No ratings yet
Compilers Lecture 7
21 pages
Compiler Design Unit 3
No ratings yet
Compiler Design Unit 3
20 pages
Chapter 8 - Syntax Analysis
No ratings yet
Chapter 8 - Syntax Analysis
92 pages
4 Parsing
No ratings yet
4 Parsing
55 pages
Compiler Design Study Material Unit 2nd
No ratings yet
Compiler Design Study Material Unit 2nd
28 pages
CFG
No ratings yet
CFG
58 pages
Imitation Learning: Modeling & Learning Sequence of Decisions
No ratings yet
Imitation Learning: Modeling & Learning Sequence of Decisions
53 pages
CSC 4181 Compiler Construction Parsing
No ratings yet
CSC 4181 Compiler Construction Parsing
53 pages
AS00001155
No ratings yet
AS00001155
28 pages
Top Down Parsing
No ratings yet
Top Down Parsing
38 pages
S2 BottomUpParsing
No ratings yet
S2 BottomUpParsing
59 pages
NLP Unit Ii
No ratings yet
NLP Unit Ii
30 pages
Syntax Analysis I 2022 Class
No ratings yet
Syntax Analysis I 2022 Class
33 pages
Modern Algebra Essentials
From Everand
Modern Algebra Essentials
Lufti A. Lutfiyya
No ratings yet
Differential Forms
From Everand
Differential Forms
Henri Cartan
5/5 (2)
Infinite Series
From Everand
Infinite Series
Isidore Isaac Hirschman
4/5 (1)
Transformation of Axes (Geometry) Mathematics Question Bank
From Everand
Transformation of Axes (Geometry) Mathematics Question Bank
Mohmmad Khaja Shareef
3/5 (1)
Lesson Plan Reading
No ratings yet
Lesson Plan Reading
2 pages
Mandarin mini applicable dictionary
No ratings yet
Mandarin mini applicable dictionary
5 pages
Ingles 1 Tema 2950
No ratings yet
Ingles 1 Tema 2950
10 pages
Holy Child School: PT I (2023-24) Class V Subject - English MM: 20 Section - A (Reading)
No ratings yet
Holy Child School: PT I (2023-24) Class V Subject - English MM: 20 Section - A (Reading)
2 pages
Language Variation Found On Anya Geraldine Instagram Caption
No ratings yet
Language Variation Found On Anya Geraldine Instagram Caption
13 pages
Theories of Peace and Conflict and Their Relationship To Language
No ratings yet
Theories of Peace and Conflict and Their Relationship To Language
49 pages
5 RW-Lesson-5 - 20240221 - 235944 - 0000
No ratings yet
5 RW-Lesson-5 - 20240221 - 235944 - 0000
38 pages
CHAPTER II HN
No ratings yet
CHAPTER II HN
16 pages
Modal Verbs of Deduction
No ratings yet
Modal Verbs of Deduction
5 pages
Full download Semantics Introducing Linguistics 2nd Edition John I. Saeed ebook PDF & DOCX all chapters
100% (2)
Full download Semantics Introducing Linguistics 2nd Edition John I. Saeed ebook PDF & DOCX all chapters
82 pages
Meghan Markle Divorce - Amid Divorce Rumors, Meghan Markle Seeks More Time From US Authorities To Correct This - World News - Times of India
No ratings yet
Meghan Markle Divorce - Amid Divorce Rumors, Meghan Markle Seeks More Time From US Authorities To Correct This - World News - Times of India
1 page
Simple Past and Present Perfect
No ratings yet
Simple Past and Present Perfect
11 pages
Confusing Sentences (10 Pairs) ....
No ratings yet
Confusing Sentences (10 Pairs) ....
1 page
Short Answers (With Verb To Be)
No ratings yet
Short Answers (With Verb To Be)
4 pages
Phonics Booklet 2nd Form 2023 146 Hojas
No ratings yet
Phonics Booklet 2nd Form 2023 146 Hojas
146 pages
Classification of Wichí Dialects (Ethnologue) - C. Wallis
No ratings yet
Classification of Wichí Dialects (Ethnologue) - C. Wallis
16 pages
English 1(1)
No ratings yet
English 1(1)
437 pages
Report
No ratings yet
Report
7 pages
Present Perfect Vs Past Simple Fun Activities Games Grammar Drills 48650
No ratings yet
Present Perfect Vs Past Simple Fun Activities Games Grammar Drills 48650
2 pages
Examples:: Verbs Followed by The Subjunctive
No ratings yet
Examples:: Verbs Followed by The Subjunctive
1 page
Adjectives Describing a Person
No ratings yet
Adjectives Describing a Person
2 pages
EL-102-REVIEWER-DIAGNOSTIC-TEST
No ratings yet
EL-102-REVIEWER-DIAGNOSTIC-TEST
6 pages
BC - Question Paper - 2
No ratings yet
BC - Question Paper - 2
10 pages
Key Terms and Concepts in Managing and Implementing
No ratings yet
Key Terms and Concepts in Managing and Implementing
23 pages
Stages of Language Acquisition
No ratings yet
Stages of Language Acquisition
29 pages
Class 6 English Grammar Ncert Solutions Vocabulary and Word Power
No ratings yet
Class 6 English Grammar Ncert Solutions Vocabulary and Word Power
18 pages
Focus5 2E Unit Test Unit7 Dictation Vocabulary Grammar UoE GroupB ANSWERS
No ratings yet
Focus5 2E Unit Test Unit7 Dictation Vocabulary Grammar UoE GroupB ANSWERS
2 pages
B4 - TA - Thuc Hành Dạy Học Môn TA ở THCS-Ngoc
No ratings yet
B4 - TA - Thuc Hành Dạy Học Môn TA ở THCS-Ngoc
84 pages
A Short History of Punctuation
100% (3)
A Short History of Punctuation
1 page

Notes 4

Uploaded by

Notes 4

Uploaded by

Massachusetts Institute of Technology

6.863J/9.611J, Natural Language Processing, Spring, 2001

Handout 8: Computation & Hierarchical parsing II

1 FTN parsing redux

Compute S i from Si-1 Compute S i from Si-1

To recognize a sentence using a context-free grammar G and

1 Compute the initial state set, S0 :

1a Put the start state, (Start → •S, 0, 0), in S0 .

2 For each word wi , i = 1, 2, . . . , n, build state set Si given

2a Apply scan to state set Si .

3 If state set n includes the accept state (Start → S •, 0, n),

Deﬁning the basic operations on items

3 Comparison of FTN and Earley state set parsing

FTN Parser Earley Parser

Compute S i from Si-1 Compute S i from Si-1

4 A simple example of the algorithm in action

(add-rule-set ’testdict ’DICTIONARY)

(create-cfg-table ’testg ’testrules ’s 0)

? (pprint (p "john ate ice-cream on the table"

State set Return ptr Dotted rule

ice-cream [Name, Noun]

S→NP VP• (33)

VP→V NP PP • (31) VP→V NP•(35)

NP→Det Noun• (29)

6 0 *DO* ==> S . (37) (complete 1) = 34 [parse 2]

5 Time complexity of the Earley algorithm

Maximum number of Maximum time

Maximum possible number

You might also like

6 0 DO ==> S . (37) (complete 1) = 34 [parse 2]