Notes 4
Notes 4
Accept/reject: Accept/reject:
Final: If qf ∈ Sn then accept; If qf∈Sn then accept;
else reject else reject
qf= [Start→S•, 0] qf= [Start→S•, 0, n]
2 Earley’s algorithm
Earley’s algorithm is like the state set simulation of a nondeterministic FTN presented earlier,
with the addition of a single new integer representing the starting point of a hierarchical
phrase (since now phrases can start at any point in the input). Note that with simple linear
concatenation this information is implicitly encoded via the word position we are at. The
stopping or end point of a phrase will be encoded by the word position. To proceed, given
input n, a series of state sets S0 , S1 , . . ., Sn is built, where Si contains all the valid items after
reading i words. The algorithm as presented is a simple recognizer; as usual, parsing involves
more work.
2 6.863J Handout 8, Spring, 2001
In theorem-proving terms, the Earley algorithm selects the leftmost nonterminal (phrase)
in a rule as the next candidate to see if one can find a “proof” for it in the input. (By varying
which nonterminal is selected, one can come up with a different strategy for parsing.)
Accept/reject: Accept/reject:
Final: If qf ∈ Sn then accept; If qf∈Sn then accept;
else reject else reject
qf= [Start→S•, 0] qf= [Start→S•, 0, n]
Start→S S→NP VP
NP→Name NP→Det Noun
NP→Name PP PP→ Prep NP
VP→V NP VP→V NP PP
V→ate Noun→ice-cream
Name→John Name→ice-cream
Noun→table Det→the
Prep→on
Let’s follow how this parse works using Earley’s algorithm and the parser used in laboratory
2. (The headings and running count of state numbers aren’t supplied by the parser. Also note
that Start is replaced by *DO*. Some additional duplicated states that are printed during
tracing have been removed for clarity, and comments added.)
(in-package ’gpsg)
(remove-rule-set ’testrules)
(remove-rule-set ’testdict)
(add-rule-set ’testrules ’CFG)
(add-rule-list ’testrules
’((S ==> NP VP)
(NP ==> name)
(NP ==> Name PP)
(VP ==> V NP)
(NP ==> Det Noun)
(PP ==> Prep NP)
(VP ==> V NP PP)))
John [Name]
1 0 NP ==> NAME . (6) (scan over 3)
1 0 NP ==> NAME . PP (7) (scan over 4)
1 0 S ==> NP . VP (8) (complete 6 to 2)
1 1 PP ==> . PREP NP (9) (predict from 7)
1 1 VP ==> . V NP (10) (predict from 8)
1 1 VP ==> . V NP PP (11) (predict from 8)
ate [V]
2 1 VP ==> V . NP (12) (scan over 10)
2 1 VP ==> V . NP PP (13) (scan over 11)
2 2 NP ==> . NAME (14) (predict from 12/13)
2 2 NP ==> . NAME PP (15) (predict from 12/13)
2 2 NP ==> . DET NOUN (16) (predict from 12/13)
on [Prep]
4 3 PP ==> PREP . NP (24) (scan over 21)
4 4 NP ==> . NAME (25) (predict from 24)
4 4 NP ==> . NAME PP (26) (predict from 24)
4 4 NP ==> . DET NOUN (27) (predict from 24)
the [Det]
5 4 NP ==> DET . NOUN (28) (scan over 27)
table [Noun]
6 4 NP ==> DET NOUN . (29) (scan over 28)
6 3 PP ==> PREP NP . (30) (complete 29 to 24)
6 1 VP ==> V NP PP . (31) (complete 24 to 19)
6 2 NP ==> NAME PP . (32) (complete 24 to 18)
6 0 S ==> NP VP . (33) (complete 8 to 1)
6 0 *DO* ==> S . (34) (complete 1) [parse 1]
6 1 VP ==> V NP . (35) (complete 18 to 12)
6 0 S ==> NP VP . (36) (complete 12 to 1) = 33
6 6.863J Handout 8, Spring, 2001
*DO*→S• (34)
NP→Name PP•(32)
PP →Prep NP•(30)
Figure 1: Distinct parses lead to distinct state triple paths in the Earley algorithm
((START
(S (NP (NAME JOHN))
(VP (V ATE) (NP (NAME ICE-CREAM))
(PP (PREP ON) (NP (DET THE) (NOUN TABLE))))))
(START
(S (NP (NAME JOHN))
(VP (V ATE)
(NP (NAME ICE-CREAM) (PP (PREP ON) (NP (DET THE) (NOUN TABLE))))))))
set). In the worst case, the maximum number of distinct items is the maximum number of
dotted rules times the maximum number of distinct return values, or |G| · n. The time to
process a single item can be found by considering separately the scan, predict and complete
actions. Scan and predict are effectively constant time (we can build in advance all the
possible single next-state transitions, given a possible category). The complete action could
force the algorithm to advance the dot in all the items in a state set, which from the previous
calculation, could be |G| · n items, hence proportional to that much time. We can combine
the values as shown below to get an upper bound on the execution time, assuming that the
primitive operations of our computer allow us to maintain lists without duplicates without any
additional overhead (say, by using bit-vectors; if this is not done, then searching through or
ordering the list of states could add in another |G| factor.). Note as before that grammar size
(measure by the total number of symbols in the rule system) is an important component to
this bound; more so than the input sentence length, as you will see in Laboratory 2.
If there is no ambiguity, then this worst case does not arise (why?). The parse is then linear
time (why?). If there is only a finite ambiguity in the grammar (at each step, there are only a
finite, bounded in advance number of ambiguous attachment possibilities) then the worst case
time is proportional to n2 .
Maximum number of state sets X Maximum time to build ONE state set