0% found this document useful (0 votes)

135 views6 pages

Earley Parser

The Earley parser is an algorithm for parsing strings that belong to context-free languages. It uses dynamic programming and is a chart parser. The Earley parser can parse all context-free languages, unlike some other parsers. It runs in cubic time in the general case but quadratic time for unambiguous grammars and linear time for some LR grammars. The algorithm uses prediction, scanning, and completion steps to build state sets representing parsing progress.

Uploaded by

ppghoshin

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

135 views6 pages

Earley Parser

Uploaded by

ppghoshin

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 6

Earley parser 1

Earley parser
In computer science, the Earley parser is an algorithm for parsing strings that belong to a given context-free
language, though (depending on the variant) it may suffer problems with certain nullable grammars. The algorithm,
named after its inventor, Jay Earley, is a chart parser that uses dynamic programming; it is mainly used for parsing in
computational linguistics. It was first introduced in his dissertation (and later appeared in abbreviated, more legible
form in a journal).
Earley parsers are appealing because they can parse all context-free languagesTalk:Earley parser#, unlike LR
parsers and LL parsers, which are more typically used in compilers but which can only handle restricted classes of
languages. The Earley parser executes in cubic time in the general case , where n is the length of the parsed
string, quadratic time for unambiguous grammars , and linear time for almost all LR(k) grammars. It
performs particularly well when the rules are written left-recursively.

Earley Recognizer
The following algorithm describes the Earley recognizer. The recognizer can be easily modified to create a parse tree
as it recognizes, and in that way can be turned into a parser.

The algorithm
In the following descriptions, α, β, and γ represent any string of terminals/nonterminals (including the empty string),
X and Y represent single nonterminals, and a represents a terminal symbol.
Earley's algorithm is a top-down dynamic programming algorithm. In the following, we use Earley's dot notation:
given a production X → αβ, the notation X → α • β represents a condition in which α has already been parsed and β
is expected.
Input position 0 is the position prior to input. Input position n is the position after accepting the nth token.
(Informally, input positions can be thought of as locations at token boundaries.) For every input position, the parser
generates a state set. Each state is a tuple (X → α • β, i), consisting of
• the production currently being matched (X → α β)
• our current position in that production (represented by the dot)
• the position i in the input at which the matching of this production began: the origin position
(Earley's original algorithm included a look-ahead in the state; later research showed this to have little practical
effect on the parsing efficiency, and it has subsequently been dropped from most implementations.)
The state set at input position k is called S(k). The parser is seeded with S(0) consisting of only the top-level rule.
The parser then repeatedly executes three operations: prediction, scanning, and completion.
• Prediction: For every state in S(k) of the form (X → α • Y β, j) (where j is the origin position as above), add (Y
→ • γ, k) to S(k) for every production in the grammar with Y on the left-hand side (Y → γ).
• Scanning: If a is the next symbol in the input stream, for every state in S(k) of the form (X → α • a β, j), add (X
→ α a • β, j) to S(k+1).
• Completion: For every state in S(k) of the form (X → γ •, j), find states in S(j) of the form (Y → α • X β, i) and
add (Y → α X • β, i) to S(k).
It is important to note that duplicate states are not added to the state set, only new ones. These three operations are
repeated until no new states can be added to the set. The set is generally implemented as a queue of states to process,
with the operation to be performed depending on what kind of state it is.
Earley parser 2

Pseudocode
Adapted from by Daniel Jurafsky and James H. Martin

function EARLEY-PARSE(words, grammar)

ENQUEUE((γ → •S, 0), chart[0])
for i ← from 0 to LENGTH(words) do
for each state in chart[i] do
if INCOMPLETE?(state) then
if NEXT-CAT(state) is a nonterminal then
PREDICTOR(state, i, grammar) // non-terminal
else do
SCANNER(state, i) // terminal
else do
COMPLETER(state, i)
end
end
return chart

procedure PREDICTOR((A → α•B, i), j, grammar)

for each (B → γ) in GRAMMAR-RULES-FOR(B, grammar) do
ADD-TO-SET((B → •γ, j), chart[ j])
end

procedure SCANNER((A → α•B, i), j)

if B ⊂ PARTS-OF-SPEECH(word[j]) then
ADD-TO-SET((B → word[j], i), chart[j + 1])
end

procedure COMPLETER((B → γ•, j), k)

for each (A → α•Bβ, i) in chart[j] do
ADD-TO-SET((A → αB•β, i), chart[k])
end

Example
Consider the following simple grammar for arithmetic expressions:

::= S # the start rule

<S> ::= <S> "+" <M>|<M>
<M> ::= <M> "*" <T>|<T>
<T> ::= "1" | "2" | "3" | "4"

With the input:

2 + 3 * 4

This is the sequence of state sets:

(state no.) Production (Origin) # Comment

-----------------------------------------
Earley parser 3

S(0): • 2 + 3 * 4
(1) P →•S (0) # start rule
(2) S →•S + M (0) # predict from (1)
(3) S →•M (0) # predict from (1)
(4) M →•M * T (0) # predict from (3)
(5) M →•T (0) # predict from (3)
(6) T → • number (0) # predict from (5)

S(1): 2 • + 3 * 4
(1) T → number • (0) # scan from S(0)(6)
(2) M →T • (0) # complete from (1) and S(0)(5)
(3) M →M •* T (0) # complete from (2) and S(0)(4)
(4) S →M • (0) # complete from (2) and S(0)(3)
(5) S →S •+ M (0) # complete from (4) and S(0)(2)
(6) P →S • (0) # complete from (4) and S(0)(1)

S(2): 2 + • 3 * 4
(1) S →S + •M (0) # scan from S(1)(5)
(2) M →•M * T (2) # predict from (1)
(3) M →•T (2) # predict from (1)
(4) T → • number (2) # predict from (3)

S(3): 2 + 3 • * 4
(1) T → number • (2) # scan from S(2)(4)
(2) M →T • (2) # complete from (1) and S(2)(3)
(3) M →M •* T (2) # complete from (2) and S(2)(2)
(4) S → S + M • (0) # complete from (2) and S(2)(1)
(5) S →S •+ M (0) # complete from (4) and S(0)(2)
(6) P →S • (0) # complete from (4) and S(0)(1)

S(4): 2 + 3 * • 4
(1) M →M * •T (2) # scan from S(3)(3)
(2) T → • number (4) # predict from (1)

S(5): 2 + 3 * 4 •
(1) T → number • (4) # scan from S(4)(2)
(2) M → M * T • (2) # complete from (1) and S(4)(1)
(3) M →M •* T (2) # complete from (2) and S(2)(2)
(4) S → S + M • (0) # complete from (2) and S(2)(1)
(5) S →S •+ M (0) # complete from (4) and S(0)(2)
(6) P →S • (0) # complete from (4) and S(0)(1)

The state (P → S •, 0) represents a completed parse. This state also appears in S(3) and S(1), which are complete
sentences.
Earley parser 4

Citations

Other Reference Materials

• Aycock, John; Horspool, R. Nigel (2002). "Practical Earley Parsing". The Computer Journal 45 (6). pp. 620–630.
doi: 10.1093/comjnl/45.6.620 (http://dx.doi.org/10.1093/comjnl/45.6.620).
• Leo, Joop M. I. M. (1991), "A general context-free parsing algorithm running in linear time on every LR(k)
grammar without using lookahead", Theoretical Computer Science 82 (1): 165–176, doi:
10.1016/0304-3975(91)90180-A (http://dx.doi.org/10.1016/0304-3975(91)90180-A), MR 1112117 (http://
www.ams.org/mathscinet-getitem?mr=1112117).
• Tomita, Masaru (1984). "LR parsers for natural languages". COLING. 10th International Conference on
Computational Linguistics. pp. 354–357.

External links

C Implementations
• 'early' (http://cocom.sourceforge.net/ammunition-13.html) An Earley parser C -library.
• 'C Earley Parser' (https://bitbucket.org/abki/c-earley-parser/src) An Earley parser C. Wikipedia:Link rot

Java Implementations
• PEN (http://linguateca.dei.uc.pt/index.php?sep=recursos) A Java library that implements the Earley
algorithm.
• Pep (http://www.ling.ohio-state.edu/~scott/#projects-pep) A Java library that implements the Earley
algorithm and provides charts and parse trees as parsing artifacts.
• (http://www.cs.umanitoba.ca/~comp4190/Earley/Earley.java) A Java implementation of Earley parser.

Perl Implementations
• Marpa::R2 (https://metacpan.org/module/Marpa::R2) and Marpa::XS (https://metacpan.org/module/
Marpa::XS), Perl modules. Marpa (http://jeffreykegler.github.com/Marpa-web-site/) is an Earley's algorithm
that includes the improvements made by Joop Leo, and by Aycock and Horspool.
• Parse::Earley (https://metacpan.org/module/Parse::Earley) A Perl module that implements Jay Earley's original
algorithm.

Python Implementations
• Charty (http://www.cavar.me/damir/charty/python/) a Python implementation of an Earley parser.
• NLTK (http://nltk.org/) a Python toolkit that has an Earley parser.
• Spark (http://pages.cpsc.ucalgary.ca/~aycock/spark/) an Object Oriented "little language framework" for
Python that implements an Earley parser.
• earley3.py (http://github.com/tomerfiliba/tau/blob/master/earley3.py) A stand-alone implementation of the
algorithm in less than 150 lines of code, including generation of the parsing-forest and samples.
Earley parser 5

Common Lisp Implementations

• CL-EARLEY-PARSER (http://www.cliki.net/CL-EARLEY-PARSER) A Common Lisp library that
implements an Earley parser.

Scheme/Racket Implementations
• Charty-Racket (http://www.cavar.me/damir/charty/scheme/) A Scheme / Racket implementation of an Earley
parser.

Resources
• The Accent compiler-compiler (http://accent.compilertools.net/Entire.html)
Article Sources and Contributors 6

Article Sources and Contributors

Earley parser Source: http://en.wikipedia.org/w/index.php?oldid=576537591 Contributors: 1&only, AlexChurchill, Architectual, Borsotti, Brynosaurus, Cadr, Chentz, ChrisGualtieri, Clément
Pillias, Conversion script, David Eppstein, Derek Ross, DixonD, EnTerr, Fimbulvetr, Frap, Idmillington, JYOuyang, Jamelan, Jason Quinn, Jeffreykegler, John of Reading, Jonsafari, Khabs,
Kimiko, Kwi, Limited Atonement, Luqui, MCiura, Macrakis, Mkartic me, Opaldraggy, Paul Foxworthy, Peak, RA0808, Rfc1394, Simon_J_Kissane, Two Bananas, UKoch, Woogyun, Zacchiro,
71 anonymous edits

License
Creative Commons Attribution-Share Alike 3.0
//creativecommons.org/licenses/by-sa/3.0/

SPCC Exp 10
No ratings yet
SPCC Exp 10
12 pages
Lisp Interpreter in Rust
From Everand
Lisp Interpreter in Rust
Vishal Patil
1/5 (1)
Notes 4
No ratings yet
Notes 4
7 pages
03 Parsing
No ratings yet
03 Parsing
71 pages
Table-Driven Parsing: Tables
No ratings yet
Table-Driven Parsing: Tables
22 pages
Unit 2 2
No ratings yet
Unit 2 2
26 pages
Compiler Design Unit 2 by Dr. Choudhary Ravi Singh
No ratings yet
Compiler Design Unit 2 by Dr. Choudhary Ravi Singh
46 pages
Compiler Design Unit-2
No ratings yet
Compiler Design Unit-2
29 pages
Compiler Design (Unit-II)
No ratings yet
Compiler Design (Unit-II)
89 pages
Syntax Analysis - LL LR Parser
No ratings yet
Syntax Analysis - LL LR Parser
148 pages
Parsers
No ratings yet
Parsers
11 pages
Elimination of Left Recursion
No ratings yet
Elimination of Left Recursion
17 pages
RkCD-Chapter 4 - Syntax Analysis
No ratings yet
RkCD-Chapter 4 - Syntax Analysis
20 pages
Compiler Design Study Material Unit 2nd
No ratings yet
Compiler Design Study Material Unit 2nd
28 pages
Summary: LR (0) Parsing: 1.1 Derivations
No ratings yet
Summary: LR (0) Parsing: 1.1 Derivations
5 pages
AS00001155
No ratings yet
AS00001155
28 pages
Building Predictive Parsing Tables: 1. Computing The Function FIRST
No ratings yet
Building Predictive Parsing Tables: 1. Computing The Function FIRST
6 pages
Lecture4 Java
No ratings yet
Lecture4 Java
46 pages
Gr.2 Miniproject Cse4th CompilerD
No ratings yet
Gr.2 Miniproject Cse4th CompilerD
28 pages
NLP-Module-3.1-Earley Parsing
No ratings yet
NLP-Module-3.1-Earley Parsing
16 pages
Lec03 Part I SLR
No ratings yet
Lec03 Part I SLR
70 pages
A Faster Earley Parser: Abstract
No ratings yet
A Faster Earley Parser: Abstract
13 pages
LR (0) Parser - Notes
No ratings yet
LR (0) Parser - Notes
8 pages
Syntax Analysis II 2024 Student
No ratings yet
Syntax Analysis II 2024 Student
67 pages
Lec03 Part I SLR
No ratings yet
Lec03 Part I SLR
70 pages
Compiler Ass
No ratings yet
Compiler Ass
13 pages
Compilers Lecture 7
No ratings yet
Compilers Lecture 7
21 pages
Week 10 - Non Recursive Predictive Parsor
0% (1)
Week 10 - Non Recursive Predictive Parsor
41 pages
Syntax Analysis I 2024
No ratings yet
Syntax Analysis I 2024
38 pages
Unit 3 CD
No ratings yet
Unit 3 CD
34 pages
Compiler Design Syntax Analysis Top Down
No ratings yet
Compiler Design Syntax Analysis Top Down
34 pages
Bottom Up Parser
100% (1)
Bottom Up Parser
61 pages
Compiler Design Questions
No ratings yet
Compiler Design Questions
6 pages
Lexical Class3
No ratings yet
Lexical Class3
27 pages
Wa0018.
No ratings yet
Wa0018.
9 pages
Syntax Analysis - LR (1) and LALR (1) Parsing
No ratings yet
Syntax Analysis - LR (1) and LALR (1) Parsing
21 pages
lr2 LR - 0 Parsing
No ratings yet
lr2 LR - 0 Parsing
32 pages
Parsing Technique Baar Baar
No ratings yet
Parsing Technique Baar Baar
29 pages
Syntax Analysis I 2022 Class
No ratings yet
Syntax Analysis I 2022 Class
33 pages
Left To Right-Right Most Parsing Algorithm With Lookahead
No ratings yet
Left To Right-Right Most Parsing Algorithm With Lookahead
7 pages
Presented by Jyoti Thakur
No ratings yet
Presented by Jyoti Thakur
31 pages
CD Unit3 Part1
No ratings yet
CD Unit3 Part1
22 pages
Bottom Up Parsing1
No ratings yet
Bottom Up Parsing1
69 pages
Construct A CLR Parsing Table For The Given Context
No ratings yet
Construct A CLR Parsing Table For The Given Context
10 pages
Lecture05 BottomUpParsing 1
No ratings yet
Lecture05 BottomUpParsing 1
34 pages
04 - CALR Parsing
No ratings yet
04 - CALR Parsing
28 pages
Lec06 Bottomupparser
83% (6)
Lec06 Bottomupparser
88 pages
3 Syntax Analysis
No ratings yet
3 Syntax Analysis
42 pages
Theory of Computation and Compiler Design: Module - 4
No ratings yet
Theory of Computation and Compiler Design: Module - 4
31 pages
Handout 8
No ratings yet
Handout 8
9 pages
Compiler Design Syntax Analysis Top Down
No ratings yet
Compiler Design Syntax Analysis Top Down
34 pages
L5 TopDownParsing
No ratings yet
L5 TopDownParsing
30 pages
Compiler Design Unit 3
No ratings yet
Compiler Design Unit 3
20 pages
Parsing
No ratings yet
Parsing
38 pages
CD Unit 2
No ratings yet
CD Unit 2
6 pages
Bottom Down Parse
No ratings yet
Bottom Down Parse
11 pages
Chart Parsing-Earley Algorithm & Statistical Parsing: Dr. Sukhnandan Kaur Csed, Tiet
No ratings yet
Chart Parsing-Earley Algorithm & Statistical Parsing: Dr. Sukhnandan Kaur Csed, Tiet
40 pages
Predictive Parser Unit 2
No ratings yet
Predictive Parser Unit 2
22 pages
Introduction to Algorithms
From Everand
Introduction to Algorithms
S VASIST
No ratings yet
A Short Course in Discrete Mathematics
From Everand
A Short Course in Discrete Mathematics
Edward A. Bender
3/5 (1)
UNIT 2: Information Technology: Structure
No ratings yet
UNIT 2: Information Technology: Structure
31 pages
TOOL
No ratings yet
TOOL
28 pages
Un 7
No ratings yet
Un 7
8 pages
Types of Computer Software
No ratings yet
Types of Computer Software
3 pages
INTERNET
No ratings yet
INTERNET
21 pages
DMlecture 1
No ratings yet
DMlecture 1
39 pages
1.2.2 Role of Information Technologies On The Emergence of New Organizational Forms
No ratings yet
1.2.2 Role of Information Technologies On The Emergence of New Organizational Forms
9 pages
Dbms Material
No ratings yet
Dbms Material
44 pages
10.2 Business Intelligence in Various Business Applications Predicting Customer Behavior
No ratings yet
10.2 Business Intelligence in Various Business Applications Predicting Customer Behavior
4 pages
Paper 6: Management Information System Module 12: Internet, Intranet, Extranet, MIS & Enterprise
No ratings yet
Paper 6: Management Information System Module 12: Internet, Intranet, Extranet, MIS & Enterprise
15 pages
UNIT 2: Information Technology: Structure
No ratings yet
UNIT 2: Information Technology: Structure
31 pages
4.3 Introduction To Database Management
No ratings yet
4.3 Introduction To Database Management
14 pages
Lesson 7: Management of Informa Tion Systems and Information Technology
No ratings yet
Lesson 7: Management of Informa Tion Systems and Information Technology
7 pages
Chapter-1 Management Information Systems: An
No ratings yet
Chapter-1 Management Information Systems: An
14 pages
10.2 Business Intelligence in Various Business Applications Predicting Customer Behavior
No ratings yet
10.2 Business Intelligence in Various Business Applications Predicting Customer Behavior
4 pages
2.0 Data: Data, or Raw Data, Refers To A Basic Description of Products
No ratings yet
2.0 Data: Data, or Raw Data, Refers To A Basic Description of Products
12 pages
1.2.2 Role of Information Technologies On The Emergence of New Organizational Forms
No ratings yet
1.2.2 Role of Information Technologies On The Emergence of New Organizational Forms
9 pages
4.7.1 How Intranets Support Electronic Business: Notes
No ratings yet
4.7.1 How Intranets Support Electronic Business: Notes
2 pages
Unit 3: IT Impacts: Notes
No ratings yet
Unit 3: IT Impacts: Notes
8 pages
Monthly Timesheet Excel
No ratings yet
Monthly Timesheet Excel
10 pages
Chapter-1 Management Information Systems: An
No ratings yet
Chapter-1 Management Information Systems: An
14 pages
An Empirical Investigation On Work-Life Balance of Working Women in Banking Sector
No ratings yet
An Empirical Investigation On Work-Life Balance of Working Women in Banking Sector
6 pages
3.2.2 Implications For System Design
No ratings yet
3.2.2 Implications For System Design
34 pages
AJMS Vol.5 No.1 January June 2016 pp.17 29
No ratings yet
AJMS Vol.5 No.1 January June 2016 pp.17 29
13 pages
Time Calc
No ratings yet
Time Calc
1 page
Work Life Balance of Female Employees A Case Study On Private Commercial Banks in Bangladesh
No ratings yet
Work Life Balance of Female Employees A Case Study On Private Commercial Banks in Bangladesh
95 pages
Entrepreneueship Speech
0% (1)
Entrepreneueship Speech
3 pages
Reviewer in MMW
No ratings yet
Reviewer in MMW
8 pages
DM - Planner - Satish Sir - Updated - Sheet1
No ratings yet
DM - Planner - Satish Sir - Updated - Sheet1
4 pages
Lecture 5 - Hash Table and BST
No ratings yet
Lecture 5 - Hash Table and BST
15 pages
IAT Ans
No ratings yet
IAT Ans
6 pages
Inverse Trigonometric Functions: Digital Lesson
No ratings yet
Inverse Trigonometric Functions: Digital Lesson
12 pages
Discrete Mathematics Final Examination - Solutions
No ratings yet
Discrete Mathematics Final Examination - Solutions
6 pages
DM Unit 2 Functions Mcqs
No ratings yet
DM Unit 2 Functions Mcqs
12 pages
00 Fundamentals of Logic Design, Enhanced Edition JR Charles H Roth (2) (055-085)
No ratings yet
00 Fundamentals of Logic Design, Enhanced Edition JR Charles H Roth (2) (055-085)
31 pages
Ma185 Exercise Set 1
No ratings yet
Ma185 Exercise Set 1
2 pages
MMW Prelim Reviewer
No ratings yet
MMW Prelim Reviewer
7 pages
On A Q - Smarandache Implicative Ideal With Respect To An Element of A Q-Smarandache BH-algebra
No ratings yet
On A Q - Smarandache Implicative Ideal With Respect To An Element of A Q-Smarandache BH-algebra
9 pages
Secant Method
No ratings yet
Secant Method
5 pages
1802 01170
No ratings yet
1802 01170
1 page
MMW 101 - Lesson 5 - Binary Operations
No ratings yet
MMW 101 - Lesson 5 - Binary Operations
25 pages
Local Search Algorithms
No ratings yet
Local Search Algorithms
14 pages
Chapter 7. Loops
No ratings yet
Chapter 7. Loops
8 pages
DAA (3,4,5) Importent Questions
No ratings yet
DAA (3,4,5) Importent Questions
3 pages
Symbolic Logic
No ratings yet
Symbolic Logic
37 pages
10th Math Lecture 212
No ratings yet
10th Math Lecture 212
12 pages
Digital Electronics Question Bank PT2 2025
No ratings yet
Digital Electronics Question Bank PT2 2025
2 pages
Bijection
100% (1)
Bijection
6 pages
Practical DT
No ratings yet
Practical DT
19 pages
LOGIC - CIRCUITS Final Exam Q4 2009 - 2010 ANSWER KEY PDF
No ratings yet
LOGIC - CIRCUITS Final Exam Q4 2009 - 2010 ANSWER KEY PDF
6 pages
Compiler Design Jan 2023
No ratings yet
Compiler Design Jan 2023
8 pages
Lectures For Your Help
No ratings yet
Lectures For Your Help
25 pages
Mathematical Language and Symbol: Lesson 1. Mathematics and English As Languages
No ratings yet
Mathematical Language and Symbol: Lesson 1. Mathematics and English As Languages
29 pages
Introduction To Java Operators
No ratings yet
Introduction To Java Operators
8 pages
Basic of Numeric Data Types, ICT Cource
No ratings yet
Basic of Numeric Data Types, ICT Cource
62 pages
Fuzzy Filters of MTL-Algebras
No ratings yet
Fuzzy Filters of MTL-Algebras
20 pages
Forms of Categorical Syllogism
No ratings yet
Forms of Categorical Syllogism
5 pages

Earley Parser

Uploaded by

Earley Parser

Uploaded by

Earley parser 1

function EARLEY-PARSE(words, grammar)

procedure PREDICTOR((A → α•B, i), j, grammar)

procedure SCANNER((A → α•B, i), j)

procedure COMPLETER((B → γ•, j), k)

::= S # the start rule

With the input:

This is the sequence of state sets:

(state no.) Production (Origin) # Comment

Other Reference Materials

Common Lisp Implementations

Article Sources and Contributors

You might also like