0% found this document useful (0 votes)

15 views24 pages

Parsers

The document provides an overview of parsing techniques, categorizing them into top-down and bottom-up parsers, with specific focus on recursive descent parsing, LL parsers, and LR parsers. It discusses the advantages and disadvantages of each method, including their handling of grammar types and efficiency. Additionally, it explains the concepts of First and Follow sets, as well as the construction of parsing tables for LL(1) and SLR(1) parsers.

Uploaded by

mraice9028

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

15 views24 pages

Parsers

Uploaded by

mraice9028

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 24

M.

Tayyab
Parsers are classified based on how they build parse trees:
 Top-down parsers
 Recursive-descent parsers.
 Back Tracking
 Non Back Tracking
 LL parsers

 Bottom-up parsers
 Shift Reduce Parsing
 LR Parsing
 LR(O)
 SLR
 LALR
 CLR
Top-down parsing builds the parse tree from the root (start symbol) down to the
terminal symbols, applying grammar rules iteratively to non-terminals to match the
input string.
Advantages Hint: SBP
 Simplicity: Simpler to implement and understand, especially for basic languages.
 Predictive Parsing: LL Parsing uses a lookahead symbol to determine the next rule
to apply.
 Backtracking: Deterministic TDP , like LL parsers, eliminate the need for
backtracking, resulting in greater efficiency.
Disadvantages Hint: LG BOC
 Limited Grammar Handling: TDP can not parse left-recursive grammars, and the
grammar’s expressiveness is quite limited.
 Backtracking Overhead: Non-deterministic TDP, like Recursive Descent, may
involve backtracking, making it computationally expensive.
 Not Suitable for Complex Grammars: TDP are inefficient for complex grammars,
often leading to incorrect parse trees or parsing failure.
RDP involves breaking down a language into its constituent parts by using a set of
recursive procedures, where each non-terminal in the grammar corresponds to a
function.
How it works:
The parser begins at the start symbol of the grammar.
Recursively expands non-terminals based on the grammar rules.
When a terminal is encountered, it’s matched against the input string.
If the string matches the grammar, the parser succeeds; otherwise, it fails.
Advantages:
 Simple and intuitive for small grammars.

 No extra data structures are usually required-functions and recursion handle the
logic.
Disadvantages:
 Not suitable for left-recursive grammars (can cause infinite recursion).

 Inefficient for more complex grammars, especially if backtracking is needed.

Grammar Rules (for arithmetic expressions with addition and multiplication):
 Expression → Term + Expression | Term
 Term → Factor * Term | Factor
 Factor → ( Expression ) | Number
Example Input:2 * (3 + 4)
The parsing begins at the start symbol (Expression):
1. The Expression function checks if the input starts with a Term followed by + and
another Expression, or just a Term.
2. The Term function checks if the input starts with a Factor followed by * and
another Term, or just a Factor.
3. The Factor function checks if the input starts with a number (like 2) or an
expression enclosed in parentheses ( ).
Expression: Calls Term first.
 Term: Calls Factor first.
 Factor: Matches the number 2. Success!
 Term: Encounters *, so it calls itself to parse the next Factor.
 Factor: Encounters (, so it calls Expression inside parentheses.
 Expression: Calls Term first.
 Term: Calls Factor first.
 Factor: Matches the number 3. Success!
 Expression: Encounters +, so it calls itself to parse the next Term.
 Term: Calls Factor first.
 Factor: Matches the number 4. Success!
 Expression: Successfully parses 3 + 4. Returns to Factor.
 Factor: Successfully parses (3 + 4). Returns to Term.
 Term: Successfully parses 2 * (3 + 4). Returns to Expression.
Recursive descent parsing with backtracking is a method where the parser explores
multiple possible ways of parsing an input string. If a chosen path fails, the parser
"backtracks" to a previous decision point and tries a different path. This approach is
useful for grammars where the correct production rule cannot be determined with just a
single lookahead token.
How It Works:
1.Recursive Descent Parsing: Each non-terminal in the grammar corresponds to a
function in the parser. These functions recursively process the input.
2.Backtracking: If a function encounters a mismatch (i.e., the input doesn't match the
current rule), the function returns failure, and the parser backtracks to try other rules.

Backtracking would happen in our RDP example if the parser tries a wrong production
rule.
For example:
If Expression → Term + Expression doesn’t fit: It backtracks to try Expression → Term.
If Factor → Number doesn’t match: Parser backtracks to try Factor → ( Expression ).
The First set of a non-terminal contains all the terminals that can appear as the first
symbol in strings derived from that non-terminal.
Steps to Compute First Set:
1. For Terminals:
 The First set of a terminal is itself. For example: First(id) = { id }
2. For Non-Terminals:
 If the production rule is A → α, where α starts with a terminal a, then: First(A) =
{a}
 If α starts with a non-terminal B, then: First(A) = First(B) (excluding ε, if B does
not produce ε).
 If α produces ε, then: First(A) = { ε }.
3. For Productions with Multiple Symbols:
 Consider A → X1 X2 X3 ...:
 Add First(X1) to First(A).
 If X1 can produce ε, add First(X2), and so on.
 Stop when a symbol does not produce ε or when you’ve reached the end.
Grammar First()
S → abc | def | ghi First(S)={a, d, g}

S → ABC | ghi| jkl First(C)={c}

A→a|b|c First(B)={b}
B→b First(A)={a, b, c}
C→c First(S)={a, b, c, g, j}
S → ABC First(C)={e, f, Ɛ}
A→a|b|Ɛ First(B)={c, d, Ɛ}
B→c|d|Ɛ First(A)={a, b, Ɛ}
C→e|f|Ɛ First(S)={a, b, c, e, Ɛ}
The Follow set of a non-terminal contains all the terminals that can appear immediately
after that non-terminal in any valid derivation.
Steps to Compute Follow Set:
1. Start Symbol:
 Add $ (end-of-input marker) to the Follow set of the start symbol.
2. For Non-Terminals in Rules:
 If a rule is A → αBβ, add all terminals in First(β) (except ε) to Follow(B).
 If β can produce ε or if B is the last symbol, add Follow(A) to Follow(B).
3. Repeat Until Stabilized:
 Iterate through all rules until no more terminals can be added.
Grammar Example Follow()
S → Abc Follow(A) = {b}
S → ACD Follow(A) = First(C) = {a,b}
C→a|b Follow(C) = {$}
Follow(D) = {$}
Follow(S) = {$}
S → aSbS | bSaS | Ɛ Follow(S) = {$, b,a }

S → AaAb | BbBa Follow(A) = {a,b}

A→Ɛ Follow(B) = {b,a}
B→Ɛ
S → ABC Follow(A)= First(B) = First(C) =
S → DEF Follow(S)= {$}
B→Ɛ
C→Ɛ
D→Ɛ
E→Ɛ
F→Ɛ
An LL parser is a type of top-down parser used for analyzing a given formal language.
The "LL" stands for Left-to-right scanning of the input and Leftmost derivation in its
parsing process.
 Left-to-right (L): The parser reads the input from left to right, one symbol at a time.
 Leftmost derivation (L): It constructs the parse tree by expanding the leftmost non-
terminal first.
Key Concepts:
1. Top-Down Parsing: LL parsers start with the grammar's initial symbol and derive
the string by applying production rules from top to bottom, reaching the input
tokens.
2. LL(1): The "1" indicates that the parser examines one symbol ahead to choose the
appropriate production rule, relying on the next input token and the current non-
terminal.
3. Context-Free Grammar: LL parsers handle context-free grammars (CFGs) with
rules that replace non-terminals using terminals and other non-terminals.
Characteristics of LL Parsers:
 Predictive Parsing: LL parsers are considered "predictive" because they make
decisions about which rule to apply based solely on the next symbol in the input
(and sometimes a small lookahead).
 Non-recursive: LL parsers use a stack to manage parsing rather than recursion,
though some implementations may use recursion to simplify the process.
 Efficiency: They are relatively simple to implement and efficient for certain types of
grammars. However, not all context-free grammars can be parsed by an LL parser.
Limitations:
 LL parsers can only handle LL(k) grammars (where k is the lookahead), which are a
subset of all context-free grammars.
 Ambiguity: If a grammar has multiple possible derivations at a certain point, it can
lead to conflicts that make LL parsing difficult or impossible.
 Starting Point
 The parser begins with the start symbol of the grammar.
 It attempts to derive the input string by applying production rules.
 Using Lookahead (1 Token)
 The parser examines the next token before making a decision.
 It consults a parse table that determines the correct rule to apply.
 Building the Parse Table The LL(1) parse table is constructed using:
 First Sets → Determine the initial symbols of possible derivations.
 Follow Sets → Identify where non-terminals can appear in different contexts.
 Parsing Process
The action can be one of the following:
 Pop and Push: If the top stack symbol is a non-terminal, the parser pops it and
pushes the production rule's right-hand side onto the stack.
 Match: If the top stack symbol is a terminal and matches the next input, the
parser pops it and consumes the input.
 Error: If no valid production rule is found, a parsing error occurs.
 Success: Parsing succeeds if the input ends and the stack is empty.
Advantages of LL(1) Parsers
 Fast & Deterministic → No backtracking is needed.

 Simple Table-Driven Parsing → Easy to implement in compilers.

 Error Detection → Immediate identification of syntax errors.

Limitations
 Limited Expressiveness → Some complex grammars cannot be parsed using
LL(1).
 Ambiguous or Recursive Grammars Need Modification → Rewriting might be
required for proper parsing.

The Parsing Process

Grammar 1 First Follow
 S→(L)|a S (a $,)
 L → S L’ L (a )
 L’ → Ɛ |, S L’ L’ Ɛ, )
( ) a , $
S S→(L) S→a
L L → S L’ L → S L’
L’ L’ → Ɛ L’ → , S L’
LL(1) Parse Table

Grammar 2 a b $
S →aSbS |bSaS | Ɛ S S →aSbS S →bSaS S →Ɛ
First(S)= {a. b, Ɛ } S →Ɛ S →Ɛ
Follow(S)={$, a, b}
Multiple Productions in One Cell: It is not LL(1) Grammar
Bottom-up parsing constructs the parse tree from the leaves upward, transforming the
input string into the start symbol using reverse production rules.
Advantages
 Handles Complex Grammars: Bottom-up parsers can efficiently handle left-
recursive grammars.
 Efficient: LR parsers, a type of bottom-up parser, are highly efficient and
powerful for parsing complex context-free grammars with minimal limitations.
 No Backtracking: LR parsers avoid backtracking, enhancing their performance
efficiency.
Disadvantages
 Complex Implementation: Implementing and understanding bottom-up parsers,
especially LR parsers, is challenging.
 Table-driven Parsing: Parsing tables in bottom-up parsers can become large and
cumbersome with complex grammars.
Shift-reduce parsing is a bottom-up parsing technique used in syntax analysis. It works
by iteratively shifting input symbols onto a stack and reducing them based on
predefined grammar rules until a valid parse tree is formed or an error is detected.
Here's how it works:
Shift: Move the next input symbol onto the stack.
Reduce: Apply a production rule in reverse to replace elements on the stack with a
non-terminal.
Repeat: Continue shifting and reducing until the stack contains only the start symbol
and the input is consumed.
Example In SRP each step involves either a Shift or a Reduce operation.
1. E′→E Example input id+id:
2. E→E+T  Shift “id”  Shift “id”
3. E→T  T+id Reduce(T→id)  E+T Reduce (T→id)
4. T→(E)  E+id Reduce (T→E),  E Reduce (E→E+T)
5. T→id  Shift “+”  E′ Reduce (E′→E)

This method is commonly used in LR parsers, including

SLR(1), LR(1), and LALR(1) parsers, which are efficient for
programming language parsing.
LR(0) parsing relies on a parsing table, which consists of:
1. Action Table: Defines whether to shift, reduce, or accept an input symbol.

2. Goto Table: Directs the parser to the next state after recognizing a non-terminal.

Steps:
1. Define the Grammar
2. Build LR(0) States
3. Construct Action Table (Shift-Reduce Decisions)
4. Construct Goto Table

The LR(0) parsing table allows systematic bottom-up parsing, enabling a Shift-Reduce
mechanism to process strings. However, LR(0) struggles with conflicts, so more
advanced techniques like SLR(1), LALR(1), and LR(1) parsing improve decision-
making using lookahead symbols.
State Action Go to
Grammar id + $ E T
 E→ T+E/T 0 S3 1 2

 T → id 1 Accept
E
2 r2 S4,r2 r2
3 r3 r3 r3
4 S3 5 2
5 r1 r1 r1
Accept

State 1
State 0 E’→E.
+ State 4
E’→.E State 2
T E→ T+.E
E→ .T+E E→ T.+E E State 5
E→ .T+E
E→ .T E→ T. T E→T+E.
E→ .T
T → .id id
State 3 id T → .id
T → id.

Reduction
SLR(1) (Simple LR) parsing is an improvement over LR(0) parsing, using lookahead
symbols to resolve conflicts and enhance decision-making. It builds on LR(0) parsing
by checking Follow sets of non-terminals before reducing.
Steps:
1. Define the Grammar
2. Find First and Follow
3. Build SLR(1) States
4. Construct Action Table (Shift-Reduce Decisions)
5. Construct Goto Table
Difference Between LR(0) and SLR(1)
 SLR(1) uses Follow sets to decide reductions, preventing conflicts.

 SLR(1) avoids unnecessary reductions until the next symbol matches Follow sets.

This method improves parsing accuracy, making it more efficient than LR(0).
State Action Go to
Grammar id + $ E T
 E→ T+E/T 0 S3 1 2

 T → id 1 Accept
E
2 r2
3 r3 r3
4 S3 5 2
5 r1
Accept

State 1
State 0 E’→E.
+ State 4
E’→.E State 2
T E→ T+.E
E→ .T+E E→ T.+E E State 5
E→ .T+E
E→ .T E→ T. T E→T+E.
E→ .T
T → .id id
State 3 id T → .id
T → id.

Reduction
In Canonical LR(1) (CLR) parsing, lookahead symbols play a crucial role in
determining when to reduce a production. Unlike SLR(1), which relies on Follow sets,
CLR(1) assigns specific lookahead symbols to each LR(1) item, refining parsing
decisions
Steps:
1. Augment the Grammar
2. Compute First Sets
3. Build SLR(1) States
4. Assign Lookahead Symbols Using First and Follow Sets

5. Construct Goto Table

Difference Between LR(0) and SLR(1)
 SLR(1) uses Follow sets to decide reductions, preventing conflicts.

 SLR(1) avoids unnecessary reductions until the next symbol matches Follow sets.

This method improves parsing accuracy, making it more efficient than LR(0).

Atcd Unit 2
No ratings yet
Atcd Unit 2
49 pages
Principles of Compiler Design
100% (4)
Principles of Compiler Design
162 pages
LL1 Parsing
0% (1)
LL1 Parsing
71 pages
Let's Learn Kotlin
100% (1)
Let's Learn Kotlin
21 pages
Module-2 1
No ratings yet
Module-2 1
51 pages
LL (K) and LR (K)
No ratings yet
LL (K) and LR (K)
21 pages
1 Types of Parsers in Compiler Design
100% (1)
1 Types of Parsers in Compiler Design
4 pages
Top-Down Parsing: - The Parse Tree Is Created Top To Bottom. - Top-Down Parser
No ratings yet
Top-Down Parsing: - The Parse Tree Is Created Top To Bottom. - Top-Down Parser
36 pages
CD Unit-3 Part-1
No ratings yet
CD Unit-3 Part-1
99 pages
CSC 4181 Compiler Construction Parsing
No ratings yet
CSC 4181 Compiler Construction Parsing
53 pages
Compiler Principle and Technology: Mr. Aruna Malik BIT (Mesra) Ranchi, Off Campus NOIDA
No ratings yet
Compiler Principle and Technology: Mr. Aruna Malik BIT (Mesra) Ranchi, Off Campus NOIDA
86 pages
4 Parsing
No ratings yet
4 Parsing
55 pages
Syntax Analysis: CD: Compiler Design
No ratings yet
Syntax Analysis: CD: Compiler Design
90 pages
Pert 4 - Syntax Analysis-Top Down Parsing
No ratings yet
Pert 4 - Syntax Analysis-Top Down Parsing
54 pages
Chapter 4 - Syntax Analysis
No ratings yet
Chapter 4 - Syntax Analysis
68 pages
Top-Down Parsing: - The Parse Tree Is Created Top To Bottom. - Top-Down Parser
No ratings yet
Top-Down Parsing: - The Parse Tree Is Created Top To Bottom. - Top-Down Parser
31 pages
Unit-5 Top Down Parsing
No ratings yet
Unit-5 Top Down Parsing
35 pages
7 - Parsing Techniques - Top Down Parsing
No ratings yet
7 - Parsing Techniques - Top Down Parsing
47 pages
Parsing Technique Baar Baar
No ratings yet
Parsing Technique Baar Baar
29 pages
03 Syntaxanalysis 2 2012 2013
No ratings yet
03 Syntaxanalysis 2 2012 2013
83 pages
CD Unit3
No ratings yet
CD Unit3
74 pages
Unit 2-Part B
No ratings yet
Unit 2-Part B
73 pages
Top Down Parser
No ratings yet
Top Down Parser
111 pages
M2 Compiler Design
No ratings yet
M2 Compiler Design
51 pages
Unit - 3 Syntax Analysis: 3.1 Role of The Parser
No ratings yet
Unit - 3 Syntax Analysis: 3.1 Role of The Parser
6 pages
Syntax Analysis I 2022 Class
No ratings yet
Syntax Analysis I 2022 Class
33 pages
Chapter 3 Syntax Analyzer1
No ratings yet
Chapter 3 Syntax Analyzer1
58 pages
Top Down PDF
No ratings yet
Top Down PDF
49 pages
Compilef Design Unit 2 AKTU As Per 2023-24 Syllabus
No ratings yet
Compilef Design Unit 2 AKTU As Per 2023-24 Syllabus
46 pages
Validity and Truth
No ratings yet
Validity and Truth
6 pages
Chapter # 5 Parsing Mechanisms. Chapter # 5 Parsing Mechanisms
No ratings yet
Chapter # 5 Parsing Mechanisms. Chapter # 5 Parsing Mechanisms
31 pages
Unit - Ii 2.1 Syntax Analysis
No ratings yet
Unit - Ii 2.1 Syntax Analysis
122 pages
Chapter 3a - Syntax Analysis
No ratings yet
Chapter 3a - Syntax Analysis
10 pages
Unit 3
No ratings yet
Unit 3
117 pages
Grammars
No ratings yet
Grammars
34 pages
CD Unit 2
No ratings yet
CD Unit 2
19 pages
Top Down Parser
No ratings yet
Top Down Parser
5 pages
UNIT-2: Parsing
No ratings yet
UNIT-2: Parsing
18 pages
Lecture3 Parser Full
No ratings yet
Lecture3 Parser Full
30 pages
Toc Unit 3
No ratings yet
Toc Unit 3
49 pages
Chapter 3 Syntax Analysis
No ratings yet
Chapter 3 Syntax Analysis
54 pages
Compiler Design 4
No ratings yet
Compiler Design 4
7 pages
Chapter 3
No ratings yet
Chapter 3
96 pages
Unit - 3 Syntax Analyzer
No ratings yet
Unit - 3 Syntax Analyzer
43 pages
CS6109 Module 5
No ratings yet
CS6109 Module 5
117 pages
Lecture 17
No ratings yet
Lecture 17
57 pages
Unit 2 Basic Parsing Techniques
No ratings yet
Unit 2 Basic Parsing Techniques
34 pages
U 2 PPT
No ratings yet
U 2 PPT
91 pages
Assignment 3
No ratings yet
Assignment 3
4 pages
Session 3
No ratings yet
Session 3
18 pages
Chapter 4 - Syntax Analysis
No ratings yet
Chapter 4 - Syntax Analysis
82 pages
Lecture04 TopDownParsing 2
No ratings yet
Lecture04 TopDownParsing 2
104 pages
LLK and LRK
No ratings yet
LLK and LRK
32 pages
Csf401 Unit 02
No ratings yet
Csf401 Unit 02
82 pages
Category Topic Description Classi Cation Top Down Parsers
No ratings yet
Category Topic Description Classi Cation Top Down Parsers
28 pages
Parsing
No ratings yet
Parsing
33 pages
Chapter 3 Syntax Analysis
No ratings yet
Chapter 3 Syntax Analysis
78 pages
Describing Web Resources in RDF
No ratings yet
Describing Web Resources in RDF
120 pages
Operator Precedence and LL Parsing
No ratings yet
Operator Precedence and LL Parsing
31 pages
CSE2002 Session22 TopDownParsingSession3
No ratings yet
CSE2002 Session22 TopDownParsingSession3
27 pages
PYQs Unit 2 CD
No ratings yet
PYQs Unit 2 CD
31 pages
An Introduction To Formal Language Theory That Integrates Experimentation and Proof - Allen Stoughton
No ratings yet
An Introduction To Formal Language Theory That Integrates Experimentation and Proof - Allen Stoughton
288 pages
UNIT 2 Notes CD
No ratings yet
UNIT 2 Notes CD
12 pages
TAFL Imp Questions
No ratings yet
TAFL Imp Questions
5 pages
Bottom-Up Parsing Including LR (0), SLR
No ratings yet
Bottom-Up Parsing Including LR (0), SLR
55 pages
Text Processing (Complete)
No ratings yet
Text Processing (Complete)
100 pages
BI Publisher Function Used in MS Office
No ratings yet
BI Publisher Function Used in MS Office
4 pages
The History of The Math Function
100% (1)
The History of The Math Function
14 pages
Deep Learning For Natural Language Inference: NAACL-HLT 2019 Tutorial
No ratings yet
Deep Learning For Natural Language Inference: NAACL-HLT 2019 Tutorial
181 pages
Antlr C Sharp Code Generation Using Visual
No ratings yet
Antlr C Sharp Code Generation Using Visual
8 pages
Vianzon, Reynaldo JR., M BSN Logic: Inference
No ratings yet
Vianzon, Reynaldo JR., M BSN Logic: Inference
2 pages
Compiler Design Question Paper 21 22
No ratings yet
Compiler Design Question Paper 21 22
3 pages
6.045 Final Exam: 6.045J/18.400J: Automata, Computability and Complexity
No ratings yet
6.045 Final Exam: 6.045J/18.400J: Automata, Computability and Complexity
20 pages
BNF Notation
No ratings yet
BNF Notation
6 pages
Class12 ProggAssign Answers Notes
No ratings yet
Class12 ProggAssign Answers Notes
66 pages
Theory of Computation
No ratings yet
Theory of Computation
2 pages
Excel Beta Example
No ratings yet
Excel Beta Example
5 pages
Backward Reasoning: The Foundations: Logic and Proofs
No ratings yet
Backward Reasoning: The Foundations: Logic and Proofs
6 pages
Cfls and The Pumping Lemma
No ratings yet
Cfls and The Pumping Lemma
24 pages
Formal Language: by Hossam Hawash
No ratings yet
Formal Language: by Hossam Hawash
38 pages
Ecture: Regular Expressions
No ratings yet
Ecture: Regular Expressions
19 pages
Automata Homework Solution
No ratings yet
Automata Homework Solution
4 pages
Java String Functions': Visit at "JH X.KS'KK Ue%"
No ratings yet
Java String Functions': Visit at "JH X.KS'KK Ue%"
5 pages
CASTRO - Three Laws of Thought
No ratings yet
CASTRO - Three Laws of Thought
2 pages
Assignment 2 6th SEM B.Tech CSE: Paper Code: BCS 304 C Paper Title: Compiler Design 4
No ratings yet
Assignment 2 6th SEM B.Tech CSE: Paper Code: BCS 304 C Paper Title: Compiler Design 4
2 pages
Syllabus csc309 2024 - 2025
No ratings yet
Syllabus csc309 2024 - 2025
3 pages
Closure Properties Table PDF
No ratings yet
Closure Properties Table PDF
1 page
Lisp Interpreter in Rust
From Everand
Lisp Interpreter in Rust
Vishal Patil
1/5 (1)
Introduction to Algorithms
From Everand
Introduction to Algorithms
S VASIST
No ratings yet

Parsers

Uploaded by

Parsers

Uploaded by

M.

 Inefficient for more complex grammars, especially if backtracking is needed.

S → ABC | ghi| jkl First(C)={c}

S → AaAb | BbBa Follow(A) = {a,b}

 Simple Table-Driven Parsing → Easy to implement in compilers.

 Error Detection → Immediate identification of syntax errors.

The Parsing Process

This method is commonly used in LR parsers, including

5. Construct Goto Table

You might also like