Parsing ProblemsAndSolutions
Parsing ProblemsAndSolutions
Jordi Cortadella
February 7, 2022
Parsing
a) The set of all strings of a’s and b’s that are palindromes.
Solution:
S → ε | a | b | aSa | bSb
b) Strings that match the pattern a∗ b∗ and have more a’s than b’s.
Solution:
S → AX
A → a | aA
X → aXb | ε
( [ [ ] ( ( ) [ ( ) ] [ ] ) ] )
Solution:
S → S (S ) | S [S ] | ε
d) The set of all strings of a’s and b’s such that every a is immediately followed by at least one b.
Solution:
S → bS | abS | ε
e) The set of all strings of a’s and b’s with an equal number of a’s and b’s.
Solution: S generates the language that has an equal number of a’s and b’s. A generates the
language than has one more a than b. B generates the language than has one more b than a.
S → aB | bA | ε
A → aS | bAA
B → bS | aBB
f) The set of all strings of a’s and b’s with a different number of a’s and b’s.
Solution:
g) Blocks of statements in Pascal, where the semicolons separate the statements, e.g.,
Solution:
Let I be a statement.
BP → ( LI )
LI → I ; LI | I
I → statement | BP
Solution:
Let I be a statement.
BC → { LI }
LI → I LI | ε
I → statement ; | BC
Page 2
2. Consider the following grammar:
S → SS + | SS ∗ | a
and the string aa+a ∗.
Give a leftmost derivation for the string.
Solution:
S → SS∗ → SS + S∗ → aS + S∗ → aa + S∗ → aa + a∗
Solution:
S → SS∗ → Sa∗ → SS + a∗ → Sa + a∗ → aa + a∗
Solution:
S
S S ∗
S S + a
a a
Solution: The grammar is unambiguous since only one parse tree can be generated for each
string. The three productions of S end with a different terminal symbol (a, + or ∗). It is
easy to see that by parsing symbols from right to left, it is only possible to choose one of the
productions.
Solution: This grammar generates arithmetic expressions in inverse Polish notation, where
+ and ∗ are the operators and a represents the operands.
3. Calculate Nullable, First and Follow of the non-terminal symbols in the following grammar:
A → B|a
B → b|ε
C → c | ABC
Solution:
Nullable First Follow
A Yes {a, b} {a, b, c}
B Yes {b} {a, b, c}
C No {a, b, c} ∅
Page 3
4. Consider the following grammar:
S → cABc
A → aAa | c
B → bBb | c
Construct the LL(1) parsing table and check whether it is an LL(1) grammar.
Solution:
Page 4
5. Calculate Nullable, First and Follow for the following grammar:
S → uBDz
B → Bv | w
D → EF
E → y|ε
F → x|ε
Construct the LL(1) parsing table and give evidence that this grammar is not LL(1). Modify the
grammar as little as possible to make an LL(1) grammar that accepts the same language.
Solution:
It is not an LL(1) grammar since there is a conflict in cell hB, wi. This is caused by the left recursion
of the production rule of B.
It can be easily realized that the language of B is wv ∗ . The same language can be generated using
and additional symbol (B 0 ) and right recursion, e.g., B → wB 0 and B 0 → vB 0 | ε.
With the new rule, we have that
The other definitions of Nullable, First and Follow remain the same.
With this transformation, the LL(1) parsing table would be as follows:
u v w x y z
S S → uBDz
B B → wB 0
B0 0
B → vB 0
B0 → ε B0 → ε B0 → ε
D D → EF D → EF D → EF
E E→ε E→y E→ε
F F →x F →ε
Page 5
6. Design a table-driven top-down parser for the following grammar:
S → E
E → T + E|T
T → num ∗ T | num
Solution: First of all, the grammar is not LL(1) since E (and also T ) have productions with
common prefixes. We need to transform the grammar:
S → E
E → T E0
E0 → +E|ε
T → num T 0
0
T → ∗T |ε
S0 → S (1)
S → V ; S | ε (2) (3)
V → int id (4)
Solution:
Nullable First Follow
S0 yes {int} {$}
S yes {int} {$}
V no {int} {; }
Automaton: LR(1) table:
0 1 4
int V −> int . id id V −> int id . int id ; $ S V
S’ −> .S
S −> .V;S 0 s1 r3 2 3
3 1 s4
S −> . V S −> V . ; S int
V −> . int id ; 2 acc
V 6
S 5 S −> V; . S S S −> V ; S . 3 s5
2 S’ −> S . S −> .V;S 4 r4
S −> . 5 s1 r3 6 3
V −> . int id 6 r2
Page 6
8. Design an LR(1) parser for the following grammar:
S0 → S (1)
S → ddX (2)
X → aX | ε (3) (4)
Solution:
Nullable First Follow
S0 no {d} {$}
S no {d} {$}
X yes {a} {$}
Page 7
9. Consider the following EBFN grammar, where the tokens are the symbols within quotes (e.g., ’if’),
ID and INTEGER.
program : statement+ ;
statement :
’if’ paren_expr statement (’else’ statement)?
| ’while’ paren_expr statement
| ’do’ statement ’while’ paren_expr ’;’
| ’{’ statement* ’}’
| expr ’;’
| ’;’
;
match(T) {
if (Token == T) nexttoken();
else SyntaxError();
}
In the code you can also use expressions like
Solution:
First
program ’if’ ’while’ ’do’ ’{’ ’;’ ID INTEGER ’(’
statement ’if’ ’while’ ’do’ ’{’ ’;’ ID INTEGER ’(’
paren expr ’(’
expr ID INTEGER ’(’
test ID INTEGER ’(’
sum ID INTEGER ’(’
term ID INTEGER ’(’
Follow
program EOF
statement ’if’ ’while’ ’do’ ’{’ ’;’ ID INTEGER ’(’ ’else’ ’}’ EOF
paren expr ’if’ ’while’ ’do’ ’{’ ’;’ ID INTEGER ’(’ ’)’ ’+’ ’-’ ’<’
expr ’)’ ’;’
test ’)’ ’;’
sum ’<’ ’)’ ’;’
term ’+’ ’-’ ’<’ ’)’ ’;’
Page 8
void program() {
do {
statement();
} while (Token in First(statement));
}
void statement() {
if (Token == ’if’) {
nexttoken(); paren_expr(); statement();
if (Token == ’else’) {
// Greedy option since ’else’ is also in Follow(statement)
nexttoken(); statement();
}
} else if (Token == ’while’) {
nexttoken(); paren_expr(); statement();
} else if (Token == ’do’) {
nexttoken(); statement(); match(’hwile’); paren_expr(); match(’;’);
} else if (Token == ’{’) {
nexttoken();
while (Token in First(statement)) statement();
match(’}’);
} else if (Token in First(expr)) {
expr(); match(’;’);
} else if (Token == ’;’) nexttoken();
else SyntaxError();
}
void paren_expr() {
match(’(’); expr(); match(’)’);
}
void expr() {
// Here is a conflict since ID in First(test)
if (Token in First(test)) test();
else if (Token == ID) {
nexttoken(); match(’=’); expr();
} else SyntaxError();
}
void test() {
sum();
if (Token == ’<’) {
nexttoken(); sum();
}
}
void sum() {
term();
while (Token == ’+’ or Token == ’-’) {
nexttoken(); term(); // Code optimization has been applied here
}
}
void term() {
if (Token == ID) nexttoken();
else if (Token == INTEGER) nexttoken();
else if (Token == ’(’)) {
nexttoken(); paren_expr();
} else SyntaxError();
}
Page 9
Page 10