How To Create A Recursiver Parser
How To Create A Recursiver Parser
How To Create A Recursiver Parser
7.1
The basic idea of a recursive descent parser is to use the current input symbol
to decide which alternative to choose. Grammars which have the property that
it is possible to do this are called LL(1) grammars.
First we introduce an end marker $, for a given G = (V, , S, P ) we define the
augmented grammar G$ = (V 0 , 0 , S 0 , P 0 ) where
/ V ,
V 0 = V {S 0 } where S 0 is chosen s.t. S 0
0 = {$} where $ is chosen s.t. $
/ V ,
P 0 = P {S 0 S$}
The idea is that
L(G$ ) = {w$ | w L(G)}
Now for each nonterminal symbol A V 0 0 we define
First(A) = {a | a A a}
Follow(A) = {a | a S 0 Aa}
i.e. First(A) is the set of terminal symbols with which a word derived from A
may start and Follow(A) is the set of symbols which may occur directly after
A. We use the augmented grammar to have a marker for the end of the word.
For each production A P we define the set Lookahead(A ) which
are the set of symbols which indicate that we are in this alternative.
[
Lookahead(A B1 B2 . . . Bn ) = {First(Bi ) | 1 k < i.Bk }
Follow(A) if B1 B2 . . . Bk
otherwise
We now say a grammar G is LL(1), iff for each pair A , A P with
6= it is the case that Lookahead(A ) Lookahead(A ) =
7.2
7.3
7.4
try {
curr=st.nextToken().intern();
} catch( NoSuchElementException e) {
curr=null;
}
}
We also implement a convenience method error(String) to report an error
and terminate the program.
Now we can translate all productions into methods using the Lookahead sets to
determine which alternative to choose. E.g. we translate
E 0 +T E 0 |
into (using E1 for E 0 to follow JAVA rules):
static void parseE1() {
if (curr=="+") {
next();
parseT();
parseE1();
} else if(curr==")" || curr=="$" ) {
} else {
error("Unexpected :"+curr);
}
The basic idea is to
Translate each occurrence of a non terminal symbol into a test that this
symbol has been read and a call of next().
Translate each nonterminal symbol into a call of the method with the same
name.
If you have to decide between different productions use the lookahead sets
to determine which one to use.
If you find that there is no way to continue call error().
We initiate the parsing process by calling next() to read the first symbol and
then call parseE(). If after processing parseE() we are at the end marker,
then the parsing has been successful.
next();
parseE();
if(curr=="$") {
System.out.println("OK ");
} else {
error("End expected");
}
The complete parser can be found at
http://www.cs.nott.ac.uk/~txa/g51mal/ParseE0.java.
Actually, we can be a bit more realistic and turn the parser into a simple evaluator by
42
to translate the current token into a number. JAVA will raise an exception
if this fails.
Calculate the value of the expression read. I.e. we have to change the
method interfaces:
static
static
static
static
static
int
int
int
int
int
parseE()
parseE1(int x)
parseT()
parseT1(int x)
parseF()
7.5
The idea behind parseE1 and parseT1 is to pass the result calculated
so far and leave it to the method to incorporate the missing part of the
expression. I.e. in the case of parseE1
static int parseE1(int x) {
if (curr=="+") {
next();
int y = parseT();
return parseE1(x+y);
} else if(curr==")" || curr=="$" ) {
return x;
} else {
error("Unexpected :"+curr);
return x;
}
}
44
43