Lecture 05 Parsing
Lecture 05 Parsing
Reading material: These notes and an implementation (see A grammar: set of rules for generating sentences of a
course web page). language.
The best way to prepare [to be a programmer] is to write Sentence ::= Noun Verb Noun
programs, and to study great programs that other people Noun ::= boys
have written. In my case, I went to the garbage cans at the Noun ::= girls Example of Sentence:
Computer Science Center and fished out listings of their boys see girls
Verb ::= like girls like boys
operating system. (Bill Gates)
Verb ::= see
First learn computer science and all the theory. Next This grammar has 5 rules. The first says:
develop a programming style. Then forget all that and just A Sentence can be a Noun followed by a Verb followed
hack. (George Carrette) by a Noun.
Look at this website: Note: Whitespace between words doesn’t matter.
Grammar is boring because the set of Sentences (8) is finite.
T+T+T F F F
T+T-T-T+T
2 $ 2 + 3 $
Trees Grammar gives precedence to * over +
Expression ::= E $ F ::= Integer
root of tree
Expression E ::= T { <+ | –> T } F ::= –F
T ::= F { <* | /> F } F ::= ( E )
E
The node labeled E is the E T
parent of the nodes labeled T After doing
T T the plus
and +. These nodes are the
children of node E. Nodes T T E first, the
F F tree cannot
labeled T and + are siblings.
F F F T T be
2 + 3 $ completed
F F F
leaf of tree
2 + 3 * 4 2 + 3 * 4
2 + 3 * 4 ( 2 + 3 ) * 4
Writing a parser for the language Writing a parser for the language
Expression ::= E $ F ::= Integer Expression ::= E $ F ::= Integer
E ::= T { <+ | –> T } F ::= –F E ::= T { <+ | –> T } F ::= –F
T ::= F { <* | /> F } F ::= ( E ) T ::= F { <* | /> F } F ::= ( E )
The scanner is the part of the program that reads in characters Scan.getToken() is always you the token being processed.
and produces tokens from them, deleting all whitespace. Scan.scan() deletes the token being processed, making the next
one in the input the one being processed, and returns the new one.
22 + 35 * – 46 / 2 $
22 + 35 * – 46 / 2 $
a1 a21 a6 + 35 * – 46 / 2 $
35 * – 46 / 2 $
Token Token Token
22 + 35
… * – 46 / 2 $
– 46 / 2 $
Writing a parser for the language Writing a parser for the language
Expression ::= E $ F ::= Integer E ::= T { <+ | –> T }
E ::= T { <+ | –> T } F ::= –F
T ::= F { <* | /> F } F ::= ( E ) /** Token Scan.getToken() is first token of a sentence for E.
Parse it, giving error mess. if there are mistakes. After the parse,
For E, T, F, write a method with this spec (we show only E):
Scan.getToken should be the symbol following the parsed E. */
/** Token Scan.getToken() is first token of a sentence for E. public static void parseE() {
Parse it, giving error mess. if there are mistakes. After the parse, parseT();
Scan.getToken should be the symbol following the parsed E. */
public static void parseE(). while (Scan.getToken() is + or - ) {
Scan.getToken()
Scan.scan();
2 + ( 3 + 4 * 5 ) + 6 parse(T);
after call parse(), situation is this: Scan.getToken()
}
2 + ( 3 + 4 * 5 ) + 6 }
We now use the blackboard. You should look at the final program,
which is on the course website. Download it and play with it. Parts
of it will be discussed in recitation this week.