0% found this document useful (0 votes)
11 views

Lecture 05 Parsing

The document discusses parsing arithmetic expressions using grammars. It provides: 1) A grammar for arithmetic expressions with rules defining Expressions (E) as Terms (T) separated by addition or subtraction, Terms as Factors (F) separated by multiplication or division, and Factors as integers, negative numbers, or expressions in parentheses. 2) Explanations of syntax trees, which represent the structure of expressions, and how grammars define precedence of operators like multiplication over addition. 3) An overview of how to write a parser for the language of arithmetic expressions, including using a scanner to tokenize input and handling operator precedence.

Uploaded by

Washington Brown
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views

Lecture 05 Parsing

The document discusses parsing arithmetic expressions using grammars. It provides: 1) A grammar for arithmetic expressions with rules defining Expressions (E) as Terms (T) separated by addition or subtraction, Terms as Factors (F) separated by multiplication or division, and Factors as integers, negative numbers, or expressions in parentheses. 2) Explanations of syntax trees, which represent the structure of expressions, and how grammars define precedence of operators like multiplication over addition. 3) An overview of how to write a parser for the language of arithmetic expressions, including using a scanner to tokenize input and handling operator precedence.

Uploaded by

Washington Brown
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

Parsing arithmetic expressions Grammar

Reading material: These notes and an implementation (see A grammar: set of rules for generating sentences of a
course web page). language.
The best way to prepare [to be a programmer] is to write Sentence ::= Noun Verb Noun
programs, and to study great programs that other people Noun ::= boys
have written. In my case, I went to the garbage cans at the Noun ::= girls Example of Sentence:
Computer Science Center and fished out listings of their boys see girls
Verb ::= like girls like boys
operating system. (Bill Gates)
Verb ::= see
First learn computer science and all the theory. Next This grammar has 5 rules. The first says:
develop a programming style. Then forget all that and just A Sentence can be a Noun followed by a Verb followed
hack. (George Carrette) by a Noun.
Look at this website: Note: Whitespace between words doesn’t matter.
Grammar is boring because the set of Sentences (8) is finite.

Recursive grammar Notations used in grammars


Sentence::= Sentence and Sentence Notation used to make grammars easier to write:
Sentence::= Noun Verb Noun
Noun ::= boys { … } stands for zero or more occurrences of …
Grammar has an infinite
Noun ::= girls number of Sentences, Example: Noun phrase ::= { Adjective } Noun
Verb ::= like because Sentence is defined Meaning: A Noun phrase is zero or more Adjectives
Verb ::= see recursively followed by a Noun.

Example of Sentence: <b | c> stands for either a b or a c.


boys see girls Example: Expression ::= Term <+ | –> Term
girls like boys and boys see girls
Meaning: An Expression is a Term followed by (either + or
girls like boys and boys see girls and boys see girls
–) followed by a Term
girls like boys and boys see girls and boys like girls and
boys like girls Alternative: Expression is Term + Term or Term – Term

Grammar for arithmetic expressions Syntax trees


Expression ::= E $ --$ marks the end of the Expression Expression ::= E $ F ::= Integer
E ::= T { <+ | –> T } E ::= T { <+ | –> T } F ::= –F
An E is a T followed by any
T ::= F { <* | /> F } T ::= F { <* | /> F } F ::= ( E )
number of things of the form
F ::= Integer
F ::= – F <+ | –> T Expression Expression
F ::= ( E )
Here are four Es:
E E
T
T+T T T T

T+T+T F F F
T+T-T-T+T
2 $ 2 + 3 $
Trees Grammar gives precedence to * over +
Expression ::= E $ F ::= Integer
root of tree
Expression E ::= T { <+ | –> T } F ::= –F
T ::= F { <* | /> F } F ::= ( E )
E
The node labeled E is the E T
parent of the nodes labeled T After doing
T T the plus
and +. These nodes are the
children of node E. Nodes T T E first, the
F F tree cannot
labeled T and + are siblings.
F F F T T be
2 + 3 $ completed
F F F
leaf of tree
2 + 3 * 4 2 + 3 * 4

Grammar gives precedence to * over + Writing a parser for the language


Expression ::= E $ F ::= Integer Expression ::= E $ F ::= Integer
Mutual
E ::= T { <+ | –> T } F ::= –F E ::= T { <+ | –> T } F ::= –F
recursion
T ::= F { <* | /> F } F ::= ( E )
Defined: T ::= F { <* | /> F } F ::= ( E )
E E in terms of T,
E Parser for a language is a program that reads in a string of
T T in terms of F, characters and tells whether it is a sentence of the language or
T T F F in terms of E. not.
E In addition, it might construct a syntax tree for the sentence.
F F F
T T We will write a parser for the language of expressions that
F appears above.
F F

2 + 3 * 4 ( 2 + 3 ) * 4

Writing a parser for the language Writing a parser for the language
Expression ::= E $ F ::= Integer Expression ::= E $ F ::= Integer
E ::= T { <+ | –> T } F ::= –F E ::= T { <+ | –> T } F ::= –F
T ::= F { <* | /> F } F ::= ( E ) T ::= F { <* | /> F } F ::= ( E )

The scanner is the part of the program that reads in characters Scan.getToken() is always you the token being processed.
and produces tokens from them, deleting all whitespace. Scan.scan() deletes the token being processed, making the next
one in the input the one being processed, and returns the new one.
22 + 35 * – 46 / 2 $
22 + 35 * – 46 / 2 $
a1 a21 a6 + 35 * – 46 / 2 $
35 * – 46 / 2 $
Token Token Token
22 + 35
… * – 46 / 2 $
– 46 / 2 $
Writing a parser for the language Writing a parser for the language
Expression ::= E $ F ::= Integer E ::= T { <+ | –> T }
E ::= T { <+ | –> T } F ::= –F
T ::= F { <* | /> F } F ::= ( E ) /** Token Scan.getToken() is first token of a sentence for E.
Parse it, giving error mess. if there are mistakes. After the parse,
For E, T, F, write a method with this spec (we show only E):
Scan.getToken should be the symbol following the parsed E. */
/** Token Scan.getToken() is first token of a sentence for E. public static void parseE() {
Parse it, giving error mess. if there are mistakes. After the parse, parseT();
Scan.getToken should be the symbol following the parsed E. */
public static void parseE(). while (Scan.getToken() is + or - ) {
Scan.getToken()
Scan.scan();
2 + ( 3 + 4 * 5 ) + 6 parse(T);
after call parse(), situation is this: Scan.getToken()
}
2 + ( 3 + 4 * 5 ) + 6 }

Writing a parser for the language


Expression ::= E $ F ::= Integer
E ::= T { <+ | –> T } F ::= –F
T ::= F { <* | /> F } F ::= ( E )

We now use the blackboard. You should look at the final program,
which is on the course website. Download it and play with it. Parts
of it will be discussed in recitation this week.

You might also like