CS 3723 - Programming Language: 1. Introductory Stuff
CS 3723 - Programming Language: 1. Introductory Stuff
1. Introductory Stuff.
Question: Given a simple DFA or NFA, describe the language recognized by it.
NFA
1
DFA
https://lh5.googleusercontent.com/0pNYmYbaODQauCOshxpOIBdoFkIIHliIQUigpz
fHhHVOMz4uUFwdRrfoAjIyNovgqrc1rATjJlJE0AifmG-tIYg4YoAlGXpp5jI2fVqEDcY7x
UvUBRS3rOtd3_CrlStQFA
2
case gotos are not available or not allowed (say, by an instructor in a course).
Simulating an NFA: As you process each input character, your simulating program
should keep track of the set of all possible states that you might be in. You start
out with the singleton set {start state}. At the end of the input string, if your set of
states includes a terminal state, then you accept. Otherwise you reject. This same
process is essentially the "subset algorithm" that lets one take an NFA and
construct a DFA that accepts the same language as the NFA.
3. Lexical Analysis.
4. Context-free Grammars.
3
A context-free grammar (CFG) is a set of recursive replacement rules
(or rewriting rules, or productions, or just rules) that are used to generate
patterns of strings.
E ----> E + E
E ----> E * E
E ----> ( E )
E ----> a | b | c | ...
Sentence: (a+b)*c
E ===> E * E ===> E * c ===> ( E ) * c ===> ( E + E ) * c ===> ( E + b ) * c ===> (
4
a+b)*c
5. Ambiguous Grammars.
Question: Given a grammar, show that it is ambiguous. (Show the two distinct
parse trees)
Ambiguity: There are other sentences derived from E above that have more than
one parse tree, and corresponding left- and rightmost derivations.
For example, the very simple sentence a + b * c. The table looks at leftmost
derivations and parse trees:
1st Leftmost Der.
2nd Leftmost Der.
E ===> E + E E ===> E * E
===> a + E ===> E + E * E
===> a + E * E ===> a + E * E
===> a + b * E ===> a + b * E
===> a + b * c ===> a + b * c
1st Parse Tree 2nd Parse Tree
E E
/|\ /|\
/ | \ / | \
E + E E * E
| /|\ /|\ |
| / | \ / | \ |
a E * E E + E c
| | | |
b c a b
Grammar: Arith. Exp.
E ----> E + E
E ----> E * E
E ----> ( E )
E ----> a | b | c
Even if some parse trees are unique, if there are multiple parse trees for any
sentence, then the grammar is called ambiguous. In a programming language it is
not acceptable to have more than one possible reading of a construct. We can't
flip a coin to decide which parse tree to use. There are several ways around this
problem:
1. Rewrite the grammar so that it is no longer ambiguous yet still accepts
exactly the same language. This is not always possible.
5
2. Introduce extra rules that allow the program to decide which of multiple
parse trees to use. These are called disambiguating rules. (Ah, yes,
"disambiguating", one of my favorite words.)
3. An ambiguous grammar may signal problems with language design, and the
programming language itself might be changed.
6. Unambiguous CF Grammars.**
Question: Given a sentence, construct the leftmost derivation for it and the parse
tree (both unique).
E ----> E + E
E ----> E * E
E ----> ( E )
E ----> a | b | c | ...
Parse Tree:
6
7. Reverse Polish Notation.
What it is.
8. Shift-Reduce Parsers
7
Grammar: Arithmetic Expressions
| id | * | + | ( | ) | $ |
-----+-----+-----+-----+-----+-----+-----+
P | | | | | | acc | (s = "shift")
E | | | s | | s | r |
T | | s | r | | r | r | (r = "reduce")
F | | r | r | | r | r |
id | | r | r | | r | r | (acc = "accept")
* | s | | | s | | |
+ | s | | | s | | |
( | s | | | s | | |
) | | r | r | | r | r |
$ | s | | | s | | |
-----+-----+-----+-----+-----+-----+-----+
The table below shows the shift-reduce parse of the following sentence, showing the
stack, current symbol, remaining symbols, and next action to take at each stage. (This
sentence has the extra artifical symbol $ stuck in at the beginning and the end.)
Input Sentence
$ ( id + id ) * id $
Shift-Reduce Actions
8
$ (T + id ) * id $ reduce: E ---> T
$ (E + id ) * id $ shift
$ (E + id ) * id $ shift
$ ( E + id ) * id $ reduce: F ---> id
$ (E + F ) * id $ reduce: T ---> F
$ (E + T ) * id $ reduce: E ---> E + T
$ (E ) * id $ shift
$ (E) * id $ reduce: F ---> ( E )
$ F * id $ reduce: T ---> F
$ T * id $ shift
$ T * id $ shift
$ T * id $ reduce: F ---> id
$ T * F $ reduce: T ---> T * F
$ T $ reduce: E ---> T
$ E $ reduce: S ---> E
$ P $ accept
Notice that the sequence of reductions give the following rightmost derivation in
reverse:
Rightmost Derivations
( id + id ) * id
S ===> E
===> T
===> T * F
===> T * id
===> F * id
===> ( E ) * id
===> ( E + T ) * id
===> ( E + F ) * id
===> ( E + id ) * id
===> ( T + id ) * id
===> ( F + id ) * id
===> ( id + id ) * id
9. Semantic Actions
The table below shows the shift-reduce parse of the same sentence, showing the
stack, current symbol, remaining symbols, and next action to take at each stage.
9
The semantic tags are shown in red below the id items and the stack items.
Shift-Reduce Actions Tag field below stack in red
10
10. Recursive-Descent Parsers
Recursive descent parser is a top-down parser, so called because it builds a parse tree
from the top (the start symbol) down, and from left to right, using an input sentence as
a target as it is scanned from left to right. The actual tree is not constructed but is
implicit in a sequence of function calls.
11