0% found this document useful (0 votes)
16 views

Compilers Lecture 5

* has higher precedence than + + is left-associative - is left-associative * and / are left-associative

Uploaded by

Fatma Sakr
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views

Compilers Lecture 5

* has higher precedence than + + is left-associative - is left-associative * and / are left-associative

Uploaded by

Fatma Sakr
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 30

Compilers Design

Introduction to parsing

1
Parsing Process
Call the scanner to get tokens
Build a parse tree from the stream of tokens
A parse tree shows the syntactic structure of the source
program.
Add information about identifiers in the symbol table
Report error, when found, and recover from the error

2
Context-free grammars
Regular languages are the weakest formal languages
widely used. they aren’t used to specify syntax.
The rules used to specify the syntax of a programming
language can be formalized using the concept of
Context-free Grammar(CFG).
Programming languages have recursive structure .
Context-free grammars are a natural notation for this
recursive structure.
Formal Definition of a CFG
G = (V,T,P,S)
V: finite set of nonterminal symbols
T: finite set of terminal symbols, V and T are disjoint
P: finite set of productions of the form
A  , A  V and   (T  V)*
S  V: start symbol
Context in CFGs
The notion of context in CFGs has nothing to do with
the ordinary meaning of the word context in language
All it really means is that the non-terminal on the left-
hand side of a rule is out there all by itself (free of
context)
A  B C
I.e., I can rewrite an A as a B followed by a C regardless
of the context in which A is found
CFG Formalism
Terminals = symbols of the alphabet of the language
being defined.
Variables = nonterminals = a finite set of other
symbols, each of which represents a language.
Start symbol = the variable whose language is the one
being defined.

6
Productions
A production has the form variable (head) -> string of
variables and terminals (body).
Convention:
A, B, C,… and also S are variables.
a, b, c,… are terminals.
…, X, Y, Z are either terminals or variables.
…, w, x, y, z are strings of terminals only.
, , ,… are strings of terminals and/or variables.

7
CFG Example
Here is a formal CFG for { 0n1n | n > 1}.
Terminals = {0, 1}.
Variables = {S}.
Start symbol = S.
Productions =
S -> 01
S -> 0S1

8
CFG Example
Many possible CFGs for English, here is an example
(fragment):
 S  NP VP
 VP  V NP
 NP  DetP NP | AdjP NP
 AdjP  Adj | Adv AdjP
 NP  N
 N  boy | girl
 V  sees | likes
 Adj  big | small
 Adv  very
 DetP  a | the
Derivations in a CFG
S

S  NP VP S
VP  V NP
NP  DetP N | AdjP NP
AdjP  Adj | Adv AdjP
N  boy | girl
V  sees | likes
Adj  big | small
Adv  very
DetP  a | the

the boy likes a girl


Derivations in a CFG
NP VP

S  NP VP S
VP  V NP
NP  DetP N | AdjP NP
AdjP  Adj | Adv AdjP NP VP
N  boy | girl
V  sees | likes
Adj  big | small
Adv  very
DetP  a | the
Derivations in a CFG
DetP N VP

S  NP VP S
VP  V NP
NP  DetP N | AdjP NP
AdjP  Adj | Adv AdjP NP VP
N  boy | girl
V  sees | likes
Adj  big | small
DetP N
Adv  very
DetP  a | the
Derivations in a CFG
the boy VP

S  NP VP S
VP  V NP
NP  DetP N | AdjP NP
AdjP  Adj | Adv AdjP NP VP
N  boy | girl
V  sees | likes
Adj  big | small
DetP N
Adv  very
DetP  a | the
the boy
Derivations in a CFG
the boy likes NP

S  NP VP S
VP  V NP
NP  DetP N | AdjP NP
AdjP  Adj | Adv AdjP NP VP
N  boy | girl
V  sees | likes DetP
Adj  big | small
N V NP
Adv  very
DetP  a | the the boy likes
Derivations in a CFG
the boy likes a girl

S  NP VP S
VP  V NP
NP  DetP N | AdjP NP
AdjP  Adj | Adv AdjP NP VP
N  boy | girl
V  sees | likes DetP
Adj  big | small
N V NP
Adv  very
DetP  a | the the boy likes DetP N

a girl
Derivations of CFGs
String rewriting system: we derive a string (=derived
structure)
But derivation history represented by phrase-structure
tree (=derivation structure)

S
NP VP
the boy likes a girl DetP N V NP
the boy likes DetP N
a girl
Grammar: Example
List of parameters in:
Function definition
<Fdef>  function
function id ( <argList> )
sub(a,b,c)
<argList>  id , <arglist> | id
Function call
<Fcall>  id ( <parList> )
sub(a,1,2)
<parList>  <par> ,<parlist>| <par>
 id | const
<argList>
<par>
 id , <arglist>
Þ id, id , <arglist>
Þ … 
<Fdef> (id function
,)* id id (<argList> )
<argList>  <arglist> , id | id
<argList>
 <arglist> , id
<Fcall>  id ( <parList> )
Þ<arglist> , id, id
Þ …  id
<parList> (, id<parlist>
)* ,<par>| <par>
<par>  id | const
17
Leftmost Derivation Rightmost Derivation
Each step of the derivation Each step of the derivation
is a replacement of the is a replacement of the
leftmost nonterminals in a rightmost nonterminals in a
sentential form. sentential form.
E E
EOE EOE
 (E) O E  E O id
 (E O E) O E  E * id
 (id O E) O E  (E) * id
 (id + E) O E  (E O E) * id
 (id + id) O E  (E O id) * id
 (id + id) * E  (E + id) * id
 (id + id) * id  (id + id) * id
18
Parse Tree
A labeled tree in which
the interior nodes are labeled by nonterminals
leaf nodes are labeled by terminals
the children of an interior node represent a replacement
of the associated nonterminal in a derivation
corresponding to a derivation

id + F

id E
*
id
19
Abstract Syntax Tree for Expression
E
+
E + E
id *
E E
id * id id

id id

20
AbstractstSyntax Tree for If Statement
ifStatement
if

st elsePart
if ( exp )
true st return

else st
true

return

21
Ambiguous Grammar
A grammar is ambiguous if it can generate two
different parse trees for one string.
Ambiguous grammars can cause inconsistency in
parsing.

22
Example: Ambiguous Grammar
Given the following CFG:
EE+E
EE-E
EE*E
EE/E
E  id
The string : id+id*id has two parse trees
E E

E + E *
E E
E E E E
id * + id

id id id id
23
Ambiguity in Expressions
Which operation is to be done first?
solved by precedence
 An operator with higher precedence is done before one with
lower precedence.
 An operator with higher precedence is placed in a rule

(logically) further from the start symbol.


solved by associativity
 If an operator is right-associative (or left-associative), an
operand in between 2 operators is associated to the operator
to the right (left).
 Right-associated : W + (X + (Y + Z))
 Left-associated : ((W + X) + Y) + Z

24
E
Precedence E

+ E *
E E E
E  E + E
E  E - E E E E E id
id * +
E  E * E
id id id id
E  E / E
E  id E
E  E + E + E
E
E  E - E
F F
E  F
F  F * F id F * F
F  F / F
id id
F  id
25
Precedence (cont’d) E

F
EE+E|E-E|F
F * F
FF*F|F/F|X
X  ( E ) | id X F * F

( E ) X X

id id
E + E
(id + id) * id * id F F

X X

id id

26
E

Associativity F

Left-associative operators F / X

EE+F|E-F|F F * X id
FF*X|F/X|X X id
X  ( E ) | id
( E )

E +
F

(id + id) * id / id F X

= (((id + id) * id) / id) X id

id
27
Ambiguity in Dangling Else
St  IfSt | ...
{ if (0)
IfSt  if ( E ) St | if ( E ) St else St { if (1) St }
E  0 | 1 | … { if (0) else St }
IfSt { if (1) St else St } } IfSt

if ( E ) St else St
if ( E ) St

0
IfSt
0 IfSt

if ( E ) St else St if ( E ) St

1 1

28
Disambiguating Rules for Dangling Else
St 
MatchedSt | UnmatchedSt
St
UnmatchedSt 
if (E) St |
if (E) MatchedSt else UnmatchedSt
MatchedSt  UnmatchedSt
if (E) MatchedSt else MatchedSt|
...
E if ( E ) St
0|1
MatchedSt
if (0) if (1) St else St
= if (0)
if (1) St else St
if ( E ) MatchedSt else MatchedSt

29
Questions?

30

You might also like