0% found this document useful (0 votes)

28 views60 pages

Lec 4

Uploaded by

shubhankarbende24

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

28 views60 pages

Lec 4

Uploaded by

shubhankarbende24

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 60

CS 352 Compilers: Principles and Practice

1. Introduction 2

2. Lexical analysis 31

3. LL parsing 58

4. LR parsing 110 Chapter 1: Introduction

5. JavaCC and JTB 127

6. Semantic analysis 150

7. Translation and simplification 165

8. Liveness analysis and register allocation 185

9. Activation Records 216

1 2

Things to do Compilers

What is a compiler?

make sure you have a working mentor account

a program that translates an executable program in one language into

start brushing up on Java

an executable program in another language

review Java development tools

we expect the program produced by the compiler to be better, in some

way, than the original

find http://www.cs.purdue.edu/homes/palsberg/cs352/F00/index.html

add yourself to the course mailing list by writing (on a CS computer)

What is an interpreter?

mailer add me to cs352

a program that reads an executable program and produces the results

of running that program

usually, this involves executing the source program in some fashion

This course deals mainly with compilers

for personal or classroom use is granted without fee provided that copies are not made or distributed for
profit or commercial advantage and that copies bear this notice and full citation on the first page. To copy
otherwise, to republish, to post on servers, or to redistribute to lists, requires prior specific permission and/or
Many of the same issues arise in interpreters

fee. Request permission to publish from [email protected].

3 4
Motivation Interest

Compiler construction is a microcosm of computer science

Why study compiler construction?
artificial intelligence greedy algorithms

learning algorithms
algorithms graph algorithms

Why build compilers?

union-find

dynamic programming
theory DFAs for scanning

Why attend class?

parser generators
lattice theory for analysis
systems allocation and naming
locality
synchronization
architecture pipeline management
hierarchy management
instruction set use

Inside a compiler, all these things come together

5 6

Isn’t it a solved problem? Intrinsic Merit

Machines are constantly changing Compiler construction is challenging and fun

Changes in architecture changes in compilers !

interesting problems
#

primary responsibility for performance (blame)

new features pose new problems

new architectures % new challenges

changing costs lead to different concerns

real results
'

extremely complex interactions

old solutions need re-engineering

Compilers have an impact on how computers are used

Changes in compilers should prompt changes in architecture

Compiler construction poses some of the most interesting problems in

New languages and features
computing
(

7 8
Experience Abstract view

You have used several compilers

source machine
+

compiler
*

code code
*
*

What qualities are important in a compiler?

1. Correct code errors

2. Output runs fast

Implications:
3. Compiler runs fast
4. Compile time proportional to program size
5. Support for separate compilation ,

recognize legal (and illegal) programs

6. Good diagnostics for syntax errors
generate correct code

7. Works well with the debugger

8. Good diagnostics for flow anomalies
.

manage storage of all variables and code

9. Cross language calls /

agreement on format for object (or assembly) code

10. Consistent, predictable optimization

Each of these shapes your feelings about the correct contents of this course Big step up from assembler — higher level notations
)

9 10

Traditional two pass compiler A fallacy

IR FORTRAN front
source front back machine
+

code end
8 7

code end end code

*
*

:
back <

target1
C++ front end
7

code end
8 7

errors back <

target2
end
7

:
9

CLU front
code end
8
7

back <

target3
Implications: ;

9
end
7

Smalltalk front
code end
8 7

intermediate representation (IR)

Can we build n = > ?

m compilers with n = @ ?

m components?
1

front end maps legal code into IR

back end maps IR onto target machine A

must encode all the knowledge in each front end

simplify retargeting C

must represent all the features in one IR

allows multiple front ends D

must handle all the features in each back end

multiple passes 6 better code

Limited success with low-level IRs

11 12
Front end Front end

tokens
F

source source
+

scanner parser IR scanner parser IR

code code
*

errors errors

Responsibilities: Scanner:

maps characters into tokens – the basic unit of syntax

recognize legal procedure M N O P Q R

becomes
report errors
S

id, T U V W id, X Y Z [ id, \ ] ^

produce IR _

character string value for a token is a lexeme

preliminary storage map
b

typical tokens: number, id, , , , , ,

c d e f g h i j k

shape the code for the back end

eliminates white space (tabs, blanks, comments )

a key issue is speed

use specialized recognizer (as opposed to )
n

Much of front end construction can be automated

o p q

13 14

Front end Front end

tokens Context-free syntax is specified with a grammar

source
+

scanner parser IR
+

code
*

sheep noise ::=

y z { {

| } ~ ~

sheep noise

errors
This grammar defines the set of noises that a sheep makes under normal
Parser: circumstances

The format is called Backus-Naur form (BNF)

recognize context-free syntax

Formally, a grammar G SN T P

guide context-sensitive analysis

construct IR(s)
S is the star t symbol

produce meaningful error messages

N is a set of non-terminal symbols

attempt error correction

T is a set of terminal symbols

P is a set of productions or rewrite rules (P : N N T)

Parser generators mechanize much of the work

15 16
Front end Front end

Context free syntax can be put to better use

Given a grammar, valid sentences can be derived by repeated substitution.
1 goal ::= expr

2 expr ::= expr op term Prod’n. Result

goal

3 term

4 term ::= 1 expr

¡ ¢ £ ¤ ¥ ¦ Î Ï

5 2 expr op term

§ ¨ © Ð Ñ Ò Ó Ô Õ

6 ª op « ::= ¬
5 expr Ö × Ø op Ù Ú

7 ®
7 expr Û Ü Ý Þ

2 expr op term

ß à á â ã ä å æ

This grammar defines simple expressions with addition and subtraction

ç è

4 expr é ê op ë ì í î

over the tokens and ¯ ° ± ² ³ ´ µ ¶

6 expr ï ð ñ ò ó ô

3 term

ö ÷ ø ù ú

5
S= goal
û ü ý þ ÿ

w ¸

T = ¹ º » ¼ ½ ¾ , ¿ À , , Á Â

N= goal , expr , term , op

w Ä Ê Ë

Ã Å Æ Ç È É

To recognize a valid sentence in some CFG, we reverse this process and

P = 1, 2, 3, 4, 5, 6, 7

build up a parse

17 18

Front end Front end

A parse can be represented by a tree called a parse or syntax tree

So, compilers often use an abstract syntax tree

goal -

expr

+ <id:y>

expr op term

<id:x> <num:2>

expr op term - <id:y>

This is much more concise

term + <num:2> Abstract syntax trees (ASTs) are often used as an IR between front end
and back end

<id:x>

Obviously, this contains a lot of unnecessary information

19 20
Back end Back end

instruction register machine IR

selection allocation code

+ *

errors errors

Instruction selection:
Responsibilities

produce compact, fast code

translate IR into target machine code

use available addressing modes

choose instructions for each IR operation

pattern matching problem

decide what to keep in registers at each point

– ad hoc techniques

ensure conformance with system interfaces – tree pattern matching

– string pattern matching

– dynamic programming
Automation has been less successful here
21 22

Back end Traditional three pass compiler

IR IR

IR instruction
machine +

source front middle back machine

selection code
+ *

code end end end code

errors errors

Register Allocation: Code Improvement

have value in a register when used

analyzes and changes IR

limited resources
goal is to reduce runtime

changes instruction choices "

must preserve values

can move loads and stores

optimal allocation is difficult

Modern allocators often use an analogy to graph coloring

23 24
Optimizer (middle end)
The Tiger compiler
%
IR ... IR
IR opt1 opt n IR
# #

Pass 2
errors
$

Environ-
ments

Modern optimizers are usually built as a set of passes Pass 1 Pass 3 Pass 4

Source Program

Abstract Syntax
Tables

Reductions

Translate

IR Trees

IR Trees
Tokens

Assem
Parsing Semantic Canon- Instruction

Lex Parse Translate

Typical passes Actions Analysis calize Selection
,

Frame
Frame
&

constant propagation and folding Layout

code motion

Relocatable Object Code

Assembly Language

Machine Language
Interference Graph
reduction of operator strength

Flow Graph
(

Control Data

Assem
Register Code
Flow Flow Allocation Emission Assembler Linker
)

common subexpression elimination Analysis Analysis

Pass 5 Pass 6 Pass 7 Pass 8 Pass 9 Pass 10

redundant store elimination

dead code elimination

25 26

The Tiger compiler phases A straight-line programming language

A straight-line programming language (no loops or conditionals):

Lex Break source file into individual words, or tokens
Parse Analyse the phrase structure of program Stm 8 Stm ; Stm
9

CompoundStm
Parsing Build a piece of abstract syntax tree for each phrase
-

Actions Stm : : Exp

; < = AssignStm
.

Semantic Determine what each phrase means, relate uses of variables to their Stm > ? ExpList @ A B C D E PrintStm
Exp
/

Analysis definitions, check types of expressions, request translation of each IdExp

F G H

phrase
0

Exp NumExp
N

J K L M

Frame Place variables, function parameters, etc., into activation records (stack
Layout frames) in a machine-dependent way Exp O Exp Binop Exp OpExp
Translate Produce intermediate representation trees (IR trees), a notation that is Exp P Stm ExpQ R S EseqExp
ExpList Exp ExpList PairExpList
V

not tied to any particular source language or target machine

T U

Canonicalize Hoist side effects out of expressions, and clean up conditional branches, ExpList W Exp LastExpList
for convenience of later phases

Binop
X Y

Plus
V

Instruction Group IR-tree nodes into clumps that correspond to actions of target-
Binop
Z [

Minus
\

Selection machine instructions

Binop Times
2

Control Flow Analyse sequence of instructions into control flow graph showing all
4

] ^

Analysis possible flows of control program might follow when it runs Binop
0

3
_ ` Div
Data Flow Gather information about flow of data through variables of program; e.g.,
Analysis liveness analysis calculates places where each variable holds a still-
5
a

e.g., b

d e

needed (live) value : 5 c 3; f : g h i j k l m n o p q r 1 10 s t u v w ; x y z { | } ~

Register Choose registers for variables and temporary values; variables not si-
Allocation multaneously live can share same register prints:

Code Replace temporary names in each machine instruction with registers

Emission

27 28
Tree representation Java classes for trees
¤ ¥ ¦ § ¨ ¤ © § ª « ¬ ® ¯ ° ± ² ´ µ ¶ · · ¸ ¹ º » ¼ ½ ¾ ¿ À ¾ Á Â Ã Ä Å Æ Ç

È É Ê Ë Ì Í Î

³ ´ µ ¶ ¶ · ¸ ¹ º ¸ » ¼ ½ ¾ ¿ ¹ À Á Â À Ã Ä Å Æ Ç È

: 5 3; : 1 10
;
É Ê Ë Ì Í Î Ï Ð Ñ Ò Ó Ô Õ
Ï Ð Ñ Ò Ó Ô Õ Ö × Ø Ù Ú Û Ü Ý Þ ß Ü à á

Ö × Ø Ù × Ú Û Ü Ý Þ Ø ß Ý Þ Ø à á â ã ä å æ ç è

CompoundStm é ê ë ì í î ê í ï ð ñ ò ó ô ð ó õ ö

ã ä å æ æ ç è é ê è ë ì í ë î ï ð ñ ò ó ô

÷ ø ù ú û û ü ý ý þ ÿ

õ ö ÷ ø ù ú û ü ý þ ÿ

AssignStm CompoundStm
! " # $ % & ' ( ) *

+ , - . , / - . , 0 1 2 3 4 5 6 7 8 9 : ;

a OpExp AssignStm PrintStm ! " # $ % & # $ ' " ! ( ) * + , - .

< = > ? @ A = B C D E F G C H I J K L M N I O P

/ 0 1 2 0 3 4 5 6 7 4 8 9 Q

LastExpList
£

NumExp Plus NumExp b EseqExp :

R S T U U V W X Y V Z [ \ ] ^ \ _ ` a b c d e

IdExp f g h i j k l m n o p q r s

5 3
; < = > > ? @ A B C D C E F G H F I J K L M N O

P Q R S T U V W X Y Z [ t u v w t x y z { | } ~

b
PrintStm OpExp \ ] ^ _ ` a ` b c d e f g ^ h ` i j

k l m n o p l q r

PairExpList
NumExp Times IdExp
s

¡ ¢ £ ¤

¥ ¦ § ¨ ¨ © ª « ¬ ® ¯ ° « ± ² ³ ´ µ ³ ¶ · ¸ ¹ º » ¼ ½ ¾ ¿ À

IdExp LastExpList 10 a t u v w x t y w z { | } } ~
Á Â Ã Ä Å Æ Ç È É Ê Ë Ì Í Î Ï Ð Ñ Ò Ó Ô

a OpExp

Õ Ö × Ø Ù Ú Û Ü Ý Þ ß à á â Ý ã ä å ß à á æ ç è é ê ë ì í î ï ð

ñ ò ó ô õ ö ò ÷ ø ù ú û ü ø ý þ

¡ ¢ £ ¤ ¥ ¦ § ¨ © ª « ¬ ® ¯ ° ® ± ²

IdExp Minus NumExp

a 1 !

" # $ % & ' ( ) * + , - . ( / * + 0 , - . 1 2 3 4 5 6 7 8 4 9 :

This is a convenient internal representation for a compiler to use. ;

29 30

Scanner

tokens
F

source
+

scanner parser IR
+

code
*

errors

maps characters into tokens – the basic unit of syntax

> ? @ A B C

Chapter 2: Lexical Analysis becomes D

id, E F G H id, I J K L id, M N O

character string value for a token is a lexeme

typical tokens: number, id, , , , , ,

S T U V W X Y Z [

eliminates white space (tabs, blanks, comments )

]

a key issue is speed

use specialized recognizer (as opposed to )
^

_ ` a

otherwise, to republish, to post on servers, or to redistribute to lists, requires prior specific permission and/or
fee. Request permission to publish from [email protected].
< <

31 32
Specifying patterns Specifying patterns

A scanner must recognize various parts of the language’s syntax

Some parts are easy: Other parts are much harder:

white space identifiers

alphabetic followed by k alphanumerics ( , $, &, . . . )

ws ::= ws
d

e f g h i

j k

ws
l m n o m

p q r

s t u v t

numbers

integers: 0 or digit from 1-9 followed by digits from 0-9

keywords and operators

decimals: integer digits from 0-9

specified as literal patterns: ,

w x y z {

reals: (integer or decimal) (+ or -) digits from 0-9

comments

complex: real real

opening and closing delimiters: | } ~ ~ ~

We need a powerful notation to specify these patterns

< <

33 34

Operations on languages Regular expressions

Patterns are often specified as regular languages

Operation Definition
union of L and M L M s s L or s M Notations used to describe a regular language (or a regular set) include

written L M

both regular expressions and regular grammars

concatenation of L and M LM st s L and t M

Regular expressions (over an alphabet Σ):

written LM
∞ Li
¢

Kleene closure of L L
¡

i 0
¢ £

written L

¥

1. ε is a RE denoting the set ε ®

∞ Li Σ, then a is a RE denoting a
¢

positive closure of L L 2. if a
¬

¯ ° ¯ ¯ ²

i 1
¢ ©

¦ § ¨
±

written L ª

3. if r and s are REs, denoting L r and L s , then:

³ ´ µ
¶

r is a RE denoting L r
¸ ¹ º

r s is a RE denoting L r Ls
À Ã

¿ Ä

¼ ½ ¾ Á Â

r s is a RE denoting L r L s
Æ Ç
È

É Ê Ë
Ì

r Î Ï is a RE denoting L r Ð Ñ Ò

If we adopt a precedence for operators, the extra parentheses can go away.

We assume closure, then concatenation, then alternation as the order of

precedence.
< <

35 36
Examples Algebraic properties of REs

identifier Axiom Description

letter
Q Ó Ô

a b c Õ
Ö ×

Ø Ù Ú Ú Ú Û
Ü

z A B C Ý
Þ ß
à á

â ã ã ã ä
å

Z
æ

rs sr )
* +

is commutative ,

ç
r st - rs t .
/

0 1 2 3
4 5

is associative 6

digit
è é

ò ó ô õ

0 1 2 3 4 5 6 7 8 9
ê ë í î ð ñ ù ú
ö ÷

ì ï ø

rs t r st 8 9 :
;

concatenation is associative
r st rs rt concatenation distributes over
ç þ ÿ

id letter letter digit

< > ? @ A

û ü ý

s t r sr tr C

D E
F

numbers εr r G ε is the identity for concatenation

integer

ε
ê

1 2 3
ù

9 digit

ç

rε r H

rε relation between and ε

ç ç
r I J K L M N O

decimal integer . digit

r r P P Q R is idempotent S

real
b

integer decimal
ç ç

digit
«

complex real real

« $ « % & %

! " # "

Numbers can get much more complicated

Most programming language tokens can be described with REs

We can use REs to build scanners automatically

( T

37 38

Examples Recognizers

From a regular expression we can construct a

Let Σ
Ö X
¬ U V

ab W

deterministic finite automaton (DFA)

1. a b denotes a b
Ö Ö \

Recognizer for identifier :

¯ Y ¯ [

letter

c
digit
2. a b a b denotes aa ab ba bb
Ö _ ` Ö b Ö f Ö g

¯ ^ ¯ a ¯ d ¯ e

i.e., a b a b
Ö j k Ö m n Ö q Ö

aa ab ba bb letter other

¯ i ¯ l ¯ o ¯ p

0 1 2
h

digit accept
other

3. a denotes ε a aa aaa
s

¯ r ¯ u ¯ v ¯ w x x x y

3
t

error

4. a b denotes the set of all strings of a’s and b’s (including ε)

Ö | } Ö

¯ { ¯

identifier
i.e., a b
Ö Ö

a b ~
¯ ¯

letter
Ö

a b c Ø
Ü

z A B C ¡ ¢ £ ¤ ¤ ¤ ¥ Z ¦

ç § ¨

digit
ò ® ô ¯
ê © ù ²

0 1 2 3 4 5 6 7 8 9
ö °

ª « ¬ ±

5. a a b denotes a b ab aab aaab aaaab

Ö Ö

¯ ¯ ¯ ¯ ¯ ¯ ¯

ç ¶ ·

id ³ letter ´ letter digit µ

39 40
Code for the recognizer Tables for the recognizer

¸ ¹ º » ¼ ½ ¾ ¿ À Á Â Ã Ä Å Æ Ç

È É Ê É Ë Ì Í Î Ï Ð Ñ Ò Ó Ô Õ Ö × Ø Ù Ú Ù Û Ü Ý Þ

Two tables control the recognizer

ß à á â ã ä å æ ç è é

ê ë ì í î ï ð ñ ò ó ô õ õ ö ÷ ø ù ú û ü ý þ ÿ

! " # $ %

& ' ( ' ) * + , - . / 0 1 0 2 3 4 5 1 / / 6 / 0 1 0 2 7 8

a z A Z 0 9 other
Þ ê ù
å

¯
Ü

9 : ; < = > ? 9 < @ < A B C

û ü ý þ ÿ

value letter letter digit other

D E F G H I J K L M N O P N Q R S T U V W X

Y Z [ \ ] ^ _ ` a b c d e f g h i j k l m n o p q r s

t u v w x y z { | } ~

¡ ¢ £ ¤ ¥ ¦ § ¨ © ª « § ¬ § © ®

¯ ° ± ² ³ ´ µ ¶ · ¸

¹ º » ¼ ½ ¾
class 0 1 2 3
¿ À Á Â Ã Ä Å Æ Ç È È É È Ê Ë

letter

1 1 — —
digit 3 1 — —
Ì Í Î Ï Ð Ñ Ò Ó Ô Õ Ö × × Ø × Ù

Ú Û Ü Ý Þ ß à á â ã

ä å æ ç è é
other 3 2 — —
ê

ì í î ï ì ð ñ ò ó ô õ ö ÷ ø ù ú

To change languages, we can just change tables

41 42

Automatic construction Grammars for regular languages

Scanner generators automatically construct code from regular expression- Can we place a restriction on the form of a grammar to ensure that it de-
like descriptions scribes a regular language?

construct a dfa

Provable fact:

use state minimization techniques For any RE r, there is a grammar g such that L r Lg.

emit code for the scanner !

The grammars that generate regular sets are called regular grammars
"

(table driven or direct code )

Definition:
A key issue in automation is an interface to the parser In a regular grammar, all productions have one of two forms:

1. A #
¯

aA
2. A
Þ $

is a scanner generator supplied with UNIX

emits C code for scanner where A is any non-terminal and a is any terminal symbol ¯

provides macro definitions for each token

(used in the parser) These are also called type 3 grammars (Chomsky)
%

43 44
More regular languages More regular expressions

Example: the set of strings containing an even number of zeros and an What about the RE a b abb ?
Ö @ A

¯ ? ¯

even number of ones

ab D

1
s0 &

s1 B

s0
C

a B

s1
b B

s2
b B

s3
1
0 0 0 0
State s0 has multiple transitions on a!

¤
¯

1 nondeterministic finite automaton

F G

s2 '

s3 (

a ¯
Ö

b
1

s0 ¤ H

s0 s1
¤ I

J K

s0 ¤ L

s1 – M

s2 N

s2 O

– P

s3 Q R

The RE is 00 11 01 10 00 11 01 10 00 11
* + * / * 2 * 6 * 9

) , - . . 0 1 3 4 5 7 8 : ; < =

45 46

Finite automata DFAs and NFAs are equivalent

A non-deterministic finite automaton (NFA) consists of:

1. a set of states S
S
T U V

s0 ¤ W X X X W

sn Y Z

1. DFAs are clearly a subset of NFAs

2. a set of input symbols Σ (the alphabet)
¬

3. a transition function move mapping state-symbol pairs to sets of states 2. Any NFA can be converted into a DFA, by simulating sets of simulta-
[

4. a distinguished star t state s0

¤
neous states:

5. a set of distinguished accepting or final states F

]

each DFA state corresponds to a set of NFA states

A Deterministic Finite Automaton (DFA) is a special case of an NFA:

possible exponential blowup

1. no state has a ε-transition, and

2. for each state s and input symbol a, there is at most one edge labelled
¯

a leaving s.
¯

A DFA accepts x iff. there exists a unique path through the transition graph
`

from the s0 to an accepting state such that the labels along the edges spell
a

x.
47 48
NFA to DFA using the subset construction: example 1 Constructing a DFA from a regular expression
g

ab f

DFA
minimized

a b b
d

s0 d

s1 d

s2 d

s3
RE DFA

NFA
h
a ¯

b ε moves

o
s0

¤ i j

s0 s1
¤ k

l m
s0

¤ n

s0 s1
¤ p

q r

s0 s1
¤ s

t u

s0 s2 ¤ v

O w

RE NFA w/ε moves

s0 s2
¤ y

O z {

s0 s1
¤ |

} ~

s0 s3 ¤

build NFA for each term

connect them with ε moves

s0 s3
¤

s0 s1
¤

NFA w/ε moves to DFA

b
construct the simulation
the “subset” construction
a

b
a DFA minimized DFA

merge compatible states

a b b
s0 s0 s1 s0 s2 s0 s3
DFA RE

Rkik 1 Rkkk 1 Rkk j 1 Rki j 1

construct Rikj ¢

¢

¢

a
a
¢

49 50

RE to NFA RE to NFA: example

ε
µ

¦ Ö · ¸

a b abb
¶ ¯

N ε
£ ¤

a a
2 3

N a
£ §

¯ ¨

ε ε

1 6
N(A)

A
®

ε ε
¦ ¦

ε ε

4 5
¯

ab ¹

b
ε
ε ε
¦ ¦

N(B) B
®

N AB
£ ª

« ¬
a
2 3

ε ε
N(A) A N(B) B
®

ε ε
0 1 6 7
N AB
£ ¯

ε
¦

ε ε

4 5
ε b
¦

ε ε
¦

N(A) A
®

Ö ¼ ½

N A
±

² ³
¯

ab »

a b b
7 8 9 10
¯

abb
´ ¾

51 52
NFA to DFA: the subset construction NFA to DFA using subset construction: example 2

Input: NFA N
¿

À
ε
Output: A DFA D with states Dstates and transitions Dtrans
Á

Ã
Â

such that L D LN
¿ É

Å Æ Ç È

Ë Ë a
Method: Let s be a state in N and T be a set of states, 2 3
¿ Ì

and using the following operations:

ε ε

À
ε ε a b b
0 1 6 7 8 9 10
Operation Definition
ε-closure s set of NFA states reachable from NFA state s on ε-transitions alone
Ä Ð

Í Í

Ê Ï Ê

ε ε
ε-closure T set of NFA states reachable from some NFA state s in T on ε-
Ä Ð

Í Í

Ñ Ò

4 5
transitions alone
Ó

move T a set of NFA states to which there is a transition on input symbol a

Ö × Ö

Ô Õ

Ø Ù

from some NFA state s in T

add state T ε-closure s0 unmarked to Dstates

Ì Ú

Ã
Ý
Í

while unmarked state T in Dstates

Þ
Ý
Û

a b
A B C
Þ
à

mark T
Ì

ò ÿ ô
* õ ù

A 01247 D 1245679
ö ù ö

B B D
à à

ó ô ö ÷ ø ú û ü ý þ

for each input symbol a

ô ò ô

B 1234678 E 1 2 4 5 6 7 10
í ð í ð
à ö ö

U ε-closure move T a
Ì ã

C B C
Ö ä ä

à á â

ò ô

C 124567
ö

if U Dstates then add U to Dstates unmarked

è
Ó

ç Ã
Ý

D B E

Dtrans T a U
Ì ê

Ö ë ì

E B C
à

endfor
í

endwhile
í

ε-closure s0 is the start state of D

ð Ù

A state of D is accepting if it contains at least one accepting state in N

Á ¿

53 54

Limits of regular languages So what is hard?

Not all languages are regular Language features that can cause problems:

One cannot construct DFAs to recognize these languages: "

reserved words
'
PL/I had no reserved words
L pk qk
" # $

! % &

B C D E F G H I J K L M N O P Q R S Q T U V W U X Y Z X [ \ ] ^ _ `

wcwr w Σ
+ ,

L
. / 0
" ) *

significant blanks
S

FORTRAN and Algol68 ignore blanks

Note: neither of these is a regular expression!
1

a b c d e f g h i j

(DFAs cannot count!) k l m n o p q r s t

But, this is a little subtle. One can construct DFAs for: string constants
S

special characters in strings

alternating 0’s and 1’s u

, , v

, w x y u v

z { | } ~

ε 1 01 ε 0
3

* 7 8 9 * ;

4 5 6 :

finite closures
<

sets of pairs of 0’s and 1’s some languages limit identifier lengths
adds states to count length
=

* >

01 10 ? @

FORTRAN 66 6 characters

These can be swept under the rug in the language design

55 56
How bad can it get?

¡ ¢ £ ¢ ¤ ¥ ¦ § ¨ ¥ © ª

« ¬ ® ¯ ¬ ° ¬ ± ² ³ ´ µ ´ ² ¶ · µ ¸ ¹ ´ º » ¼ ¹ ´ º » ¼

½ ¾ ¿ À Á Â Á Ã Ä Å Æ Ç È É Ê Ë Ì Í Î Ï Ä Ê Ë Ì Í Î Ð Å Ñ Ò Ë

Ó Ô Õ Õ Ö × Ø Ù Ú Û Ü Ý Þ ß à Ü á ß

â ã ä ä å æ ç è é ê ë ì í î ï ð í

ñ ò ó ô õ ö ÷ ö

ø ù ú û ü ý þ ý ÿ

Chapter 3: LL Parsing

! " # $ % # & '

( ) * + ,

- . / 0 1 2 3 4 5 6 7 7 8 9 :

; < = > ? @ A B

C D E F G

H I J K L M N O P Q R S T U V W X Y X Z [ \ ] ^ _ ` a b c d e f g h f g i j k f l

m n

57 58

The role of the parser Syntax analysis

source tokens Context-free syntax is specified with a context-free grammar.

code scanner parser IR

Formally, a CFG G is a 4-tuple Vt Vn S P , where:

| }

Y ~

{

errors
Vt is the set of terminal symbols in the grammar.
|

Parser For our purposes, Vt is the set of tokens returned by the scanner.
|

Vn, the nonterminals, is a set of syntactic variables that denote sets of

performs context-free syntax analysis Y

p
q

guides context-sensitive analysis (sub)strings occurring in the language.

These are used to impose a structure on the grammar.

constructs an intermediate representation

S is a distinguished nonterminal S Vn denoting the entire set of strings

T T

produces meaningful error messages

in L G .

attempts error correction !

This is sometimes called a goal symbol.

P is a finite set of productions specifying how terminals and non-terminals

For the next few weeks, we will look at parser construction

can be combined to form strings in the language.
u w

Each production must have a single non-terminal on its left hand side.
for personal or classroom use is granted without fee provided that copies are not made or distributed for
profit or commercial advantage and that copies bear this notice and full citation on the first page. To copy
y

otherwise, to republish, to post on servers, or to redistribute to lists, requires prior specific permission and/or !

fee. Request permission to publish from [email protected]. The set V Vt Vn is called the vocabulary of G
|

59 60
Notation and terminology Syntax analysis

Grammars are often written in Backus-Naur form (BNF).

abc

Vt |

Example:

ABC Vn Y

1 ×
Ø

goal Ù

:: Ú Û
Ü

expr Ý

U VW V
2 expr :: expr op expr

Ü Ü Ü

Þ ß à á â ã ä å æ

αβγ V 3
¤ ¥ ¦ ¦ ¦ §

ç è é ê

¢ £ ¨

4
ë ì í î

uvw « ¬

Vt |

ï ð

5 op ::
ª ® ® ® ¯ °

ñ ò ó

ô õ ö

6
7
÷ ø ù

If A γ then αAβ αγβ is a single-step derivation using A γ

± ² a ± ± ´

¤ ¤

8
³

ú û

This describes simple expressions over numbers and identifiers.

Similarly, and denote derivations of 0 and 1 steps
*

µ ¶ · ¸ ¹ º

In a BNF for a grammar, we represent

If S β then β is said to be a sentential form of G
T » ¼ a

1. non-terminals with angle brackets or capital letters

2. terminals with font or underline

ü ý þ ÿ ü ÿ

LG w Vt S w ,w L G is called a sentence of G
T Ä Å
" ½ " È

3. productions as in the example

| Ã

¾ ¿ À Á Â Æ Ç É

Note, L G β V S β Vt
T Ñ Ò
" Ê

Ë Ì Í Î Ï Ð Ó Ô Õ

61 62

Scanning vs. parsing Derivations

Where do we draw the line?
Q

term ::
*

We can view the productions of a CFG as rewriting rules.

* ! " # $ % & ' ( ) *

0
+

op :: , - . / 0 1 2 3
Using our example CFG:
expr :: term op term R

goal expr
4 5 6 7

Ø S T U Ü

expr op expr
W X

Ü Ü

Regular expressions are used to classify: Y Z [ \ ]

expr op expr op expr

^ _

Ü Ü Ü

` a b c d e f g h

k l m n

id, op expr op expr

i j

Ü Ü

identifiers, numbers, keywords k x y z {

o p q r s t u

id, expr op expr

v w

Ü Ü

REs are more concise and simpler for tokens than a grammar
| } ~

id, num, op expr

more efficient scanners can be built from REs (DFAs) than grammars

id, num, expr

id, num, id,

¡ ¢ £

Context-free grammars are used to count:

brackets: < = , > ? @ A B ... C D E , F G ... H I J K ... L M N L

We have derived the sentence ¤ ¥ ¦ § ¨ .
Q

imparting structure: expressions We denote this goal ©

Ø ª « ¬ ® ¯ ° ± ² ³ ´ µ

Such a sequence of rewrites is a derivation or a parse.

Syntactic analysis is complicated enough: grammar for C has around 200

productions. Factoring out lexical analysis as a separate phase makes

compiler more manageable. The process of discovering a derivation is called parsing.

P ·

63 64
Derivations Rightmost derivation

At each step, we chose a non-terminal to replace. For the string » ¼ ½ ¾ ¿ :

goal Á Â Ã Ü

expr Ä

This choice can lead to different derivations. Å Æ

expr Ç È op expr É Ê
Ü

expr op id,
Ì Í

Î Ï Ð Ñ Ò Ó

expr id,
Ô Õ

Two are of particular interest: Ö × Ø Ù Ú

expr op expr id,

Û Ü

Ü Ü

Ý Þ ß à á â ã ä å

ñ ò ó

expr op num, id,

æ ç

Ü ì í î ï ð

è é ê ë

leftmost derivation
ñ ý þ

expr num, id,

ô õ

Ü ì ù ú û ü

ö ÷ ø

the leftmost non-terminal is replaced at each step

id, num, id,

rightmost derivation
º

the rightmost non-terminal is replaced at each step

Again, goal Ø

The previous example was a leftmost derivation.

· ·

65 66

Precedence Precedence

These two derivations point out a problem with the grammar.

goal
It has no notion of precedence, or implied order of evaluation.
"

To add precedence takes additional machinery:

expr

1 #
Ø

goal $

:: % &
Ü

expr '

2
)

expr :: expr term

0 1

Ü Ü

* + , - . /

3 expr term
0 7

expr op expr 2 3 4 5 6

4 term
ë 8 9

0 :

ï ;

5 term :: term factor

0 < 0 ? @ A

= > B

ô C D

6 term factor
0 E F G

expr op expr * <id,y> H

7 factor
I J K

8 N factor O :: P Q R S

9
T U V

<id,x> + <num,2>

Treewalk evaluation computes ( )

This grammar enforces a precedence on the derivation:

— the “wrong” answer!
terms must be derived from expressions
X

Should be ( ! )
Y

forces the “correct” tree

· Z

67 68
Precedence Precedence

Now, for the string [ \ ] ^ _ : goal

goal a b c Ü

expr d

expr term
0 j

e f

expr
g h i

expr term factor

0 p q r
k l

m n o s

expr term id,

0 y z {
t u

v w x | }

expr factor id,

expr + term
expr num, id,

term num, id,

factor num, id,

¡ ¢ £ ¤ ¥ ¦ § ¨ ©

term
Ä

term * factor
id, num, id,
ª «

¬ ® ¯ ° ± ² ³ ´ µ

Again, goal ¶
Ø · ¸ ¹ º » ¼ ½ ¾ ¿ À Á Â

, but this time, we build the desired tree.

factor factor <id,y>

<id,x> <num,2>

Treewalk evaluation computes Å Æ ( Ç È É )

69 70

Ambiguity Ambiguity

If a grammar has more than one derivation for a single sentential form,
May be able to eliminate ambiguities by rearranging the grammar:
then it is ambiguous
Ê

stmt ::= matched

Example:
Ì
unmatched
stmt ::= Í Î Ï Ð
Ü

expr Ñ Ò Ó Ô Õ Ö stmt ×
matched ::=
expr matched

matched

! "

Ø Ù Ú Û

expr Ü Ý Þ ß à á stmt â ã ä å ã æ stmt ç

::=
è é ê ë ì í î ï ð ï î

unmatched &

1
'

2
(

3
)

4
Ü

expr * + , - . / stmt 0

expr

matched ; < = > < ?

unmatched @

Consider deriving the sentential form:

5 6 7 8 9 :

ñ ò
ó

E1
ó

E2 O ú û ü ý
T

S1
T

S2 O

This generates the same language as the ambiguous grammar, but applies
ô õ ö ÷ ø ù þ ÿ þ

the common sense rule:

It has two derivations.

This ambiguity is purely grammatical. match each C D E C with the closest unmatched F G H I

It is a context-free ambiguity.
z

This is most likely the language designer’s intent.

71 72
Ambiguity Parsing: the big picture

Ambiguity is often due to confusion in the context-free specification.

tokens
[

Context-sensitive confusions can arise from overloading.

Example:
L M N O P Q R

In many Algol-like languages, could be a function or subscripted variable. parser

grammar parser
Y
X

Disambiguating this statement requires context: generator

need values of declarations

not context-free
V

really an issue of type

code IR
Z

Rather than complicate parsing, we will handle this separately. Our goal is a flexible parser generator system
73 74

Top-down versus bottom-up Top-down parsing

Top-down parsers A top-down parser starts with the root of the parse tree, labelled with the
f

star t or goal symbol of the grammar.

start at the root of derivation tree and fill in

To build a parse, it repeats the following steps until the fringe of the parse
]

picks a production and tries to match the input

tree matches the input string
B

may require backtracking

1. At a node labelled A, select a production A α and construct the
± ± h

some grammars are backtrack-free (predictive)

a
appropriate child for each symbol of α
Bottom-up parsers 2. When a terminal is added to the fringe that doesn’t match the input
b

start at the leaves and fill in string, backtrack

start in a state valid for legal first tokens 3. Find the next node to be expanded (must have a label in Vn) j

as input is consumed, change state to encode possibilities

(recognize valid prefixes )
e

use a stack to store both state and sentential forms

The key is selecting the right production in step 1

should be guided by input string

75 76
Simple expression grammar Example
£

Prod’n Sentential form Input

– goal
¤
¥ ¦ § ¨ © ª « ¬

Recall our grammar for simple expressions: 1 expr

¯ ° ± ² ³ ´ µ

2 expr term
º » ¼ ½ ¾ ¿ À Á

¶ · ¸ ¹

1 l
Ø

goal m

:: n o
Ü

expr p

4 term
Â
º

term
Ã Ä Å º Æ Ç È É Ê Ë Ì

2 expr :: expr term

( q

7 factor term
0 x

º Ñ Ò Ó Ô Õ Ö ×

Ü Ü

Í Î Ï Ð

r s t u v w

Ø Ù Ú Û Ü

9 term
º Ý Þ ß à á â ã

3 expr term
0 ~

y z { | }

– term
º è é ê ë ì í î

4 term
0
ä å æ ç

– expr

ï ð ñ ò ó ô õ ö

5 term :: term factor

0 0

÷ ø

3 expr term
º ü ý þ ÿ

ù ú û

6 term factor

4 term term
0

º º

7 factor term
º

7 factor

9 term
º ! " # $

8 factor ::

– % term& ' (
º ) * + , - . /

9 – 0 term1 2 3
º 4 5 6 7 8 9 :

7
F
;

G
factor
<

H
=

I J
>

K L
? @

M
A

N
B

O
C

P
D

Q
E

8
Consider the input string ¡ ¢
– S T U V W X Y Z [ \ ] ^

– term
º c d e f g h i

_ ` a b

5
k l m n
r s t u v w x y

term factor
º o p q

z { | } ~

7
r r

factor
factor
8 factor

–
r ¡ ¢ £ ¤ ¥ ¦ §

factor
–
Ø
¨

·
©

¸
ª

¹
«

º
¬

¼
®

½ ¾
¯ factor
¿
° ±

À
²

Á
³

Â
´

Ã
µ

Ä
¶

9
– Æ Ç È É Ê Ë Ì Í Î Ï Ð Ñ Ò Ó Ô

77 78

Example Left-recursion

Another possible parse for Õ Ö × Ø Ù Top-down parsers cannot handle left-recursion in a grammar

Prod’n Sentential form Input Formally, a grammar is left-recursive if

– goal Ú
Ø Û Ü Ý Þ ß à á

1 expr â
Ü

ã ä å æ ç è é

2 expr term Vn such that A Aα for some string α

A
0 î ï ð ñ ò ó ô

ê ë ì í

& ' (

2 expr term term

0 ù ú û 0 ü ý þ ÿ

õ ö ÷ ø

2 expr term
0

2 ! " #

If the parser makes the wrong choices, expansion doesn’t terminate.

This isn’t a good property for a parser to have.

(Parsers should terminate!) Our simple expression grammar is left-recursive

)

79 80
Eliminating left-recursion Example
Our expression grammar contains two cases of left-recursion
B

To remove left-recursion, we can transform the grammar expr :: expr term

0 I

Ü Ü

C D E F G H

J K

expr term
0 O

L M N

P Q

term
0 R

Consider the grammar fragment: *

term :: term factor

0 T 0 W X Y

U V Z

foo α
[ \

term factor
+ .

foo ::
0 ] ^ _

, -

β
a b

factor c

where α and β do not start with foo

Applying the transformation gives

1 2

expr :: term expr

0 i j

Ü Ü

f g h k l

We can rewrite this as: expr :: term expr

0 s t

Ü Ü

n o p q r u v

ε
w

β bar
4 7 8

foo :: x y z

term expr
5 6

0 { |

α bar
9

7 : 7 =

bar ::
} ~

term :: factor term

; <

0 0

ε
>

term :: factor term

0 0

where bar is a new non-terminal ε

7 A

? @

factor term
0

With this grammar, a top-down parser will

terminate
B

This fragment contains no left-recursion

backtrack on some inputs

) )

81 82

Example Example
A

This cleaner grammar defines the same language

Our long-suffering expression grammar:
1
Ø

goal

::
Ü

expr
2 ¡
Ü

expr ¢ :: £ ¤
0

term
¥ ¦ §

expr
Ü

¨
1 Ï
Ø

goal Ð

:: Ñ expr Ò
Ü

2 expr :: term expr

( Ô

0 Ø Ù

Ü Ü

3 term expr
0 « ¬

Õ Ö × Ú Û

3 expr :: term expr

0 â ã

Ü Ü

4
¯ °

term
0 ±

Ü Ý Þ ß à á ä å

4 term expr
0 é ê

ï ²

5 term :: factor term

0 ³ 0 ¹

æ ç è ë ì

ε
´ µ ¶ · ¸

ï í

6
º »

factor ¼ ½ ¾
0

term
¿

5
ô î

6 term :: factor term

0 ï 0 ô õ

7 factor
À Á

ð ñ ò ó

7 term :: factor term

0 ÷ ø 0 þ ÿ

8 Ã factor Ä :: Å Æ Ç È
ù ú û ü ý

8 factor term
0

9
É Ê Ë

9
It is 10 factor ::
11
right-recursive

free of ε productions
Î

Unfortunately, it generates different associativity

Same syntax, different meaning Recall, we factored out left-recursion
) )

83 84
How much lookahead is needed? Predictive parsing

Basic idea:
a

We saw that top-down parsers may need to backtrack when they select the
wrong production
For any two productions A α β, we would like a distinct way of
±

Do we need arbitrary lookahead to parse CFGs? choosing the correct production to expand.

in general, yes For some RHS α G, define FIRST α as the set of tokens that appear

use the Earley or Cocke-Younger, Kasami algorithms first in some string derived from α

That is, for some w Vt , w FIRST α iff. α

Aho, Hopcroft, and Ullman, Problem 2.34 wγ. |

! "
#

Parsing, Translation and Compiling, Chapter 4

Key property:
Fortunately
α and A β both appear in the grammar,

Whenever two productions A

± $ ± %

large subclasses of CFGs can be parsed with limited lookahead we would like

most programming language constructs can be expressed in a gram- FIRST & α ' ( FIRST ) β * + φ
mar that falls in these subclasses
This would allow the parser to make a correct choice with a lookahead of
Among the interesting subclasses are: only one symbol!
LL(1): left to right scan, left-most derivation, 1-token lookahead; and
LR(1): left to right scan, right-most derivation, 1-token lookahead The example grammar has this property!
) )

85 86

Left factoring Example

Consider a right-recursive version of the expression grammar:

=
>

What if a grammar does not have this property?

1 ?
Ø

goal @

:: A B
Ü

expr C

2 expr :: term expr

( D

0 H I J

Ü Ü

Sometimes, we can transform a grammar to have this property. E F G K

3 term expr
0 N O P

L M Q

4
R S

term
0 T

For each non-terminal A find the longest prefix

ï U

5 term :: factor term

0 V 0 \

α common to two or more of its alternatives.

W X Y Z [

ô ] ^

6 factor term
0 b

_ ` a

7 factor
c d

if α ε then replace all of the A productions

B ±

8 factor ::
-

f g h i j k

A αβ1 αβ2 αβn

± .
l m n

/
0 1 2 2 2 3

9
with
?

A αA 4 5

To choose between productions 2, 3, & 4, the parser must see past the o p q

A β1 β26 βn
7

8
0 9 : : : ;

or r and look at the , , , or .

s t u v w

where A is a new non-terminal.

φ
(

2
y z

3
ë

FIRST x FIRST { | } FIRST ~

Repeat until no two alternatives for a single

This grammar fails the test.
B

non-terminal have a common prefix.

Note: This grammar is right-associative.
) )

87 88
Example Example

There are two nonterminals that must be left factored: Substituting back into the grammar yields

expr :: term expr

Ü Ü

term

expr
1 Ì
Ø

goal Í

:: Î expr Ï
Ü

2 expr :: term expr

( Ñ

0 Õ Ö

Ü Ü

term
0

Ò Ó Ô × Ø

3 Ù
Ü

expr Ú Û :: Ü expr
Ý Þ
Ü

term :: factor term

0 0

4 à exprá â
Ü

factor term
0

ï ä

¡ ¢

5
factor £

6
ô å

term
æ

:: ç factor termè

é ê
0 ë ì

7 term :: term
í

0 î ï 0 ó

ð ñ ò

8 term
0 ÷

Applying the transformation gives us: ô õ ö

ε
ø

¤
9
expr :: term expr
0 ¨ ©

Ü Ü

¬
¥ ¦ § ª «

10 ù factor ú :: û ü ý þ

expr ® :: ¯

³
expr °

´
±

µ
Ü

11 ÿ

expr Ü

ε
·

Now, selection requires only a single token lookahead.

term :: factor term
0 ¹ 0 ¾ ¿

º » ¼ ½

term :: term
0 Á Â 0 Æ

Ã Ä Å

Ç È É

term
0 Ê

ε
Ë

Note: This grammar is still right-associative.

)

89 90

Example Back to left-recursion elimination

Sentential form Input

– goal

¥

Given a left-factored CFG, to eliminate left-recursion:

1 expr ®

2 term expr

A Aα then replace all of the A productions

6 factor term expr

# $ %

if
® : ; : B :

! " & ' ( ) * + , -

11 term expr
1 2 3

A Aα β γ
. / 0 4 5 6 7 8 9 : ;

– term expr
? @ A

< = > ? ? ? @

< = > B C D E F G H I

ε expr
J K L

9 with
M N

O P Q R S T

U V W X Y

4 expr ®

A NA
: A
B C

Z [ \ ] ^ _ `

– a expr b c d
®

e f g h i j k

β γ
l m n o p

2 N
B D

term expr
q r

s t u v w x y z

E F F F G

{ | } ~

6 factor term expr

αA ε
®

A
: H I : J K

10 term expr

where N and A are new productions.

: L
B

– term expr
?

¥ ¦ §

¡ ¢ £ ¤ ¨ © ª « ¬ ® ¯

° ± ² ³ ´ µ ¶ · ¸

7 term expr
¹ º

» ¼ ½ ¾ ¿ À Á Â

– term expr
Ë Ì

Repeat until there are no left-recursive productions.

Ã Ä Å Æ Ç È É Ê Í Î Ï Ð Ñ Ò Ó Ô

Õ Ö × Ø Ù Ú Û Ü

6
Ý Þ ß

factor term expr

à á â

ã ä å æ ç è é ê

11 term expr
õ ö ÷ ø

ë ì í î ï ð ñ ò ó ô ù ú û ü ý þ ÿ

– term expr
õ

9
( ) *
expr+ , - . / 0 1
®

! "

2
#

3
$

4
%

5
&

6
'

The next symbol determined each choice correctly.

8 8

91 92
Generality Recursive descent parsing
Now, we can produce a simple recursive descent parser from the (right-
Question:
associative) grammar.
By left factoring and eliminating left-recursion, can we transform
$

[ \ ] ^ _

an arbitrary context-free grammar to a form where it can be pre-

` a b c d e f g h i j k l m n o p q

r s t u v w x t y z { | | } | ~

dictively parsed with a single token lookahead?

Answer:

¡ ¢ £ ¤ ¥ ¦ ¦ § ¦ ¨ © ª « ¬

Given a context-free grammar that doesn’t meet our conditions, ® ¯ ° ± ² ³ ³ ´ ³ µ

it is undecidable whether an equivalent grammar exists that does ¶ · ¸ ¶ ¹ º » ¼ ¹ ½ ¾ ¿ À Á Â Ã Ä Å Æ Ç È É

meet our conditions. Ê Ë Ì Í Î Ï Ð Ñ Ò Ó

Ô Õ Ö × Ø Ù Ú Û Ü Ý Þ ß à á â ã ä å

Many context-free languages do not have such a grammar:

æ ç è é ê ë ì í î ï ð ñ ò ó ô õ ö ÷

ø ù ú û ø ü ý þ ÿ

an 0bn n an1b2n n
j j R j j X

Q Q
P

O S T

1 U V W
O S Y

1 Z

Must look past an arbitrary number of a’s to discover the 0 or the 1 and so
! " # $ % & ' ( ) * +
P

, - . / , 0 1 2 3 4 5 6 7

determine the derivation. 8 9 : 8 ; < = > ; ? @ A B

8 C

93 94

Recursive descent parsing Building the tree

D E F G H

I J K L M N O P Q K R S T U U V U W X Y Z [ One of the key jobs of the parser is to build an intermediate representation

of the source code.
\

\ ] ^ _ \ ` a b b c b d

e f g e h i j k h l m n o p q r s t u v w x

y z { | } ~
To build an abstract syntax tree, we can simply insert code at the appropri-

ate points:
¡ ¢ £ ¤ ¥ ¦

§ ¨ © ª § « ¬ ® ¯ ° ± ²
] ^ _ ` a b c d e

can stack nodes f g , h i j

³ ´ µ ³ ¶ · ¸ ¹ º » ¼ ½ ¾ ¿ À Á Â Ã Ä Å Æ
k l m n o p q r s t u v

can stack nodes , w x

can pop 3, build and push subtree

Ç È É Ê Ë Ì Í Î Ï Ð Ñ Ò Ó Ô Õ Ö × Ø

y z { | } ~

Ù Ú Û Ü Ù Ý Þ ß à á â ã ä

å æ ç å è é ê ë è ì í î ï

can stack nodes ,

ð ñ ò ó ô õ ö

can pop 3, build and push subtree

÷ ø ù ú û ü ý þ ÿ

can pop and return tree

" # $ " % & ' ( ) * + , - . / 0 1 2 3 4

5 6 7 8 9 : ; < = > ? @ A B C D E F

G H I J G K L M N

O P Q O R S T U R V W X X Y X Z

[

95 96
Non-recursive predictive parsing Non-recursive predictive parsing

Observation: Now, a predictive parser looks like:

Our recursive descent parser encodes state information in its run-

stack
time stack, or call stack.

Using recursive procedure calls to implement a stack abstraction may not

source tokens table-driven

be particularly efficient.
scanner IR

code parser
This suggests other implementation methods:

explicit stack, hand-coded parser

stack-based, table-driven parser parsing

tables

Rather than writing code, we build tables.

Building tables can be automated!

97 98

Non-recursive predictive parsing

Table-driven parsers
Input: a string w and a parsing table M for G

A parser generator system often looks like:

§ ¨ © ª «

¬ ® ¯ ° ± ² ³ ´ µ ¶ · ¸

Start Symbol
¹ º » ¼ ½ ¾ ¿ ¿ º À Á Â Ã

stack
¢

Ä Å Æ Ç È É Ê Ë Ì Í Î Ï Ð Ñ Ò Ó Ô

Õ Ö × Ö Ø Ù

Ú Û Ü Ý Þ ß à á Ý â ã ä

å æ ç è é ê ë ì í î ï ð ñ ò ó ô õ ö ÷ ø ù ú û

source tokens table-driven

£ £

scanner IR
¢

ü ý þ ÿ

code parser
¤

! ! " ! # $

parser parsing X is a non-terminal

¤ ¤

% & ' % ( )

+ ,

grammar
¦

M X Y1Y2 Yk
- .

generator tables
£

; < = > ?

/ 0 1 2 3 4 5 6 7 8 9

: : :

@ A @ B

C D E F

Yk Yk 1 ; G ; H

I J J J K
Y1
This is true for both top-down (LL) and bottom-up (LR) parsers L M N L O P P Q P R S

T U V W X Y Z [ \ ]

99 100
Non-recursive predictive parsing FIRST
For a string of grammar symbols α, define FIRST α as:
What we need now is a parsing table M .
^

½ ¾

the set of terminal symbols that begin strings derived from α:

Our expression grammar: Its parse table:

a Vt α aβ
Á

O Â O

Ã Ä

Å Æ Ç

1 a

goal b

:: expr e

$† If α ε then ε α
¨

¡ ¢ £ ¤ ¥ ¦ §

` c d f

FIRST
È

2 expr :: term expr

k l m

É Ê Ë Ì Í
e e

ª
g h i j n o

goal a «

1 1 – – – – –
3 expr :: expr α contains the set of tokens valid in the initial position in α
e e

p q r s t u v ¬

FIRST
expr 2 2 – – – – –
Î Ï

4
x y z

expr e

To build FIRST X :
{

®
Ò Ó
±

5
| }

ε ²
expr e

¯ ° – – 3 4 – – 5 Ñ

term 6 6 – – – – –
k ³

6 term :: factor term

k k

1. If X Vt then FIRST X is X
À
Ò Ô Ò Ö Ò Ø

Ã
´

term – – 9 9 7 8 9
k µ ¶

Õ ×

7 term :: term

k k

ε then add ε to FIRST X .

2. If X
·
À À
Ò Ù Ò Û

8 term

k

factor ¸ 11 10 – – – – – Ú

ε
±

3. If X Yk :

Ü Y1Y2 Ý Þ Þ Þ
ß

10 factor ::
(a) Put Y1 ε in FIRST X
Ò æ

FIRST

11
à ä å

á â ã

i : 1 i k, if ε FIRST Y1

(b)
ê

ç FIRST Yi 1 è é ë ì

í î ï ï ï ð
ñ
ò ó

(i.e., Y1 Yi 1 ε) õ õ õ
ò ö

÷ ø

then put FIRST Yi ε in FIRST X

ò ú û ü

ù ý þ ÿ

(c) If ε Yk then put ε in FIRST X

FIRST Y1
FIRST
ß

† ¹

we use $ to represent º » ¼
Repeat until no more additions can be made.
101 102

FOLLOW LL(1) grammars

For a non-terminal A, define FOLLOW A as Previous definition

A grammar G is LL(1) iff. for all non-terminals A, each distinct pair of pro-
the set of terminals that can appear immediately to the right of A
À :

ductions A β and A γ satisfy the condition FIRST β FIRST γ φ.

: ( : )

* * / 0

in some sentential form

+ , - .

Ð
What if A 1 2 ε?
Thus, a non-terminal’s FOLLOW set specifies the tokens that can legally
appear after it. Revised definition
A grammar G is LL(1) iff. for each set of productions A α1 α2 αn:
3 4

A terminal symbol has no FOLLOW set.

5 6 7 7 7 8

1. FIRST α1 FIRST α2 > ? @ A A A B

FIRST αn are all pairwise disjoint

9 D

To build FOLLOW A :
: = C
:

; <

2. If αi ε then αj φ 1 j.
À

A j ni
R S R

FIRST FOLLOW
T U

W
E F G I J K

H L M N O P Q V

1. Put $ in FOLLOW
a

goal

If G is ε-free, condition 1 is sufficient.

2. If A αBβ:
(a) Put FIRST β ε in FOLLOW B

(b) If β ε (i.e., A αB) or ε β (i.e., β ε) then put FOLLOW A

: : $

FIRST ! " #

in FOLLOW B % &

Repeat until no more additions can be made

103 104
LL(1) grammars LL(1) parse table construction

Provable facts about LL(1) grammars: Input: Grammar G

Output: Parsing table M
^

1. No left-recursive grammar is LL(1)

2. No ambiguous grammar is LL(1) Method:
±

3. Some languages have no LL(1) grammar 1. l productions A

3 m

α:
4. A ε–free grammar where each alternative expansion for A begins with
X

(a) a α , add A α to M A a
3 r À 3 t
^ s

FIRST
\ o \ u

n p q

a distinct terminal is a simple LL(1) grammar.

(b) If ε v FIRST w α: x

Example i. A , add A α to M A b
z { z

b
3 } 3 ~ À 3
^

y FOLLOW |

ii. If $ A then add A α to M A $

3 À 3 À 3
^

S aS a
[

\ ] \

is not LL(1) because FIRST aS FIRST a a 2. Set each undefined entry of M to

À
^

\ _ ` \ b c d \ e

^ a

Z f

S aS \ g

If M A a with multiple entries then grammar is not LL(1).

aS ε
\

Z h i

\ j k

accepts the same language and is LL(1)

Note: recall a b Vt , so a b ε

105 106

Example A grammar that is not LL(1)

Our long-suffering expression grammar:
p

stmt q :: r

}
s

~
t

expr v w x y z { stmt |

expr stmt stmt

S E T FT
Z

E TE T T ¡ ¢ £ T ε ¤

E ¥ ¦ E § ¨ © E ε F ª « ¬ ® ¯ ° ±

Left-factored:

stmt ::
e

expr stmt stmt

FIRST FOLLOW
ε
¡

² ³ ´ µ ¶ · ¸ ¹ º »
¼ ½
stmt ¢ £ :: ¤ ¥ ¦ § ¥ stmt ¨ © ª

S ¼
$ Ç

E $
Now, FIRST stmt ε
¾ ¿ À Á Â Ã Ä Å Æ

ε
¼

$
% & ' ( ) * + , -

¼ Ñ

E $
« ¬ ® ¯ ° ± ² ³ ´ µ ³ ¶

Ê Ë Ì Í Î Ï Ð

È É

T
Ò Ó Ô Õ Ö × Ø Ù Ú Û Ü Ý Þ ß
¼

$
à

S S
² ² .

E S
/
² 0

E
/ 1 2 3 4 5

Also, FOLLOW stmt · ¸ ¹ º » ¼ ½ ¾ ¿ À ¾ Á$ Â

ε φ
¼ í

T $ E E TE E TE
Ò á â / / 6 Ò 7 / 8 Ò 9 : ; < = >

But, FIRST stmt stmt

Ê ã ä å æ ç è é ê ë ì

FOLLOW Õ

ε
Ã Ä Å Æ Ç È É Ê Ë Ì Í Î Ï Ð Ñ Ò Ð Ó Ô

F $ E E EE E E
î ï ð ñ ò ó ô õ ö ÷ ø ù ú û ü ý þ ÿ / ? @ A / B C D / / E F G / H I / J K

T T FT T FT On seeing , conflict between choosing

Ò Ò L î M Ò N î O P Q R S T

Ö × Ø Ö

T T ε T ε T TT TT ε
Ò U V W Ò X Y Ò Z [ Ò \ ] ^ Ò Ò _ ` a Ò Ò b c

Ê Ê Ê

ε
Ù

F F d e f F g h i j k l m n o

stmt Ú Û :: Ü Ý Þ ß Ý à stmt á and â stmt ã ä :: å

grammar is not LL(1)!

æ ç

! " # $

The fix:

Put priority on stmt stmt to associate with clos-

è é ê :: ë ì í î ì ï ð ñ ò ó ñ

est previous . ô õ ö ÷

107 108
Error recovery
Key notion:
ø

For each non-terminal, construct a set of terminals on which the parser

can synchronize
ú

When an error occurs looking for A, scan until an element of SYNCH A

3 3 ý

û ü

is found
Building SYNCH:
û

1. a \ þ

FOLLOW ÿ A
\

a
û

SYNCH

A
Chapter 4: LR Parsing
2. place keywords that start statements in SYNCH A
3

3. add symbols in FIRST A to

A
û

SYNCH

If we can’t match a terminal on top of stack:

1. pop the terminal

2. print a message saying the terminal was inserted
±

3. continue the parse

(i.e., SYNCH a Vt a )
û

\ \

109 110

Some definitions Bottom-up parsing

Recall Goal:

For a grammar G, with start symbol S, any string α such that S α is

Z Z

Given an input string w and a grammar G, construct a parse tree by

called a sentential form

star ting at the leaves and working to the root.

If α Vt , then α is called a sentence in L G

The parser repeatedly matches a right-sentential form from the language

Otherwise it is just a sentential form (not a sentence in L G )

against the tree’s upper frontier.

A left-sentential form is a sentential form that occurs in the leftmost deriva-

At each match, it applies a reduction to build on the frontier:

tion of some sentence.

A right-sentential form is a sentential form that occurs in the rightmost

each reduction matches an upper frontier of the partially built tree to

the RHS of some production
À

derivation of some sentence.

each reduction adds a node on top of the frontier

otherwise, to republish, to post on servers, or to redistribute to lists, requires prior specific permission and/or
fee. Request permission to publish from [email protected]. The final result is a rightmost derivation, in reverse.
111 112
Example Handles

What are we trying to find?

Consider the grammar
A substring α of the tree’s upper frontier that
1 S
Z & '

AB (

2 A A
) 3 * 3 + ,

matches some production A α where reducing α to A is one step in

3
D

- .

the reverse of a rightmost derivation

4 B
w
/ 0

and the input string 1 2 2 3 4 5

We call such a string a handle.

Prod’n. Sentential Form Formally:

± 6 7 8 9 : ;

3
a handle of a right-sentential form γ is a production A β and a po-
*

2 A
3 = > ? @

sition in γ where β may be found and replaced by A to produce the

4 A
previous right-sentential form in a rightmost derivation of γ
A B C

1 aABe
–
Z

S
i.e., if S rm αAw rm αβw then A β in the position following α is a
Z F G À

I J

The trick appears to be scanning the input and finding valid sentential handle of αβw
forms.
Because γ is a right-sentential form, the substring to the right of a handle
*

contains only terminal symbols.

113 114

Handles Handles

Theorem:
S

If G is unambiguous then every right-sentential form has a unique han-

dle.

Proof: (by definition)

α
K

1. G is unambiguous N
rightmost derivation is unique

2. a unique production A β applied to take γi 1 to γi

* *

E
E Q

O P

A
β is applied
± R

3. a unique position k at which A

β w
L

β
X U

4. a unique handle A
3 V

β in the parse tree for αβw

The handle A
3 M

115 116
Example Handle-pruning

The left-recursive expression grammar (original form) The process to construct a bottom-up parse is called handle-pruning.
\

Prod’n. Sentential Form Ð

To construct a rightmost derivation

1 a

goal X

:: e

expr – goal

a

γ0 γ1 γ2 γn 1 γn
W Y Z [

S
Í

1 expr w
* * * * *

Î Ï

2 expr :: expr term

) \

k c 9 Ö

9 Ô

e e

Ð Ñ Ò Ò Ò Ó Õ

] ^ _ ` a b

3 expr term
k

3 expr term
e

k i

we set i to n and apply the following simple algorithm

d e f g h

4 term 5 expr term factor

k l
e

j k

9
| m

5 term :: term factor expr term

k n k q r s
k

o p t

7
~ u v

6 term factor

expr factor
k w x y

n
Ø Ù Ú Û Ü

× Ý Þ ß à á Þ â
z
¡ ¢ £ ¤ ¥ ¦ § ¨

7 factor
{ |

8 expr
©
e

ª « ¬ ® ¯ ° ±

8 factor ::
X ²

4 term
k ³ ´ µ ¶ · ¸ ¹ º

1. Ai βi γi
~

E ð E ñ ò E

9
ã ä å æ ç è é ê ë ì í î ï

7 factor
» ¼ ½ ¾ ¿ À Á Â Ã

9 Ä Å Æ Ç È É Ê Ë Ì

2. ó ô õ ö ÷ ø ô βi E ù ú û ü

Ai E ý þ ÿ
*

γi 1E

This takes 2n steps, where n is the length of the derivation

× ×

117 118

Stack implementation Example: back to

J
K

Stack Input Action

One scheme to implement a handle-pruning, bottom-up parser is called a L

$
M N O P Q R S T U

shift
L W X Y Z [ \ ] ^ _

shift-reduce parser. L
$ `
reduce 9
$ factor a b c d e f g h reduce 7
1

goal

expr L

$ term
i

j k l m n o p q

reduce 4
Shift-reduce parsers use a stack and an input buffer

2 expr :: expr term

L r

$ expr shift

V

! "

3 expr term
s t u v w x y z
&

L {

$ expr shift
# $ %

4 term
) | } ~

1. initialize stack with $ * +

' (

2 3
L

$ expr

reduce 8
5 term :: term factor

, / 0 1

$ expr factor reduce 7

- .

2 :
4

6
5 6

term factor

7 8 9

$ expr term shift

2. Repeat until the top of the stack is the goal symbol and the input token
2 >

7 factor
; < =

$ expr term shift

¢ £ ¤ ¥
V

? @

8 factor :: ¡

is $
L ¦

$ expr reduce 9
A B C D E

term
ª « ¬

F G H I

9 L ®
§ ¨ ©

$ expr reduce 5
2 µ

term factor
² ³ ´

¯ ° ±

a) find the handle L

$ expr
¶

· ¸ ¹

term
º

reduce 3
L »

if we don’t have a handle on top of the stack, shift an input symbol

L
$ expr ½

¼ reduce 1
$ goal accept
¿

onto the stack

b) prune the handle

1. Shift until top of stack is the right end of a handle

if we have a handle A β on the stack, reduce
2. Find the left end of the handle and reduce
i) pop β symbols off the stack
ii) push A onto the stack 5 shifts + 9 reduces + 1 accept

119 120
LR k grammars
Ã Ä

Shift-reduce parsing Informally, we say that a grammar G is LR k if, given a rightmost derivation Æ
S Ç

S γ0 γ1 γ2 γn w
Z È

* * * *

Î É

Shift-reduce parsers are simple to understand

9 Î

Ê Ë Ì Ì Ì Í

we can, for each right-sentential form in the derivation,

A shift-reduce parser has just four canonical actions:
1. isolate the handle of each right-sentential form, and
1. shift — next input symbol is shifted onto the top of the stack
Y

2. determine the production by which to reduce

2. reduce — right end of handle is on top of stack; by scanning γi from left to right, going at most k symbols beyond the right
À

locate left end of handle within the stack; end of the handle of γi.
*

pop handle off stack and push appropriate non-terminal LHS

3. accept — terminate parsing and signal success

4. error — call an error recovery routine

The key problem: to recognize handles (not covered in this course).

121 122

LR k grammars Why study LR grammars?

Ã Ó

Formally, a grammar G is LR k iff.: Ô

S Õ

LR(1) grammars are often used to construct parsers.

1. S rm αAw rm αβw, and

Z Ö

H
×

Ø
H
We call these parsers LR(1) parsers.
2. S rm γBx rm αβy, and
Z Ù Ú
Û

everyone’s favorite parser

3. FIRST k w FIRST k y Ý Þ Ý á
Ü â

vir tually all context-free programming language constructs can be ex-

ß à

pressed in an LR(1) form

αAy γBx
3 ä

LR grammars are the most general grammars parsable by a determin-

i.e., Assume sentential forms αβw and αβy, with common prefix αβ and Ü

istic, bottom-up parser

common k-symbol lookahead FIRSTk y FIRST k w , such that αβw re-
Ü æ ç

efficient parsers can be implemented for LR(1) grammars

Ý å Ý è

duces to αAw and αβy reduces to γBx.

LR parsers detect an error as soon as possible in a left-to-right scan

But, the common prefix means αβy also reduces to αAy, for the same re- Ü
3

of the input
sult.
ñ

LR grammars describe a proper superset of the languages recognized

by predictive (i.e., LL) parsers
Thus αAy γBx.
Ð

3 ê

LL k : recognize use of a production A β seeing first k symbols of β

S ó S

ò ô

LR k : recognize occurrence of β (the handle) having seen all of what

S ö

is derived from β plus k symbols of lookahead

123 124
Left versus right recursion Parsing review

Right Recursion: Recursive descent

A hand coded recursive descent parser directly encodes a grammar
÷

needed for termination in predictive parsers

(typically an LL(1) grammar) into a series of mutually recursive proce-
ø

requires more stack space '

dures. It has most of the linguistic limitations of LL(1).

right associative operators LL k ÿ

An LL k parser must be able to recognize the use of a production after

Left Recursion:

seeing only the first k symbols of its right hand side.

works fine in bottom-up parsers LR k

limits required stack space An LR k parser must be able to recognize the occurrence of the right
S

left associative operators hand side of a production after having seen all that is derived from that
right hand side with k symbols of lookahead.
S

Rule of thumb:

right recursion for top-down parsers

left recursion for bottom-up parsers

125 126

The Java Compiler Compiler

Can be thought of as “Lex and Yacc for Java.”

It is based on LL(k) rather than LALR(1).

Grammars are written in EBNF.

Chapter 5: JavaCC and JTB Ð

The Java Compiler Compiler transforms an EBNF grammar into an

LL(k) parser.
S

The JavaCC grammar can have embedded action code written in Java,
just like a Yacc grammar can have embedded action code written in C.

The lookahead can be changed by writing .

The whole input is given in just one file (not two).

127 128
The JavaCC input format

Example of a token specification:

The JavaCC input format
!

One file: # $ % & ' ( ' ) * + $ & ' ) , + - . / 0 1 0 2 0 3 0 4 5 6 7 8 7 9 7 : 7 ; < = > ? @ ? A B

header
Example of a production:
token specifications for lexical analysis
À

grammar
ç

D E F G H I J I K L K M I N O P I Q K I R S M T U V

W X

Z [ \ ] \ ^ _ ^ ` \ a b c d e f g h i f j e k l m n o p p q r s t u v w v

129 130

The Visitor Pattern

For object-oriented programming,

the Visitor pattern enables

the definition of a new operation

Generating a parser with JavaCC

on an object structure
î

y z { z | | } ~ ¡ ¢

£ ¤ ¥ ¤ ¦ § ¨ © ª « ¬ ¨ ¨ ® ® ¯ ° ± ² ³ ´ ° µ ° ¶ · ¸ ¹ º » ¸ ¼ ½ ¾ ¿ À À Á Â Ã Ä Å Æ Ç È É Ê È

without changing the classes

Ë Ì Í Ì Î Ï Ð Ñ Ò Ó Ô Õ Ö × Ø Ù Ù Ú Û Ü Ý Þ Ý ß à á â ã ä å ã æ ç è é ê ë ì í

of the objects.

Gamma, Helm, Johnson, Vlissides:

Design Patterns, 1995.
131 132
First Approach: Instanceof and Type Casts

Sneak Preview The running Java example: summing an integer list.

When using the Visitor pattern,

ó ô õ ö ÷ ø ù ú ö û ü ý þ ÿ

the set of classes must be fixed in advance, and

each class must have an accept method. ! " " # $ % & ' ( ) *

+ , - . / 0 1 2

3 4 5 6 7 8 9 : ;

133 134

First Approach: Instanceof and Type Casts

= > ? @ A B C C D E F G H I J K L M N O P J

Second Approach: Dedicated Methods

Q R S T U V W X Y

Z [ [ \ ] ^ _ ` a b c d d e f g h i j k

l m n o p q r s t u v v w x y
The first approach is not object-oriented!
z { | } ~

To access parts of an object, the classical approach is to use dedicated

methods which both access and act on the subobjects.

¡ ¢ £ ¤ ¥ ¦ § ¨ © ª « ¬

® ¯ ° ± ² ³ ´ µ ¶ ¶ · ¸ ¹ º » ¼ ½ ¾ ¿ À Á Â Ã

Ä Å Æ Æ Ç È É Ê Ë Ì Í Î Ï Ð Ñ Ì Ò
ý þ ÿ

Ó Ó Ô Õ Ö × Ø Ù Ú Û Ü Ý Þ ß à á â ã ä å æ ç æ è

Advantage: The code is written without touching the classes ë ì í and We can now compute the sum of all components of a given -object

.
î ï ð ñ

by writing .

Drawback: The code constantly uses type casts and to

ó ô õ ö ÷ ô ø ù ú û

determine what class of object it is considering.

135 136
Second Approach: Dedicated Methods Third Approach: The Visitor Pattern

! " # # $ % & ' ( ) * + ( + , - . / 0 1 2 3

The Idea:
4 5 6 7 8 9 : ; < = > ? @ A B

Divide the code into an object structure and a Visitor (akin to Func-
C D E F C G H I

tional Programming!)
ï

Insert an method in each class. Each accept method takes a ¨ © © ª « ¬

L M N O O P Q R S T U V W X U X Y Z [ \ ] ^ _ `
Visitor as argument.
¯

A Visitor contains a method for each class (overloading!) A

a b c d e f g h

° ± ² ± ³

method for a class C takes an argument of type C.

i j k l m n o p q

r s t u v w x y z { | } ~

´ µ ¶ · ¸ ¹ º » · ¼ ½ ¾ ¿ À

Á Â Ã Ä Å Æ Æ Ç È É Ê Ë Ì Í Ì É Î Ï Ð Ñ Ò

Advantage: The type casts and operations have disappeared, ¡

Ô Õ Ö × Ø Ù Ú Û × Ü Ý Þ Ý ß à á â

and the code can be written in a systematic way. ã ä å æ ç è é è ê ë ì è í î ï ð

Disadvantage: For each new operation on -objects, new dedicated ¢ £ ¤ ¥

ñ ò ó ô õ ö ÷ ö ø ù ú û ü ÷ ý þ ÿ

methods have to be written, and all classes must be recompiled.

137 138

Third Approach: The Visitor Pattern Third Approach: The Visitor Pattern

The purpose of the methods is to

The control flow goes back and forth between the methods in

the Visitor and the methods in the object structure.

invoke the method in the Visitor which can handle the current

object.
¡ ¢ £ ¤ ¥ ¦ £ ¦ § ¨ © ª « ¬ « ® ¯ °

± ² ³ ´ µ ¶ · ¸ ¹

º » ¼ ½ ¾ ¿ À Á Â Ã Ä Å Æ Å Ç È É Å Ê Ë Ì Í Î

! " # $ % & ' ( ) * * + , - . / 0 1 0 - 2 3 4 5 6

Ï Ð Ñ Ò Ó Ô Õ Ö × Ø Ù Ú Û Ú Ü Ý Þ ß à Û á â ã

7 8 7 9 : 9 ; < ; = 9 : > ?

ä å æ ç è é ê ë ì í î ï ð ñ ò

ó ô õ ö ÷ ø ô ö ù ù ú û õ ü õ ý ÷ þ ÿ

B C D E E F G H I J K L M N K N O P Q R S T U V

W X Y Z [ \ ] ^

! " " # $ % & ' ( ) *

_ ` a b c d e f g

+ , - . / 0 1 2 3 . 1 4 5 6 7 . 8 7 9 - : 1 - 3 0 ; <

h i j k l m n o p q r s s t u v w x y z y v { | } ~

Notice: The methods describe both = > ? > @

1) actions, and 2) access of subobjects.

139 140
Comparison Visitors: Summary
ò

The Visitor pattern combines the advantages of the two other approaches.
E

Visitor makes adding new operations easy. Simply write a new

Frequent Frequent visitor.
F

type casts? recompilation?

Instanceof and type casts

Yes No
G

A visitor gathers related operations. It also separates unrelated

Dedicated methods No
A

Yes ones.
ò

The Visitor pattern No No

Adding new classes to the object structure is hard. Key consid-

eration: are you most likely to change the algorithm applied over an
The advantage of Visitors: New methods without recompilation! object structure, or are you most like to change the classes of objects
that make up the structure.
ï

Requirement for using Visitors: All classes must have an accept method.
J

Visitors can accumulate state.

Tools that use the Visitor pattern: K

Visitor can break encapsulation. Visitor’s approach assumes that

the interface of the data structure classes is powerful enough to let
ï

visitors do their job. As a result, the pattern often forces you to provide
D

JJTree (from Sun Microsystems) and the Java Tree Builder (from Pur-
public operations that access internal state, which may compromise
ü

due University), both frontends for The Java Compiler Compiler from
Sun Microsystems. its encapsulation.

141 142

The Java Tree Builder

The Java Tree Builder
ò

The Java Tree Builder (JTB) has been developed here at Purdue in my The produced JavaCC grammar can then be processed by the Java Com-
L

group. piler Compiler to give a parser which produces syntax trees.

JTB is a frontend for The Java Compiler Compiler. The produced syntax trees can now be traversed by a Java program by
writing subclasses of the default visitor.
M

JTB supports the building of syntax trees which can be traversed using
Program
visitors.

JTB transforms a bare JavaCC grammar into three components: Q

JavaCC JTB JavaCC grammar Java Compiler Parser

Y Y Y ]

S T W X

grammar [

Y
with embedded Compiler
Java code
N

a JavaCC grammar with embedded Java code for building a syntax ^ ^

Syntax-tree-node Syntax tree

tree;
ï

classes [

with accept methods

one class for every form of syntax tree node; and

Default visitor
V

a default visitor which can do a depth-first traversal of a syntax tree.

143 144
Example (simplified)

For example, consider the Java 1.1 production

! " #

$ % & ' ( ) & * + , - & . / / ' 0 1 2 3 4 5 5 6 7 8 9 : 8 ; < = : > ? ; @ > A B

Using JTB
C D E F G H H I J K L M N

JTB produces:
` a b c d e f e g h i j j k k l m n m o p q m r s t u v w x t v s s

O P P Q R S T U S V W X X Y Z [ \ ] [ ^ _ ` a

y z { z | | } ~ ~ } }

b c d e f g d h i j k d l m m e n o p q r

¡ ¢ £ ¢ ¤ ¥ ¦ § ¨ © ª ¦ « ¦ ¬ ¬ ® ¯ ° ± ² ® ³ ® ´ µ ¶ · ¸ ¹ ¶ º » ¼ ½ ¾ ¾ ¿ À Á Â Ã Ä Å Æ Ç È Æ

s t t u v w x y w z { | y } ~ z }

É Ê Ë Ì Í Î Î Ï Ð Ñ Ò Ó Ô Ó Õ Ö × Ô

Ø Ù Ú Ù Û Ü Ý Þ ß à á â ã ä å æ æ ç è é ê ë ì í î ï ð ñ ò ó ô õ ö ö ÷ ø ù ú û ü ý þ ÿ

¡ ¢ £

¤ ¥ ¦ § ¨ ¨ © ª ¤ « ¬ ¤ ® ¯ ¬ ° ± ² ° ³ ´

µ ¶ · ¸ ¹ º » ¼ ½ ½ ¾ ¿ µ À Á

Â Ã Ä Å Æ Ã Ç È É Ê Ë Ì Ì Í Î Ï Ð Ñ Ï Ò Ó Ï Ô Õ Ï Ö Õ Ï × Ø Ù Ú

Notice that the production returns a syntax tree represented as an

object.
Ü Ý Ý Þ ß à á â à ã

145 146

Example (simplified) Example (simplified)

The default visitor looks like this:
M

JTB produces a syntax-tree-node class for ä å å æ ç è é ê è ë :

¸ ¹ º » ¼ ½ ¾ ¿ À Á Á Â Ã Ä Å Æ Ç È É Ê Å Ë È Ê È Å Ì É Í Î Ï Ð Ñ Î Ñ Ò Ó Ô Õ Ö × Ö Ø Ù Ú Û

ì í î ï ð ñ ò ó ô õ õ ö ÷ ÷ ø ù ú û ü ú ý þ ÿ ÿ

! " # $ ! % & " ' % ( ) *

Ü Ü Ü

Ý Ý

+ , - . / 0 0 1 2 3 4 5 6

Þ Þ ß à á â ã ä å æ ç ä è é ê ë ä ì í í å î ï ð ñ

ò ò ó ô õ ö ÷ ø ø ù ú û ü ý û þ ÿ ý þ

7 8 9 : ; < = > > ? @ A B C A D E F G ? B H G I J K L G C > > ? M A N O P

Q R R S T U V W U X Y Z W [ \ X ] [ ^ _ `

a b c d e f f g h i j k l

m n o p q r s t u v w x y z { | } ~

! " # " $ % & # # " ' ( ) * ( $ + , -

. / 0 1 / 2 3 3 4 5 6 7 6 8 9 : ; <

= > ? @ > A B B C D E F E G H I J K

L M N O M P Q Q R S T U T V W X Y Z

¡ ¢ £

Notice the method; it invokes the method ¦ § § ¨ © ª « ¬ ¬ ® for ¯ ° ° ± ² ³ ´ µ ³ ¶ in Notice the body of the method which visits each of the three subtrees of
the node.
· ] ^ ^ _ ` a b c a d

the default visitor.

147 148
Example (simplified)
Here is an example of a program which operates on syntax trees for Java
1.1 programs. The program prints the right-hand side of every assignment.
e

The entire program is six lines:

f g h i j k l m n o o p q r s t u v w w s x t y z { | } ~ |

¡ ¢ £ £ ¤ ¡ ¥ ¦ £ ¢ ¡ § ¨ © ª « ¬ ® ¯ ° ° ± ® ² ³ ° ¯ ® ´ µ ¶

· ¸ ¹ º ¸ » ¼ ¼ ½ ¾ ¿ À Á Â Ã Ä Å Æ Ç È Å É Ê Ë Ì È Í Ì Î Ï Ð

Ñ Ò Ó Ô Ò Õ Ö Ö × Ø Ù Ú Ù Û Ü Ý Þ ß

Chapter 6: Semantic Analysis

When this visitor is passed to the root of the syntax tree, the depth-first
traversal will begin, and when nodes are reached, the method
·

ã ä ä å æ ç è é ç ê

in ì í

is executed. ì î

ï ð ñ ò ó ô õ ö ö ò ÷ ó ø ù ú

Notice the use of û ü ý þ ÿ ÿ ü ý ÿ þ ý . It is a visitor which pretty prints Java

1.1 programs.
M

JTB is bootstrapped.
149 150

Semantic Analysis Context-sensitive analysis

The compilation process is driven by the syntactic structure of the program What context-sensitive questions might the compiler ask?
as discovered by the parser

1. Is scalar, an array, or a function?

Semantic routines: 2. Is

declared before it is used?

3. Are any names declared but not used?

interpret meaning of the program based on its syntactic structure

4. Which declaration of does this reference?

two purposes:
·

5. Is an expression type-consistent ?
– finish analysis by deriving context-sensitive information
6. Does the dimension of a reference match the declaration?
– begin synthesis by generating the IR or target code 7. Where can be stored? (heap, stack, )

associated with individual productions of a context free grammar or 8. Does reference the result of a malloc()?
subtrees of a syntax tree

9. Is defined before it is used?

10. Is an array reference in bounds ?

11. Does function produce a constant value?
Copyright c 2000 by Antony L. Hosking. Permission to make digital or hard copies of part or all of this work

12. Can be implemented as a memo-function?

for personal or classroom use is granted without fee provided that copies are not made or distributed for

profit or commercial advantage and that copies bear this notice and full citation on the first page. To copy
otherwise, to republish, to post on servers, or to redistribute to lists, requires prior specific permission and/or
fee. Request permission to publish from [email protected]. These cannot be answered with a context-free grammar
151 152
Symbol tables
Context-sensitive analysis
For compile-time efficiency, compilers often use a symbol table:

Why is context-sensitive analysis hard? !

associates lexical names (symbols) with their attributes

answers depend on values, not syntax â

What items should be entered?

questions and answers involve non-local information

variable names
#

answers may involve computation

defined constants
Several alternatives: %

procedure and function names

abstract syntax tree specify non-local computations

literal constants and strings

(attribute grammars ) automatic evaluators

source text labels

symbol tables central store for facts

compiler-generated temporaries (we’ll get there)

express checking code
Separate table for structure layouts (types) (field offsets and lengths )
language design simplify language
avoid problems
)

A symbol table is a compile-time structure

153 154

Nested scopes: block-structured symbol tables

Symbol table information
â

â
What information is needed?
What kind of information might the compiler need?

when we ask about a name, we want the most recent declaration

textual name
·

the declaration may be from the current scope or some enclosing

scope

data type
(for aggregates) innermost scope overrides declarations from outer scopes

dimension information
9

declaring procedure
Key point: new declarations (usually) occur only in current scope
.

lexical level of declaration â

0
What operations do we need?
/

storage class (base address )

offset in storage
: ; < = > ? @ A B C D E F G H I J K L M N O P Q R S T U V W X

– binds key to value

if record, pointer to structure table

Y Z [ \ ] ^ _ ` a b c d e f g h i j k l m

– returns value bound to key

n o p q r s t u v w x y z { t | }

– remembers current state of table

if parameter, by-reference or by-value?

– restores table to state at most recent scope that

can it be aliased? to what other names?

has not been ended
5

number and type of arguments to functions

May need to preserve list of locals for the debugger
155 156
Type expressions

Attribute information e

Type expressions are a textual representation for types:

Attributes are internal representation of declarations 1. basic types: boolean, char, integer, real, etc.

Symbol table associates names with attributes 2. type names

3. constructed types (constructors applied to type expressions):

Names may have different attributes depending on their meaning:

(a) array I T denotes array of elements type T , index type I

variables: type, procedure level, frame offset e.g., array 1 10 integer

types: type descriptor, data size/alignment (b) T1 T2 denotes Cartesian product of type expressions T1 and T2
·

constants: type, value (c) records: fields have names

e.g., record integer real
procedures: formals (names/types), result type, block information (lo-
¡ ¢ £ ¤ ¥ ¥

cal decls.), frame size (d) pointer T denotes the type “pointer to object of type T ”
¦ §

(e) D R denotes type of function mapping domain D to range R

e.g., integer integer integer ª «

157 158

Type compatibility
Type descriptors
Type checking needs to determine type equivalence
Type descriptors are compile-time structures representing type expres-
Two approaches:
sions
Name equivalence: each type name is a distinct type
e.g., char

char
¦

pointer integer
®

Structural equivalence: two types are equivalent iff. they have the same
°

structure (after substituting type expressions for type names)

¹ º

s »

t iff. s and t are the same basic types º

pointer or
¶

pointer ¼

array s1 s2 ½
º º

array t1 t2 iff. s1 Á
º

t1 and s2 º

t2
¾ ¿ À Â Ã Ä Å

Æ º

s1 Ç
º

s2 È É

t1 Ê
t2 iff. s1
È
º

Ë
t1 and s2 º

È Ì

t2
È

char char integer

char integer Í Î

pointer s Ï º Ð Ñ Î

pointer t iff. s Ò

Ó
º Ô

t
Õ º

s1 Ö
º

s2 È ×

t1 Ø
t2 iff. s1
È
º

Ù
t1 and s2 º

È Ú

t2 È

159 160
Type compatibility: example

Consider:
Type compatibility: Pascal-style name equivalence
Û Ü Ý Þ ß à á â ã ä å æ ç ç è

é ê ë ì í î ï ð ñ ò ó ô õ
Build compile-time structure called a type graph:
ö ÷ ø ù ú û ü ý þ ÿ

each constructor or basic type creates a node

(

each name creates a leaf (associated with the type’s descriptor)

Under name equivalence: ) * + , - . / 0 : ; <

and have the same type

, and have the same type 1 2 3 4 5

pointer pointer pointer

6 6

and ! " have different type

Under structural equivalence all variables have the same type 7 8 9 9

Ada/Pascal/Modula-2/Tiger are somewhat confusing: they treat distinct

Type expressions are equivalent if they are represented by the same node
type definitions as distinct types, so
·

in the graph
$

has different type from % and &

161 162

Type compatibility: recursive types

Consider:
= > ? @ A B C D E F G H I I J

K L M M N O P Q R O S

T U V W X Y Z [ \ ] \ ^ _

Type compatibility: recursive types

` a b c d e f g h i

â
j k l m

Allowing cycles in the type graph eliminates :

We may want to eliminate the names from the type graph
r

record
Eliminating name n o p q from type graph for record:

t u v v

record

integer pointer

x
}

integer pointer
y z { |

163 164
Tiger IR trees: Expressions

CONST
Integer constant i

NAME

Symbolic constant n

[a code label]
n

TEMP

Temporary t [one of any number of “registers”]

BINOP
Application of binary operator o:

o e1 e2 ¡
PLUS, MINUS, MUL, DIV
AND, OR, XOR [bitwise logical]
LSHIFT, RSHIFT [logical shifts]
Chapter 7: Translation and Simplification ARSHIFT [arithmetic right-shift]
to integer operands e1 (evaluated first) and e2 (evaluated second)
¢

MEM

Contents of a word of memory starting at address e

CALL ª

Procedure call; expression f is evaluated before arguments e1

£ ¤
« ¬ ¬ ¬
en
¨

ESEQ ¯ ¯

Expression sequence; evaluate s for side-effects, then e for result

se °

otherwise, to republish, to post on servers, or to redistribute to lists, requires prior specific permission and/or
fee. Request permission to publish from [email protected].
165 166

Tiger IR trees: Statements

MOVE

TEMP e
ª

Evaluate e into temporary t

Kinds of expressions
t

Expression kinds indicate “how expression might be used”

MOVE
Evaluate e1 yielding address a, e2 into word at a Ex(exp) expressions that compute a value
³

´ ´

MEM e2
e1 Nx(stm) statements: expressions that compute no value
EXP
Evaluate e and discard result
µ

Cx conditionals (jump to true and false destinations)

e
¼

JUMP
RelCx(op, left, right)
Transfer control to address e; l1 ln are all possible values for e
· ·

e l1 ln ¶
·

¸ ¹ ¹ ¹ º
·

»
½ ¾ ¾ ¾ ¿

IfThenElseExp expression/statement depending on use

CJUMP
Evaluate e1 then e2 , yielding a and b, respectively; compare a with b
¢

´
µ
À

´
À

Conversion operators allow use of one form in context of another:

o e1 e2 t f using relational operator o:
Á

EQ, NE [signed and unsigned integers] unEx convert to tree expression that computes value of inner tree
LT, GT, LE, GE [signed]
Â
ULT, ULE, UGT, UGE [unsigned] unNx convert to tree statement that computes inner tree but returns no
jump to t if true, f if false
£

value
#

SEQ
¯

unCx(t, f) convert to statement that evaluates inner tree and branches to

Statement s1 followed by s2
® ®

true destination if non-zero, false destination otherwise

s1 s2 ® ®

LABEL
Define constant value of name n as current code address; NAME n
Ã

can be used as target of jumps, calls, etc.

n
167 168
Translating Tiger
Translating Tiger
Tiger record variables: Again, records are pointers to record base, so fetch like other
variables. For e. :
Ó

Simple variables: fetch with a MEM: Ô

Ex(MEM( (e.unEx, CONST o))) Õ

MEM
Ex(MEM( (TEMP fp, CONST k)))
È

where o is the byte offset of the field in the record

BINOP
×
É

Note: must check record pointer is non-nil (i.e., non-zero)

String literals: Statically allocated, so just use the string’s label

PLUS TEMP fp CONST k
È

where fp is home frame of variable, found by following static links; k is Ex(NAME(label ))

offset of variable in that level where the literal will be emitted as:
Ê e

Tiger array variables: Tiger arrays are pointers to array base, so fetch ß à á â ß ã
Ù

ä
Ú

å
Û

æ
Ü

ç
Ý

è è
Þ Þ

é ê ë ì ì í î ï ð ñ ò ó

with a MEM like any other variable: ª

Record creation: en in the (preferably GC’d) heap, first allocate

f1 e1 f2
£

e2
£

fn
Ex(MEM( (TEMP fp, CONST k)))
È

¨ û

ô õ

ö ÷ ø ù ú ú ú ü

the space then initialize it:

Thus, for e i :
Ì Í

Ex( ESEQ(SEQ(MOVE(TEMP r, externalCall(”allocRecord”, [CONST n])),

Ex(MEM( (e.unEx, (i.unEx, CONST w))))

Î Ï

SEQ(MOVE(MEM(TEMP r), e1 .unEx)),

SEQ(. . . ,
i is index expression and w is word size – all values are word-sized MOVE(MEM(+(TEMP r, CONST n 1 w)), þ
ÿ
Ö

(scalar) in Tiger en .unEx))),

TEMP r))
Note: must first check array index i size e ; runtime will put size in
Ð Ñ Ò

word preceding array base where w is the word size

Array creation: e1
e2 : Ex(externalCall(”initArray”, [e1 .unEx, e2 .unEx]))
169 170

while loops

while c do s:

Control structures 1. evaluate c

2. if false jump to next statement after loop

Basic blocks :

3. if true fall into loop body

a sequence of straight-line code 4. branch to top of loop

if one instruction executes then they all execute
e.g.,
a maximal sequence of instructions without branches test :
a label starts a new basic block if not(c) jump done

s
Overview of control structure translation:

jump test

control flow links up the basic blocks done:

Nx( SEQ(SEQ(SEQ(LABEL test, c.unCx(body, done)),
0

ideas are simple

SEQ(SEQ(LABEL body, s.unNx), JUMP(NAME test ))), º

implementation requires bookkeeping LABEL done))

some care is needed for good code

repeat e1 until e2 evaluate/compare/branch at bottom of loop

171 172
for loops

for := e1 to e2 do s

1. evaluate lower bound into index variable

2.

evaluate upper bound into limit variable
3. if index limit jump to next statement after loop
4.

fall through to loop body

Function calls
5. increment index
6. if index limit jump to top of loop body en :
% &

f e1 ' ( ( ( )
*

t1 e1

t2 e2 Ex(CALL(NAME label f , [sl,e1,. . . en ]))

0
if t1 t2 jump done
È

body : s where sl is the static link for the callee f , found by following n static links
%

t1 t1 1

0
from the caller, n being the difference between the levels of the caller and
,

if t1 t2 jump body !
·

the callee
done:

For break statements:

when translating a loop push the done label on some stack

break simply jumps to label on top of stack

when done translating loop and its body, pop the label
7

173 174

Conditionals
Comparisons 2

The short-circuiting Boolean operators have already been transformed into

Translate a op b as:
-

if-expressions in Tiger abstract syntax:

e.g., x 5&a b turns into if x 5 then a b else 0
5 5
·

3 4 6 3 7 8

RelCx(op, a.unEx, b.unEx)

Translate if e1 then e2 else e3 into: IfThenElseExp(e1, e2, e3)

: :

When used as a value unEx yields:

When used as a conditional unCx t f yields:
% 0

ESEQ(SEQ(SEQ(e1 .unCx(t, f),

. /

SEQ(SEQ(LABEL t,
CJUMP(op, a.unEx, b.unEx, t, f )
%

SEQ(MOVE(TEMP r, e2.unEx),
;

where t and f are labels.

JUMP join)),
â
SEQ(LABEL f,
When used as a value unEx yields: SEQ(MOVE(TEMP r, e3.unEx), :

ESEQ(SEQ(MOVE(TEMP r, CONST 1), JUMP join)))),

SEQ(unCx(t, f), LABEL join),
2

SEQ(LABEL f, TEMP r)
As a conditional unCx t f yields:
% >

SEQ(MOVE(TEMP r, CONST 0), LABEL t)))), < =

SEQ(e1.unCx(tt,ff), SEQ(SEQ(LABEL tt, e2.unCx(t, f )),

TEMP r)
SEQ(LABEL ff, e3.unCx(t, f ))))
%

175 176
One-dimensional fixed arrays
Conditionals: Example D E F G H I J J I K L M N N O P Q R S T U V W V X Y

Z Z Z

Applying unCx t f to if x 5 then a b else 0:

% A 5
·

3 B C

? @

[ \ ] ^

SEQ(CJUMP(LT, x.unEx, CONST 5, tt, ff),

SEQ(SEQ(LABEL tt, CJUMP(GT, a.unEx, b.unEx, t, f )),

translates to:
·

SEQ(LABEL ff, JUMP f )))

MEM(+(TEMP fp, +(CONST k (CONST w, e.unEx))))

_ `

or more optimally: 2w, a

SEQ(CJUMP(LT, x.unEx, CONST 5, tt, f ),

where k is offset of static array from fp, w is word size

SEQ(LABEL tt, CJUMP(GT, a.unEx, b.uneX, t, f )))

In Pascal, multidimensional arrays are treated as arrays of arrays, so b c d e f g

is equivalent to A[i][j], so can translate as above.

177 178

Multidimensional arrays
Array layout:
Multidimensional arrays
#
i

Contiguous:
Array allocation: 1. Row major
constant bounds Rightmost subscript varies most quickly:
– allocate in static area, stack, or heap
j k l m l n m o p q r s t r u u u

– no run-time descriptor is needed

v w x y z { y | } ~ ~

dynamic arrays: bounds fixed at run-time Used in PL/1, Algol, Pascal, C, Ada, Modula-3
– allocate in stack or heap 2. Column major
Leftmost subscript varies most quickly:
h

– descriptor is needed

h

dynamic arrays: bounds can change at run-time

– allocate in heap
Used in FORTRAN
h

– descriptor is needed

By vectors
Contiguous vector of pointers to (non-contiguous) subarrays

ª « ¬ ¬ « ® ¯ ° ° ± ² ³ ´ µ ¶ ¶ µ · ¸ ¹ º º » ¼ ½ ¾ ¿

case statements
no. of elt’s in dimension j:
À

case E of V1: S1 . . . Vn : Sn end

Dj Á Â

Uj Á Ã

Lj Á Ä

1 1. evaluate the expression

position of Å Æ i1 Ç È È È É
in : Ê

2. find value in case list equal to value of expression

in
Ì
Í

Ln Î
3. execute statement associated with value found
in 1 Ln 1 Dn 4. jump to next statement after case
Í Õ

Ï Ð

Ñ Ó

Ò Ô

in 2 Ln 2 DnDn 1
Í Õ Õ

Ö ×

È Ù È Û

Key issue: finding the right case

Ø Ú Ü

Ý Þ Þ Þ

ß à

i1 L1 Dn ã ã ã

D2 È sequence of conditional jumps (small case set)

á â

O cases
which can be rewritten as

binary search of an ordered jump table (sparse case set)

variable part
#

O log2 cases

È
ä å æ ç

i1D2 Dn i2D3 Dn in 1Dn in

Õ Õ Õ Õ Õ

È è è è È : ê ê ê

é ë ì ì ì í ï

L1D2 Dn L2D3 Dn ñ
Í Õ Õ

ó
Í Õ

Ln 1Dn : ô ô ô
Õ

õ ö ö ö ÷
Í

ø
Õ

ù
Í

Ln
ú

hash table (dense case set)

ò ò ò
û ü ý þ

constant part O1

address of i1 in : ÿ

address( ) + ((variable part constant part) element size)

181 182

Simplification
case statements
%

Goal 1: No SEQ or ESEQ.

case E of V1: S1 . . . Vn : Sn end
9

Goal 2: CALL can only be subtree of EXP(. . . ) or MOVE(TEMP t,. . . ).

One translation approach: Transformations:
t := expr

lift ESEQs up tree until they can become SEQs

jump test (

turn SEQs into linear list

L1 : S1 !

Ö ) Ö Ö

ESEQ(s1, ESEQ(s2, e)) ® ®

ESEQ(SEQ(s1,s2), e) ® ®

jump next BINOP(op, ESEQ(s, e1 ), e2 ) ®

Ö Ö *

ESEQ(s, BINOP(op, e1 , e2 ))
®
Ö

L2 :È

code for S2 È

Ö + Ö

MEM(ESEQ(s, e1 )) ESEQ(s, MEM(e1 ))

jump next
® ®

... ¼

JUMP(ESEQ(s, e1 )) ®
Ö ,

SEQ(s, JUMP(e1 ))
®
Ö

Ln :

code for Sn

CJUMP(op, SEQ(s, CJUMP(op, e1 , e2 , l1 , l2 ))

· ·

jump next
ESEQ(s, e1 ), e2 , l1 , l2 ) ®
Ö

· ·
Ö -

test: if t V1 jump L1

BINOP(op, e1 , ESEQ(s, e2 ))
Ö . Ö

"
®

ESEQ(MOVE(TEMP t, e1 ),

ESEQ(s, ®

if t V2 jump L2
Í

BINOP(op, TEMP t, e2 )))

...

Ö

CJUMP(op, Ö Ö /
SEQ(MOVE(TEMP t, e1 ),

e1 , ESEQ(s, e2 ), l1 , l2 ) SEQ(s,
· ·

if t Vn jump Ln
® ®

CJUMP(op, TEMP t, e2 , l1 , l2 )))

· ·

code to raise run-time exception MOVE(ESEQ(s, e1), e2 ) ®

Ö Ö 0

SEQ(s, MOVE(e1 , e2 ))
®
Ö

next:

Ö 1 Ö

CALL( f , a) ESEQ(MOVE(TEMP t, CALL( f , a)),

£ £

´ ´

TEMP(t))

183 184
Register allocation

IR instruction machine
selection code
4 ; 5 < 6 < = 7 > 8 ; 9 : 6 = 5 ? 3

: 8

errors
2

have value in a register when used

Chapter 8: Liveness Analysis and Register Allocation

limited resources
B

changes instruction choices

can move loads and stores

optimal allocation is difficult

NP-complete for k 1 registers
È F

work for personal or classroom use is granted without fee provided that copies are not made or distributed
for profit or commercial advantage and that copies bear this notice and full citation on the first page. To
copy otherwise, to republish, to post on servers, or to redistribute to lists, requires prior specific permission
and/or fee. Request permission to publish from [email protected].
I

185 186

Liveness analysis Control flow analysis

Problem: Before performing liveness analysis, need to understand the control flow
J

IR contains an unbounded number of temporaries by building a control flow graph (CFG):

machine has bounded number of registers P

nodes may be individual program statements or basic blocks

Approach:
Q

edges represent potential flow of control

temporaries with disjoint live ranges can map to same register Out-edges from node n lead to successor nodes, succ n

R

, , S

In-edges to node n come from predecessor nodes, pred n

if not enough registers then spill some temporaries

, , U

(i.e., keep them in memory)

The compiler must perform liveness analysis for each temporary:

N O

Example:

It is live if it holds a value that may be needed in future

a 0
W

L1 : b a 1
Í

c c b Z [

]

a b 2 \

if a N goto L1
` Í

return c

187 188
Liveness analysis Liveness analysis

Gathering liveness information is a form of data flow analysis operating

Define:
over the CFG: in n : variables live-in at n

, ,

liveness of variables “flows” around the edges of the graph in n : variables live-out at n
, ,

assignments define a variable, v:

Then:
c d

– def v d f g

set of graph nodes that define v d

out n

in s

– def n , i j

set of variables defined by n ,

succ n

occurrences of v in expressions use it:

k d

succ n φ out n φ
-

, ,

– use v set of nodes that use v

l m

d n o d

– use n set of variables used in n

l p

, q r ,

Note:
Liveness : v is live on edge e if there is a directed path from e to a use of v

d d

in n use n

, ,

that does not pass through any def v

d t

v is live-in at node n if live on any of n’s in-edges

in n out n def n
d , ,

, , ¡ ¢ , ¤

v is live-out at n if live on any of n’s out-edges

d , ,

use n and def n are constant (independent of control flow)

l ¥

, ¦ , ¨

use n

v v live-in at n
l v

d u , w x d ,

Now, v in n iff. v use n or v out n def n

- °
l

pred n

v live-in at n v live-out at all m

d , y d z { , }

Thus, in n use n out n def n

v live-out at n v def n v live-in at n

- ¼
l ¸

, ¶ · , ¹ º » , ½ ¾ , À Á
d , ~ d , d ,

189 190

Iterative solution for liveness

Iterative solution for liveness
φ; out n φ Complexity : for input program of size N
Â

foreach n in n

, Ã , Å Æ Ç , É Ê

Ä È Ë

repeat

N nodes in CFG
foreach n ,

N variables

in n in n ; N elements per in/out

, Î Ï , Ñ

Ì Í Ð

out n out n ; O N time per set-union

Ç , Ô Õ Ç , ×

Ò Ó Ö

in n use n out n de f n Ø
, Ù Ú

Û
, Ü Ý Þ Ç

ß
, à á

ã
, ä å

for loop performs constant number of set operations per node

out n in s O N 2 time for for loop

Ç , ç è é º î

æ í

s succ n ê ® ë ì

each iteration of repeat loop can only add to each set

until in n ï ð
, ñ ò

in n ó
, ô õ Ç

out n ö ÷
, ø ù Ç

out n ú
, û ü ý ,

n sets can contain at most every variable È

sizes of all in and out sets sum to 2N 2,

Notes: þ

bounding the number of iterations of the repeat loop

should order computation of inner loop to follow the “flow”

worst-case complexity of O N 4

liveness flows backward along control-flow arcs, from out to in

nodes can just as easily be basic blocks to reduce CFG size

ordering can cut repeat loop down to 2-3 iterations È !

O N or O N 2 in practice

could do one variable at a time, from uses back to defs, noting liveness

along the way

191 192
Register allocation

Iterative solution for liveness IR instruction machine

selection code
4 ; 5 < 6 < = 7 > 8 ; 9 : 6 = 5 ? 3

: 8

Least fixed points

errors
2

There is often more than one solution for a given dataflow problem (see
example). Register allocation:
Any solution to dataflow equations is a conservative approximation:
"

have value in a register when used

# d

v has some later use downstream from n ,

limited resources
v out n
$ &

d % , '

changes instruction choices

(

but not the converse

can move loads and stores

Conservatively assuming a variable is live does not break the program; just
means more registers may be needed.
-

optimal allocation is difficult

NP-complete for k 1 registers
È /

Assuming a variable is dead when it is really live will break things.

May be many possible solutions but want the “smallest”: the least fixpoint. 0 2

The iterative liveness computation computes this least fixpoint. work for personal or classroom use is granted without fee provided that copies are not made or distributed
for profit or commercial advantage and that copies bear this notice and full citation on the first page. To
copy otherwise, to republish, to post on servers, or to redistribute to lists, requires prior specific permission
and/or fee. Request permission to publish from [email protected].
4

193 194

Register allocation by simplification

Register allocation by simplification (continued)
1. Build interference graph G: for each program point
(a) compute set of temporaries simultaneously live
(b) add edge to graph for each pair in set 4. Select : assign colors to nodes
2. Simplify : Color graph using a simple heuristic (a) start with empty graph
(a) suppose G has node m with degree K 5

6
(b) if adding non-spill node there must be a color for it as that was the
(b) if G G m can be colored then so can G, since nodes adjacent
7 8 9 :
5 ;

basis for its removal

to m have at most K 1 colors (c) if adding a spill node and no color available (neighbors already K-

leading to more opportunity for simplification (d) repeat select

(d) leads to recursive coloring algorithm 5. Start over : if select has no actual spills then finished, otherwise
3. Spill : suppose m of degree K (a) rewrite program to fetch actual spills before each use and store
=

> ? @

(a) target some node (temporary) for spilling (optimistically, spilling after each definition
node will allow coloring of remaining nodes) (b) recalculate liveness and repeat
(b) remove and continue simplifying

195 196
Coalescing Simplification with aggressive coalescing
h

Can delete a move instruction when source s and destination d do not

C º

interfere: F

build
– coalesce them into a new node whose edges are the union of those

of s and d
â

done
H

aggressive
E

In principle, any pair of non-interfering nodes can be coalesced coalesce

e
lesc
– unfortunately, the union is more constrained and new graph may

coa
no longer be K -colorable
A

any
– overly aggressive
G

simplify

done
G

spill

l
spil
any
G

select

197 198

Conservative coalescing Iterated register coalescing

Interleave simplification with coalescing to eliminate most moves while without extra spills
Apply tests for coalescing that preserve colorability.
1. Build interference graph G; distinguish move-related from non-move-related nodes
U

Suppose a and b are candidates for coalescing into node ab

J J

h
2. Simplify : remove non-move-related nodes of low degree one at a time
Briggs: coalesce only if ab has K neighbors of significant degree K
M

3. Coalesce: conservatively coalesce move-related nodes

L N

simplify will first remove all insignificant-degree neighbors remove associated move instruction
M

O
W

if resulting node is non-move-related it can now be simplified

P J

ab will then be adjacent to Q K neighbors

repeat simplify and coalesce until only significant-degree or uncoalesced moves
X

simplify can then remove ab

R J

4. Freeze : if unable to simplify or coalesce

George: coalesce only if all significant-degree neighbors of a already inter- J

(a) look for move-related node of low-degree

(b) freeze its associated moves (give up hope of coalescing them)
O

fere with b
K

(c) now treat as a non-move-related and resume iteration of simplify and coalesce
simplify can remove all insignificant-degree neighbors of a
M

S J

5. Spill : if no low-degree nodes

[

remaining significant-degree neighbors of a already interfere with b so

T J

(a) select candidate for spilling

coalescing does not increase the degree of any node (b) remove to stack and continue simplifying
6. Select : pop stack assigning colors (including actual spills)
\

7. Start over : if select has no actual spills then finished, otherwise

]

(a) rewrite code to fetch actual spills before each use and store after each definition
(b) recalculate liveness and repeat

199 200
Spilling
Iterated register coalescing
b

Spills require repeating build and simplify on the whole program

SSA constant
To avoid increasing number of spills in future rounds of build can sim-
2 b

propagation
c

(optional)
ply discard coalescences
Alternatively, preserve coalescences from before first potential spill,
e

build h

discard those after that point

simplify f

Move-related spilled temporaries can be aggressively coalesced, since

(unlike registers) there is no limit on the number of stack-frame loca-
_

conservative
tions

coalesce

freeze

potential
spill

select
done

actual
spills

spill
y
an

201 202

Precolored nodes Temporary copies of machine registers

Precolored nodes correspond to machine registers (e.g., stack pointer, ar- Since precolored nodes don’t spill, their live ranges must be kept short:
guments, return address, return value)
g

1. use move instructions

select and coalesce can give an ordinary temporary the same color as 2. move callee-save registers to fresh temporaries on procedure entry,
M D

a precolored register, if they don’t interfere and back on exit, spilling between as necessary
e.g., argument registers can be reused inside procedures for a tempo- 3. register pressure will spill the fresh temporaries as necessary, other-
=

m
n

rary wise they can be coalesced with their precolored counterpart and the
simplify, freeze and spill cannot be performed on them moves deleted
M M

also, precolored nodes interfere with other precolored nodes

So, treat precolored nodes as having infinite degree
2

This also avoids needing to store large adjacency lists for precolored nodes;
coalescing can use the George criterion

203 204
Caller-save and callee-save registers Example
Variables whose live ranges span calls should go to callee-save registers,
s t u s v w

otherwise to caller-save
x y z { |

} ~

This is easy for graph coloring allocation with spilling

calls interfere with caller-save registers

a cross-call variable interferes with all precolored caller-save registers,

as well as with the fresh temporaries created for callee-save copies,

forcing a spill

save temporary instead of the cross-call variable ª « ¬ ®

this makes the original callee-save register available for coloring the

¯ ° ± ² ³

cross-call variable ´ µ ¶ · ´ ¸ ¹ º » ¼ ½ ¾ ¿ À Á Â Ã Ä Å Æ

Temporaries are , , , , È É Ê Ë Ì

Assume target machine with K 3 registers: , (caller-save/argument/resul

A Î

Ï Ð Ñ Ò

(callee-save)
Ô

The code generator has already made arrangements to save ex- Ö ×

plicitly by copying into temporary and back again Ø

205 206

Example (cont.) Example (cont.)

Interference graph:

Interference graph with removed:

r3 c r3

b e
Ú

b e
r2 r2

r1 a d r1 a d

Only possibility is to coalesce and : will have K significant-

D
n

No opportunity for simplify or freeze (all non-precolored nodes have

Û
h

degree neighbors (after coalescing will be low-degree, though high-

significant degree K) Ü
h

degree before)
Any coalesce will produce a new node adjacent to K significant- r3
D

degree nodes Ú

b
r2
Must spill based on priorities:
M

h h

Node uses defs uses defs à á degree priority r1 ae d

outside loop inside loop
â

2 10 0 ã

ä å æ ç 4 è 0.50
1 10 1 4 2.75
é ê

ë ì í î ï

2 ð

10 0 ñ

ò ó ô õ 6 ö 0.33
2 10 2 4 5.50
÷ ø

ù ú û ü ý

= =

1þ

10 3 ÿ

3 10.30

Node has lowest priority so spill it

207 208
Example (cont.) Example (cont.)
Cannot coalesce with because the move is constrained : the
D " # $ % D
n &

Can now coalesce with (or coalesce and ):

nodes interfere. Must simplify :

M '

r3
r3

r2b
r2b
r1 ae d
r1ae
Coalescing and (could also coalesce with ):
n

Graph now has only precolored nodes, so pop nodes from stack col-

r3
oring along the way
r2b – ) * + ,

r1ae d
– , , have colors by coalescing
- . /

– must spill since no color can be found for it

Introduce new temporaries and for each use/def, add loads be- 2 3 4 5

fore each use and stores after each def

209 210

Example (cont.) Example (cont.)

New interference graph:

6 7 8 6 9 :

; < = > ? @

A B C D E F C G H I J K
c1
Ú

b e
r2
L M N O P

r1 a d
Q R S T U

V W X Y

Coalesce with , then with :

n ¦ § n ª «

¤ ¥ ¨ ©

Z [ \ ]

r3c1c2
^ _ _ ` a

b e
b c d e f g
r2

h i j k l m

r1 a d
n o p q r s t u t v w w x

As before, coalesce
I

with , then with :

D
n ® n ° ±

y z { | }

r3c1c2

r2b

r1 ae d
¡

211 212
Example (cont.) Example (cont.)

As before, coalesce with and simplify : Rewrite the program with this assignment:
D ³ ´ M ·

² Ð

µ ¶

r3c1c2
Ñ Ò Ó Ñ Ô Õ

Ö × Ø Ù Ú Û

r2b
Ü Ý Þ ß à á Þ â ã ä å æ

r1ae
ç è é ê ë ì

Pop from stack: select ¹ º » . All other nodes were coalesced or precol- í î ï ð ñ ò

ored. So, the coloring is: ó ô õ ö ÷

– ¼ ½ ¾ ¿
ø ù ú û ü ý

– À Á Â Ã
þ ÿ ÿ

– Ä Å Æ Ç

– È É Ê Ë

–

Ì Í Î Ï

! " # $

% & ' ( ) * + , - . + /

0 1 2 3 4 5

6 7 8 9 6 : ; < = > ? @ A B C D E F G H

213 214

Example (cont.)
I

Delete moves with source and destination the same (coalesced):

J K L J M N

O P Q R S T Q U V W X Y

Z [ \ ] ^

_ ` ` a b

c d e f g h i j k

l m n o p q r s

t u v w x y z { | { } ~ ~

Chapter 9: Activation Records

¡ ¢ £

One uncoalesced move remains

215 216
The procedure abstraction The procedure abstraction
¯

The essentials:
Separate compilation:
on entry, establish ’s environment
±

allows us to build large programs

at a call, preserve ’s environment

keeps compile times reasonable

on exit, tear down ’s environment

requires independent procedures

in between, addressability and proper lifetimes

The linkage convention: ¸

procedure P ¸

procedure Q
¨

a social contract prologue prologue

machine dependent
«

division of responsibility pre−call

The linkage convention ensures that procedures inherit a valid run-time call
environment and that they restore one for their parents. post−call
¬

Linkages execute at run time

Code to make the linkage is generated at compile time

epilogue epilogue
0

217 218

Procedure linkages Procedure linkages

higher addresses
The linkage divides responsibility between caller and callee
D D

argument n
previous frame

.
arguments

¼ ¼

Caller Callee
incoming

.
.
¼

Call pre-call prologue

½ ½

argument 2
argument 1
frame 1. allocate basic frame 1. save registers, state
pointer
2. store FP (dynamic link)
¾

2. evaluate & store params.

local
Assume that each procedure activation has
V

3. set new FP
V

variables 3. store return address

4. store static link
¿

4. jump to child
an associated activation record or frame (at
B B

return address 5. extend basic frame

run time) (for local data)
current frame

temporaries 6. initialize locals

Assumptions: Â

7. fall through to code

RISC architecture saved registers
¹

Return post-call epilogue

can always expand an allocated block argument m

.
arguments

1. copy return value 1. store return value

outgoing

locals stored in frame . ¾ ¾

. 2. deallocate basic frame 2. restore state

argument 2 V V

3. restore parameters 3. cut back to basic frame

stack argument 1
pointer
(if copy out) 4. restore parent’s FP
next frame

5. jump to return address

At compile time, generate the code to do this

lower addresses

At run time, that code manipulates the frame & data areas

219 220
Run-time storage organization Run-time storage organization
¯

To maintain the illusion of procedures, the compiler can adopt some con-
Typical memory layout
ventions to govern memory use.
high address

Code space Ï

stack
Å

fixed size
(link time)
Ç

statically allocated
free memory

Data space
heap
È

fixed-sized data may be statically allocated

static data
Ï

variable-sized data must be dynamically allocated

some data is dynamically allocated in code code

low address
Control stack ¯

«
The classical scheme
Ì

dynamic slice of activation tree

allows both stack and heap maximal freedom

return addresses
Ò

code and static data may be separate or intermingled

may be implemented in hardware

221 222

Run-time storage organization Storage classes

(base address )
Ó b

Where do local variables go? Each variable must be assigned a storage class
When can we allocate them on a stack?
Static variables:
Key issue is lifetime of local names
Ü

addresses compiled into code (relocatable)

Downward exposure: Ý

(usually ) allocated at compile-time

called procedures may reference my variables ß

limited to fixed size objects

dynamic scoping à

control access with naming scheme

lexical scoping

Upward exposure: Global variables:

can I return a reference to my variables?

almost identical to static variables

layout may be important (exposed )
Ù ã

functions that return functions

continuation-passing style
ä

naming scheme ensures universal access

Û
Link editor must handle duplicate definitions
With only downward exposure, the compiler can allocate the frames on the
run-time call stack
223 224
å

Storage classes (cont.) Access to non-local data

Procedure local variables How does the code find non-local data at run-time?
î

Put them on the stack —

Real globals
æ

if sizes are fixed

visible everywhere
ð

if lifetimes are limited

naming convention gives an address

if values are not preserved

initialization requires cooperation

Dynamically allocated variables

Must be treated differently — Lexical nesting

view variables as (level,offset ) pairs (compile-time)
ô

call-by-reference, pointers, lead to non-local lifetimes

(usually ) an explicit allocation

ì
õ

chain of non-local access links

more expensive to find (at run-time)
÷

explicit or implicit deallocation

225 226

Access to non-local data Access to non-local data

¯ ¯

Two important problems arise To find the value specified by l o û ü

ý þ

How do we map a name into a (level,offset ) pair? ÿ

need current procedure level, k

Use a block-structured symbol table (remember last lecture?)

l local value
– look up a name, want its most recent declaration

find l’s activation record

– declaration may be at current level or any lower level

k l cannot occur
Given a (level,offset ) pair, what’s the address?
Ç

Two classic approaches

– access links (or static links ) Maintaining access links: (static links )
ú

– displays calling level k 1 procedure

1. pass my FP as access link
2. my backward chain will work for lower levels
calling procedure at level l k
1. find link to level l 1 and pass it
2. its access link will work for lower levels

227 228
The display Calls: Saving and restoring registers
¯

To improve run-time access costs, use a display :

caller’s registers callee’s registers all registers

callee saves 1 3 5
table of access links for lower levels
¬

caller saves 2 4 6

lookup is index from known offset 1. Call includes bitmap of caller’s registers to be saved/restored
(best with save/restore instructions to interpret bitmap directly)
takes slight amount of time at call
¬

2. Caller saves and restores its own registers

a single display or one per frame Unstructured returns (e.g., non-local gotos, exceptions) create some problems, since
code to restore must be located and executed
2

for level k procedure, need k 1 slots

3. Backpatch code to save registers used in callee on entry, restore on exit

e.g., VAX places bitmap in callee’s stack frame for use on call/return/non-local goto/exception
4 5

Non-local gotos and exceptions must unwind dynamic chain restoring callee-saved
registers
6

Access with the display

4. Bitmap in callee’s stack frame is used by caller to save/restore
assume a value described by l o

(best with save/restore instructions to interpret bitmap directly)

Unwind dynamic chain as for 3

find slot as ! " l #

5. Easy
$

add offset to pointer from slot ( % & ' ( ) * + , l o)

- .
ý /

Non-local gotos and exceptions must restore all registers from “outermost callee”
“Setting up the basic frame” now includes display manipulation Á

6. Easy (use utility routine to keep calls compact)

Non-local gotos and exceptions need only restore original registers from caller
7

Top-left is best: saves fewer registers, compact calling sequences

229 230

Call/return MIPS procedure call convention

Assuming callee saves: Registers:

1. caller pushes space for return value Number Name Usage
2. caller pushes SP
9

3. caller pushes space for: 0 ; < = > Constant 0

return address, static chain, saved registers 1 at Reserved for assembler
2, 3 v0, v1 Expression evaluation, scalar function results
Ê

4. caller evaluates and pushes actuals onto stack 1

5. caller sets return address, callee’s static chain, performs call 4–7 a0–a3 first 4 scalar arguments
¯

8–15 t0–t7 Temporaries, caller-saved; caller must save to pre-

6. callee saves registers in register-save area serve across calls

7. callee copies by-value arrays/records using addresses passed as ac- 16–23 s0–s7 Callee-saved; must be preserved across calls
tuals
¬

24, 25 t8, t9 Temporaries, caller-saved; caller must save to pre-

8. callee allocates dynamic arrays as needed

serve across calls
9. on return, callee restores saved registers 26, 27 k0, k1 Reserved for OS kernel
10. jumps to return address 28 gp
?

Pointer to global area

Caller must allocate much of stack frame, because it computes the actual 29 sp Stack pointer
0

30 s8 (fp) Callee-saved; must be preserved across calls

parameters 0

8
31 ra Expression evaluation, pass return address in calls
Alternative is to put actuals below callee’s stack frame in caller’s: common
when hardware supports stack management (e.g., VAX)
:

231 232
MIPS procedure call convention MIPS procedure call convention
¯

Philosophy: The stack frame

Use full, general calling sequence only when necessary; omit por- high memory
argument n
B

tions of it where possible (e.g., avoid using fp register whenever

possible)
argument 1
B

Classify routines as: virtual frame pointer ($fp)

static link
C

frame offset
@

non-leaf routines: routines that call other routines

locals
A

leaf routines: routines that do not themselves call other routines

framesize
saved $ra
C

– leaf routines that require stack storage for locals

temporaries
D

– leaf routines that do not require stack storage for locals

other saved registers
E

argument build
B

stack pointer ($sp)

low memory

233 234

MIPS procedure call convention MIPS procedure call convention

Pre-call: Prologue:
1. Pass arguments: use registers a0 . . . a3; remaining arguments are 1. Leaf procedures that use the stack and non-leaf procedures:
pushed on the stack along with save space for a0 . . . a3 (a) Allocate all stack space needed by routine:
2. Save caller-saved registers if necessary I

local variables
0

3. Execute a instruction: jumps to target address (callee’s first in-

F G H
J

saved registers
struction), saves return address in register ra K

sufficient space for arguments to routines called by this routine

L M N M O P Q R S T U V W P X Y W

(b) Save registers (ra, etc.)

e.g.,
Z [ \ ] ^ _ ` a b c d e f g d h ` a b c d i ` ` e d j k \ e l m

n o p q r s t u v w x y z { x | t u v w x } t t y x ~ p y

where and (usually negative) are compile-

: ¡

time constants
¬

2. Emit code for routine

235 236
MIPS procedure call convention
Epilogue:
1. Copy return values into result registers (if not already there)
2. Restore saved registers
ª « ¬ ® ¯ ° ¬ ± ² ³ ´ µ ¶ ° ¬ ± ² · ° ° ³ ¸ ¹ º » ¼ ³ ½ ¾

3. Get return address

¿ À Á Â Ã Ä Å Æ Ç È É Ê Ë Ì É Í Å Æ Ç È É Î Å Å Ê É Ï Ð Á Ê Ñ Ò

4. Clean up stack
Ó Ô Ô Õ Ö × Ø Ù Ú Û Ü Ý Þ × ß à Þ

5. Return
á â ã ä

237

Ticket Booking System: 1.1 Over View of Project
0% (2)
Ticket Booking System: 1.1 Over View of Project
37 pages
Compiler Design - Compilers Principles and Practice - A.hosking - Compiler Course Slides
No ratings yet
Compiler Design - Compilers Principles and Practice - A.hosking - Compiler Course Slides
237 pages
Engineering Standard Draw FlowChart
No ratings yet
Engineering Standard Draw FlowChart
22 pages
Introduction To Compilers1
No ratings yet
Introduction To Compilers1
47 pages
Material 1
No ratings yet
Material 1
164 pages
Fci-Management Trainee 2013
No ratings yet
Fci-Management Trainee 2013
23 pages
Income Statements Notes Chapter 21
No ratings yet
Income Statements Notes Chapter 21
5 pages
Module 1 IOT
No ratings yet
Module 1 IOT
103 pages
Theory of Programming Languages: An Overview Lecture # 1
No ratings yet
Theory of Programming Languages: An Overview Lecture # 1
38 pages
Imprementing Programming Languages PDF
67% (3)
Imprementing Programming Languages PDF
133 pages
Lecture 02
No ratings yet
Lecture 02
20 pages
Lecture 02
No ratings yet
Lecture 02
20 pages
Dell PowerEdge T110 II
No ratings yet
Dell PowerEdge T110 II
50 pages
CD MQP Solved
No ratings yet
CD MQP Solved
51 pages
Cmod
No ratings yet
Cmod
14 pages
Rsjaws En-Us SG m13 Awsorganizations
No ratings yet
Rsjaws En-Us SG m13 Awsorganizations
9 pages
CDDD
No ratings yet
CDDD
22 pages
Lecture 1 16092024 120225pm
No ratings yet
Lecture 1 16092024 120225pm
49 pages
Lecture 01 CC
No ratings yet
Lecture 01 CC
21 pages
Economics Enrichment Plan UPSC
No ratings yet
Economics Enrichment Plan UPSC
2 pages
Compiler Design
No ratings yet
Compiler Design
19 pages
Teacher Jobs in Beautiful Colombia Dec 22nd
No ratings yet
Teacher Jobs in Beautiful Colombia Dec 22nd
6 pages
Compiler Design Quantum
No ratings yet
Compiler Design Quantum
251 pages
2290
No ratings yet
2290
183 pages
Supplement To 100-3
No ratings yet
Supplement To 100-3
12 pages
Standard Shipment Process (Mass Processing) : LE (Logistics Execution)
No ratings yet
Standard Shipment Process (Mass Processing) : LE (Logistics Execution)
9 pages
Compiler Construction
No ratings yet
Compiler Construction
26 pages
WIP Job Closure Process
No ratings yet
WIP Job Closure Process
13 pages
Compiler Design
No ratings yet
Compiler Design
29 pages
Lec 1
No ratings yet
Lec 1
26 pages
U1A 1B Linear Relations
No ratings yet
U1A 1B Linear Relations
48 pages
1 Lec01
No ratings yet
1 Lec01
26 pages
Chapter 1 - Intro
No ratings yet
Chapter 1 - Intro
20 pages
Stanford University CS143
No ratings yet
Stanford University CS143
35 pages
01 IntroToCompilers
No ratings yet
01 IntroToCompilers
41 pages
CS416 Compiler Design
No ratings yet
CS416 Compiler Design
20 pages
ROADMAP First Edition
0% (1)
ROADMAP First Edition
32 pages
Cab 2024
No ratings yet
Cab 2024
1 page
Compiler Construction: Nguyen Thi Thu Huong Department of Computer Science-HUST Email: Cell Phone 0903253796
No ratings yet
Compiler Construction: Nguyen Thi Thu Huong Department of Computer Science-HUST Email: Cell Phone 0903253796
35 pages
1 - Unit1
No ratings yet
1 - Unit1
20 pages
Compiler Construction: M Ikram Ul Haq 1
No ratings yet
Compiler Construction: M Ikram Ul Haq 1
38 pages
Unit 1
No ratings yet
Unit 1
38 pages
Cambridge: Computer Science Tripos Part Ib
No ratings yet
Cambridge: Computer Science Tripos Part Ib
82 pages
Chapter 1
No ratings yet
Chapter 1
49 pages
Introduction1
No ratings yet
Introduction1
63 pages
Lecture 1 - CSC 303
No ratings yet
Lecture 1 - CSC 303
40 pages
Chapter 1
No ratings yet
Chapter 1
40 pages
CMP 335 01. Introduction-1
No ratings yet
CMP 335 01. Introduction-1
18 pages
Compiler Construction Lecture 1 - 2
No ratings yet
Compiler Construction Lecture 1 - 2
35 pages
Introduction To Compiler Construction: M Ikram Ul Haq
No ratings yet
Introduction To Compiler Construction: M Ikram Ul Haq
11 pages
LMHF Data Sheet Most Recent One
No ratings yet
LMHF Data Sheet Most Recent One
2 pages
Introduction To Compilers
No ratings yet
Introduction To Compilers
14 pages
Introduction
No ratings yet
Introduction
22 pages
Notes Compiler
No ratings yet
Notes Compiler
28 pages
DOST Puts Up Free Online Reviewer For PSHS Exams
No ratings yet
DOST Puts Up Free Online Reviewer For PSHS Exams
2 pages
Compiler Lecture-1
No ratings yet
Compiler Lecture-1
47 pages
CC 1
No ratings yet
CC 1
41 pages
19CSE401 CD 01 Introduction
No ratings yet
19CSE401 CD 01 Introduction
28 pages
Lesson 1: Structure of A Compiler
No ratings yet
Lesson 1: Structure of A Compiler
20 pages
TK3163 Sem2 2023 1MyCh1.1-1.2 Intro
No ratings yet
TK3163 Sem2 2023 1MyCh1.1-1.2 Intro
43 pages
Lecture1 - Compiler Design
No ratings yet
Lecture1 - Compiler Design
52 pages
PLT Book
No ratings yet
PLT Book
133 pages
CSC303 - Compiler Design - 060624
No ratings yet
CSC303 - Compiler Design - 060624
49 pages
RA No 11232 Revised Corporation Code of The Philippines Sec 115 To Sec 132
No ratings yet
RA No 11232 Revised Corporation Code of The Philippines Sec 115 To Sec 132
4 pages
Manual Soldadura en Campo
No ratings yet
Manual Soldadura en Campo
32 pages
ch1 Intro1
No ratings yet
ch1 Intro1
27 pages
Shouding: 1Mhz, 2A Step-Up Current Mode PWM Converter
No ratings yet
Shouding: 1Mhz, 2A Step-Up Current Mode PWM Converter
10 pages
Debre Markos University Burie Campus Departement of Computer Science
No ratings yet
Debre Markos University Burie Campus Departement of Computer Science
44 pages
Chapter 1
No ratings yet
Chapter 1
42 pages
UGC - NET December 2024 Admit Card: Ugcnet - Nta.ac - in
No ratings yet
UGC - NET December 2024 Admit Card: Ugcnet - Nta.ac - in
2 pages
Compiler Design
No ratings yet
Compiler Design
19 pages
1016
No ratings yet
1016
46 pages
Compiler Design
No ratings yet
Compiler Design
25 pages
Heather Jennings Resume
No ratings yet
Heather Jennings Resume
1 page
Compiler Construction CS-4207: Lecture 1 & 2 Instructor Name: Atif Ishaq
No ratings yet
Compiler Construction CS-4207: Lecture 1 & 2 Instructor Name: Atif Ishaq
29 pages
Bedasa
No ratings yet
Bedasa
31 pages
Unit 1 (A)
No ratings yet
Unit 1 (A)
40 pages
Datasheet Freecom Dual Drive Network Center en
No ratings yet
Datasheet Freecom Dual Drive Network Center en
2 pages
Chapter 1
No ratings yet
Chapter 1
35 pages
12th History EM - June Month Test
No ratings yet
12th History EM - June Month Test
2 pages
Chuyên Đề 22 - Từ Chỉ Số Lượng
No ratings yet
Chuyên Đề 22 - Từ Chỉ Số Lượng
4 pages
Compiler Construction: By: Engr. Muhammad Adnan Malik Class of BS-CS, NCBA&E, MULTAN
100% (1)
Compiler Construction: By: Engr. Muhammad Adnan Malik Class of BS-CS, NCBA&E, MULTAN
64 pages
Lingeswaran Vs Thirunagalingam
No ratings yet
Lingeswaran Vs Thirunagalingam
5 pages
Tillett Car Seat 2011 Brochure
No ratings yet
Tillett Car Seat 2011 Brochure
8 pages
Characteristics of A Business Letter
No ratings yet
Characteristics of A Business Letter
3 pages
Lec00 Outline
No ratings yet
Lec00 Outline
27 pages
JAVA 9.0 To 13.0 New Features: Learn, Implement and Migrate to New Version of Java.
From Everand
JAVA 9.0 To 13.0 New Features: Learn, Implement and Migrate to New Version of Java.
Mandar Jog
No ratings yet
Go Debugging from Scratch: A Practical Guide with Examples
From Everand
Go Debugging from Scratch: A Practical Guide with Examples
William E. Clark
No ratings yet
Learn Java Programming in 24 Hours
From Everand
Learn Java Programming in 24 Hours
PublishDrive
No ratings yet
JAVA: Java Programming for beginners teaching you basic to advanced JAVA programming skills!
From Everand
JAVA: Java Programming for beginners teaching you basic to advanced JAVA programming skills!
Adam Dodson
No ratings yet