0% found this document useful (0 votes)
9 views28 pages

Lecture7 AST

Uploaded by

ayadiabdoo
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views28 pages

Lecture7 AST

Uploaded by

ayadiabdoo
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 28

Syntax Tree

Abstract Syntax Tree


AST Processing

Compiling Techniques
Lecture 7: Abstract Syntax

Christophe Dubach

13 October 2015

Christophe Dubach Compiling Techniques


Syntax Tree
Abstract Syntax Tree
AST Processing

Table of contents

1 Syntax Tree
Semantic Actions
Examples
Abstract Grammar

2 Abstract Syntax Tree


Internal Representation
AST Builder

3 AST Processing
Object-Oriented Processing
Visitor Processing

Christophe Dubach Compiling Techniques


Syntax Tree Semantic Actions
Abstract Syntax Tree Examples
AST Processing Abstract Grammar

A parser does more than simply recognising syntax. It can:


evaluate code (interpreter)
emit code (simple compiler)
build an internal representation of the program (multi-pass
compiler)
In general, a parser performs semantic actions:
recursive descent parser: integrate the actions with the
parsing functions
bottom-up parser (automatically generated): add actions to
the grammar

Christophe Dubach Compiling Techniques


Syntax Tree Semantic Actions
Abstract Syntax Tree Examples
AST Processing Abstract Grammar

Syntax Tree

In a multi-pass compiler, the parser builds a syntax tree which is


used by the subsequent passes
A syntax tree can be:
a concrete syntax tree (or parse tree) if it directly corresponds
to the context-free grammar
an abstract syntax tree if it corresponds to a simplified (or
abstract) grammar
The abstract syntax tree (AST) is usually used in compilers.

Christophe Dubach Compiling Techniques


Syntax Tree Semantic Actions
Abstract Syntax Tree Examples
AST Processing Abstract Grammar

Example: Concrete Syntax Tree (Parse Tree)


Example: CFG for arithmetic expressions (EBNF form)
Expr : : = Term ( ( ’ + ’ | ’ − ’) Term ) ∗
Term ::= Factor ( ( ’∗ ’ | ’/ ’) Factor )∗
F a c t o r : : = number | ’ ( ’ Expr ’ ) ’

After removal of EBNF syntax


Expr ::= Term Terms
Terms ::= ( ’ + ’ | ’ − ’) Term Terms | 
Term ::= Factor Factors
Factors ::= ( ’∗ ’ | ’/ ’) Factor Factors | 
Factor ::= number | ’ ( ’ Expr ’ ) ’

After further simplification


Expr : : = Term ( ( ’ + ’ | ’ − ’) Expr |  )
Term : : = F a c t o r ( ( ’ ∗ ’ | ’ / ’ ) Term |  )
Factor : : = number | ’ ( ’ Expr ’ ) ’
Christophe Dubach Compiling Techniques
Syntax Tree Semantic Actions
Abstract Syntax Tree Examples
AST Processing Abstract Grammar

Example: Concrete Syntax Tree (Parse Tree)


CFG for arithmetic expressions
Expr : : = Term ( ( ’ + ’ | ’ − ’) Expr |  )
Term : : = F a c t o r ( ( ’ ∗ ’ | ’ / ’ ) Term |  )
Factor : : = number | ’ ( ’ Expr ’ ) ’

Concrete Syntax Tree for 5 ∗ 3


Term
Term Factor
’∗’
The concrete syntax tree contains
Factor number a lot of unnecessary information.

number ’3’

’5’
Christophe Dubach Compiling Techniques
Syntax Tree Semantic Actions
Abstract Syntax Tree Examples
AST Processing Abstract Grammar

It is possible to simplify the concrete syntax tree to remove the


redundant information.
For instance parenthesis are not necessary.
Exercise
1 Write the concrete syntax tree for 3 ∗ (4 + 5)

2 Simplify the tree.

Christophe Dubach Compiling Techniques


Syntax Tree Semantic Actions
Abstract Syntax Tree Examples
AST Processing Abstract Grammar

Abstract Grammar
These simplifications leads to a new simpler context-free grammar
caller Abstract Grammar.
Example: abstract grammar for arithmetic expressions
Expr : : = BinOp | i n t L i t e r a l
BinOp : : = Expr Op Expr
Op : : = add | sub | mul | d i v

5∗3
BinOp

intLiteral(5) intLiteral(3)
mul

This is called an Abstract Syntax Tree


Christophe Dubach Compiling Techniques
Syntax Tree Semantic Actions
Abstract Syntax Tree Examples
AST Processing Abstract Grammar

Example: abstract grammar for arithmetic expressions


Expr : : = BinOp | i n t L i t e r a l
BinOp : : = Expr Op Expr
Op : : = add | sub | mul | d i v

Note that for given concrete grammar, there exist numerous


abstract grammar:

Expr ::= AddOp | SubOp | MulOp | DivOp | i n t L i t e r a l


AddOp ::= Expr add Expr
SubOp ::= Expr sub Expr
MulOp ::= Expr mul Expr
DivOp ::= Expr d i v Expr

We pick the most suitable grammar for the compiler.

Christophe Dubach Compiling Techniques


Syntax Tree
Internal Representation
Abstract Syntax Tree
AST Builder
AST Processing

Abstract Syntax Tree


The Abstract Syntax Tree (AST) forms the main intermediate
representation of the compiler’s front-end.

For each non-terminal or terminal in the abstract grammar, we


define a class.
If a non-terminal has any alternative on the rhs (right hand
side), then the class is abstract (cannot instantiate it).
The terminal or non-terminal appearing on the rhs are
subclasses of the non-terminal on the lhs.
The sub-trees are represented as instance variable in the class.
Each non-abstract class has a unique constructor.
If a terminal does not store any information, then we can use
an Enum type in Java instead of a class.

Christophe Dubach Compiling Techniques


Syntax Tree
Internal Representation
Abstract Syntax Tree
AST Builder
AST Processing

Example: abstract grammar for arithmetic expressions


Expr : : = BinOp | i n t L i t e r a l
BinOp : : = Expr Op Expr
Op : : = add | sub | mul | d i v

Corresponding Java Classes


a b s t r a c t c l a s s Expr { }

c l a s s I n t L i t e r a l extends Expr {
int i ;
IntLiteral ( int i ){...}
}

c l a s s BinOp e x t e n d s E x p r {
Op op ;
Expr l h s ;
Expr r h s ;
BinOp (Op op , E x p r l h s , E x p r r h s ) { . . . }
}

enum Op {ADD, SUB , MUL, DIV}


Christophe Dubach Compiling Techniques
Syntax Tree
Internal Representation
Abstract Syntax Tree
AST Builder
AST Processing

CFG for arithmetic expressions


Expr : : = Term ( ( ’ + ’ | ’ − ’) Expr |  )
Term : : = F a c t o r ( ( ’ ∗ ’ | ’ / ’ ) Term |  )
Factor : : = number | ’ ( ’ Expr ’ ) ’

Current Parser (class)


Expr p a r s e E x p r ( ) {
par s eT e r m ( ) ;
i f ( a c c e p t ( PLUS | MINUS ) )
nextToken ( ) ; Expr p a r s e F a c t o r ( ) {
parseExpr ( ) ; i f ( a c c e p t (LPAR ) )
} parseExpr ( ) ;
e x p e c t (RPAR ) ;
Expr p a rs e Te r m ( ) { else
parseFactor (); e x p e c t (NUMBER) ;
i f ( a c c e p t ( TIMES | DIV ) ) }
nextToken ( ) ;
p ar s e Te r m ( ) ;
}
Christophe Dubach Compiling Techniques
Syntax Tree
Internal Representation
Abstract Syntax Tree
AST Builder
AST Processing

AST building (modified Parser)


Expr p a r s e E x p r ( ) {
Current Parser Expr l h s = parseTerm ( ) ;
i f ( a c c e p t ( PLUS | MINUS ) )
void parseExpr () { Op op ;
parseTerm ( ) ; i f ( t o k e n == PLUS )
i f ( a c c e p t ( PLUS | MINUS ) ) op = ADD;
nextToken ( ) ; e l s e // t o k e n == MINUS
parseExpr ( ) ; op = SUB ;
} nextToken ( ) ;
Expr r h s = p a r s e E x p r ( ) ;
r e t u r n new BinOp ( op , l h s , r h s ) ;
return lhs ;
}

Christophe Dubach Compiling Techniques


Syntax Tree
Internal Representation
Abstract Syntax Tree
AST Builder
AST Processing

AST building (modified Parser)


Expr parseTerm ( ) {
Current Parser Expr l h s = p a r s e F a c t o r ( ) ;
i f ( a c c e p t ( TIMES | DIV ) )
v o i d parseTerm ( ) { Op op ;
parseFactor (); i f ( t o k e n == TIMES )
i f ( a c c e p t ( TIMES | DIV ) ) op = MUL;
nextToken ( ) ; e l s e // t o k e n == DIV
parseTerm ( ) ; op = DIV ;
} nextToken ( ) ;
Expr r h s = parseTerm ( ) ;
r e t u r n new BinOp ( op , l h s , r h s ) ;
return lhs ;
}

Christophe Dubach Compiling Techniques


Syntax Tree
Internal Representation
Abstract Syntax Tree
AST Builder
AST Processing

AST building (modified Parser)


Expr p a r s e F a c t o r ( ) {
i f ( a c c e p t (LPAR ) )
Current Parser Expr e = p a r s e E x p r ( ) ;
e x p e c t (RPAR ) ;
void parseFactor () { return e ;
i f ( a c c e p t (LPAR ) ) else
parseExpr ( ) ; I n t L i t e r a l i l = parseNumber ( ) ;
e x p e c t (RPAR ) ; return i l ;
else }
e x p e c t (NUMBER ) ;
} I n t L i t e r a l parseNumber ( ) {
Token n = e x p e c t (NUMBER ) ;
i n t i = I n t e g e r . p a r s e I n t ( n . data ) ;
r e t u r n new I n t L i t e r a l ( i ) ;
}

Christophe Dubach Compiling Techniques


Syntax Tree
Object-Oriented Processing
Abstract Syntax Tree
Visitor Processing
AST Processing

Compiler Pass
AST pass
An AST pass is an action that process the AST in a single
traversal.
An pass can for instance:
assign a type to each node of the AST
perform an optimisation
generate code
It is important to ensure that the different passes can access the
AST in a flexible way. An inefficient solution would be to use
instanceof to find the type of syntax node

Example
i f ( tree instanceof I n t L i t e r a l )
(( I n t L i t e r a l ) tree ). i ;

Christophe Dubach Compiling Techniques


Syntax Tree
Object-Oriented Processing
Abstract Syntax Tree
Visitor Processing
AST Processing

Object-Oriented Processing

Using this technique, a compiler pass is represented by a function


f () in each of the AST classes.

The method is abstract if the class is abstract


To process an instance of an AST class e, we simply call e. f () .
The exact behaviour will depends on the concrete class
implementations

Example for the arithmetic expression


A pass to print the AST: String toStr ()
A pass to evaluate the AST: int eval ()

Christophe Dubach Compiling Techniques


Syntax Tree
Object-Oriented Processing
Abstract Syntax Tree
Visitor Processing
AST Processing

a b s t r a c t c l a s s Expr {
abstract String toStr ();
abstract int eval ();
}
c l a s s I n t L i t e r a l extends Expr {
int i ;
S t r i n g t o S t r ( ) { r e t u r n ” ”+i ; }
int eval () { return i ; }
}
c l a s s BinOp e x t e n d s E x p r {
Op op ;
Expr l h s ;
Expr r h s ;
String toStr () { return lhs . toStr () + op . name ( ) + r h s . t o S t r ( ) ; }
int eval () {
s w i t c h ( op ) {
c a s e ADD: l h s . e v a l ( ) + r h s . e v a l (); break ;
c a s e SUB : l h s . e v a l ( ) − r h s . e v a l (); break ;
c a s e MUL: l h s . e v a l ( ) ∗ r h s . e v a l (); break ;
c a s e DIV : l h s . e v a l ( ) / r h s . e v a l (); break ;
} } }

Christophe Dubach Compiling Techniques


Syntax Tree
Object-Oriented Processing
Abstract Syntax Tree
Visitor Processing
AST Processing

Main class
c l a s s Main {
v o i d main ( S t r i n g [ ] a r g s ) {
Expr e x p r = E x p r P a r s e r . p a r s e ( s o m e i n p u t f i l e ) ;
String s t r = expr . toStr ( ) ;
int r e s u l t = expr . eval ( ) ;
}
}

Christophe Dubach Compiling Techniques


Syntax Tree
Object-Oriented Processing
Abstract Syntax Tree
Visitor Processing
AST Processing

Visitor Processing

With this technique, all the methods from a pass are grouped in a
visitor.
For this, need a language that implements single dispatch:
the method is chosen based on the dynamic type of the object
(the AST node)
The visitor design pattern allows us to implement double dispatch,
the method is chosen based on:
the dynamic type of the object (the AST node)
the dynamic type of the argument (the visitor)
Note that if the language supports pattern matching, it is not
needed to use a visitor since double-dispatch can be implemented
more effectively.

Christophe Dubach Compiling Techniques


Syntax Tree
Object-Oriented Processing
Abstract Syntax Tree
Visitor Processing
AST Processing

Single vs. double dispatch

In Java:
Single dispatch

class A {
v o i d p r i n t ( ) { System . o u t . p r i n t ( ”A” ) } ;
}
c l a s s B extends A {
v o i d p r i n t ( ) { System . o u t . p r i n t ( ”B” ) } ;
}
A a = new A ( ) ;
B b = new B ( ) ;
a . p r i n t ( ) ; // o u t p u t s A
b . p r i n t ( ) ; // o u t p u t s B

Christophe Dubach Compiling Techniques


Syntax Tree
Object-Oriented Processing
Abstract Syntax Tree
Visitor Processing
AST Processing

Single vs. double dispatch


In Java:
Double dispatch (Java does not support double dispatch)

class A { }
c l a s s B extends A { }
class Print () {
v o i d p r i n t (A a ) { System . o u t . p r i n t ( ”A” ) } ;
v o i d p r i n t (B b ) { System . o u t . p r i n t ( ”B” ) } ;
}
A a = new A ( ) ;
B b = new B ( ) ;
A b2 = new B ( ) ;
P r i n t p = new P r i n t ( ) ;
p . p r i n t ( a ) ; // o u t p u t s A
p . p r i n t ( b ) ; // o u t p u t s B
p . p r i n t ( b2 ) ; // o u t p u t s A

Christophe Dubach Compiling Techniques


Syntax Tree
Object-Oriented Processing
Abstract Syntax Tree
Visitor Processing
AST Processing

Visitor Interface
i n t e r f a c e V i s i t o r <T> {
T visitIntLiteral ( IntLiteral i l );
T v i s i t B i n O p ( BinOp bo ) ;
}

Modified AST classes


a b s t r a c t c l a s s Expr {
a b s t r a c t T a c c e p t ( V i s i t o r <T> v ) ;
}
c l a s s I n t L i t e r a l extends Expr {
...
T a c c e p t ( V i s i t o r <T> v ) {
return v . v i s i t I n t L i t e r a l ( this );
} }
c l a s s BinOp e x t e n d s E x p r {
...
T a c c e p t ( V i s i t o r <T> v ) {
return v . visitBinOp ( this );
} } }

Christophe Dubach Compiling Techniques


Syntax Tree
Object-Oriented Processing
Abstract Syntax Tree
Visitor Processing
AST Processing

ToStr Visitor
ToStr i m p l e m e n t s V i s i t o r <S t r i n g > {
String v i s i t I n t L i t e r a l ( IntLiteral i l ) {
r e t u r n ” ”+ i l . i ;
}
S t r i n g v i s i t B i n O p ( BinOp bo ) {
r e t u r n bo . l h s . a c c e p t ( t h i s ) + bo . op . name ( ) + bo . r h s . a c c e p t ( t h i s
} }

Eval Visitor
E v a l i m p l e m e n t s V i s i t o r <I n t e g e r > {
Integer v i s i t I n t L i t e r a l ( IntLiteral i l ) {
return i l . i ;
}
I n t e g e r v i s i t B i n O p ( BinOp bo ) {
s w i t c h ( bo . op ) {
c a s e ADD: l h s . a c c e p t ( t h i s ) + r h s . a c c e p t ( t h i s ); break ;
c a s e SUB : l h s . a c c e p t ( t h i s ) − r h s . a c c e p t ( t h i s ); break ;
c a s e MUL: l h s . a c c e p t ( t h i s ) ∗ r h s . a c c e p t ( t h i s ); break ;
c a s e DIV : l h s . a c c e p t ( t h i s ) / r h s . a c c e p t ( t h i s ); break ;
} }

Christophe Dubach Compiling Techniques


Syntax Tree
Object-Oriented Processing
Abstract Syntax Tree
Visitor Processing
AST Processing

Main class
c l a s s Main {
v o i d main ( S t r i n g [ ] a r g s ) {
Expr e x p r = E x p r P a r s e r . p a r s e ( s o m e i n p u t f i l e ) ;
S t r i n g s t r = e x p r . a c c e p t ( new ToStr ( ) ) ;
i n t r e s u l t = e x p r . a c c e p t ( new E v a l ( ) ) ;
}
}

Christophe Dubach Compiling Techniques


Syntax Tree
Object-Oriented Processing
Abstract Syntax Tree
Visitor Processing
AST Processing

Extensibility

With an AST, there can extensions in two dimensions:


1 Adding a new AST node

For the object-oriented processing this means add a new


sub-class
In the case of the visitor, need to add a new method in every
visitor
2 Adding a new pass
For the object-oriented processing, this means adding a
function in every single AST node classes
For the visitor case, simply create a new visitor

Christophe Dubach Compiling Techniques


Syntax Tree
Object-Oriented Processing
Abstract Syntax Tree
Visitor Processing
AST Processing

Picking the right design

Facilitate extensibility:
the object-oriented design makes it easy to add new type of
AST node
the visitor-based scheme makes it easy to write new passes
Facilitate modularity:
the object-oriented design allows for code and data to be
stored in the AST node and be shared between phases (e.g.,
types)
the visitor design allows for code and data to be shared
among the methods of the same pass

Christophe Dubach Compiling Techniques


Syntax Tree
Object-Oriented Processing
Abstract Syntax Tree
Visitor Processing
AST Processing

Next lecture

Context-sensitive Analysis

Christophe Dubach Compiling Techniques

You might also like