0% found this document useful (0 votes)
6 views107 pages

Unit 4

The document discusses the semantic analysis phase of compilation, highlighting its importance in identifying program correctness beyond syntax analysis. It covers the need for compilers to understand variable declarations, types, and function consistency, and introduces attribute grammars as a method for managing semantic rules. The document also explains the use of synthesized and inherited attributes, dependency graphs, and expression trees in the context of compiler design.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views107 pages

Unit 4

The document discusses the semantic analysis phase of compilation, highlighting its importance in identifying program correctness beyond syntax analysis. It covers the need for compilers to understand variable declarations, types, and function consistency, and introduces attribute grammars as a method for managing semantic rules. The document also explains the use of synthesized and inherited attributes, dependency graphs, and expression trees in the context of compiler design.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 107

UNIT-IV

Sematic Analysis: Syntax Directed


Translation
Dr. Banee Bandana Das
Department of CSE
1
The Course covers

o Compiler Basics
o Lexical Analysis
o Syntax Analysis
o Semantic Analysis
o Runtime environments
o Code Generation
o Code Optimization
Phases of Compilation

3
Beyond syntax analysis
• Parser cannot catch all the program errors

• There is a level of correctness that is deeper than syntax analysis

• Some language features can not be modeled using context free grammar
formalism

• Whether an identifier has been declared before use

• This problem is of identifying a language


{wαw | w є Σ*}

• This language is not context free

4
Beyond syntax analysis
• Example 1
string x; int y;
y=x+3
the use of x is type error

• int a, b;
a=b+c
c is not declared

• An identifier may refer to different variables in different parts of the program

• An identifier may be usable in one part of the program but not another

5
Compiler needs to know?
• Whether a variable has been declared?

• Are there variables which have not been declared?

• What is the type of the variable?

• Whether a variable is a scalar, an array, or a function?

• What declaration of the variable does each reference use?

• If an expression is type consistent?

• If an array use like A[i,j,k] is consistent with the declaration? Does it have three dimensions?

6
Compiler needs to know?
• How many arguments does a function take?

• Are all invocations of a function consistent with the declaration?

• If an operator/function is overloaded, which function is being invoked?

• Inheritance relationship

• Classes not multiply defined

• Methods in a class are not multiply defined

• The exact requirements depend upon the language


7
How to answer these questions?

• These issues are part of semantic analysis phase

• Answers to these questions depend upon values like type information, number of
parameters etc.

• Compiler will have to do some computation to arrive at answers

• The information required by computations may be non local in some cases

8
How to answer these questions?
• Use formal methods
• Context sensitive grammars
• Extended attribute grammars

• Use ad-hoc techniques


• Symbol table
• Ad-hoc code

• Something in between !!!


• Use attributes
• Do analysis along with parsing
• Use code for attribute value computation
• However, code is developed in a systematic way

9
Why attributes ?

• For lexical analysis and syntax analysis formal techniques were used.

• However, we still had code in form of actions along with regular expressions
and context free grammar

• The attribute grammar formalism is important


• However, it is very difficult to implement
• But makes many points clear
• Makes “ad-hoc” code more organized
• Helps in doing non local computations

10
Attribute Grammar Framework
• Generalization of CFG where each grammar symbol has an associated set of attributes

• Values of attributes are computed by semantic rules

• Two notations for associating semantic rules with productions

• Syntax directed definition


• high level specifications
• hides implementation details
• explicit order of evaluation is not specified

• Translation schemes
• indicate order in which semantic rules are to be evaluated
• allow some implementation details to be shown

11
Attribute Grammars
• Context-Free Grammars (CFGs) are used to specify the syntax of
programming languages
• E.g. arithmetic expressions
• How do we tie these rules to mathematical
concepts?
• Attribute grammars are annotated CFGs in
which annotations are used to establish
meaning relationships among symbols
– Annotations are also known as decorations

12
Attribute Grammars: Example
• Each grammar symbols has a set
of attributes
• E.g. the value of E1 is the attribute
E1.val
• Each grammar rule has a set of
rules over the symbol attributes
• Copy rules
• Semantic Function rules
• E.g. sum, quotient
Attribute Flow : Example
❖ Context-free grammars are not tied to an specific
parsing order
❖ E.g. Recursive descent, LR parsing
❖ Attribute grammars are not tied to an specific
evaluation order
❖ This evaluation is known as the annotation or
decoration of the parse tree
• The figure shows the result of annotating the
parse tree for (1+3)*2
• Each symbols has at most one attribute shown in
the corresponding box
• Numerical value in this example
• Operator symbols have no value
• Arrows represent attribute flow 14
Attribute Flow : Example

(1+3)*2

15
Example
• Consider a grammar for signed binary numbers

Number sign list


sign + | -
list list bit | bit
bit 0 | 1

• Build attribute grammar that annotates Number with the value it represents

• Associate attributes with grammar symbols

symbol attributes
Number value
sign negative
list position, value
bit position, value 16
Example
production Attribute rule

number sign list list.position 0


if sign.negative
then number.value - list.value
else number.value list.value
sign + sign.negative false
sign - sign.negative true
list bit bit.position list.position
list.value bit.value
list0 list1 bit list1.position list0.position + 1
bit.position list0.position
list0.value list1.value + bit.value
bit 0 bit.value 0
bit 1 bit.value 2bit.position
17
Parse tree and the dependence graph
Number Val=-5

sign neg=true Pos=0 list Val=5

Pos=1 list Val=4 Pos=0 bit Val=1

Pos=1 bit Val=0


Pos=2 list Val=4

Pos=2 bit Val=4

- 1 0 1

18
Parse tree and the dependence graph
production Attribute rule
Number Val=-5 number sign list list.position 0
if sign.negative
then number.value - list.value
else number.value list.value
sign neg=true Pos=0 list Val=5
sign + sign.negative false
sign - sign.negative true
Pos=1 list Val=4 Pos=0 bit Val=1
list bit bit.position list.position
list.value bit.value
Pos=1 bit Val=0
Pos=2 list Val=4 list0 list1 bit list1.position list0.position + 1
bit.position list0.position
list0.value list1.value + bit.value
Pos=2 bit Val=4 bit 0 bit.value 0
bit 1 bit.value 2bit.position

- 1 0 1
19
Attributes

• attributes fall into two classes: synthesized and inherited

• value of a synthesized attribute is computed from the values of its


children nodes

• value of an inherited attribute is computed from the sibling and parent


nodes

20
Attributes
• Each grammar production A → α has associated with it a set of semantic rules of
the form

b = f (c1, c2, ..., ck)

where f is a function, and either

• b is a synthesized attribute of A
OR
• b is an inherited attribute of one of the grammar symbols on the right

• attribute b depends on attributes c1, c2, ..., ck

21
Syntax Directed Definitions

• A SDD is a context free grammar with attributes and rules


• Attributes are associated with grammar symbols and rules with productions
• Attributes may be of many kinds: numbers, types, table references, strings, etc.
• Synthesized attributes
• A synthesized attribute at node N is defined only in terms of attribute values of children
of N and at N it
• Inherited attributes
• An inherited attribute at node N is defined only in terms of attribute values at N’s parent,
N itself and N’s siblings

22
Synthesized Attributes

• A syntax directed definition that uses only synthesized attributes is said to be


an S-attributed definition

• A parse tree for an S-attributed definition can be annotated by


evaluating sematic rules for attributes

23
Example of S-attributed SDD

Production Semantic Rules

L→En Print (E.val)


E→E+T E.val = E.val + T.val
E→T E.val = T.val
T→T*F T.val = T.val * F.val
T→F T.val = F.val
F → (E) F.val = E.val
F → digit F.val = digit.lexval

• terminals are assumed to have only synthesized attribute values of which are
supplied by lexical analyzer
• start symbol does not have any inherited attribute
24
Example: Parse tree for 3 * 4 + 5 n

L Print 17

Val=17 E n L→En Print (E.val)


E → E + T E.val = E.val + T.val
E→T E.val = T.val
T → T * F T.val = T.val * F.val
Val=12 E Val=5 T→F T.val = F.val
+ T F → (E) F.val = E.val
F → digit F.val = digit.lexval

Val=12 T F Val=5

Val=3 T * F Val=4 id

Val=3 F id
25
id
Inherited Attributes
• an inherited attribute is one whose value is defined in terms of attributes at the parent and/or siblings

• Used for finding out the context in which it appears

• possible to use only S-attributes but more natural to use inherited attributes

D→TL L.in = T.type

T → real T.type = real

T → int T.type = int

L → L1, id Ll.in = L.in; addtype(id.entry, L.in)

L → id addtype (id.entry,L.in)

26
Example: Parse tree for real x, y, z

type=real in=real
T L
addtype(z,real)

real L in=real , z
addtype(y,real) D→TL L.in = T.type
in=real L , y
T → real T.type = real
T → int T.type = int
addtype(x,real) x
L → L1, id Ll.in = L.in; addtype(id.entry, L.in)
L → id addtype (id.entry,L.in)

27
Dependence Graph

• If an attribute b depends on an attribute c then the semantic rule for b must


be evaluated after the semantic rule for c

• The dependencies among the nodes can be depicted by a directed graph called
dependency graph

28
Algorithm to construct dependency graph

for each node n in the parse tree do


for each attribute a of the grammar symbol do
construct a node in the dependency graph
for a

for each node n in the parse tree do


for each semantic rule b = f (c1, c2 , ..., ck) do
{ associated with production at n }
for i = 1 to k do
construct an edge from ci to b

29
Example: Dependence Graph

• Suppose A.a = f(X.x , Y.y) is a semantic rule for A → X Y


A A.a

X Y X.x Y.y
• If production A → X Y has the semantic rule X.x = g(A.a, Y.y)

A.a
A

X.x Y.y
X Y

30
Example: Dependence Graph
• Whenever following production is used in a parse tree
E→ E1 + E2 E.val = E1.val + E2.val
we create a dependency graph

E.val

E1.val E2.val

31
Example: Dependence Graph

• dependency graph for real id1, id2, id3


• put a dummy synthesized attribute b for a semantic rule that consists of a procedure
call
D

type=real in=real
T L
addtype(z,real)
Type_lexeme
real L in=real , id.z z

addtype(y,real)
in=real L , id.y y

addtype(x,real) x id.x
32
Evaluation Order
• Any topological sort of dependency graph gives a valid order in which semantic rules must be
evaluated

Expression
Tree • Condensed form of parse tree,
• useful for representing language constructs.
• The production S → if B then s1 else s2
may appear as

if-then-else

B s1 s2
33
Expression tree
• Chain of single productions may be collapsed, and operators move
to the parent nodes

E +

E + T * id3

T F id1 id2

T * F id3

F id2

id1 34
Constructing expression tree
• Each node can be represented as a record

• operators: one field for operator, remaining fields ptrs to operands


mknode( op,left,right )

• identifier: one field with label id and another


ptr to symbol table
mkleaf(id,entry)

• number: one field with label num and another


to keep the value of the number
mkleaf(num,val)

35
Example

The following sequence of function calls creates


an expression tree for a- 4 + c

P1 = mkleaf(id, entry.a) P5

+
P2 = mkleaf(num, P3 P4
4)
P3 = mknode(-, P1, P2)
- id
P4 = mkleaf(id, entry.c)
P5 = mknode(+, P3, P4) P1
P2 entry of c
id num 4

entry of a

36
A syntax directed definition for constructing Expression tree

E → E1 + T E.ptr = mknode(+, E1.ptr, T.ptr)


E →T E.ptr = T.ptr
T → T1 * F T.ptr := mknode(*, T1.ptr, F.ptr)
T →F T.ptr := F.ptr
F → (E) F.ptr := E.ptr
F → id F.ptr := mkleaf(id, entry.id)
F → num F.ptr := mkleaf(num,val)

37
DAG for Expressions

Expression (a + a * ( b – c )) + ( b - c ) * d make a leaf or node if not


present, otherwise return pointer to the existing node
P1 = makeleaf(id,a) P13
P2 = makeleaf(id,a) +
P3 = makeleaf(id,b) P7
P4 = makeleaf(id,c) +
P5 = makenode(-,P3,P4)
P6 = makenode(*,P2,P5) P6 P12
P7 = makenode(+,P1,P6) * *
P5
P8 = makeleaf(id,b) P1 P2 P11
P10
P9 = makeleaf(id,c)
a - d
P10 = makenode(-,P8,P9)
P3 P8
P11 = makeleaf(id,d) P4 P9
P12 = makenode(*,P10,P11) b c
P13 = makenode(+,P7,P12) 38
Bottom-up evaluation of S-attributed definitions
• Can be evaluated while parsing

• Whenever reduction is made, value of new synthesized attribute is


computed from the attributes on the stack

• Extend stack to hold the values also

state value
ptr stack stack

• The current top of stack is indicated by ptr top

39
Bottom-up evaluation of S-attributed definitions
• Suppose semantic rule A.a = f(X.x, Y.y, Z.z) is
associated with production A → XYZ

• Before reducing XYZ to A, value of Z is in val(top), value of Y is in


val(top-1) and value of X is in val(top-2)

• If symbol has no attribute, then the entry is undefined

• After the reduction, pop x, y and z and push a computed in terms of x, y


and z.

40
Type system
• A type is set of values

• Certain operations are legal for values of each type

• A language’s type system specifies which


operations are valid for a type

• The aim of type checking is to ensure that operations are used on the
variable/expressions of the correct types

41
Type system
• Languages can be divided into three categories with respect to the
type:

– “untyped”
• No type checking needs to be done

– Statically typed
• All type checking is done at compile time
• Also, called strongly typed

– Dynamically typed
• Type checking is done at run time
• Mostly functional languages like Lisp, Scheme etc.

42
Type system
• Static typing
– Catches most common programming errors at
compile time
– Avoids runtime overhead
– May be restrictive in some situations
– Rapid prototyping may be difficult

• A type system is a collection of rules for assigning type expressions to


various parts of a program
• Most code is written using static types languages

• In fact, most people insist that code be strongly type checked at compile time even if
language is not strongly typed (use of Lint for C code, code compliance checkers)

43
Type expression
• Type of a language construct is • If both the operands of arithmetic operators +, -, x are
integers then the result is of type integer
denoted by a type expression
• The result of unary & operator is a pointer to the object
• A basic types in type referred to by the operand.
expression:
– Primitive data types defined in • If the type of operand is X the type of result is pointer to X
language like int,
float … • Basic types: integer, char, float, boolean
– type error: error during type
checking • Sub range type: 1 … 100
– void: no type value
• Enumerated type: (violet, indigo, red)
• Derived types
– Array
• Constructed type: array, record, pointers, functions
– Product
– Record
44
– Pointer
Type expressions for Array and Product
• If T is a type expression then array(I, T) is a type expression denoting the
type of an array with elements of type T and index set I

– C lang:
• Declaration : int A[10]
• Type expression: array([0..9],int)

– Pascal :
• Declaration : var A:array [-10 .. 10] of integer
• Type Expression: array([-10..10],int)

• If T1 and T2 are type expressions, then their Cartesian


product T1 x T2 is a type expression
45
Type expressions for Record
Type expressions for
Record
• It applies to a tuple formed from field names and field types.
Consider the declaration
struct row{
int addr;
char lexeme[15];
}
struct row table[10];

type expression of record : int x array(0 .. 14, char) and


type expression of table is array(0 .. 9, row)

46
Type expression of Pointer
• If T is a type expression, then pointer( T ) is a type expression denoting type
pointer to an object of type T

Example:
statement
type expression
pointer(int)
int *a
pointer(struct row)
int *struct row

47
Type expression of function
• Function maps domain set to range set. It is denoted by type
expression D → R

– For example mod has type expression int x int→ int

– int *f( char a, char b) has type expression char x char pointer(
integer )

function f( a, b: char ) : * integer; is denoted by char x char pointer( integer )

48
Specifications of a type checker
Consider a language which consists of a sequence of declarations
followed by a single expression

P→D;E
D → D ; D | T id | T id[ num] | T*id

T → char | int

E
• → id | num |generated
A program E mod E |byid this
[E] |grammar
*E | F(E1)is
int a;
key mod 1999
• Assume following:
– basic types are char, int, type-error
– all arrays start at 1
– array[256] of char has type expression 49
array(1 .. 256, char)
Rules for Symbol Table entry
D id : T addtype(id.entry, T.type)
T char T.type = char
T integer T.type = int
T *T1 T.type = pointer(T1.type)
T array [ num ] of T1 T.type = array(1..num, T1.type)

Type checking of functions


E E1 ( E2 ) E. type = if E2.type == s and
E1.type == s → t
then t
else type-error

50
Type checking for expressions
E → num
E.type = int
E → id
E.type = lookup(id.entry)
E→ E1 mod E.type = if E1.type == int and E2.type==int
E2
then int
else type_error
E → E[E2] E.type = if E2.type==int and E1.type==array(s,t)
then t
else type_error
E → *E1 E.type = if E1.type==pointer(t)
then t
else type_error

51
Type checking for statements

• Statements typically do not have values. Special basic type void can be assigned to them.

S.Type = if id.type == E.type


S → id := E
then void

else type_error

S → if E then S1 S.Type = if E.type == boolean


then S1.type
else type_error

S → while E do S1 S.Type = if E.type == boolean


then S1.type
else type_error

S → S1 ; S2 S.Type = if S1.type == void and S2.type == void


then void
else type_error 52
Type conversion
• Consider expression like x + i where x is of type real and i is of type
integer

• Internal representations of integers and reals are different in a computer


– different machine instructions are used for operations on integers and reals

• The compiler has to convert both the operands to the same type

• Language definition specifies what conversions are necessary.


• Type conversion is called implicit if done by compiler.

• Conversions are explicit if programmer must write something to cause


conversion
53
Type conversion
• Usually conversion is to the type of the left hand side

• Type checker is used to insert conversion operations:


x+i x real+ inttoreal(i)

• Type conversion is called implicit/coercion if done by compiler.

• It is limited to the situations where no information is lost

• Conversions are explicit if programmer has to write something to


cause conversion

54
Type implicit for expressions
E → num E.type = int
E → num.num E.type = real
E → id E.type = lookup( id.entry )

E → E1 op E.type = if E1.type == int && E2.type == int


E2 then int
elseif E1.type == int && E2.type == real
then real
elseif E1.type == real && E2.type == int
then real
elseif E1.type == real && E2.type==real
then real

55
Type resolution
• Try all possible types of each overloaded function (possible but brute force method!)

• Keep track of all possible types

• Discard invalid possibilities

• At the end, check if there is a single unique type

• Overloading can be resolved in two passes:


• Bottom up: compute set of all possible types for each expression
• Top down: narrow set of possible types based on what could be used in an expression

56
Phases of Compilation

57
INTERMEDIATE CODE GENERATION

INTERMEDIATE CODE
GENERATION
◻ Goal: Generate a Machine Independent Intermediate Form that is Suitable for Optimization and Portability.
◻ Facilitates retargeting: enables attaching a back end for the new machine to an existing front end.
◻ Enables machine-independent code optimization.

58
Motivation
◻ What we have so far...
A Parser tree
■ With all the program information
What We Want
■ Known to be correct
■ Well-typed
◻ A Representation that
■ Nothing missing
■ No ambiguities Is closer to actual machine
◻ What we need... Is easy to manipulate
Something “Executable” Is target neutral (hardware
Closer to independent)
■ An operations schedule
■ Actual machine level
Can be interpreted

59
Recall ASTs and DAGs

◻ Intermediate Forms of a Program:


ASTs: Abstract Syntax Trees
DAGs: Directed Acyclic Graphs
What is the Expression?
assign assign

a + a +

* *
*
b
b uminus uminus
b uminus
c c c 60
Representations
● Syntax Trees
+

Maintains structure of the construct * 4
– Suitable for high-level representations
3 5
● Three-Address Code
– Maximum three addresses
t1 = 3 * 5
in an instruction 3AC
t2 = t1 + 4
– Suitable for both high and
low-level representations mult 3, 5 2AC
add 4
● Two-Address Code
● … push 3 1AC
push 5 or
– e.g. C mult stack 61
add 4
machine
We can represent three-address code by using the following ways.

Quadruple
- Triple
- Indirect Triple

62
63
64
65
Representation
◻ Two Different Forms:
assign
Linked Data Structure
a +
Multi-Dimensional Array

* *

b
b uminus uminus

c c

66
Abstract Syntax Trees (ASTs) and DAG
◻ Directly Generate Code From AST or DAG as a Side Effect of Parsing
Process.
◻ Consider Code Below:

Each is Referred to as “3 Address Coding (3AC)” since there are at Most 3


Addresses per Statement One for Result and At Most 2 for Operands
67
What is Three-Address Coding?
◻ A simple type of instruction An address can be a name, constant or
3 / 2 Operands x,y,z temporary.
Each operand could be
■ A literal
Assignments x = y op z; x = op y.
■ A variable Copy x = y.
■ A temporary
◻ Example Unconditional jump goto L.
x := y op z x := op z Conditional jumps if x relop y goto L.
t0 := y * z
x+y*z
t1 := x + t0
Parameters param x.

Function call y = call

p. 68
Types of Three Address Statements
◻ Indexed Assignments of Form:
X := Y[i] (Set X to i-th memory location of Y)
X[i] := Y (Set i-th memory location of X to Y)
Note the limit of 3 Addresses (X, Y, i)
Cannot do: x[i] := y[j]; (4 addresses!)
◻ Address and Pointer Assignments of Form:
X := & Y (X set to the Address of Y)
X := * Y (X set to the contents pointed to by Y)
* X := Y (Contents of X set to Value of Y)

69
Quadruples
◻ In the quadruple representation, there are four fields for each instruction: op, arg1,
arg2, result
Binary ops have the obvious representation
Unary ops don’t use arg2
Operators like param don’t use either arg2 or result
Jumps put the target label into result
◻ The quadruples implement the three- address code in (a) for the expression a = b * - c
+ b*-c

70
Three address code VS Quadruples
◻ The quadruples implement the three- address code in (a) for the expression a = b * - c + b * - c

71
Syntax tree VS Triples

Representations of a = b * - c + b * - c

72
Indirect Triples
◻ These consist of a listing of pointers to triples, rather than a listing of
the triples themselves.
◻ An optimizing compiler can move an instruction by reordering the instruction list, without affecting the triples
themselves.

73
Control Flow

Conditionals
– if, if-else, switch
Loops
– for, while, do-while, repeat-until
We need to worry about
– Boolean expressions
– Jumps (and labels)

74
Code Generation for Boolean Expressions

◻ Implicit representation
For the boolean expressions which are used in flow-of-control statements
(such as if- statements, while-statements etc.) boolean expressions do not
have to explicitly compute a value, they just need to branch to the right
instruction.
Generate code for boolean expressions which branch to the appropriate
instruction based on the result of the boolean expression.

75
Generated Code
◻ Consider: a < b or c <d and e < f 100: if a< b goto 103
101: t1:=0
102: goto 104
103: t1:=1
104: if c< d goto 107
105: t2:=0
106: goto 108
107: t2:=1
108: if e< f goto 111
109: t3:=0
110: goto 112
111: t3:=1
112: t4:=t2 and t3
113: t5:=t1 or t4
76
CONTROL FLOW: IF STATEMENT Generated Code

77
Generated Code

78
Run-Time Environment

79
Run-time Environment
• Compiler must cooperate with OS and other system software to support
implementation of different abstractions (names, scopes, bindings, data
types, operators, procedures, parameters, flow-of-control) on the target
machine.

• Compiler does this by Run-Time Environment in which it assumes its


target programs are being executed.

• Run-Time Environment deals with


– Layout and allocation of storage
– Access to variable and data
– Linkage between procedures
– Parameter passing
80
– Interface to OS, I/O devices, etc.
Storage Organization
• Compiler deals with logical address space
• OS maps the logical addresses to physical addresses

Code Memory locations for code are determined at compile time. Usually placed in the
low end of memory
Static Size of some program data are known at compile time – can be placed another
statically determined area
Heap
Dynamic space areas – size changes during program execution.
• Heap
• Grows towards higher address
Free Memory • Stores data allocated under program control
• Stack
• Grows towards lower address
Stack • Stores activation records

81
Typical subdivision of run-time memory
Static vs. Dynamic Allocation
• How do we allocate the space for the generated target code and the data
object of our source programs?

• The places of the data objects that can be determined at compile time will
be allocated statically.
• But the places for the some of data objects will be allocated at run-time.

• The allocation and de-allocation of the data objects is managed by the


run-time support package.
– run-time support package is loaded together with the generated target
code.
– the structure of the run-time support package depends on the semantics of the
programming language (especially the semantics of procedures in that language).
82
Static vs. Dynamic Allocation
• Static: Compile time, Dynamic: Runtime allocation

• Many compilers use some combination of following


• Stack storage: for local variables, parameters and so on.
• Heap storage: Data that may outlive the call to the procedure that created it.

• Stack allocation is a valid allocation for procedures since procedure calls are
nested.

83
Procedure Activations
• Each execution of a procedure is called as activation of that procedure.
• An execution of a procedure P starts at the beginning of the procedure
body;
• When a procedure P is completed, it returns control to the point immediately
after the place where P was called.
• Lifetime of an activation of a procedure P is the sequence of steps between
the first and the last steps in execution of P (including the other procedures
called by P).
• If A and B are procedure activations, then their lifetimes are either
non-overlapping or are nested.
• If a procedure is recursive, a new activation can begin before an earlier
activation of the same procedure has ended.
84
Call Graph
A call graph is a directed multi-graph where:
• the nodes are the procedures of the program and
• the edges represent calls between these procedures.

85
Call Graph: Example
var a: array [0 .. 10] of
integer; readarray
procedure Main readarray
var i: integer
begin … a[i] … end
function partition(y,z: integer): integer
var i,j,x,v: integer
begin … end quicksort partition

procedure quicksort(m,n: integer)


var i: integer
begin i := partition(m,n); quicksort(m,i-1); quicksort(i+1,n) end
procedure main
begin readarray(); quicksort(1,9);
end

86
Activation Tree/ Call Tree
• We can use a tree (called activation tree) to show the way control
enters and leaves activations.

• In an activation tree:
– Each node represents an activation of a procedure.
– The root represents the activation of the main program.
– The node A is a parent of the node B iff the control flows from A to B.
– The node A is left to to the node B iff the lifetime of A occurs before the lifetime
of B.

87
Activation Tree (cont.)

program main;
procedure
s;
begin ... end;
procedure p;
procedure
q;
begin ...
end; begin q;
s; end;
begin p; s; end;
88
Sketch of a quicksort program

89
Run-time Control Flow for Quicksort

Activation tree representing calls during an execution of quicksort


main

r() q (1,9)

p (1,9) q(1,3) q (5,9)

p (1,3) q(1,0) q (2,3) p (5,9) q(5,5) q (7,9)

p (2,3) q(2,1) q (3,3) p (7,9) q(7,7) q (9,9)

90
Implementing Run-time control flow

• Procedure call/return – when a procedure activation terminates,


control returns to the caller of this procedure.

• Parameter/return value – values are passed into a procedure activation


upon call. For a function, a value may be returned to the caller.

• Variable addressing – when using an identifier, language scope


rules dictate the binding.

91
Activation Records
Activation Records

• Information needed by a single execution of a procedure is managed using


a contiguous block of storage called activation record.

• An activation record is allocated when a procedure is entered, and it is


de-allocated when that procedure exited.

• Size of each field can be determined at compile time (Although actual


location of the activation record is determined at run-time).
– Except that if the procedure has a local variable and its size depends on
a parameter, its size is determined at the run time.

92
Activation Records
The returned value of the called procedure is returned
return value in this field to the calling procedure. In practice, we may use a machine register for
the return value.
actual The field for actual parameters is used by the calling procedure to supply parameters to
parameters the called procedure.
optional control The optional control link points to the activation record of the caller.
link
The optional access link is used to refer to nonlocal data
optional access held in other activation records.
link The field for saved machine status holds information about the state of the machine
saved machine before the procedure is called.
status
The field of local data holds data that local to an execution
local data of a procedure.
Temporary variables is stored in the field of temporaries.
temporaries

93
Activation Records: Example

stack
main
program main;
procedure p;
var a:real;
procedure
q;
var
p
b:integer;
begin ... a:
end;
begin q; end;
main q
procedure s;
var p s
c:integer; b:
begin ... end;
begin p; s; q
end; 94
Activation Records for Recursive Procedures

program main;
main
procedure
p;
function
q(a:integer):integer; p
begin
if (a=1) then
q:=1; else
q(3)
q:=a+q(a-1); a: 3
end; q(2)
begin q(3); end;
begin p; end; a:2
q(1)
a:1

95
Stack Allocation for quicksort 1

Call Stack (growing


Tree downward)
Main
Main
a: array
readarray readarray
i: integer

96
Stack Allocation for quicksort 2

Call Stack (growing


Tree downward)
Main
Main
a: array
readarray quick(1,9)
quick(1,9)
i: integer

97
Stack Allocation for quicksort 3

Call Stack (growing


Tree downward)
Main
Main
a: array
readarray quick(1,9)
quick(1,9)
i: integer
p(1,9) quick(1,3) quick(1,3)
i: integer
quick(1,0)
p(1,9) quick(1,0)
i: integer

98
Layout of the stack frame

Frame pointer $fp: points to


the first word of the frame,
Stack pointer ($sp): points
to the last word of the frame.
The frame consists of
the memory between
locations pointed by
$fp and $sp

99
Creation of An Activation Record

• Who allocates an activation record of a procedure?


– Some part of the activation record of a procedure is created by that procedure
immediately after that procedure is entered.
– Some part is created by the caller of that procedure before that procedure is
entered.

• Who deallocates?
– Callee de-allocates the part allocated by Callee.
– Caller de-allocates the part allocated by Caller.

100
Creation of An Activation Record
return value
Caller’s Activation
actual parameters
Record
optional control link
optional access link
saved machine status
local data
temporaries
Caller’s Responsibility
return value
actual parameters
Callee’s Activation
Record
optional control link
optional access link
saved machine status
local data Callee’s
temporaries Responsibility
101
Designing Calling Sequences

• Values communicated between caller and callee are generally placed at the
beginning of callee’s activation record.
• Fixed-length items: are generally placed at the middle.
• Items whose size may not be known early enough: are placed at the end of
activation record.
• We must locate the top-of-stack pointer judiciously: a common approach is to
have it point to the end of fixed length fields.

102
Division of tasks between caller and callee

103
calling sequence
• The caller evaluates the actual parameters.
• The caller stores a return address and the old value of top-sp into the
callee's activation record.
• The callee saves the register values and other status information.
• The callee initializes its local data and begins execution.

104
corresponding return sequence

• The callee places the return value next to the parameters.


• Using information in the machine-status field, the callee restores top-sp and
other registers, and then branches to the return address that the caller placed
in the status field.
• Although top-sp has been decremented, the caller knows where the return
value is, relative to the current value of top-sp; the caller therefore may use
that value.

105
Access to dynamically allocated arrays

106
Memory Manager
Typical Memory Hierarchy Configurations
• Two basic functions:
• Allocation
• Deallocation
• Properties of
memory managers:
• Space efficiency
• Program efficiency
• Low overhead

107

You might also like