Compiler Interpreter
Compiler Interpreter
Interpreters
1. Phases of
Compilation
Phases of
Compiler
Source
Program
Lexical
Analysis
Syntax
Analysis
Symbo Semantic
Analysis Error
l Table
Intermediate Code Handl
Manag Generation er
er
Code
Optimization
Code
Generation
Target
Program
Exampl
e:
Position := initiate + rate * 60
Lexical Analysis
Syntax Analysis
:
= +
id
1 *
id
2
id 6
3 0
Example
Conti…
Semantic Analysis
:
= +
id
1 *
id
2
id Int to
3 real
6
0
Intermediate Code Generator
Temp1 := int to
real (60) Temp2 :=
id3 * Temp1 Temp3
:= id2 + Temp2
id1 := Temp3
Example
conti…
Code Optimizer
Temp1 := id3 *
60.0 id1 :=
id2 + Temp1
Code Generator
MOVE id3,
R2 MULR
60.0,R2
MOVE
id2,R1
ADDR
R2,R1
MOVE
R1,id1
• Each phase transfers source program from one
representation to
another.
• Typical decomposition of compiler into phases results to
conversion of source program into target program.
5. Code Optimization:
• Attempts to improve IC.
• Results to faster running machine code.
• Optimization improve running time
of target program.
• But it doesn’t slow down
compilation.
6. Code Generation:
• Final phase of compiler that
generates target code.
• Consist of re-locatable machine
code or assembly code.
• Here memory locations are selected
for each variable used in the
program.
Aspects of
Compilation
Aspects of
Compilation
Compiler &
Interpreter
Bridge
Semantic Gap
PL Execution
Domain domain
• Two aspects:
– Generate code.
– Provide Diagnostics.
• To understand the implementation issue, we should
know PL features
contributing to semantic gap between PL and Execution
domain.
• PL Features:
1. Data Types
2. Data Structure
3. Scope Rules
4. Control Structures
1. Data
type
• Definition:
A datatype is the
specification of
(i)legal values for variables of type
(ii)legal operations on the legal
values of the type.
• Task:
1. Check legality of operations for type
of its operands.
2. Use type conversion operations wherever
necessary & permissible.
var
x,y :
real; i,j :
integer;
Begin
y := 10;
x := y +
i;
Type conversion of i is
needed. i : integer;
a,b :
real;
CONV_ AREG,
R a := b I
+ i;
ADD_R AREG,
2. Data
• PL permitsStructure
declaration of DS and use
it.
• To compile the reference of
element of DS compiler must
develop memory mapping to
access allocated area.
• A record, heterogeneous DS leads to
complex memory mapping.
• User defined DS requires mapping of
different kind.
• Proper combination of DS is required
to manage such complexity of
structure.
• Two kind of mapping is involved
Example:
Program example (input,output);
type
employee = record
name : array *1…10+ of
character; sex : character;
id : integer
end;
weekday =
(mon,tue,wed,thur,fri,sat,sun);
var
info : array [1..500] of
employee; today :
weekday;
i,j : integer;
begin {main
program}
today :=
mon;
info[i].id := j;
3. Scope
•
Rules
Determine the accessibility of variable
declared in different blocks of a
program.
• Eg: x, y : real;
y, z : B
• integer; A
Stat x:=y uses value of block ‘B’.
x := y;
• To determine accessibility of variable
compiler performs operation:
– Scope Analysis
– Name Resolution
4. Control
• Structure
Def: It is collection of language features for altering flow of
control during
execution.
• This includes:
– Conditional transfer of control
– Conditional execution
– Iterative control
– Procedure call
• Compiler must ensure non-violation of program semantics.
• Eg: for i = 1 to 100 do
begin
lab1 : if i = 10 then
……….
……….
End
• Forbidden: control is transferred to label1 from outside the
loop.
• Assignment statements are also not allowed.
Memory
Allocation
Memory
• Allocation
3 important task:
– Determine memory requirement
To represent value of data items.
– Determine memory allocation
To implement lifetime & scope of data item.
– Determine memory mapping
To access value in non-scalar data items.
• Binding: A memory binding is an association b/w
memory address attributes of the data item &
address of memory area.
• Topic list:
A. Static and dynamic memory allocation
B. Memory allocation in block structured language
C. Array allocation & access
[A] Static & Dynamic
Allocation
Static Memory Dynamic Memory
Allocation Allocation
1. Memory is allocated 1. Memory
to variable before allocation of
execution of variable is done
program begins. all the time of
2. Performed execution of
during program.
compilation. 2. Performed
3. At execution no during
allocation & de- execution.
allocation is 3. Allocation & de-
performed. allocation actions
4. Allocation to variable occurs only at
Static Memory Dynamic Memory
Allocation
1. Variable remains Allocation
1. Variables swap
allocated from and to allocation
permanently till state & free state.
execution does not 2. Eg: PL/I, ADA, Pascal.
end. 3. Two types/ flavors
- 1. automatic
2. Eg: Fortran allocation
3. No. of flavors or - 2. Program
types controlled allo
4. Adv: Simplicity and 4. Adv: Recursion
faster access. support DS size Memo
5. Memorywastage
Memory dynamically. ry: A
Memo
compared
Code(A) 5.ry:Less memory calls
wastage
compared to B.
formal.
A&
toData
dynamic.
(A) Only A
is B is
Code (B) active active
Data (B) Code(A) Code(A)
Code (C) Code (B) Code (B)
Data (C) Code (C) Code (C)
Data (A) Data (A)
Data (B)
Automatic v/s Program Controlled Dynamic
Allocation
Automatic Dynamic Program Controlled Dynamic
Allocation Allocation
1. Implies memory binding 1. Implies memory binding
performed at execution performed during the
initiation time of execution of program
program unit. unit.
2. Memory is allocated to 2. Memory is allocated not
declared variables when
execution starts. at execution but when
used for very first time
3. De-allocated when
program unit during execution.
is exited. 3. De-allocated when
4. Different memory areas arbitrary points are left
may be allocated to while execution.
same variable in 4. Here allocation is done
different activation of when program
program unit. modules start purely
5. Implemented using based on scope of
stack since entry, exit is
[B] Memory allocation in
Block Structured
• Language
Block contain data declaration
• There may be nested structure of
block
• Block structure uses dynamic memory
allocation
• Eg: PL/I, Pascal, ADA
• Sub Topic List
1. Scope Rules
2. Memory Allocation & Access
3. Accessing Non-local variables
4. Symbol table requirement
5. Recursion
6. Limitation of stack based memory
1. Scope Rules:
• If variables vari is created with name namei in
block B.
• Rule 1: vari can be accessed in any
statement situated in block B.
• Rule 2: Rule 1 + ‘B’ is enclosed in ‘B’ unless
‘B’ contains declaration using same name
namei.
• Eg: variables B = local
B’ = non-localBlock Accessibility of Variables
A{
Local Non-Local
x,y,z :
integer; B{ A xA,yA,zA
g : real; B gB xA,yA,zA
C{
C hC, zC xA,yA,gB
h,z : real; D iD, jD xA, yA, zA
}C
2. Memory allocation
& Access
• Implemented using extended stack
model.
• Automatic dynamic allocation is
implemented using extended stack
model.
• Minor variation: each record has two
reserved pointers, determining its
scope.
• AR: Activation Record
• ARB: Activation Record Block
• See the following figure:
X:re
A
al Y: char
0 (ARB) Reserv B
1 (ARB) ed Z,W :
- Pointer C integer Z
ax
s = 10;
X
X = Z;
-
-
TO -
S AR
x A
- AR -
B AR
AR
A y B
x
-
AR -
B
AR AR
B z C
y
w
TO
S TO
• Allocation:
1. TOS := TOS + 1;
2. TOS* := ARB;
3. ARB := TOS;
4. TOS := TOS + 1;
5. TOS* := ………..(sec reserved
pointer 2)
• 6.
SoTOS := TOSof+‘z’
address n; is
• <ARB> + z;
De-allocation:
1. TOS := ARB
– 1;
2. ARB :=
ARB* ;
3. Accessing non-local
– n1_var : is a non local variable
variables
– b_def : block defined into
– b_use : block in use
• Then textual ancestor of block b_use is a
block which encloses block b_use that is
b_def.
• Level m ancestor is a block which
immediately encloses level m-1 ancestor.
• S_nest_b_use: static nesting level of block
b_use.
• Rule: when b_use is in execution, b_def
must be active.
• i.e ARBb_def exist in stack while execution
ARb_use.
• n1_var are accessed by start address of
ARb_def + dn1_var
(i) Static
• Access of non-local variables is implemented using
Pointer:
second reserved
pointer in AR.
• It 1
has (ARB)
0
• 1 (ARB) is static pointer.
• At the time(ARB)
of creation of AR for Block B its static pointer
is set to point AR of static ancestor of b.
• Access non-local variables:
1. r := ARB;
2. Repeat step-3 m
times 3. r := 1(r)
4. Access n1_var
Example:
• using Status
address <r> +after
dn1_var.
–execution:
TOS := TOS + 1
– TOS* := address of AR at level 1
ancestor.
– <r> to access x at statement
x:= z
• r:= ARB;
• r:= 1(r);
• r:= 1(r);
Dyna Static
mic
Pointer Poine
-
s rs
AR
x A
-
AR
y B
AR -
B
AR
z
C
TO w
S
(ii) Displays:
• Display (1) = address of level (S_nestb-1)
ancestor of B.
• Display (2) = address of level (S_nestb-2)
ancestor of B.
• Display [S_nestb-1] = address of level 1
ancestor of B.
• Display [S_nestb-1] = address of ARB.
• For large value of level difference, it is
expensive to access non-local variables
using static pointers.
• Display is an array used to improve the
efficiency of non-local variables
accessibility.
-
Displ
x
ay
- #
1
#
y 2
AR - #
B 3
z
TO w
S
4. Symbol Table
Requirement:
• To improve dynamic allocation &
access, compiler should perform
following task.
– Determine static nesting level of b_current
– Determine variable designated with scope
rules.
– Determine static nesting level of block &
dv.
– Generate code.
• Extended stack model is used bcz it has
– Nesting level of b_current
– Symbol table for b_current.
• Symbol table has
-
Nesting 1 AR
level X|2 A
-
2 AR
Symbol & Y|2 B
Displacem
ent AR -
B 3
AR
Z|2
C
TO W|3
S
5. Recursion:
• Extended stack model best for
recursion.
• See program and figure in book pg.
no. 175 and 176.
IR
[A] Optimizing
• Transformation:
Optimization Transformation is a rule for
‘re-writing’ a segment of a program.
• Improve execution efficiency without
affecting its meaning.
• Types:
– Local (Small segments)
– Global (Entire Segments)
• What is the need of this classification?
– Reason is difference in cost.
– Benefits of optimization transformation
– What is needed is provided, not less, not
more.
• Lets see few Optimizing transformations:
a) Compile Time
• Evaluation
Certain actions are performed at compile
time, certain at execution time.
• This distribution improves execution
efficiency as certain actions are eliminated
at execution.
• This is called ‘constant folding’.’
• ‘Constant Folding’:
– When all operands are constant.
– Perform operation at compilation
– Result is also constant
– Thus, can be replaced by original evaluation.
• Thus, we eliminate division operation at time
of execution.
• Eg: a = 3.14157/2 is replaced by a =
1.570785
b) Elimination of
Common Sub-
•See ex. 6.31 Expression
• Common sub expression are occurrences
of
expressions yielding same value, called
‘equivalent expression’.
• Second occurrence b*c can be
eliminated.
• They are identified by using triples &
quadruples.
• Some compilers use rule of ‘algebraic
equivalence’ in common sub-expression
elimination.
c) Dead
• Code
The code which can be omitted from a
program without affecting its results is
called dead code.
• Now question is how to detect or check
whether it is a dead code or not.
• By checking whether the value assigned is an
assignment statement which is used
anywhere in the program or not.
• If not, it’s a dead code.
• Eg: x:=<exp>
• Has dead code if value, assigned to x is not
used anywhere in the program.
• i.e Expression constitutes dead code only
if it is not producing any side effects.
d) Frequency
•
Reduction
Eg: 6.33
• Those who are independent of ‘for
loop’ is called loop invariant.
• Here, x is a loop invariant, which his
moved out of loop to perform
frequency reduction.
• Y is indirectly dependent of loop. i.e z,i
;
• So, frequency reduction is not
possible.
• Thus, transformation of loop
optimization moves loop invariant code
e) Strength
• Reduction
Strength reduction optimization replaces
the occurrence of a time consuming
operation (also called ‘high strength’
operation) by n occurrence of a faster
operation (also called ‘low strength’
operation).
• Example 6.34
• Here, we are replacing multiplication
operation with addition.
• Beneficial in array reference.
• This results in strength reduction.
• Dis-advantage: Not recommended for
floating point operands. Reason, it doesn’t
guarantee equivalence of result.
f) Local & Global
• Optimization
Two phases:
– Local
– Global
• Local Optimization: applied over small
segments consisting of few statements.
• Global Optimization: applied over a
program unit over function or procedure.
• Local optimization is preparatory phase
for global optimization.
• Local optimization simplifies certain
aspects of global optimization.
• Global optimization eliminates only first
occurrence of a+b, all other occurances will
eliminate automatically with local optimization.
[B] Local
• Optimization
Provides limited benefits at low cost.
• Scope? Basic block which is essentially
sequential segment in a program.
• Cost is low. Why?
– Sequential Nature
– Simplified Analysis
– Applied to basic block.
• Limitations? Loop optimization is beyond the
scope of local optimization.
• See def of basic block in book, pg no. 203.
• Is a single entry point.
• Essentially sequential.
Value
• Provides simpleNumber:
means to determine whether two
occurrence of an
expression in a basic block are equivalent or not.
• This technique is applied while identifying the basic
block.
• Steps / Conditions for value numbers:
– Two expression ei and ej are equivalent if they are
congruent and their
operands have same value number.
• See eg. 6.35 and 6.36 pg. 204
• Starting of variable is 0.
• Value no is considered only when operation need to be
performed over
variables.
• Flag checks to see if value needs to be stored in
temporary location.
• This semantic can be extended to implement “constant
propagation”.
[C] Global
•
Optimization
Require more analytic efforts to establish the feasibility of an
optimization.
• Global common sub expression elimination is done here.
• Occurrence can be eliminated if it satisfy two condition:
– 1. Basic Block bj is executed only after some block bk ϵ SB has
been executed
one or more times (Ensure x*y is evaluated before bj)
– 2. No assignment to x or y have been executed after the last (or
only) evaluation of x*y block bj.
• x*y is saved to temporary location in all block b12 satisfying
condition 1.
• Requirement? Ensure that every possible execution of program
satisfy both the condition.
• How we would do this?
• By analysing program using two techniques:
– Control Flow Analysis
– Data Flow Analysis
PFG: Program Flow
Graph
• Def: A PFG for a program P is
directed graph Gp = (N,E,n0)
• Where,
– N : set of blocks
– E : set of directed edges (bi, bj)
indicating the possibility of control flow
from the last statement of bi(source
node) to first statement of
bj(destination node).
– n0 : start node of program.
Control & Data Flow
• Analysis
Control & Data Flow Analysis: Used to determine
whether the
condition governing and optimizing transformation are
satisfied or not.
1. Control Flow Analysis: Collects information
concerning its structure i.e nesting of loops.
• Few concepts:
– Predecessors & Successor:- If (bi,bj) ϵ E, bi is a
predecessor of bj & bj is a successor of bj.
– Paths:- A path is a sequence of edges such that
destination node of one edge is the source node of the
following edge.
– Ancestors & Descendants :- If path exist from bi to bj,
bj is an ancestor of
bj and bj is a descendant of bi.
– Dominators & Post Dominators:- Block bi is a dominator
of block bj if every path from n0 to bj is passed through bi.
And bi is the post dominator of bj if every path from bj to
2. Data Flow
•Analysis:
Analyse the use of data in the program.
• Data flow information is computed for
the purpose of optimization at entry &
exit of each basic block.
• Determines whether optimization
transformation can be applied or not.
• Concepts:
– Available Expression
– Live Variables
– Reaching Definition
• Use:
– Common sub expression elimination.
– Dead code elimination
– Constant variable propagation
Available
• Expression:
Occurrence of global common sub
expression can be eliminated only if
– 1. Condition 1 & 2 are satisfied at entry of bi.
– 2. No assignment to x or y precedes the
occurrence of x*y in bi.
• How to determine availability of an expression
at entry or exit of basic block bi?
• Rules:
– 1. Expression e is available at the exit of bi if
• (i) bi contains evaluation of e not followed by
assignment to any operand of e, or
• (ii) value of e is available at the entry to bi & bi
doesn’t contain assignment to any operand of e.
– 2. Expression e is available at entry to bi if it
is available at exit of
each predecessor of bi in Gp.
• It is forward data flow concept
• Availability at exit of node determines
availability at entry of successor.
• We associate two Boolean properties
with block bi to determine the effect
of computation called ‘local
properties’ of block bi.
• Eval i:- ‘True’ if e is evaluated in Bi and
operands of e are not modified.
• Modify :- ‘True’ if operand of e is
modified in bi.
• See page no. 209 equations and pg.
No. 210 PFG Example 6.38.
Live
• Variables:
Variable Var is said to be live, at a program point Pi basic
block bi if the value contained in it at Pi is likely to be used
during subsequent execution of program.
• Otherwise, its a dead code which can be eliminated.
• How to determine liveliness? By 2 property specified on pg.
No. 211.
• See data flow information again on pg. No. 211.
• Availability at entry of block determines availability
at exit of its predecessor.
• Hence called “Backward Data Flow” concept.
• Also called “any path concept”.
• Why?
• Liveness of an entry at successor is sufficient to ensure,
liveness at exit of block.
• Eg:
– a is live at entry of each block except fourth block.
– B is live at all block .
– X & Y are live at 1,5,6,7,8,9
INTERPRET
ERS
Interpret
ers
• Topic List
– Interpreters – use
– Overview of
Interpreters
– Toy Interpreter
– Pure & Impure
Interpreters
Interpreters :
• Introduction
Avoid overhead of compilation
• Modification at every execution is
managed by interpreters.
• Dis-advantage: Expensive in terms of
CPU time.
• Why? Each statement is subject to follow
interpreters cycle.
• Cycle:
– Fetch the Instruction
– Analyse statement & Determine meaning
– Execute the meaning of statement
• What is the difference between
compiler and interpreter.
Compiler v/s
Interpreter
Compiler Interpreter
• Next Phase: During • During interpretation
compilation analysis of analysis is followed by
statement is followed
by code generation. actions for
• .exe : Compiler convert implementation.
into • We can never
exe only once. run a program
• Development: One without
time infrastructure interpretation.
development.
• Rate of • Repetitive
Development: Slow development.
• Access Rate: Faster • Rate of
access at development:
later stage.
faster.
Compi Interpre
ler ter
• Best for: static • Best for dynamic
languages languages.
• Required at: only • Each time
one time compiler is program is
required. executed
• Cost: proved interpreter is
cheaper at longer required.
run. • Proved costly at
• Loading: Compiler language run.
is loaded • Interpreter is
only first time. needed at each
load.
Interpreter :
Introduction
• Notation:
– Tc : Average Compilation time per
statement
– Te : Average Execution time per
statement
– Ti : Average Interpretation time per
statement
• Here we assure Tc = Ti.Te
• SizeP = number of statements in
program P.
Exampl
• Let,
– Size = 200
e:
– 20 statements are executed
– Loop has 10 iterations
– Loop has 8 instructions in it
– 20 stmts are followed by loop for printing result.
• Then, stmt_executedP = 20 + (10 * 8) + 20
= 120.
• Total Execution Time:
– For compilation model:
• 200 . Tc + 120. Te
• = 206.Tc
– For interpretation model:
• 120. Tc
• 120 . Tc
• Conclusion: Clearly interpretation is better than
compilation as far as execution time is concerned.
Why Use
•
Interpreter?
Simplicity
• Efficiency & Certain environmental
benefits.
• If required modification,
recommended when stmt execution
is less than size P.
• Preferred during program
development.
• Also when programs are not
executed frequently
/ repeatedly.
Componen
• ts:
1. Symbol Table: Holds information
concerning entities present in program.
• 2. Data Store: Contains values of data items
declared.
• 3. Data Manipulation Routines: A set
containing a routine for every legal data
manipulation actions in the source language.
• Advantages:
– Meaning of source statement is
implemented using interpretation
routine which results to simplified
implementation.
– Avoid generation of machine language
instruction.
– Helps make interpretation portable.
– Interpreter itself is coded in high level
A Toy
Compiler:
• Steps:
– Ivar[5] = 7;
– Rvar [ 13] = 1.2;
– Add is called (for a=b+c)
– Addrealint is called.
– Rvar[r_tos] = rvar[13] + ivar[5];
– Type conversion is made by interpreters
– Rvar[addr1] + ivar[addr2]
– End.
• See program of interpreter on pg. No.
125.
• See example for given above steps on
pg. No. 126
Pure & Impure
•
Interpreters:
Pure Interpreters:
– Here, source program is retained in
source form all through interpretation.
– Dis-advantage: This arrangement incurs
substantial analysis overheads while
interpreting the statement.
– Eliminates most of the analysis during
interpretation except type analysis.
– For type analysis pre processor is needed.
– See fig. 6.34 (a). Pg. No. 217
– See Ex. 6.42. IC intermediate code for
postfix notation.
• Impure Interpreters:
– See fig. 6.34 (b). Pg. no. 217.
– See Ex. 6.43. IC intermediate code for postfix
notation.
– Perform some preliminary processing of
the source program to reduce the
analysis overhead during interpretation.
– Pre-processor converts program to an IR
which is used during interpretation.
– IC can be analysed more efficiently then
source program.
– Thus, speed up interpretation.
– Dis-advantage: Use of IR implies that entire
program has to
be pre-processed after any modifications.
– Thus, incurs fixed overhead at the start of
interpretation.
6th Chapter Ends
•
Here.
Please start preparing.
• U will not get 1 month study
leave.
• Start preparation Now.
• Every day complete 1 chapter.
• Chapter 1, 3, 4 from class work.
• Chapter 5, 6, 7, 8 and Unit 6 from
slides.
• From tomorrow we would start
chapter 7.