Scott 4e 01 Compilation
Scott 4e 01 Compilation
• Pure interpretation
– interpreter stays around for execution of
program
– interpreter is locus of control during execution
– interpreter implements virtual machine
Compilation vs. Interpretation
• Interpretation
– greater flexibility
– better error messages (e.g., good source-level
debugger)
– dynamically create code and then execute it
• Compilation
– better performance
Compilation vs. Interpretation
• Most language implementations mix
compilation and interpretation
• Common case compilation or pre-
processing – followed by interpretation
Compilation vs. Interpretation
• Tools
An Overview of Compilation
• Phases of Compilation
An Overview of Compilation
• Lexical Analysis (Scanning)
– recognize regular language using DFA
– take input character stream
– divide program into "tokens", smallest meaningful
units to save time (char-by-char processing slow)
– recognize identifiers, constants, keywords,
operators
– produce token stream
– do simple tasks early to reduce complexity later
An Overview of Compilation
• Semantic analysis
– recognize context-sensitive aspects of syntax (often called
static semantics, but misnamed in instructor’s opinion
– build symbol table
– take concrete syntax (parse) tree
– check type matches of variables and expressions
– produce abstract syntax tree or some other intermediate
form
An Overview of Compilation
• Machine-independent optimization
– take intermediate-code program, optionally produce
equivalent but “better” program – faster, smaller, etc.
– improve code, not really optimize
– produce another intermediate form program
– examples: common subexpression elimination, copy
propagation, dead code elimination, loop optimizations,
in-line function calls, tail recursion optimization
An Overview of Compilation
• Code generation
– produce assembly language or relocatable machine
language from intermediate form and symbol table
– assign memory locations, registers, etc.
• Machine-specific optimization
– take output of code generation
– Optionally improve using specific details of machine,
e.g., special instructions, addressing modes, co-
processors
An Overview of Compilation
• Symbol table
– track information about identifiers throughout all phases
– may be (partially) retained to support debugging, error
recovery, reflection/metaprogramming
An Overview of Compilation
• Lexical and Syntax Analysis: GCD
program (in C)
int main() {
int i = getint(), j = getint();
while (i != j) {
if (i > j) i = i - j;
else j = j - i;
}
putint(i);
}
An Overview of Compilation
• Lexical and Syntax Analysis: GCD program
tokens
– Lexical analysis (scanning) and parsing recognize
structure of program, group characters into tokens
int main ( ) {
int i = getint ( ) , j = getint ( ) ;
while ( i != j ) {
if ( i > j ) i = i - j ;
else j = j - i ;
}
putint ( i ) ;
}
An Overview of Compilation
next slide
An Overview of Compilation
• Context-Free Grammar and Parsing (continued)
An Overview of Compilation
• Context-Free Grammar and Parsing (continued)
A B
An Overview of Compilation
• Syntax Tree: GCD Program Parse Tree