0% found this document useful (0 votes)
36 views36 pages

Scott 4e 01 Compilation

Uploaded by

Hany Atlam
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
36 views36 pages

Scott 4e 01 Compilation

Uploaded by

Hany Atlam
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 36

Chapter 1 :: Introduction

Programming Language Pragmatics, Fourth Edition


Michael L. Scott

Copyright © 2016 Elsevier


Introduction

• Selected excerpt from Chapter 1 of Scott’s


textbook slides

• Modified by H. Conrad Cunningham,


Professor, Computer and Information Science,
University of Mississippi

• Included ideas from Mitchell’s textbook and


other sources
Compilation vs. Interpretation
• Compilation vs. interpretation
– not opposites
– no absolute distinction
Compilation vs. Interpretation
• Pure compilation
– compiler translates source program into
equivalent target program, then goes away
– often high-level language (source code)
translated to machine language (object code)
– OS later executes target program on machine
– target program is locus of control
Compilation vs. Interpretation

• Pure interpretation
– interpreter stays around for execution of
program
– interpreter is locus of control during execution
– interpreter implements virtual machine
Compilation vs. Interpretation

• Interpretation
– greater flexibility
– better error messages (e.g., good source-level
debugger)
– dynamically create code and then execute it

• Compilation
– better performance
Compilation vs. Interpretation
• Most language implementations mix
compilation and interpretation
• Common case compilation or pre-
processing – followed by interpretation
Compilation vs. Interpretation

• Compilation not required to produce machine code


for hardware
• Compilation translates one language into another,
fully analyzing input’s meaning
• Compilation requires semantic understanding of input
• Preprocessing does not require semantic
understanding, allows some errors through
• Compiler hides subsequent steps
• Preprocessor does not hide subsequent steps
Compilation vs. Interpretation

• Compiled languages have interpreted features


– input/output formats
• Compiled languages may use “virtual
instructions”
– set operations
– string operations
• Compiled languages might only produce
virtual instructions, e.g., Java byte code
Compilation vs. Interpretation

• Implementation strategy: Preprocessor


– removes comments and white space
– groups characters into tokens (keywords, identifiers,
numbers, symbols)
– expands abbreviations and textual macros
– identifies higher-level syntactic structures (loops,
subroutines)
– preserves structure of source in intermediate form
Compilation vs. Interpretation

• Implementation strategy: Library and linking


– compiler uses linker program to merge appropriate
subroutines from library
Compilation vs. Interpretation
• Implementation strategy: Post-compilation
assembly
– facilitates debugging (assembly easier to read)
– isolates compiler from changes in format of machine
code files (e.g., between OS releases)
Compilation vs. Interpretation

• Implementation strategy: Conditional


compilation
– preprocessor deletes portions of code, several program
versions share same source
– e.g., C’s preprocessor
Compilation vs. Interpretation

• Implementation strategy: Source-to-source


translation
– generate intermediate program in another language
(e.g., C++ to C, various to JavaScript)
Compilation vs. Interpretation

• Implementation strategy: Compilation of


interpreted languages
• Compiler generates code guessing about runtime
circumstances
• If correct, code is fast
• If not, dynamic check reverts to normal interpreter
Compilation vs. Interpretation
• Implementation strategy: Bootstrapping
Compilation vs. Interpretation
• Implementation strategy: Dynamic and Just-in-
Time compilation
– Deliberately delay compilation until last possible moment
• compile source code on the fly – dynamically created source --
optimize program for particular input
• use machine-independent intermediate code but compile to
machine code when executed (e.g., Java just-in-time-
compiler, .NET CIL)
Compilation vs. Interpretation

• Implementation strategy: Microcode


• Assembly-level instruction set not implemented in
hardware; runs on interpreter.
• Interpreter written in low-level instructions
(microcode or firmware), stored in read-only
memory, executed by hardware
Compilation vs. Interpretation

• Compilers exist for some interpreted languages, but


not pure
– selective compilation of part + sophisticated preprocessing
of rest
– interpretation of part still necessary for reasons above
• Unconventional compilers
– text formatters
– silicon compilers
– query language processors
Programming Environment Tools

• Tools
An Overview of Compilation
• Phases of Compilation
An Overview of Compilation
• Lexical Analysis (Scanning)
– recognize regular language using DFA
– take input character stream
– divide program into "tokens", smallest meaningful
units to save time (char-by-char processing slow)
– recognize identifiers, constants, keywords,
operators
– produce token stream
– do simple tasks early to reduce complexity later
An Overview of Compilation

• Syntax Analysis (Parsing)


– recognize context-free language (CFG) using PDA
– take token stream (but could take character stream with
no scanner, might be quite messy)
– discover context-free grammatical structure of program
– output error messages
– produce concrete syntax (parse) tree
An Overview of Compilation

• Semantic analysis
– recognize context-sensitive aspects of syntax (often called
static semantics, but misnamed in instructor’s opinion
– build symbol table
– take concrete syntax (parse) tree
– check type matches of variables and expressions
– produce abstract syntax tree or some other intermediate
form
An Overview of Compilation

• Intermediate form (IF)


– produced if no errors in syntax or static “semantics”
– machine code for idealized machine; e.g. stack machine or
with unlimited number of registers
– chosen to balance machine independence, ease of
optimization, ease of translation to final form, compactness
– might use several intermediate forms
– use abstract syntax trees and symbol table in our
interpreters
An Overview of Compilation

• Machine-independent optimization
– take intermediate-code program, optionally produce
equivalent but “better” program – faster, smaller, etc.
– improve code, not really optimize
– produce another intermediate form program
– examples: common subexpression elimination, copy
propagation, dead code elimination, loop optimizations,
in-line function calls, tail recursion optimization
An Overview of Compilation

• Code generation
– produce assembly language or relocatable machine
language from intermediate form and symbol table
– assign memory locations, registers, etc.

• Machine-specific optimization
– take output of code generation
– Optionally improve using specific details of machine,
e.g., special instructions, addressing modes, co-
processors
An Overview of Compilation

• Symbol table
– track information about identifiers throughout all phases
– may be (partially) retained to support debugging, error
recovery, reflection/metaprogramming
An Overview of Compilation
• Lexical and Syntax Analysis: GCD
program (in C)
int main() {
int i = getint(), j = getint();
while (i != j) {
if (i > j) i = i - j;
else j = j - i;
}
putint(i);
}
An Overview of Compilation
• Lexical and Syntax Analysis: GCD program
tokens
– Lexical analysis (scanning) and parsing recognize
structure of program, group characters into tokens

int main ( ) {
int i = getint ( ) , j = getint ( ) ;
while ( i != j ) {
if ( i > j ) i = i - j ;
else j = j - i ;
}
putint ( i ) ;
}
An Overview of Compilation

• Lexical and Syntax Analysis: Context-Free


Grammar and Parsing
• Parsing organizes tokens into a parse tree that
represents higher-level constructs in terms of their
constituents
• Potentially recursive rules known as context-free
grammar define the ways in which these
constituents combine
An Overview of Compilation
• Context-Free Grammar and Parsing:
Example (while loop in C)
iteration-statement → while ( expression ) statement

statement, in turn, is often a list enclosed in braces:


statement → compound-statement
compound-statement → { block-item-list opt }
where
block-item-list opt → block-item-list
or
block-item-list opt → ϵ
and
block-item-list → block-item
block-item-list → block-item-list block-item
block-item → declaration
block-item → statement
An Overview of Compilation
• Context-Free Grammar and Parsing: GCD
Program Parse Tree

next slide
An Overview of Compilation
• Context-Free Grammar and Parsing (continued)
An Overview of Compilation
• Context-Free Grammar and Parsing (continued)
A B
An Overview of Compilation
• Syntax Tree: GCD Program Parse Tree

You might also like