19CSE401 CD 01 Introduction
19CSE401 CD 01 Introduction
Introduction
Unit 2
Context-Sensitive Analysis: Attribute Grammar – Ad Hoc Syntax Directed
Translation. Intermediate Representations: Abstract Syntax Tree, Three Address
Code. Symbol Tables: Hash Table
Unit 3
Procedure Abstraction: Access Links. Optimization: Local Value Numbering, Superlocal
Value Numbering, Liveness Analysis.
26-07-2024 2
Text Book(s)
Cooper, Keith, and Linda Torczon, Engineering a Compiler, Second Edition, Morgan
Kaufman, 2011.
Reference(s)
1. Parr T. Language implementation patterns: create your own domain-specific and
general programming languages. Pragmatic Bookshelf; First Edition, 2010.
2. Mak R. Writing compilers and interpreters: a software engineering approach. John
Wiley & Sons; Third Edition, 2009.
3. Appel W Andrew and Jens Palesberg, Modern Compiler Implementation in Java,
Cambridge University Press, Second Edition, 2002.
4. Aho, Alfred V., Monica S. Lam, Ravi Sethi, and Jeffrey Ullman, Compilers:
Principles, Techniques and Tools, Prentice Hall, Second Edition, 2006.
26-07-2024 3
CO Course Outcomes
CO2 Apply theoretical concepts and ad hoc techniques to translate high level structures to
intermediate representations
CO3 Analyze the design of data structures for compile-time code generation
CO4 Analyze the design of data structures for run-time code generation
26-07-2024 4
Outlines
Introduction
Compiler Structure
Overview of Translation
Front End
Optimizer
Back End
26-07-2024 5
Introduction
Compiler technology
Compilers are computer programs that translate a program written in one language into a program in
another language.
What is a compiler?
▪ A program that translates an executable program in one language into an executable program
in another language
▪ The compiler should improve the program, in some way
What is an interpreter?
▪ A program that reads an executable program and produces the results of executing that
program
C is typically compiled
Java is compiled to bytecodes (code for the Java VM)
▪ which are then interpreted
▪ Or a hybrid strategy is used
Just-in-time compilation(JIT that executes at runtime)
26-07-2024 6
What Do Compilers Do
A compiler acts as a translator, transforming human-oriented programming
languages into computer-oriented machine languages.
The compiler has a front end to deal with the source language.
It has a back end to deal with the target language.
Typical “source” languages might be c, c++, fortran, Java.
The “target” language is usually the instruction set of some processor
26-07-2024 7
o Connecting the front end and the back end, it has a formal structure for
representing the program in an intermediate form whose meaning is largely
independent of either language.
Instruction set
The set of operations supported by a processor, the overall design of an instruction
set is often called an Instruction Set Architecture or ISA.
o Compilers that target programming languages rather than the instruction set of a
computer are often called source-to-source translators
26-07-2024 9
What Do Interpreters Do
An interpreter takes as input an executable specification and produces as
output the result of executing the specification.
Some languages, such as Perl, Scheme are more often implemented with
interpreters than with compilers.
26-07-2024 10
o Languages adopt translation schemes that include both compilation
and interpretation
26-07-2024 12
It deals with problems such as
o dynamic allocation
o synchronization
o naming
o locality
o memory hierarchy management
o pipeline scheduling
26-07-2024 13
The Fundamental Principles of Compilation
26-07-2024 14
The Structure of a Compiler
Front end focuses on understanding the source-language program.
Back end focuses on mapping programs to the target machine.
A compiler uses some set of data structures to represent the code that it
processes.
That form is called an Intermediate Representation, or IR.
Retargeting
The task of changing the compiler to generate code for a new processor is often called
retargeting the compiler.
oBy using the IR as an interface, the compiler writer can insert this third phase
with minimal disruption to the front end and back end
26-07-2024
Three-Phase compiler
16
Structure of a Typical Compiler
26-07-2024 17
Front end: analysis
Read source program and understand its structure and meaning
Implications:
• Must recognize legal programs (& complain about illegal ones)
• Must generate correct code
• Must manage storage of all variables/data
• Must agree with OS & linker on target format
• Need some sort of Intermediate Representation(s) (IR)
• Front end maps source into IR
• Back end maps IR to target machine code
• Often multiple IRs – higher level at first, lower level in later phase
26-07-2024 18
Front End
Source
Program Tokens Syntactic
Scanner Parser Elaboration
(Character Stream) Structure
Intermediate
Representation
Infrastructure
Source
Program Tokens Syntactic
Semantic
Scanner Parser
(Character Stream) Structure Routines
Scanner
➢ The scanner begins the analysis of the source program by
reading the input, character by character, and grouping
Symbol
characters into individual andand symbols (tokens)
words
Attribute
Tables
RE ( Regular expression )
NFA ( Non-deterministic Finite Automata )
DFA ( Deterministic Finite(Used
Automata )
by all
LEX or FLEX Phases of
The Compiler)
26-07-2024 20
Parser or Syntax Analyzer
Source
Program Tokens Syntactic
Semantic
Scanner Parser
(Character Stream) Structure Routines
Intermediate
Parser Representation
➢ Given a formal syntax specification (typically as a context-
free grammar [CFG] ), the parser reads tokens and groups
them into units as specified by the productions of the CFG
being used. Symbol and Optimizer
➢ As syntactic structure is Attribute
recognized, the parser either calls
Tables
corresponding semantic routines directly or builds a syntax
tree.
CFG ( Context-Free Grammar
(Used ) by all
BNF ( Backus-Naur FormPhases
) of
GAA ( Grammar AnalysisThe
Algorithms ) Code
Compiler) Generator
LL, LR, SLR, LALR Parsers
YACC or Bison
Intermediate
Representation
Semantic Routines
➢ Perform two functions
◼ Check the static semantics of each construct
◼ Do the actual translation
Symbol and Optimizer
➢ The heart of a compiler Attribute
Tables
Syntax Directed Translation
Semantic Processing Techniques
(Used by all
IR (Intermediate Representation)
Phases of
The Compiler) Code
Generator
Intermediate
Representation
Optimizer
➢ The IR code generated by the semantic routines is analyzed
and transformed into functionally equivalent but improved IR
code Symbol and Optimizer
➢ This phase can be veryAttribute
complex and slow
➢ Peephole optimization Tables
➢ loop optimization, register allocation, code scheduling
(Used by all
Register and Temporary Management
Peephole Optimization
Phases of
The Compiler) Code
Generator
26-07-2024 23
✓Compiler writing tools
Compiler generators or compiler-compilers
26-07-2024 24
Back End
Responsibilities
• Translate IR into target machine code
• Should produce “good” code
“good” = fast, compact, low power consumption (pick some)
• Should use machine resources effectively
Registers
Instructions
Memory hierarchy
26-07-2024 25
Eg: Input: result = a + b * (c / d)
1. Lexical Analysis or Scanning:
Tokens:
‘result’, ‘=‘, ‘a’, ‘+’, ‘b’, ‘*’, ‘(‘, ‘c’, ‘/’, ‘d’, ‘)’
identifiers are result a b c d
operators are = + * /
2. Syntax Analysis or parsing:
Assign
Exp ::= Exp ‘+’ Exp
| Exp ‘-’ Exp
ID ‘=‘ Exp
| Exp ‘*’ Exp
| Exp ‘/’ Exp
Exp ‘+’ Exp
| (Exp)
| ID ID Exp ‘*’ ( Exp )
Assign ::= ID ‘=‘ Exp
ID::= a | b | c | d | result ID Exp ‘/’ Exp
26-07-2024 ID ID 26
Input: result = a + b * (c / d)
3. Semantic Analysis:
4. Intermediate Representation
‘=‘
t1= c / d
ID
‘+’ t2= b * t1
t3= a + t2
ID ‘*’
t4 = t3
ID ‘/’ result = t4
ID ID
26-07-2024 27
Thank You