0% found this document useful (0 votes)
41 views28 pages

19CSE401 CD 01 Introduction

Compile design

Uploaded by

sampath reddy
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
41 views28 pages

19CSE401 CD 01 Introduction

Compile design

Uploaded by

sampath reddy
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 28

19CSE401 – Compiler Design

Introduction

Dept. of Computer Science and Engineering

Amrita School of Computing, Bengaluru


Amrita Vishwa Vidyapeetham
19CSE401 COMPILER DESIGN 2033
Prerequisite course is 19CSE214 Theory of Computation
Unit 1
Compiler: Definition, Objectives, Structure – Overview of Translation. Scanners:
Table-Driven. Parsers: LL(1), LALR(1)

Unit 2
Context-Sensitive Analysis: Attribute Grammar – Ad Hoc Syntax Directed
Translation. Intermediate Representations: Abstract Syntax Tree, Three Address
Code. Symbol Tables: Hash Table

Unit 3
Procedure Abstraction: Access Links. Optimization: Local Value Numbering, Superlocal
Value Numbering, Liveness Analysis.
26-07-2024 2
Text Book(s)
Cooper, Keith, and Linda Torczon, Engineering a Compiler, Second Edition, Morgan
Kaufman, 2011.

Reference(s)
1. Parr T. Language implementation patterns: create your own domain-specific and
general programming languages. Pragmatic Bookshelf; First Edition, 2010.
2. Mak R. Writing compilers and interpreters: a software engineering approach. John
Wiley & Sons; Third Edition, 2009.
3. Appel W Andrew and Jens Palesberg, Modern Compiler Implementation in Java,
Cambridge University Press, Second Edition, 2002.
4. Aho, Alfred V., Monica S. Lam, Ravi Sethi, and Jeffrey Ullman, Compilers:
Principles, Techniques and Tools, Prentice Hall, Second Edition, 2006.

26-07-2024 3
CO Course Outcomes

CO1 Apply theoretical concepts for the analysis of program structure

CO2 Apply theoretical concepts and ad hoc techniques to translate high level structures to
intermediate representations
CO3 Analyze the design of data structures for compile-time code generation

CO4 Analyze the design of data structures for run-time code generation

CO5 Apply algorithms to improve the performance of the translated code

26-07-2024 4
Outlines
Introduction

Compiler Structure

Overview of Translation

Front End

Optimizer

Back End

26-07-2024 5
Introduction
 Compiler technology
Compilers are computer programs that translate a program written in one language into a program in
another language.

 What is a compiler?
▪ A program that translates an executable program in one language into an executable program
in another language
▪ The compiler should improve the program, in some way
 What is an interpreter?
▪ A program that reads an executable program and produces the results of executing that
program
 C is typically compiled
 Java is compiled to bytecodes (code for the Java VM)
▪ which are then interpreted
▪ Or a hybrid strategy is used
Just-in-time compilation(JIT that executes at runtime)

26-07-2024 6
What Do Compilers Do
A compiler acts as a translator, transforming human-oriented programming
languages into computer-oriented machine languages.
The compiler has a front end to deal with the source language.
It has a back end to deal with the target language.
Typical “source” languages might be c, c++, fortran, Java.
The “target” language is usually the instruction set of some processor

Source Program Target Program


Compiler

26-07-2024 7
o Connecting the front end and the back end, it has a formal structure for
representing the program in an intermediate form whose meaning is largely
independent of either language.

o To improve the translation, a compiler often includes an optimizer that analyzes


and rewrites that intermediate form.

Instruction set
The set of operations supported by a processor, the overall design of an instruction
set is often called an Instruction Set Architecture or ISA.

Source Program Target Program


Compiler
C,C++,Java Instruction Set
26-07-2024 8
o Research compilers produce C programs as their output

o Compilers for C are available on most computers

o The cost of an extra compilation for the final target

o Compilers that target programming languages rather than the instruction set of a
computer are often called source-to-source translators

26-07-2024 9
What Do Interpreters Do
An interpreter takes as input an executable specification and produces as
output the result of executing the specification.
Some languages, such as Perl, Scheme are more often implemented with
interpreters than with compilers.

Source Program Results


Interpreter
Perl, Scheme

26-07-2024 10
o Languages adopt translation schemes that include both compilation
and interpretation

o Java is compiled from source code into a form called bytecode, a


compact representation intended to decrease download times for Java
applications

o Java applications execute by running the bytecode on the


corresponding Java Virtual Machine (JVM), an interpreter for
bytecode.
11
Virtual machine
A virtual machine is a simulator for some processor
It is an interpreter for that machine’s instruction set
26-07-2024 11
A good compiler contains a microcosm of computer science.

It makes practical use of

o greedy algorithms (register allocation)


o heuristic search techniques (list scheduling)
o graph algorithms (dead-code elimination)
o dynamic programming (instruction selection)
o finite automata and push-down automata (scanning and parsing)
o fixed-point algorithms (data-flow analysis)

26-07-2024 12
It deals with problems such as
o dynamic allocation
o synchronization
o naming
o locality
o memory hierarchy management
o pipeline scheduling

Working inside a compiler provides practical experience in software engineering that


is hard to obtain with smaller, less intricate systems.

26-07-2024 13
The Fundamental Principles of Compilation

The first principle is inviolable:


The compiler must preserve the meaning of the program being compiled

The second principle that a compiler must observe is practical:


The compiler must improve the input program in some visible way

26-07-2024 14
The Structure of a Compiler
Front end focuses on understanding the source-language program.
Back end focuses on mapping programs to the target machine.
A compiler uses some set of data structures to represent the code that it
processes.
That form is called an Intermediate Representation, or IR.
Retargeting
The task of changing the compiler to generate code for a new processor is often called
retargeting the compiler.

26-07-2024 Two-Phase compiler 15


oThe middle section of a compiler, called an optimizer, analyzes and transforms
the IR to improve it

o optimizer takes an IR program as its input and produces a semantically


equivalent IR program as its output

oBy using the IR as an interface, the compiler writer can insert this third phase
with minimal disruption to the front end and back end

oThis leads to the following compiler structure, termed a


three-phase compiler

26-07-2024
Three-Phase compiler
16
Structure of a Typical Compiler

26-07-2024 17
Front end: analysis
Read source program and understand its structure and meaning

Back end: synthesis


Generate equivalent target language program

Implications:
• Must recognize legal programs (& complain about illegal ones)
• Must generate correct code
• Must manage storage of all variables/data
• Must agree with OS & linker on target format
• Need some sort of Intermediate Representation(s) (IR)
• Front end maps source into IR
• Back end maps IR to target machine code
• Often multiple IRs – higher level at first, lower level in later phase
26-07-2024 18
Front End
Source
Program Tokens Syntactic
Scanner Parser Elaboration
(Character Stream) Structure

Intermediate
Representation

Infrastructure

(Used by all Phases of The Compiler)

Token stream: Each significant lexical chunk of the program is


represented by a token
Operators & Punctuation: {}[]!+-=*;:
Keywords: if while return goto
Identifiers: id & actual name
Constants: kind & value; int, floating-point, character, string
26-07-2024 19
Scanner or Lexical Analyzer

Source
Program Tokens Syntactic
Semantic
Scanner Parser
(Character Stream) Structure Routines

Scanner
➢ The scanner begins the analysis of the source program by
reading the input, character by character, and grouping
Symbol
characters into individual andand symbols (tokens)
words
Attribute
Tables
 RE ( Regular expression )
 NFA ( Non-deterministic Finite Automata )
 DFA ( Deterministic Finite(Used
Automata )
by all
 LEX or FLEX Phases of
The Compiler)

26-07-2024 20
Parser or Syntax Analyzer
Source
Program Tokens Syntactic
Semantic
Scanner Parser
(Character Stream) Structure Routines

Intermediate
Parser Representation
➢ Given a formal syntax specification (typically as a context-
free grammar [CFG] ), the parser reads tokens and groups
them into units as specified by the productions of the CFG
being used. Symbol and Optimizer
➢ As syntactic structure is Attribute
recognized, the parser either calls
Tables
corresponding semantic routines directly or builds a syntax
tree.
 CFG ( Context-Free Grammar
(Used ) by all
 BNF ( Backus-Naur FormPhases
) of
 GAA ( Grammar AnalysisThe
Algorithms ) Code
Compiler) Generator
 LL, LR, SLR, LALR Parsers
 YACC or Bison

Target machine code

26-07-2024 YACC- Yet Another Compiler Compiler 21


Semantic Analyzer
Source
Program Tokens Syntactic
Semantic
Scanner Parser
(Character Stream) Structure Routines

Intermediate
Representation
Semantic Routines
➢ Perform two functions
◼ Check the static semantics of each construct
◼ Do the actual translation
Symbol and Optimizer
➢ The heart of a compiler Attribute
Tables
 Syntax Directed Translation
 Semantic Processing Techniques
(Used by all
 IR (Intermediate Representation)
Phases of
The Compiler) Code
Generator

Target machine code


26-07-2024 22
Optimizer
Source
Program Tokens Syntactic
Semantic
Scanner Parser
(Character Stream) Structure Routines

Intermediate
Representation
Optimizer
➢ The IR code generated by the semantic routines is analyzed
and transformed into functionally equivalent but improved IR
code Symbol and Optimizer
➢ This phase can be veryAttribute
complex and slow
➢ Peephole optimization Tables
➢ loop optimization, register allocation, code scheduling
(Used by all
 Register and Temporary Management
 Peephole Optimization
Phases of
The Compiler) Code
Generator

Target machine code

26-07-2024 23
✓Compiler writing tools
Compiler generators or compiler-compilers

Eg. scanner and parser generators

Tools: Lex, Flex, Jlex


Yacc(Yet another compiler compiler), Bison

26-07-2024 24
Back End
Responsibilities
• Translate IR into target machine code
• Should produce “good” code
“good” = fast, compact, low power consumption (pick some)
• Should use machine resources effectively
Registers
Instructions
Memory hierarchy

26-07-2024 25
Eg: Input: result = a + b * (c / d)
1. Lexical Analysis or Scanning:
Tokens:
‘result’, ‘=‘, ‘a’, ‘+’, ‘b’, ‘*’, ‘(‘, ‘c’, ‘/’, ‘d’, ‘)’
identifiers are result a b c d
operators are = + * /
2. Syntax Analysis or parsing:
Assign
Exp ::= Exp ‘+’ Exp
| Exp ‘-’ Exp
ID ‘=‘ Exp
| Exp ‘*’ Exp
| Exp ‘/’ Exp
Exp ‘+’ Exp
| (Exp)
| ID ID Exp ‘*’ ( Exp )
Assign ::= ID ‘=‘ Exp
ID::= a | b | c | d | result ID Exp ‘/’ Exp

26-07-2024 ID ID 26
Input: result = a + b * (c / d)
3. Semantic Analysis:
4. Intermediate Representation
‘=‘
t1= c / d
ID
‘+’ t2= b * t1
t3= a + t2
ID ‘*’
t4 = t3
ID ‘/’ result = t4
ID ID

Three Address code


Abstract Syntax Tree(AST)

26-07-2024 27
Thank You

You might also like