Lecture 01

The document provides an overview of compiler design, detailing the processes involved in translating source code into executable programs, including lexical analysis, syntax analysis, semantic analysis, intermediate code generation, code optimization, and code generation. It contrasts compilers and interpreters, highlights the importance of symbol table management, and discusses the evolution of programming languages from low-level machine languages to high-level languages. Additionally, it outlines the applications of compiler technology in various fields such as software engineering, debugging, and computer architecture design.

Uploaded by

nihafahima9

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

3 views47 pages

Lecture 01

Uploaded by

nihafahima9

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 47

Compiler Design

Introduction
Contents
•Introduction
•A Simple Syntax-Directed Translator
•Lexical Analysis
•Syntax Analysis
•Syntax-Directed Translation
•Intermediate-Code Generation
•Type Checking
•Run-time Environments
•Code Generation
•Code Optimization
Introduction
•World depends on Programming Languages
•Compilers
•Areas: Compiler construction, Write up
Programming languages, machine architecture,
language theory, algorithms, and software
engineering
Language Processors
•Compiler: is a program that can read a program
in one language (source language) and translate
it into an equivalent program in another
language (target language)
–The essential interface between applications & architectures

Compiler
Source Target
progra progra
m m
Mapping inputs to outputs
Language Processors
Target
Input Output
Program
(exe)

Interpreter: instead of producing a target program as a

translation, an interpreter appears to directly execute
the operations specified in the source program on
inputs
Source
progra Interprete
m Output
input r
Statement-by-statement
Requirements
•Basic Requirements
–Work on your homework individually.
–Discussions are encouraged but don’t copy others’ work.
–Get you hands dirty!
–Experiment with ideas presented in class and gain first-hand
knowledge!
–Come to class and DON’T hesitate to speak if you have
any questions/comments/suggestions!
–Student participation is important!
Compiler vs. Interpreter (1/5)
•Compilers: Translate a source (human-writable)
program to an executable (machine-readable)
program
•Interpreters: Convert a source program and
execute it at the same time.
Compiler vs. Interpreter (2/5)

Ideal concept:

Source code Compiler Executable

Input data Executable Output data

Source code
Interpreter Output data
Input data
Compiler vs. Interpreter (3/5)
•Most languages are usually thought of as using
either one or the other:
–Compilers: FORTRAN, COBOL, C, C++, Pascal, PL/1
–Interpreters: Lisp, scheme, BASIC, APL, Perl, Python,
Smalltalk
•BUT: not always implemented this way
–Virtual Machines (e.g., Java)
–Linking of executables at runtime
–JIT (Just-in-time) compiling
Compiler vs. Interpreter (4/5)
•Actually, no sharp boundary between them.
General situation is a combo:

Source code Translator

Intermed. code Virtual

machine Output
Input data
▪Java source program may first be compiled into an intermediate
code (Bytecodes)
•Bytecodes are then interpreted by a virtual machine.
❑Benefit: bytecodes compiled on one machine can be interpreted
on another machine (network)
Compiler vs. Interpreter (5/5)
Compiler Interpreter
•Pros •Pros
–Less space –Easy debugging
–Fast execution –Fast Development

•Cons •Cons
–Slow processing –Not for large projects
–Partly Solved –Exceptions: Perl, Python
–(Separate compilation) –Requires more space
–Debugging –Slower execution
–Improved thru IDEs –Interpreter in memory all the
time
A Language Processing System

Macro
Structure of A Compiler
Compiler

Intermediate code
Analysis Synthesis +
Symbol table
▪Breaks source program (constituent parts) =
▪Impose grammatical structure (Lexical, Target code
syntax, semantic)
▪Intermediate source code Back End
▪Error checks
❑Stored in Symbol table
❑Front End
Phase of
compilations
Scanning/Lexical analysis
❑Break program down into its smallest
meaningful symbols (tokens, atoms, lexemes)
❑Tools for this include lex, flex
❑Tokens include e.g.:
▪“Reserved words”: do if float while
▪Special characters: ( { , + - = ! /
▪Names & numbers: myValue 3.07e02
❑Start symbol table with new symbols found
Scanning/Lexical analysis
•For each lexeme, lexical analyzer produces as
output a token: <token-name, attribute-value>

▪token- name: abstract symbol that is used during syntax

analysis ,
▪attribute-value: points to an entry in the symbol table
for this token.
Scanning/Lexical analysis
•Assignment Statement (source program):

•Lexemes:
posit ion = initial + rate * 60

1.position is a lexeme that would be mapped into a token

<id, 1>, where id is an abstract symbol standing for
identifier and 1 points to the symbol table entry for
position.
▪The symbol-table entry for an identifier holds information
about the identifier, such as its name and type.
Scanning/Lexical analysis
•2. The assignment symbol = is a lexeme that is
mapped into the token <=>.
•3. initial is a lexeme that is mapped into the token
<id, 2> , where 2 points to the symbol-table entry
for initial
•4. + is a lexeme that is mapped into token <+>
•5. rate is a lexeme that is mapped into the token
<id, 3> , where 3 points to the symbol-table entry
for rate.
Scanning/Lexical analysis
•6. * is a lexeme that is mapped into token <*>
•7. 60 is a lexeme that is mapped into the token
<60>
•Blanks (White Space) separating the lexemes
would be discarded by the lexical analyzer.

After lexical analysis as the sequence of tokens

<id, 1> < = > <id, 2> <+ > <id, 3> <*> <60>
Translation of
an assignment
statement
Parsing/Syntax Analysis
•The parser create a tree-like intermediate
representation that depicts the grammatical
structure of the token stream.
•A typical representation is a syntax/parse tree in
which each interior node represents an
operation and the children of the node
represent the arguments of the operation.
Parsing/Syntax Analysis
•This tree shows the order in which the operations
in the assignment are to be performed:
position = initial + rate * 60
•The tree has an interior node labeled * with <id, 3>
as its left child and the integer 60 as its right child.
The node <id, 3> represents the identifier rate.
•The node labeled * makes it explicit that we must
first multiply the value of rate by 60.
•The node labeled + indicates that we must add the
result of this multiplication to the value of initial.
Parsing/Syntax Analysis
•The root of the tree, labeled =, indicates that we
must store the result of this addition into the
location for the identifier position.
•This ordering of operations is consistent with the
usual conventions of arithmetic which tell us
✔ multiplication has higher precedence than
addition, and hence that the multiplication is to
be performed before the addition.
Semantic Analysis
•The semantic an analyzer uses the syntax tree &
the information in the symbol table to check the
source program for semantic consistency with the
language definition.
•It also gathers type information & saves it in either
the syntax tree or the symbol table, for subsequent
use during intermediate-code generation.
Semantic Analysis
•Important part: type checking
▪compiler checks that each operator has
matching operands.
▪Ex: many programming language definitions
require an array index to be an integer;
the compiler must report an error if a
floating-point number is used to index an array.
Semantic Analysis
•The language specification may permit some
type conversions called coercions.
•Suppose that position, initial, and rate have
been declared to be floating-point numbers, and
that the lexeme 60 by itself forms an integer.
•The type checker discovers that the operator * is
applied to a floating-point number rate & an
integer 60.
Semantic Analysis
•In this case, the integer may be converted into a
floating-point number.
•The output of the semantic analyzer has an extra
node for the operator inttofloat , which
explicitly converts its integer argument into a
floating-point number.
Intermediate Code Generation
•one or more intermediate representations,
which can have a variety of forms.
•Syntax trees are a form of intermediate
representation; they are commonly used during
syntax and semantic analysis.
•02 important properties:
it should be easy to produce
it should be easy to translate into the target
machine
Intermediate Code Generation
•Three-address code: a sequence of
assembly-like instructions with three operands
per instruction.
•Each operand can act like a register.
Intermediate Code Generation
•Properties:
Each three-address assignment instruction has
at most one operator on the right side.
Compiler must generate a temporary name to
hold the value computed by a three-address
instruction.
Some "three-address instructions“ have fewer
than three operands.
Code Optimization
•The code-optimization phase attempts to improve the
intermediate code so that better target code will result.
•Better means faster,
shorter code, or target code that consumes less power.
•Ex: a straightforward algorithm generates the intermediate
code, using an instruction for each operator in the tree
representation that comes from the semantic analyzer.
Code Optimization
•The optimizer can deduce that the conversion of
60 from integer to floating point can be done once
and for all at compile time,
✔so the inttofloat operation can be eliminated by
replacing the integer 60 by the floating-point
number 60.0.
•Moreover, t3 is used only once to transmit its
value to id1 so the optimizer can transform into
the shorter sequence
Code Generation
•The code generator takes as input an
intermediate representation of the source
program and maps it into the target language.
•If the target language is machine code, registers
or memory locations are selected for each of the
variables used by the program.
•Then, the intermediate instructions are
translated into sequences of machine
instructions that perform the same task.
Code Generation
•Ex: using registers R1 and R2, the intermediate
code in might get translated into machine code:

❑The first operand of each instruction specifies a destination.

❑The F in each instruction tells us that it deals with floating-point
numbers
Code Generation
❑The code loads the contents of address id3 into register R2,
❑then multiplies it with floating-point constant 60.0.
❑The # signifies that 60.0 is to be treated as an immediate
constant

❑The third instruction moves id2 into register R1

❑fourth adds to it the value previously computed in register R2.

❑Finally, the value in register R1 is stored into the address of id1

❑so the code correctly implements the assignment statement

position = initial + rate * 60

Symbol-Table Management
•An essential function of a compiler
✔is to record the variable names used in the source program
✔collect information about various attributes of each name.
❑These attributes may provide information about
the storage allocated for a name, its type, its scope (where in the
program its value may be used), and in the case of procedure
names, such things as the number and types of its arguments,
the method of passing each argument (for example, by value or
by reference), and the type returned.
❑The symbol table is a data structure containing a record for each
variable name, with fields for the attributes of the name.
❑The data structure should be designed to allow the compiler to
find the record for each name quickly & to store or retrieve data
from that record quickly.
Compiler Construction Tools
•The compiler writer, can profitably use modern software
development environments:
▪Tools: language editors, debuggers, version managers,
profilers, test harnesses, and so on.
•Properties of the Most successful tools:
hide the details of the generation algorithm
produce components that can be easily integrated into
the remainder of the compiler.
Commonly used Compiler Construction
Tools
•1. Parser generators that automatically produce syntax
analyzers from a grammatical description of a
programming language.
•2. Scanner generators that produce lexical analyzers from
a regular-expression description of the tokens of a
language.
•3. Syntax-directed translation engines that produce
collections of routines for walking a parse tree and
generating intermediate code.
Commonly used Compiler Construction Tools

•4. Code-generator generators that produce a code

generator from a collection of rules for translating each
operation of the intermediate language into the
machine language for a target machine.
•5. Data-flow analysis engines that facilitate the
gathering of information about how values are
transmitted from one part of a program to each other
part.
✔Data-flow analysis is a key part of code optimization.
•6. Compiler- construction toolkits that provide an
integrated set of routines for constructing various
phases of a compiler.
The Evaluation of Programming
Language
•1940: 1st electronic computer
•Programmed: machine language (by sequences of 0's
and 1 's) that explicitly told the computer what
operations to execute and in what order.
•Limitations: Operations in very low level:
move data from one location to another, add the
contents of two registers, compare two values , & so
on.
•Disadv: programming was slow, tedious, and error
prone.
once written, the programs were hard to understand &
modify.
The Move to Higher-level Language
•Early 1950: Assembly languages (mnemonic)
•Later, macro instructions were added to
assembly languages so that a programmer could
define parameterized shorthands for frequently
used sequences of machine instructions.
The Move to Higher-level Language
•Latter half of the 1950's: A major step towards
higher-level languages was made
Fortran for scientific computation,
Cobol for business data processing,
Lisp for symbolic computation.
•The philosophy behind these languages was to create
higher-level notations with which programmers could
more easily write numerical computations, business
applications, and symbolic programs.
•These languages were so successful that they are still
in use today.
•Today, there are thousands of programming
languages.
•Classification:
•1. According to Generation
❑First-generation: machine languages
❑Second-generation: assembly languages,
❑Third-generation: higher-level languages (Fortran,
Cobol, Lisp, C, C++, C#, and Java)
❑Fourth-generation: designed for specific
applications like NOMAD for report generation, SQL
for database queries, and Postscript for text
formatting.
•Fifth-generation: applied to logic- and
constraint-based languages (Prolog and OPS5)
•2. imperative for languages
a program specifies how a computation is to be
done and declarative for languages in which a
program specifies what computation is to be
done.
•Languages such as C, C++, C#, and Java are
imperative languages.
•In imperative languages there is a notion of
program state and statements that change the
state.
•Functional languages such as ML and Haskell and
constraint logic languages such as Prolog are often
considered to be declarative languages.
•3. von Neumann language
•computational model is based on the von
Neumann computer architecture.
•Fortran and C are von Neumann languages.
• 4. An object-oriented language
• supports object-oriented programming, a
programming style in which a program consists
of a collection of objects that interact with one
another.
Simula 67 and Smalltalk are the earliest major
object-oriented languages.
C++, C#, Java, and Ruby are more recent
object-oriented languages.
5. Scripting languages
interpreted languages with high-level operators
designed for "gluing together" computations.
These computations were originally called
"scripts."
Awk, JavaS cript, Perl, PHP, Python, Ruby, and Tel
are popular examples of scripting languages.
Programs written in scripting languages are
often much shorter than equivalent programs
written in languages like C.
Application of Compiler Technology
▪Implementation of High-level programming language
▪Optimizations for Computer Architecture: parallelism,
Memory Hierarchies
▪Design of New Computer Architecture: RISC,
Specialized architecture
▪Debugging
▪Fault location
▪Model checking in formal analysis
▪Model-driven development
▪Optimization techniques in software engineering
▪Program Translation: Binary translation, Hardware
synthesis, database query interpreters
▪Software productivity tools: Type checking, bounds
checking, memory-management, software
maintenance
▪Visualizations of analysis results

Rory Miller Malcolm Rivers - Living in The Deep Brain Connecting With Your Intuition
100% (1)
Rory Miller Malcolm Rivers - Living in The Deep Brain Connecting With Your Intuition
120 pages
Compiler Design Note1
No ratings yet
Compiler Design Note1
111 pages
2-Introduction to Compilation and Lexical Analysis-19!07!2024
No ratings yet
2-Introduction to Compilation and Lexical Analysis-19!07!2024
135 pages
BCS 324 Lesson 1
No ratings yet
BCS 324 Lesson 1
28 pages
Introduction
No ratings yet
Introduction
46 pages
CST302_FullNotes
No ratings yet
CST302_FullNotes
134 pages
Module 1
No ratings yet
Module 1
86 pages
Compiler Design Slide Chapter 1-6
No ratings yet
Compiler Design Slide Chapter 1-6
250 pages
CD Part
No ratings yet
CD Part
159 pages
Compiler Construction Week 2
No ratings yet
Compiler Construction Week 2
29 pages
1 Compiler Design Lect1
No ratings yet
1 Compiler Design Lect1
28 pages
Introduction Compiler
No ratings yet
Introduction Compiler
47 pages
Compiler Design
No ratings yet
Compiler Design
29 pages
Principle of Compiler Design: Translator
No ratings yet
Principle of Compiler Design: Translator
20 pages
Compilation Phases
No ratings yet
Compilation Phases
20 pages
Unit I SRM
100% (1)
Unit I SRM
36 pages
SSCDNotes PDF
100% (1)
SSCDNotes PDF
53 pages
cd unit 1
No ratings yet
cd unit 1
63 pages
CH 02 - PL
No ratings yet
CH 02 - PL
92 pages
Compiler 1
No ratings yet
Compiler 1
28 pages
Lec#1
No ratings yet
Lec#1
36 pages
Module 1
No ratings yet
Module 1
133 pages
1-Phases of compiler
No ratings yet
1-Phases of compiler
68 pages
SCS13033
No ratings yet
SCS13033
121 pages
Lec 2
No ratings yet
Lec 2
21 pages
1 Compiler Phases
No ratings yet
1 Compiler Phases
30 pages
Lecture 1,2 Introduction
No ratings yet
Lecture 1,2 Introduction
40 pages
Lecture#1 2
No ratings yet
Lecture#1 2
54 pages
SPCCPDF
No ratings yet
SPCCPDF
83 pages
Compiler Construction CSEC325 Token
No ratings yet
Compiler Construction CSEC325 Token
2 pages
CD_ UNIT-1
No ratings yet
CD_ UNIT-1
10 pages
Introduction To Compiling
100% (1)
Introduction To Compiling
26 pages
phases of compiler
No ratings yet
phases of compiler
36 pages
Chapter 1
No ratings yet
Chapter 1
43 pages
Additional Note CSC 409
No ratings yet
Additional Note CSC 409
11 pages
Unit 1
No ratings yet
Unit 1
50 pages
Compiler Design Mod 1
No ratings yet
Compiler Design Mod 1
75 pages
CD UNIT 1 Chapter 1
No ratings yet
CD UNIT 1 Chapter 1
9 pages
Compiler Design: Dr. M. Moshiul Hoque Dept. of CSE, CUET
No ratings yet
Compiler Design: Dr. M. Moshiul Hoque Dept. of CSE, CUET
53 pages
INTRO TO COMPILERS
No ratings yet
INTRO TO COMPILERS
77 pages
Compiler Desining Complete Notes
No ratings yet
Compiler Desining Complete Notes
175 pages
Chapter 1
No ratings yet
Chapter 1
35 pages
CD FINALIZED NOTES
No ratings yet
CD FINALIZED NOTES
6 pages
Slides 01 - Compiler Construction - UET CS - Introduction
No ratings yet
Slides 01 - Compiler Construction - UET CS - Introduction
37 pages
z 2024 Final Report Anonymity en 0824
No ratings yet
z 2024 Final Report Anonymity en 0824
172 pages
Lecture1 - Compiler Design
No ratings yet
Lecture1 - Compiler Design
52 pages
UNIT-I Compiler Design - SCS1303: School of Computing Department of Computer Science and Engineering
No ratings yet
UNIT-I Compiler Design - SCS1303: School of Computing Department of Computer Science and Engineering
27 pages
Chapter 1 - Introduction To Comp
No ratings yet
Chapter 1 - Introduction To Comp
27 pages
1-Phases of Compiler
No ratings yet
1-Phases of Compiler
66 pages
Unit 1
No ratings yet
Unit 1
37 pages
Đột phá 9 thi vào 10 Tiếng Anh tuần 4
No ratings yet
Đột phá 9 thi vào 10 Tiếng Anh tuần 4
54 pages
Phases of Compiler
No ratings yet
Phases of Compiler
9 pages
Introduction To Compilation
No ratings yet
Introduction To Compilation
33 pages
Overview of Compiler Environment Pass and Phase Phases of Compiler Regular Expression Lexical Analyzer LEX Tool Bootstrapping
No ratings yet
Overview of Compiler Environment Pass and Phase Phases of Compiler Regular Expression Lexical Analyzer LEX Tool Bootstrapping
35 pages
Chapter One-Introduction
No ratings yet
Chapter One-Introduction
6 pages
Phases of A Compiler
No ratings yet
Phases of A Compiler
17 pages
1 Lexial Analysis
No ratings yet
1 Lexial Analysis
24 pages
Compiler Questions
No ratings yet
Compiler Questions
50 pages
2019 Sats Year 6 English Merged Large Print
No ratings yet
2019 Sats Year 6 English Merged Large Print
98 pages
Noteartificial intelligence
No ratings yet
Noteartificial intelligence
23 pages
Test A2 - English File Elementary - 2 Pages
No ratings yet
Test A2 - English File Elementary - 2 Pages
2 pages
Enamoured by A Feather-61-90
No ratings yet
Enamoured by A Feather-61-90
30 pages
PDF Korean K-Pop and K-Drama Language Workbook : A Complete Introduction to Korean Hangul with 108 Gridded Sheets for Handwriting Practice Tuttle Studio download
100% (2)
PDF Korean K-Pop and K-Drama Language Workbook : A Complete Introduction to Korean Hangul with 108 Gridded Sheets for Handwriting Practice Tuttle Studio download
65 pages
Lecture 04
No ratings yet
Lecture 04
51 pages
Compiler Design Ch1
No ratings yet
Compiler Design Ch1
13 pages
DLL MATATAG_READING & LITERACY 1 Q4_W6
No ratings yet
DLL MATATAG_READING & LITERACY 1 Q4_W6
21 pages
Practise writing-Vũ Hải Đăng-20207014
No ratings yet
Practise writing-Vũ Hải Đăng-20207014
20 pages
1655.02 Unique Assignment
No ratings yet
1655.02 Unique Assignment
25 pages
Lecture 02
No ratings yet
Lecture 02
150 pages
Lecture 03
No ratings yet
Lecture 03
36 pages
RAZ-AA 017-Rain On Fairyland
No ratings yet
RAZ-AA 017-Rain On Fairyland
15 pages
Lecture 08
No ratings yet
Lecture 08
36 pages
The Ugly Duckling: Tugas Bahasa Inggris Narrative Text
0% (1)
The Ugly Duckling: Tugas Bahasa Inggris Narrative Text
4 pages
2020-Balthazar, Ebbels, Zwitserlood. Explicit Grammatical Intervention For Developmental Language Disorder-Three Approaches
No ratings yet
2020-Balthazar, Ebbels, Zwitserlood. Explicit Grammatical Intervention For Developmental Language Disorder-Three Approaches
22 pages
Class Five Syllabus
No ratings yet
Class Five Syllabus
3 pages
Exam Notes Legres
No ratings yet
Exam Notes Legres
12 pages
ABB 3HAC024534-001: Datasheet
No ratings yet
ABB 3HAC024534-001: Datasheet
32 pages
B, Inggris Ghefira
No ratings yet
B, Inggris Ghefira
2 pages
RPT Bahasa Inggeris Form 5 2024
No ratings yet
RPT Bahasa Inggeris Form 5 2024
20 pages
Anth 1000c CORE LAC: Anthropology Fl20
No ratings yet
Anth 1000c CORE LAC: Anthropology Fl20
9 pages
5 Terms' Definition and Examples of Kinship - Group 1-1
No ratings yet
5 Terms' Definition and Examples of Kinship - Group 1-1
7 pages
603389612SU UG Gen Eng Syllabus Grid with GGG
No ratings yet
603389612SU UG Gen Eng Syllabus Grid with GGG
6 pages
Subject-Verb Agreement Quiz
100% (1)
Subject-Verb Agreement Quiz
8 pages
Direct and Indirect Speech
No ratings yet
Direct and Indirect Speech
8 pages
Waqas Arif Resume
No ratings yet
Waqas Arif Resume
2 pages
9 Ano WH Questiondocx
No ratings yet
9 Ano WH Questiondocx
2 pages
English: Quarter 1 - Module 1: Modals: Prohibition, Obligation and Permission
No ratings yet
English: Quarter 1 - Module 1: Modals: Prohibition, Obligation and Permission
22 pages
Pamela Abegail Monsanto - Identifying Nouns (News Article)
No ratings yet
Pamela Abegail Monsanto - Identifying Nouns (News Article)
2 pages
ĐỀ 26
No ratings yet
ĐỀ 26
3 pages
Quarter 4 Features Activities
No ratings yet
Quarter 4 Features Activities
1 page
Post Test Introduction To Part of Speech
No ratings yet
Post Test Introduction To Part of Speech
4 pages
Learn Java Programming in 24 Hours
From Everand
Learn Java Programming in 24 Hours
PublishDrive
No ratings yet
The 1 Page Python Book
From Everand
The 1 Page Python Book
Barani Kumar
2/5 (1)
COMPUTER PROGRAMMING FOR KIDS: An Easy Step-by-Step Guide For Young Programmers To Learn Coding Skills (2022 Crash Course for Newbies)
From Everand
COMPUTER PROGRAMMING FOR KIDS: An Easy Step-by-Step Guide For Young Programmers To Learn Coding Skills (2022 Crash Course for Newbies)
Dexter Rogers
No ratings yet
Assembly Language:Simple, Short, And Straightforward Way Of Learning Assembly Programming
From Everand
Assembly Language:Simple, Short, And Straightforward Way Of Learning Assembly Programming
Sherwyn Allibang
2/5 (1)
Code Beneath the Surface: Mastering Assembly Programming
From Everand
Code Beneath the Surface: Mastering Assembly Programming
Kameron Hussain
No ratings yet