0% found this document useful (0 votes)
17 views43 pages

Unit 1 - Overview of The Compiler and Its Structure

This document discusses a compiler design course. It covers: 1. The evolution of programming languages from machine languages to modern languages like Java. 2. The different types of translators - compilers, interpreters, and assemblers - and how they convert source code to target code. 3. The analysis-synthesis model of compilation, where the analysis phase breaks down source code and the synthesis phase constructs the target code from the intermediate representation.

Uploaded by

isaiahethi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views43 pages

Unit 1 - Overview of The Compiler and Its Structure

This document discusses a compiler design course. It covers: 1. The evolution of programming languages from machine languages to modern languages like Java. 2. The different types of translators - compilers, interpreters, and assemblers - and how they convert source code to target code. 3. The analysis-synthesis model of compilation, where the analysis phase breaks down source code and the synthesis phase constructs the target code from the intermediate representation.

Uploaded by

isaiahethi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 43

Hawassa University Daye

Campus
Department of Computer
Compiler Design
Science
Course code: CoSc4072
By: Mekonen M.
Unit one
Introduction to Compiler
Discuss on:
 Looping
• What is a programming language? What are the
different categories of programming language?
• What is the translator in the programming language,
list different kinds of translators?
• Why compiler?
• How to source code compiled and run? List steps of
compilation?
At the end of the of study you will able to:
 Looping
• Explain the role of a translator in converting high-level
programming languages to machine code.
• Differentiate between various types of translators.
• Comprehend the analysis-synthesis model and its
significance in compilation.
• Identify and explain the key components of the
analysis and synthesis phases.
• Explain the sequential steps involved in the
compilation process.
• Identify the purpose of each phase in the compiler.
Topics to be covered
 Looping
• Evolution of programming language
• Translator
• Analysis synthesis model of compilation
• Phases of compiler
• Grouping of the Phases
• Difference between compiler & interpreter
• Context of compiler (Cousins of compiler)
Evolution of computer programming
• 1940’s - the first electronic computers are invented
• They were programmed in machine language by sequences of 0's and 1‘s
• slow, tedious, error prone, machine dependant, hard to understand and modify but fast to
run
• Early 1950's - mnemonic assembly languages developed
• First they were just mnemonic representations of machine instructions, later,
macro instructions were added
 latter half of the 1950's - higher-level languages developed
 Fortran for scientific computation, Cobol for business data processing, and Lisp for symbolic
computation

Mekonen M. # CoSc4072  Unit 1 – Introduction to compiler design 6


Evolution of computer programming cont…
• following decades, many more languages were created and today, there are
thousands of programming languages
• They can be classified in a variety of ways.
• Classification based on generation
• First-generation languages - machine languages
• Second-generation languages - the assembly languages,
• Third-generation languages - higher-level languages like Fortran, Cobol, Lisp, C, C++,
C#, and Java.
• Fourth-generation languages - languages designed for specific applications e.g. NOMAD
for report generation, SQL for database queries
• Fifth-generation language - includes logic- and constraint-based languages like Prolog
and OPS5

Mekonen M. # CoSc4072  Unit 1 – Introduction to compiler design 7


Evolution of computer programming cont…
• Another classification
• Imperative for languages - a program specifies how a computation is to be
done. E.g. C, C++, C#, and Java
• Declarative languages - a program specifies what computation is to be done.
E.g. prolog, ML and Haskel
• Within the declarative and imperative families, there are several
important subclasses.
• von Neumann language E.g. C, Fortran
• object-oriented language E.g. C++, Java
• Scripting languages E.g. Ruby, PHP, Perl, Python
• Logic-or constraint-based E.g. Prolog

Mekonen M. # CoSc4072  Unit 1 – Introduction to compiler design 8


હે લ્લો

Translator
Translator
• A translator is a program that takes one form of program as input and
converts it into another form.
• Types of translators are:
1. Compiler
2. Interpreter
3. Assembler
Source Translator Target
Program Program

Error
Messages (If any)
Mekonen M. # CoSc4072  Unit 1 – Introduction to compiler design 10
Compiler
• A compiler is a program that reads a program written in source language and
translates it into an equivalent program in target language.

void main() 0000 1100 0010


{ 0100
Source
int a=1,b=2,c; 0111 1000 0001
Target
Compiler
c=a+b; Program 1111 0101 1110
Program
printf(“%d”,c); 1100 0000 1000
} 1011

Source Error Target


Program Messages (If any) Program

Mekonen M. # CoSc4072  Unit 1 – Introduction to compiler design 11


Interpreter
• Interpreter is also program that reads a program written in source language and
line by
translates it into an equivalent program in target language line.

Void main() 0000 1100 0010


{ 0000
int a=1,b=2,c; Interpreter 1111 1100 0010
c=a+b;
1010 1100 0010
printf(“%d”,c); 0011 1100 0010
} 1111
Error
Source Target
Messages (If any)
Program Program

Mekonen M. # CoSc4072  Unit 1 – Introduction to compiler design 12


Assembler
• Assembler is a translator which takes the assembly code as an input and
generates the machine code as an output.

MOV id3, R1 0000 1100 0010


MUL #2.0, R1 0100
MOV id2, R2 0111 1000 0001
MUL R2, R1 Assembler 1111 0101 1110
MOV id1, R2 1100 0000 1000
ADD R2, R1 1011
MOV R1, id1 1100 0000 1000

Assembly Error
Messages (If any) Machine Code
Code

Mekonen M. # CoSc4072  Unit 1 – Introduction to compiler design 13


Analysis Synthesis model of
compilation
Analysis synthesis model of compilation
• There are two parts of compilation.

1. Analysis Phase
2. Synthesis Phase

void main() Analysis Synthesis


{ Phase Phase 0000 1100
int a=1,b=2,c; 0111 1000
c=a+b; 0001
printf(“%d”,c) Intermediate 1111 0101
; Representation 1000
} 1011
Source Code Target Code

Mekonen M. # CoSc4072  Unit 1 – Introduction to compiler design 15


Analysis phase & Synthesis phase
Analysis Phase Synthesis Phase
• Analysis part breaks up the source  The synthesis part constructs the desired
program into constituent pieces and target program from the intermediate
representation.
creates an intermediate
 Synthesis phase consist of the following sub
representation of the source
phases:
program.
1. Code optimization
• Analysis phase consists of three sub
2. Code generation
phases:
1. Lexical analysis
2. Syntax analysis
3. Semantic analysis
Mekonen M. # CoSc4072  Unit 1 – Introduction to compiler design 16
Phases of compiler
Phases of compiler
Compiler

Analysis phase Synthesis phase

Lexical analysis
Intermediate Code
code optimization
Syntax analysis generation

Code generation
Semantic analysis

Mekonen M. # CoSc4072  Unit 1 – Introduction to compiler design 18


Lexical analysis
• Lexical Analysis is also called linear analysis or
scanning. Position = initial + rate*60
• Lexical Analyzer divides the given source statement
into the tokens.
Lexical analysis
• Ex: Position = initial + rate * 60 would be grouped
into the following tokens: id1 = id2 + id3 * 60
Position (identifier)
= (Assignment symbol)
initial (identifier)
+ (Plus symbol)
rate (identifier)
* (Multiplication symbol)
60 (Number)
Mekonen M. # CoSc4072  Unit 1 – Introduction to compiler design 19
Phases of compiler
Compiler

Analysis phase Synthesis phase

Lexical analysis
Intermediate Code
code optimization
Syntax analysis generation

Code generation
Semantic analysis

Mekonen M. # CoSc4072  Unit 1 – Introduction to compiler design 20


Syntax analysis
• Syntax Analysis is also called Parsing or
Position = initial + rate*60
Hierarchical Analysis.
• The syntax analyzer checks each line of the Lexical analysis
code and spots every tiny mistake.
id1 = id2 + id3 * 60
• If code is error free then syntax analyzer
generates the tree. Syntax analysis

id1 +

id2 *
id3 60

Mekonen M. # CoSc4072  Unit 1 – Introduction to compiler design 21


Phases of compiler
Compiler

Analysis phase Synthesis phase

Lexical analysis
Intermediate Code
code optimization
Syntax analysis generation

Code generation
Semantic analysis

Mekonen M. # CoSc4072  Unit 1 – Introduction to compiler design 22


Semantic analysis
• Semantic analyzer determines the meaning of =
a source string. id1 +
• It performs following operations: id2 * int to
1. matching of parenthesis in the expression. real
id3 60
2. Matching of if..else statement.
3. Performing arithmetic operation that are type
Semantic analysis
compatible.
4. Checking the scope of operation. =
*Note: Consider id1, id2 and id3 are real
id1 +

id2 *
id3 inttoreal

60
Mekonen M. # CoSc4072  Unit 1 – Introduction to compiler design 23
Phases of compiler
Compiler

Analysis phase Synthesis phase

Lexical analysis
Intermediate Code
code optimization
Syntax analysis generation

Code generation
Semantic analysis

Mekonen M. # CoSc4072  Unit 1 – Introduction to compiler design 24


Intermediate code generator
• Two important properties of intermediate code : =
1. It should be easy to produce. id1 +
2. Easy to translate into target program. id2 *
• Intermediate form can be represented using t3 id3 inttoreal
“three address code”. t2 t1
60
• Three address code consist of a sequence of
Intermediate code
instruction, each of which has at most three
operands. t1= int to real(60)
t2= id3 * t1
t3= t2 + id2
id1= t3

Mekonen M. # CoSc4072  Unit 1 – Introduction to compiler design 25


Phases of compiler
Compiler

Analysis phase Synthesis phase

Lexical analysis
Intermediate Code
code optimization
Syntax analysis generation

Code generation
Semantic analysis

Mekonen M. # CoSc4072  Unit 1 – Introduction to compiler design 26


Code optimization
• It improves the intermediate code.
• This is necessary to have a faster execution Intermediate code
of code or less consumption of memory.
t1= int to real(60)
t2= id3 * t1
t3= t2 + id2
id1= t3

Code optimization

t1= id3 * 60.0


id1 = id2 + t1

Mekonen M. # CoSc4072  Unit 1 – Introduction to compiler design 27


Phases of compiler
Compiler

Analysis phase Synthesis phase

Lexical analysis
Intermediate Code
code optimization
Syntax analysis generation

Code generation
Semantic analysis

Mekonen M. # CoSc4072  Unit 1 – Introduction to compiler design 28


Code generation
• The intermediate code instructions are
translated into sequence of machine Code optimization
instruction.
t1= id3 * 60.0
id1 = id2 + t1

Code generation

MOV id3, R2
MUL #60.0, R2
MOV id2, R1
ADD R2,R1
MOV R1, id1

Id3R2
Id2R1
Mekonen M. # CoSc4072  Unit 1 – Introduction to compiler design 29
Phases of compiler
Source program

Analysis Phase
Lexical analysis

Syntax analysis

Semantic analysis
Symbol table Error detection
and recovery
Intermediate code

Variable Type Addres Code optimization


Name s
Position Float 0001
Code generation Synthesis Phase
Initial Float 0005
Rate Float 0009 Target Program

Mekonen M. # CoSc4072  Unit 1 – Introduction to compiler design 30


Symbol table
• Symbol table management is a part of the compiler that interacts with
several of the phases
– Identifiers and their values are found in lexical analysis and placed in the
symbol table
– During syntactical and semantic analysis, type and scope information is added
– During code generation, type information is used to determine what
instructions to use
– During optimization, the “live analysis” may be kept in the symbol table
• Most suitably implemented as a dynamic data structure (linear list,
binary tree, hash table)

Mekonen M. # CoSc4072  Unit 1 – Introduction to compiler design 31


Handling Errors
• Error handling and reporting also occurs across many phases
– Lexical analyzer reports invalid character sequences
– Syntactic analyzer reports invalid token sequences
– Semantic analyzer reports type and scope errors, and the like

• The compiler may be able to continue with some errors, but other
errors may stop the process

Mekonen M. # CoSc4072  Unit 1 – Introduction to compiler design 32


Exercise
• Write output of all the phases of compiler for following statements:
1. x = b-c*2
2. I=p*n*r/100

Mekonen M. # CoSc4072  Unit 1 – Introduction to compiler design 33


Grouping of Phases
Front end & back end (Grouping of phases)
Front end
• Depends primarily on source language and largely independent of the target machine.
• It includes following phases:
1. Lexical analysis
2. Syntax analysis
3. Semantic analysis
4. Intermediate code generation
5. Creation of symbol table
Back end
 Depends on target machine and do not depends on source program.
 It includes following phases:
1. Code optimization
2. Code generation phase
3. Error handling and symbol table operation
Mekonen M. # CoSc4072  Unit 1 – Introduction to compiler design 35
Difference between compiler & interpreter
Compiler Interpreter
Scans the entire program and translates it It translates program’s one statement at a
as a whole into machine code. time.
It generates intermediate code. It does not generate intermediate code.
An error is displayed after entire program An error is displayed for every instruction
is checked. interpreted if any.
Memory requirement is more. Memory requirement is less.
Example: C compiler Example: Basic, Python, Ruby

Mekonen M. # CoSc4072  Unit 1 – Introduction to compiler design 36


Context of Compiler
(Cousins of compiler)
Context of compiler (Cousins of compiler)
Skeletal Source Program
• In addition to compiler, many other system
programs are required to generate absolute Preprocessor
machine code. Source
• These system programs are: Compiler
Program

Target Assembly
• Preprocessor Program
• Assembler Assembler
• Linker Relocatable Object
• Loader Code
Libraries & Linker / Loader
Object Files

Absolute Machine
Code
Mekonen M. # CoSc4072  Unit 1 – Introduction to compiler design 38
Context of compiler (Cousins of compiler)
Skeletal Source Program
Preprocessor
 Some of the task performed by preprocessor: Preprocessor

1. Macro processing: Allows user to define macros. Ex: Source


#define PI 3.14159265358979323846 Program
2. File inclusion: A preprocessor may include the header Compiler
file into the program. Ex: #include<stdio.h>
Target Assembly
3. Rational preprocessor: It provides built in macro for Program
construct like while statement or if statement.
Assembler
4. Language extensions: Add capabilities to the
language by using built-in macros. Relocatable Object
Code
 Ex: the language equal is a database query
Libraries &
language embedded in C. Statement beginning with Linker / Loader
Object Files
## are taken by preprocessor to be database access
statement unrelated to C and translated into
procedure call on routines that perform the Absolute Machine
database access. Code
Mekonen M. # CoSc4072  Unit 1 – Introduction to compiler design 39
Context of compiler (Cousins of compiler)
Skeletal Source Program
Compiler
 A compiler is a program that reads a program Preprocessor

written in source language and translates it into an Source


equivalent program in target language. Program
Compiler

Target Assembly
Program
Assembler

Relocatable Object
Code
Libraries & Linker / Loader
Object Files

Absolute Machine
Code
Mekonen M. # CoSc4072  Unit 1 – Introduction to compiler design 40
Context of compiler (Cousins of compiler)
Skeletal Source Program
Assembler
 Assembler is a translator which takes the assembly Preprocessor

program (mnemonic) as an input and generates the Source


machine code as an output. Program
Compiler

Target Assembly
Program
Assembler

Relocatable Object
Code
Libraries & Linker / Loader
Object Files

Absolute Machine
Code
Mekonen M. # CoSc4072  Unit 1 – Introduction to compiler design 41
Context of compiler (Cousins of compiler)
Skeletal Source Program
Linker
 Linker makes a single program from a several files Preprocessor

of relocatable machine code. Source


 These files may have been the result of several Program
Compiler
different compilation, and one or more library files.
Target Assembly
Loader Program
Assembler
 The process of loading consists of:
 Taking relocatable machine code Relocatable Object
Code
 Altering the relocatable address Libraries & Linker / Loader
 Placing the altered instructions and data in Object Files
memory at the proper location.
Absolute Machine
Code
Mekonen M. # CoSc4072  Unit 1 – Introduction to compiler design 42
Thank You

You might also like