0% found this document useful (0 votes)

13 views27 pages

Chapter 1 - Introduction

This document provides an introduction to Compiler Design, outlining its objectives, phases, and the roles of various components such as compilers, interpreters, assemblers, linkers, and loaders. It details the compilation process, which includes analysis (lexical, syntax, and semantic) and synthesis (code optimization and generation). Additionally, it discusses the tools used in compiler construction and the importance of understanding these concepts in computer science.

Uploaded by

alula girma

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

13 views27 pages

Chapter 1 - Introduction

Uploaded by

alula girma

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 27

Compiler Design

(CoSc-3112)

Chapter One
Introduction

Gezahiegn Tessema (MSc.)

Department of Computer
Science Dilla University
Objective
At the end of this session, students will be able to:
 Understand the basic concepts and principles of Compiler Design
 Understand the term compiler, its functions, and how it works.

 Be familiar with the different classifications of compilers.

 Be familiar with cousins of the compiler: Linkers, Loaders, Interpreters,
Assemblers
 Understand the need to study compiler Design and construction

 Understand the phases of compilation and the steps of compilation

Outline
• Introduction
• Analysis and Synthesis in compilation
• Various phases in a compilation
• Compiler construction tools
Introduction to language Processors

 Compiler is an executable program that can read a program in one high-level language

and translate it into an equivalent executable program in machine language.

 A compiler is a computer program that translates an executable program in a
source
language into an equivalent program in a target language.
A source program/code is a program/code written in the source language, which is

usually a high-level
A target language
program/code .
is a program/code written Source Target
program Compiler
program
in the target language, which often is a machine e.g. C++ e.g.
Assembly
language or an intermediate code (Object Code). Error
message
Cousins of Compilers

A.Interpreter:- is another common kind of language processor and instead of

producing a target program an interpreter appears to directly execute the operations
specified in the source program on inputs supplied by the user.
 It produces output of the statement during the translation
 It generally uses one of the following strategies for program execution:
i. Execute the source code directly
ii. Translate source code into some efficient intermediate representation and immediately execute this
iii. Explicitly execute stored precompiled code made by a compiler which is part of the interpreter system

B. Assembler:- is a translator that converts programs written in assembly language into

machine code.
 Translate mnemonic operation codes to their machine language equivalents.
 Assigning machine addresses to symbolic labels.
C. Linker:- is a program that takes one or more objects generated by a compiler and
combines them into a single executable program.
D. Loader:- is the part of an operating system that is responsible for loading programs
from executables (i.e., executable files) into memory, preparing
them
executing
forthemexecution sourceand program
then
preprocessor
There are four Language translator
modified source program phase
compiler

target assembly program

assembler

Relocatable machine code

linker/loader Library files
Relocatable object files
target machine code
Analysis and Synthesis in compilation

There are two parts to compilation: analysis & synthesis.

 During analysis, the operations implied by the source

program are determined and recorded in a hierarchical structure called a

tree.
 Breaks up source program into constituent pieces
 During synthesis, the operations involved in producing translated code.
1. Lexical Analysis  Imposes a grammatical structure on these pieces
2. Syntax Analysis Analysis
End
Fron

 Creates intermediate representation of source program

3. Semantic
t

 Collects information about the source program and stores it in a

Analysis
symbol table.

4. Code Generation  Construct target program from intermediate representation

End

5. Optimization Synthesis
Bac

 Takes the tree structure and translates the operations into the target
k

program
Various phases in a compilation
Analysis
1. Linear/Lexical analysis (Scanning)
 The stream of characters is read from left to right and grouped into tokens.
 A token is a sequence of characters having a collective meaning.
 Token is the basic lexical unit.
 Examples:
• Identifiers are variables
• Keywords
• Symbols (+, -, …)
• Numbers
• Etc…
 Blanks, new lines, tabulation marks will be removed during lexical analysis.
 Example
DIST1 = DIST2 + 5 * 4
<IDENT,1> <ASSIGN> <IDENT,2> <PLUS> <NUMB,3> <MULT> <NUMB,4>
2. Hierarchical/Syntax analysis (Parsing)
 Tokens are grouped hierarchically into nested collections with collective
meaning.
 The result is generally a parse tree.
 Most syntactic errors in the source program are caught in this phase.
 Syntactic rules of the source language are given via a Grammar.
 Consider the previous example: DIST1
= DIST2 + 5 * 4
IDENT ASSIGN IDENT PLUS NUMB MULT NUMB
3. Semantic

analysis
Certain checks are performed to make sure that the components of the program fit
together meaningfully.
 Unlike parsing, this phase checks for semantic errors in the source program (e.g.
type mismatch)
 Semantic analysis uses the symbol table.
 Symbol table:- is a data structure with a record for each identifier and its
attributes
• Attributes include storage allocation, type, scope, etc
• All the compiler phases insert and modify the symbol table
 The result of semantic analysis is Intermediate Code (IC).
 The IC can be represented using either abstract tree or Three address
code
 The previous example in three address
code: TEMP1 = 5 * 4
TEMP2 = inttoreal( TEMP1 )
TEMP3 = IDENT2 + TEMP2
IDENT1 = TEMP3
Synthesis
Synthesis is composed of two phases:
1. Code optimization
2. Code generation
3. Code optimization
 This phase changes the IC so that the code generator produces faster and less
memory
consuming program.
 The optimized code does the same thing as the original (non-optimized) code but with less cost
in terms of CPU time and memory space.
 There are several techniques of optimizing code and they will be discussed in the last chapter.
 Example
Unnecessary lines of code in loops (i.e. code that could be executed outside of the loop) are moved
out of thefor(i=1;
loop. i<10; i++) x = y+1;
{ x = y+1; for(i=1; i<10; i+
z = x+i; +)
} z = x+i;
2. Code generation
 The final phase of the compiler.
 Generates the target code in the target language (e.g. Assembly)
 The instructions in the IC are translated into a sequence of machine instructions
that
perform the same task.
Phase I: Lexical
Analysis

 The low-level text processing portion of the compiler

 The source file, a stream of characters, is broken into larger chunks called
token.
For example:
{
void main() It will be broken into 13 tokens as
int x; below:
x=3;
} void main ( ) { int x ; x = 3 ; }
 The lexical analyzer (scanner) reads a stream of characters and puts them together into some meaningful

(with respect to the source language) units called tokens.

 Typically, spaces, tabs, end-of-line characters and comments are ignored by the lexical analyzer.
 To design a lexical analyzer: input a description (regular expressions) of the tokens in the language, and
output a lexical analyzer (a program).
Phase II: Parsing (Syntax
Analysis)
A parser gets a stream of tokens from the scanner, and determines if the syntax (structure) of the
program is correct according to the (context-free) grammar of the source language.
 Then, it produces a data structure, called a parse tree or an abstract syntax tree, which describes the

syntactic structure of the program.

 The parser ensures that the sequence of tokens returned by the lexical analyzer
forms a
syntactically correct program
 It also builds a structured representation of the program called an abstract syntax tree that is
easier for the type checker to analyze than a stream of tokens
 It catches the syntax errors as the statement below:

if if (x > 3) then x = x + 1
 Context-free grammars will be used (as the input) by the parser generator to describe the syntax of
the compiling language
 Most compilers do not generate a parse tree explicitly but rather go to intermediate code directly as
Parse Tree

Is output of parsing that shows the Top-down description of program syntax
Root node is entire program and leaves are tokens that were identified during
lexical

analysis

Constructed by repeated application of rules in Context Free Grammar (CFG)

Syntax structures are analyzed by DPDA (Deterministic Push Down Automata)

Example: parse tree for position:=initial + rate*60

Phase III: Semantic Analysis

 It gets the parse tree from the parser together with information about some syntactic elements
 It determines if the semantics (meanings) of the program is correct.
 It detects errors of the program, such as using variables before they are declared, assign an
integer value to a Boolean variable, …

 This part deals with static semantic.

 semantic of programs that can be checked by reading off from the program only.
 syntax of the language which cannot be described in context-free grammar.
 Mostly, a semantic analyzer does type checking (i.e. Gathers type information for subsequent code
generation.)
 It modifies the parse tree in order to get that (static) semantically correct code
 In this phase, the abstract syntax tree that is produced by the parser is traversed, looking for
semantic errors
Contd.
 The main tool used by the semantic analyzer is a symbol table
 Symbol table:- is a data structure with a record for each identifier and its attributes
 Attributes include storage allocation, type, scope, etc
 All the compiler phases insert and modify the symbol table
 Discovery of meaning in a program using the symbol table
 Do static semantics check
 Simplify the structure of the parse tree ( from parse tree to abstract syntax tree
(AST) )
Static semantics check
 Making sure identifiers are declared before use
 Type checking for assignments and operators
 Checking types and number of parameters to subroutines
 Making sure functions contain return statements
 Making sure there are no repeats among switch statement labels
Phase IV: Intermediate Code Generation

 An intermediate code generator

 takes a parse tree from the semantic analyzer
 generates a program in the intermediate language.

 In some compilers, a source program is translated into an intermediate code first and then the

intermediate code is translated into the target language.

 In other compilers, a source program is translated directly into the target language.

 Compiler makes a second pass over the parse tree to produce the translated code
 If there are no compile-time errors, the semantic analyzer translates the abstract syntax tree into the

abstract assembly tree

 The abstract assembly tree will be passed to the code optimization and assembly code generation

phase
Contd.

Using intermediate code is beneficial when compilers which translates a single source

language to many target languages are required.

 The front-end of a compiler:- scanner to intermediate code generator

can be used for every compilers.

 Different back-ends:- code optimizer and code generator is required

for each target language.

One of the popular intermediate code is three-address code.

 A three-address code instruction is in the form of x = y op z.

Phase V: Assembly Code Generation

 Code generator coverts the abstract assembly tree into the actual assembly code

 To do code generation
 The generator covers the abstract assembly tree with tiles (each tile represents a small portion of

an abstract assembly tree) and

 Output the actual assembly code associated with the tiles that we used to cover the tree

Phase VI: Machine Code Generation and Linking

 The final phase of compilation coverts the assembly code into machine code and links (by a linker) in

appropriate language libraries

Code Optimization

 Replacing an inefficient sequence of instructions with a better sequence of

instructions.

 Sometimes called code improvement.

 Code optimization can be done:

 after semantic analyzing
performed on a parse tree
 after intermediate code generation
performed on a intermediate code
 after code generation
performed on a target code
 Two types of optimization
1. Local
2. Global
Local Optimization

 The compiler looks at a very small block of instructions and tries to determine how it
can improve the efficiency of this local code block

 Relatively easy; included as part of most compilers

Examples of possible local optimizations

1. Constant evaluation

2. Strength reduction

3. Eliminating unnecessary operations

Global Optimization

The compiler looks at large segments of the program to decide how to improve
performance
Much more difficult; usually omitted from all but the
most sophisticated and expensive production- level “optimizing compilers”
Optimization cannot make an inefficient algorithm efficient
Compiler construction tools

 Modern software development environments containing tools such as language editors,

debuggers, version managers, profilers, test harnesses, and so on.

 More specialized tools have been created to help implement various phases of a compiler.

Some commonly used compiler-construction tools include

 Parser generators:- that automatically produce syntax analyzers from a

grammatical
description of a programming language.

 Compiler-construction toolkits:- that provide an integrated set of routines

for
Cont.…
 Scanner generators:- that produce lexical analyzers from a regular-
expression
description of the tokens of a language.
 Syntax-directed translation engines:- that produce collections of routines for
walking a parse tree and generating intermediate code.
 Code-generator:- that produce a code generator from a collection of rules for
translating each operation of the intermediate language into the machine language for
a target machine.
 Data-flow analysis engines: that facilitate the gathering of information about how
values are transmitted from one part of a program to each other part. Data-flow
analysis is a key part of code optimization.

CD Decode
100% (1)
CD Decode
169 pages
Introduction To Compiling
100% (1)
Introduction To Compiling
26 pages
Chapter-1 Compiler Design
100% (1)
Chapter-1 Compiler Design
13 pages
Compiler Design Unit 1
No ratings yet
Compiler Design Unit 1
30 pages
Spooky2 Morgellon Lyme Guide 9.2019
No ratings yet
Spooky2 Morgellon Lyme Guide 9.2019
87 pages
Sales Summit-Delhi, 21may
No ratings yet
Sales Summit-Delhi, 21may
88 pages
CS602PC - Compiler - Design - Lecture Notes - Unit - 1
100% (2)
CS602PC - Compiler - Design - Lecture Notes - Unit - 1
19 pages
Compiler Construction Complete Notes
No ratings yet
Compiler Construction Complete Notes
22 pages
CD Introduction
No ratings yet
CD Introduction
32 pages
Automata Theory and Compiler Design
No ratings yet
Automata Theory and Compiler Design
55 pages
CD Finalized Notes
No ratings yet
CD Finalized Notes
6 pages
Unit 1
No ratings yet
Unit 1
29 pages
Syntax Analysis in CC
No ratings yet
Syntax Analysis in CC
15 pages
Corel Photo Paint x7
0% (1)
Corel Photo Paint x7
517 pages
Lecture1 - Compiler Design
No ratings yet
Lecture1 - Compiler Design
52 pages
Module 1-1
No ratings yet
Module 1-1
22 pages
Compiler
No ratings yet
Compiler
17 pages
CH 1
No ratings yet
CH 1
23 pages
L2 - Structure of A Compiler
No ratings yet
L2 - Structure of A Compiler
43 pages
Phases of Compiler
No ratings yet
Phases of Compiler
36 pages
Compiler Notes
No ratings yet
Compiler Notes
66 pages
Chapter 1 (Introduction)
No ratings yet
Chapter 1 (Introduction)
47 pages
Lecture 1,2 Introduction
No ratings yet
Lecture 1,2 Introduction
40 pages
Unit-1 PCD
No ratings yet
Unit-1 PCD
28 pages
Introduction
No ratings yet
Introduction
23 pages
Compiler Lec-One
No ratings yet
Compiler Lec-One
46 pages
Introduction To Compiler Lexical Analysis Notes
No ratings yet
Introduction To Compiler Lexical Analysis Notes
21 pages
CH1 3
No ratings yet
CH1 3
32 pages
Introduction To Compiler
No ratings yet
Introduction To Compiler
10 pages
L2 Compiler Phases
No ratings yet
L2 Compiler Phases
29 pages
CS 321 - Compilers: Outline
No ratings yet
CS 321 - Compilers: Outline
8 pages
Compiler Design Slide Chapter 1-6
No ratings yet
Compiler Design Slide Chapter 1-6
250 pages
Unit 1 Slides
No ratings yet
Unit 1 Slides
49 pages
BCS 324 Lesson 1
No ratings yet
BCS 324 Lesson 1
28 pages
Chapter 1 - Introduction To Comp
No ratings yet
Chapter 1 - Introduction To Comp
27 pages
SCS13033
No ratings yet
SCS13033
121 pages
66fe65b5746f9CCWeek 02lecture03
No ratings yet
66fe65b5746f9CCWeek 02lecture03
47 pages
CSC 415
No ratings yet
CSC 415
52 pages
Slides 01 - Compiler Construction - UET CS - Introduction
No ratings yet
Slides 01 - Compiler Construction - UET CS - Introduction
37 pages
SCSA1604
No ratings yet
SCSA1604
133 pages
Unit 1 Compiler Design
No ratings yet
Unit 1 Compiler Design
124 pages
Compiler Design Mod 1
No ratings yet
Compiler Design Mod 1
75 pages
UNIT-I Compiler Design - SCS1303: School of Computing Department of Computer Science and Engineering
No ratings yet
UNIT-I Compiler Design - SCS1303: School of Computing Department of Computer Science and Engineering
27 pages
CD Unit 1
No ratings yet
CD Unit 1
63 pages
1.lecture Notes 19 Apil
No ratings yet
1.lecture Notes 19 Apil
26 pages
CSE353 Slides
No ratings yet
CSE353 Slides
76 pages
m433-نظرية المترجمات د عبدالباقي
No ratings yet
m433-نظرية المترجمات د عبدالباقي
146 pages
Intro To Compilers
No ratings yet
Intro To Compilers
77 pages
Ak CD Cse 305 Assignment 1
No ratings yet
Ak CD Cse 305 Assignment 1
15 pages
Compiler Design
No ratings yet
Compiler Design
11 pages
Compiler Design
No ratings yet
Compiler Design
118 pages
Chapter 1
No ratings yet
Chapter 1
35 pages
Compiler Design: Dr. M. Moshiul Hoque Dept. of CSE, CUET
No ratings yet
Compiler Design: Dr. M. Moshiul Hoque Dept. of CSE, CUET
53 pages
Introduction To Compilation
No ratings yet
Introduction To Compilation
33 pages
Unit 1
No ratings yet
Unit 1
29 pages
CD - Module 1
No ratings yet
CD - Module 1
22 pages
Language Processing System:-: Compiler
No ratings yet
Language Processing System:-: Compiler
6 pages
Mold Price With Photos
No ratings yet
Mold Price With Photos
27 pages
ROCKEXE6EREADR
No ratings yet
ROCKEXE6EREADR
25 pages
Lec00 Outline
No ratings yet
Lec00 Outline
27 pages
The Government Contracts Reference Book
0% (1)
The Government Contracts Reference Book
3 pages
Digital Electronics Module 03
No ratings yet
Digital Electronics Module 03
18 pages
Sap Abap Guide
No ratings yet
Sap Abap Guide
30 pages
009-2014-009 APAC Best Practice Installation Manual Issue 1.1
No ratings yet
009-2014-009 APAC Best Practice Installation Manual Issue 1.1
85 pages
MasterVolt Masterview - Easy - Mkii - Manual
No ratings yet
MasterVolt Masterview - Easy - Mkii - Manual
24 pages
RocheCobasC111Host Interface Manual - 2.1 - EN - 2 PDF
No ratings yet
RocheCobasC111Host Interface Manual - 2.1 - EN - 2 PDF
93 pages
Bonafide Certificate: of Student Information System in Kongu Engineering College"
No ratings yet
Bonafide Certificate: of Student Information System in Kongu Engineering College"
9 pages
Module 2@13 3 2024
No ratings yet
Module 2@13 3 2024
41 pages
Image Process
No ratings yet
Image Process
40 pages
Pgdca 1 Sem Introduction of Software Organisation 117 Dec 2018
No ratings yet
Pgdca 1 Sem Introduction of Software Organisation 117 Dec 2018
2 pages
Chapter 1 - Introduction DS
No ratings yet
Chapter 1 - Introduction DS
36 pages
Top Election Offenses
No ratings yet
Top Election Offenses
46 pages
Final Project On MR Puff
No ratings yet
Final Project On MR Puff
12 pages
Unit 3
No ratings yet
Unit 3
8 pages
State of Practice of Building Information Modeling
No ratings yet
State of Practice of Building Information Modeling
8 pages
Mi-280 - Seafarers' Documentation - Filing Agents' Manual
No ratings yet
Mi-280 - Seafarers' Documentation - Filing Agents' Manual
27 pages
Nonin 9590 Vantage
No ratings yet
Nonin 9590 Vantage
2 pages
Data For AI August 2021 Cognilytica Slides
No ratings yet
Data For AI August 2021 Cognilytica Slides
8 pages
Subject: - IEQ (22657)
No ratings yet
Subject: - IEQ (22657)
10 pages
Compiler 1
No ratings yet
Compiler 1
31 pages
Chapter 2 - Lexical Analysis
No ratings yet
Chapter 2 - Lexical Analysis
48 pages
DAA Module
No ratings yet
DAA Module
80 pages
4 TwinCAT - 3 - PLC - HMI
No ratings yet
4 TwinCAT - 3 - PLC - HMI
27 pages
Mit - It Cost Model PDF
No ratings yet
Mit - It Cost Model PDF
139 pages
(SOLVED) - Current Mirror Noise - Forum For Electronics
No ratings yet
(SOLVED) - Current Mirror Noise - Forum For Electronics
4 pages
Sigma Rules in Technical Writing
No ratings yet
Sigma Rules in Technical Writing
4 pages
Powerpoint Template: " Add Your Company Slogan "
No ratings yet
Powerpoint Template: " Add Your Company Slogan "
20 pages
Knowledge Management
No ratings yet
Knowledge Management
8 pages
Dual Core Processing: Solution Alpha 355 S / 356
No ratings yet
Dual Core Processing: Solution Alpha 355 S / 356
1 page
DR Deepak02
No ratings yet
DR Deepak02
1 page
Dive Into Sea of C
From Everand
Dive Into Sea of C
M Ashok
No ratings yet
The 1 Page Python Book
From Everand
The 1 Page Python Book
Barani Kumar
2/5 (1)
COMPUTER PROGRAMMING FOR KIDS: An Easy Step-by-Step Guide For Young Programmers To Learn Coding Skills (2022 Crash Course for Newbies)
From Everand
COMPUTER PROGRAMMING FOR KIDS: An Easy Step-by-Step Guide For Young Programmers To Learn Coding Skills (2022 Crash Course for Newbies)
Dexter Rogers
No ratings yet
Understanding Python: Beginner's Guide to Programming
From Everand
Understanding Python: Beginner's Guide to Programming
Sabry Fattah
No ratings yet
Code Beneath the Surface: Mastering Assembly Programming
From Everand
Code Beneath the Surface: Mastering Assembly Programming
Kameron Hussain
No ratings yet