Compiler Construction

Uploaded by

Bcsf19m002-SADIA UMER

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

16 views44 pages

Compiler Construction

Uploaded by

Bcsf19m002-SADIA UMER

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 44

COMPILER CONSTRUCTION

M UHAM M AD BASIT AL I GIL ANI

Policies and Guidelines
▪ Attendance policy: marking at start
▪ Plagiarism policy: as per outline
▪ Do’s
◦ Be interactive, ask questions
◦ Participate in the lecture
◦ Relax and learn

▪ Don’ts
◦ Use of cell phones
◦ Discussion with fellows during class (unless told otherwise)
Compilers
▪ A compiler is a program that can read a program in one language - the source
language - and translate it into an equivalent program in another language - the target
language
Compilers
▪ An important role of the compiler is to report any errors in the source program that it
detects during the translation process.
▪ If the target program is an executable machine-language program, it can then be called
by the user to process inputs and produce outputs;
Interpreters
▪ An interpreter translates the code line by line when the program is running.
Compiler vs Interpreter
▪ A compiler takes an entire program and a lot of time to analyze the source code,
whereas the interpreter takes a single line of code and very little time to analyze it.
▪ Compiler generates intermediate object code whereas interpreter does not produces
any intermediate object code
▪ Memory requirement is more due to the creation of object code whereas in
interpreter requires as it does not create intermediate object code
Compiler vs Interpreter
▪ Compiler display all errors after compilation, all at the same time whereas display
error of each line one by one
▪ C, C++, and C# are the examples of compilers where as python are the example of an
interpreter
Example of Java Compilation Process
▪ Java language processors combine compilation and interpretation.
▪ A Java source program may first be compiled into an intermediate form called
bytecodes. The bytecodes are then interpreted by a virtual machine.
Example of Java Compilation Process
▪ A benefit of this arrangement is that bytecodes
compiled on one machine can be interpreted on
another machine.
Why need to study Compiler Construction?
▪ Machine is only understood binary language so it is important to use the compiler
which help to convert the high-level language to low level language.
▪ Anyone who does any software development needs to use a compiler. It is a good idea
to understand what is going on inside the tools that you use.
Generations of Programming Languages
▪ First Generation of PL (Machine Language)
▪ Second Generation of PL (Assembly Language)
▪ Third Generation of PL (Procedural Language)
▪ Fourth Generation of PL (Very High Level Language)
▪ Fifth Generation of PL
First Generation of PL / Machine Language
▪ The first generation of languages are also called machine languages/ 1G language. This
language is machine-dependent. The machine language statements are written in
binary code (0/1 form) because the computer can understand only binary language.
▪ The first electronic computers appeared in the 1940's and were programmed in
machine language by sequences of 0's and 1's that explicitly told the computer what
operations to execute and in what order.
▪ The operations themselves were very low level: move data from one location to
another, add the contents of two registers, compare two values, and so on.
First Generation of PL / Machine Language
▪ The main advantage of programming in 1GL is that the code can run very fast and very
efficiently, precisely because the instructions are executed directly by the central
processing unit (CPU).
▪ One of the main disadvantages of programming in a low level language is that when an
error occurs, the code is not as easy to fix.
Second Generation of PL / Assembly Language
▪ The second-generation programming language also belongs to the category of low-
level programming language. The second generation language comprises assembly
languages that use the concept of mnemonics for the writing program.
▪ Assembly languages were introduced in the 1950s to mitigate the error and excessively
difficult nature of binary programming
▪ MNEMONIC: English word MNEMONIC means "A device such as a pattern of letters,
ideas, or associations that assists in remembering something.". So, its usually used by
assembly language programmers to remember the "OPERATIONS" a machine can do,
like "ADD" and "MUL" and "MOV" etc. This is assembler specific.
Third Generation of PL / Procedural Language
▪ The third-generation programming languages were designed to overcome the various
limitations of the first and second-generation programming languages.
▪ The third generation is also called procedural language. It consists of the use of a series
of English-like words that humans can understand easily, to write instructions. Its also
called High-Level Programming Language.
▪ For execution, a program in this language needs to be translated into machine
language using a Compiler/ Interpreter.
▪ C, C++, C#, and Java are high-level languages
Fourth Generation of PL
▪ The fourth-generation programming language is one step ahead of the third-
generation programming language. The programs are much easier to write and debug
than 3GLs.
▪ There are built-in GUI (Graphical user interfaces) objects like buttons, dropdown
menus, add-ins, etc. and no separate code needs to be written for them. These
languages are particularly developed with the viewpoint of solving a particular class of
problems.
▪ Fourth-generation languages are languages designed for specific applications like SQL
for database queries
Fifth Generation of PL
▪ The fifth-generation languages are also called 5GL. It is based on the concept of
artificial intelligence.
▪ It uses the concept that rather than solving a problem algorithmically, an application
can be built to solve it based on some constraints, i.e., we make computers learn to
solve any problem.
▪ Therefore, the use of 5GL has not become a reality yet and is still in the research
phase. 5GLs are mostly used in artificial intelligence research.
High Level Language
▪ These are programmer-friendly languages that are manageable, easy to understand,
debug, and widely used in today’s times.
▪ These are very easy to execute.
▪ High-level languages require the use of a compiler or an interpreter for their
translation into machine code.
▪ These languages have a very low memory efficiency. It means that they consume more
memory than any low-level language.
▪ High-level languages are human-friendly. They are, thus, very easy to understand and
learn by any programmer.
Low Level Language
▪ These are machine-friendly languages that are very difficult to understand by human
beings but easy to interpret by machines.
▪ These are very difficult to execute.
▪ These languages have a very high memory efficiency. It means that they consume less
energy as compared to any high-level language
▪ Low-level languages are machine-friendly. They are, thus, very difficult to understand
and learn by any human
Advantages of High Level Language
▪ Easy to understand and debugging
▪ Easy to execute
▪ Portable from any one device to another.
▪ High-level languages are human-friendly
Cousins of Compiler / Language Processing System
▪ In addition to a compiler, several other programs may be required to create an
executable target program
▪ Preprocessor
▪ A preprocessor is a tool that produces input for compilers
▪ A source program may be divided into modules stored in separate files. The task of collecting
the source program is sometimes entrusted to a separate program, called a preprocessor.
▪ File Inclusion: A preprocessor may also include header files into the program text like
<iostream>
▪ Macro Processing: The preprocessor may also expand shorthand called macros into source
language statements
▪ The modified source program is then fed to a compiler.
Cousins of Compiler / Language Processing System
▪ Compiler
▪ The compiler may produce an assembly language is easier to produce as output and easier to
debug
▪ Assembler
▪ The assembly language is then processed by a program called an assembler that produces
relocatable machine code as its output
▪ Linker/Loader
▪ Linker is a tool used to link part of a program together for execution into single executable
file. A loader loads this executable file into the memory and do execution
Two Pass Compiler
▪ There are two parts to compilation:
1. Analysis phase
2. Synthesis phase
Analysis-Synthesis Model of Compilation
▪ The analysis part breaks up the source program into constituent pieces and creates an
intermediate representation of the source program
▪ The synthesis part constructs the desired target program from the intermediate
representation.
Analysis Model of Compilation
▪ The analysis part breaks up the source program into constituent pieces and imposes a
grammatical structure on them.

▪ It then uses this structure to create an intermediate representation of the source

program.
▪ If the analysis part detects that the source program is either syntactically ill-formed or
semantically unsound, then it must provide informative messages, so the user can take
corrective action.
▪ The analysis part also collects information about the source program and stores it in a
data structure called a symbol table
Synthesis Model of Compilation
▪ The synthesis part constructs the desired target program from the intermediate
representation and the information in the symbol table.
▪ The analysis part is often called the front end of the compiler; the synthesis part is the
back end.
The Structure of a Compiler (Lexical Analysis)
▪ The first phase of a compiler is called lexical analysis or scanning
▪ The lexical analyzer reads the stream of characters making up the source program
▪ Groups the characters into meaningful sequences called a lexeme
▪ For each lexeme, the lexical analyzer produces as output a token of the form
<token-name, attribute-value>
that it passes on to the subsequent phase, syntax analysis
The Structure of a Compiler (Lexical Analysis)
▪ In the token, the first component token-name is an abstract symbol that is used during
syntax analysis, and the second component attribute-value points to an entry in the
symbol table for this token.
▪ Information from the symbol-table entry is needed for semantic analysis and code
generation
position = initial + rate * 60
▪ The characters in this assignment could be grouped into the following lexemes and
mapped into the following tokens passed on to the syntax analyzer
The Structure of a Compiler (Lexical Analysis)
position = initial + rate * 60
▪ The characters in this assignment could be grouped into the following lexemes and
position is a lexeme that would be mapped into a token <id, 1>, where id is an abstract
symbol standing for identifier and 1 point to the symbol table entry for position
▪ The assignment symbol = is a lexeme that is mapped into the token <=>. Since this
token needs no attribute value, we have omitted the second component.
▪ initial a lexeme that is mapped into the token <id, 2>, where 2 points to the symbol-
table entry for initial.
The Structure of a Compiler (Lexical Analysis)
position = initial + rate * 60
▪ + is a lexeme that is mapped into the token <+>
▪ rate is a lexeme that is mapped into the token <id, 3>, where 3 points to the symbol-
table entry for rate
▪ * is a lexeme that is mapped into the token <*>
▪ 60 is a lexeme that is mapped into the token <60>
▪ Blanks separating the lexemes would be discarded by the lexical analyzer.
<id,1> <=> <id,2> <+> <id,3> <*> <60>
The Structure of a Compiler (Syntax Analysis)
▪ The second phase of the compiler is syntax analysis or parsing
▪ The parser uses the first components of the tokens produced by the lexical analyzer to
create a tree-like intermediate representation that depicts the grammatical structure
of the token stream.
<Id,1> <=> <id,2> <+> <id,3> * <60>
The Structure of a Compiler (Syntax Analysis)
<Id,1> <=> <id,2> <+> <id,3> * <60>
Syntax tree
The Structure of a Compiler (Semantic Analysis)
▪ The semantic analysis phase check the source program for semantic error and gather
type information for code-generation phase
▪ An important part of semantic analysis is type checking
The Structure of a Compiler (Intermediate Code
Generator)
▪ After syntax and semantic analysis of the source program, many compilers
generate an explicit low-level or machine-like intermediate representation.
▪ This intermediate representation should have two important properties:
▪ it should be easy to produce
▪ it should be easy to translate into the target machine
The Structure of a Compiler (Intermediate Code
Generator)
▪ Intermediate form called three-address code, which consists of a sequence of
assembly-like instructions with three operands per instruction.
▪ The output of the intermediate code generator
The Structure of a Compiler (Intermediate Code
Generator)

▪ There are several points about three-address instructions:

▪ First, each three-address assignment instruction has at most one operator on
the right side.
▪ Second, the compiler must generate a temporary name to hold the value
computed by a three-address instruction.
▪ Third, some "three-address instructions" like the first and last in the
sequence, above, have fewer than three operands.
The Structure of a Compiler (Code Optimization)
▪ The machine-independent code-optimization phase attempts to improve the
intermediate code so that better target code will result.
▪ Better means faster, but other objectives may be desired, such as shorter code, or
target code that consumes less power.
▪ The optimizer can deduce that the conversion of 60 from integer to floating point can
be done once and for all at compile time.
The Structure of a Compiler (Code Generation)
▪ The code generator takes as input an intermediate representation of the source
program and maps it into the target language.
▪ Then, the intermediate instructions are translated into sequences of machine
instructions that perform the same task.
Class Activity
a = (b+10) / (c-20)
a = a + b * c *d
x = (a + (b * c) ) / (a - (b * c) )
x = a + (b/c) - 75

Chapter 1
No ratings yet
Chapter 1
30 pages
CSC 409 Unit 6
No ratings yet
CSC 409 Unit 6
52 pages
Computer Progroming Fundamental Lecture One
No ratings yet
Computer Progroming Fundamental Lecture One
25 pages
Basics of Programming
No ratings yet
Basics of Programming
21 pages
CS150 - Unit 2 - Computer Programming Languages
No ratings yet
CS150 - Unit 2 - Computer Programming Languages
17 pages
Elementary Programming
No ratings yet
Elementary Programming
125 pages
Intro Unit1 Lang
No ratings yet
Intro Unit1 Lang
47 pages
s4 Notes 2020 Nabisunsa Girls
No ratings yet
s4 Notes 2020 Nabisunsa Girls
66 pages
Programming Language
No ratings yet
Programming Language
27 pages
Unit 5 Programming Logic and Concept
No ratings yet
Unit 5 Programming Logic and Concept
39 pages
Chapter9 Computer Languages
No ratings yet
Chapter9 Computer Languages
32 pages
Lesson4 ProgrammingLanguage
No ratings yet
Lesson4 ProgrammingLanguage
14 pages
Compiler Construction
No ratings yet
Compiler Construction
44 pages
09 10 2024 Tarihli Dersten
No ratings yet
09 10 2024 Tarihli Dersten
47 pages
Lesson 2 - The Art of Problem Solving
No ratings yet
Lesson 2 - The Art of Problem Solving
64 pages
Lect 1
No ratings yet
Lect 1
26 pages
Introduction To Computer Programming: Violetta Cavalli-Sforza
No ratings yet
Introduction To Computer Programming: Violetta Cavalli-Sforza
25 pages
Programming Concept
No ratings yet
Programming Concept
7 pages
Programming Fundamentals
No ratings yet
Programming Fundamentals
47 pages
Puter Language
No ratings yet
Puter Language
34 pages
Chapter 5 C Programming 1
No ratings yet
Chapter 5 C Programming 1
51 pages
Section 8 - Program Implementation
No ratings yet
Section 8 - Program Implementation
38 pages
Introduction To Computer Programming
No ratings yet
Introduction To Computer Programming
31 pages
Programming Languages
No ratings yet
Programming Languages
24 pages
Chapter 11 Intro To Programming - 241209 - 154644
No ratings yet
Chapter 11 Intro To Programming - 241209 - 154644
40 pages
Enrollment and Billing System
93% (14)
Enrollment and Billing System
77 pages
Cos 101 Module 5 Presentation
No ratings yet
Cos 101 Module 5 Presentation
31 pages
Chapter 5 Computer Languages
No ratings yet
Chapter 5 Computer Languages
4 pages
Unit 1
No ratings yet
Unit 1
38 pages
Cpe 201 - Lecture
No ratings yet
Cpe 201 - Lecture
4 pages
Structure Programming: ICT-1105 Information and Communication Technology
No ratings yet
Structure Programming: ICT-1105 Information and Communication Technology
25 pages
Chapter 1-Introduction To Computer Programming
100% (1)
Chapter 1-Introduction To Computer Programming
26 pages
Computer Language Unit 3
No ratings yet
Computer Language Unit 3
30 pages
14 PRG Lang
No ratings yet
14 PRG Lang
22 pages
FoP Theory Final
No ratings yet
FoP Theory Final
159 pages
Compiler Construction: Nguyen Thi Thu Huong Department of Computer Science-HUST Email: Cell Phone 0903253796
No ratings yet
Compiler Construction: Nguyen Thi Thu Huong Department of Computer Science-HUST Email: Cell Phone 0903253796
35 pages
Machine Code and Assembly Language
100% (1)
Machine Code and Assembly Language
29 pages
CH 1
No ratings yet
CH 1
21 pages
012.elementary Programming Principals
No ratings yet
012.elementary Programming Principals
45 pages
Chapter 1.1
No ratings yet
Chapter 1.1
23 pages
Lesson 2 The Art of Problem Solving
No ratings yet
Lesson 2 The Art of Problem Solving
64 pages
Chapter Two
No ratings yet
Chapter Two
8 pages
02 10 2024 Tarihli Dersten 2
No ratings yet
02 10 2024 Tarihli Dersten 2
36 pages
CS111-PART 4 - Brief Introduction To Software
No ratings yet
CS111-PART 4 - Brief Introduction To Software
25 pages
S4 - CST ICT - Elementary Programming
No ratings yet
S4 - CST ICT - Elementary Programming
18 pages
Part 03 - Lesson 1 Introduction To Computer (Cont.)
No ratings yet
Part 03 - Lesson 1 Introduction To Computer (Cont.)
7 pages
System Programming (BTCS-405A) Session:jan-May, 2018: Contents
No ratings yet
System Programming (BTCS-405A) Session:jan-May, 2018: Contents
10 pages
Computer Languages: Presented By:-Baburaj Patel
No ratings yet
Computer Languages: Presented By:-Baburaj Patel
12 pages
GE3151 Problem Solving and Python Programming Lecture Notes 2
No ratings yet
GE3151 Problem Solving and Python Programming Lecture Notes 2
158 pages
Informatics
No ratings yet
Informatics
127 pages
OOM Chap-1 OOPs Concept
No ratings yet
OOM Chap-1 OOPs Concept
10 pages
Kotlin Tutorial
100% (3)
Kotlin Tutorial
58 pages
Introduction To Programming
No ratings yet
Introduction To Programming
18 pages
2 High Level and Translators
No ratings yet
2 High Level and Translators
23 pages
Programming Language and Compiler Design Session
No ratings yet
Programming Language and Compiler Design Session
33 pages
Introduction To Programming
No ratings yet
Introduction To Programming
44 pages
Lec 1 BIT112
No ratings yet
Lec 1 BIT112
9 pages
Lec - 2 C Programming
No ratings yet
Lec - 2 C Programming
22 pages
Computer Programming Module
No ratings yet
Computer Programming Module
160 pages
Chapter 1
No ratings yet
Chapter 1
25 pages
Module 1 - Programming Basics and Logic
No ratings yet
Module 1 - Programming Basics and Logic
13 pages
C Programming Language
No ratings yet
C Programming Language
32 pages
Visual Basic Programming Handouts-Part1
No ratings yet
Visual Basic Programming Handouts-Part1
11 pages
Web Development
No ratings yet
Web Development
195 pages
Computer Program - Part 1
No ratings yet
Computer Program - Part 1
5 pages
Fundamentals of Computer Problem Solving (CSC415)
No ratings yet
Fundamentals of Computer Problem Solving (CSC415)
52 pages
9618 Example Candidate Responses Paper 3 (For Examination From 2021)
No ratings yet
9618 Example Candidate Responses Paper 3 (For Examination From 2021)
42 pages
Python ToC
No ratings yet
Python ToC
4 pages
PPL Unit-1
No ratings yet
PPL Unit-1
26 pages
Succinctly
100% (1)
Succinctly
121 pages
CS508 FinalTerm Solved Short Questions
No ratings yet
CS508 FinalTerm Solved Short Questions
40 pages
Lesson Proper
No ratings yet
Lesson Proper
9 pages
1 04 Pseudocode Style Guide
No ratings yet
1 04 Pseudocode Style Guide
9 pages
Cs Report
No ratings yet
Cs Report
19 pages
Go Tutorial PDF
No ratings yet
Go Tutorial PDF
17 pages
System Development Theme Module (Sisay Wayu)
No ratings yet
System Development Theme Module (Sisay Wayu)
146 pages
Individual Score Form: o Lembaga Peperiksaan Malaysia Kementerian Pelajaran Malaysia
No ratings yet
Individual Score Form: o Lembaga Peperiksaan Malaysia Kementerian Pelajaran Malaysia
16 pages
Lecture 05 - Priori & Postiary Analysis PDF
100% (1)
Lecture 05 - Priori & Postiary Analysis PDF
11 pages
On Paul Graham 3
No ratings yet
On Paul Graham 3
153 pages
CS8494 Softwareengineering-Unit Ii
No ratings yet
CS8494 Softwareengineering-Unit Ii
69 pages
Protection and Switchgear by Bakshi
No ratings yet
Protection and Switchgear by Bakshi
295 pages
Chapter 4 Register Transfer Language
No ratings yet
Chapter 4 Register Transfer Language
12 pages
07 Handout 1
No ratings yet
07 Handout 1
4 pages
Best Computer Lessons For Kids
No ratings yet
Best Computer Lessons For Kids
3 pages
PHP: A Fractal of Bad Design: Sturgeon's Law
No ratings yet
PHP: A Fractal of Bad Design: Sturgeon's Law
17 pages
06 Laboratory Exercise 1
No ratings yet
06 Laboratory Exercise 1
2 pages
Unit III System Design
No ratings yet
Unit III System Design
23 pages
Reviews ENASE 2023 96
No ratings yet
Reviews ENASE 2023 96
4 pages
C Language AU
No ratings yet
C Language AU
2 pages
COMPUTER PROGRAMMING FOR KIDS: An Easy Step-by-Step Guide For Young Programmers To Learn Coding Skills (2022 Crash Course for Newbies)
From Everand
COMPUTER PROGRAMMING FOR KIDS: An Easy Step-by-Step Guide For Young Programmers To Learn Coding Skills (2022 Crash Course for Newbies)
Dexter Rogers
No ratings yet

Compiler Construction

Uploaded by

Compiler Construction

Uploaded by

COMPILER CONSTRUCTION

M UHAM M AD BASIT AL I GIL ANI

▪ It then uses this structure to create an intermediate representation of the source

▪ There are several points about three-address instructions:

You might also like