0% found this document useful (0 votes)

32 views11 pages

Compiler Construction

no comments

Uploaded by

bc210402586 HAFIZA ANIQA IRFAN

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

32 views11 pages

Compiler Construction

no comments

Uploaded by

bc210402586 HAFIZA ANIQA IRFAN

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 11

Compiler Construction

hafiza aniqa
Compiler:
A complier is a software tool or platform that is used to translate the high-level language programming
source code into low level machine language code that can be executed directly by the computer’s hardware.
 A compiler is a specific translates that translate information form one representation to another.
 Applications that convert for example word file into PDF is called translators not a compiler.
Issues in compilation:
There is no algorithm that exist for an ideal translation. Translation is a complex process. To manage this
complex process, the translation is done in multiple process.
Types of compilers:
Compliers can be translated into two main categories:
1. Single pass complier – in this type of complier the source code is process in single go that means
the compiler read source code and do the necessary analysis and generate the target code in one go.
2. Multi pass compiler – when several intermediate codes are created in a program and the parse tree
is processed many times it is called mutli-pass complier. It breaks program or code into smaller
program.
Type of multi pass compiler:
Multi pass complier can be further divided into two categories:
1. Two-pass compiler
2. Three-pass complier
Two-pass compiler:
In this type of compilation, where the program is translated twice once form the front end and second from
the back end.

Front end:
The algorithm employed in the front end have the polynomial time complexity. The front end maps the legal
source code into intermediate code representative.
Phases of Front end:
Front end consists of the following phases:
 Lexical analysis
 Syntax analysis / parser
 Semantic analysis
 Intermediate code generator

Back End:
The back end has the time complexity of NP complete and maps IR into target machine code. The back end
of the compiler translates the intermediate code representation (IR) into target machine code. It decides
which value to keep in the register in order to avoid the memory access. It is also responsible for instruction
selection to produce fast and compact code.
Intermediate Code Representation Steps:
An intermediate code representation consists of following steps:
1. Instruction selection
2. Register allocation
3. Instruction scheduling

Register Allocation:
Register in the CPU plays an important role for providing the high-speed access to the operands. The
number of registers is small in CPU and some of them are pre allocated for specialized used such as program
counter and are not available for the back end to use. Optimal register allocation is NP- complete.
Instruction Scheduling:
The back end needs to do instruction scheduling to avoid hardware stalls and interlocks. Optimal scheduling
is NP complete in nearly all cases.
Phase of Back end:
Back end consists of following phases:
 Code optimization
 Target code generator
Modules of Front End:
Front end recognizes the legal and illegal program presented to it. Front end consists of two modules:
 Scanner
 Parser
Scanner:
The first phase of front end is lexical analysis which is also called the scanner. It takes the program (Source
code) as input and convert it into sequence of tokens. The output of lexical analysis is a sequence of tokens
that is then given to the parser as input.
Token:
A lexical token is sequence of character that can be treated as a unit in the grammar of programming
language. We call the pair <token type, word> a token.
Types of tokens:
Keywords – e.g., for, if, the, void etc.
Identifier – e.g. variables name, function name, class name etc.
Symbol – e.g. +, -, % etc.
Non-Token:
Preprocessor, macros, comments, tabs, newline etc are all non tokens.
Token definition:
Tokens can be defined by using regular language as:
 As they are based on simple and useful theory
 Are easy to understand
 And have efficient implementation
Parser:
The second phase of front end is also known as syntax analysis. It takes the sequence of token form the
previous phase i.e., lexical analysis as input and recognize the context free grammar and converts it into
intermediate representation (IR). If there are any errors it will also reports the errors.
Context Free Grammar (CFG):
The syntax of most programming language is specified or defined by using context free grammar. A context
free grammar consists of following:
 S – start symbol
 N – Non – terminals
 T – Terminals
 P – set of production rule
Terminals: The symbol that can’t be replaced by any symbol is called terminal (constant). They are denoted
by small letters. (a,b,c)
Non- Terminals: The symbol that must be replaced by other things is called non terminal (variable). They
are donated by capital letter. (X, S, Y)
Productions: The grammatical rules are often called Productions.
Parse representation:
A parse can be represented by using:
 Parse tree
 Syntax tree
 Abstract Syntax tree
These representations help in understanding the structure of the source code.
Prase tree:
It is the hierarchical representation of the terminal or non-terminals. This is also known as derivation tree. A
parse tree is created by a parser.
Example:
Tree for string = baab
S→AA
A→AA | bA |Ab | a

Syntax Tree:
A syntax tree is tree in which each leaf node is represented by an operand and inside node are represented by
operator.
Example:
3*4+5

Abstract Syntax Tree:

The parse tree often has a lot of unneeded information. Compiler often use abstract syntax tree to get rid of
unneeded information.
Three-pass compiler:
An intermediate stage is used for code improvement or optimization. The middle end is introduced on
between the front end and back end which analyse the IR and rewrites (transform) the IR. Its primary goal is
to reduce the run time of the compile code. It is generally term as optimizer.

Lexical analysis:
The scanner is the first component of the front end and parser is the second component. The task of the
scanner is to take a program (source code) that is written in some language such as java, c++ etc as stream of
character and break the stream of characters into tokens. This activity is knowns lexical analysis. The lexical
analyser partition input string into substring called word. And classify them according to their role.
Specifying token:
We don not what kind of token we are going to read after reading the first character such as if a token starts
with i it can be either an identifier or a keyword. Regular language is the most popular for specifying the
token because:
 They are base on simple and useful theory
 Are easy to understand
 And have efficient implementa/
Language:
A language over ∑(alphabet) is finite set of strings (finite sequence of character). For lexical analysis we
care about regular language.
Regular Language:
It is way to define a language. A language that can be expressed using regular expression is called regular
language. Regular expression is a way to represent a certain set of string in an algebraic fashion. The token
we want to recognize are encoded using regular expression. If A is regular expression, then L(A) refers to a
language denoted by A.
Acceptor:
We need a mechanism to determine if the input string w belong to the language L(R), the language that is
denoted by the regular expression R. Such mechanism is called acceptor. The acceptor is based on finite
automaton.
Finite automaton:

Finite automaton is also known as finite state machine which is computational model to describe and design
a system with a finite number of states. Finite automaton has a very limited amount of memory. A finite
automaton has following characteristics:

 An input alphabet (∑)

 A set of states
 A start (initial) state
 A set of transition
 A set of accepting (final) states

Types:

Finite automaton is further divided into two main categories which is further divided into subcategories:

1.Finite automata without Output

 Deterministic Finite Automaton

 Non- deterministic Finite Automaton
 Ε-Finite Automaton

2.Finite automata with Output

 Moore Machine
 Mealy Machine

Table encoding of FA:

A FA can be encoded as a table called transition table. The row represents the state and the column
represents the character of the alphabet set. The cell of the table represents the next state. The encoding
makes the implementation of FA simple and efficient.

Deterministic Finite Automaton (DFA):

In Deterministic Finite Automaton (DFA), there is only one transition per input per state and there is no ε -
moves.

Non- deterministic Finite Automaton (NFA):

In Non- deterministic Finite Automaton (NFA), it can have multiple transition for one input in a given state.
It can also have ε -moves.

Comparison Of DFA and NFA:

NFAs and DFAs recognize the same set of language (regular language). DFA are easy to implement. For a
given language, NFA is simpler than the DFA. DFA can exponentially large than NFA.

NFA construction:
NFA can be constructed using an algorithm. The algorithm, that converts the RE into NFA is called Thompson’s
Construction. This algorithm first appeared in CACM in 1968. The algorithm builds NDF for each RE and then
combines NFAs using ε -moves.

DFA construction:
The process of converting the NFA into DFA is done using an algorithm called subset construction. The
technique involves creating DFA where each state of the DFA represent the set of states in the NFA. Each
state corresponds to the set of states in the original NFA. The DFA uses its states to keep the track of all the
possible states the NFA can be in after reading each input symbol.
DFA minimize:
The generated DFA may have a large number of states, which can be minimized using Hopcroft’s algorithm.
In this algorithm, equivalent states are grouped together to reduce the overall number of states while
preserving the language recognition capabilities of the DFA.
Lexical analyser generator:
Two popular lexical analyser generators are;
 Flex – generates lexical analyser in C and C++ and it is the modern version of the original Lex tool
which was the part of the AT&T bells labs version of UNIX.
 Jlex – it is written in java and generates the lexical analyser in java.
Flex:
To use flex, one has to provide a specification file as input to Flex. Flex read this file and produce the output
file which contains the lexical analyser code in C or C++.
The input specification file consists of three sections:
1. C or C++ and flex definition
%%
2. Token definition and actions
%%
3. User code
%%
The symbol “%%” marks each section. The input file name is “flex.l”, it is customary to use the “.l”
extension for the flex input files.
Semantic Analysis:
This is the third phase of the compiler, following lexical and syntax analysis. It checks if the declarations
and statements in the program make sense and are logically correct. It also ensures that the code makes
sense according to the rules of the programming language. It checks for logical errors that cannot be caught
by syntax analysis alone.
Using Symbol Table:
Semantic analysis uses a symbol table, which is like a dictionary that keeps track of all the variables,
functions, and other elements in the program. This helps in checking the consistency and correctness of the
code.
Error Detection: If the code contains semantic errors, such as using a variable before it's declared or
assigning the wrong type of value to a variable, semantic analysis detects these errors and reports them.
Ensuring Consistency: The goal of semantic analysis is to ensure that the code is not only syntactically
correct but also logically consistent and meaningful.
Parser:
Parsing is the second phase of the front end. The parser checks the token (stream of words) and parts of
speech for grammatically correctness. The scanner that is based on regular expression will not be able to
detect the syntax error. Not all the sequence of tokens are program. Parser must distinguish between the
valid and invalid tokens.
Parsing:
Parsing is a process of discovering a derivation for some sentence of language. The mathematical model of
syntax is represented by grammar G. Syntax of most programming language is represented by context free
grammar (CGF). The syntax of C / C++ and Java is heavily derived from Algol-60.
Derivation:
Derivation is a sequence of production rule that is used in order get input string. During parsing we take two
decisions for an input string:
 Deciding which non terminal to replace
 Deciding the production rule by which the non terminal will be change.
Types of derivation:
On the basic of the decision, we made for an input string we can have two types of derivation:
 Left-most Derivation – replacing left most non-terminal at each stage.
 Right most derivation – replacing the right most non-terminal at each stage.
Parse tree:
Derivation can be represented in tree like fashion called parse tree. In a parse tree:
 All leaf nodes are terminals.
 All interior nodes are non-terminals.
 In-order traversal gives original input string.
Precedence:
If two different operators share a common operand, the precedence of operators decides which will take the
operand. To add the precedence, create a non-terminal for each level of precedence.
Ambiguous grammar:
A grammar G is said to be ambiguous if it has more than one parse tree (left or right derivation) for at least
one string.
Parsing techniques:
Following is the classification of parsing technique:

Top-down parsing:
A top-down parse tree starts at the root and grows towards the leaves. At each node, parser picks the
production rule and tried to match it with the input string. In simple term, it starts from the start symbol of
the grammar and reaching the input string.
Types:
Top down is further divided into two categories:
 Recursive descent parsing
 Predictive Parsing (LL (1))
Bottom-up parsing:
The bottom-up parsing start at the leave nodes and grow towards the root node of the parse tree. It handles a
large class of grammar. It is also knowns as shift reduce parser This process involves shifting input symbols
onto the stack and then reducing them based on the grammar rules, hence the name "shift-reduce" parsing.
Types:
Bottom-up parsing can be further divided into sub categories:
 Operator precedence parsing
 LR parsing
Final term:
Parsing:
Parsing is the second phase of the front end in two-pass compiler. Where it takes the sequence of string from
the previous phase i.e. lexical analysis. Parsing is the process of driving the string from a given grammar. It
is also known as syntax analysis.
Context free grammar:
Parser uses the CFG to check weather a given string belongs to a particular grammar or not.
Types of parsers:
Parser can be of following types:
 Top-down parser
 Bottom-up parser
Which can be further divided as follow:

Bottom-up parsing:
The bottom-up parsing start at the leave nodes and grow towards the root node of the parse tree. It handles a
large class of grammar. It is also knowns as shift reduce parser This process involves shifting input symbols
onto the stack and then reducing them based on the grammar rules, hence the name "shift-reduce" parsing.
Types:
Bottom-up parsing can be further divided into sub categories:
 Operator precedence parsing
 LR parsing
Top-down parsing:
A top-down parse tree starts at the root and grows towards the leaves. At each node, parser picks the
production rule and tried to match it with the input string. In simple term, it starts from the start symbol of
the grammar and reaching the input string.
Types:
Top down is further divided into two categories:
 Recursive descent parsing
 Predictive Parsing (LL (1))
LL (1):
In LL (1) parsing technique we use the first and the follow of the terminals:
First:
First(A) contains all the terminals present in the first place of every string derived by A.
Note:
 First (Terminal) = terminal
 First (Epsilon) = Epsilon(e)
Capital = non-terminal
Small = terminal
Follow:
Follow (A) contains set of all terminals present immediate in right of A.
Note:
Follow of start symbol is $.
Bottom-up parsing:
The bottom-up parsing start at the leave nodes and grow towards the root node of the parse tree. It handles a
large class of grammar. It is also knowns as shift reduce parser This process involves shifting input symbols
onto the stack and then reducing them based on the grammar rules, hence the name "shift-reduce" parsing.
Types of bottom-up parsing:

Compiler Design Unit 1 SRM 21 Regulation
100% (1)
Compiler Design Unit 1 SRM 21 Regulation
193 pages
Compiler Design Module
100% (1)
Compiler Design Module
120 pages
Unit 1
No ratings yet
Unit 1
109 pages
Japanese Mochi Recipes Daniel Humphreys
100% (1)
Japanese Mochi Recipes Daniel Humphreys
42 pages
System Programming Unit-2 by Arun Pratap Singh
100% (1)
System Programming Unit-2 by Arun Pratap Singh
82 pages
Theory of Music Grade 1
No ratings yet
Theory of Music Grade 1
8 pages
Chapter-1 Compiler Design
100% (1)
Chapter-1 Compiler Design
13 pages
Automata Theory and Compiler Design
No ratings yet
Automata Theory and Compiler Design
55 pages
Introduction To Compiler Design-Unit I
No ratings yet
Introduction To Compiler Design-Unit I
30 pages
Payslip For The Month of May 2023
No ratings yet
Payslip For The Month of May 2023
1 page
ADVANCED-VOCABULARY Compressed
No ratings yet
ADVANCED-VOCABULARY Compressed
105 pages
Metaverse Seminar Report
No ratings yet
Metaverse Seminar Report
19 pages
TV Repair Guide LCD
100% (22)
TV Repair Guide LCD
45 pages
Treasurers Certificate
No ratings yet
Treasurers Certificate
2 pages
Chapter 1 (BC)
No ratings yet
Chapter 1 (BC)
30 pages
Clas E Inductor Amplifeir PDF
No ratings yet
Clas E Inductor Amplifeir PDF
7 pages
Compiler Design Assignment
100% (1)
Compiler Design Assignment
16 pages
Compiler Design
No ratings yet
Compiler Design
117 pages
Compiler Design
No ratings yet
Compiler Design
7 pages
Compiler Design: Instructor: Mohammed O. Samara University
100% (1)
Compiler Design: Instructor: Mohammed O. Samara University
28 pages
Compiler Construction Notes
No ratings yet
Compiler Construction Notes
61 pages
Physics Unit & Mesaurement
No ratings yet
Physics Unit & Mesaurement
26 pages
Compiler Design: Instructor: Mohammed O. Samara University
No ratings yet
Compiler Design: Instructor: Mohammed O. Samara University
28 pages
Unit 1 Compiler Design
No ratings yet
Unit 1 Compiler Design
124 pages
Global Oil and Gas Profile: Vision, Reputation, and Commitment
No ratings yet
Global Oil and Gas Profile: Vision, Reputation, and Commitment
44 pages
Shitake Shake PPT - 281024 - F
No ratings yet
Shitake Shake PPT - 281024 - F
18 pages
Lecture1 - Compiler Design
No ratings yet
Lecture1 - Compiler Design
52 pages
Compiler RNP SP Unit 4
No ratings yet
Compiler RNP SP Unit 4
69 pages
Assignment CS7002 Compiler Design Powered by A2softech (A2kash)
No ratings yet
Assignment CS7002 Compiler Design Powered by A2softech (A2kash)
8 pages
Compler
No ratings yet
Compler
35 pages
Work Order For School Uniform
No ratings yet
Work Order For School Uniform
1 page
CS 321 - Compilers: Outline
No ratings yet
CS 321 - Compilers: Outline
8 pages
Compiler Design
No ratings yet
Compiler Design
19 pages
Unit 2. The Parts of A Compiler
No ratings yet
Unit 2. The Parts of A Compiler
24 pages
Unit-I - CD R2021
No ratings yet
Unit-I - CD R2021
60 pages
Unit 1
No ratings yet
Unit 1
29 pages
CC Viva Questions
0% (1)
CC Viva Questions
5 pages
Unit 2. The Phases of A Compiler
No ratings yet
Unit 2. The Phases of A Compiler
23 pages
Intro To Compilers
No ratings yet
Intro To Compilers
77 pages
Compiler Design
No ratings yet
Compiler Design
118 pages
CSE353 Slides
No ratings yet
CSE353 Slides
76 pages
Basics of Compilation Process COM 413
No ratings yet
Basics of Compilation Process COM 413
35 pages
CD - Unit 1
No ratings yet
CD - Unit 1
46 pages
Detailed Estimate / Bill of Materials: Item Particulars Unit Quantity Unit Cost Total Cost
No ratings yet
Detailed Estimate / Bill of Materials: Item Particulars Unit Quantity Unit Cost Total Cost
4 pages
Notes Compiler
No ratings yet
Notes Compiler
28 pages
Overview of Compiler Environment Pass and Phase Phases of Compiler Regular Expression Lexical Analyzer LEX Tool Bootstrapping
No ratings yet
Overview of Compiler Environment Pass and Phase Phases of Compiler Regular Expression Lexical Analyzer LEX Tool Bootstrapping
35 pages
1.Q and A Compiler Design
No ratings yet
1.Q and A Compiler Design
20 pages
L2 - Structure of A Compiler
No ratings yet
L2 - Structure of A Compiler
43 pages
Compiler Final Modified
No ratings yet
Compiler Final Modified
33 pages
Introduction Compiler
No ratings yet
Introduction Compiler
47 pages
Chapter 1
No ratings yet
Chapter 1
43 pages
CD Assignment Question Bank
No ratings yet
CD Assignment Question Bank
20 pages
Lec00 Outline
No ratings yet
Lec00 Outline
27 pages
CSC 318 Class Notes
No ratings yet
CSC 318 Class Notes
21 pages
Slides 01 - Compiler Construction - UET CS - Introduction
No ratings yet
Slides 01 - Compiler Construction - UET CS - Introduction
37 pages
Lec#1
No ratings yet
Lec#1
36 pages
NM03 Act.3
No ratings yet
NM03 Act.3
2 pages
Cs614 (Final)
No ratings yet
Cs614 (Final)
29 pages
Compiler Design
No ratings yet
Compiler Design
19 pages
TK3163 Sem2 2020 1MyCh1.1-1.2 Intro-20200211121547
No ratings yet
TK3163 Sem2 2020 1MyCh1.1-1.2 Intro-20200211121547
39 pages
Language Processors:: Compiler
No ratings yet
Language Processors:: Compiler
14 pages
CC Questions
No ratings yet
CC Questions
9 pages
Compiler Unit1
No ratings yet
Compiler Unit1
23 pages
Cloud Computing
No ratings yet
Cloud Computing
6 pages
CD - 1
No ratings yet
CD - 1
22 pages
Module 3 Becg
No ratings yet
Module 3 Becg
23 pages
Unit 1
No ratings yet
Unit 1
29 pages
1 Lexial Analysis
No ratings yet
1 Lexial Analysis
24 pages
Cs133 Group A: Compiler Construction
No ratings yet
Cs133 Group A: Compiler Construction
24 pages
Applicabilityof Voussoir Beam Theoryfor Tunnel Designin Sydney Oliveiraand Pells
No ratings yet
Applicabilityof Voussoir Beam Theoryfor Tunnel Designin Sydney Oliveiraand Pells
17 pages
MID 1 FLCD (Part 1)
No ratings yet
MID 1 FLCD (Part 1)
6 pages
Ati Teas 6 English Language Study Guide
No ratings yet
Ati Teas 6 English Language Study Guide
23 pages
Compiler Construction Tools & Introduction To LA
No ratings yet
Compiler Construction Tools & Introduction To LA
5 pages
General Agreement On Trade and Tariff (GATT) /WTO
No ratings yet
General Agreement On Trade and Tariff (GATT) /WTO
21 pages
Operating System
No ratings yet
Operating System
21 pages
EY Reporting Under IFRS17 and IFRS9
No ratings yet
EY Reporting Under IFRS17 and IFRS9
39 pages
Statement Details: Transaction Date Posting Date Description Debit Credit Posting Amount Posting Currency Auth Code
No ratings yet
Statement Details: Transaction Date Posting Date Description Debit Credit Posting Amount Posting Currency Auth Code
2 pages
50 Important Queries in SQL Server
No ratings yet
50 Important Queries in SQL Server
19 pages
PHYS 121 - Curriculum - Huy
No ratings yet
PHYS 121 - Curriculum - Huy
4 pages
Phases of Compiler
No ratings yet
Phases of Compiler
9 pages
Lexical and Syntax Analysis - Updated
No ratings yet
Lexical and Syntax Analysis - Updated
5 pages
Binggggo
No ratings yet
Binggggo
7 pages
Compiler Assignment
No ratings yet
Compiler Assignment
6 pages
Compiler Construction CSEC325 Token
No ratings yet
Compiler Construction CSEC325 Token
2 pages
Complier Construction (Final)
No ratings yet
Complier Construction (Final)
8 pages
s10570 015 0562 X
No ratings yet
s10570 015 0562 X
15 pages
Front End Development
No ratings yet
Front End Development
7 pages
Cs 621
No ratings yet
Cs 621
7 pages
DR Odooh 3
No ratings yet
DR Odooh 3
6 pages
CS501
No ratings yet
CS501
6 pages
10 Communication Skills For Your Life and Career Success
No ratings yet
10 Communication Skills For Your Life and Career Success
1 page
35 City of Manila Vs Chinese Community
No ratings yet
35 City of Manila Vs Chinese Community
2 pages
Nagoyamotor Com Kawasaki-Catalog
No ratings yet
Nagoyamotor Com Kawasaki-Catalog
5 pages
Atmospheric-pollutants-EXAM-QUESTIONS-Mark Scheme
No ratings yet
Atmospheric-pollutants-EXAM-QUESTIONS-Mark Scheme
3 pages
C Programming Language
From Everand
C Programming Language
Younish Pathan
No ratings yet
Dive Into Sea of C
From Everand
Dive Into Sea of C
M Ashok
No ratings yet
Python Programming Concepts
From Everand
Python Programming Concepts
MRB
No ratings yet

Compiler Construction

Uploaded by

Compiler Construction

Uploaded by

Compiler Construction

Abstract Syntax Tree:

 An input alphabet (∑)

1.Finite automata without Output

 Deterministic Finite Automaton

2.Finite automata with Output

Table encoding of FA:

Deterministic Finite Automaton (DFA):

Non- deterministic Finite Automaton (NFA):

Comparison Of DFA and NFA:

You might also like