0% found this document useful (0 votes)
17 views46 pages

Compiler Lec-One

Compiler

Uploaded by

mihretabdesta10
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views46 pages

Compiler Lec-One

Compiler

Uploaded by

mihretabdesta10
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 46

Chapter One

Introduction
Wachemo University(Durame Campus)
College of Engineering and Technology
Department of Computer Science
Mr. Abraham Wolde(2025)

1
OUTLINES
 Phases of a Compiler
 Computer Language Representation
 Compiler Construction Tools

2
Introduction to Compiler
Language Translator
 Translator is a program that takes an input program
written in one language and produces an output
program in another language.
 Besides program translation, the translator performs
another very important role, the error detection.
 Important role of Translator are:-
 Translating the High-level language program input
into an equivalent machine learning program.
 Providing error messages wherever the programmer
violates specification of the High-level language
3
Types of Translators
1) Interpreter
2) Compiler
3) Assembler
What is a Compiler ?
 A compiler is a program that reads a program written in one language (the
source language) and translates it into an equivalent program in another
language (the target language) without changing the meaning of the program.
 The process is called Compilation.
 The compiler also helps in checking the error while converting from HLL to
ML.
 Compiler design covers basic translation mechanism and error detection and
recovery.
4
 Compiler is a program that can read a
program written in high-level language
like C, C++, and Java (called source
language) and translates it into a low-
level language ( which is target
language/machine program).
 It includes lexical, syntax, and semantic
analysis as front-end and code generation
and optimization as back-end. 5
 Source Program:-
 It is normally a program written in a high-level programming
language.
 It contains a set of rules, symbols, and special words used to
construct a computer program.
 Target Program:-
 It is normally the equivalent program in machine code.
 It contains the binary representation of the instructions that the
hardware of computer can perform.
 Error Message:-

 A message issued by the compiler due to detection of syntax


errors in the source program. 6
Parts of Compilation
 In compilation the process of compiler; there are two
parts to compilation:-
1) Analysis

2) The analysis part breaks up the source
Synthesis
program into constituent pieces and creates an
intermediate representation of the source
program.

 The synthesis part constructs the desired


target program from the intermediate
representation. Of the two parts, synthesis
requires the most specialized techniques. 7
 During analysis, the operations implied by the source
program are determined and recorded in a hierarchical
structure called a tree.

 Often, a special kind of tree called a syntax tree is used, in


which each node represents an operation and the children of
a node represent the arguments of the operation.

8
9
 Software tools that manipulate source programs first
perform some kind of analysis.

 Some examples of such Software tools that manipulate source


program are:

1) Structure editor
2) Pretty printers
3) Static checkers
4) Interpreters
Structure Editor
 A structure editor takes as input a sequence of commands to build
a source program. Ex : Netbeans IDE
10
 The structure editor not only performs the text-creation and
modification functions of an ordinary text editor, but it also
analyzes the program text, putting an appropriate hierarchical
structure on the source program.
Pretty Printers
 A pretty printer analyzes a program and prints it in which a way
that the structure of the program becomes clearly visible.
 E.g.; comments may appear in a special font, and statements may
appear with an amount of indentation proportional to the depth of
their nesting in the hierarchical organization of the statements.

11
Static Checkers
 A static checker reads a program, analyzes it, and attempts to
discover potential bugs without running the program.
 For example, a static checker may detect that parts of the source
program can never be executed. It can catch logical errors such as
trying to use a real variable as a pointer.
Interpreters
 Interpreter performs the operations implied by the source program.

 Interpreters are frequently used to execute command languages,


since each operator executed in a command language is usually an
invocation of a complex routine such as an editor or compiler.

12
Analysis of the Source Program
 Analysis consists of three parts
1) Linear Analysis:- is called lexical analysis or scanning. It is the
process of reading a character from left-to-right and grouped into
tokens that are sequences of characters having a collective meaning.
2) Hierarchical Analysis:- is called as syntax analysis or parsing. In
this analysis the characters or tokens are grouped hierarchically
into nested collections with collective meaning.
3) Semantic Analysis:- in which certain checks are performed to
ensure that the components of a program fit together meaningfully.
i.e.; it check the source program for semantic errors and gathers
type of information for subsequent code generation phase.
13
The Phases of a Compiler
 Conceptually, a compiler operates in phases, each of
which transforms the source program from one
representation to another.
 Each phases take input from its previous phase, has its
own representation of source program and feeds its
output to the next phase of the compiler.

14
The compiler has six phase such
as:-

 Lexical Analyzer
 Syntax Analyzer
 Semantic Analyzer
 Intermediate Code
Generator
 Code Optimizer
 Target Code Generator

15
 The first three phases forming the bulk of the analysis
portion of a compiler.
 Two other activities; symbol table management and error
handling are shown interacting with the six phases of the
compiler.
 Each phase transforms the source program from one
representation into another representation.

16
Symbol Table Management
 An essential function of a compiler is to record the identifiers used in
the source program and collect information about various attributes of
each identifier.
 A symbol table is a data structure containing a record for each
identifier, with fields for the attributes of the identifier.
 The data structure allows us to find the record for each identifier
quickly and to store or retrieve data from that record quickly.
 When an identifier in the source program is detected by the lexical
analyzer, the identifier is entered into the symbol table.

17
Error Detection and Reporting
 Each phase can encounter errors. However, after detecting an
error, a phase must deal with that error, so that compilation can
proceed, allowing further errors in the source program to be
detected.
 The lexical phase can detect errors where the characters remaining
in the input do not form any token of the language.
 Errors where the token stream violates the structure rules of the
language are determined by the syntax analysis phase.
 During semantic analysis the compiler tries to detect constructs
that have the right syntactic structure but no meaning to the
18
The Analysis Phases of the Compiler
 Lexical Analysis
 The first phase of the compilation works as a text scanner.
 It scans or reads the source code as a stream of characters and groups these
characters into meaningful sequences called Lexemes.
 Then Lexical analyzer represents these lexemes in the form of tokens.
 Lexical Analysis or Scanners reads the source program one character at a
time, carving the source program into a sequence of automatic units is called
tokens.
<token-name, attribute-value> where
 Token-name is an abstract name that will be used during syntax
analysis.
 Attribute-name is a value that points to an entry in the symbol table.
19
 In a compiler, linear analysis is called lexical analysis or scanning.
 For example, in lexical analysis the characters in the assignment statement,
position = initial + rate * 60
 Would be grouped into the following tokens
1) The identifier (Position) is a lexeme that would be mapped into a token <id, 1>
2) The assignment symbol (=) is a lexeme that is mapped into a token <=>
3) The identifier (initial) is a lexeme that would be mapped into a token <id, 2>
4) The plus sign(+) is a lexeme that is mapped into a token <+>
5) The identifier (rate) is a lexeme that would be mapped into a token <id, 3>
6) The multiplication sign(*) is a lexeme that is mapped into a token <*>
7) The number 60 is a lexeme that is mapped into a token <60>

20
Syntax Analysis
Syntax analysis is also called Parsing
It is the second phase of compiler, which takes the token produced by
lexical analysis as input and generates a parse tree(or syntax tree).
The parser checks tokens arrangements against the source code grammar,
i.e. the parser checks if the expression made by tokens is syntactically
correct.
 The syntax tree describes the grammatical structure of the token
stream
 Each interior node represent operator
 The children of the node represent the arguments of the operation.
 Syntax analysis is aided by using techniques based on formal grammar of
the programming language. 21
Parse tree for position = initial + rate * 60

22
Semantic Analysis
 Semantic analysis checks whether the parse tree
constructed follows the rules of language.
For example, assignment of values is between compatible
data types, and adding string to an integer.
 Semantic analyzer also keeps track of identifiers, their
types and expressions; as well as whether identifiers are
declared before use or not etc.
 The semantic analyzer produces an annotated syntax tree as
an output.
23
24
Intermediate Code Generation
 After syntax and semantic analysis, some compilers generates an
explicit intermediate representation of the source program(low-level
or machine like of the source code for the target machine).
 An intermediate representation of the final machine language code is
produced.
 This phase bridges the analysis and synthesis phases of translation.
 This intermediate representation should be generated having two
important properties:-
 It should be easy to produce and
 It should be easy to translate it into the target program
(machine code).
25
 The intermediate representation can have a variety of
forms and one of the forms is called “Three address code”,
which is like the assembly language for a machine in which
every memory location can act like a register.
 Three address code consists of a sequence of instructions,
each of which has at most three operands.

26
Three address code for the statement
position : = initial + rate * 60 is

27
Code Optimization
 This phase gets the intermediate code as input and
produces optimized intermediate code as output.
 Optimization is a process that removes unnecessary codes
lines, & arranges the sequence of statements in order to
speed up the program execution without wasting resources.
 The code optimization phase attempts to improve the
intermediate code, so that the output faster running
machine code and takes less space.

28
 There is a better way to perform the same calculation for
the above three address code, which is given as follows:

29
Code Generation
 The final phase of the compiler is the generation of target code,
consisting normally of relocatable machine code or assembly code,
Memory locations are selected for each of the variables used by the
program.
 Then, intermediate instructions are each translated into a sequence of
machine instructions that perform the same task.
 The code generator takes the optimized representation of the
intermediate code and maps it to the target machine language.
 The code generator translates the intermediate code into a sequence of
relocatable machine code.
 Sequence of instructions of machine code performs the task as the
30
 The intermediate code instructions are translated into a sequence of machine
instruction.

 The first and second operands of each instruction specify a source and destination,
respectively.
 The F in each instruction tells us that instructions deal with floating-point
numbers.
 The # signifies that 60.0 is to be treated as an immediate constant.
 This code moves the contents of the address id3 into register 2, then multiplies it
with the real constant 60.0
 The third instruction moves id2 into register 1 and adds to it the value previously
computed in register 2. Finally, the value in register 1 is moved into the address of
31
Language Processing System
 We know that any computer system is made up of hardware and
software.
 The hardware understands a language which is not understandable by
humans.
 So, programs are written in high-level language, which is easier for
humans to understands and remember.
 Then these programs fed into a series of tools and OS components to
get the desired code that can be used by the machine.
 This is known as Language Processing System.
 The high-level language is converted into binary language in various
phases. 32
33
 Before we go to the details of compilers, we should understand a few
other tools that work closely with compilers.
Preprocessor
 A preprocessor, generally considered as apart of compiler, is a tool
that produces input for compilers.
 It deals with macro-processing, augmentation, file inclusion,
language extension, etc.
 They may perform the following functions:
 Macro Processing:- A preprocessor may allow a user to define
macros that are short hands for longer constructs.

34
 File Inclusion:- A preprocessor may include header files into the
program text. For example, the C preprocessor causes the contents of
the file <global.h> to replace the statement #include <global.h> when
it processes a file containing this statement.
 Rational Preprocessors:- These processors augment older languages
with more modern flow-of-control and data-structuring facilities. For
example, such a preprocessor might provide the user with built-in
macros for constructs like while-statements or if-statements, where
none exist in the programming language itself.
 Language Extensions:- These preprocessors attempt to add
capabilities to the language by what amounts to built-in macros. For
example, the language Equal is a database query language embedded
in C. 35
Compiler
♦ Compiler is a program that translates a source program written in
HLL into an equivalent target program in MLL.
♦ An important role of the compiler is error showing to the
programmer.

36
Interpreter
 Like a compiler, an interpreter translates high-level language into
low-level machine language.
 The difference is in the way they read the source code or input.
 A compiler reads the whole source code at once, create tokens,
checks semantics, generates intermediate code, executes the whole
program and many involve many passes.
 In contrast, an interpreter reads the program line by line, converts
it to an intermediate code, executes it, then takes the next statement in
sequence.
 If an error occurs, an interpreter stops execution and reports it.
 Whereas a compiler reads the whole program even if it encounters
37
Assemblers
 It is a program that translates assembly language programs into
machine code.
 The output of an assembler is called an object file.
 This object file contains a combination of machine instructions and a
data that required to place these instructions in memory.
Linker
 Linker is a computer program that links and merges various object
files together in order to make an executable file.
 The major task of a linker is to search and locate referenced
module/ routines in a program and to determine the memory
location that the codes will be loaded.
38
Loader
 Loader is a part of OS and is responsible for loading executable files
into memory and execute them.
 It calculates the size of a program and creates memory space for it.
 It initializes various registers to initiate execution.

39
Compiler Construction Tools
 The compiler writer can use some specialized tools that helps in
implementing various phases of a compiler.
 These tools assist in creation of entire compiler or its parts.
 Some commonly used compiler construction tools are:-
 Parser generators
 Scanner generators
 Syntax-directed translation
engines
 Automatic code generators
 Data flow engines
 Parser generators:- it produce syntax analyzers, normally from
input that is based on a grammatical description of programming
40
 Scanner generators:- it automatically generate lexical analyzers,
normally from a specification based on regular expressions
descriptions based on a token of a language.
 Syntax-directed translation engines :- it produce collections of
routines that walk the parse tree and the produce the intermediate
code.
 Automatic code generators:- takes a collection of rules that define
the translation of each operation of the intermediate language into the
machine language for the target machine.
 Data flow engines:- it is a part of code optimization used in code
optimization for gathering information that is the values that flow
from one part of a program to another.
41
Features of compiler construction tools
 Lexical Analyzer Generator: This tool helps in generating the
lexical analyzer or scanner of the compiler.
 It takes as input a set of regular expressions that define the syntax of
the language being compiled and produces a program that reads the
input source code and tokenizes it based on these regular expressions.
 Parser Generator: This tool helps in generating the parser of the
compiler.
 It takes as input a context-free grammar that defines the syntax of the
language being compiled and produces a program that parses the
input tokens and builds an abstract syntax tree.
 Code Generation Tools: These tools help in generating the target
code for the compiler.
 They take as input the abstract syntax tree produced by the parser and
produce code that can be executed on the target machine.
42
 Optimization Tools: These tools help in optimizing the generated
code for efficiency and performance.
 They can perform various optimizations such as dead code elimination, loop
optimization, and register allocation.
 Debugging Tools: These tools help in debugging the compiler itself
or the programs that are being compiled.
 They can provide debugging information such as symbol tables, call stacks,
and runtime errors.
 Profiling Tools: These tools help in profiling the compiler or the
compiled code to identify performance bottlenecks and optimize the
code accordingly.
 Documentation Tools: These tools help in generating documentation
for the compiler and the programming language being compiled.
They can generate documentation for the syntax, semantics, and
usage of the language.

43
 Language Support: Compiler construction tools are
designed to support a wide range of programming
languages, including high-level languages such as C++,
Java, and Python, as well as low-level languages such as
assembly language.
 Cross-Platform Support: Compiler construction tools
may be designed to work on multiple platforms, such as
Windows, Mac, and Linux.
 User Interface: Some compiler construction tools come
with a user interface that makes it easier for developers to
work with the compiler and its associated tools

44
Question & Answer

45
Thank You !!!

46

You might also like