CD Chapter 1
CD Chapter 1
Course Outline
Instructor: Agegnehu Ashenafi (MSc.)
Course Objective:
To learn basic techniques used in compiler
construction such as lexical analysis, top-down
and bottom-up parsing, and intermediate code
generation.
To learn basic data structures used in compiler
construction such as abstract syntax trees,
symbol tables, and three-address code.
To learn software tools used in compiler
construction such as lexical analyzer generators,
and parser generators.
Course Content:
Chapter 1: Introduction
– 1.0. Compilers
– 1.1. Language processing
– 1.2. System Analysis of source
– 1.3. Program Phases of a Compiler
– 1.4. Compiler Construction Tools
Chapter 2: Lexical analysis
– The role of the lexical analyzer
– Token Specification
– Recognition of Tokens
Chapter 3: Syntax analysis
– Role of a parser
– Syntax error handling
– Top down parsing
– Bottom up parsing
Course Content … cont
Chapter 4: Intermediate code generation
– Intermediate languages
– Declarations
– Assignment statements
Chapter 5: Code optimization and Code Generation
– Issues in design of a code generator
– Simple Code generator
– Introduction to code optimization
– Optimization of basic block
Text Books:
Alfred Aho, Ravi Sethi, V.Jeffery Ullman D. “COMPILERS PRINCIPLES,
TECHINQUES AND TOOLS “, Addison- Wesley, 1988.
Chapter 1: Introduction to Compilers
What is Compiler Design?
A Compiler is computer software that transforms source
program code which is written in a high-level language into
low-level machine code.
Compiler design is the process of developing a program or
software that converts human-written code into machine
code. It involves many stages like lexical analysis, parsing,
semantic analysis, code generation, optimization, etc.
Compiler Design is the structure and set of principles that
guide the translation, analysis, and optimization process of a
compiler.
Process of deciding (arranging) how different parts of sth (e.g.,
in this case “compiler”), building, drawing, works(ALD
Oxford).
Chapter 1: Introduction to Compilers …con’t
Questions
What is Program, programming, programmer,
translator?
How do compilers become special over
interpreters and vice versa?
Why lexical analysis, syntax analysis, semantic
analysis are needed?
Why you learn compiler design?
How do you match compiling time with
interpreting time?
As a compiler design learners, how do the
machine detect the lexical error and syntax error,
and their recovery?
Chapter 1: Introduction to Compilers …con’t
1.1. Language Processing System
a) Translator
Translator is a program that takes a program as input
written in one language and produces a program as output
in another language.
Beside the program translation the translator performs
another very important role is error detection.
During translation, any violation of high level language
specification would be detected and reported to the
programmers.
1. Preprocessors
Preprocessor is a computer program that modifies data to adapt with the input
requirements of another program. It is a macro processor which automatically
transform a program before actual compilation. It is responsible for starting
and ends of the program.
They may perform the following functions:
i. Macro Processing: - A preprocessor may allow a user to define macros that
are short hands for longer constructs. Example: #define MaxNo 4
ii. File Inclusion: - A preprocessor may include header files into the program
text. For example, the C preprocessor causes the contents of the file <global.h>
to replace the statement #include <global.h> when it processes a file
containing this statement.
iii. Rational Preprocessors: - These processors augment older languages
with more modern flow-of-control and data-structuring facilities. For
example, such a preprocessor might provide the user with built-in macros for
constructs like while-statements or if-statements, where none exist in the
programming language itself.
iv. Language Extensions: - These processors attempt to add capabilities
to the language by what amounts to built-in macros, For example, the
language Equel is a database query language embedded in C.
Statements beginning with ## are taken by the preprocessor
to be database-access statements, unrelated to C, and are
translated into procedure calls on routines that perform the
database access.
Macro processors deal with two kinds of statement:
macro definition
macro use
Definitions are normally indicated by some unique character or keyword,
like define or macro.
They consist of a name for the macro being defined and a body, forming
its definition.
The use of a macro consists of naming the macro and supplying actual
parameters, that is Values for its formal parameters.
The macro processor substitutes the actual parameters for the formal
parameters in the body of the macro; the transformed body then replaces
the macro use itself.
2. Compiler
A Compiler is computer software that transforms source program
code which is written in a high-level language into low-level
machine code.
In order to reduce the complexity of designing and building
computers, nearly all of these are made to execute relatively
simple commands (but do so very quickly).
A program for a computer must be built by combining these
very simple commands into a program in, what is called
machine language.
Since this is a tedious and error-prone process; most
programming is, instead, done using a high-level programming
language.
This language can be very different from the machine
language, in that the computer can execute and so some means
of bridging the gap is required.
This is where the compiler comes in.
2. Compiler …cont
A compiler translates (or compiles) a program written in a
high-level programming language that is suitable for human
programmers into the low-level assembly language.
During this process, the compiler will also attempt to
promote and report obvious programmer mistakes.
Using a high-level language for programming has a large
impact on how fast programs can be developed.
The main reasons for this are:
Compared to machine language, the notation used by
programming languages is closer to the way humans think about
problems.
PARTS and Modules OF COMPILATION
There are two parts to compilation:
i) Analysis part (lexical analysis, syntax analysis, semantic analysis)
ii) Synthesis part(intermediate code gen., code optimization and code generation)
The analysis part breaks up the source program into constituent pieces
and creates an intermediate representation of the source program.
The synthesis part constructs the desired target program from the
intermediate representation. From the two parts, synthesis requires the
most specialized techniques.
The compiler has two modules namely the front end and the back end.
In compilers, the frontend translates a computer programming source
code into an intermediate representation.
Front-end constitutes lexical analysis, syntax analysis, semantic analysis,
intermediate code generation and creation of symbol table;
Whereas, the back-end(code optimization and code generation) works
with the intermediate representation to produce code in a computer
output language.
The backend usually optimizes to produce code that runs faster.
PARTS OF COMPILATION …cont
During analysis, the operations implied by the
source program are determined and recorded
in a hierarchical structure called a tree.
Often, a special kind of tree called a syntax
tree is used, in which each node represents an
operation and the children of a node
represent the arguments of the operation.
Many software tools that manipulate source
programs first perform some kind of analysis.
Some examples of such tools are
Structure editor
Pretty printers
Static checkers
Interpreters
Structure editor
A structure editor takes as input a sequence of
commands to build a source program.
The structure editor not only performs the text-
creation and modification functions of an ordinary text
editor, but it also analyzes the program text, putting an
appropriate hierarchical structure on the source
program.
For example, it can check that the input is correctly
formed, can supply keywords automatically (e.g., when
the user types while.
The editor supplies the matching do and reminds the
user that a conditional must come between them), and
can jump from a begin or left parenthesis to its
matching end or right parenthesis.
Pretty printers
A pretty printer analyzes a program and prints it in which a
way that the structure of the program becomes clearly
visible.
For example, comments may appear in a special font, and
statements may appear with an amount of indentation
proportional to the depth of their nesting in the
hierarchical organization of the statements.
Static checkers
A static checker reads a program, analyzes it, and attempts
to discover potential bugs without running the program.
For example, a static checker may detect that parts of the
source program can never be executed.
It can catch logical errors such as trying to use a real
variable as a pointer.
3. Interpreters
An interpreter, like a compiler, translates high-level language
into low-level language. Example: Ruby, PHP, JavaScript, Java
The difference lies in the way they read the source code or input.