Week 1-2
Week 1-2
AND INTERPRETER
INSTRUCTOR: DR. SAKEENA JAVAID
OUTLINE OF THE LECTURE
Why Compilers?
Overview of high-level languages and translation
Levels of programming languages
Need for high-level languages
Advantages of high-level languages
Language translation: Compilation and Interpretation
Architecture of a compiler
Phases of the compilation process
Language definition
Syntax and semantic Specification
Using BNF notation to define syntax of a language
WHY COMPILERS?
It is fairly a complex program which can take 10,000 to ten lac lines of code
With the advent of stored-program computers concept by John Von Neuman in late
1940’s, it became necessary to perform computation at the desired level
Initially instructions and memory locations were written in machine language which is
vey tedious task
Later on assembly language is used where instructions and memory locations are
written using this in symbolic forms
An assembler is used for the translation of the symbolic codes in the numeric machine codes
WHY COMPILERS? CONT’D
It has improved the speed and accuracy, however it still has few defects
Still not easy to read and understand,
Machine dependant code
Development of FORTRAN and its compiler by IBM (by john Backus between 1954 to 1957)
With the success of this project, although issues are smoothly resolved however not all processes
involved in translating programming languages are completely understood
At the same time, when the first compiler was under development, John Chomsky began its study
regarding the structure of natural languages.
His findings made the compilers construction easy and capable of partial automation
WHY COMPILERS? CONT’D
Chomsky’s study led to the classification of languages according
to the complexity of grammars (the rules specifying their structures)
and the power of the algorithms to recognize them
Chomsky’s hierarchy is now comprised on four levels of grammars: type 0, type 1, type 2 and
type 3
Each of which is the specialization of its predecessor
Type 2 or context free grammars are most useful for programming languages (are considered as the standard
way to represent the structure of the programming language)
Study of the parsing problem (determination of the efficient algorithms for recognizing the context free
languages) was pursued during the 1960s to 1970s
Considered as the fairly complete solution of the compilers and becomes the standard part in compiler theory
WHY COMPILERS? CONT’D
Closely related to the context free grammars are type 3 grammar: finite automata and regular
expressions
Used for expressing the structure of words or tokens of a programming language
Compilers:
Compilers convert (or ‘compile’) the source code to machine code all at once
This is then stored as an executable file which the computer can run (for example,
something ending with the ‘.exe’ file extension)
Errors in the source code can be spotted as the program is compiling and reported
to the programmer
OVERVIEW OF HIGH-LEVEL LANGUAGES AND TRANSLATION
Interpreters
Interpreters convert the code as it is running
They take a line of source code at a time and convert it to machine code (which the computer
runs straight away)
This is repeated until the end of the program
No executable file is created
If the interpreter comes across an error in the source code the only things it can do is to report
the error to the person trying to use the program (or it may just refuse to continue running)
OVERVIEW OF HIGH-LEVEL LANGUAGES AND TRANSLATION
Understanding of the need for both high-level and low-level languages
Computers don’t understand high level languages because they only understand binary
(‘machine code’).
Humans struggle to understand exactly what a program does when it is in binary only.
High-level languages are more accessible to programmers.
High-level languages will work on different types of computers.
Low-level programming allows for hardware to be controlled directly
Low-level programming will only work with the processor it is designed for (machine-
dependent)
OVERVIEW OF HIGH-LEVEL LANGUAGES AND TRANSLATION
Need for compilers when translating programs written in a high-level language
Compilers
Translates the entire program from source (i.e. high-level language) to object code / machine code.
Produces an executable file (i.e. In binary / machine code)
Advantages
Fast code is produced
Source code remains hidden so cannot be modified by customer
Compiled once only so doesn’t need a translator
Disadvantages
Compilers use a lot of computer resources: It has to be loaded in the computer’s memory at the same time as the source code
and there has to be sufficient memory to hold the object code
Difficult to pin-point errors its source in the original program
OVERVIEW OF HIGH-LEVEL LANGUAGES AND TRANSLATION
Understanding of the use of interpreters with high-level language programs
Interpreters translate each instruction is taken in turn and translated to machine code. The instruction is then
executed before the next instruction is translated
Advantages
Error messages are output as soon as an error is encountered so easy to debug
Useful for prototypes as program will run even when part of it has errors.
Disadvantages
Execution of a program is slow compared to that of a compiled program.
Instructions inside a loop have to be translated each time the loop is entered
A compiler can broadly be divided into two phases based on the way they compile
Divides it into core parts and then checks for lexical, grammar and syntax errors.
Synthesis Phase
Known as the back-end of the compiler, the synthesis phase generates the
target program with the help of intermediate source code representation
and symbol table.
A compiler can have many phases and passes.
Pass: A pass refers to the traversal of a compiler through the entire
program.
Phase: A phase of a compiler is a distinguishable stage, which takes input
from the previous stage, processes and yields output that can be used as
input for the next stage. A pass can have more than one phase
PHASES OF THE COMPILER
Denotational semantics
Occasionally, a language will have its semantics given by a formal
definition in mathematical terms
Several methods that are currently used do this, and no one method has
achieved the level of a standard
Denotational semantics has become one of the more common methods,
especially in the functional programming community
When a formal definition exists for a language, then it is (in theory)
possible to give a mathematical proof that a compiler conforms to the
definition
LANGUAGE DEFINITION CONT’D
Runtime Environment
One aspect of compiler construction that is particularly affected by the language definition:
Particularly due to the structure and behavior of the runtime environment
Structure of data allowed in a programming language, i.e., kinds of function calls and returned values
allowed, have a decisive effect on the complexity of the runtime system
Three basic types of runtime environments in increasing order of complexity are as follows:
FORTRAN77: With no pointers or dynamic allocation and no recursive function calls
Allows a completely static runtime environment, where all memory allocation is done prior
to execution
Makes the job of allocation particularly easy for the compiler writer, as no code needs to be
generated to maintain the environment
LANGUAGE DEFINITION CONT’D
Pascal, C and other so-called Algol-like languages: allow a limited form of dynamic allocation
and recursive function calls
Require a “semi- dynamic” or stack-based runtime environment with an additional dynamic, structure called a
heap
Programmer can schedule dynamic allocation
Functional and most object-oriented languages: such as LISP and Smalltalk, require a “fully
dynamic” environment
In which all allocation is performed automatically via code generated by the compiler
This is complicated because it requires that memory also be freed automatically
Requires complex “garbage collection” algorithms
REFERRED TEXT BOOKS
1. Compiler Construction – Principles and Practice by Kenneth C. Louden, Course Technology, 1997,
ISBN 978-0534939724.
2. Compilers: Principles, Techniques, and Tools by Alfred V. Aho, Ravi Sethi, Jeffrey D. Ullman,
Contributor Jeffrey D. Ullman, Addison-Wesley Pub. Co., 2nd edition, 2006 Original from the
University of Michigan.
3. Modern Compiler Design by Dick Grune, Henri E. Bal, Ceriel J. H. Jacobs, Koen G. Langendoen,
2003, John Wiley & Sons.
Thanks