0% found this document useful (0 votes)
260 views65 pages

Compiler Construction: Chapter 1: Introduction To Compilation

The document provides an introduction to compiler construction, explaining that compilers translate source code written in a high-level language into machine-readable object code. It describes the main phases of compilation as scanning, parsing, semantic analysis, optimization of the source and target codes, and code generation. Additionally, it gives a brief history of compiler development and lists some related programs like interpreters, assemblers, linkers, preprocessors, editors, debuggers, and profilers.

Uploaded by

azimkhan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
260 views65 pages

Compiler Construction: Chapter 1: Introduction To Compilation

The document provides an introduction to compiler construction, explaining that compilers translate source code written in a high-level language into machine-readable object code. It describes the main phases of compilation as scanning, parsing, semantic analysis, optimization of the source and target codes, and code generation. Additionally, it gives a brief history of compiler development and lists some related programs like interpreters, assemblers, linkers, preprocessors, editors, debuggers, and profilers.

Uploaded by

azimkhan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 65

Compiler Construction

Chapter 1: Introduction to
Compilation

CA – 501
Khan S. N.
8275517509
[email protected]
Compilers
• Compilers translate from a source language (typically a high
level language) to a functionally equivalent target language
(typically the machine code of a particular machine or a
machine-independent virtual machine).
• Compilers for high level programming languages are among
the larger and more complex pieces of software
– Original languages included Fortran and Cobol
• Often multi-pass compilers (to facilitate memory reuse)
– Compiler development helped in better programming language design
• Early development focused on syntactic analysis and optimization
– Commercially, compilers are developed by very large software groups
• Current focus is on optimization and smart use of resources for
modern RISC (reduced instruction set computer) architectures.
2
What is a compiler?
A computer program translates one language to another
Source Compiler Target
Program Program
A compiler is a complex program
From 10,000 to 1,000,000 lines of codes
Compilers are used in many forms of computing
Command interpreters, interface programs
Brief History of Compiler

The first compiler was developed between 1954 and


1957
The FORTRAN language and its compiler by a team at IBM
led by John Backus
The structure of natural language was studied at about the
same time by Noam Chomsky
Brief History of Compiler

The related theories and algorithms in the 1960s and 1970s


The classification of language: Chomsky hierarchy
The parsing problem was pursued:
Context-free language, parsing algorithms
The symbolic methods for expressing the structure of the
words of a programming language:
Finite automata, Regular expressions
Methods have been developed for generating efficient object
code:
Optimization techniques or code, improvement techniques
Brief History of Compiler

Programs were developed to automate the complier development


for parsing
Parser generators,
such as Yacc by Steve Johnson in 1975 for the Unix system
Scanner generators,
such as Lex by Mike Lesk for Unix system about same time
Brief History of Compiler

Projects focused on automating the generation of other parts of a


compiler
Code generation was undertaken during the late 1970s and early 1980s
Less success due to our less than perfect understanding of them
Brief History of Compiler

Recent advances in compiler design


More sophisticated algorithms for inferring and/or
simplifying the information contained in program,
such as the unification algorithm of Hindley-Milner type checking
Window-based Interactive Development Environment,
IDE, that includes editors, linkers, debuggers, and project managers.
However, the basic of compiler design have not changed
much in the last 20 years.

BACK
Programs related to Compiler
Interpreters
Execute the source program immediately rather than generating
object code
Examples: BASIC, LISP, used often in educational or
development situations
Speed of execution is slower than compiled code by a factor of
10 or more
Share many of their operations with compilers
Assemblers
A translator for the assembly language of a particular computer
Assembly language is a symbolic form of one machine language
A compiler may generate assembly language as its target
language and an assembler finished the translation into object
code
Linkers

Collect separate object files into a directly executable file


Connect an object program to the code for standard library
functions and to resource supplied by OS
Becoming one of the principle activities of a compiler, depends
on OS and processor
Loaders

Resolve all re-locatable address relative to a given base


Make executable code more flexible
Often as part of the operating environment, rarely as an actual
separate program
Preprocessors

Delete comments, include other files, and perform macro


substitutions
Required by a language (as in C) or can be later add-ons that
provide additional facilities
Editors
Compiler have been bundled together with editor and other
programs into an interactive development environment (IDE)
Oriented toward the format or structure of the programming
language, called structure-based
May include some operations of a compiler, informing some
errors
Debuggers
Used to determine execution error in a compiled program
Keep tracks of most or all of the source code information
Halt execution at pre-specified locations called breakpoints
Must be supplied with appropriate symbolic information by the
compiler
Profiles

Collect statistics on the behavior of an object program during


execution
Called Times for each procedures
Percentage of execution time
Used to improve the execution speed of the program
The Translation Process
The phases of a compiler

Six phases Three auxiliary


Scanner components
Parser Symbol table
Semantic Analyzer Error Handler
Source code optimizer
Code generator
Target Code Optimizer
The Phases of a Compiler
Source code
Scanner
Tokens
Parser
Syntax Tree

Semantics Analyzer
Symbol
Annotated Tree
Table
Source Code Optimizer
Intermediate code
Error
Code Generator Handler
Target code

Target Code Optimizer


Target code
The Scanner

Lexical analysis: it collects sequences of characters into


meaningful units called tokens
An example: a[index]=4+2
• a identifier
• [ left bracket
• index identifier
• ] right bracket
• = assignment
• 4 number
• + plus sign
• 2 number

Other operations: it may enter literals into the literal table


RETURN
The Parser

Syntax analysis: it determines the structure of the program


The results of syntax analysis are a parse tree or a syntax tree
An example: a[index]=4+2
Parse tree
Syntax tree ( abstract syntax tree)
The Parse Tree

expression

Assign-expression

expression = expression

subscript-expression additive-expression

Expression [ expression ] expression + expression

identifier identifier number number


a index 4 2
The Syntax Tree

Assign-expression

subscript-expression additive-expression

identifier identifier number number


a index 4 2

RETURN
The Semantic Analyzer

The semantics of a program are its “meaning”, as


opposed to its syntax, or structure, that
determines some of its running time behaviors prior to
execution.
Static semantics: declarations and type checking
Attributes: The extra pieces of information computed by
semantic analyzer
An example: a[index]=4=2
The syntax tree annotated with attributes
The Annotated Syntax Tree

Assign-expression

subscript-expression additive-expression
integer integer

identifier identifier number number


a index 4 2
array of integer integer integer integer

RETURN
The Source Code Optimizer

The earliest point of most optimization steps is just after


semantic analysis
The code improvement depends only on the source code,
and as a separate phase
Individual compilers exhibit a wide variation in
optimization kinds as well as placement
An example: a[index]=4+2
Constant folding performed directly on annotated tree
Using intermediate code: three-address code, p-code
Optimizations on Annotated Tree

Assign-expression

subscript-expression additive-expression
integer integer

identifier identifier number number


a index 4 2
array of integer integer integer integer
Optimizations on Annotated Tree

Assign-expression

subscript-expression
integer

identifier identifier number


a index 6
array of integer integer integer
Optimization on Intermediate Code

t = 4 + 2
a[index]=t

t= 6
a[index]=t

a[index]=6

RETURN
The Code Generate

It takes the intermediate code or IR and generates code


for target machine
The properties of the target machine become the major
factor:
Using instructions and representation of data
An example: a[index]=4+2
Code sequence in a hypothetical assembly language
A possible code sequence

MOV R0, index


MUL R0,2
a[index]=6 MOV R1,&a
ADD R1,R0
MOV *R1,6

RETURN
The Target Code Optimizer

It improves the target code generated by the code generator:


Address modes choosing
Instructions replacing
As well as redundant eliminating

MOV R0, index


MUL R0,2 MOV R0, index
MOV R1,&a SHL R0
ADD R1,R0 MOV &a[R1],6
MOV *R1,6
BACK
Other Issues in Compiler Structure
The Structure of Compiler

Multiple views from different angles


Logical Structure
Physical Structure
Sequencing of the operations
A major impact of the structure
Reliability, efficiency
Usefulness, maintainability
Analysis and Synthesis
The analysis part of the compiler analyzes the source
program to compute its properties
Lexical analysis, syntax analysis and semantics analysis, as
well as optimization
More mathematical and better understood
The synthesis part of the compiler produces the
translated codes
Code generation, as well as optimization
More specialized
The two parts can be changed independently of the
other
Front End and Back End
The operations of the front end depend on the source
language
The scanner, parser, and semantic analyzer, as well as
intermediate code synthesis
The operations of the back end depend on the target
language
Code generation, as well as some optimization analysis
The intermediate representation is the medium of
communication between them
This structure is important for compiler portability
Passes
The repetitions to process the entire source program before
generating code are referred as passes.
Passes may or may not correspond to phases
A pass often consists of several phases
A compiler can be one pass, which results in efficient compilation but less
efficient target code
Most compilers with optimization use more than one pass
One Pass for scanning and parsing
One Pass for semantic analysis and source-level optimization
The third Pass for code generation and target-level optimization
Language Definition and compilers
The lexical and syntactic structure of a programming language
regular expressions
context-free grammar
The semantics of a programming language in English
descriptions
language reference manual, or language definition.
Language Definition and compilers
A language definition and a compiler are often
developed simultaneously
The techniques have a major impact on definition
The definition has a major impact on the techniques

The language to be implemented is well known and has


an existing definition
This is not an easy task
Language Definition and compilers
A language occasionally has it semantics given by a
formal definition in mathematical term
So-called denotational semantics in function programming
community
Given a mathematical proof that a compiler conforms to the
definition
The structure and behavior of the runtime environment
affect the compiler construction
Static runtime environment
Semi-dynamic or stack-based environment
Fully-dynamic or heap-based environment
Error Handling

Static (or compile-time) errors must be reported by a


compiler
Generate meaningful error messages and resume compilation
after each error
Each phase of a compiler needs different kind of error
handing
Exception handling
Generate extra code to perform suitable runtime tests to
guarantee all such errors to cause an appropriate event
during execution.

BACK
Bootstrapping and Porting
Third Language for Compiler
Construction
Machine language
compiler to execute immediately;
Another language with existed compiler on the same
target machine : (First Scenario)
Compile the new compiler with existing compiler
Another language with existed compiler on different
machine : (Second Scenario)
Compilation produce a cross compiler
T-Diagram Describing Complex
Situation
A compiler written in language H that translates language S into
language T.
S T
H
T-Diagram can be combined in two basic ways.
The First T-diagram Combination
A B B C A C
H H H

Two compilers run on the same machine H


First from A to B
Second from B to C
Result from A to C on H
The Second T-diagram Combination
A B A B
H H K K
M

Translate implementation language of a compiler from H to K


Use another compiler from H to K
The First Scenario
A H A H
B B H H
H

Translate a compiler from A to H written in B


Use an existing compiler for language B on machine H
The Second Scenario
A H A H
B B K K
K

Use an existing compiler for language B on different machine K


Result in a cross compiler
Process of Bootstrapping

Write a compiler in the same language


S T
S
No compiler for source language yet
Porting to a new host machine
The First step in bootstrap
A H A H
A A H H
H

“quick and dirty” compiler written in machine language H


Compiler written in its own language A
Result in running but inefficient compiler
The Second step in bootstrap
A H A H
A A H H
H

Running but inefficient compiler


Compiler written in its own language A
Result in final version of the compiler
The step 1 in porting
A K A K
A A H H
H

Original compiler
Compiler source code retargeted to K
Result in Cross Compiler
The step 2 in porting
A K A K
A A K K
H

Cross compiler
Compiler source code retargeted to K
Result in Retargeted Compiler

BACK
Compiler versus Interpreter
Compiler translates to machine code

scanner parser ... code generator loader

source code machine code

Interpreter executes source code "directly"


• statements in a loop are
scanner parser scanned and parsed
again and again
source code interpretation

Variant: interpretation of intermediate code


• source code is translated into the
... compiler ... VM code of a virtual machine (VM)
source code intermediate code • VM interprets the code
(e.g. Java bytecode) simulating the physical machine
55
Compiler-Construction Tools:

The compiler writer, like any programmer, can profitably use


tools such as

Debuggers,
Version managers,
Profilers and so on.

56
Compiler-Construction Tools:

• In addition to these software-development


tools, other more specialized tools have been
developed for helping implement various
phases of a compiler.

57
Compiler-Construction Tools:

Shortly after the first compilers were written, systems to help


with the compiler-writing process appeared.
These systems have often been referred to as
Compiler-compilers,
Compiler-generators,
Or Translator-writing systems.

58
Compiler-Construction Tools:

Some general tools have been created for the automatic design of
specific compiler components.
These tools use specialized languages for specifying and
implementing the component, and many use algorithms that
are quite sophisticated.

59
Compiler-Construction Tools:

The most successful tools are those that hide the details of the
generation algorithm and produce components that can be
easily integrated into the remainder of a compiler.

60
Compiler-Construction Tools:

The following is a list of some useful compiler-construction


tools:
Parser generators
Scanner generators
Syntax directed translation engines
Automatic code generators
Data-flow engines

61
Compiler-Construction Tools:

Parser generators
These produce syntax analyzers, normally from input that is based on a
context-free grammar.
In early compilers, syntax analysis consumed not only a large fraction of
the running time of a compiler, but a large fraction of the intellectual
effort of writing a compiler.
This phase is considered one of the easiest to implement.

62
Compiler-Construction Tools:

Scanner generators:
These tools automatically generate lexical analyzers, normally from a
specification based on regular expressions.

The basic organization of the resulting lexical analyzer is in effect a finite


automaton.

63
Compiler-Construction Tools:

Syntax directed translation engines:


These produce collections of routines that walk the parse tree, generating
intermediate code.

The basic idea is that one or more “translations” are associated with each
node of the parse tree, and each translation is defined in terms of
translations at its neighbor nodes in the tree.

64
Compiler-Construction Tools:

Automatic code generators:

Such a tool takes a collection of rules that define the translation of each
operation of the intermediate language into the machine language for
the target machine.

65

You might also like