0% found this document useful (0 votes)
27 views42 pages

Compilation Stages - New

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
27 views42 pages

Compilation Stages - New

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 42

Compilation of a Computer Program

•describe program compilation stages: lexical and syntactic analysis, code generation and
optimization.
•demonstrate understanding of the program compilation stages: lexical, syntactic
analysis.
•demonstrate understanding of the program compilation stage: code generation.
•demonstrate understanding of code optimization as the program compilation stage.
Assessme
nt criteria:

All of you will be able to:


• List program compilation steps.
• State right order for compilation steps.
• Describe each step of compilation program.
Vocabulary
Term Definition
Parsing The process of analyzing a string of symbols.

Lexeme A basic unit of a language.

Token A structure representing a lexeme that indicates its category.

White The spaces between words.


Space
Source A collection of computer instructions written using a computer language. The
Code source code is transformed by a compiler into a collection of low-level machine
code instructions called the Object Code.

Object The machine code produced from the Source Code.


Code
Keyword A word that is reserved by a program because the word has a special meaning.

Keywords Absolute, and, array, asm, begin, case, const, constructor, destructor, div, do,
(Pascal) downto, else, end, file, for, function, goto, if, implementation, in, inherited, inline,
interface, label, mod, nil, not, object, of, operator, or, packed, procedure,
program, record, reintroduce, repeat, self, set, shl, shr, string, then, to, type, unit,
until, uses, var, while, with, xor.
What a compiler
is?
Compiler
When a programmer uses a computer language (high level) to write a program the
statements are called source code.
The compiler translates source code into machine code (low level).
The code that is compiled is stored as an executable file also called object file. When
the file runs the machine code is processed by the CPU.
Advantages and disadvantages
Disadvantages of
Advantages of compilers:
compilers:
1. Source code is not 1. Object code needs to be
included, therefore compiled produced before a final
code is more secure than executable file, this can be a
interpreted code. slow process.
2. Tends to produce faster 2. The source code must be
code than interpreting 100% correct for the
source code. executable file to be
3. Produces an executable produced.
file, and therefore the
program can be run without
need of the source code.
The steps involved print(“Hell
in compilation: o world!”)

High level source code

Compiler Lexical Analysis

Syntactic Analysis

Code generation

Object File
Question:
Why would a company not
want to distribute source
code when they sell a
software package?

• Retention of source code ensures control over it is kept with


company or individual.
• Software cannot be so easily reverse engineered (taking
design knowledge and reusing it).
• Code cannot be modified.
Could you suggest right order of process
compilation of a computer program?

Code Semantic Lexical Syntactic


Optimization
Generation Analysis Analysis Analysis
How does compiler work?
Compiler - 5 Stages of Compilation
The process of compilation can be split into
five stages:

Lexical Syntactic Semantic Code


Optimization
Analysis Analysis Analysis Generation
The Stages of Compilation
Lexical Analysis
Syntactic Analysis:
-Syntax
-Semantic
Optimization(intermediate)
Code Generation
Optimization
Lexical analysis
Lexical analysis is the process of analyzing a stream of
individual characters (normally arranged as lines), into a
sequence of lexical tokens (tokenization of words and
symbols) to feed into the parser that the compiler will
understand.
It splits text written in a natural language (e.g. English)
into a sequence of words and punctuation symbols that
the compiler will understand.
Lexical analysis in english
The cat sat on the mat.
Word (Lexeme) Category (Token)
The Article
Cat Noun
Sat Verb
Lexical analysis
Lexical analysis
What does time = 5 + 3; mean to you?
1 Lexical analysis

time = 5 + 3;

The only thing the computer sees is a line of characters.

The computer needs to make sense of the characters.


2 Lexical analysis
The compiler looks at the incoming stream of
characters and tries to decide where one ‘thing’
ends and another ‘thing’ begins:

time = 5 + 3;
The first thing the computer must do is
categorize the characters into ‘Tokens’.
3 Lexical analysis
The compiler looks at the incoming stream of characters and
tries to decide where one ‘thing’ ends and another ‘thing’
begins:

time = 5 + 3;
The compiler sees the word time. The word time is a basic unit
& is called a Lexeme.

It does not recognize it as a keyword & assumes it must be a


variable.
It calls the variable an identifier.
Lexical analysis - Token Table
time = 5 + 3;

Lexeme Token Type


A Token time Identifier

Lexeme: a basic unit of a language.

Token: a structure representing a lexeme that indicates its category.

Token Table: a table containing Tokens.


Lexical analysis - Token Table
time = 5 + 3;

Lexeme Token Type


time Identifier
= Assignment Operator
5 Number/Constant
A Tokens
+ Operator(Addition*)
3 Number/Constant
; Symbol(End of Statement*)

note: all spaces have been removed.


Lexical analysis – Activity*
for (int j = 1; j <= i; j++)
{
} Lexeme Token Type
for Keyword
( Symbol
int Keyword
j Identifier
= Assignment
1 Constant
; Symbol
j <= i Expression
; Symbol
j++ Expression
) Symbol
Lexical analysis – Activity*
for (int j = 1; j <= i; j++)
{
} Lexeme Token Type
for Keyword
( Symbol
int Keyword
j Identifier
= Assignment
1 Constant
; Symbol
j <= i Expression
; Symbol
j++ Expression
) Symbol
Lexical analysis
Syntax Analysis
This is alternatively known as parsing. This stage analyses
the syntax of the statements to ensure they conform to
the rules of grammar for the computer language in
question.

It is roughly the equivalent of checking that some ordinary


text written in a natural language (e.g. English) is
grammatically correct (without worrying about meaning).

The purpose of syntax analysis or parsing is to check that


we have a valid sequence of tokens. Tokens are a valid
sequence of symbols, keywords, identifiers etc.
1 Syntax Analysis(Parsing)
Syntax Analysis is used to confirm that a statement conforms
to the rules of grammar for the computer language.

time = 5+3 Does not conform to the


rules of grammar C++

It should read be time = 5+3;


2 Syntax Analysis(Parsing)

This statement follows the grammar, but does it


make sense?

time = 5+3;
2 Syntax Analysis(Parsing)

This statement follows the grammar, but does it


make sense?

time = 5+3;

What could
go wrong?
3 Syntax Analysis(Parsing)
Semantic analysis is the task of ensuring that the declarations
and statements of a program are semantically correct, i.e, that
their meaning is clear and consistent with the way in which
control structures and data types are supposed to be used.
#include <iostream>
#include <string>
using namespace std;

int main() {
string time;

for (int i = 1; i <= 10; i++) {


time = 5 + 3;
}

return 0;
}
Lexical analysis
Code generation
Creating a sequence of instructions and the order
of the execution of the instructions. Converting
source code (via the output of lexical and
syntactic analysis) into machine code.
Code generation
In computing, code generation is the process by which a
compiler's code generator converts some intermediate
representation of source code into a form (e.g., machine
code) that can be readily executed by a machine.

This involves:

1.Converting the Source Code to the Object Code.


2.Allocates memory locations to the variables and
constants.
3.Works out the relative addressing so that the computer
can move from one section of code to another.
4.Optimizes the code.
Optimization
Making the compile time as short as possible.
Optimization is a program transformation
technique, which tries to improve the code by
making it consume less resources (i.e. CPU,
Memory) and deliver high speed.
Code Optimization
Code optimization is any method
of code modification to improve code quality and
efficiency. A program may be optimized so that it
becomes a smaller size, consumes less memory,
executes more rapidly, or performs fewer
input/output operations.
x = y + b;
z = x * 50;
Would become:
z = 50 * (y + b);
Code Optimization in Compiler
Design
Interpreter
A different type of language translator.

* It does not translate source code into machine


code (compiler)
* It contains subroutines to carry out each high-
level instruction.
* When a program is written and run, the
interpreter looks at each line of code and if no
errors it uses its own subroutines to execute it.
Let's remember what the
difference was between a
compiler and an interpreter
Aspect Interpreter Compiler
Translates entire code into machine
Execution Method Executes code line-by-line at runtime.
language before execution.

Slower due to real-time interpretation of Faster since it runs pre-translated


Speed
each line.ф machine code.

Stops at the first error during execution, Identifies all errors during the
Error Handling making it easier to find and fix errors on compilation process, but can be harder to
the spot. debug due to less context.

Does not generate an intermediate file; Generates an executable file or object


Output
runs directly from source code. code as the final output.

Platform-dependent; compiled code is


Platform Platform-independent; requires the
specific to the architecture and OS it was
Dependency interpreter to run on each platform.
compiled for.

Only the compiled executable needs to


Requires the source code to be available
Distribution be distributed, keeping the source code
or the interpreter to execute the code.
secure.
Errors
Subroutines should be tested individually.
A 'test harness' should be used - a program
which tests the modules can be coded and run.
Once tested, the subroutine may then be stored
in a library and confidently used in many
different programs.
Errors (types)
Translation errors: occur when the program is
being compiled (usually syntax errors).
Linking errors: may occur if a compiled
program is linked to library routines. For
example, a subroutine may not be present in the
library.
Execution errors (Run-time errors): Usually
logical errors - The program compiles
successfully but does not perform correctly.
Discussion Questions
1. Give two examples of high-level languages:
Pascal, Python etc.
2. A compiler is used to run them. What does it do?
A compiler will translate a high-level language into
machine code and each program instruction
translates into many machine code instructions.
3. What is an advantage of writing a program using
Pascal or Python compared to writing the same
program in assembly code?
The code can be compiled and distributed without
the source code.
Discussion Questions
1. Give two examples of high-level languages:
Pascal, Python etc.
2. A compiler is used to run them. What does it do?
A compiler will translate a high-level language into
machine code and each program instruction
translates into many machine code instructions.
3. What is an advantage of writing a program using
Pascal or Python compared to writing the same
program in assembly code?
The code can be compiled and distributed without
the source code.
Let’s summary

Editor Allows the programmer to enter and edit the high-


level language source code.
Compiler Converts the source code into executable object
code (machine code). Once compiled, a program can
be run at any time.
Interpreter Converts each line of the source code into object
code and executes it as it goes. The conversion
process is performed each time the program needs to
be run.
Debugger A program which helps to track down and identify
errors in a program.
https://www.youtube.com/watch?v=CTUDhrsy6f0

You might also like