Compiler Construction Design Phases
Compiler Construction Design Phases
Outline
a) What are the Phases of Compiler Design?
b) Phase 1: Lexical Analysis
c) Phase 2: Syntax Analysis
d) Phase 3: Semantic Analysis
e) Phase 4: Intermediate Code Generation
f) Phase 5: Code Optimization
g) Phase 6: Code Generation
h) Symbol Table Management
i) Error Handling Routine:
All these phases convert the source code by dividing into tokens, creating parse trees,
and optimizing the source code by different phases.
Page 1 of 7
Phase 1: Lexical Analysis
Lexical Analysis is the first phase when compiler scans the source code. This process
can be left to right, character by character, and group these characters into tokens.
Here, the character stream from the source program is grouped in meaningful
sequences by identifying the tokens. It makes the entry of the corresponding tickets
into the symbol table and passes that token to next phase.
Example:
x = y + 10
Tokens
X identifier
= Assignment operator
Y identifier
+ Addition operator
10 Number
Page 2 of 7
Example
Any identifier/number is an expression
If x is an identifier and y+10 is an expression, then x= y+10 is a statement.
Consider parse tree for the following example
(a+b)*c
In Parse Tree
a. Interior node: record with an operator filed and two files for children
b. Leaf: records with 2/more fields; one for token and other information about
the token
c. Ensure that the components of the program fit together meaningfully
d. Gathers type information and checks for type compatibility
e. Checks operands are permitted by the source language
Page 3 of 7
function called with improper arguments, an undeclared variable, etc.
Functions of Semantic analyses phase are:
a. Helps you to store type information gathered and save it in symbol table or
syntax tree
b. Allows you to perform type checking
c. In the case of type mismatch, where there are no exact type correction rules
which satisfy the desired operation a semantic error is shown
d. Collects type information and checks for type compatibility
e. Checks if the source language permits the operands or not
Example
float x = 20.2;
float y = x*30;
In the above code, the semantic analyzer will typecast the integer 30 to float 30.0
before multiplication
Example
For example,
total = count + rate * 5
Intermediate code with the help of address code method is:
t1 := int_to_float(5)
t2 := rate * t1
t3 := count + t2
total := t3
Page 4 of 7
Phase 5: Code Optimization
The next phase of is code optimization or Intermediate code. This phase removes
unnecessary code line and arranges the sequence of statements to speed up the
execution of the program without wasting resources. The main goal of this phase is
to improve on the intermediate code to generate a code that runs faster and occupies
less space.
Example:
Consider the following code
a = intofloat(10)
b=c*a
d=e+b
f=d
Can become
b =c * 10.0
f = e+b
Example:
a = b + 60.0
Would be possibly translated to registers.
MOV a, R1
Page 5 of 7
MOV 60.0, R2
ADD R1, R2
Most common errors are invalid character sequence in scanning, invalid token
sequences in type, scope error, and parsing in semantic analysis.
The error may be encountered in any of the above phases. After finding errors, the
phase needs to deal with the errors to continue with the compilation process. These
errors need to be reported to the error handler which handles the error to perform the
compilation process. Generally, the errors are reported in the form of message.
Summary
a. Compiler operates in various phases each phase transforms the source
program from one representation to another
b. Six phases of compiler design are 1) Lexical analysis 2) Syntax analysis 3)
Semantic analysis 4) Intermediate code generator 5) Code optimizer 6) Code
Generator
c. Lexical Analysis is the first phase when compiler scans the source code
d. Syntax analysis is all about discovering structure in text
e. Semantic analysis checks the semantic consistency of the code
f. Once the semantic analysis phase is over the compiler, generate intermediate
code for the target machine
g. Code optimization phase removes unnecessary code line and arranges the
sequence of statements
Page 6 of 7
h. Code generation phase gets inputs from code optimization phase and produces
the page code or object code as a result
i. A symbol table contains a record for each identifier with fields for the
attributes of the identifier
j. Error handling routine handles error and reports during many phases
Page 7 of 7