ACD Unit-2 Part-1
ACD Unit-2 Part-1
1. Introduction
COMPILERS
Contents :
1. Compiler Introduction
3. Compiler vs Interpreter
4. Phase of Compilation
Error
messages
Language processing system
High-Level Language –
HLL program contains
#define or #include directives .
They are closer to humans but far
from machines. These (#) tags are
called preprocessor directives.
They direct the pre-processor about
what to do.
Pre-Processor –
The pre-processor removes
all the #include directives by
including the necessary files and all
the #define directives using macro
expansion. It also performs
macro- processing.
Language processing system
Compiler
is software that converts a
program
Language)
written to
in a low-level
high-level language
langua
(Object/Target/Machine (Source
ge
Language/ language program). Assembl
y
Assembly Language program –
It’s neither in binary form nor
high
level. It is a combination of
machine instructions and some
other useful data needed for
execution.
Language processing system
Compiler takes large amount of time to analyze the Interpreter takes less amount of time to analyzethe
entire source code but the overall execution time of source code but the overall execution time of the
the program is comparatively faster. program is slower.
or Target program
Phases of compiler
⮚ Semantic analysis
⮚ Code optimization
⮚ Code Generation
Phases of A Compiler
Source Lexical Syntax Semantic Intermediate Code Code Target
Program Analyzer Analyzer Analyzer Code Generator Optimizer Generator Program
Ex:Tokens:
newval:= oldval + 12
newval identifier
:= assignment operator
oldval identifier
+ add operator
12 constant
Analysis phase : Lexical analyzer
It uses the syntax tree of the previous phase along with the
symbol
table to verify that the given source code is semantically
consistent.
Code Optimization :-
This is optional phase described to improve the intermediate code so
that the output runs faster and takes less space.
Code Generation:-
The last phase of translation is code generation. A number
of
optimizations to reduce the length of machine language program are
carried out during this phase. The output of the code generator is
Symbol table
This is the portion to store the names used by the program and records
essential information about each. The data structure used to record this
information called a ‘Symbol Table’.
A symbol table contains a record for each identifier, constant and labels with fields
for the attributes of the identifier.
This component makes it easier for the compiler to search the identifier record and
retrieve it quickly.
The symbol table also helps you for the scope management.
The symbol table and error handler interact with all the phases and symbol table
update correspondingly.
Error handler
Error Handlers:-
It is invoked when a flaw error in the source program is
detected.
One of the most important functions of a compiler is the
detection and
reporting of errors in the source program. The error message
should
allow the programmer to determine exactly where the
errors have
occurred. Errors may occur in all or the phases of a compiler.
Most common errors are invalid character sequence in scanning or lexical analysis.
invalid token sequences in type, scope error, and type mismatch parsing in syntax and
semantic analysis.
Error handler
After finding errors, the phase needs to deal with the errors to continue with the
compilation process.
These errors need to be reported to the error handler which handles the error to
perform the compilation process.
LEX...
LEX...
Rules:
Rules in a LEX program consists of two parts :
1.The pattern to be matched
2.The corresponding action to be executed
Patterns are defined using the regular expressions and actions can be
specified using C Code.
The Rules can be given as
R1 {Action1}
R2 {Action2}
.
.
.
Rn {Action n}
Note: Function yywrap is called by lex when input is exhausted. When the
end of the file is reached the return value of yywrap() is checked. If it is non-
zero, scanning terminates and if it is 0 scanning continues with next input file.
LEX...
Lex Program for count tokens in source program: