Midsem
Midsem
UNIT 1
Pre-Processor:
Compiler Design 1
All the #define directives using macro expansion. It performs file Inclusion ,
macro-processing, short hand operators etc.
Assembly Language
Assembler
They are not universal since for each platform we have one.
Compiler :
Compiler Design
Compiler Design 2
Synthesis phase creates an equivalent target program from the intermediate
representation.
The address within the program will be in such a way that it will cooperate with
the program movement.
In given picture the “I” is the memory location which is modifiable (mtlb
location change hogi toh I autimatic shift /reloacate hota rheaga.)
Loader/Linker:
Linker loads a variety of object files into a single file to make it executable.
Then loader loads it in memory and executes it.
Compiler Design 3
Pass - Traversal of a compiler through the entire program
Phase: Phase of a compiler is a distinguishable stage, which takes input from the
previous stage, processes and yields output that can be used as input for the next
stage. A pass can have more than one phase.
Compiler Passes
Pass is a complete traversal of the source program. Compiler has two passes to
traverse the source program.
Multi-pass Compiler
In the first pass, compiler can read the source program, scan it, extract the
tokens and store the result in an output file.
In the second pass, compiler can read the output file produced by first pass,
build the syntax tree and perform the syntactical analysis. The output of this
phase is a file that contains the syntax tree.
In the third pass, compiler can read the output file produced by second pass
and check that the tree follows the rules of language or not. The output of
semantic analysis phase is the annotated tree syntax.
One-pass Compiler
One-pass compiler passes only once through the parts of each compilation
unit.
In the one pass compiler, when the line source is processed, it is scanned and
the token is extracted.
Then the syntax of each line is analyzed and the tree structure is build. After
the semantic part, the code is generated.
The same process is repeated for each line of code until the entire program is
compiled.
Compiler Design 4
PHASES OF COMPILER :
1. LEXICAL ANALYZER
Compiler Design 5
Tokens are defined by regular expressions which are understood by the lexical
analyser.
It reads the source program one character at a time and converts it into
meaningful lexemes.
2. SYNTAX ANALYSIS-
This phase takes the stream of tokens generated by the lexical analysis
phase and checks whether they conform to the grammar of the
programming language.
3. SEMANTIC ANALYSIS
This phase checks whether the code is semantically correct, i.e., whether it
conforms to the language’s type system and other semantic rules.
Compiler checks the meaning of the source code to ensure that it makes
sense.
Compiler performs type checking, which ensures that variables are used
correctly and that operations are performed on compatible data types.
Compiler Design 6
Compiler also checks for other semantic errors, such as undeclared variables
and incorrect function calls.
4. INTERMEDIATE CODE
Till intermediate code, it is same for every compiler, but after that, it
depends on the platform.
To build a new compiler we don't need to build it from scratch. We can take
the intermediate code from the already existing compiler and build the last two
parts.
5. CODE OPTIMIZATION
It is an optional phase.
It is used to improve the intermediate code so that the output of the program
could run faster and take less space.
Meaning of the code optimizer is code being transformed but not altered.
6. CODE GENERATION
This phase takes the optimized intermediate code and generates the actual
machine code that can be executed by the target hardware.
Compiler Design 7
SYMBOL TABLE
LA is the first phase to communicate with the symbol and the compiler
generate the symbol table during the lexical analysis phase.
Operation on the symbol table can be performed on symbol table are - insert,
lookup/search , modify and delete
Information store in the symbol table about identifier - name type scope size
offset
In general, during the first two phases, we store the information in the symbol
table and in the memory and in the later phases, we make use of the
information available in symbol table.
Every phase of the compiler will be interacting with the symbol table.
TYPES OF ERROR
Lexical Error: Happens when the compiler finds a word it doesn't know.
Syntax Error: Happens when the rules of the language are broken.
Compiler Design 8
Semantic Error: Happens when the meaning of the sentence is wrong, even if
the sentence is written correctly.
Handling Errors:
The compiler finds errors, which are called exceptions. The programmer must
fix these exceptions.
During program execution, fatal errors can occur. These are serious, and the
system administrator must fix them.
Error Handler:
An error handler is a tool that helps the compiler keep going, even if errors
happen at different stages.
If no errors are found after the last stage (phase 3), the program is correct and
can be turned into an executable form.
If errors are still present after phase 3, they will be shown to the programmer
LEXICAL ANALYZER
It reads the character of the source program groups them into lexically
meaningful units called lexems and produces tokens as output representing
Compiler Design 9
these lexems.
4. Now this input pattern is converted into NFA by using finite automation
machine.
5. This NFA are then converted into DFA and DFA are minimized by using
different method of minimization.
6. The minimized DFA are used to recognize the pattern and broken into lexemes.
8. The tool then constructs a state table for the appropriate finite state machine
and creates
program code which contains the table, the evaluation phases, and a routine
which uses them
appropriately
For efficient design of compiler, various tools are used to automate the phases
of compiler.
Compiler Design 10
LEX is a Unix utility which generates lexical analyzer.
LEX scans the source program in order to get the stream of tokens and these
tokens can be related together so that various programming structure such as
expression, block statement, control structures, procedures can be
recognized.
LEX compiler
LEX specification file can be denoted using the extension .l (often pronounced
as dot L).
Example-
LEX specification file stores the regular expressions for the token.
In specification file, LEX actions are associated with every regular expression.
These actions are simply the pieces of C code that are directly carried over to
the lex.yy.c.
Compiler Design 11
Generation of lexical analyzer using LEX
• Finally, the C compiler compiles this generated lex.yy.c and produces an object
program a.out.
• When some input stream is given to a.out then sequence of tokens gets
generated.
1. Declaration section
2. Rule section
Compiler Design 12
R1 {action1}
R2 {action2}
Rn {actionn}
All the procedures are defined which are required by the actions in the rule
section.
a. main() function
b. yywrap() function
Compiler Design 13