Compiler Design
Compiler Design
The compiler writer like any programmer can profitably use software tools such as
debuggers,version managers,profiters and so on.
• Parser generators
• Scanner generators
• Dataflow engines
PARSER GENERATORS:
Eg: PIC,EQM
SCANNER GENERATOR:
These produce intermediate code with three address format,normally from input that is based
on the parse tree.
• It takes a collection of rules that define the translation of each operation of the
intermediate language in to the machine language for the target machine.
DATAFLOW ENGINES:
Much of the information needed to perform good code optimization involves “dataflow
analysis”, the gathering of information about how values are transmitted from one part of a
program to each other part.
• Compiler- compilers.
• Compiler-generators
• Translator-writing systems
To read the input characters and produce as output a sequence of tokens that the parser uses for
syntax analysis.
tokens
Symbol table
• Receiving a “get next token” command from the parser, the lexical analyser reads input
characters until it can dentify the next token.
1. One task is stripping out from the source program comments and while space in
the form of blank,tale,newline characters.
2. Another task is converting error messages from the compiler with the source
program.
• Two phases
1. Scanning
2. Lexical analysis
FUNCTIONS:
3. It generates symbol table which stores the information about ID,constants encounted
in the input.
The scanner is responsible for doing simple tasks, while the lexical analyser proper
does the more complex operations.
There are several reasons for separating the analysis phase of compiling into lexical
analysis and parsing.
• Simpler design.
TOKEN:
It is a sequence of character that can be treated as a single logical entity. Typical tokens are,
1. Identifiers
2. Keywords
3. Operators
4. Special symbols
5. Constants
PATTERN:
A set of strings in the input for which the same token is produced as output. This set of
strings is described by a rule called a pattern associated with the token.
LEXEME:
It is sequence of characters in the source program that is matched by the pattern foe a token.