0% found this document useful (0 votes)
18 views5 pages

Compiler Construction Tools & Introduction To LA

The document discusses Context Free Grammar (CFG) and Regular Grammar (RG), highlighting their definitions, properties, and differences, including their acceptance by Pushdown Automata and Finite State Automata respectively. It also covers compiler construction tools such as parser generators and lexical analyzers, explaining their roles in the compilation process. Additionally, it introduces lexical analysis, detailing the tokenization process and providing examples of tokens and non-tokens.

Uploaded by

itsabhi739
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
18 views5 pages

Compiler Construction Tools & Introduction To LA

The document discusses Context Free Grammar (CFG) and Regular Grammar (RG), highlighting their definitions, properties, and differences, including their acceptance by Pushdown Automata and Finite State Automata respectively. It also covers compiler construction tools such as parser generators and lexical analyzers, explaining their roles in the compilation process. Additionally, it introduces lexical analysis, detailing the tokenization process and providing examples of tokens and non-tokens.

Uploaded by

itsabhi739
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

1.

Context Free Grammar :


 Language generated by Context Free Grammar is accepted by Pushdown Automata
 It is a subset of Type 0 and Type 1 grammar and a superset of Type 3 grammar.
 Also called phase structured grammar.
 Different context-free grammars can generate the same context-free language.
 Classification of Context Free Grammar is done on the basis of the number of parse trees.
 Only one parse tree->Unambiguous.
 More than one parse tree->Ambiguous.
Productions are in the form –
A->B;

A∈N i.e A is a non-terminal.

B∈V*(Any string).

Example –
S –> AB

A –> a

B –> b

2. Regular Grammar :
 It is accepted by Finite State Automata.
 It is a subset of Type 0 ,Type 1 and Type 2 grammar.
 The language it generates is called Regular Language.
 Regular languages are closed under operations like Union, Intersection, Complement etc.
 They are the most restricted form of grammar.
Productions are in the form –
V –> VT / T (left-linear grammar) (or)

V –> TV /T (right-linear grammar)

Example –
1. S –> ab.

2. S -> aS | bS | ∊
Difference Between Context Free Grammar and Regular Grammar:

Parameter Context Free Grammar Regular Grammar

Type Type-2 Type-3

Recognizer Push-down automata. Finite State Automata

Productions are of the form: Productions are of the form:


A->B; V –> VT / T (left-linear grammar)
A∈N(Non-Terminal) (or)
Rules B∈V*(Any string) V –> TV /T (right-linear grammar)

Restriction Less than Regular Grammar More than any other grammar

Right-hand The right-hand side of The right-hand side of production should


Side production has no restrictions. be either left linear or right linear.

Set Property Super Set of Regular Grammar Subset of Context Free Grammar

Intersection of two CFL need


Intersection not be a CFL Intersection of two RG is a RG.

They are not closed under


Complement complement Closed under complement

The range of languages that The range of languages that come under
Range come under CFG is wide. RG is less than CFG.

Examples S –> AB;A –> a;B –> b S -> aS | bS | ∊


Compiler construction tools
The compiler writer can use some specialized tools that help in implementing various phases of a
compiler. These tools assist in the creation of an entire compiler or its parts. Some commonly
used compiler construction tools include:

1. Parser Generator –
It produces syntax analyzers (parsers) from the input that is based on a grammatical
description of programming language or on a context-free grammar. It is useful as the syntax
analysis phase is highly complex and consumes more manual and compilation time.
Example: PIC, EQM
2. Scanner Generator –
It generates lexical analyzers from the input that consists of regular expression description
based on tokens of a language. It generates a finite automaton to recognize the regular
expression.
Example: Lex
3. Syntax directed translation engines –
It generates intermediate code with three address format from the input that consists of a
parse tree. These engines have routines to traverse the parse tree and then produces the
intermediate code. In this, each node of the parse tree is associated with one or more
translations.
4. Automatic code generators –
It generates the machine language for a target machine. Each operation of the intermediate
language is translated using a collection of rules and then is taken as an input by the code
generator. A template matching process is used. An intermediate language statement is
replaced by its equivalent machine language statement using templates.
5. Data-flow analysis engines –
It is used in code optimization. Data flow analysis is a key part of the code optimization that
gathers the information, that is the values that flow from one part of a program to another.
6. Compiler construction toolkits –
It provides an integrated set of routines that aids in building compiler components or in the
construction of various phases of compiler.
Introduction of Lexical Analysis
Lexical Analysis is the first phase of the compiler also known as a scanner. It converts the
High level input program into a sequence of Tokens.
 Lexical Analysis can be implemented with the Deterministic finite Automata.
 The output is a sequence of tokens that is sent to the parser for syntax analysis
What is a token?
A lexical token is a sequence of characters that can be treated as a unit in the grammar of the
programming languages.
Example of tokens:
 Type token (id, number, real, . . . )
 Punctuation tokens (IF, void, return, . . . )
 Alphabetic tokens (keywords)
Keywords; Examples-for, while, if etc.

Identifier; Examples-Variable name, function name, etc.

Operators; Examples '+', '++', '-' etc.

Separators; Examples ',' ';' etc

Example of Non-Tokens:
 Comments, preprocessor directive, macros, blanks, tabs, newline, etc.
Lexeme: The sequence of characters matched by a pattern to form
the corresponding token or a sequence of input characters that comprises a single token is
called a lexeme.
How Lexical Analyzer functions
1. Tokenization i.e. Dividing the program into valid tokens.
2. Remove white space characters.
3. Remove comments.
4. It also provides help in generating error messages by providing row numbers and column
numbers.
The lexical analyzer identifies the error with the help of the automation machine and the
grammar of the given language on which it is based like C, C++, and gives row number and
column number of the error.
Suppose we pass a statement through lexical analyzer –

a=b+c; It will generate token sequence like this:


id=id+id; Where each id refers to it’s variable in the symbol table referencing all
details
For example, consider the program

int main()

{ // 2 variables

int a, b;

a = 10;

return 0;

All the valid tokens are:

'int' 'main' '(' ')' '{' 'int' 'a' ',' 'b' ';'

'a' '=' '10' ';' 'return' '0' ';' '}'

Above are the valid tokens.

Exercise 1:
Count number of tokens :
int main()

int a = 10, b = 20;

printf("sum is :%d",a+b);

return 0;

Answer: Total number of token: 27.

You might also like