0% found this document useful (0 votes)
19 views

Lab Manual CD

The document provides a laboratory manual for the Compiler Design course, with 10 experiments covering topics like lexical analysis using Lex and Flex tools, construction of finite automata, designing a lexical analyzer, and implementing parsers like LR(1) and LALR(1) parsers. The experiments are intended to help students understand compiler design concepts practically and include theoretical background, objectives, exercises and evaluation criteria for each topic.

Uploaded by

goyapal324
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views

Lab Manual CD

The document provides a laboratory manual for the Compiler Design course, with 10 experiments covering topics like lexical analysis using Lex and Flex tools, construction of finite automata, designing a lexical analyzer, and implementing parsers like LR(1) and LALR(1) parsers. The experiments are intended to help students understand compiler design concepts practically and include theoretical background, objectives, exercises and evaluation criteria for each topic.

Uploaded by

goyapal324
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 19

SAL Institute of Technology and Engineering Research

CE, CSE & ICT Department

Compiler Design (3170701)

Laboratory Manual

Year: 2023-2024
INDEX

Sr. Page No.


Experiment Date Marks Signature
No. From To
Study of Lex, Flex and YACC
1. compiler tools.
Construct finite automata for a regular
2. expression.
Design a lexical analyzer for given
3. language
Operators and identifiers in lexical
analysis:
1) Write a C program to identify whether a
given line is a comment or not.
4.
2) Write a C program to test whether a
given identifier is valid or not.
3) Write a C program to simulate lexical
analyzer for validating operators.
Write a program that will find the FIRST
5. and FOLLOW set of the grammar.
Write a program to implement a predictive
parser for below grammar.
S-> A
6.
A-> Bb | Cd
B-> aB | @
C-> Cc | @
LR Parsers:
1) Write a C program to implement LR (1)
7. parser.
2) Write a C program to implement LALR
(1) parser.
Write a C program to implement operator
8. precedence parsing.
Using recursive descent parsing method
9. design a syntax analyzer for any expression
in c language
Using Syntax directed translation &
predicative parsing technique generate the
10.
Intermediate code in three address code
format for expression in c language.
EXPERIMENT NO : 1

TITLE: Study of Lex, Flex and YACC compiler tools.


OBJECTIVE: On completion of this exercise student will able to know about
Lex, YACCand Flex Compiler Tools.

THEORY:
Some of the most time consuming and tedious parts of writing a compiler involve the lexical
scanning and syntax analysis. Luckily there is freely available software to assist in these
functions. While they will not do everything for you, they will enable faster implementation of
the basic functions. Lex and Yacc are the most commonly used packages with Lex managing
the token recognition and Yacc handling the syntax. They work well together, but conceivably
can be used individually as well. Both operate in a similar manner in which instructions for
token recognition or grammar are written in a special file format. The text files are then read
by lex and/or yacc to produce c code. This resulting source code is compiled to make the final
application. In practice the lexical instruction file has a “.l” suffix and the grammar file has a
“.y” suffix. This process is shown in Figure.

Figure Lex and Yacc Process

The file format for a lex file consists of basic sections as below:
• The first is an area for c code that will be place verbatim at the beginning of the generated
source code.
Typically is will be used for things like #include, #defines, and variable declarations.
• The next section is for definitions of token types to be recognized. These are not mandatory,
but in general makes the next section easier to read and shorter.
• The third section set the pattern for each token that is to be recognized, and can also include c
code to be called when that token is identified
• The last section is for more c code (generally subroutines) that will be appended to the end of
the generated c code. This would typically include a main function if lex is to be used by itself.
• The format is applied as follows (the use and placement of the % symbols are necessary):
%{
//header c code %}
//definitions %%
//rules %%
//subroutines

The format for a yacc file is similar, but includes a few extras.
• The first area (preceded by a %token) is a list of terminal symbols. You do not need to list
single character ASCII symbols, but anything else including multiple ASCII symbols need to
be in this list (i.e. “==”).
• The next is an area for c code that will be place verbatim at the beginning of the generated
source code. Typically is will be used for things like #include, #defines, and variable
declarations.

• The next section is for definitions - none of the following examples utilize this area
• The fourth section set the pattern for each token that is to be recognized, and can also
include c code to be called when that token is identified
• The last section is for more c code (generally subroutines) that will be appended to the end of
the generated c code. This would typically include a main function if lex is to be used by itself.
• The format is applied as follows (the use and placement of the % symbols are necessary):
• %tokens RESERVED, WORDS, GO, HERE
%
{
//header c code
%}
//definitions
%%
//rules
%%
//subroutines

These formats and general usage will be covered in greater detail in the following (4) sections.
In general it is best not to modify the resulting c code as it is overwritten each time lex or yacc
is run. Most desired functionality can be handled within the lexical and grammar files, but
there are some things that are difficult to achieve that may require editing of the c file.

EXCERSICE:

1. Study the LEX and YACC tool and Evaluate an arithmetic expression with parentheses,
unary and binary operators using Flex and Yacc (CALCULATOR)

EVALUATION:

Problem Analysis Understanding Timely


Mock(2) Total(10)
& Solution (3) level (3) completion (2)

Date: Signature:
EXPERIMENT NO : 2

TITLE: Construction of Finite Automata

OBJECTIVE: On completion of this exercise student will able to construct Finite Automata
for given grammar.

THEORY:

Procedure for program 1:


1) Input the string.
2) Scan the string from left to right.
3) Print the initial state q0 as 1.
4) If input character is „a‟ then goto step 2. If character is „b‟ then goto state 1.
5) In state 2, if character is „a‟ then goto state 1 & if character is „b‟ then goto state 3.
6) In state 3, if character is „a‟ then goto state 2 & if character is „b‟ then goto state 4.
7) In state 4, if character is „a‟ then goto state 2, if character is „a‟ then goto state 2 & if
character is „b‟ then goto state 1.
8) If the string is terminated in state 4 then this string is accepted otherwise it not accepted
by given DFA.

EXCERSICE:

Construct finite automata for a regular expression for a string which ends with abb.

EVALUATION:

Problem Analysis Understanding Timely


Mock(2) Total(10)
& Solution (3) level (3) completion (2)

Date: Signature:
EXPERIMENT NO : 3

TITLE: Design a lexical analyzer.

OBJECTIVE: On completion of this exercise student will able to know about Lexical
Analyzer.

THEORY:
Lexical analysis is the first phase of a compiler. It takes the modified source code from
language preprocessors that are written in the form of sentences. The lexical analyzer breaks
these syntaxes into a series of tokens, by removing any whitespace or comments in the
source code.

If the lexical analyzer finds a token invalid, it generates an error. The lexical analyzer works
closely with the syntax analyzer. It reads character streams from the source code, checks for
legal tokens, and passes the data to the syntax analyzer when it demands.

Fig. Lexical Analysis

Lexemes
Lexemes are said to be a sequence of characters (alphanumeric) in a token. There are some
predefined rules for every lexeme to be identified as a valid token. These rules are defined
by grammar rules, by means of a pattern. A pattern explains what can be a token, and these
patterns are defined by means of regular expressions.

Language
A language is considered as a finite set of strings over some finite set of alphabets.
Computer languages are considered as finite sets, and mathematically set operations can be
performed on them. Finite languages can be described by means of regular expressions.

The lexical analyzer needs to scan and identify only a finite set of valid string/token/lexeme
that belongs to the language in hand. It searches for the pattern defined by the language
rules.

Regular expressions have the capability to express finite languages by defining a pattern for
finite strings of symbols. The grammar defined by regular expressions is known as regular
grammar. The language defined by regular grammar is known as regular language.

Regular expression is an important notation for specifying patterns. Each pattern matches a
set of strings, so regular expressions serve as names for a set of strings. Programming
language tokens can be described by regular languages. The specification of regular
expressions is an example of a recursive definition. Regular languages are easy to
understand and have efficient implementation.

There are a number of algebraic laws that are obeyed by regular expressions, which can be
used to manipulate regular expressions into equivalent forms.
EXCERSICE:

Design a lexical analyzer for given language and the lexical analyzer should ignore
redundant spaces, tabs and new lines. It should also ignore comments. Although the syntax
specification states that identifiers can be arbitrarily long, you may restrict the length to
some reasonable value. Simulate the same in C language.

REVIEW QUESTIONS:

1. Define lexemes.
2. Define language.
3. Explain working of lexical analysis phase.

EVALUATION:

Problem Analysis Understanding Timely


Mock(2) Total(10)
& Solution (3) level (3) completion (2)

Date: Signature:
EXPERIMENT NO : 4

TITLE: To study about operators and identifiers in lexical analysis.

OBJECTIVE: On completion of this exercise student will able to know about Lexical
Analyzer.

THEORY:

The lexical analyzer is the first phase of a compiler. Its main task is to read the input
characters and produce as output a sequence of tokens, which parser uses for syntax
analysis.
The sentences of a language consist of strings of tokens. A sequence of input character that
comprises a single token is called a lexeme.
Lexical analyzer performs following functions to convert lexems into tokens:
1. Removal of white spaces and comments:
White spaces are banks, tabs and new lines & comments. White spaces are eliminated by
lexical analyzer so parser will never consider it.
2. Lexical analyzer collects character from the source file and these characters are grouped
in to one of the following Tokens.

I. Identifiers
II. Operators
III. Keywords
IV. Constants

In C language identifiers are the names given to variables, constants, functions and user-
define data. This identifier is defined against a set of rules.
Rules for an Identifier:
1. An Identifier can only have alphanumeric characters ( a-z , A-Z , 0-9 ) and underscore ( _
).
2. The first character of an identifier can only contain alphabet (a-z , A-Z ) or underscore ( _
).
3. Identifiers are also case sensitive in C. For example name and Name are two different
identifiers in C.
4. Keywords are not allowed to be used as Identifiers.
5. No special characters, such as semicolon, period, whitespaces, slash or comma are
permitted to be used in or as Identifier.

A token is a string of characters, categorized according to the rules as a symbol (e.g.,


IDENTIFIER, NUMBER, and COMMA). The process of forming tokens from an input
stream of characters is called tokenization, and the lexer categorizes them according to a
symbol type. A token can look like anything that is useful for processing an input text
stream or text file.

A lexical analyzer generally does nothing with combinations of tokens, a task left for a
parser. For example, a typical lexical analyzer recognizes parentheses as tokens, but does
nothing to ensure that each "(" is matched with a ")".
Consider this expression in the C programming language: sum = 3 + 2;
Tokenized in the following table:
Lexeme Token type
Sum Identifier
= Assignment operator
3 Integer literal
+ Addition operator
2 Integer literal
; End of statement
Tokens are frequently defined by regular expressions, which are understood by a lexical
analyzer generator such as lex. The lexical analyzer (either generated automatically by a tool
like lex, or hand-crafted) reads in a stream of characters, identifies the lexemes in the stream,
and categorizes them into tokens. This is called "tokenizing." If the lexer finds an invalid
token, it will report an error.

Operators:
Following tokenizing is parsing. From there, the interpreted data may be loaded into data
structures for general use, interpretation, or compiling.
We know, Addition operator plus („+‟) operates on two Operands

Syntax analyzer will just check whether plus operator has two operands or not. It does not
check the type of operands.

Suppose One of the Operand is String and other is Integer then it does not throw error as it
only checks whether there are two operands associated with „+‟ or not .

Procedure:
1. Read the source file from left to right one character at a time.
2. Using valid token types group them in to tokens which belong to same class.
3. Save token in to destination file.

EXCERSICE:

3) Write a C program to identify whether a given line is a comment or not.


4) Write a C program to test whether a given identifier is valid or not.
5) Write a C program to simulate lexical analyzer for validating operators.

REVIEW QUESTIONS:

1. Define tokens.
2. Define operators.
3. Define identifiers.

EVALUATION:

Problem Analysis Understanding Timely


Mock(2) Total(10)
& Solution (3) level (3) completion (2)

Date: Signature:
EXPERIMENT NO: 5

TITLE: FIRST and FOLLOW set for the grammar

OBJECTIVE: On completion of this exercise student will able to know what is FIRST and
FOLLOW and how to find FIRST and FOLLOW set from given grammar.

THEORY:

FIRST set:
Given a non-terminal symbol, the next symbol on input should uniquely determine which
alternative of the production to choose. These input symbols are called director symbols.

A production alternative can generate a number of terminal strings. The first symbols of those
strings are director symbols for that alternative. To this end, we wish to calculate the set of
terminal symbols which form the set of first symbols for each non-terminal in the language.
This set of symbols is called the first set.

FIRST (α)
If α is any string of grammar symbols, let FIRST (α) be the set of terminals that begin the strings
derived from α. If α => ε then ε is also in FIRST (α).

To compute FIRST(X) for all grammar symbols X, apply the following rules until no more
terminals or ε can be added to any FIRST set:

1. If X is terminal, and then FIRST(X) is {X}.


2. If X → ε is a production, and then adds ε to FIRST(X).
3. If X is nonterminal and X → Y1 Y2 ... YK. is a production, then place α in FIRST(X) if for
some i, a is in FIRST (Yi), and ε is in all of FIRST(Y1), ... , FIRST(Yi-1)

That is, Y1... Yi-1 => ε. If ε is in FIRST (Yj) for all j = 1, 2, ..., k, then add ε to FIRST(X). For
example, everything in FIRST (Y1) is surely in FIRST(X). If Y1 does not derive ε, then we add
nothing more to FIRST(X), but if Y1 => ε, then we add FIRST (Y2) and so on.

Now, we can compute FIRST for any string X1X2 . . . Xn as follows.


Add to FIRST(X1X2 ... Xn) all the non-ε symbols of FIRST(X1).
Also add the non-ε symbols of FIRST(X2) if ε is in FIRST(X1), the non-ε symbols of FIRST(X3)
if ε is in both FIRST(X1) and FIRST(X2), and so on. Finally, add ε to FIRST(X1X2 ... Xn) if, for
all i, FIRST (Xi) contains ε.
FOLLOW set:

Given a non-terminal symbol, the next symbol on input should uniquely determine which
alternative of the production to choose. These input symbols are called director symbols.

If a production alternative can generate the empty string, then the symbols that can FOLLOW
the production also qualify as director symbols. Hence, we are also interested in what terminal
symbols may follow a nonterminal symbol. This set of terminal symbols is called the follow
set.

FOLLOW (A)

Define FOLLOW (A), for nonterminal A, to be the set of terminals a that can appear
immediately to the right of A in some sentential form, that is, the set of terminals a such that
there exists a derivation of the form S => αAaβ for some α and β. Note that there may, at
some time during the derivation, have been symbols between A and a, but if so, they derived ε
and disappeared. If A can be the rightmost symbol in some sentential form, then $,
representing the input right endmarker, is in FOLLOW (A).

To compute FOLLOW (A) for all nonterminals A, apply the following rules until nothing can
be added to any FOLLOW set:
1. Place $ in FOLLOW(S), where S is the start symbol and $ is the input right end marker.
2. If there is a production A S => αBβ, then everything in FIRST (β), except for ε, is placed in
FOLLOW(B).
3. If there is a production A => αB, or a production A => αBβ where FIRST (β) contains ε
(i.e., β =>ε), then everything in FOLLOW (A) is in FOLLOW (B).

EXCERSICE:

Write a program that will find the FIRST and FOLLOW set of the grammar.

EVALUATION:

Problem Analysis Understanding Timely


Mock(2) Total(10)
& Solution (3) level (3) completion (2)

Date: Signature:
EXPERIMENT NO : 6

TITLE: To study about predictive parser.

OBJECTIVE: On completion of this exercise student will able to know about predictive parser.

THEORY:

In computer science, a recursive descent parser is a kind of top-down parser built from
a set of mutually recursive procedures (or a non-recursive equivalent) where each such
procedure usually implements one of the production rules of the grammar. Thus the structure
of the resulting program closely mirrors that of the grammar it recognizes.

A predictive parser is a recursive descent parser that does not require backtracking. Predictive
parsing is possible only for the class of LL(k) grammars, which are the context- free
grammars for which there exists some positive integer k that allows a recursive descent parser
to decide which production to use by examining only the next k tokens of input. (The LL (k)
grammars therefore exclude all ambiguous grammars, as well as all grammars that contain
left recursion. Any context-free grammar can be transformed into an equivalent grammar that
has no left recursion, but removal of left recursion does not always yield an LL (k) grammar.)
A predictive parser runs in linear time. Recursive descent with backtracking is a technique
that determines which production to use by trying each production in turn. Recursive
descent with backtracking is not limited to LL (k) grammars, but is not guaranteed to terminate
unless the grammar is LL (k). Even when they terminate, parsers that use recursive descent
with backup may require exponential time.

LL grammars, particularly LL(1) grammars, are of great practical interest, as parsers for these
grammars are easy to construct, and many computer languages are designed to be LL(1) for
this reason. LL parsers are table-based parsers, similar to LR parsers. LL grammars can also
be parsed by recursive descent parsers.

EXCERSICE:
Write a program to implement a program for a predictive parser for below grammar.
S-> A
A-> Bb | Cd
B-> aB | @
C-> Cc | @
REVIEW QUESTIONS:

1. Explain working of Predictive Parser

EVALUATION:

Timely
Problem Analysis Understanding
completion Mock(2) Total(10)
& Solution (3) level (3)
(2)

Date: Signature:
EXPERIMENT NO : 7

TITLE: To study about LR Parsers.

OBJECTIVE: On completion of this exercise student will able to know about LR Parsers..

THEORY:

LR parsers are used to parse the large class of context free grammars. This technique is
called LR(k) parsing.
 L is left-to-right scanning of the input.
 R is for constructing a right most derivation in reverse.
 k is the number of input symbols of lookahead that are used in making parsing
decisions.

There are three widely used algorithms available for constructing an LR parser:
 SLR(l) - Simple LR
o Works on smallest class of grammar.
o Few number of states, hence very small table.
o Simple and fast construction.
 LR( 1) - LR parser
o Also called as Canonical LR parser.
o Works on complete set of LR(l) Grammar.
o Generates large table and large number of states.
o Slow construction.
 LALR(1) - Look Ahead LR Parser
o Works on intermediate size of grammar
o Number of states are same as in SLR(1)

Reasons for attractiveness of LR parser


 LR parsers can handle a large class of context-free grammars.
 The LR parsing method is a most general non-back tracking shift-reduce parsing
method.
 An LR parser can detect the syntax errors as soon as they can occur.
 LR grammars can describe more languages than LL grammars.

Drawbacks of LR parsers
 It is too much work to construct LR parser by hand. It needs an automated parser
generator.
 If the grammar contains ambiguities or other constructs then it is difficult to parse in a
left-to-right scan of the input.

Model of LR Parser
LR parser consists of an input, an output, a stack, a driver program and a parsing table that
has two functions
 Action
 Goto

The driver program is same for all LR parsers. Only the parsing table changes from one parser
to another.
The parsing program reads character from an input buffer one at a time, where a shift reduces
parser would shift a symbol; an LR parser shifts a state. Each state summarizes the
information contained in the stack.
The stack holds a sequence of states, so, s1, · ·· , Sm, where Sm is on the top.

Fig. LR Parser

Action
This function takes as arguments a state i and a terminal a (or $, the input end
marker). The value of ACTION [i, a] can have one of the four forms:
i) Shift j, where j is a state.
ii) Reduce by a grammar production A---> β.
iii) Accept. iv) Error.

Goto

This function takes a state and grammar symbol as arguments and produces a state.
If GOTO [Ii ,A] = Ij, the GOTO also maps a state i and non terminal A to state j.

Behaviour of the LR parser


1. If ACTION[sm, ai] = shift s. The parser executes the shift move, it shifts the next state s
onto the stack, entering the configuration
a) Sm - the state on top of the stack. b) ai- the current input symbol.
2. If ACTION[sm, ai] =reduce A---> β, then the parser executes a reduce move,
entering the configuration (s0s1 ... S(m-r)S, ai+l ... an$)
a) where r is the length of β and s= GOTO[sm - r, A].
b) First popped r state symbols off the stack, exposing state Sm-r·
c) Then pushed s, the entry for GOTO[sm-r, A], onto the stack.
3. If ACTION[sm, ai] = accept, parsing is completed.
4. If ACTION[sm, ai] = error, the parser has discovered an error and calls an error
recovery routine.

LR(0) Items

An LR(0) item of a grammar G is a production of G with a dot at some position of the body.
(eg.)
A ---> •XYZ A ---> XeYZ A ---> XYeZ A ---> XYZ•
One collection of set of LR(0) items, called the canonical LR(0) collection, provides finite
automaton that is used to make parsing decisions. Such an automaton is called an LR(0)
automaton.

LR(0) Parser or SLR(1) Parser


An LR(0)parser is a shift-reduce parser that uses zero tokens of lookahead to determine what
action to take (hence the 0). This means that in any configuration of the parser, the parser
must have an unambiguous action to choose-either it shifts a specific symbol or applies a
specific reduction. If there are ever two or more choices to make, the parser fails and the
grammar is not LR(0).
An LR parser makes shift-reduce decisions by maintaining states to keep track of parsing.
States represent a set of items.

Closure of item sets


If I is a set of items for a grammar G, then CLOSURE(I) is the set of items constructed
from I by the two rules.
 Initially, add every item I to CLOSURE(I).
 If A ---> αB,β is in CLOSURE(I) and B ---> ɣ is a production, then add the item B
---> • ɣ to CLOSURE(i), if it is not already there. Apply this rule until no more items can
be added to CLOSURE (!).

Construct canonical LR(0) collection


 Augmented grammar is defined with two functions, CLOSURE and GOTO. If G is a
grammar with start symbol S, then augmented grammar G' is G with a new start symbol S'
---> S.
 The role of augmented production is to stop parsing and notify the acceptance of the
input i.e., acceptance occurs when and only when the parser performs reduction by S' --->
S.
Limitations of the LR(0) parsing method
Consider grammar for matched parentheses
S' ---> S
S' ---> (S) S S' ---> Ɛ
The LR(0) DFA of grammar G is shown below
In states: 0, 2 and 4 parser can shift (and reduce Ɛ to S)

Fig. Canonical Set of Items for above grammar


SLR(1) grammars
 SLR(l) parsing increases the power of LR(0) significantly.
Look ahead token is used to make parsing decisions
Reduce action is applied more selectively according to FOLLOW set.
 A grammar is SLR(l) if two conditions are met in every state.
If A ---> α • x ɣ and B ---> β ,•then token x Ɛ FOLLOW(B).
If A ---> α • and B ---> • then FOLLOW(A) П FOLLOW(B) = Ø.
 Violation of first condition results in shift-reduce conflict.
A---> α • x ɣ and B ---> β• and x Ɛ FOLLOW(B) then ...
Parser can shift x and reduce B ---> β.
 Violation of second condition results in reduce-reduce conflict.
A---> α • and B ---> β,•and x Ɛ FOLLOW(A) n FOLLOW(B). Parser can reduce A ---> α and
B ---> β.
 SLR(l) grammars are a superset of LR(0) grammars.

LR(1) Parser or Canonical LR (CLR)


 Even more powerful than SLR(l) is the LR(l) parsing method.
 LR(l) includes LR(0) items and a look ahead token in itemsets.
 An LR(l) item consists of,
o Grammar production rule.
o Right-hand position represented by the dot and.
o Lookahead token.
o A --->X1 · · · Xi • Xi+1 · · · Xn, l where l is a lookahead token
 The • represents how much of the right-hand side has been seen,
o X1 · · · Xi appear on top of the stack.
o Xi+l · · · Xn are expected to appear on input buffer.
 The lookahead token l is expected after X1 · · · Xn appears on stack.
 An LR(l) state is a set of LR(l) items.

Introduction to LALR Parser


 LALR stands for lookahead LR parser.
 This is the extension of LR(0) items, by introducing the one symbol of lookahead on the
input.
 It supports large class of grammars.
 The number of states is LALR parser is lesser than that of LR( 1) parser. Hence,
LALR is preferable as it can be used with reduced memory.
 Most syntactic constructs of programming language can be stated conveniently.
 Steps to construct LALR parsing table
o Generate LR(l) items.
o Find the items that have same set of first components (core) and merge these
sets into one.
o Merge the goto's of combined itemsets.
o Revise the parsing table of LR(l) parser by replacing states and goto's with
combined states and combined goto's respectively

EXCERSICE:
1) Write a C program to implement LR (1) parser.
2) Write a C program to implement LALR (1) parser.

REVIEW QUESTIONS:

1. Differentiate SLR, LR and LALR parser.


2. Define augmented grammar. What is the need to create augmented grammar for
constructing canonical set of items?
EVALUATION:

Problem Analysis Understandi Timely Mock(


Total(10)
& Solution (3) ng level (3) completion (2) 2)

Date: Signature:
EXPERIMENT NO : 8

TITLE: To study about operator precedence parser.

OBJECTIVE: On completion of this exercise student will able to know about operator
precedence parser.

THEORY:

Bottom-up parsers for a large class of context-free grammars can be easily developed
using operator grammars.
Operator grammars have the property that no production right side is empty or has two
adjacent nonterminals. This property enables the implementation of efficient operator-
precedence parsers. This parser relies on the following three precedence relations:
Relation Meaning
a <·b a yields precedence to b
a =·b a has the same precedence as b
a ·>b a takes precedence over b
These operator precedence relations allow to delimit the handles in the right sentential forms:
<· marks the left end,=· appears in the interior of the handle, and·>marks the right end.
Let assume that between the symbols ai and ai+1 there is exactly one precedence relation.
Suppose that $ is the end of the string. Then for all terminals we can write: $ <·b and b
·>$. If we remove all nonterminals and place the correct precedence relation:
<·,=·,·> between the remaining terminals, there remain strings that can be analyzed by easily
developed parser.
For example, the following operator precedence relations can be introduced for simple
expressions and for the input string id1+id2*id3:
After inserting precedence relations the string becomes $<·id1·>+<·id2·>*<·id3·>$ Having
precedence relations allows identifying handles as follows:
• scan the string from left until seeing·>
• scan backwards the string from right to left until seeing<·
• Everything between the two relations<· and·> forms the handle
id + * $
id ·> ·> ·>
+ <· ·> <· ·>
* <· ·> ·> ·>
$ <· <· <· ·>
Note that not the entire sentential form is scanned to find the handle.

EXCERSICE:

Write a C program to implement operator precedence parsing.


REVIEW QUESTIONS:

1. Define properties for operator precedence grammar.


2. Explain rules for constructing operator precedence matrix.
EVALUATION:

Problem Analysis Understandin Timely Mock(


Total(10)
& Solution (3) g level (3) completion (2) 2)

Date: Signature:
EXPERIMENT NO : 9

TITLE: To study about Recursive Descent Parser.

OBJECTIVE: On completion of this exercise student will able to know about how recursive
descent parser works.

THEORY:

Recursive descent is parsing is a top -down method of syntax analysis in which we execute a
set of recursive procedures to process the i/p. A procedure is associated with each non-
terminal of a grammar. Here we consider a special form of a recursive decent parsing called
predicative parsing in which the look ahead symbol unambiguously determines the
procedure/function selected for each no terminal.

ALGORITHM:
Procedure match (t:token)
Begin
If lookahead=t then
lookahead=next token
Else
error
End

EXCERSICE:

Using recursive descent parsing method design a syntax analyzer for any expression in c
language.

EVALUATION:

Problem Analysis Understanding Timely


Mock(2) Total(10)
& Solution (3) level (3) completion (2)

Date: Signature:
EXPERIMENT NO : 10

TITLE: To study about Intermediate code in three address code.

OBJECTIVE: On completion of this exercise student will able to implement Intermediate code
in three address code format.

THEORY:

Intermediate code generation phase of the compiler is responsible to generate Code in postfix,
syntax tree, or in three address code format. Three-address code is a sequence of statement of
the general form

X: = y op z

Where x, y & z are name, constants or compiler generated temporaries, op Stands for any
operator, such as a fixed or Boating. Arithmetic operator or a logical operator on Boolean
valued data.

EXCERSICE:

Using Syntax directed translation & predicative parsing technique generate the Intermediate
code in three address code format for expression in c language.

EVALUATION:

Problem Analysis Understanding Timely


Mock(2) Total(10)
& Solution (3) level (3) completion (2)

Date: Signature:

You might also like