Compiler Design
Compiler Design
action,
4 Ss 8] - © - action,
ea
3 x
7 When two or more accepting states are reached, the first
Lo ‘action given in the Lex specification is executed
CMD-27POPULAR PUBLICATION:
e-closure and move Examples g-closure({0}) = {0.1,3, 7}
move({0, 1,3, 7}.4)={2, 4, 7
e-closure({2, 4, 7} =12,4,7;
move({2, 4, 7}, 4) = {7}
e-closure({7}) = {7}
move( {7}, b) = {8}
e-closure({8}) = {8}
move( {8}, a)=@
a a, ba} “+ none
° 2 7 8 .
1 4
3 7 4
E Also used to simulate NFAs
Minimizing the Number of States of a DFA
5. Write short notes on the following:
2) Yaae IWBUT 2006, 2007, 2009, 2012, 2013, 2014, 2016]
3 Thompson's Construction Rule eur se g008, anon 5 ata
and YAAC :
Answer: DWBUT 2010, 2014, 2018]
a) YAAC:
yacc “assumes that the user has supplied i
Function called yyles. The funetion returns ane ona 2eE Which is an integer-valued
CMD.28COMPILER DESIGN:
In normal practice, one uses LEX to generate the lexical analyzer which is then
pigeybacked into the parser code generated by yace.
b) LEX:
we is a program that generates lexical analyzers. Lex is commonly used with the yace
parser generator. Lex is the standard lexical analyzer generator on many Unix systems.
Lex reads an input stream specifying the lexical analyzer and outputs source code
implementing the lexical analyzer in the C programming language.
The structure of a lex file is intentionally similar to that of a yace file; files are divided up
jnto three sections, separated by lines that contain only two percent signs, as follows:
Definition section .
%%
Rules section
%%
C code section
‘The definition section is the place to define macros and to import header files written in
C. It is also possible to write ‘any C code here, which will be copied verbatim into the
generated source file. The rules section is the- most important section; it associates
patterns with C statements. Patterns are simply regular expressions. When the lexer sees
some text in the input matching a given pattern, it executes the associated C code. T! is is
the basis of how lex operates. The C code section contains C statements and functions
that are copied verbatim to the generated source file. These statements presumably
sontain code called by the rules in the rules section. In large programs it is more
convenient to place this code in a separate file and link it in at compile time.
©) Thompson’s Construction Rule:
Given any RE, itis possible to algorithmically construct an NFA such that the language
accepted by the NFA is exactly the language expressed by the RE.
This is done by systematically using the constructs for the baste primitives and operators
ofan RE and building ups using Thompson's Construction process, ‘The steps are:
Step-1, Parse the given RE into its constituent sub-expressions.
Step-2, Construct NFA-s for each of the basic symbols in the given RE.
For e, construct the NFA as in Fig-1. Here i is a new start state and f is a new accepting
state, It is clear that this NFA recognizes the RE {e}.
For every a in the alphabet S, construct the NFA as shown in Fig-2. Here again iisanew
start state and f is a new accepting state. It is clear that this NFA recognizes the RE {a}.
If a occurs several times in the. given RE, a separate NFA is constructed for each
‘occurrence of a. :
r-O
Fig 1. NFA fore Fig 2. NFA for a character of the alphabet
CMD-29POPULAR PUBLICATIONS
Step-3, Combine the intermedial
expression is obtained. Each int
construction corresponds to a sub-express
properties — it has exactly one final state,
the final state. ; nd t, respective
PAS! egular expressions $ a! Y. The
NFA-s for regtt ite NFA-s N(sIt), N(St) and N(st
Suppose N(s) and N(v) are the
NFA-s for regular expression s| s*, the compos!
:
are constructed as shown itt
Fig: 4
En OO
©
Fig.:5
i NFA for the enti
Fi ively until the enti
inductive » duced during the course a
RE and has several importan,
state and no edge leave,
diate NFA-S
rermediate
sion of the
no edge ent
given
ters the start
ft, st and
Fig- 3, Fig-4 and Fig-5, respectively:
Fig: 3
d) LEX: Refer to Question No. 3(a) of Lon; ie
= a ig Answer Type Questions.
YAAC: Refer to Question No. 3(b) of Long Answer Type Questions.COMPILER DESIGN
PARSING AND CONTEXT FREE GRAMMAR
4. If all productions in a grammar G = (V,T,S,P) are of the form A->xB or A>x,
A,B c Vand x & T*, then it is called: [WBUT 2008]
a) Context-sensitive grammar b) Non-linear grammar
c) Right-linear grammar d) Left-linear grammar
Answer: (c)
2. An inherited attributes is the one whose initial value at a parse tree node is
defined in terms of [WBUT 2009, 2015]
a) attributes at the parent and / or siblings of that node
b) attributes at children nodes only
c) attributes at both children riodes and parent and / or siblings of that node
d) none of these
Answer: (c)
3. The intersection of a regular language and a context free language is
[WBUT 2009]
a) always a regular language . _ b) always a context free language
c) always a context sensitive language —_d) none of these
‘Answer: (b) *
4. The grammar E E+E | E*E | ais [WBUT 2010, 2013]
a) ambiguous b) unambiguous
c) not given-sufficient information d) none of these
Answer: (a)
5. Parse tree is generated in the phase of [WBUT 2011, 2018]
a) Syntax Analysis b) Semantic Analysis.
¢) Code Optimization d) Intermediate Code Generation
Answer: (a)
6. If the attributes of the parent node depends on its children, then its attributes are
called [WBUT 2012]
a) TAC b) synthesized ¢) inherited 4) directed
Answer: (c)
7. The expression wew where w belongs to {a, b} + is [WBUT 2014]
a) regular b) context free
¢) context sensitive ; d) none of these
Answer: (b)
CMD-31POPULAR PUBLICATIONS
[WBUT 2015)
8. The grammar S -» Sa, |Sa,}P, |B
a) is left recursive
b) has common left factor factor
a is left recursive and also has common left
d) is a CFG
Answer: (a) [WBUT 2016}
ihe shown at parse tree nodes
arse node
ree nodes
9. An annotated parse tree is a parse
a) with values of only some attribute
b) with attribute values shown at the i
c) without attribute values shown at the Lediedy :
d) with grammar symbols at the parse tree node!
Answer: (b)
: [WBUT 2017]
40. The output of the parser is Sell
a) tokens. b) syntax tree c) parse tree d) non-terminals
Answer: (c)
44. The grammar S — aSa|bS | cis [WBUT 2018]
a) LL(1) but not LR (1) b) LR(1) but not LL(1)
¢) Both LL(1) and LR(1) d) None of these
Answer: (c)
42. Grammar of the programing is checked at........ phase of compiler.[WBUT 2019]
a) Semantic analysis b) Syntax analysis
c) Code optimization d) Code generation
Answer: (b)
43. A grammar that produce more than one parse tree BUT 204
a) Ambiguous b) Unambiguous _c) Regular Al of fred
Answer: (a)
44, ssssseueee 16 the most general phase struct svar
a) Context sensitive : Roa [WBUT 2019]
¢} Context tree 4) All of these ’
Answer: (a)
45. Given a grammar G = (V, T, P, §} :
where AlsinV and ais in(¥ UT}then Gig roeucwen InP i ofthe form A>
me
Answer: context free grammar [WBUT 2022]
CMD-32COMPILER DESIGN
Short Answer e Questions
4, What is “handle'? Consider the grammar EE +nlE 'n|tt For a sentence n+n*n,
write the handles in the right-sentential forms of the reduction.
: [WBUT 2006]
What is predictive parsing?
oR,
What do you mean by a Handle? Give example.
‘Answer:
handle of a right sentential form gamma is a production Alb
where the string beta may be found and replaced by A to get the
[WBUT 2013]
and a position in gamma
previous rij ight-sentential
form.
The right-most productions are shown below. ‘The handles are underlined.
EDEtn > Etn*ndntntn
The handles are underlined.
A predictive parser is a recursive descent parser that does not require backtracking.
Predictive parsing is possible only for the class of LL(k) grammars, which are the
context-free grammars for which there exists some positive integer k that allows a
recursive descent parser to decide which production to use by ‘examining only the next k
tokens of input. (The LL(k) grammars therefore exclude all ambiguous grammars, as well
as all grammars that contain left recursion. Any context-free grammar can be transformed
into an equivalent grammar that has no eft recursion, but removal of left recursion does
not always yield an LL(K) grammar.) A predictive parser runs in linear time. :
‘A table-driven non-recursive predictive parser (for an LL(1) grammar) uses an explicit
parsing stack and a parsing table M[A;a] (where A is a nonterminal and ais a terminal)
Phere an entry M[A:a] is either a production rule or an error, A predictive parsing
algorithm uses this table to carry out top-down parsing.
2. Consider the following context-free grammar: [WBUT 2008, 2009]
$—SS+|SS*|a
2) Show how the string aa+a’ can be generated by this gramynar.
b) Construct a parse tree for the-given string.
c) What language is generated by this grammar?
Answer:
a) The string aa+a* can be generated by the rightmost production:
‘5 > SS*—> Sat > SS +.a* > Sa+a*— > aata*
b) The required parse tree i 4
adios:
$s $s .
ZAIN |
a+ a
1
aa
©) This grammar generates Binary Postfix Expressions involving a with + and * as the :
only operators.
CMD-33POPULAR PUBLICATIONS WELT 2009
mmar’
3. Consider the following left-linear grat
S— Sab) da
Aa Abb bE
- +, hi ar.
Find out an equivalent right-tinear gramm cats tanste
Answer: co (bb)tateb)* where More
This grammar generates the regular language ©
oceurences”, An equivalent right-linear grammar:
S—bdS|bbad
AabAlad
[WBUT 2008, 2016, 2017)
4, What is a handle? segieeslid
Consider the grammar E> E+ Sat Gp ee
Find the handles of the right sentential forms of reduction of the string id +id* id
Answer:
A handle of a right sentential form
gamma where the string beta may be
sentential form. ete re
‘The given grammar is ambiguous. There are two right-most derivations for the string id+
id* id. We give both the rightmost derivations, underlining the handles in each case.
E>E+E->E+E*ESE+E*id + E+id*id id tid tid
E+E*E+E*id +E+Etid + E+id*id aid tid*id
n A—>b and a position in
ma is a productio Dt ,
aaa i A to get the previous right.
found and replaced by
5. What is error handling? Describe the Panic Mode and Phrase level error
recovery technique with example. [WBUT 2011, 2018]
“Answer: .
Programs‘submitted to a compiler often have errors of various kinds. When a compiler
detects an error, i.e., when the symbols in a sentence do not match the compiler’s current
position in the syntax diagram, the error handler is invoked. The error handler warns the
programmer by issuing an appropriate message and then attempts to recover from it so
that it can detect more errors.
The simplest form of error recovery technique is “panic ‘e a
symbole ae used to delimit ‘clean pein’ in the tious, Who ag ene on
mode recovery algorithm deletes input tokens until it finds a safe backs the
parser out to 2 context in which that symbol might a aig symbol, ther tae jf
Pascal code: if a b then x else y; appear. In the following fragment
Compiler discovers the error at b and a panic- : i
forward to the semicolon, thereby mnising the ss Wing ad Sleorithin very likely sr
again produces a spurious error message, * When the parser later finds the els
In phrase-level error recovery, the quality of recover
sets of safe symbols in different contexts, Whe;
ty is improved by employing differe™
nte: " compiler discovers an error inCOMPILER DE! IGN
a phrase level
ur above example,
a ore
rent that follow the erroneous expression. In .
eae : ort cafe symbols and give @™
recovery Would use the then and else tokens as the nm
realistic error message.
if eo
6, Consider the following grammar G. Alternate the production so that it may fret
from backtracking.
Statement — if Expression then Statement else Statement ut 2012]
Statement + if Expression then Statement
Answer? |
Statement — if Expression then Statement Trailer
Trailer + else Statement tat
Trailer > &
7. Explain left factoring with suitable example. [WBUT 2013, 2014]
Answer: Refer to Question 5(a) of Long Answer Type Questions. :
8. Consider the context-free grammar: [WBUT 2017]
SSS +|SS*le.
a) How the string aa+a* can be generated by this grammar?
b) Construct a parse tree for this string.
Answer:
Similar to Question No. 2(a) of Short Answer Type questions.
8. Eliminate the left-recursion for the following grammar: [WBUT 2017]
S3(L)la
L—+Is|S
Answer:
sala | Ee
[a isis| ft recursion elimination
ap
L +
Tose A Aalp
Losuite | | 4724, | orm of left recursion)
: A'elad’
= Afr left recursion elimination +
S—(L)\a
L—st' \~ (Ans)
Losule
CMD-35POPULAR PUBLICATION:
an uestions
Long Ans’ (weut 2o10y
18 rammar:
4. Construct a predictive parsing table for the 9!
S + iEtSS'|a
S'>eS|c :
minals.
oie to and ist 8, & B are for
Here § is star symbol and S' are non-terminal
Explain the steps in brief.
Answer: sta
From rules S — iE1SS’| a, we get FIRST(S)= {1,4}
From rules .S"—> eSs|e, we get FIRST(S')= {2,6} -
From rule E >, we get FIRST(E) = {b}-
Using the FIRST sets and the rules, we get:
FOLLOW (S) = {8} U FIRST(S') = {e,$} «
FOLLOW (S')={e,8}.
FOLLOW(E)={t}.
Suppose the predictive parsing table is the 2-dimensional array PLA, a], where A isa
non-terminal and @ is a terminal.
Since i ¢ FIRST(S), P[S,i] = S > iEtSs'.
Since a FIRST(S) , P[S,a]=S—a.
Since e FIRST(S'), P[S',é]=5S'—>eS. Also, since £€ FIRST(S'), PIS'e]=S' >
because e< FOLLOW(S'). Hence, we have multiple entries. for M[S’,e]. Since
$< FOLLOW(S') and obviously © FIRST(e); we have P[S',$]=S' > 6.
‘Since b€ FIRST(E), P[E,b]=E->b.
No other rule is left to be scanned. So the predictive parsing table is:
Non- Input Symbol
terminal i t s : - <
S| sims’ Soa
se SS =
: Sie
E L =<
20) Dein 40 arama Conelder the following grammar: BUT 202
Ate
Boe
Test whether the grammar is LL(1) or not ai
ras nd construct a predictive parsing table
CMD-36
AS a ER eaten Is SLU SCOMPILER DESIG!
Answer:
FIRST (A)={e}, FIRST (B)= {«}
FIRST (S)= {a,b}
FOLLOW (A)= {a, b}, FOLLOW (B)= {a,b}, FOLLOW (S)= {3}
‘The predictive parsing table is:
a b $
S__[_S—» AaAb_| S— BbBa
A A>E ADE
B Boe Boe
Since there are no conflicts, the grammar is LL(1)
e the grammar
b) Consider the following Context Free Grammar (CFG) G and reduct
[WBUT 2012]
by removing all unit productions. Show each step of removal
SAB ‘
Ava
B>Clb
C+D
DoE
E>a
Answer:
State: S— AB
Awa
BoC|b .
cD
DE
E-a
Step-1: S— AB
Ava
BoC|b
c3D
D-a
Ea
Step-2: SAB -
Ava
“BC\b
Ca
Da
Ea
CMD-37POPULAR PUBLICATIONS
Step-3: S > AB
Aa
Bald
Coa
D-a
eae mmar is ambiguous by
that the 9°!
c) Consider the following gramma! G. Show
r gitence ‘abab'.
constructing two different leftmost derivations for the [WBUT 2012]
S— aSbS|bSaS|¢
Answer: a
‘The two possible left-most derivations for bab! are:
bases"
() SE sas abs 25a
ababS—"> abab
Gi) SEA page Sab Sa8bS.
soe bao 525 ababS.
ins [WBUT 2014]
3. a) Consider the grammar:
$= aSbs |bSaS |E
(i) Show that. this. grammar is ambiguou
most derivations for the sentence abab. -
(i) Construct the corresponding right most derivations for abab.
(ii) Construct the corresponding parse trees for abab.
{iv) What language does this grammar generate?
by () Show that no left recursive grammar can be LL (1).
(i) Show that no LL(1) grammar can be ambiguous.
Answer:
2) (i) Refer to Question No. 2(c) of Long Answer Type Questions.
is by constructing two different left
(ii) S + aSbS — aSb — abSaSb — abSab — abab
(ili) Consider a string ‘abab’. We can construct parse trees for deriving ‘abab’
AN ASSN
Ro SS
ta i |
(iv) This grammar generates the language whi :
and b's the empty string also. Buage which contains strings of equal number of 'S
CMD-38COMPILER DESIGN:
b) (i) The production is left-recursive if the leftmost symbol on the right side is the same
as the non terminal on the left side. For example,
expr — expr + term. ’
Left-recursive grammars can never be LL(1), because the left-recursion will lead to an
infinite loop.
Consider the grammar
A> plAa
and try to parse the string B a.
« Ifwe removed the left recursion the grammar becomes
A> BA'
A sade
which derives the same string but towards the right instead of the left
= Now parse a @. # doesn’t match @, therefore try another alternative for A. There
are none, so parse fails. With right recursion, we will be matching part of the input
string as we go along
= Left recursion indicates we are building the string from right-to-left
= To eliminate left recursion, we turn it into right recursion and build the string left-to-
right
(ii) Any LL(1) grammar is unambiguous because by definition there is a at most one left
nest derivation for any string. LL(1) grammar cannot be ambiguous since our parsing
algorithm for LL(1) grammars builds the only possible parse tree for a sentence in a
deterministic way, (In general, an ambiguity in a grammar will manifest itself in the fact
that there are two productions competing for the same cell in the parse table.)
4. a) Construct NFA from the regular expression using Thompson’s method
L=aa(a|b)* ab.
b) Write regular definition for the following language:
All strings of letter that. contain the five vowels in order.
¢) Construct the predictive parsing table for the following grammars:
S Aadb| BbBa
Ase
Boe [WBUT 2017]
Answer:
a) L=aa{a{b)* ab
Thompson method —
i) a
Oo
CMD-39POPULAR PuBLicaTiONS
iii)
aa(a|b)*ab
b) D'= {set of all alphabets}, Y= {set of all consonants}
Regular Exp:
Del eLILL
where >” means all possible combination of, alphabets (consonants).
©) Predictive Parsing Table:
+ S-» AaAb| BbBa
Ate
Boe.
CMb-40COMPILER DESIGN
: Parsing Table
|__Fitst_ | Follow a Bb é
s>Aadb | {a,d} | {3} s | S—>Aadb | S— Aadb
sg BbBa | {a,b} | {3} A Ase Ate
Ate fe} | {ad} B Boe Boe
Boe {e} {a,b}
5, Write short note on the following:
a) left factoring [WBUT 2008, 201 5)
b) context-free grammar [WBUT 2008, 2016]
Answer:
a) left factoring:
Left factoring is an important step required to transform a given grammar to one that is
suitable for building an LL (Le., top-down) parser. This step is carried out after removing
all left recursion. Even if a context-free grammar is unambiguous and non-left-recursion,
jt still may not be LL{1). 4
The problem is that there is only look-ahead buffer. The parser generated from such
grammar is not efficient as it requires backtracking. To avoid this problem we left factor
the grammar.
To left factor a grammar, we collect all productions that have the same left hand side and
begin with the same symbols on the right hand side. We combine the common strings
into a single production and then append a new nonterminal symbol to the end of this
new production. Finally, we create a new set of productions using this new nonterminal
for each of the suffixes to the common production. :
Suppose we have production rules:
AaB, | OP, |---| OB,
After left factoring the above grammar is transformed into
Aad, > B|Bylmel B
The above grammar is correct and is free from conflicts.
b) context-free grammar? .
The formalism of Context-free grammars was developed in the mid-1950s by Noam
Chomsky who used in the study of human languages (ie., Natural Languages). Later,
Context-free Grammars found an extremiely important application in the specification and
compilation of programming languages.”
A grammar for a programming language is the starting point in the design of compilers
and interpreters for programming languages. Most compilers and interpreters contain a
component called a.parser that extracts the meaning of a. program prior to generating the
compiled code or performing the interpreted execution. A number of methodologies
facilitate the construction of a parser once a Context-free grammar is available. Some
tools even automatically generate the parser from the grammar.
CMD-41POPULAR PUBLICATIONS
CFG-s, are more expressive
In terms of generative power, Context-free Gramm Regular Expression and Finite
than Regular Grammars (or equivaient formalisms |"
Automata).
Formaily, a Context-free Grammar is defin
G=