0% found this document useful (0 votes)
483 views

Compiler Design

CD

Uploaded by

ssmukherjee2013
Copyright
© © All Rights Reserved
Available Formats
Download as PDF or read online on Scribd
0% found this document useful (0 votes)
483 views

Compiler Design

CD

Uploaded by

ssmukherjee2013
Copyright
© © All Rights Reserved
Available Formats
Download as PDF or read online on Scribd
You are on page 1/ 130
COMPILER DES IGN Introduction to Compiler 2 Lexical Analysis 14 parsing and Context Free Grammar 31 Operator Precedence Parsing janeee® 43 Top-Down Parsing 48 Bottom-Up Parsing and LR Parser Generation Theory 61 syntax Directed Translation 80 Type Checking 96 Runtime Environment 7 Intermediate Code Generation 102 117 Code Optimization NOTE: MAKAUT course structure and syllabus of 5” semester has been changed from 2020. Previously COMPILER DESIGN was in 7 Semester. This subject has been shifted in 5” semester in present curriculum. Subject organization has been changed slightly. Taking special care of this matter we are providing chapterwise relevant MAKAUT university solutions alongwith the complete solutions of new university papers, so that students can get an idea about university questions patterns. POPULAR PUBLICATIONS: ER INTRODUCTION TO coMmPIL ” yep ee mae no Questions: ' compiler? tod by the 4. Which one of tho following error will not bo dotactor y ines 2010, 2012, 2016 b) Syntactic error a) Loxical error dy Logleal orror ¢) Somantic orror Answer: (d) j BUT 2012 2. Which Is used to keep track of currently active eclivalots | pial 1 a) control stack —_b) activation c) execu Answer: (a) : 3. A compiler that runs on one machine and produces code for a SMUT 20h rear b) One pass compilation - a) Cross compilation c) 2 pass compilation Answer: (a) d) none of these 4. Which of the following translation program converts, assembly language [WBUT 2014] ° program to object program? 3 a) assembler b) compiler ¢) macro-processor d) linker Answer: (a) [WBUT 2015] 5. Role of preprocessor is to a) produce output data b) produce output to compilers ¢) produce input to compilers d) none of these Answer: (c) 6. Consider the program statement b = 2 where bis a Boolean variable. Which stage of compilation can detect this error? [WBUT 2015] a) Lexical analysis b) Syntax analysis c) Semantic analysis d) Code generation Answer: (c) my 7. Cross-compiler is a compiler, [WBUT 2017] a) which is written in a language that is different from the source | b) that generates object code for host machine Rhee c) which is written in a language that is same as the source language d) that runs on one machine but produces object code for another machine ‘Answer: (d) CMD-2 COMPILER DESIGN 8, An ideal compiler should {weurT 2017] a) be smaller in size b) take less time for compilation c) be written in a high level language d) produce object code that is smaller in size and executes faster “Answer? (4) 9. Compiler ean check .......... Error. [wBUT 2019] a) logical b) syntax ¢) content d) both (a) and (b) Answer: (b) 40. A compiler running on computes with a small memory would normally be [WBUT 2022] Answer: compiler with several passes 14. A garbage is a [WBUT 2022] Answer: allocated storage with all across path to it destroyed Short Answer Type Questions 4. With the help of a block diagram, show each phase including symbol table and error handler of a compiler. [WBUT 2007] z. oR, Using 2-block diagram indicate the phases of compilation explaining their activities. (WBUT 2014] oR, Describe analysis phase of a Compiler with a block diagram. [WBUT 2018] OR, Describe the errors encountered in different phases of compiler. [WBUT 2019] Answer: Various Stages of Compiler: Generally we have treated a compiler as a single box that maps a source program into a semantically equivalent target program. If we open up this box a little, we see that there are two parts to this mapping: analysis and synthesis. The analysis part breaks up the source program into constituent pieces and imposes a grammatical structure on them. It then uses this structure to create an intermediate representation of the source program. If the analysis part detects that the source program is either syntactically ill formed or semantically unsound, then it must provide informative messages, so the user can take corrective action, The analysis part also collects information about the source program and stores it in a data structure called a symbol table, which is passed along with the intermediate representation to the synthesis part. If we examine the compilation process in more details, we see that it operates 2s a sequence of phases, each of which transforms one representation of the source program to another. A typical decomposition of a compiler into phases in show in figure below. In CMD-3 POPULAR PUBLICATIONS and the intermediate representation uped together, explicitly. The symbol table, Which it be construc! 7 i hases of the compiler ¢ program, may be gro! ses need no! the entire soure Character stream practice, several phases between the grouped pha: stores information about # Lexical Analyzer ‘Token stream Syntax Analyzer Sales Syntax tree ‘Syntax Analyzer ‘Syntax tree Error Handler Intermediate Code Generator L__________ Intermediate — ‘Symbol Table Machine-Independent ‘Code Optimizer T . Intermediate representation Code Generator target-machine Code Machine-Dependent Code Optimizer ‘larget-machine Code Phases of a compiler gone complies Baye a machine-independents optimization phase between the front end and the back end. The purpose of this optimization ii i t d c phase is trans ions ‘on the imermedlate Tepresentation, so that the back end a pai a cote ies program than ise produced from ay imization i i 7 " iste cisr in accent in. ot representation. Since optimization is optional, one or the'two optim saci ie ot in above figure may be missing, Eee All phases are described using an exampl ab ria * 60 Texical Analyzer T 4,1) >) Gd,.2) (4) (8,3) (2) (60) Synlax Analyzer a 1 [positon Aidt Sh ia woah ig)” 0 3] rate [See aie co ~ (id,t SYMBOL Gay’ Buopow 4 60 Tntermediate code Generator (Gd,2y {1 inttofloat (60) 2=id3 tL B=id2+2 idi-8 ‘Code Optimizer l= id3 * 60.0 idl =id2 +t1 LDF R2, id3 MULF R2, R2, #60.0 LDF RI, id2 ADDF RI, RL, R2 STF idl, RI ‘Translation of an assignment statement UT 2010) 2. What are the analysis phase and synthesis phase of an assemble? [WB Answer: ‘There are two main stages in the compilirig process --- analysis and synthesis. The analysis stage breaks up the source program into pieces, and creates a generic (lasiguage-independent) intermediate representation of the program. Then, the synthesis stage constructs the desired target program from the intermediate representation. Usually, CMD-5 POPULAR PUBLICATIONS: snynthesls stage Is ity bak. the analysis stage of a compiler is catted Its frontend and the sy! ke end, eaaniliine There are four phases in the analysis stage of compiling: In rend from left (0 right ang Here, the stream of characters making up. souree {apr ative mening, grouped into tokens -~ sequences of characters wink’ # HT Tae Output of Here, the tokens found are promped together using @ cones oo oI this parsing phase is called a parse éree. ies Aarciali Here the ne tree cick for semantic errors Le, detection eles net i the use of undeclared variables, fmetion calls wilh Improper arguments, ations, incompatible operands, type mismatehes, etc, Here, the intermediate representation of the § like three-address code, i The synth tage can have up to three ph: . F Here, code in thet ermediate Fepaehtnlok lhree Address Code) is converted into a streamlined version, still in the intermediate representation, but with attempts to produce the smallest, fastest and most efficient runing result (collectively called optimization), Here, the target program is generated, as machine code or assembly code. Memory locations are selected for cach variable and instructions are chosen for cach operation, This is an optional optimization pass following code generation and transforms the object code into tighter, more efficient object code by considering features of the hardware to make efficient usage of processor(s), registers, specialized instructions, pipelining branch prediction and other peephole optimizations. cated, usually in a form tirce programm is cf 3. Describe analysis phase of a compiler in respect of the following example. position = initial + rate * 60 (WBUT 2011] Answer: The analysis phase of a compiler, also known as. its “front-end”, breaks up the source Program into pieces, and creates a generic (language independent) intermediate representation of the program. There are four stages in the analysis phase of compiling: Lexical Analysis: Here, the stream of characters making up a source program is read from deft to right and grouped into tokens—sequences of characters have a collective meaning. For the given example, tokens: idl =id2+id3_id4 idl, id2 and id3 will be ‘bound? to 8 position, initial, and rate, respectively, lexical analysis will break the sentence into a stream of seven 'ymbol Table entries corresponding to variables while id4 will be bound to the constant 60. Syntax Analysis or Parsing: Here, the tokens found are context-free grammar. The output o} tree for the given example will be: . " grouped together using 4 this parsing phase is called a parse tree, ‘The parse COMPILER DESIGN Ss af <4 id ids idy Semantic Analysis: Here, the parse tree is checked for semantic errors i.e., detection of such things like the use of undeclared variables, function calls with improper arguments, access violations, incompatible operands, type mismatches, etc. Assuming that in our code the variables are of float type, the semantic analysis will insert a int-to-float conversion of the constant 60 to give: ae Ny fn ids inet oeleat Ida Intermediate Code Generation: Here, the intermediate representation of the source program is created, usually in a form like three-address code. The set of three-address codes for the example will be tl =inttofloat(60) 12=id3 * tl 4. What is a cross-compiler? Create a cross-compiler for SML (Sensor Mark-up Language) using Java compiler, written in ATOM-450, producing code in ATOM- 450 and a SML language producing code for XML written in Java. [WBUT 2012] Answer: * A cross-compiler is a piece of software which runs on a particular environment (processor + OS) and generates code that can run on another environment. For example a “C’ cross-compiler running on X86-Windows generating code for ARM-Linux. Source eode | (in SML) Cross-compiler Java [™] IVM Results ‘Atom 450 system Atom 450 system CMD-7 POPULAR PUBLICATIONS ee Questions Answer Lon @ output of each hag, ing th show! ur ane of acompller, statement: 4. a) Explain the different phases using the example of the following Position: = initial + rate * 60 intl the difforent phases of compilation, How the following statement is trans WBUT 20 Position : = initial + rate ‘70. OR, oxample riouraed Explain different stages of compiler with a suitable Answer: oe | Refer to Question No. 3 of Short Answer Type Questiol [WBUT 2009 b) Compare compiler and interpreter. 7 f [WBUT 2013 Distinguish between interpreter and compiler. Answer: 3 ; Compiler scans whole code at once so it is fast on the ates hand interpreter scans the byte code line by line and converts in machine code so it is slow. . Compiler creates executable file that runs directly.on the CPU but interpreter does not create direct executable file. : Compiler uses more memory - all the execution code needs to be loaded into memory. Interpreter uses less memory, source code only has to be present one line at a time in memory. 3 2. a) Apply all the phases of compiler and show the corresponding output in every phase for the following code of the source program: while (y 2 thy=y—3; [WBUT 2012] Answer: Lexical analysis generates the tokens ‘while’, °C’, “id,”, >=, id’, “=, id’, 2, 3° and & The syntax analysis stage generates the parse tree A IN Hoo hide Lo bee i | if { The intermediate code generates 3-address code agi! 3} 100 : if (y>=1) goto 102 S CMD-8 COMPILER DESIGN 104 : goto 100 The code optimizer optimized this code to 100 : if (y
70 Intermediate Code Generator int to float (70) inti float Code Generator ¥ LDF Raids LDF Rs, MULF —Ri,Ri, Ry ADDF — Rs, Re, #70.0 LDF Ry idy ADDF > R, ADDF * Ra Bul 4. Explain in detail the process of compil aah t the compilation for the input a = (b si ilo Develop the output of each phas® of Answer: (b +a) #2 qweuT 20491 1" Part: Refer to Question No. 1 of Lo f ng Answer Ty, je pe Questions. CMD-10 i COMPILER DESIGN am Part: phase 1: Lexical Analysis Tokens for the expression t ___ Symbol Category Attribute A identifier # a operator Assignment(1) B Identifier #2 < operator Arithmetic(1) Cc Identifier 43 a operator Arithmetic(2) ( operator Open parenthesis(1) y operator Closed parenthesis(1) #4 Phase tax Analysis >> Ti ttoreal (2) >> TZ2=bee >> T3 T2 * T2 >> TS ee >a - Phase 4: Intermediate Code generation >> MOV R2,b >> ADD RZ,c >> MUL R2, R2 >> MUL R2, #2.0 >> MOV R2, a Phase 5: Code optimization >> MOV b,R2 >> ADD R2,c >> MUL RZ, R2 >> SHL R2 >> MOV R2, a 5. a) What is a compiler? b) Explain the different phases of compiler with an example. c) Compare and contrast between a compiler and an Interpreter. [WBUT 2022] Answer: a) A compiler is a special program that translates a programming language's source code into macine code, byte code or another programming language. CMD-11 POPULAR PUBLICATIONS | ing: , 2010, 2 6. Write short notes on the following: [WBUT 2008, ee 2H a) Cross Compiler b) Chomsky classification of grammar Answer: i a) Cross Compiler: 7 ble of ct cutable code for a oh tform Othe, A cross compiler is a compiler capable of errr iter tools are used 10 gepe than the one on which the compiler is a fe platforms. It is use d to compile a executables for embedded system or mul t ‘te compiling, like microcontrollers ta platform upon which it is not feasible to do i Os. ‘ ild environment fr The findemental use ofa cross compiler isto espa Ne build TOM the target environment. This is useful in a number of situal it sated rescinded “e Embedded computers where a device has extremely witb several ti © Compiling for multiple machines—a company suppot S i iraile #6 ferent versions of OS, a single build environment can be set up p each of these targets. * Compiling ona server farm. © Compiling native code for emulators. I 013 b) Chomsky classification of grammar: a Type 0, Grammar is any phrase structure grammar without any restrictions, To define the other types we need a definition. Ina production of the form ¢Ay->$a. y, where A is a variable, $ is called the left context y, the right context and da w the replacement string. Example: (a) i) In ‘abAbcd->abABbed, ab is the left context, bed is the right context, a=ab i) In AC->A, A is the left context, * is the right context, =A The production simply erases C when the left context is A and the right context is “, ii) a) In C-> *, the left and right context are a=" The Production simply erases C when in any context. Type 1, Grammars is a production of the form bAy->$0. y is called type 1 production if a #* In type I production erasing of A is not Hesse eel prod Permitted. and right context. . 1) AB-> AbBe is a type 1 production, The left dil) A> abA is atype | production. The both lone SA and ti ight context is left and right cont Hext is, Type 2, Grammars is a production of the fc In other words, the L.H.S has no left eqn; ot A context or right comers A,BeVy and & (Vx ») CMD-12 COMPILER DESIGI ‘A grammar is called Type 2 or context free grammar if all the productions are Type 2 productions. Example: (c) S->Aa, A->a, B >abe, A ->*, are type 2 productions. Type 3, Grammars is a production S -> 4 is allowed in type 3 grammar, but in this case S does not appear on the right-hand side of any production. ‘A grammar is called Type 3 or regular grammar if productions. Example: (d) S ->aS, A >a, A >aB, are type 3 productions. all the productions are Type 3 L over an alphabet A is a collection of words on A, A’ denotes Language: A language ibset of A’. the set of all words on A. Thus a language L is simply a su eg, == {a,b} then ' = {A, a, b, aa, ab, ba, bb, aab, bbb ...} The set {a, aa, ab} is a language on 5. Because i language. The set L= {a"b" : n20} is also a language on =. The string aabb and aaabbb are in the language L, but the string abb is not in L. This language is infinite. it has a finite number of sentences, we call it a finite POPULAR PUBLICATIONS LEXICAL ANALYSIS ions Very Short Answer pe Quest 2010, 2012, 2014, 2016, 2047) d) nono of thoag it x I inal then FIRST (x) Is 4. If x is a terminal the (x) Twaut 2007, 2008, + abe b) {x} 2) x Answer: (b) BUT 2009, 204 2. The regular expression(a|b)*abb denotes [wi 6] a)" all possible combinations of a's and b’s b) set of all strings endings with abb . c) set of all strings starting with a and ending with abb d) none of these Answer: (b) 3. The following productions of a regular grammar generates a language L. SaS|bS|a[b The regular expression for L is [WBUT 2009, 2015, 2016] a) at b) (a+b) c)(a+b)(a+b) — d)(aa+bb) a" Answer: (c) 4, White spaces and tabs are removed in [WBUT 2011, 2018] a) Lexical analysis b) Syntax analysis ¢) Semantic analysis d) all of these Answer: (a) 5. Which of the following is used for grouping of characters into tokens? a) Parser b) Code optimization BUT 2013) ¢) Code generator 4) Lexical analyzer 7 Answer: (d) 6. .........-+.. OF scanning is the process where the sti ' the source program is read from left to right Grouped Into tokens nee [IWBUT 2013, 2019] a) Lexical analysis b) Dive rsion c) Modeling d ‘oe ) None of these 7. The regular expression representing th xx beginning with y is 9 the set of all strings over (x, y) ending wit [WBUT 20° a) ox(xty)* yb) y(x+y) ox Answer: (b) hy (aty)ex a) y(xy) CMD-14 COMPILER DESI 8. The basic limitation of Finite State Machine is that [WBUT 2016] a) it cannot remember arbitrary large amount of information b) it cannot recognize grammars that are regular ¢) it sometimes recognize grammars that are not regular d) all of these Answer: (a) [WBUT 2018] 9. What is the output of lexical analyzer? d) None of these a) A parse tree b) Alistoftokens —_¢) A syntax tree ‘Answer: (b) 49. Regular expression (x/y) (x/y) denotes the set [WBUT 2019] a) (XY, XV} b) (x, x¥, WY) c) {x, y} d) {X, Ys XV} Answer: (b) 44. The regular expressions denote zero or more instances of an x or Y is [WBUT 2019] a) (x+y) by) x+y)" c) (xt +) d) (xy) Answer: (b) [WBUT 2022] 42. Input to LEX is Answer: string 43, Give a finite automation M = (Q, 5, 8, 40, F). If 5 maps Q* F to 7 Q, then [WBUT 2022] Short Answer estions 4. What do you understand by terminal table and literal table? [WBUT 2006, 2008, 2010, 2011] Answer: During lexical analysis, after a token is identifiéd, first the terminal table is examined. If the token is not found in the table, a new entry is made. Only the ‘name’ attribute of the token goes in, the remaining information is inserted in later phases. On the other hand, numbers, quoted character strings and other self-defining data are classified as ‘literals’ and go to the literal table. If the literal is not found in the table, a new entry is made. The lexical analyzer can determine all the attributes of a literal as well: as its internal representation, by looking at it. 2. What is a lookahead operator? Give an example. With the help of the look ahead concept show how identifiers can be distinguished from keywords. : [WBUT 2007, 2009, 2016] CMD-15_ POPULAR PUBLICATIONS ; rs may have to look ahead a ¢ ermine a token with certainty, In Answer: | For certain programming languages the teal ane symbols beyond the end of a lexeme before they LEX, the lookahead operator is ‘/’. i For example, in Fortran-77, the token IF ean be i a function/array as in IF(10, 5). It is the keywor seen, followed by a letter. Hence, in a Lexical Analyzer for Fortran-77, we would specify the aan IF /: {letter} where we assume that the definition { «{F” or it can be the name of d he key ir of parentheses has bee, only if a pai ketone or any letter. i a DFA for the a. Give the NFA forthe following Regular Expression. Thon find one language. (alb) * abb. Answer: The required NFA is: « [WBUT 2008, 2013) 4. Construct DFA directly from [not by generating NFA] the regular expression L=(a|b)* ab [WBUT 2010, 2011, 2018] Answer: . : The required DFA for the language L=(a|b)* ab is as given below: 5, Describe with diagram the working process of lexical Analyzer. : DWBUT 2011, 2014, 2018) Answer: é The operation of the Lexical Analyzer is explained in the diagram below. CMD-16 COMPILER DESIGN Token Steam Get Token Symbol Table program into tokens like keywords, ‘The Lexical Analyzer breaks down the source ‘A token is the smallest piece of constants, identifiers, operators and other simple tokens, text that the language defines. Keywords are words the. language define: the language like if, else, int, char, do, while, for, str stants are the literal valued items thatthe Innguage can recog Cy ind characters. Jdentifiers are names the programmer has given to something. r functions, classes (in C++ and other object-oriented languages), janguage has rules for specifying how these names can be written, Operators are the mathematical, logical, and other operators that the language can and whieh always have specific meaning in return, cf C++. like numbers, Phese include variables, numerations, ctc. Each recognize. see entifiers, the Lexical Analyze also uses a “Symbol Table” — a data structure it shares with the Parser and subsequent phases of compilation. ression over alphabet {4,6,c} tis dead state? Explain with [WBUT 2012] 6. Define regular expression. Write the regular exp! containing at least one 'a' and at least one 'b'. What suitable example. Answer: ‘A regular expression (RE) is defined recursively as follows: e isanRE ais an RE, where 'a'€& If R, and R, are REs, then so are R+R, BR, and RK. Required RE is: (((@+0+0) a(a+0+e) felt) (arbre) Blaby ala+d~e))) : ‘one such that once that state is reached, any In a finite state automata; a dead state is hed, any input keeps the automata in that input keeps the automata in that state is reacl state. iets CMD-17 POPULAR PUBLICATIONS Example: Here, C is 4 dead-state. ess of compilation? Consider ¢ ; a ical analyzer help in the proc’ ith type and v: 4 salowing statement and find the number of tokens with typ’ alue ts applicable: void main () int x; . ye 2 IWBUT 2019) The le le into a stream of ‘tokens’. The grammar ‘The lexical analyzer breaks down the source code rs of a language is expressed in terms of tokens where similar types of entities are treated similarly. For examples, variables and constants .are considered as “identifiers”. This helps in the compiler being based on a language of token types, which is not dependent ‘on the particular source code being compiled. There are total 13 tokens: Void: Keyword main: identifier ¢ + ) { int : keyword x : identifier x 3: identifier } 8. What is Token, Pattern and L : What exeme? [wBuT 201 31 Lexeme: group of characters that form a to Token: class of lexemes that match a mae ° ¢ Token may have attributes, if more than one lexeme in token ikea CMD-18 COMPILER DESIGN ypically defined using a regular oxpresston simplest language class that's powerflil enough (waut 2016) Answer? To Remove © Find Closure of all © Mark these states which have null moves «© Make a revised transition table without epsilon column and Find all possible jon for those marked states by using their Closures. © We will get nfa without null moves/ep silon moves: Initial transition table: States [a |b |e epsilon Q {Q Qs} _| {Qa Qu | Qs Qu Qs { Qe, Qr}_ | Qe Qs Q Qe Qs Q Finding Closure of Qi, Qs, Qs Closure (Q1) = (Qi, Qz, Qs, Qe, Qr} Closure (Qs) = {Qs, Qs, Qi} Closure (Qs) = (Qi, Qe} ; / We find a transition table after following above steps which is: lon moves we follow following steps: es which have null moves Ep States Input a b c Qu. Qs, Qa Qs Qs Qu Os Qu Qs Qu Q So if we change the nfa to a dfa the Transition table and the corresponding dfa will be as follows: States Tnput b a Q Qs, Qs Qs CMD-19 2 iS ~ Ifwe consider q3 as final state then the state diagram be like below POPULAR PUBLICATIONS Wa fone Qs eo 1 Ue 10. Find out regular expression corresponding to the finite automata: [WBUT 2016) Next Stato _| Next State PS i iF at [qt q2 = q2_| q3 q2, q2 [43 [a2 = Answer: The DFA table is : Present State [Next state ona | Next state on b ql qh q2_ = gz q3 qe (ql. q2) (al. 92 @3) 2 (ql. 42.43) | (al, 2, 3) @ 3 q So the regular expression is atb*(a a b* 11. Define NFA. Answer: A Non- deterministic Finite Automata or NFA has five states (Q, Y, q0, F, 5) where 1.- Q: finite set of states finite set of the input symbol . tial state 4, F: final state 5. 6: Transition function ‘An NFA can be represented by digraphs called state diagram. In which: {i) The state is represented by vertices. CMD-20 [WBUT 2019] COMPILER DESIGN i) The arc labeled with an input character show the transitions. The initial state is marked with an arrow. (iv) The final state is denoted by the double circle. Long Answer Type Questions 4, What do you mean by Thomson Construction? Explain with an example. [WBUT 2015] Answer: To construct an NFA from a regular expression, we present a technique that can be used as a recognizer for the tokens corresponding to a regular expression. In this technique, a regular expression is first broken into simpler subexpression, then the corresponding NEA are constructed and finally, these small NFAs are combined with the help of regular expression operations. This construction is known as Thompson’s construction. Thompson Construction Example: Develop an NFA for the RE: (xiy)* 2. What is Regular Expression? Write the regular expression for: [WBUT 2019] a) R=R1 + R2 (Union operation) b) R= R1.R2 (Concatenation operation) c) R=R1* (Kleen Clouser) d) R= R + (Positive Clouser) e) Write-a regular expression “abb” over 5 = {a, b}. f) Construct a regular expression for the language containing all strings having any number of a's and b’s except the full string. Answer: I" Part: Refer to Question No. 6(1% Part) of Short Answer Type Questions. for a language containing strings which end with CMD-21 POPULAR PUBLICATIONS na ‘ 2" Part: then RI-| R2 (also written 9s RIUR2or Ri, a) If RI and R2 are regular expressions, R2)is also a regular expression. L(RIJR2) = L(R1) U L(R2). : yaaa i b) If RI and R2 are regular expressions, then RIR2 (also written 28 R2) is also 4 regular expression. L(RIR2) = L(R1) concatenated with L(R2). ©) IFR1 is a regular expression, then R1* (the Kleene closure of RI) is also a regula expression. nit L(R1*) = epsilon U L(R1) U L(RIR1) U L(RIRI ae ; Closure has the highest precedence, followed by concatenation, followed by union, 4) RS is a regular expression whose language is L, M. R+ is a regular expression Whose language is L+. ©) The Regular expression for a language containing strings which end with “ab” is: R.E.= (a+ b)* abb : Asset of strings of a’s and b’s ending with the string abb. So L = {abb, aabb, babb, aaabb, ababb, ..... }. f) The regular expression for the language which containing all strings having any number of a’s and b’s except the full string is: Regular Expression (R.E) = (a + b)’. ‘This will give the set as L = {¢, a, aa, b, bb, ab, ba, aba, bab, .....} any combination of a and b. The (a + b)* shows any combination with a and b even a null string, 3. Consider the regular expression (atb)*a(a+b)(a+b) a) Augment the expression and cbiétruct the syntax tree for the above regular expression. Find Firstpos() and Lastpos() for every internal node in the:syntax tree Find Followpos() for every position in the syntax tree \Construct the corresponding DFA for the given RE using followpos() UT 2022) * CMD-22 COMPILE IGN Answer: a) r= (alb)*a(alb)(ala)# CMD-23 POPULAR PUBLICATIONS b) °) Firstpos{1, 2.3) (e) Lastpos(4, 5} Fistos( 1.2.3) (@) Lastpos{3} Firstpos{ 1, 2} Lastpos(i,2} Oo Firspos( 1,2} Lastpost1, 2} = b J 2 Firstpos(1} —_Firstpos(2) Lastpos{1} —_Lastpos{2} Followpos(1) = {1, 2, 3} Followpos(2) = {1, 2, 3} Followpos(3) = 4, 5} Followpos(4) = {6, 7} Followpos(5) = {6, 7} Followpos(6) = {8} Followpos(7) = {8} Followpos(8) = { } Fiat Bt (2) Tastpost 4 rirwpos(1.2.3) (0) f Lasipos(6, 7 Firstpos(8} Lastpos(8) Firstpos(6, 7} (1) Lasipos(6, 7} b a 5 ixsipos(6) " Fitstpos(7) Lastpos{6} _Lastpos{7} Firstpos(4, SP Lastpos(4,5) ( ’ a 2 Ficspas(4} > Fistpost5} 3 Lasypos(4}__ Lastpos(5} Firstpos(3) Lastpos(3} CMD-24 COMPILER DESIGN ret | : a) Node Followpos a 1 (1.2.3) b 2 (12.3) | a 3 (4.5) a 4 {6.7} b 5 {6.7} a 6 {8} b 7 {8} #8 O 4, a) What is LEX? [WBUT 2022] b) Explain the working of LEX. 7 c) Show the step by step construction of a lexical analyzer with the following three tokens ca * abb © atb+ Answer: 2) and b) Refer to Questions No. 3.(b) of Long Answer Type Questions. CMD-25 POPULAR PUBLICATIONS t NFA to DFA ¢) Design of a lexical analyzer generator = RE to * Lex specification with NFA regular expressions ChpO {action} action, 4 Ss 8] - © - action, ea 3 x 7 When two or more accepting states are reached, the first Lo ‘action given in the Lex specification is executed CMD-27 POPULAR PUBLICATION: e-closure and move Examples g-closure({0}) = {0.1,3, 7} move({0, 1,3, 7}.4)={2, 4, 7 e-closure({2, 4, 7} =12,4,7; move({2, 4, 7}, 4) = {7} e-closure({7}) = {7} move( {7}, b) = {8} e-closure({8}) = {8} move( {8}, a)=@ a a, ba} “+ none ° 2 7 8 . 1 4 3 7 4 E Also used to simulate NFAs Minimizing the Number of States of a DFA 5. Write short notes on the following: 2) Yaae IWBUT 2006, 2007, 2009, 2012, 2013, 2014, 2016] 3 Thompson's Construction Rule eur se g008, anon 5 ata and YAAC : Answer: DWBUT 2010, 2014, 2018] a) YAAC: yacc “assumes that the user has supplied i Function called yyles. The funetion returns ane ona 2eE Which is an integer-valued CMD.28 COMPILER DESIGN: In normal practice, one uses LEX to generate the lexical analyzer which is then pigeybacked into the parser code generated by yace. b) LEX: we is a program that generates lexical analyzers. Lex is commonly used with the yace parser generator. Lex is the standard lexical analyzer generator on many Unix systems. Lex reads an input stream specifying the lexical analyzer and outputs source code implementing the lexical analyzer in the C programming language. The structure of a lex file is intentionally similar to that of a yace file; files are divided up jnto three sections, separated by lines that contain only two percent signs, as follows: Definition section . %% Rules section %% C code section ‘The definition section is the place to define macros and to import header files written in C. It is also possible to write ‘any C code here, which will be copied verbatim into the generated source file. The rules section is the- most important section; it associates patterns with C statements. Patterns are simply regular expressions. When the lexer sees some text in the input matching a given pattern, it executes the associated C code. T! is is the basis of how lex operates. The C code section contains C statements and functions that are copied verbatim to the generated source file. These statements presumably sontain code called by the rules in the rules section. In large programs it is more convenient to place this code in a separate file and link it in at compile time. ©) Thompson’s Construction Rule: Given any RE, itis possible to algorithmically construct an NFA such that the language accepted by the NFA is exactly the language expressed by the RE. This is done by systematically using the constructs for the baste primitives and operators ofan RE and building ups using Thompson's Construction process, ‘The steps are: Step-1, Parse the given RE into its constituent sub-expressions. Step-2, Construct NFA-s for each of the basic symbols in the given RE. For e, construct the NFA as in Fig-1. Here i is a new start state and f is a new accepting state, It is clear that this NFA recognizes the RE {e}. For every a in the alphabet S, construct the NFA as shown in Fig-2. Here again iisanew start state and f is a new accepting state. It is clear that this NFA recognizes the RE {a}. If a occurs several times in the. given RE, a separate NFA is constructed for each ‘occurrence of a. : r-O Fig 1. NFA fore Fig 2. NFA for a character of the alphabet CMD-29 POPULAR PUBLICATIONS Step-3, Combine the intermedial expression is obtained. Each int construction corresponds to a sub-express properties — it has exactly one final state, the final state. ; nd t, respective PAS! egular expressions $ a! Y. The NFA-s for regtt ite NFA-s N(sIt), N(St) and N(st Suppose N(s) and N(v) are the NFA-s for regular expression s| s*, the compos! : are constructed as shown itt Fig: 4 En OO © Fig.:5 i NFA for the enti Fi ively until the enti inductive » duced during the course a RE and has several importan, state and no edge leave, diate NFA-S rermediate sion of the no edge ent given ters the start ft, st and Fig- 3, Fig-4 and Fig-5, respectively: Fig: 3 d) LEX: Refer to Question No. 3(a) of Lon; ie = a ig Answer Type Questions. YAAC: Refer to Question No. 3(b) of Long Answer Type Questions. COMPILER DESIGN PARSING AND CONTEXT FREE GRAMMAR 4. If all productions in a grammar G = (V,T,S,P) are of the form A->xB or A>x, A,B c Vand x & T*, then it is called: [WBUT 2008] a) Context-sensitive grammar b) Non-linear grammar c) Right-linear grammar d) Left-linear grammar Answer: (c) 2. An inherited attributes is the one whose initial value at a parse tree node is defined in terms of [WBUT 2009, 2015] a) attributes at the parent and / or siblings of that node b) attributes at children nodes only c) attributes at both children riodes and parent and / or siblings of that node d) none of these Answer: (c) 3. The intersection of a regular language and a context free language is [WBUT 2009] a) always a regular language . _ b) always a context free language c) always a context sensitive language —_d) none of these ‘Answer: (b) * 4. The grammar E E+E | E*E | ais [WBUT 2010, 2013] a) ambiguous b) unambiguous c) not given-sufficient information d) none of these Answer: (a) 5. Parse tree is generated in the phase of [WBUT 2011, 2018] a) Syntax Analysis b) Semantic Analysis. ¢) Code Optimization d) Intermediate Code Generation Answer: (a) 6. If the attributes of the parent node depends on its children, then its attributes are called [WBUT 2012] a) TAC b) synthesized ¢) inherited 4) directed Answer: (c) 7. The expression wew where w belongs to {a, b} + is [WBUT 2014] a) regular b) context free ¢) context sensitive ; d) none of these Answer: (b) CMD-31 POPULAR PUBLICATIONS [WBUT 2015) 8. The grammar S -» Sa, |Sa,}P, |B a) is left recursive b) has common left factor factor a is left recursive and also has common left d) is a CFG Answer: (a) [WBUT 2016} ihe shown at parse tree nodes arse node ree nodes 9. An annotated parse tree is a parse a) with values of only some attribute b) with attribute values shown at the i c) without attribute values shown at the Lediedy : d) with grammar symbols at the parse tree node! Answer: (b) : [WBUT 2017] 40. The output of the parser is Sell a) tokens. b) syntax tree c) parse tree d) non-terminals Answer: (c) 44. The grammar S — aSa|bS | cis [WBUT 2018] a) LL(1) but not LR (1) b) LR(1) but not LL(1) ¢) Both LL(1) and LR(1) d) None of these Answer: (c) 42. Grammar of the programing is checked at........ phase of compiler.[WBUT 2019] a) Semantic analysis b) Syntax analysis c) Code optimization d) Code generation Answer: (b) 43. A grammar that produce more than one parse tree BUT 204 a) Ambiguous b) Unambiguous _c) Regular Al of fred Answer: (a) 44, ssssseueee 16 the most general phase struct svar a) Context sensitive : Roa [WBUT 2019] ¢} Context tree 4) All of these ’ Answer: (a) 45. Given a grammar G = (V, T, P, §} : where AlsinV and ais in(¥ UT}then Gig roeucwen InP i ofthe form A> me Answer: context free grammar [WBUT 2022] CMD-32 COMPILER DESIGN Short Answer e Questions 4, What is “handle'? Consider the grammar EE +nlE 'n|tt For a sentence n+n*n, write the handles in the right-sentential forms of the reduction. : [WBUT 2006] What is predictive parsing? oR, What do you mean by a Handle? Give example. ‘Answer: handle of a right sentential form gamma is a production Alb where the string beta may be found and replaced by A to get the [WBUT 2013] and a position in gamma previous rij ight-sentential form. The right-most productions are shown below. ‘The handles are underlined. EDEtn > Etn*ndntntn The handles are underlined. A predictive parser is a recursive descent parser that does not require backtracking. Predictive parsing is possible only for the class of LL(k) grammars, which are the context-free grammars for which there exists some positive integer k that allows a recursive descent parser to decide which production to use by ‘examining only the next k tokens of input. (The LL(k) grammars therefore exclude all ambiguous grammars, as well as all grammars that contain left recursion. Any context-free grammar can be transformed into an equivalent grammar that has no eft recursion, but removal of left recursion does not always yield an LL(K) grammar.) A predictive parser runs in linear time. : ‘A table-driven non-recursive predictive parser (for an LL(1) grammar) uses an explicit parsing stack and a parsing table M[A;a] (where A is a nonterminal and ais a terminal) Phere an entry M[A:a] is either a production rule or an error, A predictive parsing algorithm uses this table to carry out top-down parsing. 2. Consider the following context-free grammar: [WBUT 2008, 2009] $—SS+|SS*|a 2) Show how the string aa+a’ can be generated by this gramynar. b) Construct a parse tree for the-given string. c) What language is generated by this grammar? Answer: a) The string aa+a* can be generated by the rightmost production: ‘5 > SS*—> Sat > SS +.a* > Sa+a*— > aata* b) The required parse tree i 4 adios: $s $s . ZAIN | a+ a 1 aa ©) This grammar generates Binary Postfix Expressions involving a with + and * as the : only operators. CMD-33 POPULAR PUBLICATIONS WELT 2009 mmar’ 3. Consider the following left-linear grat S— Sab) da Aa Abb bE - +, hi ar. Find out an equivalent right-tinear gramm cats tanste Answer: co (bb)tateb)* where More This grammar generates the regular language © oceurences”, An equivalent right-linear grammar: S—bdS|bbad AabAlad [WBUT 2008, 2016, 2017) 4, What is a handle? segieeslid Consider the grammar E> E+ Sat Gp ee Find the handles of the right sentential forms of reduction of the string id +id* id Answer: A handle of a right sentential form gamma where the string beta may be sentential form. ete re ‘The given grammar is ambiguous. There are two right-most derivations for the string id+ id* id. We give both the rightmost derivations, underlining the handles in each case. E>E+E->E+E*ESE+E*id + E+id*id id tid tid E+E*E+E*id +E+Etid + E+id*id aid tid*id n A—>b and a position in ma is a productio Dt , aaa i A to get the previous right. found and replaced by 5. What is error handling? Describe the Panic Mode and Phrase level error recovery technique with example. [WBUT 2011, 2018] “Answer: . Programs‘submitted to a compiler often have errors of various kinds. When a compiler detects an error, i.e., when the symbols in a sentence do not match the compiler’s current position in the syntax diagram, the error handler is invoked. The error handler warns the programmer by issuing an appropriate message and then attempts to recover from it so that it can detect more errors. The simplest form of error recovery technique is “panic ‘e a symbole ae used to delimit ‘clean pein’ in the tious, Who ag ene on mode recovery algorithm deletes input tokens until it finds a safe backs the parser out to 2 context in which that symbol might a aig symbol, ther tae jf Pascal code: if a b then x else y; appear. In the following fragment Compiler discovers the error at b and a panic- : i forward to the semicolon, thereby mnising the ss Wing ad Sleorithin very likely sr again produces a spurious error message, * When the parser later finds the els In phrase-level error recovery, the quality of recover sets of safe symbols in different contexts, Whe; ty is improved by employing differe™ nte: " compiler discovers an error in COMPILER DE! IGN a phrase level ur above example, a ore rent that follow the erroneous expression. In . eae : ort cafe symbols and give @™ recovery Would use the then and else tokens as the nm realistic error message. if eo 6, Consider the following grammar G. Alternate the production so that it may fret from backtracking. Statement — if Expression then Statement else Statement ut 2012] Statement + if Expression then Statement Answer? | Statement — if Expression then Statement Trailer Trailer + else Statement tat Trailer > & 7. Explain left factoring with suitable example. [WBUT 2013, 2014] Answer: Refer to Question 5(a) of Long Answer Type Questions. : 8. Consider the context-free grammar: [WBUT 2017] SSS +|SS*le. a) How the string aa+a* can be generated by this grammar? b) Construct a parse tree for this string. Answer: Similar to Question No. 2(a) of Short Answer Type questions. 8. Eliminate the left-recursion for the following grammar: [WBUT 2017] S3(L)la L—+Is|S Answer: sala | Ee [a isis| ft recursion elimination ap L + Tose A Aalp Losuite | | 4724, | orm of left recursion) : A'elad’ = Afr left recursion elimination + S—(L)\a L—st' \~ (Ans) Losule CMD-35 POPULAR PUBLICATION: an uestions Long Ans’ (weut 2o10y 18 rammar: 4. Construct a predictive parsing table for the 9! S + iEtSS'|a S'>eS|c : minals. oie to and ist 8, & B are for Here § is star symbol and S' are non-terminal Explain the steps in brief. Answer: sta From rules S — iE1SS’| a, we get FIRST(S)= {1,4} From rules .S"—> eSs|e, we get FIRST(S')= {2,6} - From rule E >, we get FIRST(E) = {b}- Using the FIRST sets and the rules, we get: FOLLOW (S) = {8} U FIRST(S') = {e,$} « FOLLOW (S')={e,8}. FOLLOW(E)={t}. Suppose the predictive parsing table is the 2-dimensional array PLA, a], where A isa non-terminal and @ is a terminal. Since i ¢ FIRST(S), P[S,i] = S > iEtSs'. Since a FIRST(S) , P[S,a]=S—a. Since e FIRST(S'), P[S',é]=5S'—>eS. Also, since £€ FIRST(S'), PIS'e]=S' > because e< FOLLOW(S'). Hence, we have multiple entries. for M[S’,e]. Since $< FOLLOW(S') and obviously © FIRST(e); we have P[S',$]=S' > 6. ‘Since b€ FIRST(E), P[E,b]=E->b. No other rule is left to be scanned. So the predictive parsing table is: Non- Input Symbol terminal i t s : - < S| sims’ Soa se SS = : Sie E L =< 20) Dein 40 arama Conelder the following grammar: BUT 202 Ate Boe Test whether the grammar is LL(1) or not ai ras nd construct a predictive parsing table CMD-36 AS a ER eaten Is SLU S COMPILER DESIG! Answer: FIRST (A)={e}, FIRST (B)= {«} FIRST (S)= {a,b} FOLLOW (A)= {a, b}, FOLLOW (B)= {a,b}, FOLLOW (S)= {3} ‘The predictive parsing table is: a b $ S__[_S—» AaAb_| S— BbBa A A>E ADE B Boe Boe Since there are no conflicts, the grammar is LL(1) e the grammar b) Consider the following Context Free Grammar (CFG) G and reduct [WBUT 2012] by removing all unit productions. Show each step of removal SAB ‘ Ava B>Clb C+D DoE E>a Answer: State: S— AB Awa BoC|b . cD DE E-a Step-1: S— AB Ava BoC|b c3D D-a Ea Step-2: SAB - Ava “BC\b Ca Da Ea CMD-37 POPULAR PUBLICATIONS Step-3: S > AB Aa Bald Coa D-a eae mmar is ambiguous by that the 9°! c) Consider the following gramma! G. Show r gitence ‘abab'. constructing two different leftmost derivations for the [WBUT 2012] S— aSbS|bSaS|¢ Answer: a ‘The two possible left-most derivations for bab! are: bases" () SE sas abs 25a ababS—"> abab Gi) SEA page Sab Sa8bS. soe bao 525 ababS. ins [WBUT 2014] 3. a) Consider the grammar: $= aSbs |bSaS |E (i) Show that. this. grammar is ambiguou most derivations for the sentence abab. - (i) Construct the corresponding right most derivations for abab. (ii) Construct the corresponding parse trees for abab. {iv) What language does this grammar generate? by () Show that no left recursive grammar can be LL (1). (i) Show that no LL(1) grammar can be ambiguous. Answer: 2) (i) Refer to Question No. 2(c) of Long Answer Type Questions. is by constructing two different left (ii) S + aSbS — aSb — abSaSb — abSab — abab (ili) Consider a string ‘abab’. We can construct parse trees for deriving ‘abab’ AN ASSN Ro SS ta i | (iv) This grammar generates the language whi : and b's the empty string also. Buage which contains strings of equal number of 'S CMD-38 COMPILER DESIGN: b) (i) The production is left-recursive if the leftmost symbol on the right side is the same as the non terminal on the left side. For example, expr — expr + term. ’ Left-recursive grammars can never be LL(1), because the left-recursion will lead to an infinite loop. Consider the grammar A> plAa and try to parse the string B a. « Ifwe removed the left recursion the grammar becomes A> BA' A sade which derives the same string but towards the right instead of the left = Now parse a @. # doesn’t match @, therefore try another alternative for A. There are none, so parse fails. With right recursion, we will be matching part of the input string as we go along = Left recursion indicates we are building the string from right-to-left = To eliminate left recursion, we turn it into right recursion and build the string left-to- right (ii) Any LL(1) grammar is unambiguous because by definition there is a at most one left nest derivation for any string. LL(1) grammar cannot be ambiguous since our parsing algorithm for LL(1) grammars builds the only possible parse tree for a sentence in a deterministic way, (In general, an ambiguity in a grammar will manifest itself in the fact that there are two productions competing for the same cell in the parse table.) 4. a) Construct NFA from the regular expression using Thompson’s method L=aa(a|b)* ab. b) Write regular definition for the following language: All strings of letter that. contain the five vowels in order. ¢) Construct the predictive parsing table for the following grammars: S Aadb| BbBa Ase Boe [WBUT 2017] Answer: a) L=aa{a{b)* ab Thompson method — i) a Oo CMD-39 POPULAR PuBLicaTiONS iii) aa(a|b)*ab b) D'= {set of all alphabets}, Y= {set of all consonants} Regular Exp: Del eLILL where >” means all possible combination of, alphabets (consonants). ©) Predictive Parsing Table: + S-» AaAb| BbBa Ate Boe. CMb-40 COMPILER DESIGN : Parsing Table |__Fitst_ | Follow a Bb é s>Aadb | {a,d} | {3} s | S—>Aadb | S— Aadb sg BbBa | {a,b} | {3} A Ase Ate Ate fe} | {ad} B Boe Boe Boe {e} {a,b} 5, Write short note on the following: a) left factoring [WBUT 2008, 201 5) b) context-free grammar [WBUT 2008, 2016] Answer: a) left factoring: Left factoring is an important step required to transform a given grammar to one that is suitable for building an LL (Le., top-down) parser. This step is carried out after removing all left recursion. Even if a context-free grammar is unambiguous and non-left-recursion, jt still may not be LL{1). 4 The problem is that there is only look-ahead buffer. The parser generated from such grammar is not efficient as it requires backtracking. To avoid this problem we left factor the grammar. To left factor a grammar, we collect all productions that have the same left hand side and begin with the same symbols on the right hand side. We combine the common strings into a single production and then append a new nonterminal symbol to the end of this new production. Finally, we create a new set of productions using this new nonterminal for each of the suffixes to the common production. : Suppose we have production rules: AaB, | OP, |---| OB, After left factoring the above grammar is transformed into Aad, > B|Bylmel B The above grammar is correct and is free from conflicts. b) context-free grammar? . The formalism of Context-free grammars was developed in the mid-1950s by Noam Chomsky who used in the study of human languages (ie., Natural Languages). Later, Context-free Grammars found an extremiely important application in the specification and compilation of programming languages.” A grammar for a programming language is the starting point in the design of compilers and interpreters for programming languages. Most compilers and interpreters contain a component called a.parser that extracts the meaning of a. program prior to generating the compiled code or performing the interpreted execution. A number of methodologies facilitate the construction of a parser once a Context-free grammar is available. Some tools even automatically generate the parser from the grammar. CMD-41 POPULAR PUBLICATIONS CFG-s, are more expressive In terms of generative power, Context-free Gramm Regular Expression and Finite than Regular Grammars (or equivaient formalisms |" Automata). Formaily, a Context-free Grammar is defin G=, where: ai = Vyisa finite set of terminals. The set of terminal language as well as e. = V, isa finite set of non is disjoint from the terminals. «Pisa finite set of production rules of the tyP° = gis an element of Vy, the distinguished starting non Symbol. ed as a 4-tuple constitute the alphabet S for te nals constitute 4 special alphabet thay terminals, The non-term! (VU Vo" ye V2 PV nial, often call the Star CMD-42 : COMPILER.DESIGN OPERATOR PRECEDENCE PARSING Very Short Answer Type Questions 4, hich data structure is mainly used during shift-roduco parsing? [WBUT 2010, 2012] a) Pointers b) Arrays 0) Stacks d) Quouos “Answer? (C) 2. In operator precedence parsing, precedonce rolations aro dofinod BUT 2013, 2019] b) for all pair of torminals a) for all pair of non-terminals d) only for a cortain pair of torminals ¢) to delimit the handle Answer: (4) Short Answer Type Questions 4. Define an operator grammar. [WBUT 2006] “OR, What is an operator grammar? Give an example. [WBUT 2009] Answer: ‘A grammar is called an Operator Grammar if there is no production and no right hand side of any production has two adjacent non-terminals. For example, the grammar: SDxlylziS + SIS - SIS * SIS / SS) is an operator grammar while the grammar: SPASA ADab isnot an operator grammar since the rule A > ASA has two non-terminals side by side. 2. Describe the algorithm for eliminating left recursion from a CFG. Eliminate loft recursion from the following grammar. S— Aalb A Ac|Sdle Answer: 1 Part: Algorithm for Eliminating General Left Recursion : Arrange nonterminals in some order Ai, Agr +++ + Ane for i to n do begin for j 1 to i-1 do begin’ Replace each production of the form Ai -> AB by the production: Ay = onBloaBl ~~» [ou [WBUT 2015] CMD-43 POPULAR PUBLICATIONS where 2 Aye a | om | tions+ * produc om the AL we [eh are all the current Aj ee sion £1" Remove immediate left recurs: | i sar’ productions, if necessary end { for i} 2” Part: =A, A= A?). jeft recursion from the g ' Do cetig ® Let’s use the ordering 8, A (S = ¢ immediate When i = 1, we skip the “for j” loop and remov oductions (there isnone). ficaat When i =2 and j = 1, we substitute the S-producti productions : A— Ac|Aad|bd|e a Avis picts aie . Ra arts left recursion from the A productions yiel gre It S— Aalb A— bdA’| A A’ CA’ | ada’ |e in A — Sd to obtain the a. | 3. Describe about the operator precedence parser. [WBUT 2017] Answer: . ets Operator precedence grammar is kinds of shift reduce parsing method. It is applied toa small class of operator grammars. 4 A grammar is said to be operator precedence grammar if it has two properties: * NoRH.S. of any production has a€. * Notwo non-terminals are adjacent. Operator precedence can only established between the terminals of the grammar. It ignores the non-terminal. There are the three operator precedence relations: z a> b means that terminal “a" has the higher precedence than termi inal "a an terminal "b". a is encountered. ¢ Scan towards left over all the e 3 ; encountered. ‘ual precedence untit the first left most < » Everything between left most < i < an p| SI i + Son means parsing is successful ett MOS" > is a handle, CMD.44 COMPILER DESIGN = Es C y id s = > < < > < > * > > < > < > C < < < = < x z 2 P x S X > id > > x > x Ed $ Ls < < x < x 4, Remove left recursion S > Aa/b, A> Ac/Sd/e. [WBUT 2019] Answer: Step-1: Remove immediate left recursion.in A — Ac A—SdA' eA’ AiacA' ls Step-2: @ S—SdA‘a|cBralb A'— SdA' eA" AiocA' |e (i After removing immediate left recursion is S— SdA‘a 7 S— eA’aS' | bS' Si dA‘aS' le A—SdA’|eA’ Als cA' |e The grammar is now free from left recursion. ‘Long Answer Type Questions 4. a) What is left-recursion? Illustrate with suitable example. Consider the following grammar G. Find out the left recursion and remove it: [WBUT 2012] S— Bola B Be|Sd|e ‘Answer:* 1“ Part: Left recursion: when one or more productions can be reached from themselves with no tokens consumed in-between. Left recursion is a particular form of recursion that cannot be directly handled by the simple LL (1) parsing algorithm. Left recursion just refers to any recursive non-terminal that, when it produces a sentential form containing itself, that New copy of itself appears on the left of the production rule. ULAR PU NS 2™ Part: Step-1: emove immediate lef recursion in B pe B— SdB' | eB’ Bi cB' |e Step-2: jw S$ — SdB’b | cB’b | a B’ — SdB' | eB" Bi cB’ |e (ii) After removing immediate le! S— SdB’b S— eB’bS' | aS" S' > dB'bS' |e B= SdB' | eB’ BY cB’ |e The grammar is now free from left recursion. ft recursion is b) What is Operator Precedence Parsing? Discuss about the advantage and disadvantage of Operator Precedence Parsing. [WBUT 2012, 2015} Consider the following grammar: E>TA As+TAle" TFB Bo*FBle jis grammar is Operator Precedence Grammar or not and show how the string w=id +id*id +id will be processed by this grammar. (WBUT 2012] Answer: 1" Part: ‘An operator-precedence parser is a simple shift-reduce is is parser capable of, a subset pa LR) panes More precisely, the operator precedence aero Geral LR() grammars where two consecuti -termi i i eae secutive non-terminals never appear in the right-hand side of The technique of parsing through this parser is Operator Precedence Parsing 2° Part: Advantage: - Because of its simplicity numerou ier usi i peed $ compiler using operator precedence parsité — Can have been built for the entire language. CMD.-46 Disadvantage: _ Hard to handle token like unary minus ~ Worse, since relation between a grammar for the language being parsed and the operator precedence parser itself is week. ‘That is cannot always be sure the parser accepts exactly the desired language. Last Part: The given grammar is NOT an operator precedence grammar. It is not even an-operator grammar because: (i) There are several productions (e.g. T —* FB) where two non-terminals occur side by side in the right hand side. (ii) There are several ¢ productions. +id* id +id CANNOT be parsed by the given grammar since the terminal ‘=" is not in the terminal-set of the grammar. 2. Write short note on Left Recursion. [WBUT 2018] Answer: ‘ Refer to Question No. 1(a) of Long Answer Type Questions. POPULAR PUBLICATIONS ARSING [WBUT 2006, 2008] a torage which is freed 4. A dangling reference is a) pointer pointing to s b) pointer pointing to nothing a sei ¢) pointer pointing to storage winch is $0 d) pointer pointing to uninitialized sto’ Hin use [WBUT 2008, 2009) Answer: (a) ? 2, Which of the following is not a loop oop jamming oe a) Loop unrolling . dg) Induction variable elimination c) Loop heading Answer: (c) ‘ , 2012, 2013, 201 3, Atop down parser generates ; peut ao on 5] a) leftmost — derivation b) rightmost — Cor ion in reverse ¢) leftmost derivation in reverse d) rightmost derivation ! Answer: (2) 4. FIRST(aB) is [WBUT 2011] a) FIRST(@) b) FIRST(a)U FIRST(B) ¢) FIRST(a)U FIRST(A)if FIRST(a) contains € else FIRST(a) d) none of these Answer: (c) 5. Agiven grammar is not LL(1) if the parsing table of a grammar may contain a) any blank filed b) any e-entry [WBUT 2014, 2019] ¢) duplicate entry of same production —_d) more : ‘Answer: (d) ) than one production rule 6. First pos of a (dot) node with Jeaves c1 and c2 a) firstpos(cl) W firstpos(c2) meals tweurzot b) firstpos(cl) A firstpos(c2) ¢) if (nullable(cl)) firstpos(el) U firstpos(c2)else fi a ‘st d) if (nullable(c2)) firstpos(cl) rea chvin teers Answer: (c) t a factoring puarantee a) not occurring of backtracking IWBUT 2011, 2018] b) cycle f z c) error free target code d) correct LLtt} panne a3 ing table Answer: (a) CMD-48 COMPILER DESIGN 8. Given the grammar S—> ABc, A>ale, B> bls FOLLOW(A) is the set ‘ [WBUT 2015] a) {3} by {b} c) {b, ¢} d) {a,b,c} ‘Answer: (C) Hi [WBUT 2017] 9, Which one of the following is a top-down parser? a) Recursive descent parser b) Operator precedence parser ¢) An LR(k) parser d) An LALR(k) parser ‘Answer: (2) Short Answer estions 4. Eliminate left recursion from the following grammar: EDE+TIT TOTFIF FOF*|alb Answer? ‘After eliminating the left recursion, the resulting grammar (wit terminals E’ and"T’) have the rules: [WBUT 2007, 2014] ith three additional non- E>TE' E'->+TE'\é TFT’ T'>FT'\e F->aF"|bF Fio*F'|e 2. What is recursive descent parsing? Describe the drawbacks of recursive descent parsing for generating the string abe from the grammar: = [WBUT 2011] S—aBx B+be{b Answer: Recursive descent parsing is a simple way to construct a top-down parser by regarding each production rule as a function, where: a «The name of the function is the non-terminal on the left hand side of the rule. © Each instance of a non-terminal on the right hand side is a call to the corresponding function. * Each instance of a terminal on the right hand side tells us to match it with the input symbol and thereby ‘consume’ it. * Parsing happens as the execution of the function corresponding to the start symbol. mi given grammar, we have problem is constructing the function for the non-terminal because both the B productions start with the terminal b. CMD-49 POPULAR PUBLICATIONS 3. What is recursive descent descent parsing for generating th Ss abe Bbeld ; ore uswer Type Questions, P'part: Refer to Question No. 2 (1 Part) of Short Anse parsing? Desc! backs of r ribe the drawbat aes tbe! from the grammar: cur e strl rs WBUT 2017, 29, 2™ Part: Recursive Decent Parsing: Spams } drawbacks of generating abe Bo beld = The recursive decent parser uses the concept of ‘backtracking String L Grammer |—> [RD Parser + (Fe Tee] Now, here we have the grammar — from the given grammar while parsing. S—aBe B->be|b String w:abe ; + ip pir Step 1: 3 s Now, i/p and decent ptr matched, so we go on increasing both te Lf [ Ne pointer. eB os t Decent pir Step 2: ‘ 2 CAS YN JN © oR TBs s 0s | heck ' Decent ptr t Decent pir CMD-50 Now, for the recursive decent parser, it will give chance to each not a matched production, then it will backtracked again use the other 0} production. For example: Here B+ be Now, the parser can’t determine whicl Now, it is a drawback of this parser as, every time if the wr the parser needs to backtrack and again use other available producti So, to solve this drawback of recursive decent parser, the concept 01 or Bob COMPILER DESIGN and every production, if ptional +h production is useful to generate abe. ong production is chosen then ions. , f predictive parsing is used frequently. 4, Differentiate between top-down and bottom-up approach. [WBUT 2019] Answer: Features Top-down Approach Bottom-up approach This technique uses right most T-Technique used This technique uses left most derivation. derivation. 2. Level strategy It is a parsing strategy that first looks at the highest level of the parse tree and works down the parse tree by using the rules of Tt is a parsing strategy that first looks at the lowest level of the parse tree and works up. the parse tree by using the rules of grammar. grammar. Bottom-up parsing can be decision is to select what production rule to use in order 3. Definition Top-down parsing attempts to find the left most. derivations | defined as an attempt-to reduce for an input string. the input string to start symbol i of a grammar. 4, Goal Tn this approach the main| In this approach the main decision is to select when to use a production rule to reduce the to construct the string. string to get the’ starting [ symbol. 5. Explain the model of a non recursive predictive parser with a diagram. [WBUT 2022] Answer: Refer 10 Question No. 4.) of Long Answer Type Questions. 1. Which parser is used for the implementation of recursive descent parsing? Draw a model diagram for that parser. Construct the parsing table for the grammar: CMD-51 ToF Ti tFT' | Fo (Eid onts how tho string id + id * id ig ‘With the help of parsing table and other compol _ DWBUT 2006] parsed? 58 Consider the following grammar: ESE+TT To TIF Fe bove grammar. i) Obtain the FIRST and FOLLOW sets for the abo i Construct the predictive parsing table for the above grammar. [WBUT 2015] Answer: 1" Part: c . i A commonly used non-recursive (in program code) implementation of a recursive descent parser is the Predictive Parser. we may perhaps remember that the If we recall our lessons in Formal Language Theory, Ne h automata for a CFG is a Push-Down Automata (PDA). It may be worthwhile mentioning here that an LL parser is in effect an implementation of PDA. Now, a PDA was defined as working with a “push-down store” or stack. Where is this stack in the recursive descent parser? The stack in recursive descent parser is the same stack that programs use while using subroutine calls. As we have seen, a trivial implementation of a recursive parser for a given grammar (without left-recursion, say) may require too many backtracking. However, the backtracking may be eliminated by using the concepts of left- recursion removal, FIRST, FOLLOW, etc. Predictive Parsing IEE DEETS Routine : Parsing Table M 2 ERER cK alk Predictive Parser in Action Clearly, FIRST(F) = {(\id} Also, FIRST(I") = {*,¢} and FIRST(E?) = (+,¢) Now FIRST(E) contains everything in FIRST(T) and FIRST(T) contains everything in FIRST(R), CMD-52 + COMPILER DESIGN From this we get: FIRST(E) = FIRST(T) = FIRST(F) = ((id}. Next, we can start with FOLLOW(E) = {),8}. FOLLOW(T) = FIRST(E’) - {e} = {+}. FOLLOW(E’) = FOLLOW) = ().5}. FOLLOW(T) = FOLLOW(T) = {+} because FOLLOW(T) contains all none gymbols in FIRST(E’) Also, since ¢ € FIRST(E'), we get FOLLOW(T) = FOLLOW(T’) = {+,),8}. FOLLOW(F) = FIRST(T’) — {e} = {*} because FOLLOW(F) contains all non-e symbols in FIRST(T’). Also, since ¢ € FIRST(T’), we get FOLLOW(F) = {+,*,),$}. The Predictive Parsing Table is: Non- INPUT SYMBOL. Terminal + 7 ( ) id $ E ETE’ ETE’ E’ E'— TE’ Ee Ee T TFT Torr T Toe [Tor Toe Toe EF F>() Fo id ‘The Predictive Parser’s action for input id + id *idS is as shown below: Stack Input Output SE id + id * idS SE'T id + id * idS ETE SE'T'F id + id * id$ E> FT SE'Tid id + id * idS Fo id SE'T’ + id * ids SE’ +id * idS Toe SE'T+ + id * id E’—+TE' SE'T id * id$ SE'T'F id * ids TFT SE'T'id id * idS F>id SET" * idd SE'T'F* * ids TT’ *FT" SE'T'F idS S$E'T'id id Eid SET’ $ SE’ $ Te CMD-53 POPULAR PUBLICATIONS at] $ towing grammar: tf (WBUT 2003) 2. Design an LL(1) parsing table for the fo! S>aAcd|BcAe ADble BSC Tid be Cote ing fefcbe is With the help of the parsing table, show how the string Ausyers ion — there is not even a Clearly, the given grammar is free from any Je IRS T and FOLLOW sets q immediate left-recursion. So, we begin by computing the It the non-terminals: , a = \d FIRST(C) = {f}. FIRST(S) = {a,d, f }, FIRST(A) = {b, c}, FIRST(B) = (4, £} ant eo rOLLOWEKe ia eae {ce}, FOLLOW(B) = 0. and FOLLOW(C) = {f}. parsed. The parsing table is therefore: a b c d ei B $ S| S= ated S— Bede S— Bede A A>blAse ABS B Bod Boo c Co fe The parse for the string fefcbe proceeds as follows: S — aAcd|BcAe A> ble BOCfid C—fe Stack Input Output/Action $s fefcbe$ START SedcB fefcbe$ S—>BcAe SedefC fefcbes B—=Cf SeAcfef fefcbes Co fe Sedcfe efcbes | POP SeAcf ficbe$ POP $eAc ches POP $e bes * pop oe bes Ai 2 es $ ; Pop SUCCESS CMD-54 COMPILER DESIGN: 3. a) Consider the following Grammar: [WBUT 2008) TFT" Tooter |e Folia i) Obtain the FIRST and FOLLOW sets for the above grammar. ii) Construct the Predictive Parsing table for the above grammar. b) Explain the Predictive Parser’s action by describing the moves it would make on an input id + id *id$, Answer: Refer to Question No. 1 of Long Answer Type Questions. 4, a) Describe with a block diagram the parsing technique of LL(1) parser. {WBUT 2011, 2018] b) Parse the string abba using LL(1) parser where the parsing table is given below: [WBUT 2011, 2018] A b s] Ss |[S—>aBa -| B B> B — bB ¢) Check whether the following grammar is LL(1) or not. S—iCtSE|a EveS|e [WBUT 2011] Cb Answer: 2) LL(1) parsers can be of various types. However from the question it appears that a description of a “Non-recursive Predictive Parser” is to be given: ‘A Non-Recursive Predictive Parser is an LL parser implementation based on a table driven algorithm that uses an explicit stack. The block diagram of a typical predictive parser is shown below: Predictive Parsing Routine _f Parsing Table M Predictive Parser in Action The parsing table is of type M[A,a] which indicates which production to use if the top of the stack is a non-terminal A and the current token is equal to a. If there is a valid (i.¢., CMD-55 POPULAR PUBLICATIONS stack and push all the right-hang a a). If that happens to be A~» jie., we push Z first, followed ie he nor “error") entry M[4,a], we pop 4 eal symbols of the production stored in the entry we push symbols into the stack in the reverse order, and X, We use a special symbol $ to denote the Assuming $ to be the start sym! is as follows: push(S); a= CurrentInput(): do { X= pop0 if (X is a terminal or ’$’) { if (X =a) { Consumelnput(); a= Currentinput(); } else Error(); else if (M[X,a] = "X 7 Y1 2... Yk") { push(Yk); rae de secisive predietive parsing algorithm 01, COMPILER DESIGN ¢) From grammar rules and properties of FIRST and FOLLOW we get FIRST(S) = {ia}. FIRST(E) = {c,¢}.FIRST(C) = {b}. FOLLOW(S) = {$}UFIRST(E) = {c,$}. FOLLOW(E) = {¢,$}. FOLLOW(C) = {t}. The predictive parsing table then becomes: ‘Non- } Input Symbol terminal 1 t € a b $ s | s>icisE. sa } E | E—>eS soe Ec [ie 2) [c>> Clearly, there is a duplicate entry in P[E,eJeontains multiple entries and hence the grammar is not LL(1). 5. Consider the following grammar: [WBUT 2013] S—aABb . Ascle Boodle Prove the above grammar is LL (1). Draw the parsing table. os Now check whether the string “ab” and “acdb” are the languages of the above grammar. (Derive each step with the help of a stack.) ‘Answer: S— aABb IRST(aABb) = {a} IRST(c) U FIRST(e) = (¢, & } : FIRST(B) = FIRST(d) U FIRST(@)=(4,e} Since the right-end marker $ is used to mark the bottom of the stack, $ will initially be immediately below S (the start symbol) on the stack; and hence, $ will be in the FOLLOW(S). Therefore: FOLLOW(S) = {3} Using S + aABb, we get: FOLLOW(A) = FIRST(Bb) = FIRST(B) - {e} U FIRST(b) = (d, c}— {e} U (b} = (4, b} FOLLOW(B) = FIRST(b) = {b} CMD-57

You might also like