lecture notes of compiler design lab

Uploaded by

vishalmenaria86

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF or read online on Scribd

0% found this document useful (0 votes)

25 views170 pages

lecture notes of compiler design lab

Uploaded by

vishalmenaria86

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF or read online on Scribd

You are on page 1/ 170

Module -1 Introduction to Compiling: 1.1 INTRODUCTION OF LAN GUAGE PROCESSING SYSTEM. Skeletal Source Program Preprocessor Source program Compiler [= Assembly program Assembler Relocatable Machine Code : ; [ Loader/Linker-editor | «— Librazy, relocatable obj file ’ Absolute Machine Code Fig 1.1: Language Processing System Preprocessor ‘A preprocessor produce input to compilers. They may perform the following functions. 1. Macro processing: A preprocessor may allow a user to define macros that are short hands for longer constructs. 2. File inclusion: A preprocessor may include header files into the program text. 3. Rational preprocessor: these preprocessors augment older languages with more modern flow-of- control and data structuring facilities. 4. Language Extensions: These preprocessor attempts to add capabilities to the language by certain amounts to build-in macro COMPILER Compiler is a translator program that translates a program written in (HLL) the source program and translate it into an equivalent program in (MLL) the target program. As an important part of a compiler is error showing to the programmer. Source 9 taragt pam ‘Compiler J Fig 1.2: Structure of Compiler Enor mseExecuting a program written n HLL programming language is basically of two parts, the source program must first be compiled translated into a object program. Then the results object program is loaded into a memory executed. Somes p20) — Sonar objpem 2b} pam inp Onj pam] oP PEM oytput Fig 1.3: Execution process of source program in Compiler ASSEMBLER Programmers found it difficult to write or read programs in machine language. They begin to use a mnemonic (symbols) for each machine instruction, which they would subsequently translate into machine language. Such a mnemonic machine language is now called an assembly language. Programs known as assembler were written to automate the translation of assembly language in to machine language. The input to an assembler program is called source program, the output is a ‘machine language translation (object program). INTERPRETER An interpreter is a program that appears to execute a source program as if it were machine language. INeor PROCESS ‘oureur ‘odo Fig] 4: Execution in Interpreter Languages such as BASIC, SNOBOL, LISP can be translated using interpreters. JAVA also uses interpreter. The process of interpretation can be carried out in following phases. 1. Lexical analysis 2. Synatx analysis 3. Semantic analysis 4. Direct Execution Advantages: Modification of user program can be easily made and implemented as execution proceeds, ‘Type of object that denotes a various may change dynamically. Debugging a program and finding errors is simplified task for a program used for interpretation. The interpreter for the language makes it machine independent. Disadvantages: The execution of the program is slower. Memory consumption is mote. LOADER AND LINK-EDITOR: Once the assembler procedures an object program, that program must be placed into memory and executed, The assembler could place the object program directly in memory and transfer control to it,thereby causing the machine language program to be execute. This would waste core by leaving the assembler in memory while the user’s program was being executed. Also the programmer would have to retranslate his program with each execution, thus wasting translation time. To over come this, problems of wasted translation time and memory. System programmers developed another component called loader “A loader is a program that places programs into memory and prepares them for execution.” It would be more efficient if subroutines could be translated into object form the loader could”relocate” directly behind the user’s program. The task of adjusting programs o they may be placed in arbitrary core locations is called relocation, Relocation loaders perform four functions. 1.2 TRANSLATOR ‘A translator is a program that takes as input a program written in one language and produces as output a program in another language. Beside program translation, the translator performs another very important role, the error-detection. Any violation of d HLL specification would be detected and reported to the programmers. Important role of translator are: 1 Translating the HLL program input into an equivalent ml program. 2 Providing diagnostic messages wherever the programmer violates specification of the HLL. 1.3 LIST OF COMPILERS 1. Ada compilers 2.ALGOL compilers 3 BASIC compilers 4..C# compilers 5 .C compilers 6 C++ compilers 7 COBOL compilers 8 Common Lisp compilers 9. ECMASeript interpreters 10. Fortran compilers 11 Java compilers 12. Pascal compilers 13. PL/I compilers 14, Python compilers 15, Smalltalk compilers 1.4 STRUCTURE OF THE COMPILER DESIGN Phases of a compiler: A compiler operates in phases. A phase is a logically interrelated operation that takes source program in one representation and produces output in another representation, The phases of a compiler are shown in below ‘There are two phases of compilation. a. Analysis (Machine Independent/Language Dependent) . Synthesis(Machine Dependent/Language independent) Compilation process is partitioned into no-of-sub processes called ‘phases’ Lexical Analysis: LA or Scanners reads the source program one character at a time, carving the source program into a sequence of automic units called tokens.source program lexical analyzer eomeeme) A = / 1 Sf aa oe | Bae symbol table v x. ae Soe code \O cate optimizer ¥ code Y generator target program Fig 1.5: Phases of Compiler Syntax Analysi ‘The second stage of translation is called Syntax analysis or parsing. In this phase expressions, statements, declarations etc... are identified by using the results of lexical analysis. Syntax analysis is, aided by using techniques based on formal grammar of the programming language. Intermediate Code Generations: An intermediate representation of the final machine language code is produced. This phase bridges the analysis and synthesis phases of translation. Code Optimization :- ‘This is optional phase described to improve the intermediate code so that the output runs faster and. takes less space. Code Generation: ‘The last phase of translation is code generation. A number of optimizations to reduce the length of machine language program are cattied out during this phase. The output of the code generator is the machine language program of the specified computer.Table Management (or) Book-keeping:- This is the portion to keep the names used by the program and records essential information about each. The data structure used to record this information called a ‘Symbol Table’ Error Handlers:~ It is invoked when a flaw error in the source program is detected. The output of LA is a stream of tokens, which is passed to the next phase, the syntax analyzer or parser. The SA groups the tokens together into syntactic structure called as expression. Expression may further be combined to form statements, The syntactic structure can be regarded as a tree whose leaves are the token called as parse trees. The parser has two functions. It checks if the tokens from lexical analyzer, occur in pattern that are permitted by the specification for the source language. It also imposes on tokens a tree-like structure that is used by the sub-sequent phases of the compiler. Example, if a program contains the expression A¥/B after lexical analysis this expression might appear to the syntax analyzer as the token sequence id-H/id. On seeing the /, the syntax analyzer should detect an error situation, because the presence of these two adjacent binary operators violates the formulations rule of an expression, Syntax analysis is to make explicit the hierarchical structure of the incoming token stream by identifying which parts of the token stream should be grouped. Example, (A/B*C has two possible interpretations.) 1, divide A by B and then multiply by C or 2, multiply B by C and then use the result to divide A. each of these two interpretations can be represented in terms of a parse tree. Intermediate Code Generation:~ The intermediate code generation uses the structure produced by the syntax analyzer to create a stream of simple instructions. Many styles of intermediate code are possible. One common style uses instruction with one operator and a small number of operands. The output of the syntax analyzer is, some representation of a parse tree. the intermediate code generation phase transforms this parse tree into an intermediate language representation of the source program. Code Optimization This is optional phase described to improve the intermediate code so that the output runs faster and takes less space. Its output is another intermediate code program that does the some job as the original, but in a way that saves time and / or spaces. ‘a. Local Optimization: There are local transformations that can be applied (o a program to make an improvement. For example, IfA>B goto L2Goto L3 Li This can be replaced by a single statement ILA 0|1|2|3)4\5|6|718|9 © list, digit : Grammar variables, Grammar symbols © 0)1,2,3,4,5,6,7,8,9,-;+ : Tokens, Terminal symbols Convention specifying grammar ©. Terminal symbols : bold face string if, num, id © Nonterminal symbol, grammar symbol: italicized names, list, digit A,B Grammar G=(N,T,P,S) © N:aset of nonterminal symbols T : a set of terminal symbols, tokens © P:aset of production rules start symbol, SEN Grammar G for a language L={9-5+2, 3-1, ...} o G=(N,T.P,S) listdigit} {0,1,2,3,4,5,6.7,8,9,-.+} Ps list-> list + digit list -> list - digit list -> digit digit -> 0|1|2|314|5|617/8)9 Selist Some definitions for a language L and its grammar G © Derivation : A sequence of replacements S=>a1=942=...=>an is a derivation of an. Example, A derivation 1+9 from the grammar G © left most derivation list > list + digit = digit + digit > 1+ digit > 1+9 ‘© right most derivation list = list + digit = list + 9 = digit +9 > 1 © Language of grammar L(G) L(G) is a set of sentences that can be generated from the grammar G. L(G)={x| S >* x) where x € a sequence of terminal symbols «Example: Consider a grammar G-(N,T.P,S): N={S} T={ab} S=S P={S — aSb |=} * is aabb a sentecne of L(g)? (derivation of string aabb) S=3aSb=aaSbb=aacbb=aabb(or S=>* aabb) so, aabbeL(G) there is no derivation for aa, so aa¢L(G) note L(G)={anbn] n20} where anbn meas n a's followed by n b's. 9 Parse TreeA derivation can be conveniently represented by a derivation tree( parse tree). The root is labeled by the start symbol. Each leaf is labeled by a token or . Each interior none is labeled by a nonterminal symbol. When a production Axl... xn is derived, nodes labeled by x1... xn are made as children nodes of node labeled by A. © root : the start symbol ‘* internal nodes : nonterminal © leaf nodes : terminal Example G: list > list + digit | list - digit | digit digit > 0(1/2)3/4)5|6)7|819 © left most derivation for 9-5+2, list = listtdigit listdigit'digit > digitdigit'digit > 9digit+digit => 9S+digit > 95+2 ‘right most derivation for 9-5+2, list > listtdigit >list?2 —>listdigit+2 = list5+2 = digitS12 = 9512 parse tree for 9-5+2 \ digit 4 | list digit | | digit | 9 - 5 +4 2 Fig 2.2, Parse tree for 9-5+2 according to the grammar in Example Ambiguity ‘© A grammar is said to be ambiguous if the grammar has more than one parse tree for a given string of tokens. © Example 2.5. Suppose a grammar G that can not distinguish between lists and digits as in Example 2.1 © G:string — string + string | string - string |0{1|2)3/4|5|6(7|8)9string string JIN JIN siring + String string JIN "| PAIN | : : : : Fig 2.3. Two Parse tree for 9-5+2 1-542 has 2 parse trees => Grammar G is ambiguous. of operator A operator is said to be left associative if an operand with operators on both sides of itis taken by the operator to its left. eg) 9+5+2=(9+5)+2, ab Left Associative Grammar list — list + digit | list ~ digit digit +0)])...19 © Right Associative Grammar : =c) right — letter = right letter letter — a)b). list right ZAIN JIN list - digit letter = right 4IN | | Zs list’ = digit’ 2 a letter = right | | | digit 5 b letter | | 9 © Fig 2.4, Parse tree left- and right-associative operators. Precedence of operators ‘We say that a operator(*) has higher precedence than other operator(+) if the operator(*) takes operands before other operator(+) does. © ex, 945*2=94(542), 9*5+2=(9*5)+2. ‘¢ left associative operators :+,-,*,/ ‘right associative operators : =, **«Syntax of full expressions operator] associative | precedence += left ai left © expr expr * term | expr term | term term — term * factor | term /factor | factor factor — digit | (expr ) digit > 0\1|..\9 Syntax of statements stmt > id = expr; | if (expr) stmt ; | if (expr ) stmt else stmt ; | while (expr ) stmt ; expr — expr + term | expr term | term term + term * factor | term / factor | factor factor — digit | (expr ) digit > 0|1|.../9 2.3 SYNTAX-DIRECTED TRANSLATION(SDT) A formalism for specifying translations for programming language constructs. (attributes of a construct: type, string, location, etc) ‘* Syntax directed definition(SDD) for the translation of constructs ‘* Syntax directed translation scheme(SDTS) for specifying translation Postfix notation for an expression E If Bisa variable or constant, then the postfix nation for E is E itself ( E.t= ‘* if Eis an expression of the form E1 op E2 where op is a binary operator © El'is the postfix of El, © E2Vis the postfix of E2 ©. then El’ E2' op is the postfix for El op E2 1 is (E1), and El’ is a postfix then EI’ is the postfix for E «9-526 ED 9-6+2) > Syntax-Directed Definition(SDD) for translation ‘© SDD isa set of semantic rules predefined for each productions respectively for translation. A translation is an input-output mapping procedure for translation of an input X, © construct a parse tree for X. © synthesize attributes over the parse tree.+ Suppose a node n in parse tree is labeled by X and X.a denotes the value of attribute a of X at that node. = compute X's attributes X.a using the semantic rules associated with X. Example 2.6. SDD for infix to postfix translation PRODUCTION SEMANTIC RULE expr => expr, + term | exprat = expry.t | terms | + expr + expr, ~ term | expr := expry.t |i terms | = expr > term expr = terms term > 0 term = '0 term > 4 term = 1" term > 9 term. Fig 2.5. Syntax-directed definition for infix to postfix translation. An example of synthesized attributes for input X=9-5+2 expr expr.t = 95-2 te | ing = ™~ exor verm expr = 98- Terma = 2 aN N expr term exprt=9 ferm.t= 5 | | term term =9 | \ 3 = 3s ¢ 2 ° s . Ml 4 @ ® Fig 2.6. Attribute values at nodes in a parse tree. Syntax-directed Translation Schemes(SDTS) ‘* A translation scheme is a context-free grammar in which program fragments called translation actions are embedded within the right sides of the production productions(postfix) ‘SDD for postfix to | SDTS infix notation Tist > list + term Tistt=Tist.t | term.t]|"=" Tist > list + term © {print("+");} : translation(semantic) action. * SDTS generates an output for each sentence x generated by underlying grammar by executing actions in the order they appear during depth-first traversal of a parse tree for x.2. Translate : a) parse the input string x and b) emit thi Fig 2.7. Example of a depth-first traversal ofa tree, Design translation schemes(SDTS) for translation tion result encountered during the depth-first traversal of parse tree, rest =f + term {print('+')} rest Fig 2.8, An extra leaf is constructed for a semantic action, Example 2.8 * SDD vs. SDTS for infix to postfix translation. productions SDD SDTS expr — list term expri=listt]/termt] "| expr— list + term expr — list + term expr.t = list term.t |) "=" printf{"+")} expr — term exprt expr — list + term printf{"-")} term 0 term = expr term term > 1 term.t="1" term — 0 printf{"0")} term — I printf{"1")} term > 9 term — 9 printf} © Action translating for input 9-542 expr. aid expr / term 9 1) Parse. 2) Translate, Do we have to maintain the whole parse tre i {print('9")) expr . Term >. a ag iprint(’~") 2 print('2')) {print(’5')) Fig 2.9. Actions translating 9-5+2 into 95-24. No, Semantic actions are performed during parsing, and we don't need the nodes (whose semantic actions done).2.4 PARSING if token string x © L(G), then parse tee else error message Top-Down parsing 1, At node n labeled with nonterminal A, select one of the productions whose left part is ‘A and construct children of node n with the symbols on the right side of that production. 2. Find the next node at which a sub-tree is to be constructed, ex. G: type — simple Itid [array [ simple ] of type simple — integer [char num dotdot num Fig 2.10. Top-down parsing while scanning the input from left to right.@ pe pe i) ee SS array ~~ simple pe pe AZ } Ss =< (©) array simple of ope AN um dotdot num ype aoe — ()— array ~~ simple of pe IN mum dotdot “num simple pe ah Se UAT i | sie Fig 2.11. Steps in the top-down construction of a parse tree. ‘The selection of production for @ nonterminal may involve trial-and-error. => backtracking G: { $>aSb| ¢| ab} According to topdown parsing procedure, acb , aabb€L(G)? Slacb=aSb/acb=>aSb/acb=PaaSbb/ach => X (SovaSh) move (S-raS0) backtracking =aSb/acb=racb/acb=dacb/acb=sacb/ach (3) move move so, acb€ L(G) Is is finished in 7 steps including one backtracking, Slaabb=saSb/aabb—>aSb/aabb-aaSbb/aabb—>aaSbb/aabb—aaaSbbb/aabb => X (Sas) move (Sa8b) rove (Soa85) backing =saaSbb/aabb=aacbb/aabb = X S93 backtracking =saaSbb/aabb=saaabbb/aabb=> X (Sab) backwacking =saaSbb/aabb= X backtracking =saSblaabb=acb/aabb S3) bacraeking =:aSb/aabb=>aabb/aabb=>aabb/aabb=>aabb/aabb=raaba/aabb (Sab) move move move so, aabbEL(G) but process is too difficult. It needs 18 steps including 5 backtrackings* procedure of top-down parsing let a pointed grammar symbol and pointed input symbol be g, a respectively. © if g €N) select and expand a production whose left part equals to g next to current production. else if g =a) then make g and a be a symbol next to current symbol else iff g 4a ) back tracking = Tet the pointed input symbol a be the symbol that moves back to steps same with the number of current symbols of underlying production = eliminate the right side symbols of current production and let the pointed symbol g be the left side symbol of current production. Predictive parsing (Recursive Decent Parsing,RDP) ‘* A sstrategy for the general top-down parsing Guess a production, see if it matches, if not, backtrack and try another. = + Itmay fail to recognize correct string in some grammar G and is tedious in processing, > + Predictive parsing ois akind of top-down parsing that predicts a production whose derived terminal symbol is equal to next input symbol while expanding in top-down paring. without backtracking, © Procedure decent parser is a kind of predictive parser that is implemented by disjoint recursive procedures one procedure for each nonterminal, the procedures are patterned after the productions. * procedure of predictive parsing(RDP) let a pointed grammar symbol and pointed input symbol be g, a respectively. o if(geN) = select next production P whose left symbol equals to g and a set of first terminal symbols of derivation from the right symbols of the production P includes a input symbol a. "expand derivation with that production P. ©. else if( g=a)) then make g anda be a symbol next to current symbol ©. else if( g 4a) error * G: { SssaSb|¢| ab} => GI: S->aS'|eS'>Sb |ab } According to predictive parsing procedure, acb , aabb€L(G)? (© Slacb=> confused in { S—raSb, Sab } © so, a predictive parser requires some restriction in grammar, that is, there should. be only one production whose left part of productions are A and each first terminal symbol of those productions have unique terminal symbol. + Requirements for a grammar to be suitable for RDP: For each nonterminal either 1. A> Ba, or 2. A—alal |a2a2 |... |anan 1) for 1 $i,j Snandifj, ai ¢ aj 2) Ae may also occur if none of ai can follow A in a derivation and if we have Are* If the grammar is suitable, we can parse efficiently without backtrack. General top-down parser with backtracking t Recursive Descent Parser without backtracking ‘ Picture Parsing (a kind of predictive parsing ) without backtracking Left Factoring Ifa grammar contains two productions of form S— aa and S — a it is not suitable for top down parsing without backtracking. Troubles of this form can sometimes be removed from the grammar by a technique called the left factoring. © Inthe left factoring, we replace { S— aa, Saf } by {$—> aS), Sa, S'> B } ef. S— ata) (Hopefully a and B start with different symbols) © left factoring for G { SaSb | c| ab } S—aS'|c ef. S(=aSb | ab | c= a(Sb|b)|c) > aS'|c S'Sb |b * Aconerete example: > IF THEN | IF THEN ELSE is transformed into — IF THEN S! so ELSE | © * Example, for G1: { $+aSb |c| ab } According to predictive parsing procedure, acb , aabb €L(G)? + Siaabb=> unable to choose { S—uSb, S—rab ?} ‘© According for the feft factored gtrammar G1, acb , aabb L(G)? G1: {Sa8|fe SSb)b} <= {S=a(SbIb) | c } © Shacb=uS/acb=22/acb > ASb/acb = ablach = aéfacb= acb/ac (S-a8) move (SSWeNED) (Se) move move so, acb€ L(G) It needs only 6 steps whithout any backtracking. cf. General top-down parsing needs 7 steps and I backtracking © Slaabb=%$'/aabb=4 Yaabb=Sb/aabb=PaS'b/aabb=Pad blaabb=>adb/aabb=> > S28) move SHSHASD) (SS) move 0) move move so, aabb ©L(@) but, process is finished in 8 steps without any backtracking, cf, General top-down parsing needs 18 steps including 5 backtrackings. Left Recursion © A grammar is left recursive iff it contains a nonterminal A, such that A=+ Aq, where is any string. © Grammar {S—> Sa | c} is left recursive because of S=>Sa © Grammar {S— Aa, A— Sb | c} is also left recursive because of S>Aa=> Sba * Ifa grammar is left recursive, you cannot build a predictive top down parser for it.1) Ifa parser is trying to match $ & S—Sa, it has no idea how many times S must be applied 2) Given a left recursive grammar, it is always possible to find another grammar that, generates the same language and is not left recursive. 3) The resulting grammar might or might not be suitable for RDP. * After this, iff we need left factoring, it is not suitable for RDP. + Right recursion: Special care/Harder than left recursion/SDT can handle. Eliminating Left Recursion LetGbeS>SA|A Note that a top-down parser cannot parse the grammar G, regard are tried. =? The productions generate strings of form AA:--A => They can be replaced by SA S' and SA S]é of the order the productions Example : " i « A Aal Bp A A | g = aaa A> Aal B i . R- R—aR|e ABR | Fig 2.12. Left-and right-recursive ways of generating a string * In general, the rule is that o IfA—Aal | Aa2|... | Aan and A- fil | 62 | ... | fm (no Bi's start with A), then, replace by A BIR | B2R| ... | PmR and Z—alR|@2R|...|anR | Exercise: Remove the left recursion in the following grammar expr — expr + term | expr term expr term solution: expr — term rest rest + term rest | - term rest |=2.5 A TRANSLATOR FOR SIMPLE EXPRESSIONS ‘© Convert infix into postfix (polish notation) using SDT. ‘© Abstract syntax (annotated parse tree) tree vs. Conerete syntax tree E le ss \ e | : | © Concrete syntax tree : parse tree © Abstract syntax tree: syntax tree © Concrete syntax : underlying grammar Adapting the Translation Scheme «Embed the semantic action in the production Design a translation scheme © Left recursion elimination and Left factoring © Example 3) Design a translate scheme and eliminate left recursion ESET {+} E>TOUR ESE-T(¥} RotTCH}R EST ty R+-TY}R T= 0(0}|...]9€9) Ros T= 0f0}...|9¢9 ‘Translate of input string 9-5+2 : parsing and SDT E y Tt print) RI 9 print’: at pint) R Sprint's) + Trinny 2 print21 e Result: 95-2 +Example of translator design and execution ‘A translation scheme and with left-recursion. Tnitial specification for infix-to-postfix with left recursion eliminated translator expr — expr + term {printf("")y expr — term rest expr — expr term {printf{"-")} rest —> + term {printf{"+")} rest expr — term rest > = term {printf{"-")} rest term — 0 {printf{"0")} rest > © term — I {printf{"1")} term — 0 {printf{"0")} term > 1 {printf{"1")} term + 9 {printf{"0")) term — 9 {printf{"0")} | a termi rest 9° {prim('9')) = ~ term (prints’~")y L rest ee oe 8° (prin(’5')) + teri { print('s')~ rest 2° { print('2")¥ | Fig 2.13. Translation of 9 ~ 5 +2 into 95-2+, Procedure for the Nonterminal expr, term, and rest expe () |/cexgr > tem rsd : : rest () [IGiest-+ + tm pint'+I ret | ~ term print term() (Jeter 0 pint} ~ term + 9 print} else error() Fig 2.14, Function for the nonterminals expr, rest, and term,Optimizer and Translator 1. expr) { 2. tormO: rest 3 4. rest) roct() Bt ‘ 6 Hlcokaread = 5°) ¢ T Wlookahead == F TT 7. (+ term(): p+): rest 0: term; B('+): goto L | ote miookanead ==~'){ | = dice iflookahead == =!) { 2. m=": termO: 9(' rest: (=; term: p=; gato L 10._} ese Toe nt } 12. exor0) £ 13. tend 1A while(t) £ "5 isokahead == 4!) £ 16 (+): torm0: p+" 17. Pelee itbookshead =") { 18. m(—'r: form: pt 10. Pelee break: a) 2.6 LEXICAL ANALYSIS * reads and converts the input into a stream of tokens to be analyzed by parser. * Iexeme : a sequence of characters which comprises a single token. * Lexical Analyzer —+Lexeme / Token — Parser Removal of White Space and Comments «Remove white space(blank, tab, new line etc.) and comments Contsants * Constants: For a while, consider only integers + eg) for input 31 + 28, output(token representation)? input : 31 + 28 output: <+, > num + token 31 28 : attribute, value(or lexeme) of integer token num, Recognizing «Identifiers © Identifiers are names of variables, arrays, functions, A grammar treats an identifier as a token. eg) input : count = count + increment; output : <=, > <+, > ; Symbol table tokens | attributes(lexeme) id count id increment + Keywords are reserved, ie. they cannot be used as identifiers.‘Then a character string forms an identifier only if it is no a keyword. ‘© punctuation symbols © operators : + Interface to lexical analyzer read Pass character token and lexical its attributes [> parser analyzer push back character Fig 2.15. Inserting a lexical analyzer between the input and the parser A Lexical Analyzer uses getchar() returns token to read character lexan() to caller lexical Pushes back © using — analyzer ungete(c, stdin) sets global variable tokenval to attribute value Fig 2.16, Implementing the interactions in Fig. 2.15. * e=getchear(); ungete(c,stdin); * token representation 0 #define NUM 256 + Function lexan() eg) input string 76+ a input , output(retumed value) 16 NUM, tokenval=76 (integer) + + A id, tokeval="a" ‘A way that parser handles the token NUM returned by laxan() © consider a translation scheme factor — ( expr) | num { print(num.value) } fédefine NUM 256factor() { if{lookahead =="(") { match(’(); exor(); mateh(")); } else if (lookahead = NUM) { printf(" %f ",tokenval); match(NUM); } else error(); } The implementation of function lexan 1) 2) 3) 4) 5) 6) 7) 8) 9) 10) 11) 12) 13) 14) 15) 16) 17) 18) 19) 20) 21) 22) 23) 24) 25) include include int lino = 1; int tokenval = NONE; int lexan) { int t; while(1) { t= getchar(); f(t" |); else if (t—"w' )lineno +=1; else if (isdigit(®) { tokenval = t t= getchar(); while (isdigit(®)) { tokenval = tokenval*10 + t t=getchar); } ungete(t.stdin); retunr NUM; } else { tokenval = NONE; return t; } } 2.7 INCORPORATION A SYMBOL TABLE The symbol table interface, operation, usually called by parser. © insert(s,t): input s: lexeme tt token output index of new entry © lookup(s): input s: lexeme output index of the entry for string s, or 0 if's is not found in the symbol table. Handling reserved keywords 1 Inserts all keywords in the symbol table in advance, ex) insert("div", div)insert("mod", mod) 2. while parsing whenever an identifier s is encountered. if (lookup(s)'s token in {keywords} ) s is for a keyword; else s is fora identifier; © example preset insert("div",div); insert("mod", mod); © while parsing lookup("count")=>0 insert("countid);, lookup("i") =>0 insert("i"id); lookup("i") =>4, id Hokup("div")=>1 div ARRAY symtable Jexpt ken ttributes div mod id id o [4 Bosc |o|u|n|t fos i Bos) [e ARRAY lexenes Fig 2.17. Symbol table and array for storing strings. 2.8 ABSTRACT STACK MACHINE © An abstract machine is for intermediate code generation/execution. © Instruction classes: arithmetic / stack manipulation / control flow * 3 components of abstract stack machine 1) Instruction memory : abstract machine code, intermediate code(instruction) 2) Stack 3) Data memory * An example of stack machine operation © fora input (5~a)*b, intermediate codes : push 5 rvalue 2Instruction memo: push 5 waue 2 + To Taue 3 2 [1 Ja 5 3 L7_]e Stack iis] eo] se] 4 sie L-value and r-value # values a : address of location a * values a : if is location, then content of location a if'a is constant, then value a +b; Walue 92 r value 5 = 5 rvalue of b => 7 + eg)a: Stack Manipulation Some instructions for assignment operation push v : push v onto the stack rvalue a : push the contents of data location a. Ivalue a : push the address of data location a. pop : throw away the top element of the stack. '=! assignment for the top 2 elements of the stack. copy : push a copy of the top element of the stack. Translation of Expressions * Infix expression(IE) — SDD/SDTS —> Abstact macine codes(ASC) of postfix expression for stack machine evaluation, eg) IE: a+b, (PE: ab +) IC: rvalue a rvalue b + day = (1461 * y) div 4 + (153 *m+2)div5 +d ( day 1462 y * 4 div 153 m*245div+d+=) = 1) Ivalue day 6) div 11) push 16):= 2) push 1461 7) push 15312) div 3)rvaluey 8) rvaluem 13) + 4* 9)push2 14) valued 5)push4 10) + 15) + + A translation scheme for assignment-statement into abstract astack machine code e can be expressed formally In the form as follows stmt — id = expr { stmt.t :=lvalue' | id.lexeme || expr.t | eg) day -atb = Ivalue day rvalue a rvalue b +=Control Flow * 3 types of jump instructions Absolute target location © Relative target location( distance :Current +Target) © Symbolic target location(i.e. the machine supports labels) © Control-flow instructions: label a: the jump’s target a goto a: the next instruction is taken from statement labeled a gofalse a: pop the top & ifit is 0 then jump to a gotrue a: pop the top & if it is nonzero then jump to a halt : stop execution Translation of Statements ‘* Translation scheme for translation if-statement into abstract machine code. stmt + if expr then stmt! {out = newlabell) stmt.t := expr-t || gofalse’ out || stmt!-t ||'label' out } le Wnite Tabel test code for expr code for expr gofalse out gofalse out code for stmt code for stmt abel out goto test abel out Fig 2.18. Code layout for conditional and while statements. ‘Translation scheme for while-statement ? Emitting a Translation © Semantic Action(Tranaslation Scheme): 1. stmt if expr { out := newlabel; emit(‘gofalse’, out) } then stmt] { emit(label’, out) } 2. stmt — id { emit(‘Ivalue’, id.lexeme) } expr { emit(:~!) } 3. stmt i expr { out := newlabel; emit(‘gofalse’, out) } then stmt! { emit(label’, out) ; outl := newlabel; emit(‘goto', out’ 1); }else stmt2 { emit(label’, outl) ; } iflexpr—false) goto out stmt] goto out] out : stmt2 out! bottom Implementation ‘© procedure stmt() © var test,out:integer; # begin © end if lookahead ~ id then begin = emit(‘Ivalue',tokenval); match(id); match(':="); expr(); emit('="); end else if lookahead match(‘if); expr; out = newlabel(); emit(‘gofalse’, out); match(‘then’); stmt; emit(‘label’, out) if then begin end else error(); Control Flow with Analysis © if E1 or E2 then S vs if El and E2 then S or El and E; if El then true else E2 if El then E2 else false © The code for El or E2. © The full code for Codes for El Evaluation result: e1 copy gotrue OUT Pop Codes for E2 Evaluation result: 2 label OUT El or E2 then codes for El copy gotrue OUTI pop codes for E2 label OUTIwart ~ list cot 0 gofalse OUT2 list ~ expr ; list ° code for S le © label OUT2 expr = expr + term | pris") } * Exercise: How about if E1 and E2 then $; fe ee © ifEl and E2 then S1 else $2; erm term = factor prla0} term / factor —{ print’) } | term div factor { print(“DIv") } 2.9 Putting the techniques together! term mod factor { print(’MoD") } # infix expression = postfix expression factor eg) id+(id-id)*num/id = id id id - num * id / feos Seed i i { priatiJeseme) } num { priny(oum value) | Description of the Translator * Syntax directed translation scheme (SDTS) to translate the infix expressions into the postfix expressions, Fig 2.19. Specification for infix-to-posttix translation Structure of the translator, infix expressions =e eymbol..c ee | postin pressions Fig 2.19. Modules of infix to postfix translator. ‘© global header file "header.h” ‘The Lexical Analysis Module lexer.c © Description of tokens +-*/DIV MOD () ID NUM DONELEXEME TOKEN Artripute VALUE white space sequence of digits NUM numeric value of sequence av viv mod —— MoD other sequences of a letter then letters and digits ID index into symtable end-of-file character . DONE any other character .. that character NONE Fig 2.20. Description of tokens, ‘The Parser Module parser.c SDTS || — left recursion elimination New SDTS start - list eof start list expr soreerge smoreters list > expr 5 list le expr expr + term —{ print 's') } | exor - term { print") } term ~ term + factor | prints") | term / factor { print’) } | term div factor { print(*D2v") } | term mod factor print(*moD") } | factor factor > ( expr) 1 | num = list cof = expr le = term soreenpe + term prints) } woreenpe [= term print('=") } aoceerse factor morstere fist + factor {print '«') }noreters | 7 factor { priny(’'/") | mecetera | div factor { print’ Drv") } woceters | mod factor { print’ moo’) } sereters le factor ( expr) A print teneme) | { print(num.value) } ( print (ia.texemey ( print(wum.vatue) } Fig 2.20. Specification for infix to postfix translator & syntax directed translation scheme after eliminating left-recursion.The Emitter Module emitter.c emit (t,tval) The Symbol-Table Modules symbol.c and init.c Symbol.c data structure of symbol table Fig 2.29 p62 insert(s,t) lookup(s) The Error Module error.c Example of execution input 12 div 5 +2 output 12 5 div 2 43. Lexical Analy 3.1 OVER VIEW OF LEXICAL ANALYSIS © To identify the tokens we need some method of describing the possible tokens that can appear in the input stream, For this purpose we introduce regular expression, a notation that can be used to describe essentially all the tokens of programming language. * Secondly , having decided what the tokens are, we need some mechanism to recognize these in the input stream. This is done by the token recognizers, which are designed using transition diagrams and finite automata. 3.2 ROLE OF LEXICAL ANALYZER: ‘The LA is the first phase of a compiler. It main task is to read the input character and produce as output a sequence of tokens that the parser uses for syntax analysis. soo] ata yZER PARSER SEMEOL Fig. 3.1: Role of Lexical analyzer Upon receiving a ‘get next token’ command form the parser, the lexical analyzer reads the input character until it can identify the next token. The LA return to the parser representation for the token it has found. The representation will be an integer code, if the token is a simple construct such as parenthesis, comma or colon. LA may also perform certain secondary tasks as the user interface. One such task is striping out from the source program the commands and white spaces in the form of blank, tab and new line characters. Another is correlating error message from the compiler with the source program. 3.3 TOKEN, LEXEME, PATTERN: Token: Token is a sequence of characters that can be treated as a single logical entity. Typical tokens are, 1) Identifiers 2) keywords 3) operators 4) special symbols 5)constants Pattern: A set of strings in the input for which the same token is produced as output. This set of strings is described by a rule called a pattern associated with the token. Lexeme: A lexeme is a sequence of characters in the source program that is matched by the pattern for a tokenToken Texeme pattern const const const F 7 iF relation => ‘Of = of = or <> or = oF letter followed by letters & digit 7 Pi ‘any aumeric constant aaa 314 ‘any character biw “and “except” feral "core pate Fig. 3.2: Example of Token, Lexeme and Pattern 3.4, LEXICAL ERRORS: Lexical errors are the errors thrown by your lexer when unable to continue. Which means that there's no way to recognise a /exeme as a valid token for you lexer. Syntax errors, on the other side, will be thrown by your scanner when a given set of already recognised valid tokens don't match any of the right sides of your grammar rules. simple panic-mode error handling system requires that we return to a high-level parsing function when a parsing or lexical error is detected. Error-recovery actions are: i. Delete one character from the remaining input ii, Insert a missing character in to the remaining input. iii, Replace a character by another character. iv. Transpose two adjacent characters. 3.5, REGULAR EXPRESSIONS Regular expression is a formula that describes a possible set of string. Component of regular expression. x the character x : any character, usually accept a new line Ixyz] any of the characters X, y, % «+ R? a R or nothing (optionally as R) R* zero or more occurrence: Rt one or more occurrences RIR2 an RI followed by an R2 RIRL either an RI or an R2. A token is either a single string or one of a collection of strings of a certain type. If we view the set of strings in each token class as an language, we can use the regular-expression notation to describe tokens, Consider an identifier, which is defined to be a letter followed by zero or more letters or digits. In regular expression notation we would write. Identifier = letter (letter | digit)*Here are the rules that define the regular expression over alphabet . © is a regular expression denoting { € }, that is, the language containing only the empty string, + Foreach ‘a’ in 5, is a regular expression denoting { a }, the language with only one string consisting of the single symbol ‘a’ . © IfRand S are regular expressions, then (R) | (S) means L(t) U Ls) R.S means L(®).L(S) R* denotes L(r*) 3.6, REGULAR DEFINITIONS For notational convenience, we may wish to give names to regular expressions and to define regular expressions using these names as if they were symbols. Identifiers are the set or string of letters and digits beginning with a letter. The following regular definition provides a precise specification for this class of string, Example-l, Ab*[cd? Is equivalent to (a(b*)) | (e(€?)) Pascal identifier Letter A|B| .....)Z[a[B feos 2 Digits -0|1/2)....|9 Id letter (letter / digit)* Recognition of tokens: ‘We learn how to express pattern using regular expressions. Now, we must study how to take the patterns for all the needed tokens and build a piece of code that examins the input string and finds a prefix that is a lexeme matching one of the patterns. Stmt if expr then stmt | If expr then else stmt le Expr —term relop term | term Term id ‘number For relop ,we use the comparison operations of languages like Pascal or SQL where = is “equals” and <> is “not equals” because it presents an interesting structure of lexemes. The terminal of grammar, which are if, then , else, relop ,id and numbers are the names of tokens as far as the lexical analyzer is concerned, the patterns for the tokens are described using regular definitions. digit — [0,9] digits digit number —sdigit( digit) (e.[+-]?digits)? letter > [A-Z,a-z] id letter(letter/digit)* if it then thenelse else relop >< [> |= =|<> In addition, we assign the lexical analyzer the job stripping out white space, by recognizing the “token” we defined by: WS — (blank/tab/newline)# Here, blank, tab and newline are abstract symbols that we use to express the ASCII characters of the same names. Token ws is different from the other tokens in that when we recognize it, we do not return it to parser ,but rather restart the lexical analysis from the character that follows the white space . It is the following token that gets returned to the parser. Lexeme | Token Name | Attribute Value ‘Any WS = = if if = then then 5 else else = Any id Id Pointer to table entry ‘Any number [number | Pointer to table entry < relop LT = relop LE = relop EQ = relop NE 3.7. TRANSITION DIAGRAM: Transition Diagram has a collection of nodes or circles, called states. Each state represents a condition that could occur during the process of scanning the input looking for a lexeme that matches one of several patterns Edges are directed from one state of the transition diagram to another. each edge is labeled by a symbol or set of symbols. If we are in one state s, and the next input symbol is a, we look for an edge out of state s labeled by a. if we find such an edge we advance the forward pointer and enter the state of the transition diagram to which that edge leads. Some important conventions about transition diagrams are 1. Certain states are said to be accepting or final .These states indicates that a lexeme has been found, although the actual lexeme may not consist of all positions b/w the lexeme Begin and forward pointers we always indicate an accepting state by a double circle. 2. Inaddition, if it is necessary to return the forward pointer one position, then we shall additionally place a * near that accepting state. 3. One state is designed the state ,or initial state ., it is indicated by an edge labeled “start” entering from nowhere .the transition diagram always begins in the state before any input symbols have been used,RenrtepLT) Pace GE Fig. 3.3: Transition diagram of Relational operators As an intermediate step in the construction of a LA, we first produce a stylized flowchart, called a transition diagram, Position in a transition diagram, are drawn as circles and are called as states. letter or digit return (gettoken(),installID()) Fig. 3.4: Transition diagram of Identifier The above TD for an identifier, defined to be a letter followed by any no of letters or digits.A sequence of transition diagram can be converted into program to look for the tokens specified by the diagrams. Each state gets a segment of code. 3.8, FINITE AUTOMATOD © A recognizer for a language is a program that takes a string x, and answers “yes” if x is a sentence of that language, and “no” otherwise. ‘We call the recognizer of the tokens as a finite automaton. A finite automaton ean be: deterministic (DFA) ot non-deterministic (NFA) ‘This means that we may use a deterministic or non-deterministic automaton as a lexical analyzer. Both deterministic and non-deterministic finite automaton recognize regular sets. Which one? deterministic ~ faster recognizer, but it may take more space — non-deterministic — slower, but it may take less space Deterministic automatons are widely used lexical analyzers. ‘* First, we define regular expressions for tokens; Then we convert them into a DFA to get a lexical analyzer for our tokens.3.9. Non-Deterministic Finite Automaton (NFA) ‘* Anon-deterministie finite automaton (NFA) is a mathematical model that consists of: S -asct of states © &~asset of input symbols (alphabet) © move -a transition function move to map state-symbol pairs to sets of states. sO a start (initial) state F- asset of accepting states (final states) ‘+ ©- transitions are allowed in NFAs. In other words, we can move from one state to another one without consuming any symbol. * ANPA accepts a string x, if and only if there is a path from the starting state to one of accepting states such that edge labels along this path spell out x. Example: Transivon Graph ‘Transition Function: «>a 0 [on | om ito le 2,0 [oe ‘The language recognized by thie NFA i (alb)tab 3.10, Deterministic Finite Automaton (DFA) ‘© A Deterministic Finite Automaton (DFA) is a special form of a NEA. © No state has e- transition ‘* For cach symbol a and state s, there is at most one labeled edge a leaving s. i. transition function is from pair of state-symbol to state (not set of states) Example:“The DFA to recognize the language (ab) ab is as follows. 0 fe the start state 20 {P) is the set of fina states F E = fb) S=101.2) ‘Transition Function: ra o[:7|o 1 [2 cea | eee | Note thatthe entries in this function ae single value and not set of values (unlike NFA). 3.11. Converting RE to NFA. This is one way to convert a regular expression into a NFA. There can be other ways (much efficient) for the conversion. Thomson’s Construction is simple and systematic method. It guarantees that the resulting NFA will have exactly one final state, and one start state Construction starts from simplest parts (alphabet symbols). To create a NFA for a complex regular expression, NFAs of its sub-expressions are combined to create its NFA. To recognize an empty string ¢: N(t1) and N72) are NFAs for regular expressions rl and 12.© For regular expression rl 12 Here, final state of N(rl) becomes the final state of N(rl12).. © For regular expression r* Example: For a RE (alb) * a, the NFA construction is shown below. a +0+O e700 PO am ORO b +O-+O Ono” 3.12. Converting NFA to DFA (Subset Construction) ‘We merge together NFA states by looking at them from the point of view of the input characters: ‘+ From the point of view of the input, any two states that are connected by an —transition may as well be the same, since we can move from one to the other without consuming any character. Thus states which are connected by an -transition will be represented by the same states in the DFA. ‘* If itis possible to have multiple transitions based on the same symbol, then we can regard 2 transition on a symbol as moving from a state to a set of states (ie. the union of all those states reachable by a transition on the current symbol). Thus these states will be combined into a single DFA state ‘To perform this operation, let us define two functions: ‘* The -closure function takes a state and returns the set of states reachable from it based on (one or more) -transitions. Note that this will always include the state itself. We should be able to get from a state to any state in its -closure without consuming any input. ‘© The function move takes a state and a character, and retums the set of states reachable by one transition on this character.We can generalise both these functions to apply to sets of states by taking the union of the application to individual states. For Example, if A, B and C are states, move({A,B,C},’a’) = move(A,’a') move(B,‘a') move( The Subset Construction Algorithm is a follows: put e-closure( {s0}) as an unmarked state into the set of DEA (DS) while (there is one unmarked $1 in DS) do begin mark SI for each input symbol a do begin $2 < s-closure(move(S1,a)) if (S2 is not in DS) then add S2 into DS as an unmarked state ‘ransfune[S1,a] — S2 end end ‘+ astate S in DS is an accepting state of DFA if a state in S is an accepting state of NFA ‘© the start state of DPA is e-closure({s0}) 3.13. Lexical Analyzer Generator AGS) >] cope [Pee ol c > aout lexyyic Compiler 3.18, Lex specifications: A Lex program (the . file ) consists of three parts: declarations %% translation rules %% auxiliary procedures1. The declarations section includes declarations of variables,manifest constants(A manifest constant is an identifier that is declared to represent a constant e.g. # define PIE 3.14), and regular definitions 2. The translation rules of a Lex program are statements of the form pl faction 1} 2 {action 2} 3 {action 3} ‘Where, each p is a regular expression and each action is a program fragment describing what action the lexical analyzer should take when a pattern p matches a lexeme. In Lex the actions are written in C. 3. The third section holds whatever auxiliary procedures are needed by the actions. Alternatively these procedures can be compiled separately and loaded with the lexical analyzer. Note: You can refer to a sample lex program given in page no. 109 of chapter 3 of the book: Compilers: Principles, Techniques, and Tools by Aho, Sethi & Ullman for more clarity. 3.19, INPUT BUFFERING ‘The LA scans the characters of the source pgm one at a time to discover tokens. Because of large amount of time can be consumed scanning characters, specialized buffering techniques have been developed to reduce the amount of overhead required to process an input character. Buffering techniques: 1. Buffer pairs 2. Sentinels The lexical analyzer scans the characters of the source program one a t a time to discover tokens Often, however, many characters beyond the next token many have to be examined before the next token itself can be determined. For this and other reasons, it is desirable for thelexical analyzer to read its input from an input buffer. Figure shows a buffer divided into two haves of, say 100 characters each, One pointer marks the beginning of the token being discovered. A look ahead pointer scans ahead of the beginning point, until the token is discovered .we view the position of each pointer as being between the character last read and thecharacter next to be read. In practice each buffering scheme adopts one convention either apointer is at the symbol last read or the symbol it is ready to read. t uw Token beginnings _ 1ook ahead pointer Token beginnings look ahead pointerThe distance which the lookahead pointer may have to travel past the actual token may belarge. For example, in a PL/I program we may see:DECALRE (ARG1, ARG2... ARG n) Without knowing whether DECLARE is a keyword or an array name until we see the character that follows the right parenthesis. In either case, the token itself ends at the second E. If the look ahead pointer travels beyond the buffer half in which it began, the other half must be loaded with the next characters from the source file. Since the buffer shown in above figure is of limited size there is an implied constraint on how much look ahead can be used before the next token is discovered. In the above example, ifthe look ahead traveled to the left half and all the way through the left half to the middle, we could not reload the right half, because we would lose characters that had not yet been groupedinto tokens. While we can make the buffer larger if we chose or use another buffering scheme,we cannot ignore the fact that overhead is limited.4.1 ROLE OF THE PARSER Parser for any grammar is program that takes as input string w (obtain set of strings tokens from the lexical analyzer) and produces as output either a parse tree for w , if w is a valid sentences of grammar or error message indicating that _w is not a valid sentences of given grammar. The goal of the parser is to determine the syntactic validity of a source string is valid, a tree is built for use by the subsequent phases of the computer. The tree reflects the sequence of derivations or reduction used during the parser. Hence, it is called parse tree. If string is invalid, the parse has to issue diagnostic message identifying the nature and cause of the errors in string. Every elementary subtree in the parse tree corresponds to a production of the grammar. There are two ways of identifying an elementry sutree: 1, By deriving a string from a non-terminal or 2. By reducing a string of symbol to a non-terminal. ‘The two types of parsers employed are: a. Top down parser: which build parse trees from top(root) to bottom(leaves) b. Bottom up parser: which build parse trees from leaves and work up the root lexical |__ token parser | parse] restof | intermediate program "| analyzer tree ”] frontend | representation get next token “ symbol table Fig . 4.1: position of parser in compiler model. 4.2 CONTEXT FREE GRAMMARS Inherently recursive structures of a programming language are defined by a context-free S). Here , V is finite set of terminals (in our case, this will be the set of tokens) Grammar. In a context-free grammar, we have four triples G¢V, T is a finite set of non-terminals (syntactic-variables)Pis a finite set of productions rules in the following form A— a where A is a non-terminal and a is a string of terminals and non-terminals (including the empty string) S isa start symbol (one of the non-terminal symbol) L(G) is the language of G (the language generated by G) which is a set of sentences. A sentence of L(G) is a string of terminal symbols of G. IFS is the start symbol of G then is a sentence of L(G) iff $ =o where « is a string of terminals of G. If G is a context- free grammar, L(G) is a context-free language. Two grammar G, and Gy are equivalent, if they produce same grammar. Consider the production of the form $ =?a, If « contains non-terminals, it is called as a sentential form of G. If « does not contain non-terminals, itis called as a sentence of G. 4.2.1 Derivations In general a derivation step is AB = ay is sentential form and if there is a production rule A—y in our grammar. where «and Bare arbitrary strings of terminal and non-terminal symbols al =¥a2 =... => «an (an derives from «lor al derives an ). There are two types of derivaion 1 Ateach derivation step, we can choose any of the non-terminal in the sentential form of G for the replacement. 2. If we always choose the left-most non-terminal in each derivation step, this derivation is called as left-most derivation. E>E+E/E-EIE*E/E/E|-E E(B) Esid Leftmost derivation EE+E SE * E+E Sid* E+E id Mid+Eid*idsid The string is derive from the grammar w= id*id+id, which is consists of all terminal symbols Rightmost derivation ESE+E — EE * EE+ BtidEtid*ididtid*id Given grammar G : E> E+E | B*E | (E)|-E lid Sentence to be derived : ~ (id tid)LEFTMOST DERIVATION RIGHTMOST DERIVATION Es-E E+-E E+-(E) E+-(E) E—- (E+E) E-- (E+E) E-- (id+E) E--(Esid) Es. (idtid ) E—- (idtid ) String that appear in leftmost derivation are called left sentinel forms. * String that appear in rightmost derivation are called right sentinel forms. Sentinels: © Given a grammar G with start symbol S, if $ — a, where a may contain non- terminals or terminals, then a is called the sentinel form of G. Yield or frontier of tree: ‘© Each interior node of a parse tree is a non-terminal. The children of node can be a terminal or non-terminal of the sentinel forms that are read from left to right. The sentinel form in the parse tree is called yield or frontier of the tree. 4.2.2 PARSE TREE ‘+ Inner nodes of a parse tree are non-terminal symbols. ¢ leaves of a parse tree are terminal symbols. ‘© Apparse tree can be seen as a graphical representation of a derivation. Eval bok YA A me = 7 7" | CAD el ely aie edit “IN Cad ll li ee | h Ambiguity: A grammar that produces more than one parse for some sentence is said to be ambiguous grammar.Example : Given grammar G : E> E+E |E*E|(E)|- Elid The sentence id+id*id has the following two distinct leftmost derivations: ESEtE ESE*E E> id+E EE Es id+E*E Eid+E*E Es id+id*E Es id+id* EB Es id+id*id Es id+id*id The two corresponding parse trees are E E ae | A ZAIN BE + 8B EB * & ] aN aI™ | id H *) EF E+ E id id id id id Example: To disambiguate the grammar E + E+E | E*E | EME | id | ( ‘we can use precedence of operators as follows » (right to left) 1,* (left to right) “ot (left to right) We get the following unambiguous grammar: ES e4T |T TOT FIF F>GRIG Grid) Consider this example, G: stmt — if expr then stmt |if expr then stmt else stmt | other This grammar is ambiguous since the string if El then if E2 then SI else S2 has the followingTwo parse trees for leftmost derivation then stmt ele simt To eliminate ambiguity, the following grammar may be used: stmt + matched_stmt | unmatched_stmt ‘matched_simt — if expr then matched_stmt else matched_stmt | other unmatched_stmt —> if expr then stmt lif expr then matched_stmt else unmatched_stmr Eliminating Left Recursion: A grammar is said to be left recursive if it has a non-terminal A such that there is a derivation “Aa. for some string a, Top-down parsing methods cannot handle left-recursive grammars. Hence, left recursion can be eliminated as follows:If there is a production A — Aa | f it can be replaced with a sequence of two productions As BA’ Ai aA’ le Without changing the set of strings derivable from A. Example : Consider the following grammar for arithmetic expressions: ESEsTIT TO TFIF F> @®)lid First eliminate the left recursion for E as ESTE ES 4TE' le Then eliminate for T as Torr ToT le Thus the obtained grammar after eliminating left recursion is, ESTE E+ 4TE' le torr Tote le F> @®)lid Algorithm to eliminate left recursion: 1. Arrange the non-terminals in some order Al, A2... An, 2. for i:= 1 ton do begin for j= | toi-1 do begin replace each production of the form Ai —» Aj-y by the productions Ai 61 y 182y1...13k7 where Aj— 811521... 18k are all the current Aj-productions; end climinate the immediate left recursion among the Ai-productions endLeft factoring is a grammar transformation that is useful for producing a grammar suitable for predictive parsing. When it is not clear which of two alternative productions to use to expand @ non-terminal A, we can rewrite the A-productions to defer the decision until we have seen enough of the input to make the right choice. If there is any production A — aB1 | a2, it can be rewritten as Asay’ Ay pI p2 Consider the grammar , G : $ + iE1S 1iE1SeS la Eob Leff factored, this grammar becomes S—iFISS' la SeSle Esb TOP-DOWN PARSING It-can be viewed as an attempt to find a left-most derivation for an input string or an attempt to construct a parse tree for the input starting from the root to the leaves. Types of top-down parsing : 1. Recursive descent parsing, 2. Predictive parsing 1, RECURSIVE DESCENT PARSING > Recursive descent parsing is one of the top-down parsing techniques that uses a set of recursive procedures to scan its input. > This parsing method may involve backtracking, that is, making repeated scans of the input. Example for backtracking : Consider the grammar G : S—> cAd Asabla and the input string w=cad, The parse tree can be constructed using the following top-down approach : Step]: Initially create a tree with single node labeled S. An input pointer points (0 ‘c’, the first symbol of w. Expand the tree with the production of S,Step2: The leftmost leaf ‘c’ matches the first symbol of w, so advance the input pointer to the second symbol of w ‘a’ and consider the next leaf ‘A’. Expand A using the first alternative. s ~ IX 4 | \ e A d / ‘ \ a b Step3: The second symbol ‘a’ of w also matches with second leaf of tree. So advance the input pointer to third symbol of w ‘d’. But the third leaf of tree is b which does not match with the input symbol d. Hence discard the chosen production and reset the pointer to second position. This is called backtracking. Step4: Now try the second alternative for A. JN a Now we can halt and announce the successful completion of parsing.Example for recursive decent parsing: A left-recursive grammar can cause a recursive-descent parser to go into an infinite loop. Hence, elimination of left-recursion must be done before parsing. Consider the grammar for arithmetic expressions ESETIT TOTEIF F (lid After eliminating the left-recursion the grammar becomes, ESTE BE’ +TE'le TO PFT le Fo @)lid Now we can write the procedure for grammar as follows: Recursive procedure: Procedure E() begin TO: EPRIME(); End Procedure If input_symbol="+' then ADVANCE(); TO: EPRIME(); end Procedure T()) begin FO; ‘TPRIME( ); EndProcedure TPRIME( ) begin If input_symbol="*" then ADVANCE(); FO: ‘TPRIME( } end Procedure F() begin If input-symbol="id’ then ADVANCE); else if input-symbol="(‘ then ADVANCE(); EO: else if input-symbol=")' then ADVANCE(); end else ERROR(); Stack implementation: PROCEDURE INPUT STRING EO pridid TO) idvidtid FO idvidid ADVANCE() idsidid TPRIME() ideidid EPRIMEQ) idgid"id [ADVANCE —sifidvid*idSC* TO ideidid FO idvidid ADVAN idtidzid TPRIMEQ) idtideid ADVANCE() idtideid FO idtidtid ADVANCE) ididid TPRIME() idnid idPREDICTIVE PARSING V Predictive parsing is a special case of recursive descent parsing where no backtracking is required. Y The key problem of predictive parsing is to determine the production to be applied for a non-terminal in case of alternatives. Non-recursive predictive parser INPUT ay +]e]s STACK Predictive parsing program une | 4 “f= Parsing Table M The table-driven predictive parser has an input buffer, stack, a parsing table and an output stream. Input buffer: It consists of strings to be parsed, followed by $ to indicate the end of the input string, Stack: It contains a sequence of grammar symbols preceded by $ to indicate the bottom of the stack. Initially, the stack contains the start symbol on top of S. Parsing table: Itis a two-dimensional array MIA, a], where ‘A’ is a non-terminal and ‘a’ is a terminal. Predictive parsing program: The parser is controlled by a program that considers X, the symbol on top of stack, and a, the current input symbol. These two symbols determine the parser action, There are three possibilities: 1. 4x =, the parser halts and announces successful completion of parsing 2, IfX=a#5, the parser pops X off the stack and advances the input pointer to the next input symbol. 3. IfX is a non- erminal , the program consults entry MLX, a] of the parsing table M. This entry will either be an X-production of the grammar or an error entry.If MIX, a] = (X + UVW],the parser replaces X on top of the stack by UVW MIX, a Algorithm for nonrecursive predictive parsing: error, the parser calls an error recovery routine. Input : A string w and a parsing table M for grammar G. Output : If w is in L(G), a leftmost derivation of w; otherwise, an error indication. Method : Initially, the parser has $5 on the stack with S, the start symbol of G on top, and w$ in the input buffer. The program that utilizes the predictive parsing table M to produce a parse for the input is as follows: set ip to point to the first symbol of wS; repeat let X be the top stack symbol and a the symbol pointed to by ip; if X is a terminal or $ then it =athen pop X from the stack and advance ip else errori) else /* X is anon-terminal */ if MIX, a] =X SY/Y2 ... Yk then begin pop X from the stack; push Yk, Yio ‘output the production X + YI ¥2... Yk . + 7 onto the stack, with ¥7 on top; end ‘else error) until X= $ Predictive parsing table construction: The construction of a predictive parser is aided by two functions associated with a grammar G: 1. FIRST 2. FOLLOW Rules for first): 1. If X is terminal, then FIRST(X) is (X} 2. IfX —+ cis a production, then add ¢ to FIRST(X). 3. IfX is non-terminal and X — aa is a production then add a to FIRST(X),4, If X is non-terminal and X + Y; ¥p...¥% is a production, then place a in FIRST(X) if for some i, a is in FIRST(Y0), and «is in all of FIRST(Y/),...,FIRST(Vi-1); that is, ¥I,....¥ied => e. Ife is in FIRST(Y)) for all j=1,2,...k, then add € to FIRST(X).. Rules for follow( ): 1. If Sis start symbol, then FOLLOW(S) contains S. 2. If there is a production A — aBB, then everything in FIRST(P) except « is placed in follow(B). 3. If there is a production A — aB, or a production A — aBB where FIRST(B) contains e, then everything in FOLLOW(A) is in FOLLOW(B). Algorithm for construction of predictive parsing table: Input : Grammar G Output : Parsing table M Method 1. For each production A —> a of the grammar, do steps 2 and 3. 2. For each terminal a in FIRST(a), add A — a to MIA, al. 3. If cis in FIRST(q), add A — a to MIA, b] for each terminal b in FOLLOW(A). If cis in FIRST(q) and $ is in FOLLOW(A) , add A — a to MIA, $). 4, Make each undefined entry of M be error. Example: Consider the following grammar : TOT FIF F— @®)lid After eliminating left-recursion the grammar is (id) FIRST(E’) =(+,€ } FIRST(T) = { (, id} FIRST(T) = (*,£ ) FIRST(F) = { (, id } Follow( ): FOLLOWE) FOLLOW(E’ $)) {S,)}FOLLOW(T) = { +,$,)) FOLLOW(T” +5)) FOLLOW(F) = {+,*,$.)} Predictive parsing table: NON- id + . ( ) s E ETE’ ETE E Por Eos | Boe T | Torr Tor 7 Poo [Por Toe | Poe F Foi F>@ Stack implementation: stack Input Output SE idtid*id $ SET ididtid $ ETE SETF idtid*id $ TIT SET id idrid*id S Foid SET’ sidtid $ © HPS Toe SET+ +idtid $ E+E 3ET was SETF idtid S eer SET id idtid $ Foid SET’ “dS Te dS P4+er SETF id SET id id$ Frid eT 5 SE’ s Toe $ $ Boe LL() grammar: The parsing table entries are single entries. So cach location has not more than one entry. This type of grammar is called LL(1) grammar, Consider this following grammar: S —iEXS |iEISeS | Esbfier eliminating left factoring, we have SEIS’ la Sle Eb To construct a parsing table, we need FIRST() and FOLLOWO for all the non-terminals. FIRST(S) = (i, a} FIRST(S") = {¢, © } 7 ={b) FOLLOW(S) FOLLOWS" FOLLOW) Se) Parsing table: NOX = a = 7 t 5 TERMINAL S Soe Sa iESS # ses Soe Soe E Eb Since there are more than one production, the grammar is not LL(1) grammar. Actions performed in predictive parsing: 1. Shift 2. Reduce 3. Accept 4, Error Implementation of predictive parser: 1. Elimination of left recursion, left factoring and ambiguous grammar. 2. Construct FIRST() and FOLLOW( for all non-terminals. 3. Construct predictive parsing table. 4, Parse the given input string using stack and parsing table. BOTTOM-UP PARSING Constructing a parse tree for an input string beginning at the leaves and going towards the root is called bottom-up parsing. A general type of bottom-up parser is a shift-reduce parser. SHIFT-REDUCE PARSING Shift-reduce parsing is a type of bottom-up parsing that attempts to construct a parse tree for an input string beginning at the leaves (the bottom) and working up towards the root (the top), Example: Consider the grammar: S— aABe Ax Abc lb Bod The sentence to be recognized is abbede.REDUCTION (LEFTMOST) RIGHTMOST DERIVATION abbede (A> b) S—>aABe aAbede (A —> Abe) — adde aAde (Bd) — aAbede aABe (S— aABe) — abbede Ss The reductions trace out the right-most derivation in reverse. Handles A handle of a string is a substring that matches the right side of a production, and whose reduction to the non-terminal on the left side of the production represents one step along the reverse of a rightmost derivation. Example: Consider the grammar: And the input string id;+ids*id The rightmost derivation is : > Etids*ids > idyFids*ids In the above derivation the underlined substrings are called handles. dle prur A rightmost derivation in reverse can be obtained by “handle pruning”. (i.e.) if w is a sentence or string of the grammar at hand, then w = Yq, where is then right- sentinel form of some rightmost derivation.‘Stack Tnput ‘Action 5 dri S shift Sid Vidic § reduce by Eid SE Vidic $ shift 3 iri S shift SBtid; FS Teduce by Eid SEE Fibs shift SEE id § shift SEEMS $ Teduce by Eid SEE $ Teduce by E> E*E SEE $ reduce by E> EXE SE $ ‘accept — The next input symbol is shifted onto the top of the stack. reduce — The parser replaces the handle within a stack with a non-terminal. * accept —The parser announces successfil completion of parsing. * error The parser discovers that a syntax error has occurred and calls an error recovery routine. Conflicts in shift-reduce parsing: ‘There are two conflicts that occur in shift shift-reduce parsing: 1. Shift-reduce conflict: The parser cannot decide whether to shift or to reduce. 2, Reduce-reduce conflict: The parser cannot decide which of several reductions to make. 1. Shiftereduce conflict: Example: Consider the grammar E-SE+E | E*E | id and input idid*idStack Input ‘Action Stack Input ‘Action FEE “dS Reduce by | SEVE mas Shin EOEHE 5E dS Shift SEE ds Shin ids Shift SEEid $ Reduce by Esid FEFid $ Reduce by | SEIT $ Reduce by Eid ESE*E SEE 5 Reduce by | SEHE $ Reduce by ESE*E ESE*E 3 SE 2. Reduce-reduce conflict: Consider the grammar: M— RAR [Rte [R Roe and input e+e Stack Input ‘Action Stack Input ‘Action ¥ oes Shift g eres Shit Fe 8 Reduceby | Se eS Reduce by Roe Roe R we Shift SR eS Shit $Re es ‘Shift SRE es Shit FRte 5 Reduce by | SRte 3 Reduce by Roe MoR+e FRR 3 Reduce by [SM g MoRIR 3M 5Viable prefixes: «ais a viable prefix of the grammar if there is w such that aw is a right sentinel form, ‘The set of prefixes of right sentinel forms that can appear on the stack of a shift-redu are called viable prefix > The set of viable prefixes is a regular language, parser OPERATOR-PRECEDENC! -ARSIN An efficient way of constructing shift-reduce parser is called operator-precedence parsing. Operator precedence parser can be constructed from a grammar called Operator-grammar. These grammars have the property that no production on right side is ¢ or has two adjacent non terminals. Example: Consider the grammar: E> BAE|(E)|-E id ASel[t|t Since the right side EAE has three consecutive non-terminals, the grammar can be written as follows: E+ E+E | EE | E*E | B/E | ETE |-E [id Operator precedence relations There are three disjoint precedence relations namely <' = less than = = equalto > = greater than The relations give the following meaning: a<"b — ayields precedence to b a=b — ahas the same precedence as b a'>b — atakes precedence over b Rules for binary operations: 1, If operator 8; has higher precedence than operator 83, then make 0; °> Ozand 02 <" 0 2. operators 0; and 02, are of equal precedence, then make 01°> 8: and 63"> 0 if operators are left associative 01<' 8) and @2<'6; if right associative 3. Make the following for all operators 6: 0< id,id'>8 e<( (<8 y>8, 8>) O>S,8<0‘Also make ECS Example: Operator-precedence relations for the grammar E> E+E | E-E | E*E | E/E | E7E | (&) |-E |i is given in the following table assuming >). C), S<'id, id >$.$<(,) >8 1. is of highest precedence and right-associative 2, * and /are of next higher precedence and left-associative, and 3. + and- are of lowest precedence and left-associative Note that the blanks in the table denote error entries, TABLE : Operator-precedence relations + = * 7 T id C y $ + > > = < = < = > > 7 > > = = = = = > > * > > > > = = = > > 7 > > > > = = = > > T > > > = = > > id > > = > > ES C = = = = = = = ) > > > > > > $ = = = = = = = Operator precedence parsing algorithm: Input: An input string w and a table of precedence relations. Output : If w is well formed, a skeletal parse tree ,with a placeholder non-terminal E labeling all interior nodes; otherwise, an error indication, ‘Method : Initially the stack contains $ and the input buffer the string w $. To parse, we execute the following program (1)Set ip to point to the first symbol of wS; (2)repeat forever (3) if Sis on top of the stack and ip points to $ then (4) return else begin (5) eta be the topmost terminal symbol on the stack and let b be the symbol pointed to by ip, (6) ifa b then Mreduce*! (10) repeat ay pop the stack a2) until the top stack terminal is related by < to the terminal most recently popped (13) else error() end Stack implementation of operator precedence parsing: Operator precedence parsing uses a stack and precedence relation table for its implementation of above algorithm. It is a shift-reduce parsing containing all four actions shift, reduce, accept and error. The initial configuration of an operator precedence parsing is STACK INPUT 8 ws where w is the input string to be parsed, Example: Consider the grammar E ~» E+B | E-E | E*E | E/E | E7E | (E)| id, Input string is idid*id The implementation is as follows: STACK INPUT ‘COMMENT 7 = dried S shift id Sid = adrid $ op the top of the stack id 5 = FdeidS shift 3= <__ iid S shiftid Said > *idS pop id oF = ds shift * x ids id > $ pop id > $ pop* 3 pop* ¥ ‘accept Advantages of operator precedence parsing: 1. Tris easy to implement. 2, Once an operator precedence relation is made between all pairs of terminals of a grammar , the grammar can be ignored, The grammar is not referred anymore during implementation, Disadvantages of operator precedence parsing: 1. Itis hard to handle tokens like the minus sign (-) which has two different precedence. 2. Only a small class of grammar can be parsed using operator-precedence parser.LR PARSERS An efficient bottom-up syntax analysis technique that can be used to parse a larg. CFG is called LR(K) parsing, The *L? is for left-to-right scanning of the input, the *R’ for constructing a rightmost derivation in reverse, and the ‘#° for the number of input symbols. When *X’ is omitted, it is assumed to be 1 Advantages of LR parsing: Y It recognizes virtually all programming language constructs for which CFG can be written ¥ Itis an efficient non-backtracking shift-reduce parsing method YA grammar that can be parsed using LR method is a proper superset of @ grammar that can be parsed with predictive parser. Y Itdetects a syntactic error as soon as possible Drawbacks of LR method: It is too much of work to construct a LR parser by hand for a programming language grammar. A specialized tool, called a LR parser generator, is needed. Example: YACC. ‘ypes of LR parsing method: 1, SLR- Simple LR = Easiest to implement, least powerful 2. CLR- Canonical LR = Most powerful, most expensive. 3, LALR- Look-Ahead LR * Intermediate in size and cost between the other two methods. ‘The LR parsing algorithm: The schematic form of an LR parser is as follows: INPUT a a ay $ Sf tase pom + ure Xe Tan Xe action | goto So STACKIt consists of : an input, an output, a stack, a driver program, and a parsing table that has two parts (action and goto). > The driver program is the same for all LR parser: > The parsing program reads characters from an input buffer one at a time. v The program uses a stack to store a string of the form soX1SiX282...XaSm Where Sq is on top. Each X; is a grammar symbol and each s; is a state > The parsing table consists of two parts : action and goto functions. Action : The parsing program determines sy, the state currently on top of stack, and ay, the current input symbol. It then consults action{sn2i in the aetion table which can have one of four values shifts, where s isa state, reduce by a grammar production A — B, accept, and error. Goto : The function goto takes a state and grammar symbol as arguments and produces a state, LR Parsing algorithm: Input: An input string w and an LR parsing table with funetions action and goto for grammar G. Output: If w is in L(G), a bottom-up-parse for ws otherwise, an error indication, Method: Initially, the parser has sp on its stack, where so is the initial state, and wS in the input buffer. The parser then executes the following program set ip to point to the first input symbol of w$; repeat forever begin let s be the state on top of the stack and 4 the symbol pointed to by ip; it.action|s, al = shift s’ then begin push a then s* on top of the stack; advance jp to the next input symbol end else if action|s, a] = reduce AB then begin pop 2* | | symbols off the stack; let s” be the state now on top of the stack; push A then gotofs’, A] on top of the stack; ‘output the production A— end else if action(s, a] = accept then return else error( ) endCONSTRUCTING SLR(1) PARSING TABLE To perform SLR parsing, take grammar as input and do the following: 1, Find LR(0) items 2. Completing the closure, 3. Compute gofo(1,X), whe c, Lis set of items and X is grammar symbol. LR(O) items: ‘An LR(0) item of a grammar G is a production of G with a dot at some position of the right side, For example, production A —> XYZ yields the four items A>.XYZ ASX.YZ Closure operation: If Tis a set of items for a grammar G, then closure({) is the set of items constructed from I by the two rules: 1. Initially, every item in I is added to closure(1). 2. IA a. BB is in closure(1) and B — y is a production, then add the item B +. yto I, if it is not already there. We apply this rule until no more new items can be added to closure(). Goto operation: Goto{I, X) is defined to be the closure of the set of all items [A—+ aX . f] such that [A a. XB] is in, Steps to construct SLR parsing table for grammar G are: Augment G and produce G* Construct the canonical collection of set of items C for G? Construct the parsing action function action and goto using the following algorithm that requires FOLLOW(A) for each non-terminal of grammar. Algorithm for construction of SLR parsing table: Input: An augmented grammar G° Output : The SLR parsing table functions action and goto for G? Method : 1. Construct € = {Io,h,_. In. the collection of sets of LR(0) items for G 2. State # is constructed from I, The parsing functions for state i are determined as follows (a) If {A+eraf} is in I, and goto(l,a) = I, then set action|i,a] to “shift ;”. Here a must be terminal (b) If [Aer] is in 1, then set action{i,a] to “reduce A—>a” for all a in FOLLOW(A). (6) If[S*+S.] is in L, then set action{i,$] to “accept”. Ifany conflicting a enerated by the above rules, we say grammar is not SLR(1),3. The goto transitions for state i are constructed for all non-terminals A using the rule: If goto(l,,A) = Ij, then gato[iA] =). All entries not defined by rules (2) and (3) are made “error” 5. The initial state of the parser is the one constructed from the set of items containing, [ss]. Example for SLR parsing: Construct SLR parsing for the following grammar G:ESE+T|T TOTtFIF F>@)lid The given grammar is G:ESE+T Est - TOT*r - TOF Fo) Foid Step 1 : Convert given grammar into augmented grammar. Augmented grammar : EOE+T EOT TOT*r TOF F>@ F id Step 2 : Find LR (0) items. BiB? .8 Bo.EeT Bot To.TtF Tor Fo.) Faia GOTO (6) GOTO (1s, id) li BoE. Ie Pid. EOEGOTO (1) LET. TOT.tF GOTO (1p. F) Tor. GOTO (1p. AF (8) E>.EtT GOTO (In,.id Is: Fy id. GOTO (y+) 1: EE+.T To.TtF TO.F F>.6) Fo. id GOTO (h.*) bb: ToT*.F F>.&) Fo.id GoTo (EB) Ie B > (E.) SE.HT GoTO(h,.1) hiEoT. TOT.*E GOTO (1. F) b:ToF. GOTO (Ie. T L:ESE+T. TOT.F GOTO (Ig, F) hiTor. GOTO (1.0) UW:F+(.E) GOTO (Ig, id) Is: F id. GOTO (Ir.F) Ip: TOT*F. GOTO (hy lL: FOCE) E>.E+T E>.T To.T*F T>.F Fo. F > .id GOTO (iy, id Is:F id. GOTO (I.)) In:F(E). GOTO (Ig .+ :EOE+.T T>.TtF To.P F>.(E) Fo id GOTO (ly, *) b:ToT*.F Fo.(B) Fo .idGOTO (1.0) LiF (5) ES.E+ Es.T To.T TLE Foe) F id T F FOLLOW (F)={$,),+) FOLLOW (T) FOOLOW (F) {S49} FSF JR parsing tabl ACTION GOTO id ¥ * ( ) 3 T ¥ To s 4 2 3 1 6 ‘ACC b 2 7 2 2 b 1 v4 ro 4 Ty s 4 z 3 7 76 6 6 76 Te s 9 3 G 5 : 10 Is 6 si b a 7 FI i To 5 5 eS 3 Tn 5 5 5 Blank entries are error entries. Stack implement: Check whether the input id + id * id is valid or not.STACK INPUT ACTION 0 id? id* dS | GOTO(h, id) =s5 shift 0is ¥id*idS | GOTO (Is, +) =16; reduce by Pid OFS ¥id*idS | GOTO(,F)=3 GOTO (Is, +) =14; reduce by T> F oT? ¥id*idS | GOTO(), T)=2 GOTO (In, +)=12; reduce by E> T OFT rid*idS | GOTO(,,E)=1 GOTO (1), +) = 36: shift OEI=6 id*id$ | GOTO (I, d)=s5; shift DEI +6idS FidS [GOTO (Is, *)=16; reduce by Fd OEI+6F3 ¥idS | GOTO(I,F)=3 GOTO (Is, *)=r4 ; reduce by T+ F DEI+6T9 ¥idS | GOTO(L,T) GOTO (Is ,*)=s7: shift OEI+6T9*7 iS | GOTO(h, id) =s5; shite OEI+6T9* 705 S| GOTO(Is,$) =16; reduce by F—> id OEI+6T9*7FIO § | GOTO(h,F)=10 GOTO (ho. $)=13 ; reduce by OEI+6T9 3 |GOTO(k,T)=9 GOTO (1p, $)=rl ; reduce by E> E+ OE 3 |GOTO(h,E)=1 GOTO (1,8) = accept

Compiler Design Notes
No ratings yet
Compiler Design Notes
130 pages
Lecture Notes: Sir C R Reddy College of Engineering
No ratings yet
Lecture Notes: Sir C R Reddy College of Engineering
25 pages
Compiler Design: B.Tech Cse Iii Year Ii Semester
No ratings yet
Compiler Design: B.Tech Cse Iii Year Ii Semester
25 pages
Compiler Design Lecture Notes (10CS63) : D C S & E
No ratings yet
Compiler Design Lecture Notes (10CS63) : D C S & E
96 pages
notes CD
No ratings yet
notes CD
148 pages
CD
No ratings yet
CD
149 pages
ATCD-Unit4
No ratings yet
ATCD-Unit4
81 pages
Compiler Notes Arv
No ratings yet
Compiler Notes Arv
171 pages
Compiler Design Notes
No ratings yet
Compiler Design Notes
185 pages
CD UNIT-I-1
No ratings yet
CD UNIT-I-1
42 pages
CD Unit1 Notes
No ratings yet
CD Unit1 Notes
28 pages
CD_UNIT I
No ratings yet
CD_UNIT I
25 pages
Unit 1 Introduction To Compiler 1. Introduction To Compiler
No ratings yet
Unit 1 Introduction To Compiler 1. Introduction To Compiler
134 pages
Vino Compiler Notes
No ratings yet
Vino Compiler Notes
153 pages
CS8602 Notes Compiler Design
No ratings yet
CS8602 Notes Compiler Design
92 pages
Compiler Design CS8602 Full Lecture Notes Unique
No ratings yet
Compiler Design CS8602 Full Lecture Notes Unique
92 pages
cd unit 1
No ratings yet
cd unit 1
63 pages
UNIT 1 COMPILER DESIGN
No ratings yet
UNIT 1 COMPILER DESIGN
43 pages
COMPILER_DESIGN unit 1
No ratings yet
COMPILER_DESIGN unit 1
25 pages
unit 1
No ratings yet
unit 1
43 pages
CD Notes Unit1 Aktu
No ratings yet
CD Notes Unit1 Aktu
71 pages
Compiler Notes
No ratings yet
Compiler Notes
68 pages
CSE353 Slides
No ratings yet
CSE353 Slides
76 pages
Unit 1 Compiler Design
No ratings yet
Unit 1 Compiler Design
70 pages
Principle of Compiler Design: Translator
No ratings yet
Principle of Compiler Design: Translator
20 pages
CD Unit I Part I Introduction
No ratings yet
CD Unit I Part I Introduction
67 pages
cd unit I
No ratings yet
cd unit I
20 pages
CC 1
No ratings yet
CC 1
41 pages
Compiler Design Short Notes
No ratings yet
Compiler Design Short Notes
133 pages
Compiler Design LectureNotes
No ratings yet
Compiler Design LectureNotes
45 pages
CD_Unit_1
No ratings yet
CD_Unit_1
20 pages
CDU1
No ratings yet
CDU1
21 pages
Compiler Design Unit-1
No ratings yet
Compiler Design Unit-1
25 pages
Compiler Design Note Unit 1
No ratings yet
Compiler Design Note Unit 1
16 pages
Chapter 1 in Automated Theory
No ratings yet
Chapter 1 in Automated Theory
19 pages
Compiler Design UNIT 1
No ratings yet
Compiler Design UNIT 1
27 pages
Kca015 Unit1
No ratings yet
Kca015 Unit1
23 pages
CD Unit - 1 Lms Notes
No ratings yet
CD Unit - 1 Lms Notes
58 pages
CD Unit-I
No ratings yet
CD Unit-I
21 pages
Compiler Design Unit 1
No ratings yet
Compiler Design Unit 1
26 pages
KCA105 Unit1
No ratings yet
KCA105 Unit1
18 pages
CD Unit1 Notes
No ratings yet
CD Unit1 Notes
28 pages
Unit 1
No ratings yet
Unit 1
29 pages
DFJDFJ
No ratings yet
DFJDFJ
12 pages
CD Unit1
No ratings yet
CD Unit1
21 pages
CD KCS502 Unit 1 A
No ratings yet
CD KCS502 Unit 1 A
8 pages
UNIT-1 1.1. Introduction of Language Processingsystem
No ratings yet
UNIT-1 1.1. Introduction of Language Processingsystem
14 pages
Unit 1
No ratings yet
Unit 1
29 pages
Unit 5 SP
No ratings yet
Unit 5 SP
13 pages
Unit 1
No ratings yet
Unit 1
9 pages
Compiler-Design U1
No ratings yet
Compiler-Design U1
10 pages
Module - I: Introduction To Compiling: 1.1 Introduction of Language Processing System
No ratings yet
Module - I: Introduction To Compiling: 1.1 Introduction of Language Processing System
7 pages
Compiler 2021 Module 1
No ratings yet
Compiler 2021 Module 1
15 pages
Chapter 1 - Introduction
No ratings yet
Chapter 1 - Introduction
13 pages
Compiler Design
No ratings yet
Compiler Design
11 pages
Compiler Design Ch1
No ratings yet
Compiler Design Ch1
13 pages
Com 413 Compiler - Notes1-1
No ratings yet
Com 413 Compiler - Notes1-1
6 pages
Language Processing System:-: Compiler
No ratings yet
Language Processing System:-: Compiler
6 pages

lecture notes of compiler design lab

Uploaded by

lecture notes of compiler design lab

Uploaded by

You might also like