0% found this document useful (0 votes)
46 views41 pages

CD Unit 1

Uploaded by

ayush bansal
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
0% found this document useful (0 votes)
46 views41 pages

CD Unit 1

Uploaded by

ayush bansal
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
You are on page 1/ 41
CONTENTS 16 Ww 18 CHAPTER 2. 21 22 23 2a. 25 26 27 28 29 ‘CHAPTER 3. CONTEXT FREE GRAMMAR ... By 32 33 I-27 INTRODUCTION . 1 aes Sseueros ‘Frres OF TRANSLATORS 134 Compilers 132 Assembler 13.3 Interpreters 13a Macros sors Tinker and toad 136 _ Unkers EOF A COM EQNSTRUCTION OF COMPILER ‘Syntax Analysis (Parse!) ie analysts Seman cade Generation ose Stn 6 Code Generation 8 7 Table Management (Bookkeeping) 136 Erortanding ASSES IN COMPILATION 1.6.1 Classification ‘of compiler w.r.t Number ‘of Passes 1.6.2 __One Pass vs Multi Pass’ ‘Compiler BOOTSTRAPPING = COMPILER CONSTRUCTION TOOLS . LEXICAL ANALYSIS .. INTRODUCTION 2.1.1 Role of Lexical Analyzer 312 fnpu Bufesng BeStGN OF LEXICAL ANALYZER Sei Gransition Diagram (Finite Automata) for Tokens : ‘SPECIFICATION OF TOKENS (REGULAR EXPRESSIONS) RECOGNITION OF TOKEN (FINITE AUTOMATA) 241 Types of Finite Automata ‘CONVERSION OF REGULAR EXPRESSION TO FINITE AUTOMATA (NFA) CONVERSION FROM NFA TO DFA (Subset Constructic MINIMIZING NUMBER OF STATES OF DFA mien LANGUAGE FOR SPECIFYING LEXICAL ANALYZERS IMPLEMENTATION OF A LEXICAL ANALYZER, vee 28-66 ‘wrrooucion Seavanons TREES (DERIVA yee TION TREES) 122 Conversion of Ambiguous to Unambiguous Grammar INTRODUCTION 1.1 DEFINITION OF COMPILER The compiler translates a human-readable Program into machine-executable code. . OR A compiler is a program that translates a high level language (e.g. C, C++ : Java) into a low level language. Source Language Target Program COMPILER }-—————"""> (A programme written (Apr rogram written in High-level language) in Machine code) e.g. C, C++, Java Fig. 1.1: Compiler 1.2. COMPILERS AND TRANSLATORS Over the years, Programming languages have progressed from machine languages written in binary 0s & 1s to the languages written in English terms which can be easily understandable by user. Computer languages can be classified into following categories :— (@) Machine language (Binary form 0's & 1's) (6) Assembly language (© High level language (e.g. C, C++, Java) (c) Machine language : There is only one language understood by computer ie, Machine language. It is normally written as strings of binary 1's and 0's. Obviously, it is difficult to read & understand the instructions of binary form. A Program written in machine language has following disadvantages :— _ 1. Difficult to read & understand 2, Machine dependent : 8. Error Prone. ISHAN'S Compe, some new langua va to these limitations, over the Years £8 han cue 2 shew Melly understandable by @ user and which arg yy. this, we also require a software Program which a boa, dependent. But t0 do h emt Gepesdent program into Machine language becauso o OP Ute, by SY introduction hy q translator which converte high uk Janguage. I 4 e instructions. Therefore, we require ‘As, Computor can directly execute mac! ‘Assembly language into machine; general way, we can det ras. a s one language into another or : A program whic! ‘convert this machine in Gin caly understand Machine language. That Software Program ig cally Tr ‘Translator (Compiler or Assembler). dt tanguay (@) Assembly language : Assembly language was introduced in 1959 ‘A program that translates between high level languages is called "language ‘overcome some of the sbove limitations of Machine language. Using A, Whicy_ translator", "Source to Source translator” or "language rewriter". language 2 programmer need not keep track of storage locations of aa | Program in ee Program ia 4 one Language ieee ‘Ancther Language instructions. Instron can be easily written in the form of letters & eymbon ols but notin Os & 1's For Bg. Toadd two numbers we write ADD A, B in Assembly language, Following are the advantages of Assembly language over by : aching language: (0) Reading is easier. ‘ (®) Addresses. © Aidreses are eynbole& programmer need not worry about ade mnemonic eg we use ST instead of 01010000 for store instr.a Assembly language cece ea (Easy w locate and carret errors (©) High level Iti wey e enguage :It is machine independent language whi This cakes ee nT: to know any thing about internal structures erm ee © So enn sg easier. e.g. C, C+, Java, ete, cae '¥o numbers we can simply write ¢=2+ bin high evel language, High level language. low Leet ee ‘Assembly &] or Machine langu 24139119 9001 [SseS8h lenguegs —— 219000 | | Complexity Decreases poe | SUB. High ievel BAB language gO | User D | intabe Understandability ti Increases, a Fig. 1.2: Translator TYPES OF TRANSLATORS Compilers Assemblers Interpreters Macros Preprocessors Linkers & Loaders. beaded 1.3.1 Compilers ‘A compiler is a translator which converts high level language (soure program) into low level language (object program or Machine program). Machine tore ome vores ae eal El 2 Fig. 1.8: Compiler Compiler also find out the various errors encountered during compilation of "Yow-level language using various ‘through various phases of Program. Compiler converts high level language into phases. A character stream inputted by user goes compilation which at last will give target language. 1.3.2 Assembler ° translator which translates Assembly language progran, ‘Assembler is & program of he comPuteE valent machine language ta eee sets Tite | ein Program T Fig. 14: Assembler e more time to run assembly language progr, : than its Computer will always tak machine language program. » Computer has to convert first assembly language prog, fam inty Because rachine program by running assembler program. But Assembler reduce, effort done by programmer. As user has to write instructions in form of Joy Xtra symbols but not inform of O's & 1's ters oy 133 Interpreters 4n Interpreter a program which execute the programmis / grammin, i instead of just trafblating it into another format. Tt translates: ann gee programming langulge statements one by one. Sxecutes |i tev tithe interpreter Jo Rona of Program Execution irect Execution) Fig. 16: Interpreter Interpreters translates (e compile) & exccutes etatement by statement of whole ile) & rogram in very slower manner, guages which were mainly used during 1970s use Introduction Difference between Compiler & Interpreter :- Interpreter Tt translates & executes the program ‘They are slower as they have to translate whole program line by line. Requires less memory for execution ‘Compiler T | Itonly translates the program 2. | They are faster as they translate the whole program at once 8, | Require more memory for execution. 4, | Difficult to write Compiler | Easy to write programs 6. | C, CH uses compiler Java uses Interpreter (1.3.4 Macros | “Macros are like functions having actual & formal arguments as we have in C, (C++. But, Rather Macros are used in Assembly language, not in High level language (c,CH) For e.g. A Macro named SUM having P and Q as its formal arguments is called by Macro calling statement i.e. SUM A,B where A and B are its actual arguments. A and B will be copied into formal arguments P & Q respectively. MACRO MACRO SUMP.Q Macro Translation for SUM AB LOAD Q — LOAD B ADDP SUM A,B ADDA STORE P STORE B MEND MEND 1.3.5 Preprocessors fare programs that, transform the source code before Preprocessors compilation. As, Compilers are called language processors, therefore these are called Preprocessors. Preprocessing occurs before compilation i.e preprocessor tells or directs the compiler, to include some header files into the program. Source Pree preprocessor} —»] Compiler —e Machine Languase ogra Fig. 1.7: Preprocessors ring « 1.36 Linkers and Loaders 4.3.6.1 Linkers: ‘A Linker ia a program that combines object modules to foy, executable program, A cftware oft conaat of multiple source progr retin which auld bo contained before execution. Bach modulo of « softy, to produce object programe (object modules), Linker to loader to load it into smomory ta may 8 op 1.3.6.2 Loaders Executable Program Fig. 1.8: Linking and Loading Wty tituto value 100 at every occur Nee a aro de loads them into main memory for over Linker perform task of" Symbol Resolution". ‘Symbol Resolution : Software consist of various programs (or modules). The 2 (or functions) in one program can be referenced by subprogram (or another program through symbols. Program 2. # include extorn void dsplay () 5 Program 1. # include ‘extern void show () ‘void main () void show () { ( show () display (); 9 : Symbol Resolution ‘The variablee & functions which uso Yoxtern" keyword i, for global declaration leclared in ono program and ean be roferonced in another program. accepts the input as linked modules & execution by the computer. from secondary storage to main memory for \ddroseos with physical addresses. & londor. Bas there tasks Londer is a program which Loaders load or copies program jaders can also replace virtual a‘ no clear difforence b/w tasks of ‘There slap with each other. As, the loader ean porform both linking & 1o Source Reault of Progra] Taneiator Enoeution Fig. 1.9: Loader and Linker Functions of Loar” to space in momory forthe Programs Allocation abel Resolution between object modules Linking rm adres dspendent loetions of 2427656 cong, om eto different part of@ Program a nd addrespine instructions & data into memory. Loading Schemes 1. Simple comp ion without loader (Compile & go Loading sep, me), SuurceProeram[ Comrie | onjet tee | Freer | Program Loaded. Fig. (2) 2. Compilation with loader (General Loading scheme) Source Progra 1 ‘ones ' Pron Sm P Loader Source Program 2 [Com sou SEREAE] Comper | —aPoO" Performa both Loading & Linking Free Memory Pi 7 ‘1.11: Londing Schemes ISHAN'S Compa De ty Introduction, 2 ‘As shown in Figure 1.11 (a). If compilation is done & object programs are directly loaded without a loader, than a translator (compiler or Assembler) program will also take lot of space in memory which will result in wastage of memory. In figure 1.11 (6). (On using Loaders with compilation, Loaders will replace the space taken by compiler or Assembler. But loaders will take less memory than compiler or ‘Assembler causing more free memory to be available to user. ‘Types of Loaders : (Absolute Loaders : It is a primitive type of loader which does only the Joading. It does not perform linking & program relocation. In Absolute loading, the programmer has to give the address in memory explicitly where he/she wants to store the program. This is the main limitation of Absolute loading. @ Relocatable Loader : A loader which also performs relocation with loading. It is the responsibility of Relocatable loader to load each function ‘or subprogram at non-overlapping addresses & to give each function a original load address. 1.4. THE STRUCTURE OF A COMPILER. A compile: system software or a program which takes as input the source language (e.g C, C++, Java) & produces its corresponding low level language (Machine language program). In this topic, we are basically going to discuss the internal structure of a compiler ie. how actually compiler converts a program writte: language which is understandable by machine or a compute ‘This operation or process is so complex & difficult such that it do this whole process in one go. auser toa ‘stern. impossible to So, the process of compilation is generally broken down into subtaske calle phases. Each phase is interdependent on other phase. Output of one phase will b input to another phase. First phase of compiler takes as input the source languag written by a programmer & the last phase produces the required object language ( target program). Figure 1.12 shows the structure of a compiler or various phases compiler. Inoducton : " 3. Semantic Analysis ‘Performe Type checking ie for each operator, type compatibilities of operators are checked 4, Intermediate code Generation = Converts output of ite previous phase into intermediate representation. E.g (a) Postfix Notation (0) Three Address code © Quadruples (@ Triples & Indirect Triples Error Handling (© Syntax Trees 5. Code Optimization Tati frtermediato Code Generation [rt code = Tt converts the intermediate representation of source program into fan efficient code. ie faster running code. Code optimization can be done in 3 ways (©) Local optimization : Elimination of common sub expression (&) Loop optimization : Removing loop invariants, (© Global Data Flow Analysis : Perform optimization using | Target Programa information flow. Ready for Execution) @.__Code Generation ‘The whol Fig. 1.12: Phases of Compiler = Te converts optimized intermediate code into Machine TAssembly ole Process of compilation - is carried out in diffe code. ferent phases “= Ttallocates memory locations for variables in the program. Tt reads the = Geanner) 7. Table Management Source program one character at a time & group th Keep Record of each token & ite attributes (je identifier name A token is a sec a i quence of . data type, location etc) t of chars ; = puma zments ldeatitey acne collective meaning Eg of + Symbol Table is a data structures which eontaine Tokens ___ ens in the e rs, labels, : ; 2. Syntax Analysis to table, eament %. Error Handling + Receives tok Detect & Report errors occured at each phase of the compiler. (lexical Analyei) <2, iaput Analysis) & eenerated - are PPOduet whicrarchal gee erence called Syntax ane compilation P rouse for an input A= BYCap @ Saree ON aera ore eae ge re 8 Examples of Toke: 1: Shot Example 1 Soe oF compiler et aon —oatpet__| Q Sporaiare ae bse es © 5.6, 60.5) [SNo|___—" "Source Program _ @ Identifiers sum 7, [Procrammer_ ecaana © “Labels [2 [besical Ansley ee Tee = [Syntax Analysis Ch B ‘c = is ‘Tree eck data | [Semantic Analysis [Parse whether Nuceet ARS : i Bg character to2! or n°) Find out Tokens & their types. cannot beT data yt Token Type Values. JArithmetic ene, Weed IDENTIFIER a etic operation IDENTIFIER, b —lintermediate code|Thres Address code|T1 = B+C RSSIGNOPERATOR = & [Generation J(ntermediate code) |T2=T14p ‘ADD-OPERATOR + JA=T2 CONSTANT 5 & [Code optimization optimized T1=B+C 2. Put information about Tokens into Symbol Table. lintermediatecode JA = T24D \ddress_ ‘Token Type, val 7, |GodeGeneration [Assembly code MOV BRT a id, integer valu en Mov cc Re 382 id, integer, value= b ADD R21 : MOV DR : [ADD R38 R} ‘360 ‘constant, integer, value =5 IMOV_R1 a A 15 DIFFERENT PHASESISTATES IN CONSTRUCTION OF COMPILER 15.1 Lexical Analysis (Scanner) Lexical Analyzer read: a is the source prograi tokens of source program A token i erry collective meaning then ora + it insert tokens into symbol of characters whic! table, character by character and returns +h have some Remember, Information regarding keyword, operators, delimiters need not | inserted into symbol table, as they are repeated number of times ina program have very common occurrence in a program, 3. After Sing out tokens & storing them into symbol table, a token stre is generated as follows 332] + (const, 360] is of form (token - type , index} token-type tells whether it is constant, identifier, label ete. index tells about the address of the token in the symbol table. A lee 4 jon performed by Lexical Analygi, S Brample 8 What will be the cpt perfo 498I8 Phas, ~ statement. WA= Jo) then’ co % ‘Ans. ‘Tokens will be © ken Tye hig! xeyworDs ——_‘Iethen, GOTO DENTIFIERS A ASSIGN-OPERATOR = = = LABEL 200 DELIMITERS —(,) (i) Symbol Table address | Token Type, value 236 id, integer, value = A 238 constant, integer, value = 10 288 label, value = 200 (ii) Token Stream will be, id, 236) = [constant, 298) then GOTO flabel, 288) 155.2 Syntax Analysis (Parser) Program, oe | aN aa | const (10) 1.5.3 Semantic Analysis} ‘Semantic refers to the "meaning of a program" as opposed to its syntax or structure. Functions performed by a semantic Analyzer (a) Type Checking : Checks or verifies that each operator has operands that are permitted by source language jon or there should be type compatibility between operator & operands. Implicit type conversion (coercion) : Coercion refers to changing one data type to another automatically when data type rands of an operator are different or mismatches. Bg in Pascal when "+" operator has operands of type real & integer, the integer type will be automatically converted to real type before the addition is done. ie int+ real ] Type conversion real + real = real Egg. of Semantic Anal If a & bare real no's & 10 toreal Semantic a+b* 10 Sait + a+b * int to real (10) © an integer then semantic Analysis will convert int eration rate code CF" j “54 Intermediate Code vox & semantic analyse on Program, comp ; sing sy! i termediate between r er which i basically Source ja Re inte Fig. atypesof Intermediate Code = 1. Postfix Notation 2, Three address code (@) Quadruples @ Triples 3. Syntax Trees 1. Postfix Notation : In postfix Notation, ‘ce operator fllows an operan: ee. () Postfix Notation for expres Eg Three Address code for the statement Intermediate Code Lid: Intermediate Code Generation operator comes after an Derang Postfix Notation for the expression (a+b) * (c+d) is abt edge (arb) - (¢#d) is ab* cas, 2. Three Address code : These are Statements of for: which there will be atmost three addresses, ie, two for operan Each instruction has atmost one operator on right hand side d=a*bie| Three Address code — oT koh ie ds & one for Res, nat tl=a*b t2=tlte d=t2 8 Syntax Trees : Ij ies 8 et condensed ill be ope Exgeyntax Tre fog Pe verators, id+iata oe form of parse Tre e in which leaves are ep ; Neus oy It is basically an oy 5:5 Code optimization improve the intermediate code, so that faster running object code (machine code)will be produced It performs following task ~ (@) Improve target code. (Eliminate redundant information (common sub expressions) (© Remove unnecessary operation. Ww instructions with faster ones. ) ‘Types of optimization: @ Local optimization (© Global Data flow Analysis, (@) Local optimization: Local optimization includes removing common sub ions or redundant information, Loop opti zation PS=P4+QtR Z=P+Q+M Unoptimized code In unoptimized code , we have to perform four additions. But in optimized code, only three additions\are required which take less time of execution. (©) Loop optimization } Program spend most of their time in loops. So, it is very important to concentrate on loops for increasing performance of the whole program. A statement which computes the same value every time when loop is executed is called "Loo . Those statements can be taken outside the loop’ which leads to decrease in execution time of loop & obviously program also. int a=5; . inta=6; inte; inte; for (int i= 1; i< =5; i++) loop c=atd; { cout<: >} for (int i= 151 <=5 it) cana: optimization { } cout < | Woap Geceration ADD C Source Program STOREA Assembly Code : 15.7 Table Management (Bookkeeping) , Esch phase of the compiler which we {able management phase in fig 1 Previous phases in the table, we have discussed earlier, is attached tp ~ ach phase uses information stored by ite fetected by following phases of compiler : Phase ‘Type of Error Description [Lexical phase | lexical error [when remaining unread characters do not form any token [Syntax [syntax or syntactic phase | when token stream do not follow analyzer errors structural rules of language E.g [Missing Parenthesis [Semantic [semantic errore Type checking errors analyzer Intermediate | Intermediate code Eg Incompatible types of operands. [Code generator |generation errors [Code optimizer |code optimization errors |some statements never reach while execution of program. [Code generator |code generation error [ome data values are very large to fit into register for allocation. _. Example 6. Apply lexical analysis, Syntax Analysis (parser) & Intermediate code generation inthe statement While (A>=B && A<=3+B-I0)do A=A+B : ‘Ans. 1. Lexical Analyzer will find otit tokens & will store tokens into symbol Table. * address | Token -Type, value 540 id, value= A 542 id, value= B rate following parse tree for the aboy. stream. ‘ 2 will gene : 2, Syntax analyzer 2 Example 6. Show the whole compilation process for the statement A= B*C+20, where A, a - : ‘are of real lypes. enti ae ocd — as xN Z 200 WAN < of PK | “ ee en eA 7 ap th ed TNS = aA ae nnioe const) 4% Intermediate codo ie thne Address code for the Statoment is given below : IfAZB goto (3) goto (9) TI=3*B Three address code T2=T1-10 =—_—_ IfA S T2 goto (7) Boto (9) ‘Ti=id,tidy A=A4B id,-T1420.0 oto (1) e F id, RL Three Address code, MULTF id, R1 ADDF # 24 ‘MOVP Ri, id, Assembly] code 200 208 22 216 ple 7 Shove Pane wrious phases of row object code TiHayidy TasideT1 Tasid,T2 igor, TisidJid, Tesidyrt idsider2 PASSES IN COMPILATION ‘Whe whole source program can be processed several times before generating sembly / Machine code. Pass+ One complete scan or processing of source program. Several tnases can be grouped into one pass. Lexical, syntax & semantic Analysis are often ‘ouped in single pass. Each Pass reads source Program & write output into ntermediate file, which then can be read by subsequent passes. ic. output of one Sage will be input to next Pass. 1 Classification of compiler w.r-t Number of Passes 1, One - Pass Compiler : One pass compiler reads the program only once & then translate it. 2. Multi -Pass compiler : It reads the program several times, each time transforming it to another form. Source code ‘Assembly Code! Machine Code Fig. 1.14: Passes ISHAWS og piler pass vs Multi Pass is [One Pass compiler Tt reads program translates it at same 7.6.2 One su ram several Tega Oe Rein ifforent form. E | They are “slower” jar passes means only time, 8) Farmore number| They are faster. ‘more execution) Less effici ie optimization & code tor cod [gencration.. ‘called "wide compiler" can sean cach & every pot of program. [Memory occupied by one paas can be} id by subsequent pass therefore} is required by It is also called * N; rrow As It has limited scope. “a i Reqanay Large memory compiler. E.g Pascal & C lan; |- pass compilation: #8 Weng tion. mpilers from existing compiler. Sometimes, we think “How the first compiler was compiled”? ‘This question is : ‘This question is answered by Boot strapping = Using Bootstr translate a more con rogram, = For Seottp ing a compiler is characterized § ~ Source language it compiles Te Teveet language it generates ~ Implement Se ibait tation language that it is written in guages can be represented using a T- diagram as s 7 : a which can be abbreviated as , SiT 1 Fig. 1.15 -T-diagram hich is used in turn can handle more complisin by three languages : for another macl 28 tetroduetion ‘Cross Compiler : A compiler may run on ene machine and produce target code e. Such a compiler is called as a cross — compiler.“ 1. Suppose we write a cross - compiler for a new languages ‘Lin implementation languaga’S' to generate code for machine 'N’ uN 1 ns on machine M and genorates code for M, ie, 2. Hfanexisting compiler for'S it is characterized by SuM. If LsN runs through SuM, wo get a compilor LwN, i.e. compiler from L to N that runs on M. ? i.e, LsN + SM = LN divean also be represented as = xl 8 when SsM runs on SsA, SaM will be generated sm fs are Al Ll La] io SM. SM SSMAN'S Com, ie Blisainate common sub ‘compiler construction enerate lexical ay ‘Automata which 2 Toole a eoanner generatcnt? in| | prosram. ee varus yrodves ernie: ana Babi aanagement «Keer seerd of each token found by lexical ‘yntax of Programming Language based or Whig | «ART landling : Bagh phase detects & report error. oleh oe mast) we Passes tion Engines ease coftware si 8 ¢ > ; 4 3 ; 3 é Fig. 2.4: Transition Diagram for Keywords ¥ Table 2.1: es Integer codes for different Tokens [Identifier| eo Tdentifier Integer. = fixed. oe des and values wh desig Different Programmers can chose different uppose, ifidentifier ig the Lexical Analyzer. is stor i ved at location 236 in symbol Table, then, letter digit not dette or digit return(G,install) Aateger Cate, Insta) ly | 6.236) Parser}. | Fig 2.5: Transition Diagram for Identifier ifeonstant is stored at locati es S08 than [Constant] cera Similarly, ical Anaya at thon Lt=ULIULZU...... Here * canbe 0, ‘ Lt ={e} UJUCLYU. & Lt = (e111, Operations on Languages If Li = (00, 10} & La = {01, 11} i 7 Operation Deseri ____ Example Union LAU La = (eet of string in Ls] LxUL2 =(00, 10, 01, 11) & strings in Le Concatenation _|LiLa= (Set of string in Li[Lile = (0001, 0011, 1001, followed by strings| ~ 1011} inl) 5 Kleen closure of la [13 =19 ur, uLzU = &, 00, 10, 1010, 0010, u 1000, 0000, 600000, 4 : or Rit (Ret Re) = (Rit Re) + Rs v= Oy 001000, Ea =u... Lj= _ {00, 10, 1010, 0010, : 1000, 0000,000000, ora 4-00 01000...) ig. 2.7 : Transition Di mf . “nsition Diagram for Relational operator f SPECI : s Soon OF TOKENS (RE Rules of Regular Expre: * To specify tokens, Re 1. is a Regular expression. . Bula It provides convenient a a 2. Union of two Regular Expressions Ri and Re. "Regular Expressions def, Sef notation for ie. Ri+Reor Ril Reis also a Regular Expression. ~ (ansition Diagram, 3. Concatenation of two Regular Expressions Riand Re * Satna Expressions are def ie. RiReis also a Regular Expression. UR ina Regular Epreceg vet alphi 4. Closure of Regular Expression R i.e , R* is also a Regular Expression. Teg expression, ™ then L(R) rep, 5. If Ris Regular Expression then (R) is also a Regular Expression. e can be denied by a %84 collection of atrin Algebric Laws iE iL danguage = 88 Over some fixed alphabet 1. Ril Re=Rel Ri or Ri+Ru=Re+ Rt then, Lnigg ott etrings of g 2 1a] Ra) = (Ril Ra) Ra i Ri ReR)= BRIE | 4. Rie | RY=RiRe | ist i or Rr (Re+ R= Rife + Ris Ore a : & eR=Re=R (x \xample 6. Write Regular Expression for 2 = {a, b} ' : neater,» (a) Le=set of strings having atleast 2: Find Regular Expression for following Langua, ten. * (0) ngs ig atleast one occurrence of double letter Example - sd ‘ns. (a+ 6)* (aa + bb) (a+ b)* of string. Pee0=1, Rey (b) L=set of strings having double letter at Beginning and bL=1)) gs. (aa bb) (a+ 8)* (aa+ bb) 8 (©) L=set of string having double letter at beginning or on ending of string. ns. (aa + bb) (a+ B)* + (a +b)* (aa + Bb) + (aa + bb) (a + b)* (aa + bb :mple 7. Write regular expression for set of words having a, ¢, i he order although not necessarity consecutively. }) ans. Ifa = ((b—d] + [f—h] + Gn) + [pa] + (v —z] + Blankspace)* (b= (4,11, 111, (Hint: *=0, 1,2 Ans. It () L=(e, 1, 1M, 1D Ans.(11)¢ @_ L=Seofoll strings of sand 1'¢= 6,0, 1,01, 11,00, 000, 19) ‘Ans. (0# 1)" or (011)* (@) L=Setof all strings of 0's and 1's ending with 11, ++ Required Regular Expression will be Ans. (0+ 1)*11 aaaeaiaoaua (© L=Set ofall strings of O's and 1's be i . ‘Ans.0(0+1)*1 ei 'th Oand Ending win, ) 14. RECOGNITION OF TOKEN (FINITE AUTOMATA) Regular Expression are used to represent various type of Tokens. From these ng language over x= j, , teevlar Expression, Finite Automata is generated. Finite Automata is basical =H) ther name of Lexical Analyzer which checks or recognize whether a string is Example 3 : Write Regular Expressions for follo (0) Strings of length zero or one. To Token or not. Strings of length two, Finite Automata: Ans.aa |0b | ba bb or (aa++ab+ ba + bby + It is a Machine or Recognizer for a language that is used to check © Strings of Been length, ” vhether string is accepted by a language or not. Ans (calablbal68)* oF (00 +0b-+ + Itgives answer "Yes" if string is accepted else “no”. © Ssofallsringsofa'sand bsharne + InFinite Automata, Finite means finite number of states & Automata . means, the Automatic Machine which works without any interference of havi ing atleast two occurrences of aa, human being. Ans. (0 +.8)* aa ( 2 (0+ 8) aa (a+ bye Representation of Finite Automata :There are 2 ways to represent finite Automata. 1. Transition Diagram : It is a directed graph or flow chart having states & edges. Final State Initial State 0 + Initial state 0, 1, 2+ States a,b Input symbol jon Table : 2+ Final State Finite Automata can be Represented by 5 tuple (Q, » Qisfinite non-empty set of states. @ + 90,8) (6) Sis finite set of input symbols - (0 dis transition function @ qe Qisinitial state. (© FE Qisthe set offinal states. Example 8: Design Finite Aulomata which accepts string “abby Ans. States Q= (4s, a1, 42, G3) Input symbols Z={a, b) ‘Transition Function 3 8 (go, a) = +) = 9 8 (41, 8) = G2, 5(qs, b) =, Initial Stave % Gee Final State (F) = tab 24.4 Types of Finite Automata (@) Deterministic Finite Automata (DFA) (6) Non-Deterministic Fi nite Automata (NI (@) Deterministic Fin; ee ae oat ite Automata (DFA) : “Deterministic” hs ane and onl; i tna enn 'Y one state to which the automata can bit a an a ‘S-tuple, 30, 0, F) (2) Qis finite ‘Ron-empty set of states cat Anaysis 38 Neat Anatly 8 (6) Non-Deterministic Finite Automata (NFA) : “Non-Deterministic’ rns there can be seyeral possible transitions. So, output is non-deterministic for a iven input. NFA is 5-tuple (@, 2, 6, go, F) where (@) Qis finite non-empty set of states. (®) Dis finite set of input symbols. (©) dis transition function to move from current state to next state. 5: QxE+2e (@ q€ Qis initial state (© FC Qisset of final states, + Difference Between DFA and NFA DFA NFA 7. Every transition from one state | There can be multiple transitions for to other is unique & | an input ie. non-deterministic. deterministic in nature. 2, Null transitions (¢) are not | Null Transitions (¢) are allowed means allowed. transition form current state to next state without any input. 3. ‘Transition function ¢ | Transition functic ye 6:QxE+Q 7 8:Qx E> 20, 4. Requires less memory as | Requires more memory. transitions & states are less 5. EDFA EgNFA Infig. In fig. Each State has unique input symbol | As, there are multiple transitions of E= (a, b} outgoing from it. | input symbol ‘a’ on state 0. & no transition from state 2, ie non- 2.5 CONVERSION OF REGULAR EXPRESSION TO FINITE AUTOMATA (NFA) ‘A Regular i Recognize @ token, We ae ae Bac (NFA). So We BB CONT Regular Expression into NP4. ae igerthm fr eopversion OF 1? NFA: eam oular Expression R ‘Ans. language denoted by R L ISHaNs Expression i8 basically @ representation of & ‘oken Recognizer which is nog "A accepting Fore, NFAis. Fora, NFAis Fora +b, or alb NFA is Forab, NFAis Porat, NFA is Toke, s ite, NY ca Anais hing by yaa yple 10: Draw NFA for Regular Expression a (a+b)* ab Example 11: Draw NFA for atb+ ab Ans. Ans. transitions 1 ish a sts Eo theme pete be repeated, ut Alort:etosare(D) ‘re otanes hse nuit be ound 1 Pus Allstate Pou stack. areas. it psure (54 enclosure) 66) all states are covered. forcachsttet,withedges—t a ‘ 6 6 {Uf tisnot present in e-closure (1) 1 i- (e-closue (1)=e-losure (T) U(t) tele) = Example thle 7 (O,eclaare( ecoare orf e-ctosure @)= 6) pFAare Algorit 2,6 CONVERSION F ‘To construct DPA €9! collection of states of NFA. Example 16: Draw NFA for DPA. ‘Ans, NPA for (atb)* abb is ROM NFA TO DFA (Subset Consruction Algo) aivalent to NPA, it should be remembered that states of ithm NFA -to-DFA: sA with set of states N= p= {dd set do unmarked while there is an { sotd marked {For achinput symbol‘ unmarked state din D’ re isa transition 0% in NBA to which the d ¢ pep ‘Add transition > labeled setd' unmarked y n, convert NFA lo regular Bxpresion (atbj"abb Then 7 wert it into DFA Tow we can apply algorithm fo con — -losure 0) lot 210,247) Tey p={a} 3 ‘Apply steps (4-12) of Algo on state A 4 by, ‘The transitions of symbols a, b from state A ‘ A= MAORI 4 TF vba by ay Penh + 3 6 B } ls 1 For Input Symbol a, For Input Symbol», 1 =63) Tei | 2 Beeelosure (1) Osea “clos =eosure (38)) seclosurcgy, 8 = eclosure (8)Ue-closure (8) = (5,67, 3 =18.6,7,1,2, 4)U(8) 4 B=(1,2,3,4, i, “ Bete Is | Cniaseey 1 Poange het “Ad transormation fom A toBand Atoc’ f (step 1 For input symbol + Te = 88) e-closure(Ts) = e-closure (3,8)" = (1,2,3,4, For input symbol b Te = (5,9) e-closure (Ts) = e-closure ((5,9)) = e-closure (5)Us-closure (9) = (6, 6,7, 1,2, 4} U (9} = {1,2,4,5,6,7,9} = D = D! = (A,B,C)UD}.={8,B,CD) «Add Transitions from B to B and from B to D e 7 @--® [For state O| Since, Ga ( te ee 6 7} Fabby t 4 ag Te ee ie ) For input symbol a For input symbol b T= (8,8) = 5) e-closure {3,8}=B | -. e-closure (5) = C Add Transition from C to B and C toC » Q ©+-® [For state D] : cs eee cee 4, GB & 7f 9) ab by le ele bh To ke For input symbol a For input symbol b T. = (3,8) ‘Ty = {5,10} e-closure (T:) = B e-closure ({5,10}) : = e-closure (6)Ue-closure (10) = {5, 6, 7, 1, 2, 4,}U{10} ={1, 2, 4,5, 6,7, 10}=E & »-D’ = (A.B,C,D) U (E} = A.B,.CD.E} Be (ER bet Synh % 1? day bb $ 4 ab i tm (ae BY For input symbol a For input symbol b T= (88) ‘Te = {5} e-closure (3,8)) = B e-closure (5) = C ‘Add Transition from E to B and E to C ® ‘Therefore states Dagan om fa of DEA wl be Dr =A, B, ©, D, E}. Joining alll transitos * It can also be represented in form of Transition Table as state) Fi stato) a B c D ® E contains state 10, which is final state in NFA be final state in DFA. ING NUMBER OF STATES OF DFA _-Minimizing means Reducing the number of states in DFA. We have to detect those states of DFA whose presence or absence in DFA does not affect language accepted by DFA. These States can be ‘Algorithm : Input: DFA D1 with sot of states Q with set of final states F. Output: DFA D2 which accepts same language as D1 and having minimum no. of states as possible. Method (2) Make a partition ‘1' of set of states with two subsets : (@ Final state (Non Final states ‘@-F” ; 2=(F,Q-F) : ) Apply following procedure to make tow from . For each set S of 2. Partition S into Subsets such that two states p & q of S are in same subset of S iff for each input symbol 'a' states p & q have transitions to states in same set of x. Replace S in zrmw by set of subsets formed, If stoew = 2, Let musi = 2 & continue with step 4. Else repeat step 2 for @ = Tom, Choose one state from each set of east as representative o ‘These statesjwill be states of Minimized DFA D2. @ f that set. Ans. Make a Transition Table inal state) G) |B 2 Makea patton of set of states i, x = (F, QF) (E}, {A, B, C, Dj} mn 3(¢) For input symbol a, on (A, B, C, D} of z0 c—t4p| A! Bele in same set (4, B,C, D} of xo D4) "4, (6) For input a, on (A, B, C} of o @ — g-8 ap] ® A.B) B—2+B} All Bs lic in same set of x1 C+ +B For input bon A,B, C)ofm \ ate c io in same set (A,B, C} of, Lio in other set (D) of 7, 1 will be split into {A, C} and {B} For input a, on {A, 0} of a2 lie in same set of CB. For input 6, on {A, C} of xz acl | Lie in same set of 12 ‘There will be 4 states of Minimized DFA corresponding to 6 states of ‘given-DFA. trai +. (AO) can be renamed as Al . B : D ; . E 4 States of Minimized DFA ie. Al, B, D,E will be joined by seeing the ns from the given DFA Table . :: Minimized or Reduced Automata will be Ans. 1. Make a Transition Table, (iti state) FW For inputb, on (1,2,8.4) of 0, » — Piacente mnt a—_h—s Lie in same set {5} of abe , 2, 3, 4} will-be split into {1, 3} and 2,4) 1, 3}, 2 4} som +4, (@)For input symbol a on {1,3} of 1—2+2 | Liein same set (2, 4} of ate 4! Similarly for input symbol a on (2,4) of a 2+ 3 Lie in same set (1, 3} of m1 4243 (&) For input symbol b on {1,3} of m1 it 3 Lie in same set{1, 3} of m1 ats Similarly for input symbol b on {2,4} of 1 at * Lie in same set {5} of 1 ‘ S 4} will not be splitted. aid eee a \ Gaitislotate) 39 24 24 inalstate) = © (@) Ausiliary Definitions : It denotes the Regular Expressions of the form 2 LANGUAGE FOR SPECIFYING LEXICAL ANALYZERS 2 . (Dy = By i D, = R, : stave we ever tought which language OF CO4Te® Program te be Bs produce lexical Analyzer. LEX source program represents the Jac," i Distinet Names}? {Regular Expressions pecfcation of lexical Analyzer Nguage 3 7 i LEK: j a = Re * iuie-a tool or sf which automatically generate lexical : ‘utomata) ~ Analg ct Names (Di) + Shortcut name of Regular Expression ‘It ibe ai input a LEX source program and produces lo «Regular Expression (Rj + Notation to represent eallection of MPM symbole asits output ical + And then , Lexical Analyzer will convert the input tri eo horns iti i Gis wien: ine entered ‘Auxiliary Definition for Identifiers + Lx Di Lotter =A R Seae—>| LEX \{—> Lesical De digit = 0111) Re isa Analyzer Ds |. identifier =letter Getter | digi 1 Re . ‘Auxiliary Definition for Input __.| Lexical 1 igit String —"] Analyzer [7 Tokens i Fig, 2.8: LEX T signedinteger= sif integer LEX Source ot ‘ool ignedinteger= sign intege) ee . iliars finition for Dee ymber Lexical Analyonr im : It is a language used for specifying or represen an Lena meoravenine _ . eximel = signedinteger .inteee + = Auxiliary Definition for Exponential ‘Numbers ‘Exponential - No= (decimal {signedinteges) E pignedintewer, + Auxiliary Definition for Real Number: r Real-No= decimal | ‘Exponential — ‘Lexical Analyzer Leger LE) X source program consist of 2 parts : “Foss (Translation Rules : It is et ofruls actions which tells ns what it has to do or what it has to return: to parser on ‘encountering the token. It consist of ‘statements of form: ! Fae Pi {Actiomi} as Translation vi Pa (Actions) Rules ‘ ~ Fig. 2.9 : Compan ompotients of LEX Source P; | ‘rogram Pa {Action ro — Er ee oe ion consistis Ream wit som or Regular Expression HOE of opus ; n names. ae Foray srry res of ode which 218 ex0ted When, Q ies specs ext of statements to be executed ype : etl expresin or pattern Pi matches ‘en, 7 ‘Then, NFA‘s for corresponding patterns will be. Bsample: 7 ‘Translation Rules for "Keywords' begin {return 1) Patterss | end. {return 2} pep if {return 3} > Actions Expressions then {return 4} clse {return 6} From Table 2.1, we can see, that if Lexical Analyzer is given the it will recognize the token "begin" & Lexical Analyzer will return 1 as int the parse. ‘Translation Rules for "Identifiers" + + A start state ie taken and using letter (etter +igit)* connected together to make combined NFA = Ile i ical Anelyzer is given the token which is ari "identifier", then the Ad taken by the Lexical analyzer is toi val 6 an inteper code tothe parse tare the name in symbol tablet 29 IMPLEMENTATION OF A LEXICAL ANALYZER LEX generates Lexi itsinput ‘cal Analyzer a ite output by taking LEX pret! LEX program is coll aerating Aton, fPalers (Rogular Expression & # atterns teaeriing te tokens to be recog a recognized by lexical analyze! Fig. 2.11: Combined NFA For each Pattern, 7 SSoreeponding NFA will be designed. «The Final state ofeach NFA show that it has found ta own token P. UD . ‘Shey Toabined NFA to DPA. As itis alivayg g Convert tht ‘ha program. Ry tg) ‘A with & PI ai ea which token we have found, 1f nop, w Tae Mies any final states of NFA then contro} me a y +5 Ny condition includes more than one final y ate of DFA inclu 2 state opyp : ee stipe pattern coming fret in the Translation ruje, wa ty bee, AUXILARY DEFINITIONS a Example 19: Concert the following LEX program into Lexical 4, ‘Auomata)- TRANSLATION RULES th abb {} art} Hint; Ans.1. Convert the patterns into NFA's mea a Gonvedt NFA to DRA: A eslosite = (01,3, 7 'he transition on eyrabot, bole a,b from state A th hRet- 2, < at oat Az 9G L bh bb gant e-closure (Ta) = e-closure (2, 4,7) =2,4,0=B ha at Bet 2 by Net e-closure ()=(7} =D T= at = {8} by T= (8) sceclosure (@)=¢ | e-closure (8) ={8}=C 4 7) 2247 at at a 74 bh bE = 8) =8 e-closure (Ts) ® : =eclosuret) GF =@=C 6 © 7 eth at at e aay ay by bt 5, 8 } = e-closure (5, 8) =16,3)=E 1-728 §-9 ‘zaclosure (1)=(7}=D | e-closure (8) = {8} = C [HersuseE ] / Po at at ae E= ( 5 8 } a by by es =f & 8 } =a} ae - ure )=$ | e-closure (6,8)) = (6,8} = F + ema) 6.6 + ecclosure (6) =, ; O=6 | econure 6) =(9)= 0 : : ® ®--O + Combini 6,840 fal etter NPA Diagram 8, We get complete DFA. Since ss* States in NPA havi tats in NEA having thre tte states ie, 247, 8, 5 8, 68, 68 are final states \ ‘Tokens Recognized + \ + maT Tokens Recognize | State | a [> ois? | 2aT[ 8 ‘one oat | 7 | 58 a \ a |e| 8 we 1 \7 | 8 one ss | o | 8 ab cs | #| 8 abb ¢ lel? none + 0137+ Nostate in {0,1,3,7) is Final state. ‘Therefore , no token will be | Recognized by this state, |, state 2 in these’ states is final eat s combined NFA. Therefore, 247 will accept & 8 ig Final State in combined NFA. It accepts bY io state 2 accepts @ in arf combined NFA. ad a_i not final state & there it . 58+ "yj Final eats but Bis sons acoopts nothing: jn combined NFA. Therefore GC) Both states 6 & 8 are final But 6 eccepts ebb and 8 ccopts a” b* in combined NF ‘bb comes before a bin ‘Translation rules given ‘in Question. Therefore, state 68 will accopt token ab —_— <_ISHANe ign Lexical Analyzer for following Lex Protos ‘pxample 20: Desi ‘The combined NFA: for various patterns will be :— \” ,yxILIARY DEFINITIONS © Ans. 1. \ etter = AIBICI IZ digit = 0/112 9 \ ‘TRANSLATION RULES begin {return 1} end {return 2} Nee {return 3} x then {return 4} \ else {return 5} letter (etter + digit)’ [value = Install Q3 return 6 | digitt (eee (3 \ return 7 } 5 value = 1 ;] return 8 = {rae =2; return 8 } : SL fraluo =a ; rss <> {ee =43 a > < value = 5 E ne return 8 } Fig. 2.12: Combined NFA + { value = ¢ i A | e for following language and convert it into DFA using Le (aa + (bb) c*) ‘Ans. NFA for about language will be: @—_—-® 7 Conversion from NFA to DFA e-closure (0) = (0,1, 4)=4 For input symbol @ . T= 2) : 2 starting state of DFA. In combined NF, transions from alate 1 to 2 and fom same. Because, Input 'b' is a letter. onto ne) 4 For input symbol b | Foy NT ty Th={} ‘vat, ‘: eclosure (Ts) . ed zy apy Mu, 12) Tae ‘ yo ~ + =o T= ‘Te = {10} a “. e-closure (¢) = a e-closure (10)= G Abs bee ge 4 e-closure or s os ; @=9 t{-, 4 + 1 ~} = Forinput symbola | For input symbol b | For input [For state D] ay == (19, teN=neg a SUAnnne * eelosure 1 e-closure (Ts) = é-closure (I\) =e-closure (T.)=$=D * = Be ee = 110, 9, ‘Transition Table and Diagram for DFA will be :— =F ees a b c Here : C={3, 13 (itial State) ATB D A=0,14 uy E D D B={} Tal-. od De G C= 6, 6,8,9, 11, 12) D D=9 Tag h=¢ he=¢ é D D : D D E={3, 12) closure or $ e-closure Ty =p e-closure (I)=¢ ao a pees . - f = ), 9, 11, 12} Resa] as D D G G={10, CHT & & 9 11, ay 7 Hobbes yy TH. a, to, Wad Aste T= 1 felosure (1s) e-closure = e-dlosure (7) ‘ KEY POINTS TO REMEMBER asia Analyeor:Ttreads the sour Program and eonyg Input Buffering :To recognize complete token, buffering iy, red illbbulfered uti its prefix tobe read combined to tay 1 » Regular Expression isa standard notation used to rep ate ais oftokens. TeSent te i + Fite Automate (Transition Diagrams) are used to "CORN; ‘Transition Diagram are flowcharts which consist of ‘Types of Finite Automata DFA Determinstic Finite Automata) NFA (Non-Determinstic Finite Automata) LEX: LEX isa software tool which automat eee tit is a language used for ‘LEX Program consist of two Parts Auxiliary Definitions ‘Translation Rules : tically genera te Lexi SPEFing cr Repl ' REVIEW QUESTIONS ize the number of states of a DFA andy a 8. Defi Ry example, an NFA into an equivalent DFA. Explain) fine Regular Express Q4. Explain Pressions and vari the Various rules t Sanat ne ‘te of constructing NFA ted construct Regular Eipe Qs, a mM a regular Expressie! CONTEXT FREE GRAMMAR 3.1 INTRODUCTION Grammar :A Grammar isa set-of rules which checks whether a string belongs to a particular language a not. A program consists of a various string of characters . But, every string is not a_ proper or meaningful string. So, to identify valid strings in a language, some rules should be specified to check whether string is valid or not. These rules are nothing but actually make a Grammar. E.g In English language, Grammar checks whether string of characters is acceptable or not i.e, checks whether nouns, verbs, adverbs ete are in proper sequence. stor [ Boxtsh ares) - Geamnier ‘Not Acceptable Fig. 3.1: Grammar Context Free Grammar : It is a notation used to specify the syntax of language. Context free Grammar are used to design parsers. As lexical analyzer generate a string of tokens which are given to parser to construct parse tree . But , before constructing parse tree, these tokens will be grouped so that result of grouping’ will be valid construct of a language. So, to specify constructs of language, a suitable notation is used which will be precise & easy to understand. ‘This notation is Context free Grammar. Formally, Context free Grammar (@) can defined as: It isa 4-tuple (V,Z, P, S) of Productions oF Set of rules 3. Pisaset sie sym ; 1 Pea ln etm ng Gis con! : e(VUE) rer: tritedown Grammar [or language Exar parce Ans, Let G=W,2P,8) \ v=) where vee P=(S-aS Soa } ‘These productions generate the language a* ie So a $8 2aS aa or at S saaaora® S ea8 > 8 2 aS anS saaaS In Context Free Grammar , variabl ane » variable or Non-Ter ide of » (Arrow). These symbols will be expanded titi all tan generated. ‘The terminal eymbols are basically tokens w: Example 2: Find out la G=((S), (0,b1S-+asb, HS >aSb, $+ Ans. S = ash 2 > anSbb aaaSdbp aaaaSbyny Language L= frrfnay} 018 occur ial symbols xy sed in a language, mnguage generated by a Grammar te 2 ted Free Grammar {ample 3 : Write down CFG G generating all integers. as.Let G =(,3,P,S) where ‘V =(S, ,,} E=(0,1,2,3, e008 +.) P ={ S + > + |- -+ | igit> + 0111218..roe 19 For e.g, the derivation for ~ 12 is given by : S> = ~ =~ = -l > -12. \ ample 4: Write a Contest free Grammar for palindrome Language L= {wew"|w € (a,b)"} having middle symbol c. Here R means reverse of string. ns. Language represents a palindrome with a middle symbol 'c! Grammar _G =(V, 3, P, S) will have productions, S+aSa S+bSb S+c For e.g. To generate Palindrome abc a, derivation is S= aSa = abSba = abeba ixample 5 : Write CFG for language which generates palindrome of binary numbers. ms, #0801181 S+o0llle For e.g. To generate string 0110, derivation is S=0S0 = 01810 = 01610 = 0110 pian w acing a given string’s non-terminal 2 2 DERIVATIONS r Y Tight \fre0 Grammar n ee rep ful erivation means ef application, of rules that = : aie rae, The et aartne symbol is called derivatioy Produces RS GPARSE ‘TREES (DERIVATION TREES) sagt emia! ering, EREOnIng TD tart Sol by \\ parse Tree for a CFG G = (V,,P, 8) is tree eatisying following conditions replacing a variable Y SO guage is called context Free lang, Set of: Root has label S ‘pale can deriv ne Z Buage, “leg, Every vertex has label which can be a variable (V) , terminal (E) or e. Darian a ee \. A+ CC production, then C1, C2 ....:4Ca are children of ‘Types o! : A dusivation’A Sw i node labeled A. vation : A derivation A =} w is called j : @ papal eae w only to the leftmost variable a etter Leaf Node’ are terminals (2) & Interior nodes are variables (V). we spy nr tuner of derivations, VEIY ston, Wigld ied of Derivation Tree is the concatenation ofthe labels ofthe leaves (b). Rightmost Adorivation A > w is a rightmog, a to right ordering. . aple 7: If CFG has productions. yroduction ightmost variable at eve; esoply Pied Canonical Derivation TY step, Example 6: Let G be a OFG with productions. S+AA : AvaB Bob Bes Find (1) Leftmost (2) Rightmost derivatioy Ans.: (1) Leftmost Derivation: for string ab s Saa 3 aBA 3 ava 3 abap 3 abab (2) Rightmost Derivation; 8 3S aa 3 aap 3 Aad 3 apap 3 abab S>aAS\d A +SbA|SS| ba Show that $ > aa bb aa & construct parse tree whose yield is aa bb aa. tn s3 aas = agbas = aabAS = aabbaS = aabbaa » Se aabbas Derivation Tree : Yield = Left to Right ordering of leaves. = aa bb aa mple 8 : Consider the CFG S > dBlaA aad Ab B +a|aS|bBB E Find (a) Leftmost (b) Rightmost derivation for string b aa baba. Also find vation Trees. ation NN ass ati Deri oN ge0B 2 bOBB bbe abbas abbaabB abhaabaS boas va \ co bban bab ; co bbaababe “N Derivation Tre for Rightinostp, ti /\ my B MAN S, uta WA \, / JN Frame Cie gonnar \ Solna a Is nt nd i 8 (eleanor: ‘, Derivation Tree for hing et Fe Canna eo (as(aa) efimost Derivation Rightmost Derivation 83 2 OS) Rightmost Derivation 83 ee rn . DB der Grammar given below : sy Bxample10 Cons! "po BHE |BeE lid sehtmast Derivation for string. / ext Five Grammer 5 1p Leto (2) RighimostF dente idtid+id 7. j. For string id + id « id, there exits two parse trees. ‘Ans. (1) Leftmost Derivation: ‘ EE E+E ia 4 | eid+E = EtE+E AA if Ogee eit E+E ¥ IN | wid +id«E eidtid+E fo 7 Pow eidtideid eid+id tid fa | a ES EE sE+EsE AN 2 id+ E+E sf sid+id*E 7 é Hee: oid + id id | 2 E+id+id | TA | \ ., same string is gonerated using two different leftmost id ia eidtid+ia id & { \ srivations. Bach having a different parse treo. | | ‘Two different parse trees exist for string id + id + id. 33.1 Ambiguity ia 4 Grammar is Ambiguous. 7 3.2 Conversion of Ambiguous to Unambiguous Grammar A context free Grammar G is non-terminal one derivation tree for a given string, ‘The productions containing more than one occurrence of a given Said to be ambiguous if there exists more th | iés right hand side are ambiguous productions. AGrammar that cae Ee. E+E*E pas rd 7 E Dustin rth ame eae att one Lettnost Derivation (or Right It contain two E's on RHS m i ightmon in convert ‘the of Ambiguous productions to Unambiguous 4. Follouing sentence in Eng ambiguous Grammar. 2 aa, a ee or Non serminal t reece sect of theo none “tn Books selected informa arminal farther down the Parse Tree. U gives wo diferent men Se” txample 12 : Consider the Grammar ~ In Books, “elected nie E+ E+E |BE| lid In Boole ett information is given Convert this ambiguous Grammar into unambiguous Grammar. So thegpalt tele, “information is giver Ans, Stop 1 Introduction a non terminal T term) which eannot be further addition ee ve Sentence can said as an a. Mf two terms. ple iguous one ir . -EtT 1. Ve us one in language of English. A Re, a jon E + T. Substitute E yas, we have production with 'T step 2:48 port nonterminal F (factor) which cannot bo g, yumbers. “ther _ other tan of 02 with F to get, T>TFIF F@lid ‘Unambiguous Grammar will be E+E+TIT T>TeFIF Fo@ lid Brample 18, Prove that following grammar is Ambiguous for the sy 2 then s] else s2 ~ if then < statement > cannot ‘Substitute T lif then else | other statement Convert it into Unambiguous grammar. Ans. Given grammar is Ambig in iguous. Since, there exists t ee b Wo parse Trees i Because else condition can belong to any if statement a it ond> then | a ' 2s 3 " then = else I= TO then else ! ASN | | 2 a In above parse Tree, else belongs to first if. Conversion to Unambiguous grammar ‘The Unambiguous grammar will be : > | ifcond >thenelse lother-statement -rif then |ifthen else ha Grammar: A set of rules which checks whether string belongs to a particular language or not. Context Tree Grammar: It is 4-tuple (V, 3, P, S) where production (P) is of form Ava where AEV anda € (VU). KEY POINTS TO REMEMBER Derivation : Replacing Non-terminal of string by R.H.S of production. ‘Types of Derivation : (@ _Leftmost Derivation (&) Rightmost Derivation Parse Trees: A tree for Grammar G = (V, 5, P, S) has following attributes = + Root has label S Every vertex can have label as a variable (V), terminal (©) or e. IfA+ CC: Ca, then C1, Cz ‘n, are children of node A ‘Ambiguous Grammar :A Grammar having two or more Parse Trees fore given string. oN 78 REVIEW QUESTIONS —~ Q.1. Find (a) Leftmost (6) Rightmost derivation for... . > 0B/1A “ne O04, A> 010S|1AA 4 B1/1S|0BB Q.2, IfG is the grammar S + SbS | a, show that G is a Q.3. What are Leftmost and Rightmost derivations, Q.4. Consider the grammar S+aB|bA A-a&S | bAAla B->bS | aBBIb For the string aaabbabbba, find (a) The Leftmost derivation (6) Rightmost definition (© Parse Tree Q.5. Consider one grammar S-@Qla L-LS/S _, Find parse tree for following 1 @- @a) @. @@,a)) Q.6. For the given grammar Ze S+aS/SBld B~ Bbic Show grammar is ambiguous for string “aadccb”. Mbiguoys

You might also like