0% found this document useful (0 votes)
3 views27 pages

CD Final

The document provides detailed explanations of various concepts in programming languages and compilers, including the differences between compilers and interpreters, parsing strategies, memory allocation types, and error recovery strategies. It also discusses different types of compilers, parsing techniques, and attributes in syntax trees. Additionally, it outlines the roles of preprocessors, assemblers, linkers, and loaders in the compilation process.

Uploaded by

studyof19
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views27 pages

CD Final

The document provides detailed explanations of various concepts in programming languages and compilers, including the differences between compilers and interpreters, parsing strategies, memory allocation types, and error recovery strategies. It also discusses different types of compilers, parsing techniques, and attributes in syntax trees. Additionally, it outlines the roles of preprocessors, assemblers, linkers, and loaders in the compilation process.

Uploaded by

studyof19
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 27

1. What is the difference between compiler and interpreter? / Define following terms: Compiler and Interpreter.

➢ Compiler:- A compiler is a program that reads a


program written in one language and translates
into an equivalent program in another language.
FIG→
➢ Interpreter:- It is a program that functions for the translation of a programming language into a comprehensible one.
It is a computer program used for converting high-level program statements into machine codes. It includes pre-
compiled code, source code, and scripts.
COMPILER INTERPRETER
Compiler takes entire program as aninput. Interpreter takes single instruction as aninput.
Intermediate code is generated. No Intermediate code is generated.
Memory requirement is more. Memory requirement is less.
Error is displayed after entire program is checked. Error is displayed for every instruction interpreted.
Compiled codes run faster than Interpreter. Interpreted codes run slower than Compiler.
basic working is Linking-Loading Model. basic working model of the Interpreter.
The compiler generates an output in the form of (.exe). The interpreter does not generate any output.
Errors are displayed in Compiler after Compiling Errors are displayed in every single line.
together at the current time.
Compilers more often take a large amount of time for Interpreters take less time for analyzing the source
analyzing the source code. code.
The use of Compilers mostly happens in Production The use of Interpreters is mostly in Programming and
Environment. Development Environments.
C, C++, C#, etc. Python, Ruby, Perl, SNOBOL, MATLAB, etc.
2. Difference between Top-down and Bottom-up parser.
TOP-DOWN PARSING BOTTOM-UP PARSING
It is a parsing strategy that first looks at the highest level It is a parsing strategy that first looks at the lowest level
of the parse tree and works down the parse tree by using of the parse tree and works up the parse tree by using
the rules of grammar. the rules of grammar.
Top-down parsing attempts to find the left most Bottom-up parsing can be defined as an attempt to
derivations for an input string. reduce the input string to the start symbol of grammar.
In this parsing technique we start parsing from the top In this parsing technique we start parsing from the
(start symbol of parse tree) to down (the leaf node of bottom (leaf node of the parse tree) to up (the start
parse tree) in a top-down manner. symbol of the parse tree) in a bottom-up manner.
This parsing technique uses Left Most Derivation. This parsing technique uses Right Most Derivation.
The main leftmost decision is to select what production The main decision is to select when to use a production
rule to use in order to construct the string. rule to reduce the string to get the starting symbol.
Example: Recursive Descent parser. Example: ItsShift Reduce parser.
3. Differentiate synthesized and inherited attributes. S-attribute and L-attribute.
SYNTHESIZED ATTRIBUTE INHERITED ATTRIBUTE
An attribute is said to be Synthesized attribute if its An attribute is said to be Inherited attribute if its parse
parse tree node value is determined by the attribute tree node value is determined by the attribute value at
value at child nodes. parent and/or siblings node.
The production must have non-terminal as its head. The production must have non-terminal symbol in body
A synthesized attribute at node n is defined only in A Inherited attribute at node n is defined only in terms
terms of attribute values at the children of n itself. of attribute values of n’s parent, n itself, and n’s
siblings.
It can be evaluated during a single bottom-up traversal It can be evaluated during a single top-down and
of parse tree. sideways traversal of parse tree.
Synthesized attributes can be contained by both the Inherited attributes can’t be contained by both, It is
terminals or non-terminals. only contained by non-terminals.
Synthesized attribute is used by both S-attributed SDT Inherited attribute is used by only L-attributed SDT.
and L-attributed SDT.
4. Write difference between static stack and heap allocation.
Static allocation Stack allocation Heap allocation
Static allocation is done for all data In stack allocation, stack is used to In heap allocation, heap is used to
objects at compile time. manage runtime storage. manage dynamic memory allocation.
Data structures can not be created Data structures and data objects can Data structures and data objects can
dynamically because in static be created dynamically. be created dynamically.
allocation compiler can determine
the amount of storage required by
each data object.
Memory allocation: The names of Memory allocation: Using Last In First Memory allocation: A contiguous
data objects are bound to storage Out (LIFO) activation records and data block of memory from heap is
at compile time objects are pushed onto the stack. allocated for activation record or
The memory addressing can be done data object. A linked list is
using index and registers. maintained for free blocks.
Merits and limitations: This Merits and limitations: It supports Merits and limitations: Efficient
allocation strategy is simple to dynamic memory allocation but it is memory management is done using
implement but supports static slower than static allocation strategy. linked list. The deallocated space can
allocation only. Similarly recursive Supports recursive procedures but be reused. But since memory block
procedures are not supported by references to non local variables after is allocated using best fit, holes may
static allocation strategy. activation record can not be retained. get introduced in the memory
5. Differentiate between parse tree and syntax tree
Parse Tree Syntax Tree
Parse trees are typically more detailed and larger than Syntax trees are simpler and more abstract, as they only
syntax trees, as they contain more information about include the information necessary to generate machine
the source code. code or intermediate code.
Interior nodes are non-terminals, leaves are terminals. Interior nodes are “operators”, leaves are operands.
Rarely constructed as a data structure. When representing program in atree structure usually
use a syntax Tree.
Represents the concrete syntax of a Program. Represent the abstract syntax of program(semantics).
6. Difference between ambiguous and unambiguous grammar.
AMBIGUOUS UNAMBIGUOUS
In ambiguous grammar, the leftmost and rightmost In unambiguous grammar, the leftmost and rightmost
derivations are not same. derivations are same.
Amount of non-terminals in ambiguous grammar is less Amount of non-terminals in unambiguous grammar is
than in unambiguous grammar. more than in ambiguous grammar.
Length of the parse tree in ambiguous grammar is Length of the parse tree in unambiguous grammar is
comparatively short. comparatively large.
Speed of derivation of a tree in ambiguous grammar is Speed of derivation of a tree in unambiguous grammar
faster than that of unambiguous grammar. is slower than that of ambiguous grammar.
Ambiguous generates more than one parse tree. Unambiguous grammar generates only one parse tree.
Ambiguous grammar contains ambiguity. Unambiguous grammar doesn’t contain any ambiguity.
7. Compare: Static v/s Dynamic Memory Allocation.
STATIC MEMORY ALLOCATION DYNAMIC MEMORY ALLOCATION
It’s allocated before the execution of the program begins. It is allocated during the execution of the program.
No memory allocation or de-allocationactions are Memory Bindings are established anddestroyed during
performed during execution. the execution.
Variables remain permanently allocated. Allocated only when program unit is active.
Implemented using stacks and heaps. Implemented using data segments.
Pointer is needed to accessing variables. No need of dynamically allocated pointers.
Faster execution than dynamic. Slower execution than static.
More memory space required. Less memory space required.
8. Give the difference between SLR, CLR and LALR Parser.
SLR CLR LALR
It is very easy and cheap to It is expensive and difficult to It is also easy and cheap to
implement. implement. implement.
SLR Parser is the smallest in size. CLR Parser is the largest. As the LALR and SLR have the same size. As
number of states is very large. they have less number of states.
Error detection is not immediate in Error detection can be done Error detection is not immediate in
SLR. immediately in CLR Parser. LALR.
SLR fails to produce a parsing table It is very powerful and works on a It is intermediate in power between
for a certain class of grammars. large class of grammar. SLR and CLR i.e., SLR ≤ LALR ≤ CLR.
It requires less time and space It also requires more time and space It requires more time and space
complexity. complexity. complexity.
9. Explain types of compiler.
➢ There are various types of compilers which are as follows –
• Traditional Compilers(C, C++, and Pascal):- These compilers transform a source program in an HLL into its similar in
native machine program or object program.
• Interpreters (LISP, SNOBOL, and Java1.0):- These Compilers first convert Source code into intermediate code, and
then interprets (emulates) it to its equivalent machine code.
• Cross-Compilers:- These are the compilers that run on one machine and make code for another machine. A cross
compiler is compiler making executable code for a platform other than the one on which the compiler is running.
• Incremental Compilers:- Incremental Compiler is a compiler, which executes the recompilation of only a changed
source instead of compiling the complete source code.
• Converters (e.g. COBOL to C++):- These programs will be compiling from one high-level language to another.
• Just-In-Time (JIT) Compilers (Java):- These are the runtime compilers from intermediate language to executable code
or native machine code. These implement type-based verification which creates the executable code more authentic.
• Ahead-of-Time (AOT) Compilers (e.g., .NET ngen):- These are the pre-compilers to the native code for Java and .NET.
• Binary Compilation:- These compilers will be compiling the object code of one platform into the code another platform.
• Single Pass Compiler:- When all the phases of the compiler are present inside a single module, it is simply called a
single-pass compiler. It performs the work of converting source code to machine code.
• Two Pass Compiler:- Two-pass compiler is a compiler in which the program is translated twice, once from the front
end and the back from the back end known as Two Pass Compiler.
• Multi-pass Compiler:- When several intermediate codes are created in a program and a syntax tree is processed
many times, it is called Multi pass Compiler. It breaks codes into smaller programs.
10. Explain Shift reduce parser with suitable example. / Explain shift reduce parsing technique in brief.
➢ Shift: Moving of the symbols from input buffer onto the stack,
this action is called shift.
• Reduce: If handle appears on the top of the stack then
reduction of it by appropriate ruleis done. This action is called
reduce action.
• Accept: If stack contain start symbol only & input buffer is
empty at the same timethen that action is called accept.
• Error: A situation in which parser cannot either shift or
reduce the symbols, it cannoteven perform accept action then
it is called error action.
 Example:- E→E + T | T T→T * F | F F→id
Perform shift reduce parsing for string id + id * id.

FIG→
11. What is front-end and back-end of compiler?
➢ Front end:- The front end consist of those phases, that depends primarily on source language andlargely
independent of the target machine. Front end includes lexical analysis, syntax analysis, semantic analysis, intermediate
codegeneration and creation of symbol table. Certain amount of code optimization can be done by front end.
➢ Back end:- The back end consists of phases, that depend on target machine a n d notdepend on source program.
It includes code optimization and code generation phase with necessary errorhandling and symbol table operation.
12. List the cousins of the compiler and explain the role of each. / What is linker and loader. / Explain the roles of linker,
loader and preprocessor.
➢ In addition to a compiler, several other programs may be required to create an
executable target program.
• Preprocessor:- Preprocessor produces input to compiler.
 Macro processing: A preprocessor may allow user to define macros that are
shorthandfor longer constructs.
 File inclusion: A preprocessor may include the header file into the program text.
 Rational preprocessor: Such a preprocessor provides the user with built in macro for
construct like while statement or if statement.
 Language extensions: this processors attempt to add capabilities to the language by
what amount to built-in macros.
• Assembler:- It is a translator which takes the assembly program as an input and
generates the machine code as a output. An assembly is a mnemonic version of
machine code, in which names are used instead of binary codes for operations.
• Linker:- Linker allows us to make a single program from a several files of relocatable
machine code. These file may have been the result of several different compilation,
and one or more may be library files of routine provided by a system.
• Loader:- The process of loading consists of taking relocatable machine code,
altering the relocatableaddress and placing the altered instructions and data in memory at the proper location.
13. Describe code generator design issues. / Explain various issues in the design of code generator.
➢ Input to code generator:- The input to the code generator is the intermediate code generated by the front end,
along with information in the symbol table that determines the run-time addresses of the data objects denoted by
the names in the intermediate representation. Intermediate codes may be represented mostly in quadruples,
triples, indirect triples, Postfix notation, syntax trees, DAGs, etc.
➢ Target program: The target program is the output of the code generator. The output may be:-
 Assembly language: It allows subprogram to be separately compiled.
 Relocatable machine language: It makes the process of code generation easier.
 Absolute machine language: It can be placed in a fixed location in memory and can be executed immediately.
➢ Memory management:- Mapping names in the source program to addresses of data objects in run time memory is
done cooperatively by the front end and the code generator.
We assume that a name in a three-address statement refers to a symbol table entry for the name
➢ Instruction selection:- Nature of instruction set of the target machine should be complete and uniform. When you
consider the efficiency of target machine then the instruction speed and machine idioms are important factors. The
quality of the generated code can be determined by its speed and size.
➢ Register allocation issues – Use of registers make the computations faster in comparison to that of memory, so
efficient utilization of registers is important. The use of registers is subdivided into two subproblems:-
 During Register allocation: we select only those sets of variable that will reside in registers at each point in program.
 During a subsequent Register assignment phase, the specific register is picked to access the variable.
➢ Choice of evaluation:- The order in which computations are performed can affect the efficiency of the target code.
Some computation orders require fewer registers to hold intermediate results than others. Picking a best order is
another difficult, NP-complete problem.
➢ Approaches to code generation:- The most important criterion for a code generator is that it produces correct code.
Correctness takes on special significance because of the number of special cases thatcode generator must face.
 Given the premium on correctness, designing a code generator so it can be easily implemented, tested, and
maintained is an important design goal.
14. Explain: Error recovery strategies in compiler. / Explain all error recovery strategies used by parser.
➢ There are mainly four error recovery strategies :
• Panic Mode:- In this method on discovering error, the parser discards input symbol one at a time. This process is
continued until one of a designated set of synchronizing tokens is found. Synchronizing tokens are delimiters such as
semicolon or end. These tokens indicate an end of the statement.
 If there is less-number of errors in the same statement then this strategy is best choice.
• Phase Level Recovery:- In this method, on discovering an error parser performs local correction on remaining input.
 The local correction can be replacing comma by semicolon, deletion of semicolons or inserting missing semicolon.
 This type of local correction is decided by compiler designer.
 This method is used in many error-repairing compilers.
• Error Production:- If we have good knowledge of common errors that might be encountered, then we can augment
the grammar for the corresponding language with error productions that generate the erroneous constructs.
 Then we use the grammar augmented by these error production to construct a parser. If error production is used
then, during parsing we can generate appropriate error message and parsing can be continued.
• Global Correction:- Given an incorrect input string x and grammar G, the algorithm will find a parse tree for a related
string y, such that number of insertions, deletions and changes of token require to transform x into y is as small as
possible. Such methods increase time and space requirements at parsing time.
 Global correction is thus simply a theoretical concept.
15. List the functions of lexical analyzer.
➢ Tokenization: The lexical analyzer breaks the input source code into meaningful units called tokens, such as identifiers,
keywords, operators, literals, and punctuation symbols. Each token represents a specific lexical element in the
programming language.
• Removing Comments and Whitespace: The lexical analyzer filters out comments and whitespace from the source
code, as they are not relevant to the compilation process but may aid readability for humans.
• Error Handling: It identifies and reports lexical errors, such as invalid characters or tokens, to the compiler or the
programmer. It ensures that the compiler can provide meaningful feedback to the user about syntax issues.
• Symbol Table Management: In some implementations, the lexical analyzer may also interact with the symbol table to
manage identifiers and their associated attributes. It may perform tasks like symbol table lookup and insertion for
identifiers encountered in the source code.
• Handling Preprocessor Directives: In languages with preprocessor directives (e.g., C/C++), the lexical analyzer may
process these directives before passing the modified source code to the parser. This includes tasks such as file
inclusion, macro expansion, and conditional compilation.
• Generating Output for the Parser: Finally, the lexical analyzer produces an output stream of tokens that serves as input
for the parser. This stream provides the parser with the necessary information to analyze the syntax of the source
code and construct a parse tree or AST.
16. Explain Storage allocation strategies. / List and explain various storage allocation strategies.
➢ Static allocation: lays out storage for all data objects at compile time.
 In static allocation, names are bound to storage as the program is compiled, so there is noneed for a run-time
support package. Since the bindings do not change at run time, every time procedure is activated, its names are
bounded to the same storage location.
 Therefore values of local names are retained across activations of a procedure. That is, when control returns to a
procedure the value of the local are the same as they were when control left the last time.
➢ Stack allocation: manages the run-time storage as a stack.
 All compilers for languages that use procedures, functions or methods as units of user define actions manage at least
part of their run-time memory as a stack. Each time a procedure is called, space for its local variables is pushed onto a
stack, and when the procedure terminates, the space is popped off the stack.
➢ Heap allocation: allocates and de-allocates storage as needed at run time from a data areaknown as heap.
 Stack allocation strategy cannot be used if either of the following is possible:- The value of local names must be
retained when activation ends.; A called activation outlives the caller.
 Heap allocation parcels out pieces of contiguous storage, as needed for activation record orother objects.
 Pieces may be de-allocated in any order, so over the time the heap will consist of alternateareas that are free and in
use. The record for an activation of procedure r is retained when the activation ends.
17. Explain various code optimization techniques. / Explain any three code-optimization technique in detail.
➢ Common sub expressions elimination:- Compile time evaluation means shifting of computations from run time to
compile time. There are two methods used to obtain the compile time evaluation.
 Folding:- In the folding technique the computation of constant is done at compile time instead ofrun time.
 Constant propagation:- In this technique the value of variable is replaced and computation of an expression isdone
at compilation time.
• Common sub expressions elimination:- The common sub expression is an expression appearing repeatedly in the
program whichis computed previously. If the operands of this sub expression do not get changed at all then result
of such subexpression is used instead of re-computing it each time.
• Variable propagation:- Variable propagation means use of one variable instead of another.
• Code movement:- There are two basic goals of code movement: i) To reduce the size of the code. ii)To reduce the
frequency of execution of code.
 Loop invariant computation:- Loop invariant optimization can be obtained by moving some amount of code outside
theloop and placing it just before entering in the loop. This method is also called code motion.
• Strength reduction:- Strength of certain operators is higher than others. For instance strength of * is higher than +. In
this technique the higher strength operators can be replaced by lower strengthoperators.
• Dead code elimination:- A variable is said to be live in a program if the value contained into is subsequently. On the
other hand, the variable is said to be dead at a point in a program if the value contained into it is never been used. The
code containing such a variable supposed to be a dead code. And an optimization can be performed by eliminating
such a dead code.
18. Explain LALR parser in detail. Support your answer with an example.
➢ LALR Parser is lookahead LR parser. It is the most powerful parser which can handle large classes of grammar. The
size of CLR parsing table is quite large as compared to others parsing table. LALR reduces the size of this table. LALR
works similar to CLR. The only difference is, it combines the similar states of CLR parsing table into one single state.
 The general syntax becomes [A->∝.B, a ] where A → ∝.B is production and a is a terminal or right end marker $,
LR(1) items=LR(0) items + look ahead.
• Example:-
S→CC
C→aC|d
Augmented grammar: S’ →.S, $
Closure(I)
TABLE→
 Now we will merge state 3, 6 then 4, 7
and 8, 9.
I36 : C→ a.C , a | d | $
C→ .a C , a | d | $
C→ .d , a | d | $
I47 : C→ d. , a | d | $
I89: C→ aC. ,a | d | $
 Parsing table:
TABLE→
19. Explain different types of intermediate code.
➢ There are three types of intermediate representation:-
• Abstract syntax tree:- A syntax tree depicts the
natural hierarchical structure of a source program. A
DAG (Directed Acyclic Graph) gives the same
information but in a more compact waybecause
common sub-expressions are identified. A syntax tree
and DAG for the assignment statement a = b*-c + b*-c
is given in Fig→ (Assign, Uminu)
• Postfix notation:- Postfix notation is a linearization of a syntax tree. In postfix notation the operands occurs first and
then operators are arranged. The postfix notation for the syntax tree in Fig. is,a b c uminus * b c uminus * + assign.
• Three address code:- Three address code is a sequence of statements of the general form, a:= b op c
 Where a, b or c are the operands that can be names or constants. And op stands for anyoperator.
 For the expression like a = b + c + d might be translated into a sequence, t1=b+c t2=t1+d a= t2
 Here t1 and t2 are the temporary names generated by the compiler.
 There are at most three addresses allowed (two for operands and one for result).
20. Explain different representation of three address code.
➢ There are 3 representations of three address code namely:-
• Quadruple:- It is a structure which consists of 4 fields namely op, arg1, arg2 and result. op denotes the operator and
arg1 and arg2 denotes the two operands and result is used to store the result of the expression.
 Example:- Consider expression a = b * – c + b * – c. The
three address code is:
t1 = uminus c (Unary minus operation on c)
t2 = b * t1
t3 = uminus c (Another unary minus operation on c)
t4 = b * t3
t5 = t2 + t4
a = t5 (Assignment of t5 to a)
FIG→
• Triples:- This representation doesn’t make use of extra
temporary variable to represent a single operation instead when a
reference to another triple’s value is needed, a pointer to that
triple is used. So, it consist of only three fields namely op, arg1 and
arg2.
 Example:- Consider expression a = b * – c + b * – c
FIG

• Indirect Triples:- This representation makes use of


pointer to the listing of all references to
computations which is made separately and
stored. Its similar in utility as compared to
quadruple representation but requires less space
than it. Temporaries are implicit and easier to
rearrange code.
 Example:- Consider expression a = b * – c + b * – c
FIG→
21. Define lexemes, patterns and token.
➢ Token: Sequence of character having a collective
meaning is known as token. Typical tokens are, Identifiers, keywords, operators, special symbols, constants.
➢ Pattern: The set of rules called pattern associated with a token.
➢ Lexeme: The sequence of character in a source program matched with a pattern for a token is called lexeme.
22. Explain the term “activation record” and explain activation tree?
➢ Activation Record:- Temporary values: The temporary variables are needed during the
evaluation ofexpressions. Such variables are stored in the temporary field of activation record.
• Local variables: The local data is a data that is local to the execution procedure is stored in this
field of activation record.
• Saved machine registers: This field holds the information regarding the status of machinejust
before the procedure is called. This field contains the registers and program counter.
• Control link: This field is optional. It points to the activation record of the calling procedure. This
link is also called dynamic link.
• Access link: This field is also optional. It refers to the non local data in other activation
record. This field is also called static link field.
• Actual parameters: This field holds the information about the actual parameters. Theseactual
parameters are passed to the called procedure.
• Return values: This field is used to store the result of a function call.
➢ Activation tree:- Activation tree is used to depict the way control enters and leave activation. In anactivation tree,
 Each node represents an activation of a procedure.
 The root represents the activation of the main program.
 The node for a is the parent of the node b if and only if control flows from activationa to b.
 The node for a is to the left of the node for b if and only if the lifetime of a occursbefore the lifetime of b.
23. Explain various parameter passing methods.
➢ Call by value:- This is the simplest method of parameter passing.
 The actual parameters are evaluated and their r values are passed to caller procedure.
 The operations on formal parameters do not change the values of a parameter.
 Example: Languages like C, C++ use actual parameter passing method.
• Call by reference:- This method is also called as call by address or call by location.
 The L-value, the address of actual parameter is passed to the called routines activationrecord.
• Copy restore:- This method is a hybrid between call by value and call by reference. This method is alsoknown as
copy-in-copy-out or values result. During execution of called procedure, the actual parameters value is not affected.
 If the actual parameter has L-value then at return the value of formal parameter is copiedto actual parameter.
• Call by name:- This is less popular method of parameter passing.
 Procedure is treated like macro. The procedure body is substituted for call in caller withactual parameters
substituted for formals. The actual parameters can be surrounded by parenthesis to preserve their integrity.
 The local names of called procedure and names of calling procedure are distinct.
24. Explain Basic Block with example. / Define basic block with simple example.
➢ Basic Block is a straight line code sequence that has no branches in and out branches except to the entry and at the
end respectively. Basic Block is a set of statements that always executes one after other, in a sequence.
 The first task is to partition a sequence of three-address codes into basic blocks. In the absence of a jump, control
moves further consecutively from one instruction to another. The idea is standardized in the algorithm below:
• Algorithm:- Partitioning three-address code into basic blocks.
• Input:- A sequence of three address instructions.
• Process:- The following are the rules used for finding a leader:
 The first three-address instruction of the intermediate code is a leader.
 Instructions that are targets of unconditional or conditional jump/goto statements are leaders.
 Instructions that immediately follow unconditional or conditional jump/goto statements are considered leaders.
• Example:- The following sequence of three-address statements forms a basic block:
t1 := a*a t2 := a*b
t3 := 2*t2
t4 := t1+t3
t5 := b*b t6 := t4+t5
 A three-address statement x:=y+z is said to define x and to use y or z. A name in a basic block is said to be live at a
given point if its value is used after that point in the program, perhaps in another basic block.
25. Explain Flow Graph. / Explain Dominators, Natural loop, inner loop.
➢ Dominators:- In a flow graph, a node d dominates n if every path to node n from initial
node goesthrough d only. This can be denoted as’d dom n'. Every initial node dominates all
the remaining nodes in the flow graph. Similarly everynode dominates itself. FIG→
 Node 1 is initial node and it dominates every node as it is initial node.
 Node 2 dominates 3, 4 and 5.
 Node 3 dominates itself similarly node 4 dominates itself.
➢ Natural loops:- Loop in a flow graph can be denoted by n→d such that d dom n. This edge is
called back edges and for a loop there can be more than one back edge. If
there is p → q then q is a head and p is a tail. And head dominates tail.
FIG.1
 The loop in above graph can be denoted by 4→1 i.e. 1 dom 4.
Similarly 5→4 i.e. 4 dom 5. The natural loop can defined by a back edge
n→d such there exist a collection of all the node that reach to n without
going through d & at the same time d also can be added to this collection.
FIG.2→
 6→1 is a natural loop because we can reach to all the remaining nodes from 6.
➢ Inner loops:- The inner loop is a loop that contains no other loop.
FIG Here the inner loop is 4→2 that mean edge given by 2-3-4.
➢ Pre-header:- The pre-header is a new block created such that successor of this block is the
block. All the computations that can be made before the header block can be made the pre-
headerblock.
➢ Reducible flow graph:- The reducible graph is a flow graph in which there are two types of
edges forward edgesand backward edges. These edges have following properties,
 The forward edge forms an acyclic graph.
 The back edges are such edges whose head dominates their tail.
26. What is input buffering? Why it is used?
➢ There are mainly two techniques for input buffering,
• Buffer pair:- The lexical analysis scans the input string from left to right one character at a time.
 So, specialized buffering techniques have been developed to reduce the amount ofoverhead required to process
a single input character. We use a buffer divided into two N-character halves, as shown in figure N is thenumber of
character on one disk block. FIG→
 code to advance forward pointer:-
if forward at end of first half then begin
reload second half; forward := forward + 1;
end
else if forward at end of second half then beginreload first half;
move forward to beginning of first half; end
else forward := forward + 1;
• Sentinels:- If we use the scheme of Buffer pairs we must check, each time we move the forward pointer that we have
not moved off one of the buffers; if we do, then we must reload the other buffer. Thus, for each character read, we
make two tests. FIG→
 Code with sentinels:- forward := forward + 1;
if forward = eof then begin
if forward at end of first half then begin reload second half;
forward := forward + 1; end
else if forward at the second half then begin reload first half;
move forward to beginning of first half; end
else terminate lexical analysis; end;
27. Explain the functions of a translator.
➢ A translator converts code or text from one form to another, facilitating communication between systems, languages,
or formats. It analyzes syntax and semantics, ensures correctness, and generates output in the desired form.
28. Write a brief note on input buffering techniques.
➢ It is an important concept in compiler design that refers to the way in which the
compiler reads input from the source code. In many cases, the compiler reads input
one character at a time, which can be a slow and inefficient process.
 The basic idea behind input buffering is to read a block of input from the source
code into a buffer and then process that buffer before reading the next block.
 One of the main advantages of input buffering is that it can reduce the number
of system calls required to read input from the source code.
• One Buffer Scheme:- In this scheme, only one buffer is used to store the input
string but the problem with this scheme is that if lexeme is very long then it
crosses the buffer boundary, to scan rest of the lexeme the buffer has to be
refilled, that makes overwriting the first of lexeme.
• Two Buffer Scheme:- To overcome the problem of one buffer scheme, in this
method two buffers are used to store the input string. the first buffer and
second buffer are scanned alternately. when end of current buffer is reached
the other buffer is filled.
29. Explain various methods of peephole optimization.
➢ Peephole optimization is a code optimization technique that focuses on improving the efficiency of small, localized
code segments, known as "peepholes." The goal is to identify and replace specific code patterns with more efficient
alternatives, without changing the overall behavior of the program.
• Constant Folding:- Identifying expressions that can be evaluated at compile-time and replacing them with their
constant values. Example: Replacing `2 + 3` with `5`.
• Constant Propagation:- Tracking the values of variables that are assigned constant values and replacing subsequent
uses of those variables with the constant values. Example: Replacing `x = 5; y = x + 2;` with `x = 5; y = 7;`.
• Common Subexpression Elimination:- Identifying and removing redundant computations of the same expression.
Example: Replacing `a = b + c; d = b + c;` with `a = b + c; d = a;`.
• Dead Code Elimination:- Identifying and removing code that is never executed and has no observable effect on the
program's behavior. Example: Removing `if (false) { // some code }`.
• Strength Reduction:- Replacing expensive operations with cheaper, but semantically equivalent, operations. Example:
Replacing `x = x * 2;` with `x = x << 1;`.
• Instruction Scheduling:- Reordering instructions to improve the overall performance of the code, taking into account
processor pipeline constraints and dependencies. Example: Rearranging instructions to minimize stalls and maximize
instruction-level parallelism.
• Conditional Branch Simplification:- Simplifying conditional branches by exploiting known conditions or values.
Example: Replacing `if (x == 0) { ... } else { ... }` with `if (x) { ... } else { ... }`.
30. Write a short note on Symbol table management.
➢ Managing a symbol table involves inserting, searching, and deleting entries.
• Inserting Entries:- When the compiler encounters a new identifier, it needs to be added to the symbol table. This
process involves creating a new entry with the identifier's name, type, scope, and any other relevant information. The
insertion process also involve checking for duplicate identifiers the same scope and throwing an error if one is found.
• Searching Entries:- When the compiler encounters an identifier in an expression or statement, it must look it up in the
symbol table. Searching involves finding the relevant entry for the identifier based on its name and scope. For
example, when processing a statement like y = x + 1;, the compiler searches for x in the symbol table to determine its
type and memory location.
• Deleting Entries:- As the compiler processes the program, it may enter and exit various scopes. When exiting a scope,
the entries associated with that scope should be removed from the symbol table. This cleanup process prevents
memory leaks and ensures that identifiers from different scopes don't conflict.
31. Write R.E. for the language of all strings that do not end with 01.
➢ (0|1)*((?!01).)*
• Let's break down the components of this regular expression:
 `(0|1)*`: Zero or more occurrences of '0' or '1'.
 `((?!01).)*`: Any character (represented by `.`) zero or more times that is not followed by "01”.
32. Write R.E. the for: The language that do not end with 01.
➢ The regular expression for strings that do not end with "01" can be represented as:
• (0|1)*(0|ε)
 `(0|1)*`: Zero or more occurrences of '0' or '1'.
 `(0|ε)`: Ends with '0' or is empty (ε represents the empty string).
 This regular expression matches strings that can have any combination of '0' or '1' and end with '0' or are empty. It
ensures that the string does not end with "01".
33. Explain symbol table. For what purpose, compiler uses symbol table?
➢ A symbol table is a data structure used by a language translator such as a compiler orinterpreter. It is used to store
names encountered in the source program, along with the relevantattributes for those names.
➢ Use of Symbol Table:- The symbol tables are typically used in compilers. Basically compiler is a program which scans
the application program (for instance: your C program) and produces machine code.
 During this scan compiler stores the identifiers of that application program in the symbol table. These identifiers are
stored in the form of name, value address, type.
 Here the name represents the name of identifier, value represents the value stored in an identifier, the address
represents memory location of that identifier and type represents the data type of identifier.
• Items stored in Symbol table:- Variable names and constants; Procedure and function names; Literal constants and
strings; Compiler generated temporaries; Labels in source languages
34. What do you mean by left recursion and how it is eliminated? / What is left Recursion in CFG?
➢ Left Recursion:- A Grammar G (V, T, P, S) is left recursive if it has a production in the form. A → A α |β.
 The above Grammar is left recursive because the left of production is occurring at a first position on the right side of
production. It can eliminate left recursion by replacing a pair of production with A → βA′ A → αA′|ϵ
➢ Elimination of left recursion:- A grammar is said to be left recursive if it has a non- terminal A such that there is
aderivation A→Aα for some string α. Top down parsing methods cannot handle left recursive grammar, so a
transformation that eliminates left recursion is needed.
• Algorithm to eliminate left recursion:- Assign an ordering A1,…,An to the non-terminals of the grammar.
for i:=1 to n dobegin
for j:=1 to i−1 dobegin
replace each production of the form Ai→Aiɣ
by the productions Ai ->δ1ɣ | δ2ɣ |…..| δkɣ
where Aj -> δ1 | δ2 |…..| δk are all the current Aj productions; end end
35. Give the rule to remove
left recursive grammar.
And Eliminate left
recursion from following
grammar.
S → Aa | b
A → Ac | Sd | f
ANS→
36. Explain a rule of Left factoring a grammar and give Example. / What is left factoring in CFG? / Do left factoring for
following grammar: S → iEtS | iEtSeS | a E→b
➢ S → / iEtSS’ / a S’ → / es/ E E→b
 Left factoring is used to convert a left-factored grammar into an equivalent grammar to remove the uncertainty for
the top-down parser. In left factoring, we separate the common prefixes from the production rule.
 The following algorithm is used to perform left factoring in the grammar-
• Suppose the grammar is in the form:- A ⇒ αβ1 | αβ2 | αβ3 | …… | αβn | γ
 Where A is a non-terminal and α is the common prefix.
 We will separate those productions with a common prefix and then add a new production rule in which the new
non-terminal we introduced will derive those productions with a common prefix.
• A ⇒ αA` A` ⇒ β1 | β2 | β3 | …… | βn
 The top-down parser can easily parse this grammar to derive a given string. So this is how left factoring in compiler
design is performed on a given grammar.
37. Define Handle, Handle pruning, Ambiguous grammar, Basic block and Constant folding.
➢ Handle: A “handle” of a string is a substring of the string that matches the right side of a production, and whose
reduction to the non-terminal of the production is one step along the reverse of rightmost derivation.
➢ Handle pruning: The process of discovering a handle
and reducing it to appropriate Left hand side non
terminal is known as handle pruning. FIG→
➢ Ambiguous grammar: A CFG is said to be ambiguous if
there exists more than one derivation tree for the
given input string i.e., more than
one LeftMost Derivation Tree (LMDT)
or RightMost Derivation Tree (RMDT).
 Definition:- G = (V,T,P,S) is a CFG that is said to be ambiguous if and only if there exists a string in T* that has more
than one parse tree. where V is a finite set of variables. T is a finite set of terminals. P is a finite set of productions o f
the form, A -> ?, where A is a variable and ? ? (V ? T)* S is a designated variable called the start symbol.
➢ Basic block:- A basic block is a sequence of consecutive statements in which flow of control enters at the beginning
and leaves at the end without halt or possibility of branching except at the end.
➢ Constant folding:- Constant folding is an optimization technique in which the expressions are calculated beforehand
to save execution time. The expressions which generate a constant value are evaluated and during the compilation
time, the expression are calculated & stored in the designated variables. This method reduces the code sizes as well.
38. What is global optimization? Name the 2 types of analysis performed for global optimization.
➢ Global Optimization is a code optimization technique employed by compilers to improve the overall performance
and efficiency of a program by considering the program as a whole, rather than focusing on individual code
segments or functions.
 The two main types of analysis performed for global optimization are:
• Inter-procedural Analysis:- Inter-procedural analysis involves analyzing the interactions and data flow between
different functions or procedures in a program.
 This analysis allows the compiler to make optimization decisions that span multiple functions, such as function
inlining, constant propagation across function boundaries, and inter-procedural dead code elimination.
• Whole-program Analysis:- Whole-program analysis considers the program as a single, integrated unit, allowing the
compiler to make optimization decisions that take the entire program into account.
 This type of analysis is particularly useful for optimizations that require a global view of the program, such as global
register allocation, inter-procedural code motion, and pointer analysis.
39. What is lexical analysis?
➢ Lexical analysis is the first phase of the compiler process where the source code is scanned to convert sequences of
characters into meaningful symbols known as tokens.
40. Describe the role of lexical analyzer. Which are the tasks performed by lexical analyzer.
➢ The lexical analyzer role, also known as the lexer or scanner, is a fundamental component of a compiler. It performs
the first phase of the compilation process, transforming the source code into a sequence of tokens. Here are the key
roles of the lexical analyzer:
• Tasks Performed by the Lexical Analyzer:-
 Tokenization:- Converts the input stream of characters into tokens, which are the basic building blocks of syntax (e.g.,
keywords, operators, identifiers, literals).
 Removing Whitespaces and Comments:- Eliminates unnecessary whitespace and comments to streamline the input for
the syntax analyzer.
 Error Detection:- Identifies and reports lexical errors such as illegal characters or
malformed tokens.
41. Construct syntax tree and DAG for following expression: X = a * (b+c)- (b+c)* d
➢ T1= b + c
T2=a*T
T3=T1*d
T4=T2-
T3 T5=X
42. Write RE the following language.. (i) All string of 0’s and 1’s that do not contain 11.; (ii) All string of 0’s and 1’s that
every 1 is followed by 00.
➢ 1. All strings of 0’s and 1’s that do not contain "11":- The regular expression for this language ensures that no two
consecutive '1's appear in the string.
• Regular Expression:- `0*(10*0)*10*`
• Explanation:- `0*` matches any number of 0s (including none).
 `(10*0)*` matches a '1' followed by any number of 0s, ensuring the '1' is not followed directly by another '1'.
 The final `10*` matches an optional '1' that may be followed by any number of 0s, ensuring the string ends correctly
without a "11".
➢ 2. All strings of 0’s and 1’s where every '1' is followed by "00":- The regular expression for this language ensures that
every occurrence of '1' is immediately followed by "00".
• Regular Expression:- `(0*(100)*0*)*`
• Explanation:- `0*` matches any number of 0s (including none) at the beginning.
 `(100)` ensures that every '1' is followed by exactly two '0's.
 The entire expression is repeated with `(0*(100)*0*)*` to allow multiple segments of valid patterns, ensuring any '1'
encountered in the sequence is followed by "00".
43. Design FIRST and FOLLOW set for the following grammar.
S → 1AB | ε
A → 1AC | 0C
B → 0S
C→1 ANS→
44. For the following production write the semantic action:
S → E$
E → E1 + E2
E → E1 * E2
E → digit ANS→
45. Define the following terms and give suitable examples for it. (i) LR(0) Item (ii) LR(1) Item (iii) Augmented Grammar
➢ LR(0) Item:- An LR(0) item is a production of a grammar with a dot (•) at some position in the right-hand side of the
production. It is used in the context of LR parsing, which is a bottom-up method for syntax analysis in compilers. The
dot indicates how much of the production has been recognized so far.
• Example:- Consider the production E→E+T E →E + T E →E+T. Some possible LR(0) items derived from this
production are:
E→•E+TE E→E•+TE E→E+•TE E→E+T•E
➢ LR(1) Item:- An LR(1) item is an LR(0) item with an additional component called a lookahead. The lookahead is a
terminal symbol that is expected to appear immediately after the end of the production. This lookahead helps in
making more informed parsing decisions.
• Example:- Using the same production E→E+T E →E + T E→E+T and assuming the lookahead symbol is $, which
typically represents the end of input:
E → • E + T, $ E → E • + T, $ E → E + • T, $ E → E + T •, $
➢ Augmented grammar:- An augmented grammar is a grammar that has been modified by adding an extra production
rule. This new production rule introduces a new start symbol, and the original start symbol becomes the right-hand
side of this new production. The purpose of augmenting a grammar is to facilitate the parsing process, especially for
LR parsers, by providing a clear and unambiguous starting point.
• Example:- Suppose we have a grammar with the start symbol SSS and production rules:
S→E E→E+T
E→T T→id
• To create an augmented grammar, we introduce a new start symbol S′S'S′ and add a production for S′S'S′ as follows:
S′→S S' → S S′→S
S→E S→E S→E
E→E+T E→E+T E→E+T
E→T E→T E→T
T→id T → id T→id
46. Write the regular expression R over {0,1} or {a,b}: (i) The set of all strings with an even number of a’s followed by an
odd number of b’s. (ii) The set of all strings that consist of alternating 0’s and 1’s.
➢ (i)The set of all strings with an even number of a’s followed by an odd number of b’s:
• Let's break this down:- An even number of `a`'s can be represented by `(aa)*` because every two `a`'s make an even
count. An odd number of `b`'s can be represented by `b(bb)*` because the first `b` ensures the count is odd and any
subsequent pairs `bb` maintain the odd count.
• Combining these, we get Regular Expression:- `(aa)*(b(bb)*)`
• Explanation:- `(aa)*` matches any sequence with an even number of `a`'s, including the empty string. `(b(bb)*)`
matches any sequence with an odd number of `b`'s.
➢ (ii)The set of all strings that consist of alternating 0’s and 1’s:
• Let's consider strings that start with either `0` or `1`:- For strings starting with `0` and alternating with `1`: `01(01)*` For
strings starting with `1` and alternating with `0`: `10(10)*`
• Combining these using alternation (`|`), we get Regular Expression:- `(01(01)*) | (10(10)*)`
• Explanation:- `01(01)*` matches strings starting with `0` followed by alternating `1`'s and `0`'s. `10(10)*` matches
strings starting with `1` followed by alternating `0`'s and `1`'s. The alternation `|` allow for either pattern to matched.
47. Explain dynamic memory allocation strategy.
➢ Explicit Allocation:- The explicit allocation can be done for fixed size and variable sized blocks.
 Explicit Allocation for Fixed Size Blocks
 This is the simplest technique of explicit allocation in which the size of the block for which memory is allocated is fixed.
 In this technique a free list is used. Free list is a set of free blocks. This observed when we want to allocate memory. If
some memory is de-allocated then the free list gets appended.
 The advantage of this technique is that there is no space overhead. Explicit Allocation of Variable Sized Blocks.
 Due to frequent memory allocation and de-allocation the heap memory becomes fragmented. That means heap may
consist of some blocks that are free and some that are allocated.
• Implicit Allocation:- The implicit allocation is performed using user program and runtime packages. The run time
package is required to know when storage block is not in use. There are two approaches used for implicit allocation.
 Reference count:- Reference count is a special counter used during implicit memory allocation. If any block is referred
by some another block then its reference count incremented by one. That also means if the reference count of
particular block drops down to 0 then, that means that block is not referenced one and hence it can be de-allocated.
Reference counts are best used when pointers between blocks never appear in cycle.
 Marking techniques:- This is an alternative approach to determine whether the block is in use or not. In this method,
the user program is suspended temporarily and frozen pointers are used to mark the blocks that are in use. There is
one more technique called compaction in which all the used blocks are moved at the one end of heap memory.
48. What is a dependency graph? Explain with examples.
➢ Data Dependencies: When a statement computes data that is later utilized by another statement. A state in which
instruction must wait for a result from a preceding instruction before it can complete its execution.
• Control Dependencies: Control Dependencies are those that come from a program’s well-ordered control flow. A
scenario in which a program instruction executes if the previous instruction evaluates in a fashion that permits it to
execute is known as control dependence.
• Flow Dependency: In computer science, a flow dependence occurs when a program statement refers to the data of
a previous statement.
• Anti Dependence: When an instruction need value that is later modify, this is known as anti-dependency, or WAR.
• Output-Dependency: An output dependence, also known as write-after-write (WAW), happens when the sequence
in which instructions are executed has an impact on the variable’s ultimate output value.
• Control-Dependency: If the outcome of A determines whether B
should be performed or not, an instruction B has a control
dependence on a previous instruction A.
 Example:- E → E1 + E2 E→ E1 * E2
PRODUCTIONS SEMANTIC RULES
E -> E1 + E2 E.val -> E1.val + E2.val
E -> E1 * E2 E.val -> E1.val * E2.val
49. Explain different phases of the compiler. / Describe the output for the various phases of compiler with example./
Explain input, output and action performed by each phases of compiler with example.
➢ Lexical Analysis:- It phase is the first phase of compilation process. It reads the source program one character at a time
and converts it into meaningful lexemes. Lexical analyzer represents lexemes in the form of tokens.
 Input: Stream of characters (source code).
 Output: Tokens (e.g., identifiers, operators, constants).
 Action:-x → Identifier (id, 1) = → Operator (Assignment) a → Identifier (id, 2)
+ → Operator (Binary Addition) b → Identifier (id, 3) * → Operator (Multiplication)
50 → Constant (Integer)
 Final Tokenized Expression:- (id, 1) = (id, 2) + (id, 3) * 50
• Syntax Analysis:- Syntax analysis is the second phase of compilation process. It takes tokens as input and generates a
parse tree as output. In this parser checks that the expression made by the tokens is syntactically correct or not.
 Input: Tokens (from the lexical analyzer). Output: Parse tree (syntax structure).
 Action: Checks syntax based on a context-free grammar (CFG):
 CFG rules:- S → Id = E E→E+T|T T→T*F|F F → Id | Integer constant
• Semantic Analysis:- It is the third phase of compilation process. It checks whether the parse tree follows the rules of
language. The output of semantic analysis phase is the annotated tree syntax.
 Input: Parse tree (from syntax analyzer). Output: Type checking and semantic actions.
 Action: Ensures type consistency and performs semantic checks. No errors in our example.
• Intermediate Code Generation:- The compiler generates the source code into the intermediate code. The
intermediate code should be generated in such a way that you can easily translate it into the target machine code.
 Input: Modified parse tree. Output: Three-address code (intermediate representation).
 Action: Generates intermediate code:- t1 = b * 50.0 t2 = a + t1 x = t2
• Code Optimization:- It is an optional phase. It is used to improve the intermediate code so that the output of the
program could run faster. It removes the unnecessary line and arranges the sequence of statements in order.
 Input: Three-address code. Output: Optimized code.
 Action: Optimizes the code (e.g., constant folding, common subexpression elimination):
Original: t1 = b * 50.0, x = a + t1 Optimized: x = a + b * 50.0
• Code Generation:- It is the final stage of the compilation process. It takes the optimized intermediate code as input
and maps it to the target machine language.
 Input: Intermediate code. Output: Assembly code (target language).
 Action: Converts the expression into assembly code for the processor.
50. Explain the language dependent and machine independent phases of the compiler. Also List major functions done
by the compiler.
➢ Language Dependent Phases:- This are closely tied to the syntax and semantics of the source programming language:
 Lexical Analysis:- Converts the source code into a stream of tokens, identifying syntactic elements such as keywords,
operators, and identifiers.
 Syntax Analysis (Parsing):- Analyzes the token stream against the grammar of the language to build a parse tree (or
syntax tree), ensuring the code follows the correct syntactical structure.
 Semantic Analysis:- Ensures that the syntax tree follows the language's semantic rules, such as type checking, scope
resolution, and checking for undefined variables.
 Intermediate Code Generation:- Translates the syntax tree into an intermediate representation (IR) that is easier to
manipulate and optimize. This IR is still language-dependent but abstracts away some of the language-specific details.
• Machine Independent Phases:- These phases focus on optimization and code generation that are not specific to any
particular machine architecture:
 Intermediate Code Optimization:- Performs optimizations on the intermediate representation to improve performance
and reduce resource usage. These optimizations are general and not specific to any particular machine.
 Code Generation:- Converts the optimized intermediate representation into machine code. This phase takes into
account the specific architecture of the target machine.
 Machine Dependent Code Optimization:- Further optimizes the machine code for specific hardware features of the
target machine, such as instruction pipelining and register allocation.
• Major Functions of the Compiler:- (ANS NO: 49)
51. Explain the following with example (i) Lexical phase error. (ii) Syntactic phase error.
➢ (i)Lexical phase error:- Occurs when a sequence of characters doesn't form a recognizable token during the source
program scanning, preventing valid token generation.
• Common Causes of Lexical Errors:- Adding an unnecessary character.
 Omitting a required character.
 Substituting a character incorrectly.
 Swapping two characters.
• Example:- In Fortran, identifiers exceeding 7 characters are considered lexical errors.
 The presence of characters like ~, &, and @ in a Pascal program constitutes a lexical error.
➢ (ii)Syntactic phase error:- Syntax errors occur due to coding mistakes made by programmers.
• Common Sources of Syntax Errors:- Omitting semicolons.
 Imbalanced parentheses and incorrect punctuation usage.
• Example:- Consider the code snippet: int x; int y //Syntax error
 The error arises from the missing semicolon after int y.
52. Discuss the functions of error handler.
➢ Error Detection :- Error handlers don't magically prevent errors, but they play a crucial role in identifying them. They
act as sentinels, constantly monitoring the program's execution for conditions that deviate from normal operation.
 Invalid user input (e.g., entering letters when a number is expected)
 File access issues (e.g., trying to open a non-existent file)
 Runtime errors (e.g., division by zero)
 Logical errors in program code (e.g., infinite loops)
• Error Report :- Once an error is detected, the error handler doesn't just stay silent. It communicates the issue by
generating an error report.
 Error type (e.g., "File not found," "Division by zero")
 Location of the error (e.g., line number in the code, specific function)
 Any relevant data that can help pinpoint the root cause (e.g., values of variables involved)
• Error Recovery:- In some cases, error handlers can go beyond just reporting the error. They might attempt to recover
from the situation and allow the program to continue execution, albeit potentially in a limited way.
 Providing default values (e.g., using a default value for a missing input)
 Skipping problematic sections of code
 Attempting to retry operations (e.g., retrying a file access after a brief delay)
53. Explain peephole optimization. Explain with example.
➢ Definition: Peephole optimization is a simple and effective technique for locally improving target code. This technique
is applied to improve the performance of the target program by examining the short sequence of target instructions
(called the peephole) and replacing these instructions by shorter or faster sequence whenever possible. Peephole is
a small, moving window on the target program.
• Objectives of Peephole Optimization:-
 Improve Performance:- Optimize code to execute more efficiently.
 Reduce Memory Footprint:- Minimize memory usage during execution.
 Decrease Code Size:- Make the compiled code smaller.
• Techniques:-
• Redundant Load and Store Elimination:- Identify and eliminate redundancy.
 Example:- Initial code:- y = x + 5; i = y; z = i; w = z * 3;
Optimized code:- y = x + 5; w = y * 3;
• Constant Folding:- Simplify expressions that can be evaluated at compile time.
 Example:- Initial code:- x = 2 * 3; Optimized code:- x = 6;
• Strength Reduction:- Replace expensive operations with cheaper alternatives.
 Example:- Initial code:- y = x * 2; Optimized code:- y = x + x; // Equivalent to multiplication by 2
• Null Sequences / Simplify Algebraic Expressions:- Remove useless operations.
 Example:- Initial code:- a = a + 0; a = a * 1; a = a / 1; a = a - 0;
Optimized code:- // No effect on 'a'
54. What are conflicts in LR Parser? What are their types? Explain with an example.
➢ Shift-Reduce Conflict: A shift-reduce conflict occurs when the parser faces a choice between shifting the next input
symbol onto the stack or reducing an existing set of symbols to a non-terminal.
 Cause: This conflict arises when the parser can either continue reading input (shift) or apply a production rule.
• Example: Consider the ambiguous grammar:-
E -> E + E E -> E * E E -> id
 Suppose we have the input string: id + id * id.
 The parser encounters id and shifts it onto the stack.
 Next, it faces +, which can either be shifted or used to reduce the first E + E production.
➢ Reduce-Reduce Conflict: A reduce-reduce conflict occurs when the parser has multiple production rules that can be
applied to the same set of symbols.
 Cause: This conflict arises when the parser must decide which production rule to use for reduction.
• Example: Consider the following ambiguous grammar:-
E -> E + E E -> E * E E -> id
 Suppose we have the input string: id + id * id.
 After reading id + id, the parser can reduce it using either E + E or E * E.
➢ Handling Conflicts:-
• Precedence and Associativity: To resolve conflicts, LR parsers use rules based on operator precedence and
associativity. For example, multiplication (*) might have higher precedence than addition (+).
• Parsing Table: The parsing table contains action and go-to entries. Action entries guide shift and reduce decisions,
while go-to entries determine the next state.
• Shift Action: Shift the current terminal and update the state.
• Reduce Action: Apply a production rule and update the stack.
 Example Resolution: In the ambiguous grammar, if we assign higher precedence to * than +, the parser will correctly
reduce id * id first, followed by id + (id * id).
55. Define DAG Give an example. / What is DAG? What are its advantages in context of optimization? How does it help
in eliminating common sub expression?
➢ A Directed Acyclic Graph (DAG) is a type of graph that is directed and contains no cycles. In other words, it is a graph
in which edges have a direction, and it is impossible to start at any node and follow a consistently directed sequence of
edges that eventually loops back to the starting node. DAGs are widely used in various fields such as computer science,
project scheduling, data processing, and more.
• Example:- Consider the following example to illustrate a DAG. Let's have a graph with vertices A, B, C, D, and E:
A→B A→C B→D C→D D→E
The graph can be visualized as:
A→B→D→E
\ /
\/
C
➢ Advantages of DAG in Optimization:-
 Compact Representation: DAGs capture the essential computations without redundancy. They collapse repeated
expressions into a single node, reducing the overall size of intermediate code.
 Common Subexpression Elimination (CSE): DAGs enable the identification of common subexpressions within basic
blocks. CSE is a compiler optimization technique that replaces redundant subexpressions with temporary variables,
reducing the overall number of computations.
 Efficient Memory Usage: By sharing common subexpressions, DAGs minimize memory requirements during
intermediate code generation.
➢ Common Subexpression Elimination (CSE) using DAG: CSE aims to eliminate repeated computations by identifying
and reusing common subexpressions.
• Process:- Construct a DAG for the basic block (a sequence of code with no branches).
 Each node in the DAG represents an expression or computation.
 Repeated subexpressions are merged into a single node.
 Replace occurrences of the same subexpression with references to the shared node.
56. What is ambiguous grammar? Describe with an example. / Show that S → aS|Sa|a ia an ambiguous grammar.
➢ A CFG is said to be ambiguous if there exists more than one derivation tree for the given input string., more than one
leftmost Derivation Tree & Right most derivation tree.
 G = (V,T,P, S) is a CFG is said to be ambiguous if and only if there exists a String in T* that has more than one parse tree.
 where, V= Finite set of variable; T=Finite set of Terminal; P =Finite set of Productions; S =Start symbal
• Example:- S → aS | Sa | a String :- aaa
 First Leftmost derivation S → aS S → aaS a → aaa
 Second Left Most S → Sa S → aSa a → aaa
 As we can see string can be derived by more than one derivation. The given grammar is ambiguous.
57. Translate the expression –(a+b)*(c+d)*(a+b*c) into Quadruples, Triples, and Indirect triples.
➢ t1 = a * b t2 = -t1 t3 = c +d t4 = t2 + t3 t5 = a+b t6 = t5+t3 t7= -t6 t8 = t4*t7
• Quadruple
Operator Arg1 Arg2 Result
0 + a b t1
1 - t1 t2
2 + c d t3
3 * t2 t3 t4
4 * t1 c t5
5 * t3 t5 t6
6 * t2 t6 t7
• Triple Triple Statment
Operator Arg1 Arg2
0 + a b 100 (0)
1 - (0) 101 (1)
2 + C d 102 (2)
3 * (1) (2) 103 (3)
4 * (2 ) C 104 (4)
5 * (2 ) (4) 105 (5)
6 * (1 ) (5) 106 (6)
58. Consider the following grammar:S’ = S# S → ABC A → a|bbD B → a| Ꜫ C → b| Ꜫ D → c| Ꜫ
Construct FIRST and FOLLOW for the grammar also design LL(1) parsing table for the grammar.
59. Consider the following grammar: S → AA A → aA A → b And construct the LALR parsing table.
60. Generate the three address code for the following program segment:-
While(a<c and b>d)
Do if a=1 then c = c+1
Else
While a<=d
Do a= a+b
➢ 1.if (a < c) goto (3)
2.goto (15)
3.if (B>D) goto (5)
4.goto (15)
5.if (A = 1) goto (7)
6.goto (10)
7.T1 = C + 1
8.C = T1
9.goto(1)
10.if (A <=D) goto (12)
11.goto (1)
12.T2 = A + B
13.A = T2
14.Goto (10)
61. Translate the following expression into quadruple, triple, and indirect triple: -(a+b)*(c+d)-(a+b+c).
62. Explain operator grammar. Generate precedence function table for following grammar.
E → EAE | id
A→+|*
➢ A grammar that is used to define mathematical operators is called an Grammar operator orPrecedence grammar.
• Grammar is said to be operator operator grammar if it follows Following two conditions.
 no Production has either an empty Right-hand side (null Productions).
 There should not be two adjacent non- terminals in its right-hand side.
• Precedence Function Table:- E -> E A E | id A -> + | * E -> E +E | E * E | id
• Table:
id + * $
Id - .> .> .>
+ <. .> <. .>
* <. .> .> .>
$ <. <. <. -
63. Give syntax directed definition for simple desk calculator. Also show annotated parse tree for 6*5+7n.
64. Construct the NFA for following regular expressions using Thompson’s construction. Apply a subset construction
method to convert into DFA. (a+b)*abb#

65. Define token, lexeme and pattern. Identify the lexemes that makes up the tokens for the following code
const p = 10; if( a < p) { a++ ; If(a== 5) continue ; }
➢ const: Keyword token, lexeme = const a: Identifier token, lexeme = a
p: Identifier token, lexeme = p ++: Operator token, lexeme = ++
=: Operator token, lexeme = = ;: Separator token, lexeme = ;
10: Number token, lexeme = 10 if: Keyword token, lexeme = if
;: Separator token, lexeme = ; (: Separator token, lexeme = (
if: Keyword token, lexeme = if a: Identifier token, lexeme = a
(: Separator token, lexeme = ( ==: Operator token, lexeme = ==
a: Identifier token, lexeme = a 5: Number token, lexeme = 5
<: Operator token, lexeme = < ): Separator token, lexeme = )
p: Identifier token, lexeme = p continue: Keyword token, lexeme = continue
): Separator token, lexeme = ) ;: Separator token, lexeme = ;
{: Separator token, lexeme = { }: Separator token, lexeme = }
66. Construct deterministic finite automata without constructing NFA for following regular expression. (a/b)*abb*
67. Generate the SLR parsing table for following grammar
S → Aa | bAc | bBa
A→d
B→d
68. Show the following grammar is LR(1) but not LALR(1).
S → Aa │bAc │Bc│bBa A→ d B→d
69. Construct SLR parsing table for the following grammar: S → (L) | a L→ L,S | S

70. Translate following arithmetic expression into (i) Quadruples (ii) Triple (iii) Indirect Triple. (a*b)+(c+d)-(a+b+c+d)
71. Construct CLR parsing table for following grammar.
S → aSA | €
A → bS | c

You might also like