0% found this document useful (0 votes)
45 views33 pages

CD Guess Paper

Uploaded by

21egjcs021
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
45 views33 pages

CD Guess Paper

Uploaded by

21egjcs021
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 33

GLOBALINSTITUTEOFTECHNOLOGY

B. Tech.V Semester
5CS-04/ Compiler Design
Branch-AIDS/CS/IT
GUESS PAPER
Attempt all questions
Schematic diagrams must be shown wherever necessary. Any data you feel missing suitably be
assumed and state clearly .No supplementary sheet shall be issued in any case.
Part A (All questions are compulsory)
Answer should be given up to 25 words only

Q. 1 Explain the terms Grammar. [CO2]


Q. 2 What is Predictive parser? [CO2]
Q. 3 What do you mean by Peephole Optimization? [CO5]
Q. 4 Differentiate Abstract syntax tree and DAG representation of intermediate code. [CO5]
Q. 5 Define left factoring in the following grammar. E --> E+E/E*E/a/b. [CO2]
Q. 6 What do you mean by Activation code? [CO4]
Q. 7 Define Top down & Bottom up Parsing. [CO3]
Q. 8 Write short note on Storage allocation strategies. [CO4]
Q. 9 Explain the basic blocks. [CO5]
Q. 10 Define the syntax directed definition. [CO3]
Q. 11 Differentiate between Top down Parsing and Bottom up parsing. [CO2]
Q. 12 Define lexeme, token and pattern. [CO1]
Q. 13 Classify leftmost derivation and rightmost derivation. Show an example for each. [CO2]
Q. 14 Consider a compiler for P on machine N. Apply boot strapping to obtain a compiler for P on machine
M. Also define cross-compilers. [CO1]
Q. 15 Write a short note on Symbol Table Management. [CO1]
Q. 16 Give the fundamental difference between Compiler, interpreter and assembler. [CO1]
Q. 17 Identify the various goals of Error Handler? [CO1]
Q. 18 State the meaning of intermediate code? [CO1]
Q. 19 Describe front end and back end of compiler. [CO1]
Q. 20 Check whether the grammar is ambiguous or not? [CO2]
S-> aS |Sa| Є
E->E +E | E*E| id
A -> AA | (A) | a
S ->SS|AB
A -> Aa|a
B -> Bb|b
Q.21 Define preprocessor. What are the functions of pre-processor? [CO3]
Q.22 Discuss about the Syntax Error Handling. [CO1]
Q.23 Differentiate between shift-reduce and Operator Precedence Parsers. [CO2]
Q.24 What are the benefits of intermediate code generation? [CO5]
Q.25 What are the various attributes of a Symbol Table? [CO4]
Q.26 Mention the issues to be considered while applying the techniques for
code optimization. [CO3]
Q.27 Briefly describe about the Lexical errors. [CO1]
Q.28 What are the functions used to create the nodes of syntax trees? [CO1]
Q.29What are the three techniques for constructing LR parsing table? [CO2]
Q.30 Discuss the evaluation of semantic rules. [CO3]

Part B (Analytical/Problem solving questions)


Attempt all questions (WordLimit100)
Q. 1 Construct the DAG and generate the code for the given block: [CO5]
T1 = a + b
T2 = a – b
T3 = T1 * T2
T4 = T1 – T3
T5 = T4 + T3
Q. 2 What do you mean by basic block? Consider the following program segment: [CO5]
for r from 1 to 10 d0
for c from 1 to 10 do
a [r, c] = 0.0;
for r from 1 to 10 do
a [r, c] = 1.0;
Find the basic block and construct the flow graph.
Q. 3 Translate the arithmetic expression. (a+b) * (c+d) + (a+b+c) into [CO3]
a) Syntax Tree b) Three Address code c) Quadruple d) Triples
Q. 4 What is an LALR (1) grammar? Construct LALR parsing table for the following grammar:

S->cC
C->cC
C-c/d [CO2]
Q. 5 Solve the input id+id*id, using operator precedence parser for the following grammar:
T->T+T/T*T/id [CO2]
Q.6 Construct DFA for following regular expression without constructing NFA.
(a | b) * a [CO1]
Q.7 Demonstrate Input buffering techniques. [CO1]
Q.8 Design the FIRST SET and FOLLOW SET for the following grammar. [CO2]
S→ Bb/Cd
B→ aB/Ɛ
C→ cC/Ɛ
Q.9 Justify that the following grammar is LL (1).
S→ AaAb | BbBa
A→ ϵ
B→ϵ [CO2]
Q.10 Describe the process of bootstrapping in detail. Also write short note on error recovery
Strategies. [CO1]
Q.11 A) Write a regular expression for identifiers and reserved words. Design the
transition diagrams for them. [CO1]
B) Explain the three general approaches for the implementation of a Lexical
analyzer. [CO2]
Q.12 Construct the predictive parser for the following grammar. [CO2]
S -> (L) | a
L ->L,S | S
Q.13 Translate the assignment x := A[y,z] into three address statement. [CO5]
Q.14 Write the quadruple, triple, indirect triple for the expression. [CO3]
-(a*b) + (c+d)-(a+b+c+d)
Q.15 Write regular expressions for the set of words having a,e,i,o,u appearing in that order,
although not necessarily consecutively. [CO4]

Part C (Descriptive /Analytical /Problem Solving/ Design Question)


(Attempt all questions)
Q. 1 Generate code for the following C statements for the simple/target machine assuming all
variables are static and three register are available. [CO5]
t:= a - b
u:= a - c
v:= t + u
d:= v + u
Q. 2 Let G be a formal grammar with the following production rules [CO2]
D -> type tlist ;
tlist -> tlist , id / id
type -> int / float
a) Explain the role of terminal symbol $.
b) Construct a LR (0) parse table for the grammar.
c) What kind of conflict does the resulting parse table contain?
d) Explain two strategies to resolve this conflict.
Q. 3 Explain the following:
a) How data structure used in symbol table? Also explain static verses dynamic storage
allocation. [CO4]
b) What do you mean by CFG? Give distinction between regular and context free grammar and
limitation of context free grammar. [CO2]
Q.4 Elaborate left recursion? State the rules to remove left recursion from the grammar. Eliminate left re-
cursion from following grammar. [CO2]
S-> L=R/R
L-> *R/id
R-> L
Q.5 Consider the following grammar. [CO2]
E→ E + T | T
T→ T* F |F
F→ (E) | id
(i) Remove left recursion from the grammar.
(ii) Construct a predictive parsing table.
(iii) Design Stack Implementation.
Q.6 Classify different phases of compiler? Explain each phase in detail. Also give each phase result for
given statement. [CO1]
Q.7 Construct basic blocks, data flow graph and identify loop invariant statements
for the following: [CO5]
for (i=1 to n)
{
j=1;
while (j<=n)
{
A=B*C/D;
j=j+1;
}
}
Q.8 a) Construct the syntax tree and draw the DAG for the expression. [CO3]
(a*b) + (c-d) * (a*b)+b.
b) Write Syntax directed definition for constructing syntax tree of an expression derived from the
grammar. [CO4]
E -> E + T | E – T | T
T -> (E) | id | num
Q.9 a) Construct the collection of LR(0) item sets and draw the goto graph for the grammar
S -> S S | a | ϵ. Indicate the conflicts (if any) in the various states of the SLR parser. [CO2]
b) Discuss about error recovery strategies in predictive parsing. [CO4]

ANSWERS
Q.1Differentiate between Top-down Parsing and Bottom up parsing.
Ans.Top-down parsing bottom up parsing
Top-down approach starts evaluating Bottom-up approach starts evaluating the
the parse tree from the top and move parse tree from the lowest level of the
downwards for parsing other nodes. tree and move upwards for parsing the
node.

Top-down parsing uses leftmost Bottom-up parsing uses the rightmost


derivation. derivation.

Q.2 Define lexeme, token, and pattern.


Ans.Token: Token is a sequence of characters that can be treated as a single logical
entity.
Typical tokens are,
1) Identifiers 2) keywords 3) operators 4) special symbols 5) constants
Pattern: A set of strings in the input for which the same token is produced as output.
This set of
strings is described by a rule called a pattern associated with the token.
Lexeme: A lexeme is a sequence of characters in the source program that is matched
by the
pattern for a token.

Q.3 Classify leftmost derivation and right most derivation Show anexample for
each.
Ans.Leftmost derivation: A leftmost derivation is obtained by applying production
to the leftmost variable in each successive step.
Example:
Consider the grammar G with production:
S→aSS
A→b
Computethestring w=‘aababbb’ with leftmostderivation.
S⇒aSS (Rule:1)
S⇒aaSSS (Rule:1)
S⇒aabSS (Rule:2)
S⇒aabaSSS (Rule:1)
S⇒aababSS (Rule:2)
S⇒aababbS (Rule:2)
S⇒aababbb (Rule: 2)
To obtain the string ‘w’ the sequence followed is “left most derivation”, following
“1121222”.
Rightmost derivation: A rightmost derivation is obtained by applying production to
the rightmost variable in each step.
Example:
Consider the grammar G with production:
S→aSS
A→b
Compute the string w = ‘aababbb’ with right most derivation.
S⇒aSS (Rule:1)
S⇒aSb (Rule:2)
S⇒aaSSb (Rule:1)
S⇒aaSaSSb (Rule:1)
S⇒aaSaSbb (Rule:2)
S⇒aaSabbb (Rule:2)
S⇒aababbb (Rule: 2)
To obtain the string ‘w’ the sequence followed is “right most derivation”, following
“1211222”.

Q.4 Consider a compiler for P on machine N. Apply boot strapping to obtain a


compiler for P on machine M. Also define cross-compilers.
Ans.Initial Compiler A for P on machine N: You start with an existing compiler
(Compiler A) that is capable of compiling the source code of language P. This
compiler is designed to run on machine N.
Compile Compiler A with itself: You use the initial Compiler A to compile its own
source code, resulting in a new version of Compiler A. This new version may have
improvements, bug fixes, or optimizations.
Compiler A for P on machine M: Now, the new version of Compiler A is capable of
compiling language P on a different machine, say machine M. This is possible
because the compiler is now self-hosted, meaning it can compile its own source code.
Cross-compiler is a compiler that generates code for a different platform than the one
on which the compiler itself is running. In the context of bootstrapping, the new
version of Compiler A serves as a cross-compiler for P on machine M.

Q.5 Write a short note on Symbol Table Management.


Ans. Symbol Table is an important data structure created and maintained by the
compiler in order to keep track of semantics of variables i.e. it stores information
about the scope and binding information about names, information about instances of
various entities such as variable and function names, classes, objects, etc.

Q.6 Give the fundamental difference between Compiler, interpreter and


assembler.
Ans. 1.A compiler converts the high-level instruction into machine language while an
interpreter converts the high-level instruction into an intermediate form.
2. Before execution, entire program is executed by the compiler whereas after
translating the first line, an interpreter then executes it and so on.
3. List of errors is created by the compiler after the compilation process while an
interpreter stops translating after the first error.

Q.7 Identify the various goals of Error Handler?


Ans.
 Error Detection
 Error Report
 Error Recovery

Q.8 State the meaning of intermediate code?


Ans. During the translation of a source program into the object code for a target
machine, a compiler may generate a middle-level language code, which is known as
intermediate code or intermediate text. The complexity of this code lies between the
source language code and the object code.

Q.9 Describe front end and back end of compiler.


Ans.Frontend:
The front end consists of those phases, or parts of phase, that depends primarily on
the
source language and is largely independent of the target machine. These normally
include lexical.
and syntactic analysis, the creation of the symbol table, semantic analysis, and the
generation of
intermediate code.
Back end:
The back end includes those portions of the compiler that depend on the target
machine.
and generally, these portions do not depend on the source language.

Q.10 Check whether the grammar is ambiguous or not?


S->aS|Sa|€
E->E+E|E*E|id
A->AA|(A)|a
S->SS|AB
A->Aa|a
B->Bb|b
Ans. Grammar S ->aS | Sa | €:
This grammar is not ambiguous. The production rules are simple, and for any given
string in the language, there is only one leftmost derivation. The presence of the
empty string (€) does not introduce ambiguity.

Grammar E -> E + E | E * E | id:


This grammar is ambiguous because it allows for ambiguous expressions. For
example, consider the expression "1 + 2 * 3." This expression can be interpreted as
either "(1 + 2) * 3" or "1 + (2 * 3)," leading to different parse trees.

Grammar A -> AA | (A) | a:


This grammar is ambiguous. For instance, the string "a(aa)" can be derived in
multiple ways, leading to different parse trees.

Grammar S -> SS | AB:


This grammar is ambiguous. The non-terminal S can be expanded using either the
production S -> SS or S -> AB. This ambiguity leads to multiple parse trees for some
strings.

Grammar A -> Aa | a:
This grammar is not ambiguous. It generates strings where 'a' can appear at the
beginning or after one or more 'a's.

Grammar B -> Bb | b:
This grammar is not ambiguous. It generates strings where 'b' can appear at the
beginning or after one or more 'b's.

Part-B
Q.1 Construct DFA for following regular expression without constructing NFA.
(a|b)*a.
Ans.Given Regular Expression: (a|b)*a

Convert the Regular Expression to a Reversed Regular Expression (Obtain Reverse


Polish Notation):

Original: (a|b)*a
Reversed: a*|a
Construct DFA from the Reversed Regular Expression:

Define states for each symbol and operation.


Use a stack to keep track of the states.
For each symbol or operation encountered, update the stack accordingly.
sql
Copy code
State 0: Initial State
State 1: Accepting State

Transitions:
State 0 --(a)--> State 0
State 0 --(*)--> State 0
State 0 --(|)--> State 0, State 1
State 1 --(a)--> State 1

Q.2 Demonstrate Input buffering techniques.


Ans. The LA scans the characters of the source program one at a time todiscover
tokens. Because of large amount of time can be consumed scanningcharacters,
specialized buffering techniques have been developed to reduce theamount of
overhead required to process an input character.
Buffering techniques:
1. Buffer pairs
2. Sentinels
The lexical analyzer scans the characters of the source program one a t a time to
discover tokens. Often, however, many characters beyond the next token many have
to be examined before the next token itself can be determined. For this and other
reasons, it is desirable for the lexicalanalyzer to read its input from an input buffer.
Figure shows a buffer divided into two halves of, say 100 characters each. One
pointer marks the beginning of the token being discovered. A lookahead pointer scans
ahead of the beginning point, until the token isdiscovered. we view theposition of
each pointer as being between the character last read and the character next to beread.
In practice each buffering scheme adopts one convention either a pointer is at the
symbollast read or the symbol it is ready to read.
Token beginnings look ahead pointer, The distance which the lookahead pointer
mayhave to travel past the actual token may be large.

Q.3 Design the FIRST SET and FOLLOW SET for the following grammar.
S->Bb/Cd B->aB/€ C->cC/€
Ans.

1. FIRST set for each non-terminal:


FIRST(S): {a, c, ε} (because S can derive from B or C, and B can derive
ε)
 FIRST(B): {a, ε} (because B can derive from aB or ε)
 FIRST(C): {c, ε} (because C can derive from cC or ε)
2. FOLLOW set for each non-terminal:
 FOLLOW(S): { $ } (where $ represents the end of the input, and S is the
start symbol)
 FOLLOW(B): { b, $ } (because S can derive Bb)
 FOLLOW(C): { d, $ } (because S can derive Cd)

The construction of these sets is based on the production rules of the grammar. Let's
explain the process for each set:

 FIRST set:
 For each non-terminal, add the first terminal of its production rules.
 If ε (empty string) is in the FIRST set of a non-terminal, also add the
first terminal of the next symbol in the production.
 FOLLOW set:
 Initialize the FOLLOW set of the start symbol with $.
 For each production rule, add the FIRST set of the following symbol to
the FOLLOW set of the current non-terminal (excluding ε).
 If ε is in the FIRST set of the following symbol, add the FOLLOW set of
the left-hand side non-terminal to the FOLLOW set of the current non-
terminal.

Q.4 Justify that the following grammar is LL (1). S->AaAb| BbBa


A->a B->€
Ans. To justify that a grammar is LL(1), we need to show that the grammar is
suitable for predictive parsing without any backtracking, and the parsing table
derived from the grammar is unambiguous. For a grammar to be LL(1), each
entry in the parsing table must be unique, meaning that there is at most one
production for each non-terminal and terminal pair.
Given the grammar:
S ->AaAb | BbBa
A -> a
B -> ε
Let's construct the parsing table:
1. First sets for each non-terminal:
 FIRST(S): {a, ε, b}
 FIRST(A): {a}
 FIRST(B): {ε}
2. Follow sets for each non-terminal:
 FOLLOW(S): { $ }
 FOLLOW(A): { a, b }
 FOLLOW(B): { a, b }
3. Parsing Table:
a b
S AaAb BbBa
A a
B ε
4. Justification:
 For each pair (non-terminal, terminal), there is at most one pro-
duction in the table, making it unambiguous.
 There are no conflicts in the parsing table.
 The grammar is LL(1) because, for any pair (S, a), (S, b), (A, a), and
(B, b), there is a unique production.
In summary, the grammar is LL(1) because it satisfies the conditions for LL(1)
grammars, and the parsing table is unambiguous.

Q.5 Describe the process of bootstrapping in detail. Also write short note on
error recovery strategies.
Ans. Bootstrapping - When a computer is first turned on or restarted, a special type
of absolute loader, called asbootstrap loader is executed. This bootstrap loads the first
program to be run by the computerusually an operating system. The bootstrap itself
begins at address O in the memory of the
machine. It loads the operating system (or some other program) starting at address 80.
After allof the object code from device has been loaded, the bootstrap program jumps
to address 80,which begins the execution of the program that was loaded.Such
loaders can be used to run stand-alone programs independent of the operating system
or thesystem loader. They can also be used to load the operating system or the loader
itself intomemory.
Loaders are of two types:
Linking loader.
Linkage editor.
Linkage loaders, perform all linking and relocation at load time.
Linkage editors, perform linking prior to load time and dynamic linking, in which the
linkingfunction is performed at execution time.
error recovery strategies-
panic mode, statement mode, error productions, global correction
Panic mode
When a parser encounters an error anywhere in the statement, it ignores the rest of
the statement by not processing input from erroneous input to delimiter, such as semi-
colon. This is the easiest way of error-recovery and also, it prevents the parser from
developing infinite loops.

Part-C
Q.1 Elaborate left recursion? State the rules to remove left recursion from the
grammar. Eliminate left recursion from the following grammar.
S->L=R/R L->*R/id R->L
Ans. Left Recursion:Left recursion occurs in a grammar when a non-terminal A can
derive a string that starts with itself, directly or indirectly. It can lead to infinite recur-
sion during parsing and needs to be eliminated for the grammar to be suitable for pre-
dictive parsing.

Consider the following production rule:A→Aα ∣ β

where α and β are sequences of terminals and/or non-terminals.

In the given grammar, the left recursion is present in the productions: S→L=R

L→∗R

Rules to Remove Left Recursion:

To eliminate left recursion, we can follow these rules:

1. Left Factoring:
 If a non-terminal A has multiple productions, factor out the common pre-
fix among the right-hand sides of those productions.
 This helps in breaking the left recursion indirectly.
2. Introduce New Non-Terminals:
 For each left-recursive production A→Aα ∣ β, introduce a new non-
terminal A' and rewrite the productions as:
 A→βA′
 A′→αA′ ∣ ϵ
 where ϵ represents the empty string.

Eliminating Left Recursion for the Given Grammar:

The given grammar is:

S→L=R ∣ R

L→∗R ∣ id
R→L

1. Left Factoring:
 The common prefix in S→L=R and S→R is 'R'. Factor out the common
prefix:
 ′S→RS′
 S′→=R ∣ ϵ
2. Introduce New Non-Terminals:
 For the left-recursive production R→L, introduce a new non-terminal R':
 R→LR′
 R′→ϵ

The modified grammar without left recursion is:

S→RS′

S′→=R ∣ ϵ

L→∗R ∣ id

R→LR′

R′→ϵ

Now, this grammar is free of left recursion and can be used for predictive parsing.

Q.2 Consider the following grammar.


E->E+T|T
T->T*F|F
F->(E) | id
a. Remove left recursion from the grammar.
b. Construct a predictive parsing table.
c. Design Stack Implementation.
Ans. a. Remove Left Recursion:

The given grammar is:

E→E+T ∣ T

T→T∗F ∣ F

F→(E) ∣ id

To remove left recursion:


1. Factor out common prefixes.
2. Introduce new non-terminals for each left-recursive production.

Modified grammar:

E→TE′

E′→+TE′ ∣ ϵ

T→FT'

T′→∗FT′ ∣ ϵ

F→(E) ∣ id

b. Construct Predictive Parsing Table:

To construct the predictive parsing table, we need to determine the First and Follow
sets for each non-terminal.

 First Sets:
 FIRST(E)=(id(ϵ
 FIRST(E′)=(+,ϵ)
 FIRST(T)=(id(ϵ
 FIRST(T′)=(∗,ϵ)
 FIRST(F)=(id(
 Follow Sets:
 FOLLOW(E)=($, +)
 FOLLOW(E′)=($, +)
 FOLLOW(T)=($, +, *)
 FOLLOW(T′)=($, +, *)
 FOLLOW(F)=($, +, *, (, id)
 Predictive Parsing Table:
Non-
Terminal
/
Terminal id ( ) + ∗ $
E E→TE′ E→TE′
′E′ E′→ϵ E′→+TE′ E′→ϵ
T T→FT′ T→FT′
T′ T′→ϵ T′→ϵ T′→∗FT′ T′→ϵ
F F→id F→(E)
Design Stack Implementation:

Q.3 Classify different phases of compiler ? Explain each phase in detail. Also
give each phase result for given statement.
Ans.
A compiler is a complex software system that translates high-level programming lan-
guages into machine code or an intermediate code. The compilation process is divid-
ed into several phases, each performing a specific set of tasks. The main phases of a
compiler are:

1. Lexical Analysis (Scanner):


 Task: Breaks the source code into tokens (smallest meaningful units)
and removes comments and whitespaces.
 Result for the Statement: For the statement int x = 10;, the lexical
analysis phase produces the tokens int, x, =, 10, and ;.
2. Syntax Analysis (Parser):
 Task: Constructs a parse tree or abstract syntax tree (AST) based on the
grammar rules of the programming language.
 Result for the Statement: The parse tree for int x = 10; represents the
syntactic structure of the statement according to the language's grammar.
3. Semantic Analysis:
 Task: Checks the semantics of the program, ensuring that it adheres to
the language's rules and constraints. Detects semantic errors.
 Result for the Statement: Verifies that the variable x is declared, and
the assignment is semantically correct.
4. Intermediate Code Generation:
 Task: Generates an intermediate representation of the source code,
which is easier to optimize and target various platforms.
 Result for the Statement: Produces an intermediate code representation
that captures the essential operations of the statement.
5. Code Optimization:
 Task: Improves the intermediate code to make the resulting machine
code more efficient.
 Result for the Statement: Optimizes the intermediate code for perfor-
mance improvements.
6. Code Generation:
 Task: Translates the optimized intermediate code into the target machine
code or assembly code.
 Result for the Statement: Generates machine code or assembly code
corresponding to the original statement for the target architecture.
7. Code Embedding and Linking:
 Task: Embeds the generated code into an executable file, resolves ad-
dresses, and links it with other compiled code and libraries.
 Result for the Statement: Integrates the machine code for the statement
into the final executable file.
8. Error Handling:
 Task: Identifies and reports errors at various stages of compilation.
 Result for the Statement: Reports any lexical, syntactic, or semantic
errors in the source code, providing error messages and locations.
9. Symbol Table Management:
 Task: Manages information about identifiers, such as variables and
functions, to assist in the compilation process.
 Result for the Statement: Populates and maintains a symbol table with
information about the variable x and other symbols in the program.

These phases together ensure the correct translation of high-level source code into
executable machine code while performing necessary optimizations and error checks.
The exact details and sub-stages within each phase can vary based on the specific
compiler and language being used.

Part-A
Q.1 Explain the termGrammer.
Ans. It is a finite set of formal rules for generating syntactically correct sentences or
meaningful correct sentences.
Any Grammar can be represented by 4 tuples – <N, T, P, S>
 N – Finite Non-Empty Set of Non-Terminal Symbols.
 T – Finite Set of Terminal Symbols.
 P – Finite Non-Empty Set of Production Rules.
 S – Start Symbol (Symbol from where we start producing our sentences or
strings).

Q.2 What is Predictive parser?


Ans. Predictive Parser is method that implements the technique of Top- Down
parsing without Backtracking. A predictive parser is an effective technique of
executing recursive descent parsing by managing the stack of activation records,
particularly.

Q.3 What do you mean by Peephole Optimization?


Ans. Peephole optimization is an optimization technique performed on a small set of
compiler-generated instructions; the small set is known as the peephole or window.
Peephole optimization involves changing the small set of instructions to an equivalent
set that has better performance.

Q.4 Differentiate Abstract syntax tree and DAG representation of intermediate


code.
Ans. AnAbstractSyntaxTree(AST) isasimplifiedparsetree. Itretainssyntactic structure
ofcode.

A DirectedAcyclicGraph (DAG) isagraphical representationofsymbolic


expressionswhereanytwoprovablyequalexpressionsshareasinglenode.
Q.5 Define left factoring in the following grammar. E->E+E/E*E/a/b.

Ans. The left-factored grammar would be:

E→TE′

E′→+TE′ ∣ ∗TE′ ∣ ϵ

T→a ∣ b

Q.6 What do you mean by Activation code?

Ans. Activation code refers to a code used for user authentication. An activation code
can be included with the software or sent to the user’s email address or device. They
can be used by software publishers to confirm the purchase.They can be used to un-
lock product functionality.

Q.7 Define Top down &Bottom up Parsing.


Ans. Top-down parsing is a parsing technique that first looks at the highest level of
the parse tree and works down the parse tree by using the rules of grammar while
Bottom-up Parsing is a parsing technique that first looks at the lowest level of the
parse tree and works up the parse tree by using the rules of grammar.

Q.8 Write short note on Storage allocation strategies.


Ans. A compiler is a program that converts HLL (High-Level Language) to
LLL(Low-Level Language) like machine language. In a compiler, there is a need
for storage allocation strategies in Compiler design because it is very important to
use the right strategy for storage allocation as it can directly affect the performance
of the software.
Storage Allocation Strategies
There are mainly three types of Storage Allocation Strategies:
1. Static Allocation
2. Heap Allocation
3. Stack Allocation

Q.9 Explain the basic blocks.


Ans. Basic Block is a straight-line code sequence that has no branches in and out
branches except to the entry and at the end respectively. Basic Block is a set of
statements that always executes one after other, in a sequence. The first task is to
partition a sequence of three-address codes into basic blocks.

Q.10 Define the syntax directed definition.


Ans. Syntax Directed Definition (SDD) is a kind of abstract specification. It is
generalization of context free grammar in which each grammar production X –> a is
associated with it a set of production rules of the form s = f(b 1, b2, ……bk) where s
is the attribute obtained from function f. The attribute can be a string, number, type
or a memory location.
Example: E --> E1 + T {E.val = E1.val + T.val}

Part-B
Q.1 Construct the DAG and generate the code for the given block:
T1=a+b T4=T1-T3
T2=a-b T5=T4+T3
T3=T1*T2
Ans.

a = 10 // Initialize 'a' with the value 10


b = 5 // Initialize 'b' with the value 5

T1 = a + b // T1 = 10 + 5 = 15
T2 = a - b // T2 = 10 - 5 = 5
T3 = T1 * T2 // T3 = 15 * 5 = 75
T4 = T1 - T3 // T4 = 15 - 75 = -60
T5 = T4 + T3 // T5 = -60 + 75 = 15
This is a simple representation of the DAG and the corresponding code for the given
block. Each operation in the DAG is represented as an assignment statement in the
code. Note that the DAG helps in identifying common subexpressions and optimizing
the code by reusing intermediate results. In this example, T1 and T3 are computed
only once and reused in subsequent operations.

Q.2 What do you mean by basic block? Consider the following program
segment:
for r from 1 to 10 do
for c from 1to 10 do
a[r,c]=0.0;
for r from 1 to 10 do
a[r,c]=1.0;
Find the basic block and construct the flow graph.
Ans. 1) i=1 // Leader 1 (First statement)
2) j=1 // Leader 2 (Target of 11th statement)
3) t1 = 10 * i // Leader 3 (Target of 9th statement)
4) t2 = t1 + j
5) t3 = 8 * t2
6) t4 = t3 - 88
7) a[t4] = 0.0
8) j = j + 1
9) if j <= 10 goto (3) // Leader 4 (Immediately following Conditional goto
statement)
10) i = i + 1
11) if i <= 10 goto (2) // Leader 5 (Immediately following Conditional goto
statement)
12) i = 1 // Leader 6 (Immediately following Conditional goto
statement)
13) t5 = i - 1 // Leader 7 (Target of 17th statement)
14) t6 = 88 * t5
15) a[t6] = 1.0
16) i = i + 1
17) if i <= 10 goto (13) // Leader 8 (Immediately following Conditional goto
statement)
There are six basic blocks for the above-given code, which are:
 B1 for statement 1
 B2 for statement 2
 B3 for statements 3-9
 B4 for statements 10-11
 B5 for statement 12
 B6 for statements 13-17.

Q.3 Translate the arithmetic expression. (a+b)*(c+d)+(a+b+c) into


a) Syntax tree
b) Three Address code
c) Quadruple
d) Triples
Ans. Let's translate the given arithmetic expression (a+b)⋅(c+d)+(a+b+c) into various
intermediate representations:

a) Syntax Tree:
+
/\
* +
/\/\
a b +
/\
+ c
/\
a b

b) Three Address Code:


1. t1=a+b
2. t2=c+d
3. t3=t1×t2
4. t4=a+b+c
5. 4t5=t3+t4

Q.4 What is an LALR(1) grammar? Construct LALR parsing table for the
following grammar:
S->cC, C->cC, C->c/d

Ans. LALR(1) stands for "Look-Ahead LR(1)" and is a type of parsing table used in
compiler design. LALR(1) parsers are an extension of LR(1) parsers, where "LR"
stands for "Left-to-right, Rightmost derivation." LALR parsers are capable of han-
dling a broader class of grammars compared to SLR parsers.

An LALR(1) grammar is a context-free grammar for which an LALR(1) parser can


be constructed. This parser uses a one-symbol lookahead to make parsing decisions.

Construct LR (1) Set of items. First of all, all the LR (1) set of items should be
generated.
In these states, states I3 and I6 can be merged because they have the same core or first
component but a different second component of Look Ahead.

Similarly, states I4 and I7 are the same.

Similarly, states I8 and I9 are the same.

So, I3 and I6 can be combined to make I36.

I4 and I7 combined to make I47.

I8 and I9 combined to make I89.

So, the states will be

The resulting LALR Parsing table.

Q.5 Solve the input id+id*id, using operator precedence parser for the following
grammar:
T->T+T/T*T/id
Ans. We construct the operator precedence table as-
id + x $

id > > >

+ < > < >

x < > > >

$ < < <

Operator Precedence Table

Parsing Given String-


Given string to be parsed is id + id x id.
We follow the following steps to parse the given string-
Step-01:
We insert $ symbol at both ends of the string as-
$ id + id x id $
We insert precedence operators between the string symbols as-
$ < id > + < id > x < id > $
Step-02:
We scan and parse the string as-
$ < id > + < id > x < id > $
$ E + < id > x < id > $
$ E + E x < id > $
$E+ExE$
$+x$
$<+<x>$
$<+>$
$$
Accepted.

Part-C
Q.1 Generate code for the following C statements for the simple/target machine
assuming all variables are static and three register as available.
t:=a-b
u:=a-c
v:=t+u
d:=v+u
Ans.
Statement Code Register Address
Generated descriptor descriptor
Register
empty

t:= a - b MOV a, R0 R0 contains t in R0


SUB b, R0 t

u:= a - c MOV a, R1 R0 contains t in R0


SUB c, R1 t u in R1
R1 contains
u

v:= t + u ADD R1, R0 contains u in R1


R0 v v in R1
R1 contains
u

d:= v + u ADD R1, R0 contains d in R0


R0 d d in R0 and
MOV R0, d memory

Register Descriptor:

 R0: Contains the value of 't' after the operation t := a - b. Later used for stor-
ing 'v' and 'd'.
 R1: Contains the value of 'u' after the operation u := a - c.

Address Descriptor:

 t: Represents the memory location or register where the variable 't' is stored. In
this case, it is in register R0.
 u: Represents the memory location or register where the variable 'u' is stored.
It is initially in register R1 after the operation u := a - c.
 v: Represents the memory location or register where the variable 'v' is stored.
In this case, it is in register R0 after the operation v := t + u.
 d: Represents the memory location or register where the variable 'd' is stored.
It is in register R0 after the operation d := v + u.
Code Generation: The generated assembly code performs arithmetic operations and
uses registers to store intermediate values. The operations are sequenced to achieve
the desired results for each assignment statement.

Q.2 Let G be a formal grammar with the following production rules


D->type tlist;
tlist->tlist , id / id
type->int/float
a) Explain the role of terminal symbol $.
b) Construct a LR(0) parse table for the grammar.
c) What kind of conflict does the remaining phase table contain?
d) Explain two strategies to resolve this conflict.
Ans. a) Role of Terminal Symbol $:
In formal grammars and parsing, the terminal symbol $ typically represents the end-
of-input marker. It is appended to the input string to signify the end of the source
code being analyzed by the parser. The presence of $ helps the parser recognize when
it has reached the end of the input and can successfully complete parsing.

b) Constructing a LR(0) Parse Table:

The LR(0) parse table is constructed based on the LR(0) items for each production in
the grammar. The items indicate the current position of the parser within a
production. For the given grammar:

Grammar:
D -> type tlist;
tlist ->tlist , id / id
type -> int / float

Augmented Grammar:
S' -> D
D -> type tlist;
tlist ->tlist , id / id
type -> int / float
LR(0) Items:
S' -> .D (initial item)
D -> .typetlist;
tlist -> .tlist , id / id
type -> .int / float
LR(0) Parse Table:
+-----------+-------------+-------------+-------------+-------------+-------------+
|State| type| id | , | ; | $ | D | tlist|
+-----------+-------------+-------------+-------------+-------------+-------------+
| 0 | s2 | | | | | 1 | |
| 1 | | s3| | | | | |
| 2| | | | | acc | | |
| 3 | s2 | | | | | | |
| 4 | | s3 | | | | | |
| 5| | | s6 | | | | |
| 6| s2 | | | | | | |
| 7 | | | | s8 | | | |
| 8 | s2 | | | | | | |
| 9| | | | | r3 | | r3 |
| 10 | | | | | | | |
| 11 | | | | | | 12 | |
| 12 | | | | | r1 | r1 r1|
 sX: Shift to state X.
 rX: Reduce by production X.
 acc: Accept.

c) Kind of Conflict in the Parse Table:

The LR(0) parse table contains a reduce-reduce conflict. Specifically, in state 12,
there is a conflict between reducing by production 1 (S' -> D) and production 3 (type
-> int / float) when the next symbol is $.

d) Strategies to Resolve the Conflict:

1. Precedence and Associativity Declarations:


 Introduce precedence and associativity declarations for the conflicting
terminal symbols ('int', 'float', ',', ';', '$').
 Specify the precedence and associativity of these symbols using declara-
tions in the grammar.
 For example, you can declare that 'int' and 'float' have higher precedence
than ',' and ';'.
 The declarations guide the parser in making the correct reduction deci-
sions.
2. Grammar Restructuring:
 Restructure the grammar to eliminate the reduce-reduce conflict.
 This may involve breaking down the conflicting production rules into
separate rules or introducing additional non-terminals.
 For example, you might modify the grammar to distinguish between dif-
ferent contexts where 'int' and 'float' are used, thereby resolving the am-
biguity.
Q.3 Explain the following:
a) How data structure used in symbol table? Also explain static verses
dynamic storage allocation.
b) What do you mean by CFG? Give distinction between regular and
context free grammar and limitation of context free grammar.
Ans. a) Data Structure Used in Symbol Table and Static vs. Dynamic Storage Allocation:

Data Structure Used in Symbol Table:


A symbol table is a data structure used by compilers to store information about the
variables, functions, and other entities encountered during the compilation process. It
maps identifiers in the source code to information about their properties, such as type,
scope, memory location, etc.

Common data structures used in symbol tables include:

1. Hash Tables: Efficient for quick lookups. The identifier is hashed, and the
hash value is used to index into the table.
2. Binary Search Trees (BST): Sorted structure where identifiers are arranged in
a tree. Allows for efficient searching.
3. Linked Lists: Simple structure where each entry points to the next one. Easy
to implement but may not be as efficient for large symbol tables.
4. Arrays: For small symbol tables, an array may be used with identifiers in-
dexed directly. The drawback is that it may waste space for unused entries.
Static vs. Dynamic Storage Allocation:
 Static Storage Allocation:
 Memory is allocated for variables at compile-time.
 The size and layout of data structures are known at compile-time.
 Examples include global variables and statically declared arrays.
 The main advantage is efficiency, as memory is allocated once during
compilation.
 The main drawback is inflexibility; it may not support dynamic memory
needs.
 Dynamic Storage Allocation:
 Memory is allocated at run-time during program execution.
 Examples include dynamically allocated memory using malloc in C or
new in C++.
 Provides flexibility, but comes with the cost of runtime overhead.
 Dynamic allocation allows for structures like linked lists and dynamic
arrays.
b) CFG (Context-Free Grammar), Distinction Between Regular and Context-Free Grammar,
and Limitations of Context-Free Grammar:

CFG (Context-Free Grammar):


 Definition: A Context-Free Grammar (CFG) is a formal grammar with pro-
duction rules, consisting of variables, terminals, a start symbol, and production
rules that define how variables can be replaced by sequences of other variables
and terminals.
 Components of CFG:
 Variables (Non-terminals): Symbols representing sets of strings.
 Terminals: Symbols that appear in the final strings.
 Production Rules: Define how variables can be replaced.
 Start Symbol: The initial variable from which the derivation begins.

Distinction Between Regular and Context-Free Grammar:


 Regular Grammar:
 Simplest type of formal grammar.
 Expressive power is limited.
 Can be represented by regular expressions.
 Used to define regular languages.
 Context-Free Grammar:
 More expressive than regular grammars.
 Used to describe context-free languages.
 Supports nested structures, such as nested parentheses.
 Requires more powerful parsing algorithms.

Limitations of Context-Free Grammar:


1. Nested Structures: CFGs struggle to express nested structures efficiently. For
example, matching nested parentheses requires a context-free grammar but is
more elegantly handled with context-sensitive grammars or parsing algorithms.
2. Cross-Serial Dependencies: CFGs cannot capture cross-serial dependencies
where the relationship between non-terminals is not strictly hierarchical. This
is a limitation for certain natural language constructs.
3. Ambiguity: CFGs can be ambiguous, leading to multiple parse trees for the
same input. Ambiguity can be problematic in language design and compiler
construction.
4. Limited Expressiveness: CFGs have limitations in expressing certain lan-
guage features, leading to the need for more powerful formalisms like context-
sensitive grammars or programming language semantics.

1. Define preprocessor. What are the functions of pre-processor?

Preprocessors are programs that process the source code before compilation. Several steps are
involved between writing a program and executing a program in C. Let us have a look at these
steps before we actually start learning about Preprocessors.
2. Discuss about the Syntax Error Handling.
The tasks of the Error Handling process are to detect each error, report it to the user, and then
make some recovery strategy and implement them to handle the error. During this whole process
processing time of the program should not be slow.
Functions of Error Handler:
 Error Detection
 Error Report
 Error Recovery
3. Differentiate between shift-reduce and Operator Precedence Parsers.

Operator precedence grammar is kinds of shift reduce parsing method. It is applied to a small class
of operator grammars.

A grammar is said to be operator precedence grammar if it has two properties:

o No R.H.S. of any production has a∈.


o No two non-terminals are adjacent.

Operator precedence can only established between the terminals of the grammar. It ignores the non-
terminal.

4. What are the benefits of intermediate code generation?

Easier to implement: Intermediate code generation can simplify the code generation process by
reducing the complexity of the input code, making it easier to implement.
Facilitates code optimization: Intermediate code generation can enable the use of various code
optimization techniques, leading to improved performance and efficiency of the generated code.
Platform independence: Intermediate code is platform-independent, meaning that it can be
translated into machine code or bytecode for any platform.
Code reuse: Intermediate code can be reused in the future to generate code for other platforms or
languages.
Easier debugging: Intermediate code can be easier to debug than machine code or bytecode, as it
is closer to the original source code.

5. What are the various attributes of a Symbol Table?

Symbol table is an important data structure created and maintained by compilers in order to store
information about the occurrence of various entities such as variable names, function names, ob-
jects, classes, interfaces, etc. Symbol table is used by both the analysis and the synthesis parts of a
compiler.

A symbol table may serve the following purposes depending upon the language in hand:

6. Mention the issues to be considered while applying the techniques for


code optimization.
Increased compilation time: Code optimization can significantly increase the compilation time,
which can be a significant drawback when developing large software systems.
Increased complexity: Code optimization can result in more complex code, making it harder to
understand and debug.
Potential for introducing bugs: Code optimization can introduce bugs into the code if not done
carefully, leading to unexpected behavior and errors.
Difficulty in assessing the effectiveness: It can be difficult to determine the effectiveness of
code optimization, making it hard to justify the time and resources spent on the process.

7. Briefly describe about the Lexical errors.


When the token pattern does not match the prefix of the remaining input, the lexical analyzer gets
stuck and has to recover from this state to analyze the remaining input. In simple words, a lexical
error occurs when a sequence of characters does not match the pattern of any token. It typically
happens during the execution of a program.
Types of Lexical Error:
Types of lexical error that can occur in a lexical analyzer are as follows:
1. Exceeding length of identifier or numeric constants.
This is a lexical error since signed integer lies between −2,147,483,648 and 2,147,483,647
2. Appearance of illegal characters
This is a lexical error since an illegal character $ appears at the end of the statement.
3. Unmatched string

This is a lexical error since the ending of comment “*/” is not present but the beginning is pre-
sent.
4. Spelling Error
5. Replacing a character with an incorrect character.
8. What are the functions used to create the nodes of syntax trees?
A syntax tree’s nodes can all be performed as data with numerous fields. One element of the node
for an operator identifies the operator, while the remaining field contains a pointer to the operand
nodes. The operator is also known as the node’s label. The nodes of the syntax tree for expres-
sions with binary operators are created using the following functions. Each function returns a ref-
erence to the node that was most recently created.
1. mknode (op, left, right): It creates an operator node with the name op and two fields, contain-
ing left and right pointers.
2. mkleaf (id, entry): It creates an identifier node with the label id and the entry field, which is a
reference to the identifier’s symbol table entry.
3. mkleaf (num, val): It creates a number node with the name num and a field containing the
number’s value, val. Make a syntax tree for the expression a 4 + c, for example. p1, p2,…, p5 are
pointers to the symbol table entries for identifiers ‘a’ and ‘c’, respectively, in this sequence.

9. What are the three techniques for constructing LR parsing table?

The rules of LR parser as follows.


o The first item from the given grammar rules adds itself as the first closed set.
o If an object is present in the closure of the form A→ α. β. γ, where the next symbol after
the symbol is non-terminal, add the symbol’s production rules where the dot precedes the
first item.
o Repeat steps (B) and (C) for new items added under (B).

10. Discuss the evaluation of semantic rules.

Semantic Analysis is the third phase of Compiler. Semantic Analysis makes sure that declarations
and statements of program are semantically correct. It is a collection of procedures which is called
by parser as and when required by grammar. Both syntax tree of previous phase and symbol table
are used to check the consistency of the given code. Type checking is an important part of
semantic analysis where compiler makes sure that each operator has matching operands.
Semantic Analyzer:
It uses syntax tree and symbol table to check whether the given program is semantically
consistent with language definition. It gathers type information and stores it in either syntax tree
or symbol table. This type information is subsequently used by compiler during intermediate-code
generation.
Semantic Errors:
Errors recognized by semantic analyzer are as follows:
 Type mismatch
 Undeclared variables
 Reserved identifier misuse

Q.11 A) Write a regular expression for identifiers and reserved words. Design the
transition diagrams for them.
C) Explain the three general approaches for the implementation of a Lexical
analyzer.

A) The lexical analyzer needs to scan and identify only a finite set of valid string/token/lexeme
that belongs to the language in hand. It searches for the pattern defined by the language
rules.

Regular expressions have the capability to express finite languages by defining a pattern for
finite strings of symbols. The grammar defined by regular expressions is known as regular
grammar. The language defined by regular grammar is known as regular language.

Operations
The various operations on languages are:
 Union of two languages L and M is written as
L U M = {s | s is in L or s is in M}
 Concatenation of two languages L and M is written as
LM = {st | s is in L and t is in M}
 The Kleene Closure of a language L is written as
L* = Zero or more occurrence of language L.
Notations
If r and s are regular expressions denoting the languages L(r) and L(s), then
 Union : (r)|(s) is a regular expression denoting L(r) U L(s)
 Concatenation : (r)(s) is a regular expression denoting L(r)L(s)
 Kleene closure : (r)* is a regular expression denoting (L(r))*
 (r) is a regular expression denoting L(r)
Precedence and Associativity
 *, concatenation (.), and | (pipe sign) are left associative
 * has the highest precedence
 Concatenation (.) has the second highest precedence.
 | (pipe sign) has the lowest precedence of all.
Representing valid tokens of a language in regular expression
If x is a regular expression, then:
 x* means zero or more occurrence of x.
i.e., it can generate { e, x, xx, xxx, xxxx, … }
 x+ means one or more occurrence of x.
i.e., it can generate { x, xx, xxx, xxxx … } or x.x*
 x? means at most one occurrence of x
i.e., it can generate either {x} or {e}.
[a-z] is all lower-case alphabets of English language.
[A-Z] is all upper-case alphabets of English language.
[0-9] is all natural digits used in mathematics.
Representing occurrence of symbols using regular expressions
letter = [a – z] or [A – Z]
digit = 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 or [0-9]
sign = [ + | - ]
Representing language tokens using regular expressions
Decimal = (sign)?(digit)+
Identifier = (letter)(letter | digit)*
The only problem left with the lexical analyzer is how to verify the validity of a regular expression
used in specifying the patterns of keywords of a language. A well-accepted solution is to use finite
automata for verification.

B) Lexical Analysis is the first step of the compiler which reads the source code one character at a time
and transforms it into an array of tokens. The token is a meaningful collection of characters in a
program. These tokens can be keywords including do, if, while etc. and identifiers including x,
num, count, etc. and operator symbols including >,>=, +, etc., and punctuation symbols including
parenthesis or commas. The output of the lexical analyzer phase passes to the next phase called syn-
tax analyzer or parser.
The syntax analyser or parser is also known as parsing phase. It takes tokens as input from lexical
analyser phase. The syntax analyser groups tokens together into syntactic structures. The output of
this phase is parse tree.
Function of Lexical Analysis
The main function of lexical analysis are as follows −
 It can separate tokens from the program and return those tokens to the parser as requested by
it.
 It can eliminate comments, whitespaces, newline characters, etc. from the string.
 It can insert the token into the symbol table.
 Lexical Analysis will return an integer number for each token to the parser.
 Stripping out the comments and whitespace (tab, newline, blank, and other characters that
are used to separate tokens in the input).
 The correlating error messages that are produced by the compiler during lexical analyzer
with the source program.
 It can implement the expansion of macros, in the case of macro, pre-processors are used in
the source code.
LEX generates Lexical Analyzer as its output by taking the LEX program as its input. LEX pro-
gram is a collection of patterns (Regular Expression) and their corresponding Actions.
Patterns represent the tokens to be recognized by the lexical analyzer to be generated. For each pat-
tern, a corresponding NFA will be designed.
There can be n number of NFAs for n number of patterns.
Example − If patterns are { }
P1 { }
P2 { }
Pn { }
Then NFA’s for corresponding patterns will be −

A start state is taken and using ϵ transition, and all these NFAs can be connected to make combined
NFA −

The final state of each NFA shows that it has found its token Pi.
It converts the combined NFA to DFA as it is always easy to simulate the behavior of DFA with a
program.
The final state shows which token we have found. If none of the states of DFA includes any final
states of NFA then control returns to an error condition.
If the final state of DFA includes more than one final state of NFA, then the final state, for the pat-
tern coming first in the Translation rules has priority.

You might also like