Mid Sem Solution (CPTT)
Mid Sem Solution (CPTT)
Tokens:
2. Syntax Analysis (Parsing): This phase checks whether the tokens follow correct grammar
rules. It creates a parse tree or an Abstract Syntax Tree (AST).
=
/ \
a +
/ \
+ 3
/ \
* b
/ \
a -50
3. Semantic Analysis: Checks for type correctness, declarations, scope, etc. a and b are
declared as float. -50 and 3 are implicitly converted to float. All operands in expressions are of
type float. The expression is valid, and a can be assigned the result. So, no semantic errors here.
=
/ \
a +
/ \
+ 3.0
/ \
* b
/ \
a -50.0
The forward pointer moves ahead to search for end of lexeme. As soon as the blank space is
encountered, it indicates end of lexeme. In above example as soon as pointer (fp) encounters a
blank space the lexeme “int” is identified. The fp will be moved ahead at white space, when fp
encounters white space, it ignore and moves ahead, then both the begin pointer (bp) and
forward pointer (fp) are set at next token. The input character is thus read from secondary
storage, but reading in this way from secondary storage is costly. Hence buffering technique is
used. A block of data is first read into a buffer, and then second by lexical analyser.
(c) Explain the language processing system for the modern computer programming
languages.
Answer: The computer is an intelligent combination of software and hardware. Hardware is
simply a piece of mechanical equipment and its functions are being compiled by the relevant
software. The hardware considers instructions as electronic charge, which is equivalent to the
binary language in software programming. The binary language has only 0s and 1s. To
enlighten, the hardware code has to be written in binary format, which is just a series of 0s and
1s. Writing such code would be an inconvenient and complicated task for computer
programmers, so we write programs in a high-level language, which is Convenient for us to
comprehend and memorize. These programs are then fed into a series of devices and operating
system (OS) components to obtain the desired code that can be used by the machine. This is
known as a language processing system.
Q. 2 (a) Write the regular expression for the floating point number with exponent. The
example of some floating point numbers with exponent are some strings such as 6.5897E4,
1.8569E-4, 2.54e15 or 3.54e-15. [Hint: At least one digit should be before and after the dot
(.)]
Solution:
digit → [0-9]
digits → digit+
Exponent → E | e
number → digits (. digits)? (Exponent (+|-)? digits)?
(b) Construct a transition diagram (finite state automaton) that represents the lexical
recognition of a floating-point number with an exponent.
Q. 3 (a) Consider two grammars G with the production rules given below:
G: S → if E then S | if E then S else S | a
E→b
Where if, then, else, a, b, c are the terminals. Determine whether the given grammar G is
ambiguous.
Answer: The given grammar G has common prefix “if E then S”. This grammar is having
dangling else ambiguity. Also, the common prefix leads to the ambiguous grammar.
Note: Students can also find ambiguity of the grammar by having LMD and RMD for an input
string.
(b) Determine whether the given grammar G in question 3(a) is LL (1) grammar.
Solution:
At first, grammar needs to be left factored by rewriting it.
S → if E then S | if E then S S’ | a
S’ → else S | ɛ
E→b
Next, we need to construct LL (1) parsing table to check whether the given grammar is LL (1)
or not. For this, we need to find FIRST and FOLLOW of each nonterminal in left factored
grammar.
FIRST FOLLOW
FIRST (S) = {if, a} FOLLOW (S) = {else, $}
FIRST (S’) = {else, ɛ} FOLLOW (S’) = {else, $}
FIRST (E) = {b} FOLLOW (E) = {then}
The parsing table has conflict since it has two entries in the cell (S’, else). Therefore, the given
grammar is not LL (1).
(c) Consider the following context-free grammar where the start symbol is S and the set
of terminals is {a, b, c, d}.
S → AaAb | BbBa
A → cS | ɛ
B → dS | ɛ
The following is a partially-filled LL (1) parsing table.
a b c d $
S S → AaAb S → BbBa (1) (2)
A A→ɛ (3) A → cS
B (4) B→ɛ B → dS
Write down the CORRECT productions for the numbered cells in the parsing table.
Solution:
First, we have to find FIRST and FOLLOW of each nonterminal in left factored grammar.
FIRST FOLLOW
FIRST (S) = {a, b, c, d} FOLLOW (S) = {a, b, $}
FIRST (A) = {c, ɛ} FOLLOW (A) = {a, b}
FIRST (B) = {d, ɛ} FOLLOW (B) = {a, b}
a b c d $
S S → AaAb S → BbBa S → AaAb S → BbBa
A A→ɛ A→ɛ A → cS
B B→ɛ B→ɛ B → dS
(b) The attributes of three arithmetic operators in some programming language is given
below.
Operator Precedence Associativity Arity
+ High Left Binary
- Medium right Binary
* Low Left Binary
Compute the value of the expression 2-5+1-7*3 and create the parse tree for this
expression.
Solution:
Given expression is:
2–5+1–7*3
= 2 – 6 – 7 * 3 (highest priority is of ‘+’, associativity left to right)
= 2 – (–1) * 3 (second highest priority is of ‘–’, associativity right to left)
= 2 + 1 * 3 (highest priority is of ‘+’, associativity left to right)
= 3 * 3 (lowest priority is of ‘*’, associativity left to right)
=9
Parse tree for the given expression is provided below:
*
/ \
– 3
/ \
2 –
/ \
+ 7
/ \
5 1
(c) Consider the following parse tree for the expression a#b$c$d#e#f, involving two
binary operators $ and #.
#
/ \
a #
/ \
$ #
/ \ / \
$ d e f
/ \
b c
Determine which operator has the high precedence and what is the associativity of the
two operators $ and #.
Solution:
Highest precedence operator is at the lowest level in the expression tree so that it is evaluated
first.
For unambiguous grammar, we can get precedence and associativity directly from production
or expression tree.
Left Associativity → Left Linear Grammar or in expression tree, it should expand on left child
for the same operator and vice versa.
Here at the lowest level, we have $ so it has the highest precedence.
# is right associative and $ is left associative.
Q. 5 (a) Consider the following grammar with production rules as:
E→T+E|T
T → id
Compute the canonical collection of LR (0) items of the grammar.
Solution:
Augmented Grammar
E’ → E
E→T+E
E→T
T → id
I0 : E’ → .E GOTO (I0, E) → I1
E → .T + E GOTO (I0, T) → I2
E → .T GOTO (I0, id) → I3
T → .id
Answer:
A grammar is LR (0) if in every item set
(i) no shift/reduce conflict occurs
(ii) no reduce/reduce conflict occurs
In I2, we have shift-reduce conflict. Therefore, the grammar is not LR (0).
OR
Alternatively, students can solve it by constructing LR (0) parsing table.
(1) E → T + E
(2) E → T
(3) T → id
Action GOTO
State
id + $ E T
0 S3 1 2
1 Accept
2 R2 S4/R2 R2
3 R3 R3 R3
4 S3 5 2
5 R1 R1 R1
It can be observed from table that state 2 has shift-reduce conflict for + input. Therefore, the
grammar is not LR (0).
(c) Construct the SLR (1) parsing table for the grammar given in question 5(a).
Solution:
Action GOTO
State
id + $ E T
0 S3 1 2
1 Accept
2 S4 R2
3 R3 R3
4 S3 5 2
5 R1