VTU Exam Question Paper With Solution of 18CS61 System Software and Compilers July-2022-Sagarika Behera
VTU Exam Question Paper With Solution of 18CS61 System Software and Compilers July-2022-Sagarika Behera
MODULE-1
1
a) Explain in details SIC-XE machine architecture -10 marks
Solution:
1. Memory
An address (20 bits) cannot be fitted into a 15-bit field as in SIC Standard
2. Registers
9 registers (5 registers of SIC + 4 additional registers)
3. Data Format
24-bit(3 Bytes) integer representation in 2’s complement
8-bit(1 Byte) ASCII code for characters
There is a 48-bit floating-point data type
fraction is a value between 0 and 1
exponent is an unsigned binary number between 0 and 2047
CSE-Dept 1 2021-22
VTU –6TH SEMESTER –CSE -SYSTEM SOFTWARE &COMPLIER DESIGN, JULY 2022 PAPER SOLUTION
5. Addressing Mode
CSE-Dept 2 2021-22
VTU –6TH SEMESTER –CSE -SYSTEM SOFTWARE &COMPLIER DESIGN, JULY 2022 PAPER SOLUTION
6. Instruction Set
load and store:LDA, LDX, STA, STX LDB, STB, etc.
CSE-Dept 3 2021-22
VTU –6TH SEMESTER –CSE -SYSTEM SOFTWARE &COMPLIER DESIGN, JULY 2022 PAPER SOLUTION
Subroutine linkage
b) List the various machine independent assembler features .explain the control sections
how the assembler convert them into object code. 10 marks
Solution:
Machine-Independent Assembler Features are
1. Literals
2. Symbol-Defining Statements
3. Expressions
4. Program Blocks
5. Control Sections
6. Program Linking
Control Sections
A control section is a part of the program that maintains its identity after assembly; each such control
section can be loaded and relocated independently of the others.
Different control sections are most often used for subroutines or other logical subdivisions of a program.
Control sections differ from program blocks in that they are handled separately by the assembler.
The EXTDEF (external definition) statement in a control section names symbols, called external
symbols that are defined in this control section and may be used by other sections.
The EXTREF (external reference) statement names symbols that are used in this control section and
are defined elsewhere. We need two new record types (Define and Refer) in the object program.
CSE-Dept 4 2021-22
VTU –6TH SEMESTER –CSE -SYSTEM SOFTWARE &COMPLIER DESIGN, JULY 2022 PAPER SOLUTION
A Define record gives information about external symbols that are defined in this control section –
that is, symbols named by EXTDEF.
A Refer record lists symbols that are used as external reference by the control section – that is,
symbols named by EXTREF.
can only use extended formatto provide enough room (that is, relative addressing for
external reference is invalid)
The assembler generates information for each external reference that will allow the
loaderto perform the required linking
Case 2
190 0028 MAXLEN WORD BUFEND-BUFFER 000000
There are two external references in the expression, BUFEND and BUFFER
The assembler inserts a value of zero
CSE-Dept 5 2021-22
VTU –6TH SEMESTER –CSE -SYSTEM SOFTWARE &COMPLIER DESIGN, JULY 2022 PAPER SOLUTION
Case 3
On line 107, BUFEND and BUFFER are defined in the same control section and the expression
can be calculated immediately.
107 1000 MAXLEN EQU BUFEND-BUFFER
CSE-Dept 6 2021-22
VTU –6TH SEMESTER –CSE -SYSTEM SOFTWARE &COMPLIER DESIGN, JULY 2022 PAPER SOLUTION
Algorithm:
Begin
read first input line
if OPCODE = ‘START’ then begin
save #[Operand] as starting addr
initialize LOCCTR to starting address
write line to intermediate file
read next line
end( if START)
else
initialize LOCCTR to 0
While OPCODE != ‘END’ do
begin
if this is not a comment line then
begin
if there is a symbol in the LABEL field then
CSE-Dept 7 2021-22
VTU –6TH SEMESTER –CSE -SYSTEM SOFTWARE &COMPLIER DESIGN, JULY 2022 PAPER SOLUTION
begin
search SYMTAB for LABEL
if found then
set error flag (duplicate symbol)
else
(if symbol)
search OPTAB for OPCODE
if found then
add 3 (instr length) to LOCCTR
else if OPCODE = ‘WORD’ then
add 3 to LOCCTR
else if OPCODE = ‘RESW’ then
add 3 * #[OPERAND] to LOCCTR
else if OPCODE = ‘RESB’ then
add #[OPERAND] to LOCCTR
else if OPCODE = ‘BYTE’ then
begin
find length of constant in bytes
add length to LOCCTR
end
else
set error flag (invalid operation code)
end (if not a comment)
write line to intermediate file
read next input line
end { while not END}
write last line to intermediate file
Save (LOCCTR – starting address) as
program length
End {pass 1}
Solution:
Basic function of loader are
1. Allocation: It allocates memory for the program in the main memory.
2. Linking: It combines two or more separate object programs or modules and supplies
necessary information.
3. Relocation: It modifies the object program so that it can be loaded at an address different
from the location.
4. Loading: It brings the object program into the main memory for execution.
The absolute loader is a kind of loader in which relocated object files are created,
loader accepts these files and places them at a specified location in the memory.
This type of loader is called absolute loader because no relocating information is
needed, rather it is obtained from the programmer or assembler.
The starting address of every module is known to the programmer, this corresponding
starting address is stored in the object file then the task of loader becomes very simple
that is to simply place the executable form of the machine instructions at the locations
mentioned in the object file.
CSE-Dept 8 2021-22
VTU –6TH SEMESTER –CSE -SYSTEM SOFTWARE &COMPLIER DESIGN, JULY 2022 PAPER SOLUTION
RELOCATABLE LOADERS
Absolute loaders have a number of advantages: they are small, fast and simple. But they
have a number of disadvantages, too.
The major problem deals with the need to assemble an entire program all at once. Since the
addresses for the program are determined at assembly time, the entire program must be
assembled at one time in order for proper addresses to be assigned to the different parts. This
means that a small change to one subroutine requires reassembly of the entire program. Also,
standard subroutines, which might be kept in a library of useful subroutines and functions, must
be physically copied and added to each program which uses them.
A relocatable loader is a loader which allows this delay of binding time. A relocatable
loader accepts as input a sequence of segments, each in a special relocatable load format, and
loads these segments into memory. The addresses into which segments are loaded are
determined by the relocatable loader, not by the assembler or the programmer.
Each segment is a subroutine, function, main program, block of global data, or some
similar set of memory locations which the programmer wishes to group together. Segments are
CSE-Dept 9 2021-22
VTU –6TH SEMESTER –CSE -SYSTEM SOFTWARE &COMPLIER DESIGN, JULY 2022 PAPER SOLUTION
loaded into memory one after the other, to use as little space as possible. The relocatable load
format is defined so that separate segments can be assembled or compiled separately and
combined at load time.
Relocation
The relocation implied in the name "relocatable loader" refers to the fact that on two
separate loads, the same segment can be loaded into two different locations in memory. If any
of the segments which are loaded into memory before a segment change in size due to recoding
and reassembly between the two loads, then the addresses in memory into which the segment
is loaded will change by the same amount.
This program has four symbols, BEGIN, LOOP, LENGTH, and BUFFER. If the program
were to be loaded into memory starting at location 0, then the values of these symbols would
be 0, 1, 6, and 7, respectively. If the starting address were 1000, the values of the symbols
would be 1000, 1001, 1006, and 1007; if the base address were 1976, the values would
be 1976, 1977, 1982, and 1983. In all cases, the addresses, for a base BASE, would
be BASE+0, BASE+1, BASE+6, and BASE+7. Thus, to relocate the program from starting at
an address BASE to starting at an address NEWBASE merely involves adding NEWBASE-
BASE to the values of all of the symbols. If the assembler would produce all code as if it had
a base of 0, then relocating this code would involve only adding the correct base.
CSE-Dept 10 2021-22
VTU –6TH SEMESTER –CSE -SYSTEM SOFTWARE &COMPLIER DESIGN, JULY 2022 PAPER SOLUTION
Solution:
. The analysis phase creates an intermediate representation from the given source code.
The synthesis phase creates an equivalent target program from the intermediate
representation
Symbol Table – It is a data structure being used and maintained by the compiler, consisting
of all the identifier’s names along with their types. It helps the compiler to function smoothly
by finding the identifiers quickly.
The compiler has two modules namely the front end and the back end. Front-end constitutes
the Lexical analyzer, semantic analyzer, syntax analyzer, and intermediate code generator.
And the rest are assembled to form the back end.
1. Lexical Analyzer –
It is also called a scanner. It takes the output of the preprocessor (which performs file
inclusion and macro expansion) as the input which is in a pure high-level language. It reads
the characters from the source program and groups them into lexemes (sequence of characters
that “go together”). Each lexeme corresponds to a token. Tokens are defined by regular
expressions which are understood by the lexical analyzer. It also removes lexical errors (e.g.,
erroneous characters), comments, and white space.
2. Syntax Analyzer – It is sometimes called a parser. It constructs the parse tree. It takes all the
tokens one by one and uses Context-Free Grammar to construct the parse tree.
3. Semantic Analyzer – It verifies the parse tree, whether it’s meaningful or not. It
furthermore produces a verified parse tree. It also does type checking, Label checking, and
Flow control checking.
4. Intermediate Code Generator – It generates intermediate code, which is a form that can be
readily executed by a machine We have many popular intermediate codes. Example – Three
address codes etc. Intermediate code is converted to machine language using the last two
phases which are platform dependent.
Till intermediate code, it is the same for every compiler out there, but after that, it depends
on the platform. To build a new compiler we don’t need to build it from scratch. We can take
the intermediate code from the already existing compiler and build the last two parts.
CSE-Dept 11 2021-22
VTU –6TH SEMESTER –CSE -SYSTEM SOFTWARE &COMPLIER DESIGN, JULY 2022 PAPER SOLUTION
5. Code Optimizer – It transforms the code so that it consumes fewer resources and produces
more speed. The meaning of the code being transformed is not altered. Optimization can be
categorized into two types: machine-dependent and machine-independent.
6. Target Code Generator – The main purpose of the Target Code generator is to write a code
that the machine can understand and also register allocation, instruction selection, etc. The
output is dependent on the type of assembler. This is the final stage of compilation. The
ptimized code is converted into relocatable machine code which then forms the input to the
linker and loader.
Solution:
1. Implementation of High-level Programming
A high-level programming language defines a programming abstraction: the programmer
specifies an algorithm in the language, and the compiler must translate it to the target
language. Higher-level programming languages are sometimes easier to develop in, but they
are inefficient, therefore the target applications run slower. Low-level language programmers
have more control over their computations and, in principle, can design more efficient code.
Lower-level programs, on the other hand, are more difficult to build and much more difficult
to maintain. They are less portable, more prone to errors, and more complex to manage.
CSE-Dept 12 2021-22
VTU –6TH SEMESTER –CSE -SYSTEM SOFTWARE &COMPLIER DESIGN, JULY 2022 PAPER SOLUTION
CSE-Dept 13 2021-22
VTU –6TH SEMESTER –CSE -SYSTEM SOFTWARE &COMPLIER DESIGN, JULY 2022 PAPER SOLUTION
Solution:
write programs in a high-level language, which is Convenient for us to comprehend
and memorize. These programs are then fed into a series of devices and operating
system (OS) components to obtain the desired code that can be used by the machine.
This is known as a language processing system.
Preprocessor:–
The pre-processor includes all header files and also evaluates whether a macro(A macro is a piece of
code that is given a name. Whenever the name is used, it is replaced by the contents of the macro by
an interpreter or compiler.
The purpose of macros is either to automate the frequency used for sequences or to enable more
powerful abstraction) is included. It takes source code as input and produces modified source code as
output. The pre-processor is also known as a macro evaluator, processing is optional that is if any
language that does not support #include and macros processing Is not required.
Compiler –
The compiler takes the modified code as input and produces the target code as output.
Input-Output
Assembler:
The assembler takes the target code as input and produces real locatable machine code as
output.
Linker:
A linker or link editor is a program that takes a collection of objects (created by assemblers
and compilers) and combines them into an executable program.
CSE-Dept 14 2021-22
VTU –6TH SEMESTER –CSE -SYSTEM SOFTWARE &COMPLIER DESIGN, JULY 2022 PAPER SOLUTION
Loader:
The loader keeps the linked program in the main memory.
Executable code:
It is the low level and machine specific code and machine can easily understand. Once the
job of linker and loader is done then object code finally converted it into the executable code.
Solution :
The lexical analyzer scans the input from left to right one character at a time. It uses two
pointers begin ptr(bp) and forward ptr(fp) to keep track of the pointer of the input scanned.
The forward ptr moves ahead to search for end of lexeme. As soon as the blank space is
encountered, it indicates end of lexeme. In above example as soon as ptr (fp) encounters a
blank space the lexeme “int” is identified. The fp will be moved ahead at white space, when
fp encounters white space, it ignore and moves ahead. then both the begin ptr(bp) and forward
ptr(fp) are set at next token. The input character is thus read from secondary storage, but
reading in this way from secondary storage is costly. hence buffering technique is used.A
block of data is first read into a buffer, and then second by lexical analyzer. there are two
methods used in this context: One Buffer Scheme, and Two Buffer Scheme. These are
explained as following below.
CSE-Dept 15 2021-22
VTU –6TH SEMESTER –CSE -SYSTEM SOFTWARE &COMPLIER DESIGN, JULY 2022 PAPER SOLUTION
Two Buffer Scheme: To overcome the problem of one buffer scheme, in this method two
buffers are used to store the input string. the first buffer and second buffer are scanned
alternately. when end of current buffer is reached the other buffer is filled. the only problem
with this method is that if length of the lexeme is longer than length of the buffer then
scanning input cannot be scanned completely. Initially both the bp and fp are pointing to the
first character of first buffer. Then the fp moves towards right in search of end of lexeme. as
soon as blank character is recognized, the string between bp and fp is identified as
corresponding token. to identify, the boundary of first buffer end of buffer character should
be placed at the end first buffer. Similarly end of second buffer is also recognized by the end
of buffer mark present at the end of second buffer. when fp encounters first eof, then one can
recognize end of first buffer and hence filling up second buffer is started. in the same way
when second eof is obtained then it indicates of second buffer. alternatively both the buffers
can be filled up until end of the input program and stream of tokens is identified.
This eof character introduced at the end is calling Sentinel which is used to identify the end
Solution:
A token is a pair consisting of a token name and an optional attribute value. The token name is
an abstract symbol representing a kind of lexical unit, e.g., a particular keyword, or a sequence
of input characters denoting an identifier. The token names are the input symbols that the parser
processes. We will often refer to a token by its token name.
A pattern is a description of the form that the lexemes of a token may take. In the case of a
keyword as a token, the pattern is just the sequence of characters that form the keyword. For
identifiers and some other tokens, the pattern is a more complex structure that is matched by
many strings.
A lexeme is a sequence of characters in the source program that matches the pattern for a token
and is identified by the lexical analyzer as an instance of that token.
CSE-Dept 16 2021-22
VTU –6TH SEMESTER –CSE -SYSTEM SOFTWARE &COMPLIER DESIGN, JULY 2022 PAPER SOLUTION
if characters i, f if
In many programming languages, the following classes cover most or all of the tokens:
1. One token for each keyword. The pattern for a keyword is the same as the keyword
itself.
2. Tokens for the operators, either individually or in classes such as the token
3. One token representing all identifiers.
4. One or more tokens representing constants, such as numbers and literal strings.
5. Tokens for each punctuation symbol, such as left and right parentheses, comma, and
semicolon
Example: Consider the following C statement
printf ("Total = %d\n”, score) ;
bothprintf and score are lexemes matching the pattern for token id, and
"Total = %d\n” is a lexeme matching literal.
CFG stands for context-free grammar. It is is a formal grammar which is used to generate all
possible patterns of strings in a given formal language. Context-free grammar G can be defined
by four tuples as:
1. G = (V, T, P, S)
Where,
G is the grammar, which consists of a set of the production rule. It is used to generate the string
of a language.
T is the final set of a terminal symbol. It is denoted by lower case letters.
V is the final set of a non-terminal symbol. It is denoted by capital letters.
CSE-Dept 17 2021-22
VTU –6TH SEMESTER –CSE -SYSTEM SOFTWARE &COMPLIER DESIGN, JULY 2022 PAPER SOLUTION
P is a set of production rules, which is used for replacing non-terminals symbols(on the left
side of the production) in a string with other terminal or non-terminal symbols(on the right side
of the production).
S is the start symbol which is used to derive the string. We can derive the string by repeatedly
replacing a non-terminal by the right-hand side of the production until all non-terminal have
been replaced by terminal symbols.
Solution:
S → aSb, (Rule: 1)
S → ab (Rule: 2)
First compute some strings generated by the production rules of the grammar G in the above;
Hence; Language generated by the above grammar L(G) = {ab, a2b2, a3b3, a4b4, a5b5, a6b6, a7b7,.. ..
.. .. }By analyzing the above generated string form the grammar G, there has a similar pattern in
all computed strings, i.e.
CSE-Dept 18 2021-22
VTU –6TH SEMESTER –CSE -SYSTEM SOFTWARE &COMPLIER DESIGN, JULY 2022 PAPER SOLUTION
b)
Solution:
A CFG is said to be ambiguous if there exists more than one derivation tree for the given
input string i.e., more than one LeftMost Derivation Tree (LMDT)
or RightMost Derivation Tree (RMDT)
This grammar: E -> E+E|id We can create a 2 parse tree from this grammar to obtain a
string id+id+id. The following are the 2 parse trees generated by left-most derivation:
Both the above parse trees are derived from the same grammar rules but both parse trees are
different. Hence the grammar is ambiguous.
Answer:
( a)
The general form for left recursion is
A → Aα1|Aα2| … . |Aαm|β1|β2| … . . βn
can be replaced by
A → β1A′|β2A′| … . . | … . . |βnA′
A’ → α1A′|α2A′| … . . |αmA′|ε
Algorithm:
- Arrange non-terminals in some order: A1 ... An
- for i from 1 to n do {
For j from 1 to i-1 do {
CSE-Dept 19 2021-22
VTU –6TH SEMESTER –CSE -SYSTEM SOFTWARE &COMPLIER DESIGN, JULY 2022 PAPER SOLUTION
S → Aa | b
A→ bd A’ | ƐA’
A’→ c A’| ad A’|Ɛ
Input Buffer
Stack
There are the various steps of Shift Reduce Parsing which are as follows −
There are the various steps of Shift Reduce Parsing which are as follows −
It uses a stack and an input buffer.
Insert $ at the bottom of the stack and the right end of the input string in Input
Buffer.
Shift − Parser shifts zero or more input symbols onto the stack until the handle
is on top of the stack.
Reduce − Parser reduce or replace the handle on top of the stack to the left side
of production, i.e., R.H.S. of production is popped, and L.H.S is pushed.
Accept − Step 3 and Step 4 will be repeated until it has detected an error or until
the stack includes start symbol (S) and input Buffer is empty, i.e., it contains $.
Handle:
CSE-Dept 20 2021-22
VTU –6TH SEMESTER –CSE -SYSTEM SOFTWARE &COMPLIER DESIGN, JULY 2022 PAPER SOLUTION
Each replacement of the Right side of production by the left side in the process above is known
as "Reduction" and each replacement is called "Handle."
Ans:
A LEX program consists of three
parts:
Declarati
ons
%
%
translation
rules
%
%
auxiliary
procedures
CSE-Dept 21 2021-22
VTU –6TH SEMESTER –CSE -SYSTEM SOFTWARE &COMPLIER DESIGN, JULY 2022 PAPER SOLUTION
R1 {action
1}
R2 {action
2}
.... ……
….
Rn {action n} where each Ri is regular expression and each action i, is a program fragment
describing what action the lexical analyzer should take when pattern Ri matches lexeme.
Typically, action i will return control to the parser. In Lex actions are written in C;in
general,however,they can be in any implementation language.
The third section holds whatever auxiliary procedures are needed by the
actions.
(b)
/*Lex Program to count numbers of lines, words, spaces and
characters
in a given statement*/
%{
#include<stdio.h>
int sc=0,wc=0,lc=0,cc=0;
%}
%%
%%
CSE-Dept 22 2021-22
VTU –6TH SEMESTER –CSE -SYSTEM SOFTWARE &COMPLIER DESIGN, JULY 2022 PAPER SOLUTION
yylex();
int yywrap( )
return 1;
Ans:
The UNIX utility yacc (Yet Another Compiler Compiler) parses a stream of token,
typically generated by lex, according to a user-specified grammar.
definitions
CSE-Dept 23 2021-22
VTU –6TH SEMESTER –CSE -SYSTEM SOFTWARE &COMPLIER DESIGN, JULY 2022 PAPER SOLUTION
%%
rules
%%
code
Definition: All code between %{ and %} is copied to the beginning of the resulting C
file.
Rules: A number of combinations of pattern and action: if the action is more than a single
command it needs to be in braces.
Code: This can be very elaborate, but the main ingredient is the call to yylex, the lexical
analyzer. If the code segment is left out, a default main is used which only calls yylex.
Definition section
There are three things that can go in the definitions section:
C code: Any code between %{ and %} is copied to the C file. This is typically used for
defining file variables, and for prototypes of routines that are defined in the code segment.
Definitions: The definition section of a lex file was concerned with characters; in yacc
this is tokens.
Example: %token NUMBER.
These token definitions are written to a .h file when yacc compiles this file.
If your lex program is supplying a tokenizer, the yacc program will repeatedly call the
yylex routine. The lex rules will probably function by calling return every time they have
parsed a token.
If lex is to return tokens that yacc will process, they have to agree on what tokens there
are. This is done as follows:
For Example
1.The yacc file will have token definition %token NUMBER in the definitions
section.
2. When the yacc file is translated with yacc –d , a header file y.tab.h is created that has
definitions like #define NUMBER 258.
3. The lex file can then call return NUMBER, and the yacc program can match on this token.
CSE-Dept 24 2021-22
VTU –6TH SEMESTER –CSE -SYSTEM SOFTWARE &COMPLIER DESIGN, JULY 2022 PAPER SOLUTION
Rules section
The rules section contains the grammar of the language you want to parse. This looks like
The terminal symbols get matched with return codes from the lex tokenizer. They are typically
defines coming from %token definitions in the yacc program or character values.
(b)
. Matches any character except \n.
{ } 1) Indicates how many times a pattern can be present. Example: A{1,3} implies
Ex: {digit}
^ Negation.
CSE-Dept 25 2021-22
VTU –6TH SEMESTER –CSE -SYSTEM SOFTWARE &COMPLIER DESIGN, JULY 2022 PAPER SOLUTION
(a)
S-Attributed Definitions.
An SDD is S-attributed if every attribute is synthesized.
If an SDD is S-attributed, we evaluate its attributes in any bottom-up ordering of the parse
tree nodes.
It is simpler to perform a post-order tree traversal and evaluate the attributes at a
node N when the traversal leaves N for the last time.
These definitions are implemented during bottom-up parsing because a bottom-up parse
corresponds to a post-order traversal, in other words, a post order traversal corresponds to the
order that an LR parser reduces the production body to its head.
L-Attributed Definitions.
The idea is that between attributes associated with a production body, the edges of a
dependency graph can go from right to left but not the other way round(left to right), hence
the name 'L-attributed'.
1. Synthesized, or,
2. Inherited but with limited rules, i.e Suppose there is a production A → X1X2...Xn and
an inherited attribute Xi.a computed by a rule associated with this production, then the
rule only uses;
** inherited attributes that are associated with head A.
** Either inherited attribute or synthesized attributes associated with the occurrences
of symbols X1,X2, ..., Xi-1 located to the left of Xi.
** Inherited or synthesized attributes that are associated with such an occurrence
of Xi itself, only in such a way that no cycles exist in the dependency graph formed
by Xi attributes.
Syntax Tree
CSE-Dept 26 2021-22
VTU –6TH SEMESTER –CSE -SYSTEM SOFTWARE &COMPLIER DESIGN, JULY 2022 PAPER SOLUTION
Parse Tree
(b)
CSE-Dept 27 2021-22
VTU –6TH SEMESTER –CSE -SYSTEM SOFTWARE &COMPLIER DESIGN, JULY 2022 PAPER SOLUTION
1. Assignment Statement-
x = y op z and x = op y
Here,
x, y and z are the operands.
op represents the operator.
It assigns the result obtained after solving the right-side expression of the assignment
operator to the left side operand.
2. Copy Statement-
x=y
Here,
x and y are the operands.
= is an assignment operator.
If x relop y goto X
Here,
x & y are the operands.
X is the tag or label of the target statement.
relop is a relational operator.
CSE-Dept 28 2021-22
VTU –6TH SEMESTER –CSE -SYSTEM SOFTWARE &COMPLIER DESIGN, JULY 2022 PAPER SOLUTION
4. Unconditional Jump-
goto X
Triplets
Operator Arg1 Arg2
Uminus c
+ a b
- (1) (0)
Indirect Triples
CSE-Dept 29 2021-22
VTU –6TH SEMESTER –CSE -SYSTEM SOFTWARE &COMPLIER DESIGN, JULY 2022 PAPER SOLUTION
100 (0)
101 (1)
102 (2)
Solution:
Syntax directed definition specifies the values of attributes by associating semantic rules
with the grammar productions. It is a context free grammar with attributes and rules together
which are associated with grammar symbols and productions respectively.
The process of syntax directed translation is two-fold:
Construction of syntax tree and
• Computing values of attributes at each node by visiting the nodes of syntax tree.
Semantic actions
Semantic actions are fragments of code which are embedded within production bodies by
syntax directed translation.
They are usually enclosed within curly braces ({ }).
It can occur anywhere in a production but usually at the end of production.
(eg.)
E—> E1 + T {print ‘+’}
Types of translation
• L-attributed translation
o It performs translation during parsing itself.
o No need of explicit tree construction.
o L represents ‘left to right’.
• S-attributed translation
o It is performed in connection with bottom up parsing.
o ‘S’ represents synthesized.
CSE-Dept 30 2021-22
VTU –6TH SEMESTER –CSE -SYSTEM SOFTWARE &COMPLIER DESIGN, JULY 2022 PAPER SOLUTION
CSE-Dept 31 2021-22
VTU –6TH SEMESTER –CSE -SYSTEM SOFTWARE &COMPLIER DESIGN, JULY 2022 PAPER SOLUTION
2. Target program:
The target program is the output of the code generator. The output can be:
3. Memory management
o During code generation process the symbol table entries have to be mapped to actual
p addresses and levels have to be mapped to instruction address.
o Mapping name in the source program to address of data is co-operating done by the
front end and code generator.
o Local variables are stack allocation in the activation record while global variables are in
static area.
4. Instruction selection:
o Nature of instruction set of the target machine should be complete and uniform.
o When you consider the efficiency of target machine then the instruction speed and
machine idioms are important factors.
o The quality of the generated code can be determined by its speed and size.
Example:
The Three address code is:
1. a:= b + c
2. d:= a + e
1. MOV b, R0 R0→b
2. ADD c, R0 R0 c + R0
3. MOV R0, a a → R0
4. MOV a, R0 R0→ a
CSE-Dept 32 2021-22
VTU –6TH SEMESTER –CSE -SYSTEM SOFTWARE &COMPLIER DESIGN, JULY 2022 PAPER SOLUTION
5. ADD e, R0 R0 → e + R0
6. MOV R0, d d → R0
5. Register allocation
Register can be accessed faster than memory. The instructions involving operands in
register are shorter and faster than those involving in memory operand.
Register allocation: In register allocation, we select the set of variables that will reside
in register.
Certain machine requires even-odd pairs of registers for some operands and result.
6. Evaluation order
The efficiency of the target code can be affected by the order in which the
computations are performed. Some computation orders need fewer registers to hold
results of intermediate than others.
CSE-Dept 33 2021-22