0% found this document useful (0 votes)
16 views25 pages

UNIT-1 Objective:: Overview of A Language-Processing System

This document provides an overview of compiler design, focusing on lexical analysis and its role in language processing systems. It covers the functions of preprocessors, compilers, assemblers, interpreters, linkers, and loaders, as well as the phases of a compiler including lexical, syntax, and semantic analysis. Additionally, it explains key concepts such as tokens, patterns, and lexemes, and outlines the differences between compilers and interpreters.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views25 pages

UNIT-1 Objective:: Overview of A Language-Processing System

This document provides an overview of compiler design, focusing on lexical analysis and its role in language processing systems. It covers the functions of preprocessors, compilers, assemblers, interpreters, linkers, and loaders, as well as the phases of a compiler including lexical, syntax, and semantic analysis. Additionally, it explains key concepts such as tokens, patterns, and lexemes, and outlines the differences between compilers and interpreters.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 25

COMPILER DESIGN UNIT-1

Objective:
To familiarize with lexical analyzer.

Syllabus:
Lexical analysis Overview of language processing, preprocessors, compiler,
assembler, interpreters, linkers & loaders, phases of a compiler. Lexical Analysis-
Role of lexical analysis, lexical analysis vs parsing, token, patterns and lexemes,
lexical errors, transition diagram for recognition of tokens, reserved words and
identifiers.

Learning Outcomes:
Students will be able to
 enumerate language processing system.
 identify the differences between compiler and Interpreter.
 design a Lexical analyzer for the given language.

Learning Material
Overview of A language-processing System

Pre-processor:
A pre-processor is a program that processes its input data to produce output that is
used as input to another program.
Skeletal Source program Source program
Preprocessor

Functions of pre-processor:
1. Macro processing: A pre-processor may allow a user to define macros that
are short hands for longer constructs.
2. File inclusion: A pre-processor may include header files into the program
text.

DEPARTMENT OF INFORMATION TECHNOLOGY GEC Page 1


COMPILER DESIGN UNIT-1

3. Rational pre-processor: These pre-processors augment older languages with


more modern flow-of-control and data structuring facilities.
4. Language Extensions: These pre-processor attempts to add capabilities to
the language by certain amounts to build-in macro.
Example
A C pre-processor is a program that accepts C code with pre-processing
statements and produces a pure form of C code that contains no pre-processing
statements.
• Commands used in pre-processor are called pre-processor directives and they
begin with “#” symbol.
• #define - This macro defines constant value and can be any of the basic data
types.
• #include <file_name> - The source code of the file “file_name” is included in
the main C program where “#include <file_name>” is mentioned.

#include<stdio.h> Header File inclusion

#define height 100


#define number 3.14
#define letter 'A' Macro Expansion
#define letter_sequence "ABC"
#define backslash_char '\?'

void main()
{
printf("value of height : %d \n", height );
printf("value of number : %f \n", number );
printf("value of letter : %c \n", letter );
printf("value of letter_sequence : %s \n", letter_sequence);
printf("value of backslash_char : %c \n", backslash_char);
}
Output:
value of height : 100
value of number : 3.140000
value of letter : A
value of letter_sequence : ABC
value of backslash_char : ?

Compiler:
 A compiler is a computer program that reads a program written in one
language -the source language-and translates it into an equivalent program
in another language.
 An important part of a compiler is it presents error information to the user.
DEPARTMENT OF INFORMATION TECHNOLOGY GEC Page 2
COMPILER DESIGN UNIT-1

 There are thousand of source languages ranging from traditional


programming languages such as Fortran and pascal to specialized
languages.
 Target language may be another programming language ,or the machine
language of any computer between microprocessor or a super computer.
 Compilers are sometimes classified as the following depending on how
they have been constructed or on what function they are supposed to
perform.
single-pass
multi-pass
load-and-go
debugging
optimizing
 The first compiler started to appear in 1950's.
Assembler:
 An assembler is a translator that converts assembly code into machine code.
 Assembly code is a mnemonic version of machine code, in which names are
used instead of binary codes for operations, and names are also given to
memory addresses.
 A typical sequence of assembly instructions might be
MOV a, R1
ADD #2, R1
MOV R1, b
Typically, assemblers make two passes over the assembly file
– First pass: reads each line and records labels in a symbol table
– Second pass: use info in symbol table to produce actual machine
code for each line
Assembly language program Machine code

Assembler

Interpreter:
 An interpreter is another common kind of language processor, instead of
producing a target program as a translation; it appears directly to execute

DEPARTMENT OF INFORMATION TECHNOLOGY GEC Page 3


COMPILER DESIGN UNIT-1

the operations specified in the source program on inputs supplied by the


user.
 It provides better debugging environment.
 An interpreter can usually give better error diagnostic than a compiler.

 Java language processors combine compilation and interpretation.


 A java source program may first be compiled into an intermediate code
called byte codes.
 These byte codes are then interpreted by a virtual machine.
 A benefit of this arrangement is that byte codes compiled on one machine
can be interpreted on another machine; perhaps a network.
 In order to achieve faster processing of inputs to outputs, some java
compilers called Just-In-time compilers, translate the byte codes into
machine language immediately before they run the intermediate program to
process the input.
 Some interpreted languages are
BASIC, LISP, Python etc.
Differnces between a Compiler and Interpreter

Compiler Interpreter
Compiler Scans the entire program Interpreter scans the program line by
first and then translates it into an line .
equivalent machine code.
Compiled programs take more Interpreted programs take less memory
memory because the entire program because at a time a line of code will
has to reside in memory. reside in memory.
A compiled language is more difficult Debugging is easy because interpreter
to debug stops and reports errors as it encounter
them.

Execution time is less Execution time is more.


Code optimization is possible Code optimization is not possible
Examples: C,C++ Examples: LISP,Python

DEPARTMENT OF INFORMATION TECHNOLOGY GEC Page 4


COMPILER DESIGN UNIT-1

Linker:
 A linker or link editor is a computer program that takes one or more object
files generated by a compiler and combines them into a single executable
file, library file, or another object file.
 The Linker resolves external memory addresses, where the code in one file
may refer to code in another file.
 Link editors are commonly known as linkers. The compiler automatically
invokes the linker as the last step in compiling a program. The linker inserts
code (or maps in shared libraries) to resolve program library references,
and/or combines object modules into an executable image suitable for
loading into memory.
 Static linking is the result of the linker copying all library routines used in
the program into the executable image. This may require more disk space
and memory, but is both faster and more portable, since it does not require
the presence of the library on the system where it is run.
 Dynamic linking is accomplished by placing the name of a sharable library
in the executable image. Actual linking with the library routines does not
occur until the image is run, when both the executable and the library are
placed in memory. An advantage of dynamic linking is that multiple
programs can share a single copy of the library.
 If linker does not find a library of a function then it informs to compiler and
then compiler generates an error.
 Usually a longer program is divided into smaller subprograms called
modules. And these modules must be combined to execute the program.
The process of combining the modules is done by the linker.
 Linker can convert machine understandable format into Operating system
understandable format.
Loader
 The loader puts together all of the executable object files into memory for
execution.
 Relocating loaders
 Some operating systems need relocating loaders, which adjust
addresses (pointers) in the executable to compensate for variations in
the address at which loading starts.
 The operating systems that need relocating loaders are those in which
a program is not always loaded into the same location in the address
space and in which pointers are absolute addresses rather than offsets
from the program's base address.
Linking and loading provides 4 functions
1. Allocation
2. Relocation
3. linking

DEPARTMENT OF INFORMATION TECHNOLOGY GEC Page 5


COMPILER DESIGN UNIT-1

4. loading

Analysis and Synthesis model of Compiler


 The Analysis part breaks up the source program into constituent pieces and
creates an intermediate representation of source program.
 In compiling, analysis consists of three phases:
1. Linear analysis or Lexical analysis, in which the stream of
characters making up the source program is read from left to
right and grouped into tokens that are sequence of characters
having a collective meaning.
2. Hierarchical analysis, in which characters or tokens are
grouped hierarchically into nested collections with collective
meaning.
3. Semantic analysis, in which certain checks are performed to
ensure that the components of a program fit together
meaningfully.
 The Synthesis part constructs the desired target program from the
intermediate representation.
 In compiling, synthesis consists of 2 phases:
 Code optimization
 Code Generation
 The analysis part is often called the front end of compiler and synthesis
part is called back end of compiler.
Phases of a compiler

DEPARTMENT OF INFORMATION TECHNOLOGY GEC Page 6


COMPILER DESIGN UNIT-1

Symbol Table Management:


 An essential function of a compiler is to record the identifiers used in the
source program and collect information about various attributes of each
identifier.
 Symbol table is a data structure containing a record for each identifier, with
fields for the attributes of the identifier.
 various attributes of each identifier are
 type, (by semantic and intermediate code)
 scope, (by semantic and intermediate code)
 storage allocated for an identifier, (by code generation)
 in case of procedure names such as number of arguments and
its type for procedure, the type returned

 For example for the statement below , the symbol table entries are shown
below

Error handler:
 Each phase encounters errors.
 After detecting an error, a phase must some how deal with that error, so that
compilation can proceed, allowing further errors in the source program to
be detected.
 Lexical analysis phase can detect errors that do not form any token of the
language.
 Syntax analysis phase can detect the token stream that violates the structure
(or) syntax rules of the language.
 Semantic analysis phase detects the constructs that have no meaning to the
operation involved.
Lexical analysis
 Lexical analysis is the first phase of a compiler.
 Lexical analyzer is also called Scanner.
 The lexical analysis phase reads the characters from the source program and
group them into stream of tokens in which each token represents a logically
cohesive sequence of characters, such as an identifier, a keyword (if, while,
etc.,), a punctuation character etc.,
 For example in the statement position := initial + rate * 60 would be
grouped into the following tokens:
The identifier 1 - position.
The assignment symbol - : =.

DEPARTMENT OF INFORMATION TECHNOLOGY GEC Page 7


COMPILER DESIGN UNIT-1

The identifier 2 - initial.


The plus (+)- sign.
The identifier 3 - rate.
The multiplication (*)- sign.
The number – 60

id 1=id2 + id3 *60


 The blanks separating the characters of these tokens would normally be
eliminated during lexical analysis.

Syntax Analysis Phase:


 Syntax analysis imposes a hierarchical structure on the token stream. This
hierarchical structure is called syntax tree.
 Syntax analyzer is also called Parser.
 The syntax analyzer basically checks the syntax of the language.
 A syntax analyzer takes the token from the lexical analyzer and groups
them in such a way that some programming structures can be recognized.
 A syntax tree has an interior node, which is a record with a field for the
operator and two fields containing pointers to the records for the left and
right children.
 A leaf is a record with two or more fields, one to identify the token at the
leaf, and the other to record information about the token.

 Syntax trees for the example statement position := initial + rate * 60

Semantic analysis
 This phase checks the source program for semantic errors and gathers type
information for the subsequent code-generation phase.
 It uses the hierarchical structure determined by the syntax-analysis phase to
identify the operators and operands of expressions and statements
 An important component of semantic analysis is type checking.
 Syntax trees after semantic analysis phase for the example statement
position := initial + rate * 60

DEPARTMENT OF INFORMATION TECHNOLOGY GEC Page 8


COMPILER DESIGN UNIT-1

Intermediate code generation


 After the syntax and semantic analysis, some compilers generate a explicit
intermediate representation of the source program.
 The intermediate representation is a program for an abstract machine
 The intermediate representation should have two important properties:
 It should be easy to produce,
 And easy to translate into target program.
 Intermediate representation can have a variety of forms.
 One of the intermediate form is: three address code; which is like the
assembly language for a machine in which every location can act like a
register.
 Three address code consists of a sequence of instructions, each of which has
at most three operands.
 Three address code after intermediate code generation phase for the
example statement
position: = initial + rate * 60

Code optimization
 Code optimization phase attempts to improve the intermediate code, so that
faster-running machine code will result.
 Optimized Three address code after Code Optimization phase for the
example statement
position := initial + rate * 60

Code generation
 The final phase of the compiler is the generation of target code, consisting
of relocatable machine code or assembly code.
 Memory locations are selected for each of the variables used by the
program.
 Then, each intermediate instruction is translated into a sequence of machine
instructions that perform the same task.
 A crucial aspect is the assignment of variables to registers.

DEPARTMENT OF INFORMATION TECHNOLOGY GEC Page 9


COMPILER DESIGN UNIT-1

Example:

DEPARTMENT OF INFORMATION TECHNOLOGY GEC Page 10


COMPILER DESIGN UNIT-1

Pass
Grouping of several phases of compilation is called a pass.

Phase
Phase is a logical entity to perform a particular task.

Differences between pass and phase of a compiler

Pass Phase
Pass requires more space. Phase requires less space.
Single Pass takes more time for Single Phase takes more time for
execution. execution.

Examples: Single –pass compiler, Examples: Lexical analysis, Syntax


multi-pass compiler analysis

Role of lexical analysis


 The lexical analyzer is the first phase of a compiler.
 Its main task is to read the input characters and produce as output a
sequence of tokens that the parser uses for syntax analysis.

 Another task of lexical analyzer is stripping out comments and white space
in the form of blank, tab and newline characters from the source program.
 Correlating error messages from the compiler with the source program.
 The lexical analyzer may keep track of the number of newline characters
seen, so that line number can be associated with an error message.
 In some compilers, the lexical analyzer is in charge of making a copy of the
source program with the error messages marked in it.
 If the lexical analyzer finds a token invalid, it generates an error.
 The lexical analyzer works closely with the syntax analyzer. It reads
character streams from the source code, checks for legal tokens, and passes
the data to the syntax analyzer when it demands.
 The lexical analyzer collects information about tokens into their associated
attributes.

DEPARTMENT OF INFORMATION TECHNOLOGY GEC Page 11


COMPILER DESIGN UNIT-1

Token:
 Token is a sequence of characters that can be treated as a single logical
entity.
 Typical tokens are:
1) Identifiers 2) keywords 3) operators 4) special symbols
5) Constants
Pattern:
 A rule that describes the set of strings associated to a token.
 Expressed as a regular expression and describing how a particular token can
be formed. For example, [A-Z a-z][A-Z a-z _ 0-9] *
Lexeme:
 A lexeme is a sequence of characters in the source program that is matched
by the pattern for a token.
 Each lexeme corresponds to a token.

In many programming languages, the following classes cover most or all of


the tokens:
1. One token for each keyword. The pattern for a keyword is the same as the
keyword itself.
2. Tokens for the operators is either individual token or class of tokens.
3. One token representing all identifiers.
4. One or more tokens representing constants, such as numbers and literal.
5. Tokens for each punctuation symbol, such as left and right parentheses,
comma, and semicolon.

Attributes for tokens:


 A token has only a single attribute – a pointer to the symbol-table entry in
which the information about the token is kept.
 The token names and associated attribute values for the statement
E = M * C + 2 are written below as a sequence of pairs.
<id, pointer to symbol-table entry for E>
<assign_op>
<id, pointer to symbol-table entry for M>
<mult_op>
<id, pointer to symbol-table entry for C>
<add_op>
<number, integer value 2>

DEPARTMENT OF INFORMATION TECHNOLOGY GEC Page 12


COMPILER DESIGN UNIT-1

Specification of Tokens
Regular expressions are an important notation for specifying lexeme patterns.

Regular expressions
The languages accepted by finite automata are easily described by simple
expressions called regular expressions.
Let Σ be an alphabet. The regular expressions over Σ and the sets that they denote
are defined recursively as follows.
1) Ø is a regular expression and denotes the empty set.
2) ε is a regular expression and denotes the set { ε }.
3) For each a in Σ, a is a regular expression and denotes the set {a}.
4) If r and s are regular expressions denoting the languages R and S, respectively,
Then (r + s), (rs), and (r*) are regular expressions that denote the sets R U S,
RS, and R*, respectively.

Example:
Regular expression for pascal identifier
Letter ( letter |digit)*

Regular Definition
If Σ is an alphabet of basic symbols, then a regular definition is a sequence of
definitions of the form
d1 -> r1
d2 -> r2
...
dn-> rn
Where each di is a distinct name and each ri is a regular expression over the
symbols in
Σ ∪ {d1, d2,.... dn},

Examples:
Regular expression for identifiers in PASCAL

Regular expression for white space in PASCAL or in C

Regular expression for unsigned numbers in PASCAL or in C


such as 5280, 39.37, 6.336E4 or 1.894E-4

DEPARTMENT OF INFORMATION TECHNOLOGY GEC Page 13


COMPILER DESIGN UNIT-1

Recognition of Tokens

How to take the patterns for all the needed tokens and build a piece of code that
examines the input string and finds a prefix that is a lexeme matching one of the
patterns.

Example:

Transition diagram for the keyword then in PASCAL

t h e n nonletter/digit

DEPARTMENT OF INFORMATION TECHNOLOGY GEC Page 14


COMPILER DESIGN UNIT-1

Transition diagrams for white space in PASCAL or in C

Transition diagrams for unsigned numbers in PASCAL or in C

Transition diagrams for Relational operators in PASCAL or in C

Implementation of Lexical Analyzer


There are three general approaches to the implementation of a lexical analyzer.
1. Use a lexical-analyzer generator, such as Lex compiler to produce the
lexical analyzer from a regular-expression based specification.
2. Write a lexical analyzer in a conventional systems-programming language,
using the I/O facilities of that language to read the input.
3. Write the lexical analyzer in assembly language and explicitly manage the
reading of input.

DEPARTMENT OF INFORMATION TECHNOLOGY GEC Page 15


COMPILER DESIGN UNIT-1

Lexical errors:
 It is hard for a lexical analyzer to tell, without the aid of other components,
that there is a source-code error.
 For instance, if the string fi is encountered for the first time in a C program
in the context:
f i ( a == f ( x ) ) . ..
 A lexical analyzer cannot tell whether f i is a misspelling of the keyword if
or an undeclared function identifier.
 Since f i is a valid lexeme for the token id, the lexical analyzer must return
the token id to the parser and let some other phase of the compiler —
probably the parser in this case — handle an error due to transposition of
the letters.
 Other possible error-recovery actions are:
1. Delete one character from the remaining input.
2. Insert a missing character into the remaining input.
3. Replace a character by another character.
4. Transpose two adjacent characters.

Lexical Analysis Vs Parsing


1. The separation of lexical and syntactic analysis often allows us to simplify
atleast one of these tasks.
2. Compiler efficiency is improved.
3. Compiler portability is enhanced.

Assignment-Cum-Tutorial Questions

A. Questions testing the remembering / understanding level of students


I) Objective Questions
1. The output of a pre-processor is [ ]
a) absolute machine language program b) relocatable machine language program
c) Assembly language program d) a high level language program
2. A compiler running on computers with small memory would normally be
[ ]
a) a multi-pass compiler b) single pass compiler
c) a compiler with less number of phases d) none of these

DEPARTMENT OF INFORMATION TECHNOLOGY GEC Page 16


COMPILER DESIGN UNIT-1

3. A computer program that translates a program statement by statement into


machine language is called a_________________________________.
4. Front end of compiler does not include the phase____________ [ ]
a) semantic analysis b) intermediate code generation
c) code optimization d) lexical analysis
5. Back end of compiler includes those phases that depend on
[ ]
a) target machine b) source language
c) both a and b d) None of the above
6. Assembly language__________ [ ]
a) is usually the primary user interface b) requires fixed format
commands
c) is a mnemonic form of machine language d)is quite different from the
SCL interpreter
7. In a compiler, grouping of characters into tokens is done by
_________________.
8. __________________is a sequence of characters in the source program that
is matched to some pattern for a token.
9. In a compiler keywords of a language are recognized during ____________
phase.
10. Match the following [ ]
LIST-1 LIST-2
A. pre-processor 1) Resolving external reference
B. Assembler 2) loading the program
C. Loader 3) producing relocatable machine
code
D. Linker 4) allow user to define shorthand for
longer construct

DEPARTMENT OF INFORMATION TECHNOLOGY GEC Page 17


COMPILER DESIGN UNIT-1

A B C D
a) 4 3 2 1
b) 3 4 1 2
c) 4 3 1 2
d) 4 2 3 1
11. r+ represents _________________________.

II) Descriptive questions

1. What are the functions of pre-processing?


2. Explain the need and functionality of linkers, assemblers and loaders.
3. Draw a block diagram of phases of a compiler and indicate the main
functions of each phase.
4. What is the role of Lexical analyzer in a compiler?
5. Define token, lexeme and pattern.
6. Explain in brief about Lexical errors.
7. Explain the reasons why lexical analysis is separated from syntax analysis.
8. Define regular expression with notation.
B. Questions testing the ability of students in applying the concepts.
I) Multiple Choice Questions:
1. Relocating loaders perform four functions in which order? [ ]
a. Allocation, linking, relocation, loading
b. Loading, linking, relocation, Allocation
c. Allocation, loading, relocation, linking
d. none of the above
2. Which of the following phase of compilation process is an optional phase?
[ ]
a. lexical analysis phase b. Syntax analysis phase
c. Code optimization d. Code generation

DEPARTMENT OF INFORMATION TECHNOLOGY GEC Page 18


COMPILER DESIGN UNIT-1

3. Storage mapping is done by [ ]


a. Operating system b. Complier c. Linker d. Loader
4. Which of the following is the name of the data structure in a compiler that is
responsible for managing information about variable and their attributes?
[ ]
a. Symbol table b. Attribute grammar c. Stack d. Syntax tree
5. The lexical analysis for a modern computer language such as Java needs the
power of which one of the following machine models in a necessary and
sufficient sense?
[ ]
a. Finite state automata b. Deterministic pushdown automata
c. Non-Deterministic pushdown automata d. Turing Machine
6. How many tokens are there in the following code? [ ]
int max(i,j)
int i,j;
{
return i >j ? i : j ;
}
a. 23 b. 20 c. 25 d. 19
7. Find number of tokens in the following statement [ ]
printf(“ i = %d, &i = %x”, i, &I );
a.19 b. 10 c. 22 d. 20
8. The regular expression for the identifier is given by [ ]
a.letter(letter |digit)* b.digit(digit |letter)*
c. (letter | digit)* d. All of the above
9. Which of the following are the aspect of high level languages [ ]
a. ease of understanding b. naturalness
c. portabiliy d. All of the above

DEPARTMENT OF INFORMATION TECHNOLOGY GEC Page 19


COMPILER DESIGN UNIT-1

II)Problems

1. Write the output at all phases of a compiler for the statement x=a+b*c
2. Construct the transition diagram for identifiers in C.
3. Construct syntax tree for the expression a=b*-c+b*-c.
4. Identify the lexemes that make up the tokens in the following program
segment. Indicate corresponding token and pattern
void swap(int i, int j)
{
int t;
t=i;
i=j;
j=t;
}
5. Differentiate between pass and phase of a compiler.
6. Differentiate between Compiler and Interpreter.
7. Construct transition diagram for relational operators in C.
8. Give regular expression for unsigned numbers in C.

DEPARTMENT OF INFORMATION TECHNOLOGY GEC Page 20


COMPILER DESIGN UNIT-1

Assignment-Cum-Tutorial Questions

A. Questions testing the understanding / remembering level of students

I) Objective Questions

1. The output of a pre-processor is [ ]


a.absolute machine language program
b.relocatable machine language program
c.assembly language program
d.a high level language program

2. A compiler running on computers with small memory would normally


be [ ]
3. a. a multi-pass compiler b.single pass compiler
c. a compiler with less number of phases d. none of these

4. A computer program that translates a program statement by statement


into machine language is called a______________.

5. Front end of compiler does not include the phase [


]
a. semantic analysis b.intermediate code generation
c. code optimization d.lexical analysis

6. Back end of compiler includes those phases that depend on


[ ]
a.target machine b. source language
c. both a and b d.None of the above

7. Assembly language__________ [
]
a. is usually the primary user interface
b. requires fixed format commands
c. is a mnemonic form of machine language
d. is quite different from the SCL interpreter

8. Grouping of characters into tokens is done by ___________ phase of a


compiler.

9. _____________is a sequence of characters in the source program that is


matched to some pattern for a token.

DEPARTMENT OF INFORMATION TECHNOLOGY GEC Page 21


COMPILER DESIGN UNIT-1

10. Match the following [


]
LIST-1 LIST-2
A. pre-processor 1)Resolving external reference
B. Assembler 2) loading the program
C. Loader 3) producing relocatable
machine code
D. Linker 4) allow user to define
shorthand for longer construct

A B C D
a) 4 3 2 1
b) 3 4 1 2
c) 4 3 1 2
d) 4 2 3 1

11. r+ represents ______________.

II) Descriptive questions

1. Explain briefly about language-processing system.


2. Draw a block diagram of phases of a compiler and indicate the main
functions of each phase.
3. What is the role of Lexical analyzer in a compiler?
4. Define token, lexeme and pattern.
5. Explain in brief about Lexical errors.
6. Explain the reasons why lexical analysis is separated from syntax analysis.
7. Explain the implementation of a lexical analyzer using Lex tool.

DEPARTMENT OF INFORMATION TECHNOLOGY GEC Page 22


COMPILER DESIGN UNIT-1

B. Question testing the ability of students in applying the concepts.


I) Multiple Choice Questions:

1. Relocating loaders perform four functions in which order ? [


]
a. Allocation , linking, relocation ,loading
b. Loading , linking, relocation ,Allocation
c. Allocation , loading, relocation ,linking
d. none of the above

2.which of the following phase of compilation process is an optional phase? [


]
a. lexical analysis phase b. Syntax analysis phase
c. Code optimization d. Code generation

3. Storage mapping is done by [ ]


a. Operating system b. Complier c. Linker d. Loader

4. Which of the following is the name of the data structure in a compiler


[ ]
that is responsible for managing information about variable and their
attributes?
a. Symbol table b. Attribute grammar c. Stack d. Syntax tree

5. The lexical analysis for a modern computer language such as Java needs the
[ ]
power of which one of the following machine models in a necessary and
sufficient sense?
a.Finite state automata b.Deterministic pushdown automata
c.Non-Deterministic pushdown automata d.Turing Machine

6. How many tokens are there in the following code ?


[ ]
int max(i,j)
int i,j;
{
return i>j?i:j;

}
a.23 b. 20 c. 25 d. 19
7. Find number of tokens in the following statement
[ ]
printf(“i=%d,&i=%x”,i,&i);

DEPARTMENT OF INFORMATION TECHNOLOGY GEC Page 23


COMPILER DESIGN UNIT-1

a.19 b. 10 c. 22 d. 20

8. The regular expression for the identifier is given by [


]
a. letter(letter |digit)* b.digit(digit |letter)*
c. (letter | digit)* d. All of the above

9. Which of the following are the aspect of high level languages [


]
a.ease of understanding b.naturalness
c.portabiliy d.All of the above

10. In a compiler, keywords of a language are recognized during


[ ]
a.parsing of the program b.The code generation
c.The lexical analysis of the program d.dataflow analysis

II)Problems

1. Write the output at all phases of a compiler for the following statement
x=a+b*c
2. Construct the transition diagram for identifiers in C.
3. Construct syntax tree for the expression a=b*-c+b*-c.
4. Identify the lexemes that make up the tokens in the following program
segment. Indicate corresponding token and pattern
void swap(int i, int j)
{
int t;
t=i;
i=j;
j=t;
}
5. Differentiate between pass and phase of a compiler.
6. Differentiate between Compiler and Interpreter.
7. Construct transistion diagram for relational operators in C.
8. Give regular expression for unsigned numbers in C.

DEPARTMENT OF INFORMATION TECHNOLOGY GEC Page 24


COMPILER DESIGN UNIT-1

DEPARTMENT OF INFORMATION TECHNOLOGY GEC Page 25

You might also like