System Programming Notes

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 15

Q1. Short Note on Linker.

Ans. A system software that combines two or more object modules to form an executable program.
Usually a longer program is divided into smaller subprograms called modules. And these modules must
be combined to execute the program. The process of combining the modules is done by the linker.

 Characteristics:- There are few characteristics like bellow-


1. A linker, also called link editor or binder.
2. In general , in case of a large program, programmers prefer to break the code in to smaller modules, as this simplifies
the programming task.
3. Eventually, when the source code of all the modules has been converted in to object code, all the modules need to be
put together, which is done by the linker.
4. Finally the Compiled and Linked programs are called Executable code or program.

Explanation:- In high level languages, some built in


header files or libraries are stored. These libraries are
predefined and these contain basic functions which are
essential for executing the program. These functions
are linked to the libraries by a program called Linker. If
linker does not find a library of a function then it
informs to compiler and then compiler generates an
error. The compiler automatically invokes the linker as
the last step in compiling a program.
Not built in libraries, it also links the user defined
functions to the user defined libraries. Usually a longer
program is divided into smaller subprograms called
modules. And these modules must be combined to
execute the program. The process of combining the
Fig:- Diagram of Linker
modules is done by the linker.

 Function:- The following are the functions of Linker as tasks-


o Searches the program to find library routines used by program, e.g. printf(),sqrt(),strcat() and various other.
o Determines the memory locations that code from each module will occupy and relocates its instructions by adjusting
absolute references. [Relocation, which modifies the object program so that it can be loaded at an address different
from the location originally specified.]
o It combines two or more separate object programs and supplies the information needed to allow references between
them .
 Example of Linking Concept:- Source code(.c) is compiled and converted into object code(.obj) in C. After this
Linker comes into the act, linker resolve all the references in .obj file via linking them to their respectives codes and
resources.In short linker performs the final step to convert .obj into executable or machine readable file(.exe).
eg:
#include<stdio.h>
int main()
{
printf("ashwin");
return 0;
}

here,compiler first searches the declaration for "printf()" function,finds it in "#include" and creates a (.obj) file successfully.
A symbol table is created in (.obj) file which contains all the references to be
resolved,linkers resolves them by providing respective code or resource, here code referred by "printf" also gets executed after
successful creation of( .exe) file by linker.

 Types:- There are mainly two types of Linker as following-


1. Static Linking.
2. Dynamic Linking.
Now we’ll discuss the types one by one as follows

1. Static Linking:- In static linking all the modules that are required to complete a program are physically
placed together to generate a executable program file. The file can then be "loaded" at any subsequent time
to run the program.
Static linking is the result of the linker copying all library routines used in the program into the executable
image. This may require more disk space and memory than dynamic linking, but is both faster and more portable, since it does
not require the presence of the library on the system where it is run.

Advantages:- The advantage of static linking is that you can create self-contained, independent
programs. In other words, the executable program consists of one part (the .EXE file) that you need to keep track of.
Disadvantages:-
o You cannot change the behavior of executable files without relinking them.
o External called programs cannot be shared, requiring that duplicate copies of
programs be loaded in memory if more than one calling program needs to access
them.
2. Dynamic Linking:- In dynamic linking the actual task of linking is performed just prior to running the program
and with the individual modules actually in the memory. This approach has several advantages over static
linking.This provides a powerful and high-performance way to extend the function of programs.MS Windows
extensively uses DLL (Dynamic Linked Libraries).

Dynamic linking is accomplished by placing the name of a sharable library in the executable image.
Actual linking with the library routines does not occur until the image is run, when both the executable and the library are
placed in memory. An advantage of dynamic linking is that multiple programs can share a single copy of the library.
Advantages:-
o Often-used libraries (for example the standard system libraries) need to be stored in
only one location, not duplicated in every single binary.
o If an error in a library function is corrected by replacing the library, all programs using
it dynamically will benefit from the correction after restarting them. Programs that
included this function by static linking would have to be re-linked first.
Disadvantages:-
o Known on the Windows platform as "DLL Hell", an incompatible updated DLL will
break executables that depended on the behavior of the previous DLL.
o A program, together with the libraries it uses, might be certified (e.g. as to correctness,
documentation requirements, or performance) as a package, but not if components
can be replaced.

Q2. Short Note on Loader.


Ans. Loader is a system program that loads machine codes of a program into the system memory.
Loading a program involves reading the contents of executable file into memory. Once loading is complete, the
operating system starts the program by passing control to the loaded program.

 Characteristics:- There are few characteristics of Loader as follows-


o It is a SYSTEM PROGRAM that brings an executable file residing on disk into memory and starts it running.
o A loader is the part of an Operating System that is responsible for loading programs.
o It is one of the essential stages in the process of starting a program.
o Loading a program involves reading the contents of executable file into memory.
o All operating systems that support program loading have loaders. In many operating systems the loader is
permanently resident in memory.

 Functions:- The loader performs the following


functions:
1) Allocation
2) Linking
3) Relocation
4) Loading

Allocation:
• Allocates the space in the memory where the object
program would be loaded for Execution.
• It allocates the space for program in the memory, by
calculating the size of the program. This activity is
called allocation.
• In absolute loader allocation is done by the
programmer and hence it is the duty of the programmer to ensure that the programs do not get overlap.
• In reloadable loader allocation is done by the loader hence the assembler must supply the loader the size of the program.
Linking:
• It links two or more object codes and provides the information needed to allow references between them.
• It resolves the symbolic references (code/data) between the object modules by assigning all the user subroutine and library
subroutine addresses. This activity is called linking.
• In absolute loader linking is done by the programmer as the programmer is aware about the runtime address of the symbols.
• In relocatable loader, linking is done by the loader and hence the assembler must supply to the loader, the locations at which
the loading is to be done.
Relocation:
• It modifies the object program by changing the certain instructions so that it can be loaded at different address from location
originally specified.
• There are some address dependent locations in the program, such address constants must be adjusted according to
allocated space, such activity done by loader is called relocation.
• In absolute Loader relocation is done by the assembler as the assembler is aware of the starting address of the program.
• In relocatable loader, relocation is done by the loader and hence assembler must supply to the loader the location at which
relocation is to be done.
Loading:
• It brings the object program into the memory for execution.
• Finally it places all the machine instructions and data of corresponding programs and subroutines into the memory. Thus
program now becomes ready for execution, this activity is called loading.
• In both the loaders (absolute, relocatable) Loading is done by the loader and hence the assembler must supply to the loader
the object program.

 Types:- There are several types of Loader


as follows-

1. COMPILE-AND-GO LOADER:- In compile and go loader is a


link editor/program loader in which the assembler itself places the
assembled instruction directly into the designated memory locations
for execution.
o The instruction are read line by line, its machine code is
obtained and it is directly put in the main memory at some known
address.
o After completion of assembly process, it assigns the starting
address of the program to the location counter.
o There is no stop between the compilation, link editing,
loading, and execution of the program.
o It is also called an assemble-and-go or a loadand-go system.
ADVANTAGES:-
 They are simple and easier to implement.
 No additional routines are required to load the compiled code into the memory.
DISADVANTAGES:-
 There is wastage in memory space due to the presence of the assembler.
 There is no production of .obj file, the source code is directly converted to executable form.
Hence even though there is no modification in the source program it needs to be assembled
and executed each time.

2.GENERAL LOADER SCHEME:- In “Compile-and-Go” the outputting instruction and data are assembled. In which
assembler is placed in main memory that results in wastage of memory.
o To overcome that we requires the addition of the new program of the system, a loader.
o Generally the size of loader is less than that of assembler.
o The loader accepts the assembled machine instructions, data and other information present in the object format and
places machine instructions and data in core in an executable computer form.
o The reassembly is no longer necessary to run the program at a later date.
ADVANTAGE:- In this scheme the source program translators produce compatible object program deck formats
and it is possible to write subroutines in several different languages since the object decks to be processed by the loader will
all be in the same “language” that is in “machine language”.
3.ABSOLUTE LOADERS:- In this scheme the assembler outputs the machine language translation of the source program in
almost the same form as in the “Compile and go” , except that the data is punched on cards. Here it will directly placed in
memory .
The loader in turn simply accepts the machine language text and places it into core at the location prescribed by the
assembler.
DISADVANTAGES:-
 The programmer must specify to the assembler the address where the program is to be loaded.
 If there are multiple subroutines , the programmer must remember the address of each.
4.RELOCATING LOADERS:- To avoid possible reassembling of all subroutines when a single subroutine is changed and to
perform the tasks of allocation and linking for the programmer the relocating loaders is introduced.
Q3. Difference between Linker and Loader.
Ans. Linker and Loader are the utility programs that plays a major role in the execution of a program. The
Source code of a program passes through compiler, assembler, linker, loader in the respective order, before
execution. On the one hand, where the linker intakes the object codes generated by the assembler and
combine them to generate the executable module. On the other hands, the loader loads this executable
module to the main memory for execution. Let us discuss the difference between Linker and loader with the
help of a comparison chart.
Comparison Chart:-

BASIS FOR
LINKER LOADER
COMPARISON

Basic It generates the executable module of a It loads the executable module to the main
source program. memory.

Input It takes as input, the object code It takes executable module generated by a
generated by an assembler. linker.

Function It combines all the object modules of a It allocates the addresses to an executable
source code to generate an executable module in main memory for execution.
module.

Type/Approach Linkage Editor, Dynamic linker. Absolute loading, Relocatable loading and
Dynamic Run-time loading.

Q4.Difference between One Pass and Two Pass assembler.

Ans. The difference between one pass and two pass assemblers is basically in the name. A one pass assembler passes over
the source file exactly once, in the same pass collecting the labels, resolving future references and doing the actual assembly.
A two pass assembler does two passes over the source file ( the second pass can be over a file generated in the first pass
). But more differences as follows bellow-

ONE PASS ASSEMBLER TWO PASS ASSEMBLER

Require two passes to scan source file.


First pass – responsible for label definition and introduce
them in symbol table.
Second pass – translates the instructions into assembly
Scans entire source file only once language or generates machine code.

Along with pass1 pass two is also required which


Generally • Generates actual Opcode.
• Deals with syntax. • Compute actual address of every label.
• Constructs symbol table • Assign code address for debugging the information.
• Creates label list. • Translates operand name to appropriate register or
• Indentifies the code segment, data segment, memory code.
stack segment etc… • Immediate value is translated to binary strings (1’s and 0’s)

Cannot resolve forward references of data


symbols. Can resolve forward references of data symbols.

No object program is written, hence no loader is


required Loader is required as object code is generated.

Two pass assembler requires rescanning. Hence slow


Tends to be faster compared to two pass compared to one pass assembler.

Only creates tables with all symbols no address


of symbols is calculated. Address of symbols can be calculated
Q5. Short note on Symbol table.

Ans. An essential function of compiler is to record the identifier used in the source program and collect information about
various attributes of each identifier.
A symbol table is a data structure containing a record for each identifier, with field for the attribute of identifier.
The data structure allows us to find the record for each identifier.
Quickly and to store or retrieve data form that’s record quickly. When an identifier in source program
detected the lexical analyzers, the identifiers entered into symbol table.

Characteristics:- There are few characteristics of Symbol table as follows-


1. Symbol tables are data structures that are used by compilers to hold information about source-program constructs.
2. It is a data structure used by compiler.
3. It stores information about the occurrence of various entities such as variable names, function names, objects,
classes, interfaces, etc.
4. Symbol table is used by both the analysis and the synthesis parts of a compiler.

Information Provided By Symbol Table:- The following are the information which is provided by
symbol table:-
o Given an Identifier which name is it?
o What information is to be associated with a name?
o How do we access this information?

Use of Symbol Table:- Use of symbols tables are-


o Symbol table information is used by the analysis and synthesis phases
o To verify that used identifiers have been defined (declared)
o To verify that expressions and assignments are semantically correct – type
checking
o To generate intermediate or target code

Format:- A symbol table is simply a table which can be either linear or a hash table. It maintains an entry for
each name in the following format:

<symbol name, type, attribute>


Example:- For example, if a symbol table has to store information about the following variable declaration:

static int interest;


then it should store the entry such as:
<interest, int, static>
The attribute clause contains the entries related to the name.
Implementation ways:- A symbol table can be implemented in one of the following ways:
 Linear (sorted or unsorted) list
 Binary Search Tree
 Hash table
Among all, symbol tables are mostly implemented as hash tables.

Operations:- The operations that are performed in Symbol Table as follows-


 insert() :- Add symbol to symbol table
The insert() function takes the symbol and its attributes as arguments and stores the information in the
symbol table.
For example: int a;
should be processed by the compiler as:
insert(a, int);
 lookup():- Find symbol in the symbol table (and get its attributes)
lookup() operation is used to search a name in the symbol table to determine:
o if the symbol exists in the table.
o if it is declared before it is being used.
o if the name is used in the scope.
o if the symbol is initialized.
o if the symbol declared multiple times.
The format of lookup() function varies according to the programming language. The basic format should match the following:

lookup(symbol)
This method returns 0 (zero) if the symbol does not exist in the symbol table. If the symbol exists in the symbol table, it returns
its attributes stored in the table.
Q6. Short note on Editor.
Ans. A text editor is a tool that allows a user to create and revise documents in a computer.
 Though this task can be carried out in other modes, the word text editor commonly refers to the tool that does this
interactively.
 Earlier computer documents used to be primarily plain text documents, but nowadays due to improved input-output
mechanisms and file formats, a document frequently contains pictures along with texts whose appearance (script,
size, color and style) can be varied within the document.
 Apart from producing output of such wide variety, text editors today provide many advanced features of
interactiveness and output.
Types of Text Editors:- Depending on how editing is performed, and the type of output that can be generated,
editors can be broadly classified as -
1. Line Editors - During original creation lines of text are recognized and delimited by end-of-line markers, and during
subsequent revision, the line must be explicitly specified by line number or by some pattern context. eg. edlin editor in
early MS-DOS systems.
2. Stream Editors - The idea here is similar to line editor, but the entire text is treated as a single stream of
characters. Hence the location for revision cannot be specified using line numbers. Locations for revision are either
specified by explicit positioning or by using pattern context. eg. sed in Unix/Linux.
Line editors and stream editors are suitable for text-only documents.
3. Screen Editors - These allow the document to be viewed and operated upon as a two dimensional plane, of which
a portion may be displayed at a time. Any portion may be specified for display and location for revision can be
specified anywhere within the displayed portion. eg. vi, emacs, etc.
4. Word Processors - Provides additional features to basic screen editors. Usually support non-textual contents and
choice of fonts, style, etc.
5. Structure Editors - These are editors for specific types of documents, so that the editor recognizes the
structure/syntax of the document being prepared and helps in maintaining that structure/syntax.
Editor Structure:- Most text editors have a structure similar to that shown in the following figure.
 Command language Processor accepts command, uses semantic routines – performs functions such as editing and
viewing. The semantic routines involve traveling, editing,
viewing and display functions.

 Editing operations are specified explicitly by the user and


display operations are specified implicitly by the editor.
Traveling and viewing operations may be invoked either
explicitly by the user or implicitly by the editing operations.
 In editing a document, the start of the area to be edited is
determined by the current editing pointer maintained by the
editing component. Editing component is a collection of
modules dealing with editing tasks. Current editing pointer can
be set or reset due to next paragraph, next screen, cut
paragraph, paste paragraph etc.
 When editing command is issued, editing component invokes
the editing filter – generates a new editing buffer – contains
part of the document to be edited from current editing pointer.
 Filtering and editing may be interleaved, with no explicit
editor buffer being created.
 In viewing a document, the start of the area to be viewed is determined by the current viewing pointer maintained by
the viewing component.
 Viewing component is a collection of modules responsible for determining the next view.
 Current viewing pointer can be set or reset as a result of previous editing operation.
 When display needs to be updated, viewing component invokes the viewing filter – generates a new viewing buffer –
contains part of the document to be viewed from current viewing pointer.
 In case of line editors – viewing buffer may contain the current line, Screen editors - viewing buffer contains a
rectangular cutout of the quarter plane of the text.
 Viewing buffer is then passed to the display component of the editor, which produces a display by mapping the buffer
to a rectangular subset of the screen – called a window.
 The editing and viewing buffers may be identical or may be completely disjoint. Identical – user edits the text directly
on the screen.
 Disjoint – Find and Replace (For example, there are 150 lines of text, user is in 100th line, decides to change all
occurrences of ‘text editor’ with ‘editor’).
 The editing and viewing buffers can also be partially overlap, or one may be completely contained in the other.
 Windows typically cover entire screen or a rectangular portion of it. May show different portions of the same file or
portions of different file.
 Inter-file editing operations are possible.
 The components of the editor deal with a user document on two levels: In main memory and in the disk file system.
 Loading an entire document into main memory may be infeasible – only part is loaded – demand paging is used –
uses editor paging routines.
 Documents may not be stored sequentially as a string of characters.
 Uses separate editor data structure that allows addition, deletion, and modification with a minimum of I/O and
character movement.
Q7. Short note on Interpreter.

Ans. An interpreter is a program which translates statements of a program into machine code. It translates only one
statement of the program at a time.
Characteristics:- There are few characteristics of interpreter-
1. It is type of translator.
2. An interpreter translates high-level instructions into an intermediate form, which it then executes.
3. It Translates one line of the program into binary code at a time.
4. It occupied less memory space compared to compiler.
5. The interpreter reads each statement of code and then converts or executes it directly.

Working of Interpreter:- following are the working of interpreter-


o An instruction is fetched from the original source code.
o The Interpreter checks the single instruction for errors. (If an error is found, translation and execution ceases.
Otherwise…)  The instruction is translated into binary code.
o The binary coded instruction is executed.
o The fetch and execute process repeats for the entire program.

Explanation:- Interpreter reads only one statement of program, translates it and executes it. Then it reads the next
statement of the program again translates it and executes it. In this way it proceeds further till all the statements are translated
and executed. On the other hand, a compiler goes through the entire program and then translates the entire program into
machine codes. A compiler is 5 to 25 times faster than an interpreter.

Example:- Computer programs are compiled or interpreted. Languages like Assembly Language, C, C++, Fortran, Pascal
were almost always compiled into machine code. Languages like Basic, VbScript and JavaScript were usually interpreted. 
However, BASIC and LISP are especially designed to be executed by an interpreter

Advantages :-
 easy to learn and use
 minimum programming knowledge or experience
 allows complex tasks to be performed in relatively few steps
 allows simple creation and editing in a variety of text editors
 allows the addition of dynamic and interactive activities to web pages
 edit and running of code is fast.
Disadvantages:-
 usually run quite slowly
 limited access to low level and speed optimization code.
 limited commands to run detailed operations on graphics.

Q8. What is MACRO? Discuss its Expansion in detail with the suitable example?
Ans. A macro name is an abbreviation, which stands for some related lines of code.
Purpose:- Macros are useful for the following purposes:
o To simplify and reduce the amount of repetitive coding
o To reduce errors caused by repetitive coding
o To make an assembly program more readable.

A macro consists of name, set of formal parameters and body of code. The use of macro name
with set of actual parameters is replaced by some code generated by its body. This is called macro expansion.

Macros allow a programmer to define pseudo operations, typically operations that are generally desirable, are not
implemented as part of the processor instruction, and can be implemented as a sequence of instructions. Each use
of a macro generates new program instructions, the macro has the effect of automating writing of the program.

Macros can be defined used in many programming languages, like C, C++ etc.
Example macro in C programming.Macros are commonly used in C to define small snippets of code. If the macro
has parameters, they are substituted into the macro body during expansion; thus, a C macro can mimic a C
function. The usual reason for doing this is to avoid the overhead of a function call in simple cases, where the code
is lightweight enough that function call overhead has a significant impact on performance.

For instance,
#define max (a, b) a>b? A: b
Defines the macro max, taking two arguments a and b. This macro may be called like any C function, using
identical syntax. Therefore, after preprocessing
z = max(x, y);
Becomes z = x>y? X:y;

While this use of macros is very important for C, for instance to define type-safe generic data-
types or debugging tools, it is also slow, rather inefficient, and may lead to a number of pitfalls.
C macros are capable of mimicking functions, creating new syntax within some limitations, as well as expanding
into arbitrary text (although the C compiler will require that text to be valid C source code, or else comments), but
they have some limitations as a programming construct. Macros which mimic functions, for instance, can be called
like real functions, but a macro cannot be passed to another function using a function pointer, since the macro itself
has no address.
In programming languages, such as C or assembly language, a name that defines a set
of commands that are substituted for the macro name wherever the name appears in a program (a process called
macro expansion) when the program is compiled or assembled. Macros are similar to functions in that they can
take arguments and in that they are calls to lengthier sets of instructions. Unlike functions, macros are replaced by
the actual commands they represent when the program is prepared for execution. function instructions are copied
into a program only once.
Macro Expansion:- A macro call leads to macro expansion. During macro expansion, the macro statement is
replaced by sequence of assembly statements.
In the above program a macro call is shown in the middle of the figure. i.e. INITZ.
Which is called during program execution? Every macro begins with MACRO keyword at the beginning and ends
with the ENDM (end macro).whenever a macro is called the entire is code is substituted in the program where it is
called. So the resultant of the macro code is shown on the right most side of the figure. Macro calling in high level
programming languages.

(C programming)
#define max(a,b) a>b?a:b
Main () {
int x , y;
x=4; y=6;
z = max(x, y); }

The above program was written using C programming statements. Defines the macro max, taking two
arguments a and b. This macro may be called like any C function, using identical syntax. Therefore, after
preprocessing

Becomes z = x>y ? x: y;

After macro expansion, the whole code would appear like this.
#define max(a,b) a>b?a:b
main()
{ int x , y;
x=4; y=6;z = x>y?x:y; }

Q9. Describe Macro Processor.

Ans. Macro Processor is a program that lets you define the code that is reused many times giving it a specific
Macro name and reuse the code by just writing the Macro name only.
Generally it doesn't come as a separate program but as a bundle to either
assembler or compiler
Note : Please don't confuse Macro Processor with Micro Processor, remember Micro Processor is a hardware
device and it's a completely different area of study

There are three main steps of using a macro


1.Define the macro name
2.Give it's definetion
3.Use the macro name from with in the program anywhere to use it's definetion (this step is called macro call)
And the most important thing is that, macro processing is done before the compilation replacing the code in place
of macro calls and deleting macro definetions.If you are on an open source OS, this can easily be tested.For
others, this is how it happens.This is my helloworld programme

#include<stdio.h>
#define hell printf("\nHello world\n");printf("I am a good boy\n\n");

int main()
{
hell;
}

Now there is are a lot of options in gcc compiler and if you use gcc helloworld.c -E , it gives you the code that is
processed before it is passed to the compiler.It includes a lot of things along with the code.But since our main
discussion is about Macro processing, let's see how the macro processed code looks like

# 916 "/usr/include/stdio.h" 3 4

# 2 "helloworld.c" 2

int main()
{
printf("\nHello world\n");printf("I am a good boy\n\n");;
}

You can clearly see that macro processing is already done replacing hell with the corresponding macro
definition.This is passed onto the compiler which does the remaining job. This is the compiled output

Hello world
I am a good boy

There are a lot more uses for macro and don't confuse functions with macros even though both are used for same
purpose and if any one asks you the difference tell them, the things I showed you above, function gets compiled
but a macro gets macro processed

One important use of these macros is for booting.During the boot time, the system uses a lot of macros to get itself
started since they need not be compiled to use, but just macro processing is enough which is a small task

Q10. What is Bootstrap Loader?

Ans. When a computer is first tuned on or restarted, a special type of absolute loader, called bootstrap loader is executed.
This bootstrap loads the first program to be run by the computer -- usually an operating system.
 Characteristics:- There are few characteristics of Bootstrap Loader as following-
o It is a type of Loader.
o It is a very small and simple loader in the present architecture of the computer.
o Bootstrap loader always place in ROM.
o It cannot be erased.

 Example (SIC bootstrap loader):-If we think how a OS loaded in the


time of booting so, there we can see the work of bootstrap as
following-
The bootstrap itself begins at address 0 .It loads the OS at
starting address 0x80 .No header record or control information, the object
code is consecutive bytes of memory
Q11. What is BNF?

Ans. In computer science, Backus–Naur form or Backus normal form (BNF) is a notation technique for context-free grammars,
often used to describe the syntax of languages used in computing, such as computer programming languages, document
formats, instruction sets and communication protocols.
In other words BNF is a standard notation for expressing syntax as a set of grammar rules.

 Characteristics:- There are few characteristics of BNF as follows-


o A formal mathematical way to describe a language.
o BNF was developed by John Backus and Peter Naur.
o It is first use to describe Algol 60
o BNF is a Meta Language where Meta language is a language that describes a language
which can describe a programming language.
o It is precise and unambiguous.

 Components of BNF:- BNF grammar consists of 4 parts,


o The set of tokens
o The set of non terminal symbols
o The start symbol
o The set of productions
LHS = RHS
Non-terminals = Terminals/non-terminals

 Syntax:- Following the syntax of a BNF

 Example:-
A := B*(A+C)

Solution -->
Assign -> id := expr
A := expr
A := id*expr
A := B*(expr)
A := B*(id + expr)
A := B*(A + expr)
A := B*(A + id)
A := B*(A + C)

Q12. What is Ambiguity in Grammar?

Ans. A Grammar(G )is called Ambiguous when G has more than one derivation tree for some string.
There exist multiple right-most or left-most derivations for some string generated from that grammar.

 Problem:- Check whether the grammar G with production rules −


X → X+X | X*X |X| a
is ambiguous or not.

 Solution :- Let’s find out the derivation tree for the string "a+a*a". It has two leftmost derivations.
Derivation 1 − X → X+X → a +X → a+ X*X → a+a*X → a+a*a
Parse tree 1 −
Derivation 2 − X → X*X → X+X*X → a+ X*X → a+a*X → a+a*a
Parse tree 2 −

Q13. Difference between Syntax and Semantics.

Ans.
Syntax
It represents the grammar which tells you the rules for constructing well-formed sentences of the language. If a program
contains syntactic errors, it will not pass compilation. For example:
int multiply_numbers = a b *; // syntax error in C
Semantics
It represents the meaning of the sentences of the language. If a program contains only semantic errors, it means that it may
pass compilation, but does not do what it meant to do. For example:

 int multiply_numbers = a / b; // semantic error


 If an expression involves int, double what should be the casting results, these are semantic questions.

 Comparison Chart between Semantics and Syntax:

Subjects Semantics Syntax

Semantics is a term which is derived from the Syntax is a term which is derived from Ancient
Greek word seme meaning sigh. Semantics Greek σύνταξις "arrangement" from σύν syn, "together,"
Definition is another important field related to and τάξις táxis, "an ordering". Syntax is the study which
theoretical linguistics. It is all about studying deals with analyzing that how words are combined in
the meaning of linguistic expressions. order to form grammatical sentences.

Related to Meanings of words and sentences The structure of words

Describe the relationship between symbol Describe the correct word order and inflectional structure
Rules
and the things they mean or refer to in sentences

Main aspect Relation between form and meaning Word order

Approach
towards Individual’s own interpretation on the basis of
Linguistically and grammatically correct
meaning of a previous knowledge
sentence

Q14. Discuss Phases of Compiler Design.

Ans. The structure of compiler consists of two parts:

Analysis part:-
• Analysis part breaks the source program into constituent pieces and imposes a grammatical structure on them which further
uses this structure to create an intermediate representation of the source program.
• It is also termed as front end of compiler.
• Information about the source program is collected and stored in a data structure called symbol table.
Synthesis part:-
• Synthesis part takes the intermediate representation as input and transforms it to the target program.
• It is also termed as back end of compiler.

The design of compiler can be decomposed into several phases, each of which converts one form of source program into
another.
The different phases of compiler are as follows:
1. Lexical analysis
2. Syntax analysis
3. Semantic analysis
4. Intermediate code generation
5. Code optimization
6. Code generation

All of the aforementioned phases involve the following tasks:

• Symbol table management.


• Error handling.

Lexical Analysis
• Lexical analysis is the first phase of compiler which is also
termed as scanning.
• Source program is scanned to read the stream of characters
and those characters are grouped to form a sequence called
lexemes which produces token as output.
• Token: Token is a sequence of characters that represent
lexical unit, which matches with the pattern, such as keywords,
operators, identifiers etc.
• Lexeme: Lexeme is instance of a token i.e., group of characters forming a token. ,
• Pattern: Pattern describes the rule that the lexemes of a token takes. It is the structure that must be matched by strings.
• Once a token is generated the corresponding entry is made in the symbol table.
Input: stream of characters
Output: Token
Token Template: <token-name, attribute-value>
(eg.) c=a+b*5;
Lexemes and tokens

Lexemes Tokens
c identifier
= assignment symbol
a identifier
+ + (addition symbol)
b identifier
* * (multiplication symbol)
5 5 (number)

Hence, <id, 1><=>< id, 2>< +><id, 3 >< * >< 5>

Syntax Analysis
• Syntax analysis is the second phase of compiler which is also called as parsing.
• Parser converts the tokens produced by lexical analyzer into a tree like
representation called parse tree.

• A parse tree describes the syntactic structure of the input.


• Syntax tree is a compressed representation of the parse tree in which the operators
appear as interior nodes and the operands of the operator are the children of the
node for that operator.
Input: Tokens
Output: Syntax tree

Semantic Analysis
• Semantic analysis is the third phase of compiler.
• It checks for the semanticconsistency.
• Type information is gathered and stored in symbol table or in syntax tree.
• Performs type checking.

Intermediate Code Generation


• Intermediate code generation produces intermediate representations for the source
program which are of the following forms:
o Postfix notation
o Three address code
o Syntax tree
Most commonly used form is the three address code.
t1 = inttofloat (5)
t2 = id3* tl
t3 = id2 + t2
id1 = t3
Properties of intermediate code

• It should be easy to produce.


• It should be easy to translate into target program.
Code Optimization
• Code optimization phase gets the intermediate code as input and produces optimized intermediate code as output.
• It results in faster running machine code.
• It can be done by reducing the number of lines of code for a program.
• This phase reduces the redundant code and attempts to improve the intermediate code so that faster-running machine code
will result.
• During the code optimization, the result of the program is not affected.
• To improve the code generation, the optimization involves
o Deduction and removal of dead code (unreachable code).
o Calculation of constants in expressions and terms.
o Collapsing of repeated expression into temporary string.
o Loop unrolling.
o Moving code outside the loop.
o Removal of unwanted temporary variables.
t1 = id3* 5.0
id1 = id2 + t1

Code Generation
• Code generation is the final phase of a compiler.
• It gets input from code optimization phase and produces the target code or object code as result.
• Intermediate instructions are translated into a sequence of machine instructions that perform the same task.
• The code generation involves
o Allocation of register and memory.
o Generation of correct references.
o Generation of correct data types.
o Generation of missing code.
LDF R2, id3
MULF R2, # 5.0
LDF R1, id2
ADDF R1, R2
STF id1, R1
Symbol Table Management

• Symbol table is used to store all the information about identifiers used in the program.
• It is a data structure containing a record for each identifier, with fields for the attributes of the identifier.
• It allows finding the record for each identifier quickly and to store or retrieve data from that record.
• Whenever an identifier is detected in any of the phases, it is stored in the symbol table.
Example
int a, b; float c; char z;
Symbol name Type Address
a Int 1000
Example
b Int 1002
c Float 1004 extern double test
z char 1008 (double x);
double sample (int count)
{
double sum= 0.0;
for (int i = 1; i < = count; i++)
sum+= test((double) i);
return sum;
}

Symbol name Type Scope


test function, double extern
x double function parameter
sample function, double global
count int function parameter
sum double block local
i int for-loop statement

Error Handling
• Each phase can encounter errors. After detecting an error, a phase must handle
the error so that compilation can proceed.
• In lexical analysis, errors occur in separation of tokens.
• In syntax analysis, errors occur during construction of syntax tree.
• In semantic analysis, errors may occur at the following cases:
(i) When the compiler detects constructs that have right syntactic structure but no
meaning
(ii) During type conversion.
• In code optimization, errors occur when the result is affected by the optimization.
In code generation, it shows error when code is missing etc.
Figure illustrates the translation of source code through each phase, considering
the statement
c =a+ b * 5.

Error Encountered in Different Phases


Each phase can encounter errors. After detecting an error, a phase must some
how deal with the error, so that compilation can proceed.
A program may have the following kinds of errors at various stages:
Lexical Errors
It includes incorrect or misspelled name of some identifier i.e., identifiers typed
incorrectly.
Syntactical Errors
It includes missing semicolon or unbalanced parenthesis. Syntactic errors are
handled by syntax analyzer (parser).
When an error is detected, it must be handled by parser to enable the parsing of
the rest of the input. In general, errors may be expected at various stages of
compilation but most of the errors are syntactic errors and hence the parser
should be able to detect and report those errors in the program.
The goals of error handler in parser are:
• Report the presence of errors clearly and accurately.
• Recover from each error quickly enough to detect subsequent errors.
• Add minimal overhead to the processing of correcting programs.
There are four common error-recovery strategies that can be implemented in the parser to deal with errors in the code.
o Panic mode.
o Statement level.
o Error productions.
o Global correction.
Semantical Errors
These errors are a result of incompatible value assignment. The semantic errors that the semantic analyzer is expected to
recognize are:
• Type mismatch.
• Undeclared variable.
• Reserved identifier misuse.
• Multiple declaration of variable in a scope.
• Accessing an out of scope variable.
• Actual and formal parameter mismatch.
Logical errors
These errors occur due to not reachable code-infinite loop.
Q15. What are the data structures used by the macro processor?

Q16.Define Macro and Macro Expanding.

You might also like