System Programming Notes
System Programming Notes
System Programming Notes
Ans. A system software that combines two or more object modules to form an executable program.
Usually a longer program is divided into smaller subprograms called modules. And these modules must
be combined to execute the program. The process of combining the modules is done by the linker.
here,compiler first searches the declaration for "printf()" function,finds it in "#include" and creates a (.obj) file successfully.
A symbol table is created in (.obj) file which contains all the references to be
resolved,linkers resolves them by providing respective code or resource, here code referred by "printf" also gets executed after
successful creation of( .exe) file by linker.
1. Static Linking:- In static linking all the modules that are required to complete a program are physically
placed together to generate a executable program file. The file can then be "loaded" at any subsequent time
to run the program.
Static linking is the result of the linker copying all library routines used in the program into the executable
image. This may require more disk space and memory than dynamic linking, but is both faster and more portable, since it does
not require the presence of the library on the system where it is run.
Advantages:- The advantage of static linking is that you can create self-contained, independent
programs. In other words, the executable program consists of one part (the .EXE file) that you need to keep track of.
Disadvantages:-
o You cannot change the behavior of executable files without relinking them.
o External called programs cannot be shared, requiring that duplicate copies of
programs be loaded in memory if more than one calling program needs to access
them.
2. Dynamic Linking:- In dynamic linking the actual task of linking is performed just prior to running the program
and with the individual modules actually in the memory. This approach has several advantages over static
linking.This provides a powerful and high-performance way to extend the function of programs.MS Windows
extensively uses DLL (Dynamic Linked Libraries).
Dynamic linking is accomplished by placing the name of a sharable library in the executable image.
Actual linking with the library routines does not occur until the image is run, when both the executable and the library are
placed in memory. An advantage of dynamic linking is that multiple programs can share a single copy of the library.
Advantages:-
o Often-used libraries (for example the standard system libraries) need to be stored in
only one location, not duplicated in every single binary.
o If an error in a library function is corrected by replacing the library, all programs using
it dynamically will benefit from the correction after restarting them. Programs that
included this function by static linking would have to be re-linked first.
Disadvantages:-
o Known on the Windows platform as "DLL Hell", an incompatible updated DLL will
break executables that depended on the behavior of the previous DLL.
o A program, together with the libraries it uses, might be certified (e.g. as to correctness,
documentation requirements, or performance) as a package, but not if components
can be replaced.
Allocation:
• Allocates the space in the memory where the object
program would be loaded for Execution.
• It allocates the space for program in the memory, by
calculating the size of the program. This activity is
called allocation.
• In absolute loader allocation is done by the
programmer and hence it is the duty of the programmer to ensure that the programs do not get overlap.
• In reloadable loader allocation is done by the loader hence the assembler must supply the loader the size of the program.
Linking:
• It links two or more object codes and provides the information needed to allow references between them.
• It resolves the symbolic references (code/data) between the object modules by assigning all the user subroutine and library
subroutine addresses. This activity is called linking.
• In absolute loader linking is done by the programmer as the programmer is aware about the runtime address of the symbols.
• In relocatable loader, linking is done by the loader and hence the assembler must supply to the loader, the locations at which
the loading is to be done.
Relocation:
• It modifies the object program by changing the certain instructions so that it can be loaded at different address from location
originally specified.
• There are some address dependent locations in the program, such address constants must be adjusted according to
allocated space, such activity done by loader is called relocation.
• In absolute Loader relocation is done by the assembler as the assembler is aware of the starting address of the program.
• In relocatable loader, relocation is done by the loader and hence assembler must supply to the loader the location at which
relocation is to be done.
Loading:
• It brings the object program into the memory for execution.
• Finally it places all the machine instructions and data of corresponding programs and subroutines into the memory. Thus
program now becomes ready for execution, this activity is called loading.
• In both the loaders (absolute, relocatable) Loading is done by the loader and hence the assembler must supply to the loader
the object program.
2.GENERAL LOADER SCHEME:- In “Compile-and-Go” the outputting instruction and data are assembled. In which
assembler is placed in main memory that results in wastage of memory.
o To overcome that we requires the addition of the new program of the system, a loader.
o Generally the size of loader is less than that of assembler.
o The loader accepts the assembled machine instructions, data and other information present in the object format and
places machine instructions and data in core in an executable computer form.
o The reassembly is no longer necessary to run the program at a later date.
ADVANTAGE:- In this scheme the source program translators produce compatible object program deck formats
and it is possible to write subroutines in several different languages since the object decks to be processed by the loader will
all be in the same “language” that is in “machine language”.
3.ABSOLUTE LOADERS:- In this scheme the assembler outputs the machine language translation of the source program in
almost the same form as in the “Compile and go” , except that the data is punched on cards. Here it will directly placed in
memory .
The loader in turn simply accepts the machine language text and places it into core at the location prescribed by the
assembler.
DISADVANTAGES:-
The programmer must specify to the assembler the address where the program is to be loaded.
If there are multiple subroutines , the programmer must remember the address of each.
4.RELOCATING LOADERS:- To avoid possible reassembling of all subroutines when a single subroutine is changed and to
perform the tasks of allocation and linking for the programmer the relocating loaders is introduced.
Q3. Difference between Linker and Loader.
Ans. Linker and Loader are the utility programs that plays a major role in the execution of a program. The
Source code of a program passes through compiler, assembler, linker, loader in the respective order, before
execution. On the one hand, where the linker intakes the object codes generated by the assembler and
combine them to generate the executable module. On the other hands, the loader loads this executable
module to the main memory for execution. Let us discuss the difference between Linker and loader with the
help of a comparison chart.
Comparison Chart:-
BASIS FOR
LINKER LOADER
COMPARISON
Basic It generates the executable module of a It loads the executable module to the main
source program. memory.
Input It takes as input, the object code It takes executable module generated by a
generated by an assembler. linker.
Function It combines all the object modules of a It allocates the addresses to an executable
source code to generate an executable module in main memory for execution.
module.
Type/Approach Linkage Editor, Dynamic linker. Absolute loading, Relocatable loading and
Dynamic Run-time loading.
Ans. The difference between one pass and two pass assemblers is basically in the name. A one pass assembler passes over
the source file exactly once, in the same pass collecting the labels, resolving future references and doing the actual assembly.
A two pass assembler does two passes over the source file ( the second pass can be over a file generated in the first pass
). But more differences as follows bellow-
Ans. An essential function of compiler is to record the identifier used in the source program and collect information about
various attributes of each identifier.
A symbol table is a data structure containing a record for each identifier, with field for the attribute of identifier.
The data structure allows us to find the record for each identifier.
Quickly and to store or retrieve data form that’s record quickly. When an identifier in source program
detected the lexical analyzers, the identifiers entered into symbol table.
Information Provided By Symbol Table:- The following are the information which is provided by
symbol table:-
o Given an Identifier which name is it?
o What information is to be associated with a name?
o How do we access this information?
Format:- A symbol table is simply a table which can be either linear or a hash table. It maintains an entry for
each name in the following format:
lookup(symbol)
This method returns 0 (zero) if the symbol does not exist in the symbol table. If the symbol exists in the symbol table, it returns
its attributes stored in the table.
Q6. Short note on Editor.
Ans. A text editor is a tool that allows a user to create and revise documents in a computer.
Though this task can be carried out in other modes, the word text editor commonly refers to the tool that does this
interactively.
Earlier computer documents used to be primarily plain text documents, but nowadays due to improved input-output
mechanisms and file formats, a document frequently contains pictures along with texts whose appearance (script,
size, color and style) can be varied within the document.
Apart from producing output of such wide variety, text editors today provide many advanced features of
interactiveness and output.
Types of Text Editors:- Depending on how editing is performed, and the type of output that can be generated,
editors can be broadly classified as -
1. Line Editors - During original creation lines of text are recognized and delimited by end-of-line markers, and during
subsequent revision, the line must be explicitly specified by line number or by some pattern context. eg. edlin editor in
early MS-DOS systems.
2. Stream Editors - The idea here is similar to line editor, but the entire text is treated as a single stream of
characters. Hence the location for revision cannot be specified using line numbers. Locations for revision are either
specified by explicit positioning or by using pattern context. eg. sed in Unix/Linux.
Line editors and stream editors are suitable for text-only documents.
3. Screen Editors - These allow the document to be viewed and operated upon as a two dimensional plane, of which
a portion may be displayed at a time. Any portion may be specified for display and location for revision can be
specified anywhere within the displayed portion. eg. vi, emacs, etc.
4. Word Processors - Provides additional features to basic screen editors. Usually support non-textual contents and
choice of fonts, style, etc.
5. Structure Editors - These are editors for specific types of documents, so that the editor recognizes the
structure/syntax of the document being prepared and helps in maintaining that structure/syntax.
Editor Structure:- Most text editors have a structure similar to that shown in the following figure.
Command language Processor accepts command, uses semantic routines – performs functions such as editing and
viewing. The semantic routines involve traveling, editing,
viewing and display functions.
Ans. An interpreter is a program which translates statements of a program into machine code. It translates only one
statement of the program at a time.
Characteristics:- There are few characteristics of interpreter-
1. It is type of translator.
2. An interpreter translates high-level instructions into an intermediate form, which it then executes.
3. It Translates one line of the program into binary code at a time.
4. It occupied less memory space compared to compiler.
5. The interpreter reads each statement of code and then converts or executes it directly.
Explanation:- Interpreter reads only one statement of program, translates it and executes it. Then it reads the next
statement of the program again translates it and executes it. In this way it proceeds further till all the statements are translated
and executed. On the other hand, a compiler goes through the entire program and then translates the entire program into
machine codes. A compiler is 5 to 25 times faster than an interpreter.
Example:- Computer programs are compiled or interpreted. Languages like Assembly Language, C, C++, Fortran, Pascal
were almost always compiled into machine code. Languages like Basic, VbScript and JavaScript were usually interpreted.
However, BASIC and LISP are especially designed to be executed by an interpreter
Advantages :-
easy to learn and use
minimum programming knowledge or experience
allows complex tasks to be performed in relatively few steps
allows simple creation and editing in a variety of text editors
allows the addition of dynamic and interactive activities to web pages
edit and running of code is fast.
Disadvantages:-
usually run quite slowly
limited access to low level and speed optimization code.
limited commands to run detailed operations on graphics.
Q8. What is MACRO? Discuss its Expansion in detail with the suitable example?
Ans. A macro name is an abbreviation, which stands for some related lines of code.
Purpose:- Macros are useful for the following purposes:
o To simplify and reduce the amount of repetitive coding
o To reduce errors caused by repetitive coding
o To make an assembly program more readable.
A macro consists of name, set of formal parameters and body of code. The use of macro name
with set of actual parameters is replaced by some code generated by its body. This is called macro expansion.
Macros allow a programmer to define pseudo operations, typically operations that are generally desirable, are not
implemented as part of the processor instruction, and can be implemented as a sequence of instructions. Each use
of a macro generates new program instructions, the macro has the effect of automating writing of the program.
Macros can be defined used in many programming languages, like C, C++ etc.
Example macro in C programming.Macros are commonly used in C to define small snippets of code. If the macro
has parameters, they are substituted into the macro body during expansion; thus, a C macro can mimic a C
function. The usual reason for doing this is to avoid the overhead of a function call in simple cases, where the code
is lightweight enough that function call overhead has a significant impact on performance.
For instance,
#define max (a, b) a>b? A: b
Defines the macro max, taking two arguments a and b. This macro may be called like any C function, using
identical syntax. Therefore, after preprocessing
z = max(x, y);
Becomes z = x>y? X:y;
While this use of macros is very important for C, for instance to define type-safe generic data-
types or debugging tools, it is also slow, rather inefficient, and may lead to a number of pitfalls.
C macros are capable of mimicking functions, creating new syntax within some limitations, as well as expanding
into arbitrary text (although the C compiler will require that text to be valid C source code, or else comments), but
they have some limitations as a programming construct. Macros which mimic functions, for instance, can be called
like real functions, but a macro cannot be passed to another function using a function pointer, since the macro itself
has no address.
In programming languages, such as C or assembly language, a name that defines a set
of commands that are substituted for the macro name wherever the name appears in a program (a process called
macro expansion) when the program is compiled or assembled. Macros are similar to functions in that they can
take arguments and in that they are calls to lengthier sets of instructions. Unlike functions, macros are replaced by
the actual commands they represent when the program is prepared for execution. function instructions are copied
into a program only once.
Macro Expansion:- A macro call leads to macro expansion. During macro expansion, the macro statement is
replaced by sequence of assembly statements.
In the above program a macro call is shown in the middle of the figure. i.e. INITZ.
Which is called during program execution? Every macro begins with MACRO keyword at the beginning and ends
with the ENDM (end macro).whenever a macro is called the entire is code is substituted in the program where it is
called. So the resultant of the macro code is shown on the right most side of the figure. Macro calling in high level
programming languages.
(C programming)
#define max(a,b) a>b?a:b
Main () {
int x , y;
x=4; y=6;
z = max(x, y); }
The above program was written using C programming statements. Defines the macro max, taking two
arguments a and b. This macro may be called like any C function, using identical syntax. Therefore, after
preprocessing
Becomes z = x>y ? x: y;
After macro expansion, the whole code would appear like this.
#define max(a,b) a>b?a:b
main()
{ int x , y;
x=4; y=6;z = x>y?x:y; }
Ans. Macro Processor is a program that lets you define the code that is reused many times giving it a specific
Macro name and reuse the code by just writing the Macro name only.
Generally it doesn't come as a separate program but as a bundle to either
assembler or compiler
Note : Please don't confuse Macro Processor with Micro Processor, remember Micro Processor is a hardware
device and it's a completely different area of study
#include<stdio.h>
#define hell printf("\nHello world\n");printf("I am a good boy\n\n");
int main()
{
hell;
}
Now there is are a lot of options in gcc compiler and if you use gcc helloworld.c -E , it gives you the code that is
processed before it is passed to the compiler.It includes a lot of things along with the code.But since our main
discussion is about Macro processing, let's see how the macro processed code looks like
# 916 "/usr/include/stdio.h" 3 4
# 2 "helloworld.c" 2
int main()
{
printf("\nHello world\n");printf("I am a good boy\n\n");;
}
You can clearly see that macro processing is already done replacing hell with the corresponding macro
definition.This is passed onto the compiler which does the remaining job. This is the compiled output
Hello world
I am a good boy
There are a lot more uses for macro and don't confuse functions with macros even though both are used for same
purpose and if any one asks you the difference tell them, the things I showed you above, function gets compiled
but a macro gets macro processed
One important use of these macros is for booting.During the boot time, the system uses a lot of macros to get itself
started since they need not be compiled to use, but just macro processing is enough which is a small task
Ans. When a computer is first tuned on or restarted, a special type of absolute loader, called bootstrap loader is executed.
This bootstrap loads the first program to be run by the computer -- usually an operating system.
Characteristics:- There are few characteristics of Bootstrap Loader as following-
o It is a type of Loader.
o It is a very small and simple loader in the present architecture of the computer.
o Bootstrap loader always place in ROM.
o It cannot be erased.
Ans. In computer science, Backus–Naur form or Backus normal form (BNF) is a notation technique for context-free grammars,
often used to describe the syntax of languages used in computing, such as computer programming languages, document
formats, instruction sets and communication protocols.
In other words BNF is a standard notation for expressing syntax as a set of grammar rules.
Example:-
A := B*(A+C)
Solution -->
Assign -> id := expr
A := expr
A := id*expr
A := B*(expr)
A := B*(id + expr)
A := B*(A + expr)
A := B*(A + id)
A := B*(A + C)
Ans. A Grammar(G )is called Ambiguous when G has more than one derivation tree for some string.
There exist multiple right-most or left-most derivations for some string generated from that grammar.
Solution :- Let’s find out the derivation tree for the string "a+a*a". It has two leftmost derivations.
Derivation 1 − X → X+X → a +X → a+ X*X → a+a*X → a+a*a
Parse tree 1 −
Derivation 2 − X → X*X → X+X*X → a+ X*X → a+a*X → a+a*a
Parse tree 2 −
Ans.
Syntax
It represents the grammar which tells you the rules for constructing well-formed sentences of the language. If a program
contains syntactic errors, it will not pass compilation. For example:
int multiply_numbers = a b *; // syntax error in C
Semantics
It represents the meaning of the sentences of the language. If a program contains only semantic errors, it means that it may
pass compilation, but does not do what it meant to do. For example:
Semantics is a term which is derived from the Syntax is a term which is derived from Ancient
Greek word seme meaning sigh. Semantics Greek σύνταξις "arrangement" from σύν syn, "together,"
Definition is another important field related to and τάξις táxis, "an ordering". Syntax is the study which
theoretical linguistics. It is all about studying deals with analyzing that how words are combined in
the meaning of linguistic expressions. order to form grammatical sentences.
Describe the relationship between symbol Describe the correct word order and inflectional structure
Rules
and the things they mean or refer to in sentences
Approach
towards Individual’s own interpretation on the basis of
Linguistically and grammatically correct
meaning of a previous knowledge
sentence
Analysis part:-
• Analysis part breaks the source program into constituent pieces and imposes a grammatical structure on them which further
uses this structure to create an intermediate representation of the source program.
• It is also termed as front end of compiler.
• Information about the source program is collected and stored in a data structure called symbol table.
Synthesis part:-
• Synthesis part takes the intermediate representation as input and transforms it to the target program.
• It is also termed as back end of compiler.
The design of compiler can be decomposed into several phases, each of which converts one form of source program into
another.
The different phases of compiler are as follows:
1. Lexical analysis
2. Syntax analysis
3. Semantic analysis
4. Intermediate code generation
5. Code optimization
6. Code generation
Lexical Analysis
• Lexical analysis is the first phase of compiler which is also
termed as scanning.
• Source program is scanned to read the stream of characters
and those characters are grouped to form a sequence called
lexemes which produces token as output.
• Token: Token is a sequence of characters that represent
lexical unit, which matches with the pattern, such as keywords,
operators, identifiers etc.
• Lexeme: Lexeme is instance of a token i.e., group of characters forming a token. ,
• Pattern: Pattern describes the rule that the lexemes of a token takes. It is the structure that must be matched by strings.
• Once a token is generated the corresponding entry is made in the symbol table.
Input: stream of characters
Output: Token
Token Template: <token-name, attribute-value>
(eg.) c=a+b*5;
Lexemes and tokens
Lexemes Tokens
c identifier
= assignment symbol
a identifier
+ + (addition symbol)
b identifier
* * (multiplication symbol)
5 5 (number)
Syntax Analysis
• Syntax analysis is the second phase of compiler which is also called as parsing.
• Parser converts the tokens produced by lexical analyzer into a tree like
representation called parse tree.
Semantic Analysis
• Semantic analysis is the third phase of compiler.
• It checks for the semanticconsistency.
• Type information is gathered and stored in symbol table or in syntax tree.
• Performs type checking.
Code Generation
• Code generation is the final phase of a compiler.
• It gets input from code optimization phase and produces the target code or object code as result.
• Intermediate instructions are translated into a sequence of machine instructions that perform the same task.
• The code generation involves
o Allocation of register and memory.
o Generation of correct references.
o Generation of correct data types.
o Generation of missing code.
LDF R2, id3
MULF R2, # 5.0
LDF R1, id2
ADDF R1, R2
STF id1, R1
Symbol Table Management
• Symbol table is used to store all the information about identifiers used in the program.
• It is a data structure containing a record for each identifier, with fields for the attributes of the identifier.
• It allows finding the record for each identifier quickly and to store or retrieve data from that record.
• Whenever an identifier is detected in any of the phases, it is stored in the symbol table.
Example
int a, b; float c; char z;
Symbol name Type Address
a Int 1000
Example
b Int 1002
c Float 1004 extern double test
z char 1008 (double x);
double sample (int count)
{
double sum= 0.0;
for (int i = 1; i < = count; i++)
sum+= test((double) i);
return sum;
}
Error Handling
• Each phase can encounter errors. After detecting an error, a phase must handle
the error so that compilation can proceed.
• In lexical analysis, errors occur in separation of tokens.
• In syntax analysis, errors occur during construction of syntax tree.
• In semantic analysis, errors may occur at the following cases:
(i) When the compiler detects constructs that have right syntactic structure but no
meaning
(ii) During type conversion.
• In code optimization, errors occur when the result is affected by the optimization.
In code generation, it shows error when code is missing etc.
Figure illustrates the translation of source code through each phase, considering
the statement
c =a+ b * 5.