0% found this document useful (0 votes)
2 views

Unit 1 SystemProgramming

System programming involves various programs that support computer operations, allowing users to focus on applications. Key components include assemblers, loaders, compilers, macros, and formal systems, each serving specific functions in translating and managing code. The document also discusses machine structure, assembly language, and differences between programming concepts such as compilers and interpreters, processes and programs, and user versus system viewpoints.

Uploaded by

Medhansh Sharma
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

Unit 1 SystemProgramming

System programming involves various programs that support computer operations, allowing users to focus on applications. Key components include assemblers, loaders, compilers, macros, and formal systems, each serving specific functions in translating and managing code. The document also discusses machine structure, assembly language, and differences between programming concepts such as compilers and interpreters, processes and programs, and user versus system viewpoints.

Uploaded by

Medhansh Sharma
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 43

Unit 1

What is System Programming?

System Programming consists of a variety of programs that support the operations of a


computer. This software makes it possible for the user to focus on an application or the
other problem to be solved. System programs (e.g. compiler, loader macro processors,
and operating systems) were developed to make computer better adapted d to the need of
their users.

Component of System Programming

Components of system programming are

1. Assemble
2. Loader
3. Compiler
4. Macros
5. Formal System

Assembler: -

• An assembler is a type of computer program that interprets software programs


written in assembly language into machine language, code and instructions that
can be executed by a computer.
• An assembler enables software and application developers to access, operate and
manage a computer's hardware architecture and components.
• An assembler is sometimes referred to as the compiler of assembly language. It
also provides the services of an interpreter.

Loader:-

• A loader is a program used by an operating system to load programs from a


secondary to main memory so as to be executed.

Compiler:-

A compiler is a computer program (or a set of programs) that transforms source


code written in a programming language (the source language) into another
computer language (the target language), with the latter often having a binary
form known as object code.

Macro:-

• A macro is a single line abbreviation for group of statement.


• A macro processor is a program that substitutes and specialized macro definitions
for macro calls.

Formal System:-

• A formal system consists of a language over some alphabet of symbols together


with (axioms and) inference rules that distinguish some of the strings in the
language as theorems.
• A formal system has the following components:

o A finite alphabet of symbols.


o A syntax that defines which strings of symbol are in the language of our
formal system.
o A decidable set of axioms and a finite set of rules from which the set of
theorems of the system is generated. The rules must take a finite number
of steps to apply.

----***-----
General Machine Structure

All the conventional modern computers are based upon the concept of stored program
computer, the model that was proposed by John von Neumann.

The components of a general machine are as follows:

1. Instruction interpreter: A group of electronic circuits performs the intent of


instruction of fetched from memory.
2. Location counter: LC otherwise called as program counter PC or instruction
counter IC, is a hardware memory device which denotes the location of the
current instruction being executed.
3. Instruction register: A copy of the content of the LC is stored in IR.
4. Working register: are the memory devices that serve as “scratch pad” for the
instruction interpreter.
5. General register: are used by programmers as storage locations and for special
functions.
6. Memory address registers (MAR): contains the address of the memory location
that is to read from or stored into.
7. Memory buffer register (MBR): contain a copy of the content of the memory
location whose address is stored in MAR. The primary interface between the
memory and the CPU is through memory buffer register.
8. Memory controller: is a hardware device whose work is to transfer the content
of the MBR to the core memory location whose address is stored in MAR.
9. I/O channels: may be thought of as separate computers which interpret special
instructions for inputting and outputting information from the memory.

----***-----

Assembly Language

• An assembler is a program that takes computer instruction and converts them into
a pattern of bits that the computer processor can use to perform its basic
operation.
• The assembler is responsible for translating the assembly language program into
machine code. When the source program language is essentially a symbolic
representation for a numerical machine language, the translator is called
assembler and the source language is called an assembly language.

Basic function of Assembler

• Translate mnemonics opcodes to machine language.


• Convert symbolic operands to their machine addresses.
• Build machine instructions in the proper format
• Convert data constants into machine representation.
• Error checking is provided.
• Changes can be quickly and easily incorporated with a reassembly.
• Variables are represented by symbolic names, not as memory locations.
• Assembly language statements are written one per line. A machine code program
thus consists of a sequence of assembly language statements, where each
statement contains a mnemonics.

Advantages

• Reduced errors
• Faster translation times
• Changes could be made easier and faster.
• Addresses are symbolic, not absolute \
• Easy to remember

Disadvantages

• Assembler language are unique to specific types of computer


• Program is not portable to the computer.
• Many instructions are required to achieve small tasks
• Programmer required knowledge of the processor architecture and instruction set.

Translation phase of Assembler

The six steps that should be followed by the designer

1. Specify the problem


2. Specify data structure
3. Define format of data structure
4. Specify algorithm
5. Look for modularity
6. Repeat 1 through 5 on modules

Functions / Purpose of Assembler


An assembler must do the following

1. Generate instruction
a. Evaluate the mnemonics in the operation field to produce the machine
code
b. Evaluate the subfield-fine the value of each symbol. Process literals and
assign addresses.
2. Process pseudo ops
a. Pass 1 (Define symbol and literals)
i. Determine length of machine instruction ( MOTGET)
ii. Keep track of location counter (LC)
iii. Remember value of symbol until pass 2 (STSTO)
iv. Process some pseudo ops(POTGET1)
v. Remember literal (LITSTO)
b. Pass 2 (Generate object Program)
i. Look up value of symbol (STGET)
ii. Generate instruction (MOTGET2)
iii. Generate data (for DS, DC and Literal)
iv. Process pseudo ops (POTGET2)

Design data structure for assembler design in Pass-1 and Pass-2 with flow chart

Pass -1

1. Input source program


2. A location counter used to keep track of each instruction location.
3. A table, the machine –operation table (MOT) that indicate the symbolic
mnemonics for each instruction and its length (tow, four or six bytes)
4. A table, the pseudo operation table (POT) that indicate the symbolic mnemonics
and action to be taken for each pseudo-op in pass-1
5. A table, the literal table (LT) that is used to store each literal encounter and its
corresponding assigned location.
6. A table, the symbol table (ST) that is used to store each label and its
corresponding value.
7. A copy of the input to be used later by Pass-2. This may be stored in a secondary
storage device.
Pass-2

1. Copy of source program input to pass-1


2. Location counter
3. A table the MOT that indicates for each instruction
a. Symbolic
b. Mnemonics
c. Length
d. Binary machine op-code
e. Format (RR, RS, RX, SI, SS)
4. A table the POT that indicates for each pseudo-op the symbolic mnemonic and
the action to be taken in Pass-2
5. The ST prepare by Pass-1, containing each label and its corresponding value.
6. A table, BT that indicate which register are currently specified by base register by
USING pseudo-ops and what are the specified contents of these register.
7. A work space INSR that is to hold each instruction as its various parts are being
assembled together.
8. A workspace PRINT LINE used to produce a printed listing
9. A workspace PUNCH CARD used prior to actual outputting for converting
assembled instruction into the format needed by the loader.
10. An output deck of assembled instruction in the format needed by the loader.
Format of Database
1. Machine Operation Table (MOT)

Fig machine op table for pass-1 and pass-2 the op code is the key and its value is
the binary op-code equivalent which is stored for use in generating machine op-
code. The instruction length is stored for use in updating the location counter, the
instruction format for use in forming the machine language equivalent.

2. Pseudo Operation Table (POT)

The table will actually contain the physical address. Fig POT for pass-1, each
pseudo-op is listed with and associated pointer to the assembler routine for
processing the pseudo-op.

3. Symbol Table (ST)


Symbol table is usually hash-organized. It contains all relevant information about
symbol defined and used in the source program. Information regarding all forward
references in a symbol table, for each symbol and address is generated randomly
and the information regarding that symbol get the same address conflict is resolve
by using collision handling techniques.

The relative location indicator tells the assembler whether the value of the
symbol is absolute or relative base of the program.

4. Base Table (BT)

Base table is used by the assembler to generate the proper base register references
in machine instruction and to compute the correct offsets, when generating an
address, the assembler referencing the base register table to chose a base register
that will contains a value close to the symbolic references. The address is the
information using that base register.
Difference Between

Difference between compiler and interpreter

Sr. No. Compiler Interpreter


1 Scans the entire program first and Translate the program line-by-line.
then translate it into machine code
2 Convert the entire program to Each time the program is executed,
machine code; when all the syntax every line is checked for syntax error
errors have been removed execution and then converted to equivalent
takes place. machine code
3 Execution time is less Execution time is more
4 Machine code can be saved and used; Machine code cannot be saved;
source code and compiler no longer interpreter is always required for
needed. translation.
5 Since source code is not required Source code can be easily modified
tampering with the source code is not and hence no security of programs.
possible
6 Slow for debugging Fast for debugging

Difference between process and Programs

Sr. No. Process Programs


1 A process is an instance of program in Program is a set of instructions written
execution to carry out a particular task
2 Process is a dynamic concept Program is a static concept
3 A process is termed as an ‘active
entity’ since it is always stored in the
A program is an executable file
main memory and disappears if the
residing on the disk (secondary storage)
machine is power cycled. Several
in a directory
processes may be associated with a
same program.
4 It is read into the primary memory and
executed by the kernel.
5 A process is the actual execution of A computer program is a passive
those instructions. collection of instructions;
Difference between multiprogramming and multiprocessing

Sr.
Multiprocessing Multiprogramming
No.
1 Multiprogramming keeps
Multiprocessing refers to
several programs in main
processing of multiple
memory at the same time and
processes at same time by
execute them concurrently
multiple CPUs.
utilizing single CPU.
2 It utilizes multiple CPUs. It utilizes single CPU.
3 It permits parallel processing. Context switching takes place.
4 Less time taken to process the More Time taken to process the
jobs. jobs.
5 It facilitates much efficient
Less efficient than
utilization of devices of the
multiprocessing.
computer system.
6 Such systems are less
Usually more expensive.
expensive.

Difference between Open Subroutine and Closed Subroutine

Sr.
Open Subroutine Closed Subroutine
No.
1 Close subroutine can be
Open subroutine is one
stored outside the main
whose code inserted into
routine and control transfer
the main program
to the subroutine
2 If some open subroutine
where called four times. It Close subroutine perform
would appear in four transfer of control and
different placed in the transfer of data
calling program
3 Arguments are passed in
the registers that are given Arguments may be placed
as arguments to the in registers or on the stack
subroutine.
4 A subroutine also allows
Open Subroutines are very
you to debug code once and
efficient with no wasted
then ensure that all future
instructions
instantiations of the code
will be correct

5 Any register that the


Open Subroutines are very
subroutine uses must first
flexible and can be as
be saved and then restored
general as the program
after the subroutine
wishes to make them
completes execution

Difference between Pure Procedure and Impure Procedure

Sr.
Pure Procedure Impure Procedure
No.
1 Procedures that modify
A pure procedure does not
themselves are called
modify itself.
impure procedures.
2 Other program finds them
It can be shared by multiple difficult to read and
processors moreover they cannot shard
by multiple processors.
3 Pure procedures are readily Impute procedures are not
reusable. readily reusable.
4 To ensure that the
Each processor executing
instructions are the same
an impure procedure
each time a program is
modifies its contents.
used.
5 Writing such procedure is a Writing such procedure is a
good programming practice poor programming practice

Difference between user viewpoint and System viewpoint

Operating System is designed both by taking user view and system view into
consideration.

1. The goal of the Operating System is to maximize the work and minimize the
effort of the user.
2. Most of the systems are designed to be operated by single user, however in some
systems multiple users can share resources, memory. In these cases Operating
System is designed to handle available resources among multiple users and CPU
efficiently.
3. Operating System must be designed by taking both usability and efficient
resource utilization into view.
4. In embedded systems (Automated systems) user view is not present.
5. Operating System gives an effect to the user as if the processor is dealing only
with the current task, but in background processor is dealing with several
processes.

System View

1. From the system point of view Operating System is a program involved with the
hardware.
2. Operating System is allocator, which allocate memory, resources among various
processes. It controls the sharing of resources among programs.
3. It prevents improper usage, error and handles deadlock conditions.
4. It is a program that runs all the time in the system in the form of Kernel.
5. It controls application programs that are not part of Kernel.

-----***-----

General Approaches to New Machines


In order to know a new machine we have a number of questions in mind. These
questions can be categorized as follows.

• Memory: Basic unit, size and addressing scheme.


• Registers Number of registers, and size, functions, interrelation of each register.
• Data: Types of data and their storing scheme.
• Instruction: Classes of instructions, allowable operations and their storing
scheme.
• Special Features: Additional features like interrupt and protections.

The lists of opcodes used in the program are as follows

• USING is a pseudo code the indicates to the assembler which general


purpose register to use as a base and what its contents will be. As we do not have
any specific general register acting as the base register, so it becomes necessary
to inform indicate a base register for the program. Because the address are
relative so by the knowledge of base and offset the program can be easily be
located and executed.
• BALR is machine opcode that load a register with the next address and
branch to the address in the second field. Since second operand is 0 so the
control will go to the next instruction.

• START is a pseudo opcode that tell the assembler where the beginning of the
program is and allows the user to give a name to the program.

• END is a pseudo code that tells the assembler that the last card of the program
has been reached.

• BR 14, the last machine opcode is used to branch to the location whose
address is in general purpose register 14. By convention, calling programs leave
their return address in register 14.

Literals
The same program is repeated by using literals, that are mechanisms where by the
assembler creates data areas for the programmer, containing constants he requests.
=F'10', =F'49', =F'4' are the literal which would be result in the creation of a data
area containing 10, 49 and 4 and replacement of the literal operand with the address of
the data it describes. L 3, =F'10' is translated by the assembler to point to a full word that
contains a 10. Generally the assembler keeps track of the literal with the help of a literal
table. This table will contain all the constants that have been requested through the use
of literal. A pseudo opcode LTORG place the literal at an earlier location. This is
required because, the program may be using 10000 data and it become difficult for the
offset of the load instruction to reach the literal at the end of the program.

Data Format of IBM 360/370

The 360 may store several different types of data as is depicted in the figure. The
groups of bits stored in memory are interpreted by 360 processor in several ways.
The list of different interpretation are shown in the figure are as follow.
----/////-----

Instruction Format

The instructions in 360 can be arithmetic, logical, control or transfer and special
interrupt instructions. The format of 360 instructions is as in figure above.
There are five types of instructions that differ in the type of operands they use.
Register operand refers to the data stored in the 16 general purpose registers (32
bits each). Registers being high-speed circuits provide faster access to data than the data
in the core.

E.g. Add register 3, 4 causes the contents of the contents of the register 4 to be added to
that of register 3 and stored back in the register 3. The instruction is represented as given
in the diagram. The is called as RR format. A total of two bytes are required to
represent the RR instruction 8 bits for opcode and 4 bits each of register(8+4+4=16
bits =2 bytes).

The address of ith storage operand is computed from the instruction in the
following manner:
Address = c (Bi)+c(Xi)+Di (RX format)
or = c (Bi) +Di (RS, SI, SS format)

Unit 2
Definition

Macros are single-line abbreviations for a certain group of instructions. Once the
macro is defined, these groups of instructions can be used anywhere in a program.

It is sometimes necessary for an assembly language programmer to repeat some blocks of


code in the course of a program. The programmer needs to define a single machine instruction to
represent a block of code for employing a macro in the program. The macro proves to be useful
when instead of writing the entire block again and again, you can simply write the macro that you
have already defined. An assembly language macro is an instruction that represents several other
machine language instructions at once.

Macro facility permits you to attach a name to the sequence that is occurring several
times in a program and then you can easily use this name when that sequence is encountered. All
you need to do is to attach a name to a sequence with the help of the macro instruction definition.
The following structure shows how to define a macro in a program:

This structure describes the macro definition in which the first line of the definition is the
MACRO pseudo-op. Following line is the name line for macro, which identifies the macro
instruction name. The line following the macro name includes the sequence of instructions that
are being abbreviated. Each instruction comprises of the actual macro instruction. The last
statement in the macro definition is MEND pseudo-op. This pseudo-op denotes the end of the
macro definition and terminates the definition of macro instruction.

MACRO EXPANSION

Once a macro is being created, the interpreter or compiler automatically replaces the
pattern, described in the macro, when it is encountered. The macro expansion always happens at
the compile-time in compiled languages. The tool that performs the macro expansion is known as
macro expander. Once a macro is defined, the macro name can be used instead of using the entire
instruction sequence again and again.
As you need not write the entire program repeatedly while expanding macros, the
overhead associated with macros is very less. This can be explained with the help of the
following example.

The macro processor replaces each macro call with the following lines:
A 1, DATA
A 2, DATA
A 3, DATA

The process of such a replacement is known as expanding the macro. The macro definition itself
does not appear in the expanded source code. This is because the macro processor saves the
definition of the macro. In addition, the occurrence of the macro name in the source
program refers to a macro call. When the macro is called in the program, the sequence of
instructions corresponding to that macro name gets replaced in the expanded source.

NESTED MACRO CALLS


Nested macro calls refer to the macro calls within the macros. A macro is available
within other macro definitions also. In the scenario where a macro call occurs, which contains
another macro call; the macro processor generates the nested macro definition as text and places
it on the input stack. The definition of the macro is then scanned and the macro processor
compiles it. This is important to note that the macro call is nested and not the macro definition. If
you nest the macro definition, the macro processor compiles the same macro repeatedly,
whenever the section of the outer macro is executed. The following example can make you
understand the nested macro calls:

You can easily notice from this example that the definition of the macro ‘SUBST’
contains three separate calls to a previously defined macro ‘SUB1’. The definition of the macro
SUB1 has shortened the length of the definition of the macro ‘SUBST’. Although this technique
makes the program easier to understand, at the same time, it is considered as an inefficient
technique. This technique uses several macros that result in macro expansions on multiple levels.
This is clear from the example that a macro call, SUBST, in the source is expanded in the
expanded source (Level 1) with the help of SUB1, which is further expanded in the expanded
source (Level 2).

FEATURES OF MACRO FACILITY

The features of the macro facility are as follows:

• Macro instruction arguments


• Conditional macro expansion
• Macro instructions defining macros

1. Macro Instruction Arguments

The macro facility presented so far inserts block of instructions in place of macro calls. This
facility is not at all flexible, in terms that you cannot modify the coding of the macro name for a
specific macro call. An important extension of this facility consists of providing the arguments or
parameters in the macro calls. Consider the following program.
In this example, the instruction sequences are very much similar but these sequences are not
identical. It is important to note that the first sequence performs an operation on an operand
DATA1. On the other hand, in the second sequence the operation is being performed on operand
DATA2. The third sequence performs operations on DATA3. They can be considered to perform
the same operation with a variable parameter or argument. This parameter is known as a macro
instruction argument or dummy argument.
Notice that in this program, a dummy argument is specified on the macro name line and is
distinguished by inserting an ampersand (&) symbol at the beginning of the name. There is no
limitation on supplying arguments in a macro call. The important thing to understand about the
macro instruction argument is that each argument must correspond to a definition or dummy
argument on the macro name line of the macro definition. The supplied arguments are substituted
for the respective dummy arguments in the macro definition whenever a macro call is processed.

2. Conditional Macro Expansion

These macro expansions permit conditional reordering of the sequence of macro expansion. They
are responsible for the selection of the instructions that appear in the expansions of a macro call.
These selections are based on the conditions specified in a program. Branches and tests in the
macro instructions permit the use of macros that can be used for assembling the instructions. The
facility for selective assembly of these macros is considered as the most powerful programming
tool for the system software. The use of the conditional macro expansion can be explained with
the help of an example.

Consider the following set of instructions:

LOOP 1 A 1, DATA1
A 2, DATA2
A 3, DATA3
:
:
LOOP 2 A 1, DATA3
A 2, DATA2
:
:
DATA1 DC F’5’
DATA2 DC F’10’
DATA3 DC F’15’

In this example, the operands, labels and number of instructions generated are different in each
sequence. Rewriting the set of instructions in a program might look like:

The labels starting with a period (.) such as .FINI are macro labels. These macro labels do
not appear in the output of the macro processor. The statement AIF (& COUNT EQ 1).FINI
directs the macro processor to skip to the statement labeled .FINI, if the parameter corresponding
to &COUNT is one. Otherwise, the macro processor continues with the statement that follows the
AIF pseudo-op. AIF pseudo-op performs an arithmetic test and since it is a conditional branch
pseudo-op, it branches only if the tested condition is true. Another pseudo-op used in this
program is AGO, which is an unconditional branch pseudo-op and works as a GOTO statement.
This is the label in the macro instruction definition that specifies the sequential processing of
instructions from the location where it appears in the instruction. These statements are indications
or directives to the macro processor that do not appear in the macro expansions.

3. Macro Instructions Defining Macros

A single macro instruction can also simplify the process of defining a group of similar
macros. The considerable idea while using macro instructions defining macros is that the inner
macro definition should not be defined until the outer macro has been called once. Consider a
macro instruction INSTRUCT in which another subroutine &ADD is also defined.

This is explained in the following macro instruction.


In this code, first the macro INSTRUCT has been defined and then within INSTRUCT, a new
macro &ADD is being defined. Macro definitions within macros are also known as “macro
definitions within macro definitions”.

DESIGN OF A MACRO PRE-PROCESSOR

A Macro pre-processor effectively constitutes a separate language processor with its own
language. A macro pre-processor is not really a macro processor, but is considered as a macro
translator. The approach of using macro pre-processor simplifies the design and implementation
of macro pre-processor. Moreover, this approach can also use the features of macros such as
macro calls within macros and recursive macros. Macro pre-processor recognises only the macro
definitions that are provided within macros. The macro calls are not considered here because the
macro pre-processor does not perform any macro expansion.

The macro preprocessor generally works in two modes: passive and active. The passive mode
looks for the macro definitions in the input and copies macro definitions found in the input to the
output. By default, the macro pre-processor works in the passive mode. The macro pre-processor
switches over to the active mode whenever it finds a macro definition in the input. In this mode,
the macro preprocessor is responsible for storing the macro definitions in the internal data
structures. When the macro definition is completed and the macros get translated, then the macro
pre-processor switches back to the passive mode.

Four basic tasks that are required while specifying the problem in the macro pre-processor are as
follows:

1. Recognising macro definitions: A macro pre-processor must recognize macro


definitions that are identified by the MACRO and MEND pseudo-ops. The macro
definitions can be easily recognised, but this task is complicated in cases where the macro
definitions appear within macros. In such situations, the macro pre-processor must
recognise the nesting and correctly matches the last MEND with the first MACRO.

2. Saving the definitions: The pre-processor must save the macro instructions definitions
that can be later required for expanding macro calls.
3. Recognising macro calls: The pre-processor must recognise macro calls along with the
macro definitions. The macro calls appear as operation mnemonics in a program.

4. Replacing macro definitions with macro calls: The pre-processor needs to expand
macro calls and substitute arguments when any macro call is encountered. The pre-
processor must substitute macro definition arguments within a macro call.

Implementation of Two-Pass Algorithm

The two-pass algorithm to design macro pre-processor processes input data into two passes. In
first pass, algorithm handles the definition of the macro and in second pass; it handles various
calls for macro. Both the passes of two-pass algorithm in detail are:

1. First Pass

The first pass processes the definition of the macro by checking each operation code of the
macro. In first pass, each operation code is saved in a table called Macro Definition Table
(MDT). Another table is also maintained in first pass called Macro Name Table (MNT). First
pass uses various other databases such as Macro Name Table Counter (MNTC) and Macro Name
Table Counter (MDTC). The various databases used by first pass are:

1. The input macro source deck.


2. The output macro source deck copies that can be used by pass 2.
3. The Macro Definition Table (MDT), which can be used to store the body of the macro
definitions. MDT contains text lines and every line of each macro definition, except the
MACRO line gets stored in this table. For example, consider the code described in macro
expansion section where macro INC used the macro definition of INC in MDT. Table 2.1
shows the MDT entry for INC macro:

4. The Macro Name Table (MNT), which can be used to store the names of defined macros.
Each MNT entry consists of a character string such as the macro name and a pointer such
as index to the entry in MDT that corresponds to the beginning of the macro definition.
Table 2.2 shows the MNT entry for INCR macro:
5. The Macro Definition Table Counter (MDTC) that indicates the next available entry in
the MDT.
6. The Macro Name Table Counter (MNTC) that indicates the next available entry in the
MNT.
7. The Argument List Array (ALA) that can be used to substitute index markers for dummy
arguments prior to store a macro definition. ALA is used during both the passes of the
macro pre-processor. During Pass 1, dummy arguments in the macro definition are
replaced with positional indicators when the macro definition is stored. These positional
indicators are used to refer to the memory address in the macro expansion. It is done in
order to simplify the later argument replacement during macro expansion. The i th dummy
argument on the macro name card is represented in the body of the macro by the index
marker symbol #. The # symbol is a symbol reserved for the use of macro pre-processor.

2. Second Pass

Second pass of two-pass algorithm examine each operation mnemonic such that it replaces macro
name with the macro definition. The various data-bases used by second pass are:
1. The copy of the input macro source deck.
2. The output expanded source deck that can be used as an input to then assembler.
3. The MDT that was created by pass 1.
4. The MNT that was created by pass 1.
5. The MDTP for indicating the next line of text that is to be used during macro expansion.
6. The ALA that is used to substitute macro calls arguments for the index markers in the
stored macro definition.

3 .Two-Pass Algorithm
In two-pass macro-preprocessor, you have two algorithms to implement, first pass and second
pass. Both the algorithms examines line by line over the input data available. Two algorithms to
implement two-pass macro-preprocessor are:

• Pass 1 Macro Definition


• Pass 2 Macro Calls and Expansion

Pass 1 - Macro Definition

Pass 1 algorithm examines each line of the input data for macro pseudo opcode. Following are the
steps that are performed during Pass 1 algorithm:

1. Initialize MDTC and MNTC with value one, so that previous value of MDTC and MNTC
is set to value one.
2. Read the first input data.
3. If this data contains MACRO pseudo opcode then
A. Read the next data input.
B. Enter the name of the macro and current value of MDTC in MNT.
C. Increase the counter value of MNT by value one.
D. Prepare that argument list array respective to the macro found.
E. Enter the macro definition into MDT. Increase the counter of MDT by value one.
F. Read next line of the input data.
G. Substitute the index notations for dummy arguments passed in macro.
H. Increase the counter of the MDT by value one.
I. If mend pseudo opcode is encountered then next source of input data is read.
J. Else expands data input.
4. If macro pseudo opcode is not encountered in data input then
A. A copy of input data is created.
B. If end pseudo opcode is found then go to Pass 2.
C. Otherwise read next source of input data.
Pass 2 - Macro Calls and Expansion
Pass two algorithm examines the operation code of every input line to check whether it exist in
MNT or not. Following are the steps that are performed during second pass algorithm:

1. Read the input data received from Pass 1.


2. Examine each operation code for finding respective entry in the MNT.
3. If name of the macro is encountered then
A. A Pointer is set to the MNT entry where name of the macro is found. This pointer is
called
Macro Definition Table Pointer (MDTP).
B. Prepare argument list array containing a table of dummy arguments.
C. Increase the value of MDTP by value one.
D. Read next line from MDT.
E. Substitute the values from the arguments list of the macro for dummy arguments.
F. If mend pseudo opcode is found then next source of input data is read.
G. Else expands data input.
4. When macro name is not found then create expanded data file.
5. If end pseudo opcode is encountered then feed the expanded source file to assembler for
processing.
6. Else read next source of data input.
Implementation of Single-Pass Algorithm
The single-pass algorithm allows you to define macro within the macro but not supports macro
calls within the macro. In the single-pass algorithm two additional Level Counter (MDLC).
Following is the usage of MDI and MDLC in single-pass algorithm:

• MDI indicator: Allows you to keep track of macro calls and macro definitions. During
expansion of macro call, MDI indicator has value ON and retains value OFF otherwise. If
MDI indicator is on, then input data lines are read from MDT until mend pseudo opcode
is not encountered. When MDI is off, then data input is read from data source instead of
MDI.
• MDLC indicator: MDLC ensures you that macro definition is stored in MDT. MDLC is
counters that keeps track of the numbers of macro1 and mend pseudo opcode found.
Single-pass algorithm combines both the algorithms defined above to implement two-
pass macro pre-processor. Following are the steps that are followed during single-pass
algorithm:

1. Initialize MDTC and MNTC to value one and MDLC to zero.


2. Set MDI to value OFF.
3. Performs read operation.
4. Examine MNT to get the match with operation code.
5. If macro name is found then
A. MDI is set to ON.
B. Prepare argument list array containing a table of dummy arguments.
C. Performs read operation.
6. Else it examines that macro pseudo opcode is encountered. If macro pseudo opcode is
found then
A. Enter the name of the macro and current value of MDTC in MNT at entry number
MNTC.
B. Increment the MNTC to value one.
C. Prepare argument list array containing a table of dummy arguments..
D. Enter the macro card into MDT.
E. Increment the MDTC to value one.
F. Increment the MDLC to value one.
G. Performs read operation.
H. Substitute the index notations for the arguments list of the macro for dummy
arguments.
I. Enter data input line into MDT.
J. Increment the MDTC to value one.
K. If macro pseudo opcode is found then increments the MDLC to value one and
performs read
operation.
L. Else it checks for mend pseudo opcode if not found then performs read operation.
M. If mend pseudo opcode is found then decrement the MDLC to value one.
N. If MDLC is equal to zero then it goes to step 2. Otherwise, it performs read operation.
7. In case macro pseudo opcode is not found, then write it into expanded source card file.

If end pseudo opcode is found, then it feeds expanded source file to assembler for processing,
otherwise performs read operation at step 2.
Unit 3
INTRODUCTION

Earlier programmers used loaders that could take program routines stored in tapes and combine
and relocate them in one program. Later, these loaders evolved into linkage editor used for
linking the program routines but the program memory remained expensive and computers were
slow. A little progress in linking technology helped computers become faster and disks larger,
thus program linking became easier. For the easier use of memory space and efficiency in speed,
you need to use linkers and loaders.

Loaders and linker’s helps you to have a schematic flow of steps that you need to follow while
creating a program. Following are the steps that you need to perform when you write a program
in language:

1. Translation of the program, which is performed by a processor called translator.


2. Linking of the program with other programs for execution, this is performed by a
separate processor known as linker.
3. Relocation of the program to execute from the memory location allocated to it, which is
performed by a processor called loader.
4. Loading of the program in the memory for its execution, this is performed by a loader.

LOADERS

A loader is a program that performs the functions of a linker program and then immediately
schedules the resulting executable program for some kind of action. In other words, a loader
accepts the object program, prepares these programs for execution by the computer and then
initiates the execution. It is not necessary for the loader to save a program as an executable file.

The functions performed by a loader are as follows:

1. Memory Allocation allocates space in memory for the program.


2. Linking: Resolves symbolic references between the different objects.
3. Relocation adjusts all the address dependent locations such as address constants, in order
to correspond to the allocated space.
4. Loading places the instructions and data into memory.

Functions of Loader:

The loader is responsible for the activities such as allocation, linking, relocation and loading

1. It allocates the space for program in the memory, by calculating the size of the program.
This activity is called allocation.
2. It resolves the symbolic references (code/data) between the object modules by assigning
all the user subroutine and library subroutine addresses. This activity is called linking.
3. There are some address dependent locations in the program, such address constants must
be adjusted according to allocated space, such activity done by loader is called relocation.
4. Finally it places all the machine instructions and data of corresponding programs and
subroutines into the memory. Thus program now becomes ready for execution, this
activity is called loading.

1. Compile and Go Loader

Compile and go loader is also known as “assembler-and-go”. It is required to introduce the


term “segment” to understand the different loader schemes. A segment is a unit of
information such as a program or data that is treated as an entity and corresponds to a single
source or object deck. A figure shows the compile and go loader.

The compile and go loader executes the assembler program in one part of memory and
places the assembled machine instructions and data directly into their assigned memory
locations. Once the assembly is completed, the assembler transfers the control to the starting
instruction of the program.

Advantages
1. This scheme is easy to implement,
2. the assembler simply places the code into core and the loader, which consists of one
instruction, transfers control to the starting instruction of the newly assembled program.

Disadvantages

1. In this scheme, a portion of memory is wasted. This is mainly because the core occupied
by the assembler is not available to the object program.
2. It is essential to assemble the user’s program deck every time it is executed.
3. It is quite difficult to handle multiple segments, if the source programs are in different
languages. This disadvantage makes it difficult to produce orderly modular programs.

2. General Loader Scheme.

The concept of loaders can be well understood if one knows the general loader scheme. It
is recommended for the general loader scheme that the instructions and data should be produced
in the output, as they were assembled. This strategy, if followed, prevents the problem of wasting
core for an assembler. When the code is required to be executed, the output is saved and loaded in
the memory. The assembled program is loaded into the same area in core that it occupied earlier.
The output that contains a coded form of the instructions is called the object deck. The object
deck is used as intermediate data to avoid the circumstances in which the addition of a new
program to a system is required. The loader accepts the assembled machine instructions, data and
other information present in the object. The loader places the machine instructions and data in
core in an executable computer form. More memory can be made available to a user, since in this
scheme, the loader is assumed to be smaller than the assembler. Figure shows the general loader
scheme.

Advantages:
• The program need not be retranslated each time while running it. This is because initially
when source program gets executed an object program gets generated. Of program is not
modified, and then loader can make use of this object program to convert it to executable
form.
• There is no wastage of memory, because assembler is not placed in the memory, instead
of it, loader occupies some portion of the memory. And size of loader is smaller than
assembler, so more memory is available to the user.
• It is possible to write source program with multiple programs and multiple languages,
because the source programs are first converted to object programs always, and loader
accepts these object modules to convert it to executable form.

3. Absolute Loader

An absolute loader is the simplest type of loader scheme that fits the general model of
loaders. The assembler produces the output in the same way as in the “compile and go loader”.
The assembler outputs the machine language translation of the source program. The difference
lies in the form of data, i.e., the data in the absolute loader is punched on cards or you can say that
it uses object deck as an intermediate data. The loader in turn simply accepts the machine
language text and places it at the location prescribed by the assembler. When the text is being
placed into the core, it can be noticed that much core is still available to the user. This is because,
within this scheme, the assembler is not in the memory at the load time.

In the figure, the MAIN program is assigned to locations 1000-2470 and the SQRT
subroutine is assigned locations 4000-4770. This means the length of MAIN has increased to
more than 3000 bytes, as it can be noticed from figure 4.4. If the modifications are required to be
made in MAIN subroutine, then the end of MAIN subroutine, i.e., 1000+3000=4000, gets
overlapped with the start of SQRT, i.e., with 4000. Therefore, it is necessary to assign a new
location to SQRT. This can be made possible by changing the START pseudo-op card and
reassembling it. It is then quite obvious to modify all other subroutines that refer to address of
SQRT.

Advantages

1. Absolute loaders are simple to implement.


2. This scheme allows multiple programs or the source programs written different
languages. If there are multiple programs written in different languages then the
respective language assembler will convert it to the language and a common object file
can be prepared with all the ad resolution.
3. The task of loader becomes simpler as it simply obeys the instruction regarding where to
place the object code in the main memory.
4. The process of execution is efficient.

Disadvantages.

1. It is desirable for the programmer to specify the address in core where the program is to
be loaded.
2. A programmer needs to remember the address of each subroutine, if there are multiple
subroutines in the program.
3. Additionally, each absolute address is to be used by the programmer explicitly in the
other subroutines such that subroutine linkage can be maintained.

4. Relocating Loaders

Relocating loaders was introduced in order to avoid possible reassembling of all subroutines
when a single subroutine is changed. It also allows you to perform the tasks of allocation and
linking for the programmer. The example of relocating loaders includes the Binary Symbolic
Subroutine (BSS) loader. Although the BSS loader allows only one common data segment, it
allows several procedure segments. The assembler in this type of loader assembles each
procedure segment independently and passes the text and information to relocation and
intersegment references.

In this scheme, the assembler produces an output in the form of text for each source
program. A transfer vector that contains addresses, which includes names of the subroutines
referenced by the source program, prefixes the output text. The assembler would also provide the
loader with additional information such as the length of the entire program and also the length of
the transfer vector portion. Once this information is provided, the text and the transfer vector get
loaded into the core. Followed by this, the loader would load each subroutine, which is being
identified in the transfer vector. A transfer instruction would then be placed to the corresponding
subroutine for each entry in the transfer vector.

The output of the relocating assembler is the object program and information about all the
programs to which it references. Additionally, it also provides relocation information for the
locations that need to be changed if it is to be loaded in the core. This location may be arbitrary in
the core, let us say the locations, which are dependent on the core allocation. The BSS loader
scheme is mostly used in computers with a fixed-length direct-address instruction format.
Consider an example in which the 360 RX instruction format is as follows:

In this format, A2 is the 16-bit absolute address of the operand, which is the direct
address instruction format. It is desirable to relocate the address portion of every instruction. As a
result, the computers with a direct-address instruction format have much severe problems than the
computes having 360-type base registers. The 360- type base registers solve the problem using
relocation bits. The relocation bits are included in the object desk and the assembler associates a
bit with each instruction or address field. The corresponding address field to each instruction
must be relocated if the associated bit is equal to one; otherwise this field is not relocated.

5. Direct-Linking Loaders

A direct-linking loader is a general relocating loader and is the most popular loading scheme
presently used. This scheme has an advantage that it allows the programmer to use multiple
procedure and multiple data segments. In addition, the programmer is free to reference data or
instructions that are contained in other segments. The direct linking loaders provide flexible
intersegment referencing and accessing ability. An assembler provides the following information
to the loader along with each procedure or data segment.

This information includes:

• Length of segment.
• List of all the symbols and their relative location in the segment that are referred by other
segments.
• Information regarding the address constant which includes location in segment and
description about the revising their values.
• Machine code translation of the source program and the relative addresses assigned.

LINKAGE EDITOR

Supply information needed to allow references between them. A linkage editor is also known as
linker. To allow linking in a program, you need to perform:

• Program relocation
• Program linking

1. Program relocation
Program relocation is the process of modifying the addresses containing instructions of a
program. You need to use program relocation to allocate a new memory address to the
instruction. Instructions are fetched from the memory address and are followed sequentially to
execute a program. The relocation of a program is performed by a linker and for performing
relocation you need to calculate the relocation_factor that helps specify the translation time
address in every instruction. Let the translated and linked origins of a program P be t_origin and l

2. Program Linking

Linking in a program is a process of binding an external reference to a correct link address. You
need to perform linking to resolve external reference that helps in the execution of a program. All
the external references in a program are maintained in a table called name table (NTAB), which
contains the symbolic name for external references or an object module. The information
specified in NTAB is derived from
LINKTAB entries having type=PD. The algorithm that you use for program linking is:
DYNAMIC LINKING

Sophisticated operating systems, such as Windows allow you to link executable object modules to
be linked to a program while a program is running. This is known as dynamic linking. The
operating system contains a linker that determines functions, which are not specified in a
program. A linker searches through the specified libraries for the missing function and helps
extract the object modules containing the missing functions from the libraries. The libraries are
constructed in a way to be able to work with dynamic linkers. Such libraries are known as
dynamic link libraries (DLLs). Technically, dynamic linking is not like static linking, which is
done at build time. DLLs contain functions or routines, which are loaded and executed when
needed by a program. The advantages of DLLs are:

• Code sharing: Programs in dynamic linking can share an identical code instead of
creating an individual copy of a same library. Sharing allows executable functions and
routines to be shared by many application programs. For example, the object linking and
embedding (OLE) functions of OLE2.DLL can be invoked to allow the execution of
functions or routines in any program.
• Automatic updating: Whenever you install a new version of dynamic link library, the
older version is automatically overridden. When you run a program the updated version
of the dynamic link library is automatically picked.

• Securing: Splitting the program you create into several linkage units makes it harder for
crackers to read an executable file.

Unit 4
Unit -4

Common Object File Format (COFF)

This chapter describes the Common Object File Format (COFF). COFF is the format
of the output file produced by the assembler and the link editor.

The following are some key features of COFF:

• applications can add system-dependent information to the object file without


causing access utilities to become obsolete
• space is provided for symbolic information used by debuggers and other
applications
• programmers can modify the way the object file is constructed by providing
directives at compile time

The object file supports user-defined sections and contains extensive information
for symbolic software testing. An object file contains:

You might also like