0% found this document useful (0 votes)
17 views84 pages

SS VVFGC New

The document discusses the evolution of system software components. It describes how assemblers were developed to allow users to write programs in assembly language instead of machine code. Loaders were then created to manage program loading and preparation for execution. Further, macros provided abbreviations to reduce repetitive code. Compilers and interpreters were later introduced to compile high-level languages like FORTRAN and COBOL. Operating systems were also created to control hardware and allow interaction between users and applications. The major components of a programming system are assemblers, compilers, interpreters, loaders, linkers, and the operating system.

Uploaded by

Arthi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views84 pages

SS VVFGC New

The document discusses the evolution of system software components. It describes how assemblers were developed to allow users to write programs in assembly language instead of machine code. Loaders were then created to manage program loading and preparation for execution. Further, macros provided abbreviations to reduce repetitive code. Compilers and interpreters were later introduced to compile high-level languages like FORTRAN and COBOL. Operating systems were also created to control hardware and allow interaction between users and applications. The major components of a programming system are assemblers, compilers, interpreters, loaders, linkers, and the operating system.

Uploaded by

Arthi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 84

Chapter-1,Background System Software

Background
This chapter is an introduction to the design and implementation of system
software. System software consists of a variety of programs that supports the
operation of computer.
Software is a collection of data and instructions for controlling ,integrating and
managing the hardware components of computer and perform specific task.
There are two types of software such as ,
 System software
 Application software.

System software is defined as instructions given to perform computer’s own


tasks i.e. basic operations of computer Or System Software is a set of programs
that manage the resources of a computer

General Machine Structure


The structure of CPU for a typical Von Neumann Machine is as below

The structure above consists of -

Usha Kamala B T,VVFGC, Tumakuru


1
Chapter-1,Background System Software

1. Instruction Interpreter
2. Location Counter
3. Instruction Register
4. Working Registers
5. General Register

The Instruction Interpreter is a Hardware is basically a group of circuits that


perform the operation specified by the instructions fetched from the memory.
The Location Counter can also be called as Program/Instruction Counter simply
points to the current instruction being executed.
The Instruction Register stores the opcode that is currently being executed.
The working registers are often called as the "scratch pads" because they are
used to store temporary values while calculation is in progress.
The general Register are used by the programmer as storage locations and for
special functions. There are 16 general purpose registers in ibm360

The CPU interfaces with Memory through MAR & MBR

MAR (Memory Address Register) - contains address of memory location (to be


read from or stored into)
MBR (Memory Buffer Register) - contains copy of address specified by MAR
Memory controller is used to transfer data between MBR & the memory location
specified by MAR
The role of I/O Channels is to input or output information from memory.

MEMORY

Memory is a device where information is stored and retrieved. In memory always


Information is stored as 0’s and 1’s called bit(binary digit).
Memory unit is the amount of data that can be stored in the storage unit. This storage
capacity is expressed in terms of Bytes.
The following table explains the main memory storage units –
Bit (Binary Digit): A binary digit is logical 0 and 1 representing a passive or an active
state of a component in an electric circuit.
Nibble: A group of 4 bits is called nibble.
Byte: A group of 8 bits is called byte. A byte is the smallest unit, which can represent a
data item or a character.
Word: A computer word, like a byte, is a group of fixed number of bits processed as a
unit, which varies from computer to computer but is fixed for each computer.The length of
a computer word is called word-size or word length. It may be as small as 8 bits or may
be as long as 96 bits. A computer stores the information in the form of computer words.
Kilobyte (KB): 1 KB = 1024 Byte=>103bytes
Megabyte (MB): 1 MB = 1024 KB=>106 bytes
Usha Kamala B T,VVFGC, Tumakuru
2
Chapter-1,Background System Software

GigaByte (GB): 1 GB = 1024 MB=>109 bytes


TeraByte (TB): 1 TB = 1024 GB=>1012 bytes
PetaByte (PB): 1 PB = 1024 TB=>1015 bytes

INSTRUCTION
An instruction is a single operation of a process. Instructions may be of the fallowing
types:
 Arithmetic instructions
 Logical instructions
 Control or transfer instructions
 Interrupt instructions
Instructions may be of different formats depending on type of operands. The different
operand types are :
 register operands
 storage operands
 immediate operands
The different instruction formats are:
I. RR (Register to Register) format
II. RX (Register and Indexed) format
III. RS (Register and Storage) format
IV. SI (Storage and Immediate) format
V. SS (Storage to Storage ) format

i. RR instruction: RR instruction denotes register to register operation.i.e, both


the operands are register. The length of RR instruction is 2 bytes (16 bits). The
general format of RR

Example: the instruction is Add 3, 4

ii. RX instruction: RX instruction denotes a register and indexed storage


operation. The length of RX instruction is 4 bytes (32 bits). Indexed storage
operand refers to the data stored in core memory. The address of the storage
operand is calculated as follows
Usha Kamala B T,VVFGC, Tumakuru
3
Chapter-1,Background System Software

Address=value of an offset or displacement + content of a base register + content


of an index register
=C(B2)+C(X2)+D2
The general format of RX instruction is

Example: ADD 3,16(0,5)

Assume base register 5 contains the number 1000.


The address of the storage operand= C(B2)+C(X2)+D2
=C(5)+0+16=1000+0+16=1016

iii. RS instructions: RS instruction denotes register and storage operation. The


length of RS type instruction is 4 bytes(32 bits). The general format of RS
instruction is

Address=value of an offset or displacement + contents of a base register


=C (B2)+D2

iv. SI instruction: SI instruction denotes storage and immediate operand


operation. Immediate operands are single byte of data and stored as part of the
instruction. The length of SI type instruction is 4 bytes (32 bits). The general
format of SI instruction is

Address=value of an offset or displacement + contents of a base register


=C (B1) +D1
Example: MOV I2, 4(5)
Usha Kamala B T,VVFGC, Tumakuru
4
Chapter-1,Background System Software

Address of storage operand =C(B1)+D1


= C(5)+4
=1000+4
=1004
v. SS instruction: SS instruction denotes a storage to storage operation. The
length of SS instruction is 6 bytes(48 bits). The general format is

Example 1: MVC 32 (79, 5), 300(5)

Evolution of the Components of a Programming System

i. Assembler: In the earlier stages the computer programmers are used to write
programs using 0’s and 1’s. The programmers found difficult in writing programs
using machine language. In order to overcome difficulty an Assembly language is
developed. It is a low level programming language that allows a user to write
programs using letters and symbols (mnemonic) which are more easily
remembered. An assembler is a system program that converts programs written
in assembly language into machine language, which can be executed by a
computer.

ii. Loaders: Loader is the system program which is responsible for loading
programs and libraries into memory and prepares them for execution. The
assembler itself could place the object program directly in memory and transfer
control to it, then that machine level language program is executed. But there are
two disadvantages i. Wastage of memory-assembler itself occupies more space in
memory during execution. ii. Wasting translation time-need of retranslation of the
program with each execution. In order to avoid this the new system software

Usha Kamala B T,VVFGC, Tumakuru


5
Chapter-1,Background System Software

called Loader is introduced. If the program size is very large then subdivide the
program into smaller routines called subroutines The loader must performs
following four functions i. Allocation: Allocate space in memory for the programs
ii. Linking: Resolve symbolic references between object decks iii. Relocation:
Adjust all address dependent locations iv. Loading: Physically place the machine
instructions and data into memory Based on the loading function the loader is
divided into different types they are a) Compile and go b) Absolute loader c)
Relocating loader d) Direct linking e) Dynamic loading and linking .

iii. Macros: sometimes programmer need to repeat identical parts of their


program . Macro is a facility, which permits the programmer to define an
abbreviation for a part of her program and use the abbreviation in her program.
The macro processor treats thee identical parts defined by abbreviation as macro
definition and substitutes these definition for all occurrences of abbreviation
(macro call) in the program.in addition to this macros are also used to specialize
operating system..

iv. Compiler: As the user started concentrating problems into areas such as
scientific, business, statistical areas. High level languages were developed to
express certain problems more easily. COBOL, FORTRAN, PASCAL, ALGOL, C.,
etc. are high level languages, which is processed by compilers and interpreters. A
compiler is a software program that converts high-level language into a machine
language, which can be executed by a computer.

v. Formal system: A formal system is an un interpreted symbolic system whose


syntax is precisely defined. Formal system consist of an alphabet, a set of words
called axioms and a finite set of relations called rules of inference. Examples of
formal systems are set theory, Boolean algebra, post systems.

vi. Operating system : OS is a program that controls the execution of an


application program and acts as an interface between user and computer
hardware. The main functions of operating system are: Job sequencing, Job
scheduling, Input/output programming, secondary storage management, User
interface, Error-handling.

Components of a Programming System


System software is divided into following major parts

 Assembler
 Compilers
 Interpreter
 Loaders
 Linkers

Usha Kamala B T,VVFGC, Tumakuru


6
Chapter-1,Background System Software

 Macros
 Operating system
 Assembler

Assembler: An assembler is a system software that converts programs written in


assembly language into machine language, which can be executed by a computer.

Source code Object code


ASSEMBLER
written in assembly Machine level
language language

Compiler: A compiler is a system software that converts programs written in high


level language into machine language which can be executed by a computer.

Source code Object code


COMPLIER
written in high level Machine level
language language

Interpreters: An interpreter is a program that converts program written in high


language into machine code line by line or one statement at a time.

Source code Object code


INTERPRETER
written in high level Machine level
language language

Loaders: Loader is the system program which is responsible for loading programs
and libraries into memory and prepares them for execution.

Linkers: Linker is the system program which intakes the object code generated
by the assembler or compiler and combine them to generate the executable
module.

Macros: Macro processor is a program that substitutes and specializes macro


definitions for macro calls.

Operating system: An operating system is the system program that acts as an


interface between the software and the computer hardware.

Formal systems: A formal system is a interpreted calculus. It consists of an


alphabet, a set of words called axioms, and a finite set of relations called rules of
inference. Examples of formal systems are set theory, Boolean algebra, Backus
normal form etc. formal systems are important for designing and complexity
studies of programming languages, also in syntax directed compilations.
Usha Kamala B T,VVFGC, Tumakuru
7
Chapter-1,Background System Software

Machine Level Language (MLL)

Machine level language is the lowest and most elementary level of programming
language. The computer can understand only string of binary digits (bits) 0 and
1. Such programming language is called Machine level language. Since a
computer is capable of recognizing electric signals, it understands machine
language. The symbol zero (0) stands for the absence of an electric pulse and the
one (1) stands for the presence of an electric pulse.

The main advantages of MLL is:


i. Fast execution
ii. Efficient use of the computer
iii. No need of translator
iv. Save the memory
But, the disadvantages of MLL is:
i. All memory addresses have to be remembered
ii. All operation codes have to be remembered
iii. The programmer should know the architecture of the system
iv. Difficult to read and read machine code.

Assembly Level Language (ALL)

Assembly level language was developed to overcome the inconveniences of


machine language. This is low-level but very important language in which
operation codes and operands are given in the form of alphanumeric symbols
instead of 0’s and l’s. These alphanumeric symbols are known as mnemonic
codes and can combine in a maximum of five-letter combinations e.g. ADD for
addition, SUB for subtraction, START, LABEL etc. Because of this feature,
assembly language is also known as ‘Symbolic Programming Language.
The main advantages of ALL is:
i. It is mnemonic
ii. . Assembly language is easier to understand
iii. It is easy to locate and correct errors
But, the disadvantages of ALL is:
i. It is also machine dependent/specific
ii. Since it is machine dependent, the programmer also needs to
understand the hardware
Pseudo–op: it is an assembly language instruction that specifies an operation of
the assembler. The different pseudo ops are ; using, START, END, EQU, DC, DS,
DROP, LTORG.
Machine-op: machine-op represents to the assembler a machine instruction, that
specifies to load the proper value to the base register at execution time. The
different machine ops are ; BALR, BR, BCT.
High Level Language (HLL)
Usha Kamala B T,VVFGC, Tumakuru
8
Chapter-1,Background System Software

A high-level language (HLL) is a programming language enables a programmer to


write programs that are more or less independent of a particular type of
computer. Such languages are considered high-level because they are closer to
human languages and further from machine languages. For example C, C++,
Java, FORTRAN etc.

Advantages of HLL is the main advantage of high-level languages over low-level


languages is that they are

 Easier to read, write, and maintain.


 Ultimately, programs written in a high-level language must be translated
into machine language by a compiler or interpreter

Difference between compiler and interpreter

The difference between an interpreter and a compiler is given below:

Compiler Interpreter
Scans the entire program and Translates program one statement at a
translates it as a whole into machine time.
code.
It takes large amount of time to analyse It takes less amount of time to analyse
the source code but the overall the source code but the overall
execution time is comparatively faster. execution time is slower.
Generates intermediate object code No intermediate object code is
which further requires linking, hence generated, hence are memory efficient.
requires more memory.
The compilation is done before Compilation and execution take place
execution. simultaneously.
Display all errors after compilation, all Displays error of each line one by one.
at the same time.
C, C++, C#, Scala, typescript uses Shell scripts, Java, PHP, Perl, Python,
compiler Ruby uses an interpreter.

Usha Kamala B T,VVFGC, Tumakuru


9
Chapter-2 Assembler System Software
Chapter-2
Assembler
2.1Introduction
An assembler is a System software that converts the assembly language program into machine
language program.
OR
An assembler is a system program that converts programs written in assembly language into
machine language, which can be executed by a computer.
Assembly Machine
Assembler
Level Language Level Language

Functions of assembler
 Assembler can take an input and produces its machine instructions
 Converts symbolic instructions for each machine instructions
 It decides the proper instruction format
 Converts the data constants to internal machine representations
 Write the object program and the assembly listing
Types of Assembler
There are two types of assemblers based on how many passes through the source are needed
(how many times the assembler reads the source) to produce the object file.
One-pass assemblers go through the source code once. here source program is translated
instruction by instruction. When Assembler references labels, it leave address space for label
and when assembler found the declaration of label, it uses back patching.

Multi-pass (Two) assemblers create a table with all symbols and their values in the first passes,
then use the table in later passes to generate code.

Forward referencing: Using an identifier before its declaration is called a forward reference.
In forward referencing, variable or label is referenced before it is declared. Different problems
can be solved using One Pass or Two Pass forward referencing.
In One Pass forward referencing source program is translated instruction by instruction.
Assembler leave address space for label when it is referenced and when assembler found the
declaration of label, it uses back patching.
In Two Pass forward referencing consist of two passes.
During first pass symbol table, op-code table and label table are maintained.
 In op-code table instruction size and address is stored.
 Label and label's address is stored in label table. When label is encountered, its name is
stored in label table when label declaration is found then its location is also stored in
label table.
During 2nd Pass, translation from source language to machine language takes place. Instruction
addresses and label addresses are used from symbol table instead of their names.

Usha Kamala B T,VVFGC, Tumakuru


1
Chapter-2 Assembler System Software
PASS: In order to build the object file, the assembler processes the assembly file one line at a
time. It translates what it can, and keeps track of what it still doesn't know yet. So pass means
processing the entire source file from beginning to end one after the other is called a pass.

2.2 Assembler Directive


Assembler directives are instructions that direct the assembler to do something. Some of the
Assembler directive listed below
 EQU: Assigns a value to a symbol (same as =)
 DC: Define constant
 DS: Defines an amount of free space. This is used for allocating variable space.
 IF <expression>: Assembles code if expression evaluates to TRUE.
 IFNOT <expression>: Assembles code if expression evaluates to FALSE.
 ELSE: Assembles code if preceding evaluation is rejected.
 ENDIF: Ends conditional evaluation.
 RESET: Sets the reset start address.
 LTORG: Begin the literal pool
 ORG: Sets the current origin to a new value
 PRINT: Sets some options for the assembly listing
 SPACE: Provides for line spacing in the assembler listing
 START: Define the start of the first control section in a program
 END: End of the assembler module or control section
 TITLE: Provide a title at the top of each page of assembler listing
 USING: Indicates the base registers to use in addressing

2.3 General design Procedure: The below six steps that should be followed by the designer
i. Specify the problem statement.
ii. Specifies data structures (database)
iii. Define format of data structure
iv. Specify the algorithm
v. Look for modularity

2.3.1 Specify the problem statement:

Usha Kamala B T,VVFGC, Tumakuru


2
Chapter-2 Assembler System Software

Assembler must (i) Generate instructions and (ii) Process pseudo ops. This sequence of
operations divided into two passes:
Pass 1-purpose: Define the symbols and literals
a. Determine the length of machine instruction (MOTGET1)
b. Keep track of Location Counter (LC)
c. Remember the values of symbols until pass2 (STSTO)
d. Process some pseudo ops (POTGET1)
e. Remember the values (LITSTO)
The above steps can be overviewed in the below diagram

Usha Kamala B T,VVFGC, Tumakuru


3
Chapter-2 Assembler System Software

Fig. Overview of Pass-1 Assembler


Pass 2- purpose: Generate the object program
a. Look up the value of symbols (STGET)
b. Generate the instructions (MOTGET2)
c. Generate the data
d. Process pseudo ops (POTGET2)
The above steps can overviewed in the following diagram

Usha Kamala B T,VVFGC, Tumakuru


4
Chapter-2 Assembler System Software

Fig. Overview of Pass-2 Assembler

2.3.2 Specify the data structure


Pass 1: database
i. Input source program
ii. Location Counter (LC)-it is used to store each instruction location
iii. Machine Operation Table (MOT)-it is used to store mnemonic or symbols for each
instruction and its length.
iv. Pseudo Operation Table (POT)-it is used to store the all pseudo opcodes in our source
program and corresponding actions
v. Symbol Table (ST):it is used to store all the symbol /labels used in our program and its
corresponding value
vi. Literal Table (LT): it is used to store all the literals/constants used in our program and its
assigned location
vii. A copy of the input to be used later by pass 2

Pass 2: databases
i. Copy of the source program input to pass-1
Usha Kamala B T,VVFGC, Tumakuru
5
Chapter-2 Assembler System Software
ii. LC-it is used to store each instruction location
iii. MOT-it is used to store all directives or mnemonics and its length, binary opcode and
instruction format
iv. POT- it is used to store all directives and its action corresponding index
v. ST-it is used to prepare by pass 1 it consists of each label and its value
vi. BT-it is used to store which registers are currently specified as base register by USING
pseudo opcode and it specifies the contents of these registers
vii. INST-this is a work space used to store each instruction as its various parts, Ex: binary
opcode, register fields, length fields etc.
viii. Print line: it is also work space used to produces a printed listing
ix. Punch card: it is also work space used for converting the assembled instructions in the
format needed by the loader.

Fig. Databases used in Pass 1 and Pass 2 Assembler


2.3.3 Format of the data structure
The third step in design procedure is to specify the format and content of each of the databases.
i.Machine Operation Table (MOT): this table containing name, length, binary code and formats.
The size of the MOT table is 6 bytes per entry. The content of this table are not filled in or
altered during the assembly process. Both pass1 and pass2 assembler uses the same MOT table.
The below diagram shows the format of MOT.

Usha Kamala B T,VVFGC, Tumakuru


6
Chapter-2 Assembler System Software

ii. Pseudo Operation Table (POT):this table containing pseudo op and associated pointer to
processing the pseudo ops. The size of POT table is 8 bytes per entry. The content of this table
are not filled in or altered during the assembly process. Pass1 and Pass2 assembler uses two
separate POT table. The below diagram shows the format of POT.

iii. Symbol Table (ST) and Literal Table (LT):this table containing, name of each entry, value of
the fields, length of a field and relative location indicator. The size of ST table is 14 bytes per
entry. The length fields indicates the length (in bytes) of the instruction. The relative location
indicator tells the assembler whether the value of the symbols is absolute or relative to the base
program. The below diagram shows the format of Symbol Table and Literal Table.

iv. Base Table (BT): the base table contains the relative address of the symbol. The size of BT is
4 bytes per entry.

2.3.4 Specify the algorithm


PASS1 – Assembler: Define Symbols and Literals: The purpose of a pass1 assembler is to
assign location to each instruction and data defining pseudo-ops and define values for symbols
appearing in the label field of source program.
Algorithm:
1. Initially set Location Counter to zero i.e. first location in the program.

Usha Kamala B T,VVFGC, Tumakuru


7
Chapter-2 Assembler System Software
2. Next, read the source statement.
3. The operation code field is examined to determine
a) If it is Pseudo-op of type
i. If it is DC or DS, it will affect both LC and definition of symbols in pass1. Determine
the number of bytes of storage required, based on its length, adjust the LC before
defining the symbol and update the LC and save.
ii. If is EQU, concerned with defining the symbol in the label field. Then this
requirement evaluating the expression in the operand field.
iii. If it USING or DROP, the assembler need only save these commands and does not
alter the location counter.
iv. If it is an END, pseudo-opcode pass1 will be terminated.
b) If it is Machine Op code, determine the length of the instruction, then operand filed is
scanned for the presence of literal. If new literal is found, it is entered into the LT.
Then, any Symbol is present in the label field, saved it in Symbol Table along with the
current value of LC.
c) Update the LC with length of the instruction and copy of the source code is saved for
pass2.
4. Repeat these steps up to encounters the END pseudo opcode.

PASS2 – Assembler: Generate Code: The purpose of pass2 is to generate the machine code
and structure the generated code into appropriate format for loader.
Algorithm
1. First location counter is initialized as zero
2. Read the statement from source file copied by pass1
3. Process the opcode field
a. If it is a machine opcode then it will store in to the MOT entry and to find length and
binary opcode and format of the instruction format. And, process the instruction based on
its format like RR, RX, RS, SS, etc.
b. If it is pseudo op of type
i. If it is USING pseudo-op or DROP pseudo opcode, then they may require addition
processing in pass2
ii. If it is a DC pseudo op convert the constant and output it thereby updating LC
iii. If it is a DS pseudo opcode then update the LC value
iv. If it is a START or EQU pseudo opcode just print the card in program.
v. If it is END pseudo op indicates the end of the source program and terminates the
assembly.

Usha Kamala B T,VVFGC, Tumakuru


8
Chapter-2 Assembler System Software

Fig. Detailed Pass1 Assembler Flowchart

Usha Kamala B T,VVFGC, Tumakuru


9
Chapter-2 Assembler System Software

Fig. Detailed Pass2 Assembler Flowchart

Usha Kamala B T,VVFGC, Tumakuru


10
Chapter-2 Assembler System Software
2.3.5 Look for Modularity

Usha Kamala B T,VVFGC, Tumakuru


11
Chapter3:Macros System Software
Unit-3
Macro Language and Macro Processor
3.1 Macro instruction
Macro instructions are single line abbreviations for groups of instructions. Advantages: The
frequent use of macros can reduce programmer induced errors and they facilitate standardization.

Fig. Block diagram of a Macro Processor


Syntax:
MACRO start of definition
[Macro Name] macro name
------------
---------- Sequence of instructions
MEND end of macro definition.
 The macro definition starts with “MACRO” pseudo opcode it indicates the beginning of the
macro definition
 The following macro pseudo opcode we will give the name of the macro definition
 After that the sequence of instructions being abbreviated
 The macro definition is terminated by the MEND pseudo opcode
For example consider the following program
:
:
A 1, DATA adds the contents of DATA to register 1
A 2, DATA adds the contents of DATA to register 2
A 3, DATA adds the contents of DATA to register 3
:
:
A 1, DATA adds the contents of DATA to register 1
A 2, DATA adds the contents of DATA to register 2
A 3, DATA adds the contents of DATA to register 3
:
:
In the above program, the sequence
A 1, DATA
A 2, DATA
A 3, DATA

Usha Kamala B T, VVFGC, Tumakuru


1
Chapter3:Macros System Software
Occurs twice. A macro facility provides us to attach a name to this sequence and use this name in
its place. The macro defined as follows
MACRO
INCR
A 1, DATA
A 2, DATA
A 3, DATA
MEND
Where,
 MACRO: beginning of the macro definition
 INCR: name of the macro definition
 MEND: end of the macro definition
 Between the INCR and MEND we will have write the sequence of macro instructions to
perform operations

Macro call
The occurrences of the macro name in the source program is called as macro call. OR Once the
macro has been defined, the use of the macro name in the place of sequence of instruction is
called as macro call or macro instruction.
For example we can assign the name INCR to the repeated sequence, as follows
:
:
INCR
:
:
INCR
:
:

Macro Expansion
When there is a macro call the macro processor substitutes the macro definition in the place of
macro call is called Macro expansion. In this macro expansion MACRO, MEND and name of
the macro doesn’t appear in the expanded source code. All other remaining lines are appearing.

Usha Kamala B T, VVFGC, Tumakuru


2
Chapter3:Macros System Software

3.2 Features of Macro facility


The important features of macro are
1. Macro instruction arguments
2. Conditional macro expansion
3. Macro calls within macros
4. Macro instruction defining macro

3.2.1. Macro instruction arguments


This feature provides the facility of passing arguments or parameters in macro calls, So that
flexibility can be provided in program.
:
:
A 1, DATA1
A 2, DATA1
A 3, DATA1
:
:
A 1, DATA2
A 2, DATA2
A 3, DATA2

In the above program the sequence of instructions are similar but not identical. The first
sequence performs an operation using DATA1 as operand, second sequence performs an
operation using DATA2 as operand, so the same operation performs with the different types of
parameters. Such type of parameters is known as macro instruction argument or dummy
arguments. It is specified on the macro name line with an ‘&’ as its first character.
The preceding program could be written as:

Usha Kamala B T, VVFGC, Tumakuru


3
Chapter3:Macros System Software

Source Expanded source code


MACRO
INCR &arg
A 1, &arg
A 2, &arg
A 3, &arg
MEND
: :
INCR DATA1 A 1, DATA1
: A 2, DATA1
: A 3, DATA1
:

INCR DATA2 A 1, DATA2


: A 2, DATA2
: A 3, DATA2
:

It is possible to supply more than one argument in a macro call. These arguments are separated
by comma
MACRO
INCR &arg1, &arg2, &arg3
A 1, &arg1
A 2, &arg2
A 3, &arg3 Expanded source code
MEND
.
.
INCR DATA1, DATA2, DATA3 A 1, DATA1
. A 2, DATA2
. A 3, DATA3

INCR DATA3, DATA2, DATA1 A 1, DATA3


. A 2, DATA2
. A 3, DATA1
3.2.2 Conditional Macro Expansion
The sequence of macro expansions can be changed based on some conditions called as
Conditional macro expansion. There are two important macro pseudo opcode, they are: AIF,
AGO.
(i) AIF: this is conditional branching pseudo opcode
The general format
AIF<expression>. <Label name>

Usha Kamala B T, VVFGC, Tumakuru


4
Chapter3:Macros System Software
Where, expression or condition is true control transfer to the label statement. If it is false, the
next statement following the AIF is executed. The labels used in the statements starts with a
period (.) followed by label name. AIF statement and label doesn’t appear in the expanded code

(ii) AGO: It is an unconditional branching pseudo opcode, It is also called as goto statement.
The general format
AGO<sequence label>

3.2.3 Macro calls within macros


Calling macro inside other macro definition is called macro calls within macro. Macro can be
called within definition of another macro. Macro calls within macros can involve several levels.
In the example given below, within the definition of macro ADDS there are three separate calls
to a previously defined macro ADD1. Such use of macro results in macro expansion on multiple
levels.

Usha Kamala B T, VVFGC, Tumakuru


5
Chapter3:Macros System Software

3.2.4 Macro Instruction Defining Macros


Defining macro inside another macro definition is called macro definitions within macros. It is
important to realize that the inner macro definition is not defined until after the outer macro has
been called or defined. The following example defines a macro instruction DEFINE, which when
called with subroutine name defines a macro with the same name as the subroutine.
The user call this macro with the statement
DEFINE COS
Defining a new macro name COS. The user subsequently call the COS macro as follows
COS AR

3.3 Implementation of Macros


The macro processor taken as input an assembly language program which contains macro
definitions and macro calls. It then transforms into another program were all macro definitions
have been replaced with the corresponding macro bodies. The output of the macro processor is
an assembly language program containing no macros.

Usha Kamala B T, VVFGC, Tumakuru


6
Chapter3:Macros System Software
3.3.1 Statement of problem
There are four basic tasks or functions performed by macro processor.
They are
i. Recognize macro definitions
ii. Save the definitions
iii. Recognize calls
iv. Expand calls and substitute arguments

i. Recognize macro definitions: The macros are recognized by the keyword MACRO and
MEND pseudo-opcodes. It identifies the nested macro the macro processor must recognize the
nesting and should correctly match the last or outer MEND with the first macro
ii. Save the definitions: The macro processor stores all the macro instruction definitions in
memory which it will need for expanding macro calls.
iii. Recognize calls: It must recognize macro calls (i.e., macro name) that appear as operation
mnemonics.
iv Expand calls and substitute arguments: The macro processor substitute dummy or macro
definition arguments with the corresponding arguments from a macro call. Then assembly
language text is then substituted for the macro call.

3.3.2 Types of macro processors


Basically there are two types of macro processors, they are
i. One pass macro processor (Single pass processor)
ii. Two pass macro processor (Restricted facility)

3.4 Implementation of restricted facility


Fallowing assumptions are made to implement two pass macros.
 Macro processor is functionally independent of the assembler.
 Output from microprocessor is given as input to assembler.
 Macro calls or definitions within macro definition is not allowed because of its
complications.
3.4.1 Functions of pass1
i. Examines every opcode
ii. Save all macro definitions in MDT
iii. Save copy of input on secondry storage for pass2
iv. Creates MNT
Functions of pass2
i. Examines every operation mnemonic
ii. Replace each macro name with appropriate statements from macro definition
3.4.2 Specification of Database
The following databases are used by the two passes of macro processor:
Pass 1 database:
i. The input macro source deck

Usha Kamala B T, VVFGC, Tumakuru


7
Chapter3:Macros System Software
ii. Copy for use by pass2
iii. Macro definition table (MDT) used to store the body of the macro definitions.
iv. The macro name table (MNT), used to store the names of defined macros.
v. The macro definition table counter (MDTC), used to indicate next available entry in the
MDT
vi. The macro name table counter (MNTC), used to indicate next available entry in the MNT
vii. The argument list array (ALA), used for storing dummy arguments.

Pass2 database:
i. The copy of the input macro source deck
ii. The output expanded source deck to be used as input to the assembler
iii. Macro definition table(MDT), created by pass1
iv. The macro name table(MNT), created by pass1
v. The macro definition table counter (MDTP),used to indicate next line of text to be used
during macro expansion.
vi. The argument list array (ALA),used to substitute macro call arguments for the index
markers in the stored macro definition.

3.4.3 Specification of Database formats


The only databases with nontrivial format are the Macro Definition Table (MDT), Macro Name
Table (MNT) and Argument List Array (ALA).

Argument List Array (ALA): Argument list array (ALA) maintains the details about the
parameters. ALA is used in both pass1 and pass2,but the functions are reverse in both the passes.
In pass1,when the macro definitions are stored, the arguments in the macro definitions are
replaced by index markers (#).
&LOOP1 INCR &arg1, &arg2, &arg3
#0 A 1, #1
A 2, #2
A 3, #3
MEND
In pass2, arguments in the macro call are substituted for the index markers stored in macro
definition
For example,
Consider a macro call
LOOP1 INCR DATA1, DATA2, DATA3
The macro call expander would prepare the following ALA

Usha Kamala B T, VVFGC, Tumakuru


8
Chapter3:Macros System Software
Macro definition table (MDT): MDT is used to store the body of the macro definitions. The size
of macro definition table is 80 bytes per entry. Every line is the macro definition, except
MACRO is stored is the MDT.

Macro name table (MNT): MNT is used to store the names of the defined macros. Each MNT
entry consists of macro name whose size is 8 bytes. The size of the MDT index is 4 bytes.
Therefore the size of the MNT is 12 bytes per entry.

Two pass macro processor algorithm


Algorithm of Pass1
Step 1: Initialize macro definition table counter (MDTC) and macro name table counter to 1.
Step2: Read one line from the source program
Step3. a: Check whether it is a MACRO pseudo-op
a) If it is a macro pseudo op, you have encountered a macro definition and the entire
definition that follows macro pseudo op should be stored is macro definition table
For that
i) Read next line from the source program
ii) In macro name table (MNT) enter the macro name and current value of macro
definition table counter (MDTC) in entry number MNTC
iii) Increment MNTC (macro name table counter)
iv) Prepare the argument list array
v) In macro definition table (MDT) enter the macro name line
vi) Increment macro definition table counter (MDTC)
vii) Read next line from the source program
viii) Substitute index notations for arguments
ix) Enter the line into macro definition table, with index markers for arguments
x) Increment macro definition table counter (MDTC)

Usha Kamala B T, VVFGC, Tumakuru


9
Chapter3:Macros System Software
xi) Check whether you have encountered a MEND pseudo op
b) If it is not MEND pseudo op, go to step 3.a.vii
Step3.b If it is not a macro pseudo op
i) Write copy of source card
ii) Check whether you name encountered an END pseudo op
a) If it is and END pseudo op, you have reached the end of the program and all the macro
definitions have been processed. Therefore go to pass2 to process macro calls.
b) If it is not and END pseudo op go to step 2.

flowchart of Pass1

Usha Kamala B T, VVFGC, Tumakuru


10
Chapter3:Macros System Software

Flowchart pass-2

Algorithm pass-2

Step1: Read next line fr0m the source program copied by pass1
Step2. a: Search macro name table (MNT) for match with operation code. Check whether you
have encountered a macro call i.e. checks whether a macro name is found.
i. if it is a macro name, set the macro definition table points, to the corresponding
macro definition stored in macro definition table (MDT) for that assign MDT index
field of MNT entry to MDTP
ii. prepare argument list array (ALA)
iii. Increment macro definition table counter (MDTP)
iv. Get line from macro definition table (MDT)

Usha Kamala B T, VVFGC, Tumakuru


11
Chapter3:Macros System Software
v. substitute arguments from macro calls
vi. check whether you have encounter MEND pseudo op
b. If it is MEND pseudo op, it means that macro expansion is over. so go to step 1 to scan
the input file
c. IF it is not MEND pseudo op, write expanded source card and go to step2.iii
Step2. b: If you have not encountered a macro call, i.e. if macro name is not found, write into
expanded source card file.
c: Check whether you have encountered. END pseudo op
i. If it is END pseudo.op, transfer expanded source file to the assemblers for further
processing
ii. If it is not and END pseudo.op go to step1

Implementation of macro processor For example


MACRO
DEFINE &SUB
MACRO
&SUB &Y
CNOP 0,4
BAL 1,*+8
DC A(&Y)
L 15,=V(&SUB)
BALR 14,15
MEND
MEND

MDT: macro definition table


Index card
1 DEFINE &SUB
2 MACRO
3 #1 &Y
4 CNOP 0,4
5 BAL 1,*+8
6 DC A(&Y)
7 L 15,=V(#1)
8 BALR 14,15
9 MEND
10 MEND
MNT: macro name table
Index name MDT index
1 DEFINE 1

3.5 Single pass macro processor

Usha Kamala B T, VVFGC, Tumakuru


12
Chapter3:Macros System Software
3.5 Single -Pass Macro Processor
One-Pass Macro Processor supports macro definitions within macros and reduces all macro
processing to a single pass. It also supports macro call within macros i.e. nested macros so that
use of multiple macro processors can be eliminated.
Usually inner macro is defined only after the outer one has been called; in order to provide for
any use of the inner macro, we would have to repeat the macro definition and the macro call
passes, by using single pass macros it can be reduced to single pass.
A one-pass macro processor that alternate between macro definition and macro expansion is able
to handle “macro in macro”.
However, because of the one-pass structure, the definition of a macro must appear in the source
program before any statements that invoke that macro. This restriction is reasonable (does not
create any real inconvenience).
To design for one-pass assembler, The data structures required are:
• DEFTAB (Definition Table)
• NAMTAB (Name Table)
• ARGTAB (Argument Table)
DEFTAB (Definition Table) : Stores the macro definition including macro prototype and macro
body Comment lines are omitted.
References to the macro instruction parameters are converted to a positional notation for
efficiency in substituting arguments.
NAMTAB (Name Table) : Stores macro names and Serves as an index to DEFTAB Pointers to
the beginning and the end of the macro definition (DEFTAB)
ARGTAB (Argument Table) : Stores the arguments according to their positions in the
argument list.
As the macro is expanded the arguments from the Argument table are substituted for the
corresponding parameters in the macro body.

Comparison between One pass and Two pass Macro processor


Single pass
• every macro must be defined before it is called
• one-pass processor can alternate between macro definition and macro expansion
• nested macro definitions may be allowed but nested calls are not
Two pass algorithm
• Pass1: Recognize macro definitions
• Pass2: Recognize macro calls
• nested macro definitions are not allowed

Usha Kamala B T, VVFGC, Tumakuru


13
Chapter3:Macros System Software

The figure below shows the different data structures described and their
relationship

DEFINE

MACRO
PROCESSOR
GETLINE

EXPANDING=FALSE PROCESSLINE

EXPAND

GETLINE
PROCESSLINE
EXPANDING=TRUE

GETLINE
GETLINE PROCESSLINE

EXPANDING FALSE

TRUE

READ FROM READ FROM


DEFTAB INPUT

3.6 Implementation of macro processor with in assembler


The macro processor can be implemented within the pass1 of assembler. i.e. assembler itself
expands the program before translating to machine language. Here by implementing macro
processor with in pass1 eliminates the overhead of intermediate files. By integrating macro
processor and assembler efficiency can be improved by combining similar functions.
Advantages

Usha Kamala B T, VVFGC, Tumakuru


14
Chapter3:Macros System Software
 Many functions do not have to be implemented twice (read a card, test for statement
type).
 There is less over head during processing
 Functions are combined and it is not necessary to create intermediate files as output from
the macro processor and input to the assembler.
 More flexibility is available to the programmer.

Disadvantages
The program becomes too large to fit into the core of some machines
Program becomes complex with macros

NOTE:
Macro processor for assembly language can be implemented in various ways. They are;
1. Independent two pass processor.
2. Independent one pass processor.
3. Processor incorporated into pass 1 of a standard two pass assembler.

Usha Kamala B T, VVFGC, Tumakuru


15
Chapter-4 Loaders System Software

Loaders

Introduction
In a computer operating system, a loader is a component that locates
a given program (which can be an application or, in some cases, part of
the operating system itself) in offline storage (such as a hard disk), loads it
into main storage (in personal computer, it's called random access
memory) and give a program control of the computer (allows it to execute
its instructions).

In computing, a loader is the part of an operating system that is


responsible for loading programs, one of the essential stages in the
process of starting a program, it means loader is a program that places
programs into memory and prepares them for execution.
Data

Translator Linker Loader


Binary
Program

Source
Program Results

Object Binary
Module Programs

Fig: Semantic Flow of Program Execution


Definition of Loader
Loader is a system program which takes object code as input
prepares it for execution and loads the executable code into the memory.
Thus loader is actually responsible for initiating the execution process.

1 Usha Kamala B T, VVFGC Tumakuru


Chapter-4 Loaders System Software

Functions of Loader
The loader is responsible for the activities such as allocation, linking,
relocation and loading
1) Memory Allocation: It allocates the space for program in the
memory, by calculating the size of the program. This activity is
called allocation.
2) Linking: It resolves the symbolic references (code/data) between the
object modules by assigning all the user subroutine and library
subroutine addresses. This activity are called linking.
3) Relocation: There are some address dependent locations in the
program, such address constants must be adjusted according to
allocated space, such activity done by loader is called relocation.
4) Loading: Finally it places all the machine instructions and data of
corresponding programs and subroutines into the memory. Thus
program now becomes ready for execution, this activity is called
loading.

Assembler Object Loader


Source Object
Program
Program Program
Ready
execution

Memory
Fig: Loading scheme
LOADING SCHEMES
Based on the various functionalities of loader, there are various types
of loaders:
1) Compile and Go loader or Assemble and Go loader.

2 Usha Kamala B T, VVFGC Tumakuru


Chapter-4 Loaders System Software

2) General loader scheme.


3) Absolute loader.
4) Direct linking loader.
5) Relocating loader.
6) Dynamic linking
7) Dynamic loader.

1) "Compile and go" loader


 In this type of loader, the instruction is read line by line.
 Its machine code is obtained and it is directly put in the main
memory at some known address. That means the assembler runs in
one part of memory and the assembled machine instructions and
data is directly put into their assigned locations. After completion of
assembly process, assign starting address of the program to the
location counter.
 The typical example, is WATFOR-77, it's a FORTRAN compiler which
uses such Compile-load and go" scheme. This loading scheme is also
called as "Assemble and Go".

PROGRAM LOADED
SOURCE COMPILE-AND- GO IN MEMORY
PROGRAM TRANSLATOR

MEMORY

Fig: Compile and go loading scheme

3 Usha Kamala B T, VVFGC Tumakuru


Chapter-4 Loaders System Software

Advantages
 This scheme is simple to implement. Because assembler is placed at
one part of the memory and loader simply loads assembled machine
instructions into the memory.
Disadvantages
 In this scheme some portion of memory is occupied by assembler
which is simpy a wastage of memory. As this scheme is combination
of assembler and loader activities, this combination program
occupies large block of memory.
 There is no production of .obj file, the source code is directly
converted to executable form. Hence even though there is no
modification in the source it needs to be assembled and executed
each time, which then becomes a time consuming activity.
 It cannot handle multiple source programs or multiple programs
written in different languages. This is because assembler can
translate one source language to other target language.
 For a programmer it is very difficult to make an orderly modulator
program and also it becomes difficult to maintain such program, and
the "compile and go" loader cannot handle such programs.
 The execution time will be more in this scheme as every time
program is assembled and then executed
2) General Loader Scheme

SOURCE TRANSLATOR
OBJECT PROGRAM READY
PROGRAM
PROGRAM FOR EXECUTION
1 LOADER

SOURCE TRANSLATOR
OBJECT
PROGRAM LOADER
PROGRAM
2

4 Usha Kamala B T, VVFGC Tumakuru MEMORY


Chapter-4 Loaders System Software

Fig :General loader scheme


In this loader scheme, the source program is converted to Object
program some translator (assembler). The loader accepts these object
module and puts machine instruction and data in an executable form at
their assigned memory. The loader occupies some portion of main memory.
Advantages’
 The program need not be retranslated each time while running it.
 There is no wastage of memory, because assembler is not placed in
the memory,
 It is possible to write source program with multiple programs and
multiple
Languages.
3) Absolute Loader:

 Absolute loader is a kind of loader in which relocated object files are


created, loader
accepts these files and places them at specified locations in the
memory. This type
of loader is called absolute because no relocation information is
needed, rather it is
obtained from the programmer or assembler.
 The starting address of every module is known to the programmer,
this
corresponding starting address is stored in the object file, then task
of loader
becomes very simple and that is to simply place the executable form
of the machine
instructions at the locations mentioned in the object file.
 In this scheme, the programmer or assembler should have knowledge
of memory
management. The resolution of external references or linking of
different

5 Usha Kamala B T, VVFGC Tumakuru


Chapter-4 Loaders System Software

subroutines are the issues which need to be handled by the


programmer.
 The programmer should take care of two things:
o First thing is, specification of starting address of each module to
be used. If some modification is done in some module then the
length of that module may vary. This causes a change in the
starting address of immediate nextmodules, it's then the
programmers duty to make necessary changes in the starting
addresses of respective modules.
o Second thing is, while branching from one segment to another
the absolute starting address of respective module is to be
known by the programmer so that such address can be
specified at respective JMP instruction.

Fig: Absolute loader scheme


Some important things to know about absolute loader scheme:
The absolute loader is simple to implement in this Scheme
1. Allocation is done by either programmer or assemble
2. Linking is done by the programmer or assembler
3. Resolution is done by assembler
4. Simply Loading is done by the loader.

6 Usha Kamala B T, VVFGC Tumakuru


Chapter-4 Loaders System Software

As the name suggests, no relocation information is needed, if at all


it is required then that task can be done by either a programmer or
assembler
DESIGN OF AN ABSOLUTE LOADER
To design an absolute loader we must and should have two types of cards:
1. Text cards
2. Transfer cards

Text cards
 This type of card is used to store instructions and data.
 The capacity of this card is 80bytes.
 'It must convey the machine instructions that assembler has created
along with
assigned core location.
Let's discuss the following example

Card Column Content


1 Card type=0[indicates text card
2 Count the number of bytes in information
3-5 Address of that information
6-7 Empty
8-72 Instruction and data to be loaded
73-80 Cards sequence numbers

Transfer cards
These cards must convey the entry point of the program, which is where
the loader 1s to transfer the control when all instructions are loader.

7 Usha Kamala B T, VVFGC Tumakuru


Chapter-4 Loaders System Software

Card Column Content


1 Card type=1[indicates transfer card
2 Count type =0
3-5 Address of entry point
6-7 Empty
73-80 Cards sequence numbers

Advantages
1. It is simple to implement.
2. This scheme allows multiple programs or the source programs
written different languages.
If there are multiple programs written in different languages then the
respective language assembler will convert it to the language and a
common object file can be prepared with all the ad resolution.
3. The task of loader becomes simpler as it simply obeys the instruction
regarding
Where to place the object code in the main memory
4. The process of execution is efficient.

Disadvantages
1. In this loader program adjust all internal segment addresses. So that
programmers
must and should know the memory management and address of the
programs
2. If any modification is done in one segment then starting address is
also changed.
3. If there are multiple segments the programmer must and should
remember the
addresses of all sub-routine.

8 Usha Kamala B T, VVFGC Tumakuru


Chapter-4 Loaders System Software

Initialize

Read Card

Set CURLOC to Location in


Character 3-5

Card
Type

Set LNG to Count Transfer to


In Character 2 Location CURLOC

Move LNG bytes


Of text form
character 8-72 to
Location CURLOC

Fig: Absolute Loader


Algorithm for Absolute Loader:

Statement 1:
start
Statement 2:
read header record [first record or first line]
Statement 3:
program length
Statement 4:
if [it is text card or transfer card]
If it is text card,
then store the data and instruction
Else
Transfer instructions
Statement 5: code is in character for then it will convert in to internal
9 Usha Kamala B T, VVFGC Tumakuru
Chapter-4 Loaders System Software

representation
Statement 6: read next object program
Statement 7: end
Or
Input: Object codes and starting address of program segments.
Output: An executable code for corresponding source program.This
executable
code is to be placed in the main memory
Method: Begin
For each program segment
do
Begin
Read the first line from object module to obtain information about
memory location. The starting address say S in corresponding
object module is the memory location where executable code is to
be placed.
Hence,
Memory location=S
Line counter=1;as it is first line While(! end of file)
For the current object code
do
Begin
1.Read next line
2.Write line into location S
3.S=S+1
4.Line counter=Line counter+1

SUB-ROUTINE LINKAGES
 If one main program is transfer to sub program and that sub
program also transfer to another program.
 The assembler does not know this mechanism [symbolic
reference]hence it will
declare the error message.
10 Usha Kamala B T, VVFGC Tumakuru
Chapter-4 Loaders System Software

To handle that situation assembler provides two pseudo-op codes. They


are,
1. EXTRN
2. ENTRY
The assembler will inform the loader that these system may be referred
by
other program
EXTRN
 The EXTRN pseudo op code is used to maintain the reference
between two or more
subroutines.
 The assembler pseudo-op code EXTRN followed by a list of symbols
indicates that
these symbols are defined in other programs but referenced in the
present
program.
ENTRY
 The assembler pseudo-op code ENTRY followed by a list of symbols
indicates that
these symbols are defined in present program and referenced in
another program.
 ENTRY pseudo-op code is optional which is used to defining entry
locations of sub-
routines.
Program Unit P
The entry statement lists the public definitions of a program
unit(TOTAL).The EXTRN statement lists the symbol to which external
references are made in the program unit(MAX,ALPHA).

11 Usha Kamala B T, VVFGC Tumakuru


Chapter-4 Loaders System Software

START 500
ENTRY TOTAL
EXTRAN MAX, ALPHA
READ N
LOOP
MOVER AREG, ALPHA
BC ANY, MAX
.
.
.
BC LT, LOOP
STOP
TOTAL DS 1
DS 1

END

Program Unit Q

START 200
ENTRY ALPHA
.

DS
ALPHA END 25

Program unit P contains an external reference to symbol ALPHA


which is a public definition in program unit Q.
RELOCATING LOADERS:BSS [BINARY SYMBOLIC SUBROUTINE]
In order to avoid the disadvantages of reassembling in absolute loader
another type of loader called relocating loader is introduced. BSS [Binary
Symbolic subroutine]is one of the example of relocating loader. Following
are the features of relocating loaders:
 The BSS loader allows many procedure segments but only one data
segment

12 Usha Kamala B T, VVFGC Tumakuru


Chapter-4 Loaders System Software

 The assembler assembles each procedure segments independently


and then passes
to loader the text and information as to relocation and inters
segment reference.
 The output of a relocating assembler using a BSS scheme 18 the
object program
and information about all other programs it reference. For each
source program
the assembler output a text prefixed by transfer vector that consist of
address
containing names of the subroutines referenced by the source
program.
 The assembler would also provide to loader with additional
information the length
of the entire program and length of the transfers? vector.

This BSS scheme uses RX type instruction format. The format is


given Below:

Data segment vector Procedure segment

32

OP R1 X2 A2

8
4 4 16
8
13 Usha Kamala B T, VVFGC Tumakuru
8
Chapter-4 Loaders System Software

Fig: Relocating loader


OP:Opcode
R1:Register 1
X2:Index Register
A2:Address of Operand
Relocation Bit & its Use
 It is necessary to relocate the address portion of every instruction,
the assembler
associate a dit with each instruction or address field called relocation
bit
 If relocation bit =1 the corresponding address filed must be relocated.
if it is 0, the filed is not relocated.
 The relocation bits are used to solve the problem of relocation, the
transfer vector is used to solve the problem of linking and the
program length information is used to solve the problem of
allocation.
ADVATAGES:
1. Reassembling is not necessary.
2. All the function of the loader allocation, relocation , linking and
loading are implemented only by the BSS loader.
DESADVANTAGES:
1. The transfer vector increases the size of the object program in
memory.
2. There is no facility for accessing common data segment.

DIRECT LINKING LOADER


It is general relocating loader and most popular loading scheme presently
used.
The main difference between direct linking loader and relocating loader is
"relocating loaders has one data segment support multiple procedure
segment” but " direct linking loader support multiple procedure segment
and also multiple data segments".
The loader cannot have the direct access to the source code. The
assembler should give the following information to the loader
14 Usha Kamala B T, VVFGC Tumakuru
Chapter-4 Loaders System Software

 The length of the object code segment.


 The list of all the symbols which are not defined in the current
segment but can be used in the current segment.
 The list of all the symbols which are defined in the current segment
but can be referred by the other segments.

The list of symbols which are not defined in the current segment but can
be used in the current segment are stored in a data structure called
USE table. The list of symbols which are defined in the current segment
and can be referred by the other segments are stored in a data structure
called DEFINITION table.

Output Given to

Translator Linker

Output

Load in memory
Given to
Loader

Fig; Process of linking a program

DIRECT LINKING LOADER CARDS


In design of direct linking loader ,assembler produces four types of
cards in the abject deck. There are 4 types of cards available in the direct
linking loader . They are
1. ESD - External symbol dictionary.
2. TXT card.
3. RLD – Relocation and linking dictionary.
4. END card.
15 Usha Kamala B T, VVFGC Tumakuru
Chapter-4 Loaders System Software

1. ESD - External symbol dictionary


These ESD cards are again Classified into 3 types of nemonics .They
are
1. SD: It refer to the segment Definition.
2. LD: It refers to the Local Definition [ENTRY].
3. ER: It refers to the External Reference they are used in the
[EXTRN] pseudo-op. It contains information about all symbols that
are defined in the program but reference somewhere. It contains:
1. Reference number.
2. Symbol name.
3. Type Id.
4. Relative location.
5. Length.

2. TXT card
It contains the actual information are text which are already
translated.

3. RLD – Relocation and linking directory


 This card contains information about location in the program whose
contexts depends on the address at which the program is placed.

 The format of RLD contains:


1. Reference number.
2. Symbol.
3. Flag.
4. Length.
5. Relative location.
4.END card
 It indicates end of the object program.
 The size of the above 4 cards is 80 bytes.
DESIGN OF DIRECT LINKING LOADER

Fig: Designing of direct linking loader


16 Usha Kamala B T, VVFGC Tumakuru
Chapter-4 Loaders System Software

Source Relative Sample program


Card Address Source Deck
Reference

1 0 PG 1 START
2 ENTRY PG1ENT1. PG1ENT2
3 EXTRN PG1ENT1. PG2.
4 20 PG1ENT1 _____
5 30 PG1ENT2 _____
6 40 DC A(PG1ENT1)
7 44 DC A(PG1ENT2+15) PG1
8 48 DC A(PG2ENT2-
9 52 DC PG1ENT1 3)
10 56 DC A(PG2)
11 END A(PG2ENT1+PG2-
PG1ENT1+4)

12 0 PG2 START
13 ENTRY
14 EXTRN PG2
15 16 PG2ENT1 _____ PG2ENT1
16 24 DC PG1ENT1. PG1ENT2
17 28 DC
18 32 DC A(PG1ENT1)
19 END A(PG1ENT2+15)
A(PG1ENT2-
PG1ENT1-3)

ESD Cards
In ESD card table contains information necessary to build the
external symbol dictionary or symbol table. In the above source code the
symbol are PG1, PG1ENT2, PG2, PG2ENT1.

17 Usha Kamala B T, VVFGC Tumakuru


Chapter-4 Loaders System Software

Format of ESD Card for PG1:


Source Name Type Id Relative Length
card
Reference
1 PG1 SD 01 0 60
2 PG1ENT1 LD - 20 -
2 PG1ENT2 LD - 30 -
3 PG2 ER - - -
3 PG2ENT1 ER - - -
Fig: Format of ESD Card
Explanation
Here, the PG1 is the segment definition it means, the header of
program1. PG1ENT1 and PG1ENT2 those are the local definition of
program1, so that we are using the type LD. PG2 and PG2ENT1 those are
using the EXTRN pseudo op code, so that we are using the type ER.
TXT card for PG1:
The format of card will be-
Source card Relative Content Comments
Reference Address
6 40-43 20
7 44-47 45 =30+15
8 48-51 7 =30-20-3
9 52-55 0 Unknown to
10 56-60 -16 PG1
-20+4

6 = A (PG1ENT1) = 20
7 = A (PG1ENT2+15) = 30 + 15 = 45
8 = A (PG1ENT2- PG1ENT1-3) = 30 – 20 – 3 = 7
9 = A (PG2) = 0=
10 = A (PG2ENT1+PG2- PG1ENT1+4) = 0 + 0 - 20 + 4 = -16
18 Usha Kamala B T, VVFGC Tumakuru
Chapter-4 Loaders System Software

RLD Card Format for PG1

Source ESD ID Length Flag Relative


Card
[bytes] + or -
reference
address

6 02 4 + 40
7 02 4 + 44
9 03 4 + 52
10 02 4 + 56
10 03 4 + 56
10 02 4 - 56
Fig: Format of RLD card

ESD Card for program2 [PG2]

Source Name Type Id ADDR Length


Card
Reference
12 PG2 SD 01 0 36
13 PG2ENT1 LD - 16 -
14 PG1ENT1 ER 03 - -
14 PG1ENT2 ER 03 - -

Text Card for PG2


Source Card Relative address Content
Reference
16 24-27 0
17 28-31 15
18 32-35 -3

19 Usha Kamala B T, VVFGC Tumakuru


Chapter-4 Loaders System Software

16=A (PG1ENT1) = 0
17=A (PG1ENT2+15) =0+15=15
18=A (PG1ENT2-PG1ENT1-3) =0-0-3=-3

RLD Card for PG2


Source ESD ID Length[flag] Flag Relative
Card [bytes] + or - address
Reference
16 01 4 + 24
17 - 4 + 28
18 03 4 + 32
18 03 4 - 32

SPECIFICATION OF DATA STRUCTURE IN DIRECT LINKING LOADER

Pass1 databases:
1. Input object decks.
2. The initial program load addresses [IPLA]: The IPLA supplied by the
programmer or operating system that specifies the address to load
the first segment.
3. Program load address counter [PLA]: It is used to keep track of each
segments assigned location.
4. Global external symbol table [GEST]: It is used to store each external
symbol and its corresponding assigned core address.
5. A copy of the input to be used later by pass2.
6. A printed listing that specifies each external symbol and its assigned
value.

Pass2 database:
1. A copy of object program is input to pass2.
2. The initial program load address [IPLA].
3. The program load address counter [PLA].
4. A table the global external symbol table [GEST].

20 Usha Kamala B T, VVFGC Tumakuru


Chapter-4 Loaders System Software

5. The local external symbol array [LESA]: which is used to establish a


correspondence between the ESD ID numbers used on ESD and RLD
cards and the corresponding External symbols, Absolute address
value.

FORMAT OF DATA BASES


Object decks:

ESD Card format:


Source Name Type ID Relative Length
Card Address
Reference
Type Hexadecimal
SD 01
LD 02
ER 03

TXT Card format:


Source Card Relative Address Content
reference
address

RLD format Card:

Source ESD ID Length Flag Relative


card
+ or - address
Reference

Note: The length of each card is 80-bytes.

21 Usha Kamala B T, VVFGC Tumakuru


Chapter-4 Loaders System Software

GLOBAL EXTERNAL SYMBOL TABLE [GEST]:

It is used to store each external symbol and its corresponding core


address.

External symbol Assigned core


[8 bytes] character [4 bytes] address decimal
“PG1bbbbb” 104
“PG1ENT1b” 124

LOCAL EXTERNAL SYMBOL ARRAY [LESA]


i. The external symbol is used for relocation and linking purpose.
ii. This is used to identify the RLD card by means of an ID number
rather than the symbols name.
iii. The ID number must match an SD or ER entry in the ESD card.

Assigned core address of


Corresponding symbol [4 bytes]
104
124
134
....
....
This technique saves space and also increases the processing speed.

22 Usha Kamala B T, VVFGC Tumakuru


Chapter-4 Loaders System Software

Copy of
Input object deck Program
Object Loaded in
Pass 1 of loader Pass 2 of loader
deck memory

Program
load address
(PLA)

Printed Local external


Load Symbol array
Map Initial program (LESA)
load address
(IPLA)

Global external
Symbol Table
(GEST)

Different databases used by loader

PASS 1: Algorithm

1. Initial program load address [PLA] it's set to the initial program load
address [IPLA].
2. Read the object card.
3. Write copy of card for pass2.
4. The card can be any one of the following type.
 Text card or RLD card- there is no processing required during pass1
and then read the next card.
 ESD card is processed based on the type of the external symbols.
5. SD card reads the length field LENGTH from the card is temporarily saved
in the variable SLENGTH.

23 Usha Kamala B T, VVFGC Tumakuru


Chapter-4 Loaders System Software

 The value, VALUE to assign to this symbol is set to current value of the
PLA.
 The symbol and its assigned value are then stored in the GEST.
 If the symbol already existed in the GEST then this is an error.
 The symbol and its value are printed as part of the load map.
6. LD is read the value to be assigned is set to the current PLA+ the relative
address [ADDR].The ADDR indicates on the ESD card.
7. ER symbol do not required any processing duration pass1.
8. When an END card is encountered the program load address is
incremented by length of the segment and saved on SLENGTH.
9. EOF cord is read pass 1 is completed and control transfer to pass2.
PASS 2 : Algorithm
If an address is specified a the END card then that address is used as the
executed start address otherwise, the execution will begin from the first
segment.
In pass2 the four cards are read one by one described as follows
At the beginning of pass2 the program load address is initialized as in pass 1
and the execution start address [EXADDR] is set to IPLA.
I. ESD card
 SD type=>the length of the segment is temporarily saved in the variable
SLENGTH. The LESA [ID] is set to the current value of the PLA.
 LD type=> Does not requires any processing during pass2
 ER type=>the gest is searched for a match with the ER symbols. If it is
not found then there is an error. If found in the GEST its value is
extracted and the corresponding LESA entry is set.
2. Text card: The text card is copied from the card to the appropriate
relocated memory location [PLA+ADDR]
3. RLD card: The value to be used for relocation and linking is executed
from the LESA as specified by the ID field. Depending upon the flag
values is either added to or subroutine from the address constants.
4. END card: If an execution start address is specified on the END card it
is saved in a variable EXADDR. The PLA is incremented by length of the
segment and saved in

24 Usha Kamala B T, VVFGC Tumakuru


Chapter-4 Loaders System Software

5. SLENGTH, becoming the PLA for the next segment


[PLA=PLA+SLENGTH).
6. EOF card: the loader transfers control to the loaded program at the
address specified by the contents of the execution address variable
[EXADDR].
PASS 1: FLOWCHART

25 Usha Kamala B T, VVFGC Tumakuru


Chapter-4 Loaders System Software

PASS 2: FLOWCHART

26 Usha Kamala B T, VVFGC Tumakuru


Chapter-4 Loaders System Software

OTHER LOADING SEGMENTSS


Binders:
In order to avoid the disadvantages of direct linking loader, another
scheme of loader is introduced which divides the loading process into 2
separate programs:
I. A binder
II. A module loader
I. A binder: Binder is a program that performs the function as
direct linking loader in binding together. It outputs the text as a
file or card deck, rather than placing the relocated and linked text
directly into memory. The output files are in format ready to be
loaded and are called a load module.
II. Module loader : Module loader is a program that merely loads the
load module into memory. The binder performs the function of the
allocation, relocation and linking. The module loader performs the
function of loading.
There are 2 major classes of binders:
 Core image builder
 Linkage editor
(i) Core image builder
A specific memory allocation of the program is performed at a time
that the subroutines are bound together. lt is called a core image module
and the corresponding binder is called a core image builder
Advantages:
 Simple to implement.
 Fast to execution.
Disadvantages:
 Difficult to allocate and load the program
(II). Linkage editor

27 Usha Kamala B T, VVFGC Tumakuru


Chapter-4 Loaders System Software

The linkage editor can keep track of relocation information so that


the resulting load module can be further relocated and their care the
module loader must performs additional allocation and relocation as well
as loading but it does not worry about the problem of linking.

 A linking loader performs all linking and relocation operations, and


loads the linked program directly into memory for execution.
 A linkage editor produces a linked version of the program (loader
module or executable image) which is written to a file or library for
later execution.
Advantages:
 More flexible allocation and loading scheme.
Disadvantages:
 Implementation is so complex.

DYNAMIC LOADING
For the entire loader scheme we have assured that all of the subroutine
needed or loaded into memory at the same time.
If the total amount of memory required by all these subroutine exceeds
the amount available especially, for large programs there is a problem to
load all the subroutine into memory at the same time. There are several
hardware techniques to solve this problem such as paging or segments.
Usually the subroutine of a program is need at different times. So dynamic
loading scheme is used.
“The mechanism of loading certain part or required routine of the program
into the main memory only when it is called by the program, is
called Dynamic Loading” , and this enhance the performance of computer.
Example

28 Usha Kamala B T, VVFGC Tumakuru


Chapter-4 Loaders System Software

Pass1 and Pass2 of an assembler are mutually exclusive. The


assembler can recognize which subroutine calls the other subroutine it is
possible to produce an overlay structure that identifies the mutual
exclusive subroutine.

A(20K)
B(20k)
C(30K)
D(10K)
E(20K)
Fig 1: Subroutine call between the procedures
Above figure illustrate a program consist of five sub programs [A to E]
and that requires 100 K bytes of memory.
The arrow indicator says that
• sub program A only calls B, D and E
• sub program B calls C and E
• sub program D calls only E and
• sub programs C , E does not calls any other sub programs or
routines.
[ Note that procedure B and D are never use at the same time. ]

Overlay loading scheme


The process of transferring a block of program code or other data into
internal memory, replacing what is already stored” is called overlay.
Sometimes it happens that compare to the size of the biggest partition, the
size of the program will be even more, then, in that case, you should go
with overlays.
So overlay is a technique to run a program that is bigger than the size of the
physical memory by keeping only those instructions and data that are
needed at any given time. Divide the program into modules in such a way
that not all modules need to be in the memory at the same time.
Overlay structure is used to identify mutually exclusive subroutines by
exclusively recognizing subroutines which call other subroutines. The
example ( fig 1) highlights interdependencies between the procedures. If
we load only those procedures that are actually to be used at any

29 Usha Kamala B T, VVFGC Tumakuru


Chapter-4 Loaders System Software

particular time , the amount of memory needed is equal to longest path of


overlay structure, it happens to be 70k. The portion of loader that actually
interprets the procedure calls and loads the necessary procedure is called
overlays supervisor or flipper.

A(20K)

B(20K) D(10K)

C(30K) E(20K)

DYNAMIC LINKING

The main disadvantages of all of the previous loading scheme are that the
subroutine as references but they never executed, but the loader still in
use the overhead of linking the subroutines. In This mechanism the
loading and linking of external references are postponed until execution
time. the loader loads only the main program . If the main program
should execute a transfer instruction to an external address or external
variable the loader is called. Only then the segment containing the
external reference loaded.

“The mechanism of linking code into a form that is loadable by programs at


run time is called Dynamic linking”

Advantages
1. The number overhead is incurred unless the procedure to be called or
reference is actual used.
2. System can be dynamically reconfigured.
3. Saves memory.
Disadvantages:

30 Usha Kamala B T, VVFGC Tumakuru


Chapter-4 Loaders System Software

1. More complex because of postponed most of binding process until the


program execution time.

Some of one mark questions

1. What is Binder?
The program which performs allocation, relocation, and linking called binder.
2. What is overlays?
The inter dependency of the segments can be specified by a tree like
structure called static overlay structure.
3. What is dynamic loading?
Dynamic loading is the process in which one can attach a shared library
to the address space of the process during execution
4. What is direct linking loader?
A Direct linking loader is a general relocating loader it allows the
programmer to use multiple procedure and multiple data segments.
5. What is relocating loader?
The relocating loader will load the program anywhere in memory, altering
the various addresses as required to ensure correct referencing.

31 Usha Kamala B T, VVFGC Tumakuru


Chapter-5 Compiler System Programming

COMPILER

A compiler is a special program that processes statements written in a


particular programming language and turns them into machine language or
"code" that a computer s processor uses.

TYPES OF COMPILER
The various types of compiler that you can use are
Cross compilers
One-pass or multi-pass compiler
Source-to-source compiler
Stage compiler
Just-in-time compiler

1. Cross Compiler
A cross compiler is a compiler that creates an executable code for one platform
and runs the created executable code on anther platform. A cross compiler
separate the build environment from the target environment and is useful in a
number of ways, such as it provides compiling for embedded systems and
compiling an operating system for the first time.
Uses of Cross compilers
 Embedded computers.
 Compiling for multiple machines.
 Use of virtual machines.

2. One-pass or Multi-pass Compiler


A compiler performs many operations and consumes a lot of memory space. So
compilers are split into smaller programs that perform analysis and
translations Small program can be compiled in one pass while a big program is
divided into a numbers of sub-programs, which is compiled in multiple times
known as multi-pass The ability to compile in one pass is beneficial because it
simplifies the job of writing a compiler. One-pass compilers are faster than
multi-pass compilers.

3.Source-to-source Compiler:
source to source compiler are the complier that take a high level language as
its input and gives an output written in the same high level language.
4. Stage Compiler:
A stage complier is used for compiling assembly language of a machine. For
example, Warren Abstract Machine (WAM) is used as a stage compiler.
1 Usha Kamala B T,VVFGC,Tumakuru
Chapter-5 Compiler System Programming

5.Just in time complier

A just-in-time compiler allows you to deliver applications in byte code. for


example, Smalltalk and Java systems use a just-in-time compiler.
FUNCTIONS OF COMPILER

The compiler must perform the following 4 tasks functions:


 Recognize certain strings as basic elements or token ie., variables,
operators keywords etc.
 Recognize combinations of elements as synthetic units and interpret
meaning.
 Allocates strong and assign location for all variables in the program
 Generate the appropriate object code.
General model of complier:

there are 7 distinct logical problems:


1. Lexical analysis.
2. Syntax analysis.
3. Integration phase.
4. Machine independent optimization.
5. Storage assignment.
6. Code generation.
7. Assembly and output.

1. Lexical analysis: Recognition of basics element or tokens and creation of


uniform Symbol table.

2. Syntax analyses: Recognition or basics syntactic construct through


reduction table.

3. Interpretation phases: It describes the definition of exact meaning, creation


of matrix and tables for respective routine [action routings].

4.Machine independent optimization: Creation of most optimal matrix


[ removes the duplicate entries in the matrix table].

5.Storage assignment: It makes entries in the matrix that allow code


generation to create code that allocates dynamic storage and also the assembly
phase to reserve the proper amount of STATIC storage.

6. Code generation: A macro processor is used to produced more optimal


assembly code.

2 Usha Kamala B T,VVFGC,Tumakuru


Chapter-5 Compiler System Programming

7. Assembly and Output: It resolving symbolic address and generating the


machine language.

Phase 1 to 4 are machine independent and language dependent. Because this


phases helps in determining the syntax and meaning of each statement in the
source program. Hence it is dependent on the language and independent of
the machine

Phase 5 to 7 is machine dependent and language independent. Because phase


allocates memory for literals and also generate the assembly code which
dependent on machine and independent of language.

The database used by the compiler is:

Source code: The program written by user or the user program.

3 Usha Kamala B T,VVFGC,Tumakuru


Chapter-5 Compiler System Programming

Uniform symbol table: it consist of the tokens or basic elements as they


appear in the program created by lexical analysis phase and given as input
syntax analysis and interpretation phase.

Terminal table: This table is created by lexical analysis phase and contains all
variable in the program.

Identifier table: It contains all variable in the program and temporary storage
(Ex M1, M2, M3 ... M7] and information needed to reference allocate storage for
the variables. This table is created by lexical analysis.

Literal tables: It contains all contents in the program.

Reductions: It is a permanent table of decision rules in the form matching


with the uniform symbols table to discover synthetic structure

Matrix: Matrix is created by the intermediate form of the program created by


the action routine. It is optimized and then used for code generations

Code productions: It is permanent table of definition, There is one entry code


for each matrix operator.

Assembly code: The assembly language variation of the program which is


created by the code generation phase and it is input to the assembly phase.

Re-locatable object codes: The final output of the assembly phase ready to be
use as input to loader.

PHASES OF COMPILER

A compiler is broken into several logical phases that help in the execution of a
source code with efficiency to improve the performance of the compiler. The
common logical phases that you use in a compiler for translating a source code
into the target code are: To understand the phases let's consider the following
examples:

WCM: procedure (Rate, Start, finish);


Declare (Cost, Rate, Start, Finish) fixed binary (31) static;
Cost=Rate (Start- Finish) +2*Rate*(Start-Finish-100);
Return (Cost);
End;

4 Usha Kamala B T,VVFGC,Tumakuru


Chapter-5 Compiler System Programming

Lexical phase
The lexical phase performs the following three tasks:
1. Recognize basic elements are tokens present in the source code.
2. Build literal and an identifier table.
3. Build a uniform symbol table.
Database:
Lexical phase involves the manipulation of 5 databases
1. Source program
2. Terminal table
3. Literal table
4. Identifier table
5. Uniform symbol table

1.Source program: The original form of the program created by the user.

2.Terminal table: It is a permanent database it consist of 3 fields

Symbol Indicator precedence

Symbol: operators, keywords and separators [(,,,.]

Indicators: values are YES or NO


Yes=> operators, separators
No=> Keywords

Precedence: Used in later phase

Example

5 Usha Kamala B T,VVFGC,Tumakuru


Chapter-5 Compiler System Programming

Step Symbol Indicator Precedence

1 : Yes
2 ; Yes
3 ( Yes
4 ) Yes
5 “ Yes
6 * Yes
7 Declare No
8 Procedure No
9 + Yes
10 - Yes
11 * Yes ‘
12 Rate No
13 Start No
14 Finish No
Fig: Terminal Table

Literal Table:

It describes all literals constants used in the source program. It fields. Other
information and address are stored in lateral phases.

Literal Base Scale Precision Other Address


information

Example

Literal Base Scale Precision Other Address

Information

31 Decemal Fixied 2 Filled by Filled by


later later

2 Decemal Fixied 1 phases phases

100 Decemal Fixied 3

6 Usha Kamala B T,VVFGC,Tumakuru


Chapter-5 Compiler System Programming

Identifier Table
It describes all identifiers used in the source program. It consists of three fields

Name Data attribute Address

Data attribute and address are used in later phases


Name Data attribute Address
WCM Filled by later phases Filled by later phases
RATE
START
FINISH
COST

Uniform Symbol Table


Uniform Symbol Table represent the program as a strange of tokens rather
than individual characer one uniform symbol for every token in the program.

It consists of 2 fields:
Table Index

Table Index Token

IDN 1 WCM
TRM 1 :
TRM 8 Procedure
TRM 3 (
IDN 2 Rate
TRM 5 ,
IDN 3 Start
TRM 5 ,
IDN 4 Finish
TRM 4 )
TRM 2 ;

Fig :Uniform symbol table


7 Usha Kamala B T,VVFGC,Tumakuru
Chapter-5 Compiler System Programming

Algorithm

Step 1: The first task of the lexical analysis algorithm is to parse the input
character strange into tokens
Step 2: the second step is to make appropriate entries in the table.

Implementation
The input string is separated into tokens by break character. Brake characters
are denoted by the contents of a special field in the terminal table.

Lexical analysis 3 types of tokens:


 Terminal symbols TRM
 Identifiers [IDN]
 Literals [LIT)

Segregating Tokens
i. If symbol== TERMINAL table then create uniform symbol table of type TRM

ii. Else if symbol==IDENTIFIER table then Create uniform symbol table of type
IDN

iii. Else Create uniform symbol table of type LIT

End if

Syntax Phase:
The functions of the syntax phase are:

 To recognize the major construct of the language


 To call the appropriate action routines that will generate the intermediate
form or matrix form the constructs

DATABASES
1 Uniform symbol table: The uniform symbols are the source of input to the
stack which s used by syntax and interpretation phase

Table classes Index

8 Usha Kamala B T,VVFGC,Tumakuru


Chapter-5 Compiler System Programming

2 Stack: The stack is a collection of uniform symbol i.e., currently being


worked on the stack is organized in LIFO technique

3 Reduction table: The syntax rules of the source language are contained in
the reduction table. The syntax analysis phase is an interpreter driven by the
reductions.
The general form of the reduction or rules is:

Label : old top of stack/Action routines/new top of stack/next reduction

The following conventions are Used:

1. Label:- optional
2. old top of stack:- to be compared to top of stack
3. Action routines:- to be called if old top of stack matches top of stack
4. new top of stack:- changes to be made to top of stack after action routines
are executed.
5. next reduction:- interpret the next reduction.

ALGORITHM

Step 1: Reduction or tested consequently 1or match between old top of stack
field and the actual top of stack until match 1s found.
Step 2: When match 18 found the action routine specified in the action fields
are executed in ordered from left to right
Step 3: when controlled return to the syntax analyser, it modifies the top of
stack to agree with the new top of tack.
Step 4: step1 is repeated starting with the reduction specified in the next
reduction field.

Interpretation Phase
It is a collection of routines which are called when a Construct is recognized.

Databases
1.Uniform symbol table: The table create a by lexical phase.

The uniform symbols are the source of input to the stack Which s Used by
syntax and interpretation phase.

Table Classes Index

9 Usha Kamala B T,VVFGC,Tumakuru


Chapter-5 Compiler System Programming

2. Stack: The stack is a collection of uniform symbol 1.e, currently being


worked on It the stack is organized in LIFO technique.

3. ldentifier table: It is initialized by lexical analysis phase to describe all


identifier used in source Program.

Nam Bas Scal Precisi Stora Arra Structu Liter Bloc Oth Addre
e e e on ge y re al k er ss

4. Matrix: It allows the first four phases of compiler to be machine


independent. It also allows machine independent optimization of the program
to occur before the code generation phase.

In order to add or delete entries there is a special reserved field called as


chaining.

5.Temporary Storage Table: It stores data type, precision or source statement


temporarily.

Algorithm:

There is no algorithm because interpretation phase is a collection of individual


action routines that 1s created in syntax phase analysis.

Optimization Phase
Optimization performed by a compiler is of 2 types. They are:

Machine dependent Optimization: It is related to the machine instructions


that get generated. So it is added into the code generation phase.

Machine independent Optimization: It is not related o the machine


instructions. It is used to increase efficiency of the code and reduces the lines
of code

Data bases:

Matrix: It allows the first four phases of compiler to be machine independent.


It

10 Usha Kamala B T,VVFGC,Tumakuru


Chapter-5 Compiler System Programming

also allows machine independent optimization of the program to occur before


the generation phase. In order to add or delete entries there is a special
reserved field called as chaining.

Identifier table: It is initialized by lexical analysis phase to describe all


1dentiher used in source program.

Literal table: It describes all literals constants used in the source program. It
consists of 6 fields. Other information and address are stored in lateral phases

Algorithm-

1. Place the matrix in a form so that common sub expression can be recognized.
2. Recognize two sub expression as being equivalent.
3. Eliminate one of them.
4. Alter the rest of the matrix to reflect the elimination of this entry.

Optimization (Machine Independent & Machine Dependent)

Optimization (Machine Independent)

The machine independent optimization steps are:

1. Elimination of common sub expressions.

2. Compile time computation of operations, both of whose operands are


constants.

3. Use of Boolean expressions to minimize their computation

4. Movement of computation involving no varying operands out of loop.

Elimination of Common sub expression

Example 1:

SUM= (A*B) + (A*B)

The following table shows the matrix with sub expression and matrix alter
elimination

Matrix with common sub expression

Line No Operator Operand1 Operand 2

11 Usha Kamala B T,VVFGC,Tumakuru


Chapter-5 Compiler System Programming

1 * A B

2 * A B

3 + M1 M2

4 = SUM M3

Matrix after eliminating common sub expressions

Line No Operator Operand1 Operand 2


1 * A B
2
3 + M1 M2
4 = SUM M3
Example 2:
A=B(C - D
E = (C D)*5
Matrix before elimination:

Line No Operator Operand1 Operand 2


M1 - C D

M2 * B M1

M3 = A M2

M4 - C D

M5 * M4 5

M6 = E M5

Matrix after elimination:

Line No Operator Operand1 Operand 2

M1 - C D

M2 * B M1

M3 = A M2

12 Usha Kamala B T,VVFGC,Tumakuru


Chapter-5 Compiler System Programming

M4 - C D

M5 * M1 5

M6 = E M5

Compile time computation of operations, both of whose operands are


constants.

Example: A=5*4/2*BB

Before optimizations

Line No Operator Operand1 Operand 2

M1 * 5 4

M2 / M1 2

M3 * M2 B

M4 = A M3

After optimization

Line No Operator Operand1 Operand 2

M1

M2

M3 * 10 B

M4 = A M3

13 Usha Kamala B T,VVFGC,Tumakuru


Chapter-5 Compiler System Programming

Boo1ean Expression Optimization

For example, instead comparing Boolean operators with a variable, compute


whether it is true or false.

If a or b AND c

Instead of store an expression, substitute the value of a, b or c and calculate it


either true or false.

Move invariant computations outside the loop.

If the computation within the loop depends on a variable that does not change
with in the loop, the computation may be moves outside the loop

Optimization (Machine Dependent)

In order to save memory and improve its execution speed, there is a need
machine dependent optimization.

Example l:

Consider A=B+C
The following depicts the code for the above statement
Matrix Original Code Better Code

+B C 1) L1,B 1)L 1, B
2) A 1,C 2) A 1,C
3) ST 1, M1 3) ST 1, A
=A M1 4) L 1, M1
6) ST 1, A

Example 2:
Optimisation (Machine dependent)
After elimination:
1 * A B

3 + M1 M2

14 Usha Kamala B T,VVFGC,Tumakuru


Chapter-5 Compiler System Programming

4 = SUM M3

Code Generation Phase


The Purpose of the code generation is to produce appropriate code. In
this phase Matrix is the input data base
Data bases
 Matrix: It allows the first four phases of compiler to be machine
independent. It also allows machine independent optimization of the
program to occur before the code generation phase. In order to add or
delete entries there 15 a special reserved field called as chaining

 Identifier table: It is initialized by lexical analysis phase to describe all


identifier used in source program.

 Literal table: It describes all literals constants used in the source


program. It consists of 6 fields. Other information and address are stored
in lateral phases.

 Code productions: It is a permanent database defining all possible


matrix Operator.

Assembly Phase
If majority of work are done by code generation phase then assembly phase
does as follows:
 It resolves label reference in object program.
 It formats the object deck.
 It formats the appropriate information for loader.

If code generator phase generated symbols as machine instruction and label


then assembly phase does as follows:
 It resolves label references.
 It calculates address.
 It generates binary machine instruction.
 It generates storage
 It converts literals.
Databases

15 Usha Kamala B T,VVFGC,Tumakuru


Chapter-5 Compiler System Programming

 Identifier Table: It is initialised by lexical analysis phase to describe all


identifier used in source program.
 Literal Table: It describes all literals constants used in the source
program. It consists of 6 fields. Other information and address are
stored in lateral phases.
 Object Code: The output of the code generation.
Algorithm

1. The assembly phase scans the object code, resolving all label references
and producing the TXT card.
2. Then it scans the identifier table to create ESD cards.
3. The RLD cards are created using object code, ESD cards and their
identifier table.
 TXT card contains actual assembled program.
 ESD card contains information about all symbol that are defined in the
programbut referenced elsewhere and symbols defined and referenced in
current program.
 RLD card contains information about location of each constant to be
changed due to relocation.

PASSES OF A COMPILER

16 Usha Kamala B T,VVFGC,Tumakuru


Chapter-5 Compiler System Programming

The above diagram depicts a flowchart of a compiler.

Pass 1:
It corresponds to the lexical analysis of a compiler. It scans the source program
and creates the identifiers, literals and uniform symbol tables.
Pass 2:
It corresponds to syntax and interpretation phases. Pass2 scans the uniform
symbol table produces the matrix.
Pass 3 through Pass N-3 means Pass 4:
They corresponding to the optimization phase.

Pass N-2: Pass 5:


It corresponds to the storage assignment phase.

Pass N-1: Pass 6:


It corresponds to code generation. It scans the matrix.

Pass N-2: Pass 5:


It corresponds to Assembly and output phase. It resolves symbolic address and
creates information for loader.

Difference between BAlR and USING

17 Usha Kamala B T,VVFGC,Tumakuru


Chapter-5 Compiler System Programming

BALR USING
It is a machine-op It is a pseudo -op

Balr is an instruction to the computer Using indicates to the assembler


to load a register with the next which general register to use as a base
address and branch to to the address register and what is in the base
in the second field. register

Loads the base register Informs assembler what is in the base


register

Sets the register with the next address Only provides information to the
assembler

18 Usha Kamala B T,VVFGC,Tumakuru

You might also like