0% found this document useful (0 votes)
18 views

System Programming Part1&2

Uploaded by

elias ferhan
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
18 views

System Programming Part1&2

Uploaded by

elias ferhan
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 27

System Programming part

Unit one: Introduction


System programming is the process of constructing instructions using which computer hardware
can interact with computer programmers and the user. System programming is hardware
dependent. To develop an efficient system program, knowledge of computer hardware is must as
system programs are platform dependents. In system programming, programs are written in low
level language using language such as C, C++ and C#. Hierarchy of system program is as follows:

o Operating System
o Compilers
o Assemblers
o I/O routines
o Interpreters
o Scheduler
o Loaders and Linkers

System Software
System software refers to the low-level software that manages and controls a computer’s hardware
and provides basic services to higher-level software. System software refers to the collection of
programs and software components that enable a computer or computing device to function
properly. It acts as an intermediary between the user and the computer hardware, allowing the user
to interact with the hardware and use various applications and programs.

Examples of system software include:

• Operating systems (OS): Windows, Linux, macOS, etc.


• Device drivers: software that enables the communication between hardware and OS.
• Firmware: pre-installed low-level software that controls a device’s basic functions.
• Utility software: tools for system maintenance and optimization.
• Boot loaders: software that initializes the OS during startup.

1
There are two main types of software: systems software and application software. Systems
software includes the programs that are dedicated to managing the computer itself, such as the
operating system, file management utilities, and disk operating system (or DOS). System software
is software that provides a platform for other software. Some examples can be operating systems,
antivirus software, disk formatting software, Computer language translators, etc. These are
commonly prepared by computer manufacturers. This software consists of programs written in
low-level languages, used to interact with the hardware at a very basic level. System software
serves as the interface between the hardware and the end users.

Operating systems are the most important type of system software, as they provide the
foundational framework for all other software and applications to run on the computer. They
manage computer resources, such as memory and processing power, and provide a user interface
for users to interact with the system.

Device drivers are another important type of system software, as they allow the operating system
to communicate with hardware devices such as printers, scanners, and graphics cards.

Utility programs provide additional functionality to the operating system, such as disk
defragmentation, virus scanning, and file compression.

A compiler performs almost all of the following operations during compilation: preprocessing,
lexical analysis, parsing, semantic analysis (syntax-directed translation), conversion of input
programs to an intermediate representation, code optimization, and code generation. Examples of
compilers may include gcc(C compiler), g++ (C++ Compiler ), javac (Java Compiler), etc.

Interpreter: An interpreter is a computer program that directly executes, i.e. it performs


instructions written in a programming or scripting language. Interpreters do not require the
program to be previously compiled into a machine language program. An interpreter translates
high-level instructions into an intermediate form, which is then executed.
Interpreters are fast as it does not need to go through the compilation stage during which machine
instructions are generated. The interpreter continuously translates the program until the first error
is met. If an error comes it stops executing. Hence debugging is easy. Examples may include Ruby,
Python, PHP, etc.

2
Assembler: An assembler is a program that converts the assembly language into machine code. It
takes the basic commands and operations and converts them into binary code specific to a type of
processor.
Assemblers produce executable code that is similar to compilers. However, assemblers are more
simplistic since they only convert low-level code (assembly language) to machine code. Since each
assembly language is designed for a specific processor, assembling a program is performed using
a simple one-to-one mapping from assembly code to machine code. On the other hand, compilers
must convert generic high-level source code into machine code for a specific processor.

Why use System Software?


Here are some reasons why system software is necessary:
1. Hardware Communication: System software serves as an interface between the hardware and
software components of a computer, enabling them to communicate and work together.
2. Resource Management: System software manages computer resources such as memory, CPU
usage, and storage, optimizing their utilization and ensuring that the system operates
efficiently.
3. Security: System software provides security measures such as firewalls, antivirus software,
and encryption, protecting the system and its data from malware, viruses, and other security
threats.
4. User Interface: System software provides a user interface that allows users to interact with the
computer or computing device and perform various tasks.
5. Application Support: System software supports the installation and running of applications and
software on the system.
6. Customization: System software allows for customization of the system settings and
configuration, giving users greater control over their computing environment.
The most important features of system software include:
1. Closeness to the system
2. Fast speed
3. Difficult to manipulate
4. Written in a low-level language
5. Difficult to design

3
Simplified Instructional Computer (SIC)
SIC is a type of hypothetical computer, which contains some hardware features. Real machines
are most often containing these features. The simplified instructional computer basically has two
versions, i.e.,

o SIC standard model


o SIC/XE (extra expensive or equipment)

1. SIC machine Architecture/components


The simplified instruction computer contains a lot of components, which are described as follows:

Memory

The memory in a simplified instructional computer is organized as a sequence of 8-bit bytes (1


byte = 8 bits). A word can be formed by 3 consecutive bytes (1 word = 24 bits). This means that
with the help of 24 bits, the simplified instructional computer is designed. The lower number byte
is used to address a word, and the addressing starts by 0 byte. A computer memory contains
215 bytes.

Registers

The simplified instruction computer contains 5 types of registers. There is an address associated
with every register, and that address is known as the register number. Each register can contain
only 3 bytes that mean its size is 3 bytes. The size of the integer depends on the size of a register.

4
There is no stack in the SIC, and it basically stores the address with the help of linkage register. If
we want to write the recursive program, it is very difficult in SIC. If we write a function call with
more than one layer, the programmer is required to maintain memory for return addresses.

Mnemonic Number Special use

A (Accumulator) 0 This register is used to perform the arithmetic operations.

X (Index 1 This register is used for addressing.


Register)

L (Linkage 2 If there is a case of the subroutine, this register will save the return
Register) address of an instruction.

PC (Program 8 This register is used to store the next instruction address, which
Counter) will be executed.

SW (Status 9 This register is used to contain a variety of information such as


Word) Conditional code (CC).

There are five types of status word register, which is described as follows:

Mode: The supervision mode (value = 1) or user mode (value = 0) are referred by this mode bit.
1 bit is occupied by the mode bit. [0]

State: In this, we will see whether the process is in an idle state (value = 1) or running
state (value = 0). 1 bit is occupied by the State bit. [1]

Id: The process id (PID) is referred by the id bit. 3 bits are occupied by the Id bit. [2-5]

CC: The condition code is referred by CC bit. The means the CC bit will show whether the device
is ready or not. 2 bits are occupied by the CC bit. [6-7]

5
Mask: The interrupt mask is referred by the mask bit. 4 bits are occupied by the mask bit. [8-11]

X: The unused bits are referred by the X. 4 bits are occupied by the X. [12-15]

ICode: The interrupt code is referred by the ICode. The remaining bits are occupied by the
ICode. [16-23]

Data format
o With the help of 24 bit, the integer numbers are represented.
o With the help of 8 bit ASCII value, the characters are represented.
o There is no available bit to represent the floating-point numbers, but it exists in the SIC/XE.

o With the help of 2's complement, the negative numbers are represented. That means -N ⇔
2n - N. For example: if n = 4, then -1 ⇔ 24 -1 = (1111)2.

Instruction Format

There is a total 24-bit format contained by all instructions in a simplified instructional computer.
The memory size of a simplified instructional computer is 215sup bytes.

In this image, X is used to show the index address mode.

Addressing Mode:

The SIC can only support 2 modes, which are described as follows:

o Indexed
o Direct

6
If X = 0, it will show the direct addressing mode. If X = 1, it will show the indexed addressing
mode, which is shown as follows:

Mode Indicate Target address Calculation

Direct X=0 TA = address

Indexed X=1 TA = address + (X)

Here the content of register is shown by the ().

Instruction Set:

The instruction set in SIC is described as follows:

Arithmetic Instruction:

SIC uses memory and register A to perform the operations. With the help of register, the result
will be stored. The arithmetic instructions are represented with the help of ADD, MUL, SUB, DIV,
etc. For example:

1. ADD ALPHA ⇔ (A) ← (A) + (ALPHA)

Examples of arithmetic

1. LDA ALPHA It uses register A to load the ALPHA.


2. ADD INCR It is used to add the value of INCR.
3. LDA GAMMA It is used to load the GAMMA into register A.
4. SUB ONE It is used to subtract 1.
5. STA DELTA It is used to store in DELTA.

Load and Store Instruction:

7
It is used to store or move the data from memory to accumulator or from the accumulator to
memory. The load and store instructions are represented with the help of LDX, STA, LDA, STX,
etc. For example:

1. LDA ALPHA ⇔ (A) ← (ALPHA)


2. STA ALPHA ⇔ (ALPHA) ← (A)

Comparison Instruction:

It is used to compare the contents in register A and the data in memory. It uses the CC (conditional
code) of SW to save the result. The comparison instruction is represented with the help of
COMP. For example:

1. COMP ALPHA ⇔ CC ← (<, +, >) of (A) ? (ALPHA)

Subroutine Linkage Instruction:

It is used to show the instruction which is related to the subroutine. The subroutine linkage
instructions are represented with the help of RSUB, JSUB. Here, RSUB will be returned with the
help of jumping the address in register L, and JSUB is used to jump and place the return address
in L.

Conditional Jump Instruction:

It is first used to compare the contents of memory and accumulator. After that, on the basis of the
condition, it will perform the task. The condition jump instructions are represented with the help
of JLT, JGT, and JEQ. For example:

1. Test CC and jump accordingly

Input and Output

An 8 bits address is contained by each device. In the form of a single byte, the data is transferred
to or from the rightmost byte of register A. The input and output instructions are of three types,
which are described as follows:

8
Test Device (TD): It uses the status word and conditional code to test whether the device is ready
to send or receive a byte of data. If CC (conditional code) is <, in this case, the device will be
ready. If CC is >, in this case, the device will be busy.

Read Data (RD): With the help of RD, a byte can be read from the device. That byte will be stored
in register A.

Write Data (WD): With the help of WD, a byte can be written into the d specified by memory
device from register A.

Example of I/O for SIC

1. INLOOP TD INDEV It is used to test the input device


2. JEQ INLOOP It is used to maintain a loop until the device is ready.
3. RD INDEV It is used to read a single data byte from register A
4. STCH DATA It is used to store the byte that was read.
5. .
6. .
7. OUTLP TD OUTDEV It is used to test the output device
8. JEQ OUTLP It is used to maintain a loop until the device is ready.
9. LDCH DATA It uses register A to load the data byte.
10. WD OUTDEV It uses the output device to write one byte.

Here are some applications of SIC:

1. Computer Architecture education: The SIC is an excellent tool for teaching computer
architecture and organization, as it provides a simplified model of a computer system. By
studying the SIC’s architecture, students can learn about the basic components of a computer
system, such as the CPU, memory, and I/O devices.
2. Assembly language programming education: The SIC’s instruction set is simple and easy
to understand, making it a useful tool for teaching assembly language programming. Students
can write and execute assembly language programs on the SIC, learning about the various
instructions, addressing modes, and program flow control.

9
3. Compiler development: The SIC can be used as a platform for developing compilers for high-
level programming languages. Compiler developers can use the SIC’s instruction set and
memory organization as a reference for generating assembly language code from high-level
code.
4. Operating system development: The SIC’s simple architecture can be used as a basis for
teaching operating system development. Students can learn about the basic features of an
operating system, such as process management, memory management, and I/O management,
by implementing them on the SIC.
5. Emulation and simulation: The SIC can be used for emulation and simulation purposes,
allowing software developers to test their programs on a simulated computer system before
deploying them on real hardware.

10
UNIT II. ASSEMBLERS
The Assembler is a Software that converts an assembly language code to machine code. It takes basic
Computer commands and converts them into Binary Code that Computer’s Processor can use to
perform its Basic Operations. These instructions are assembler language or assembly language. We
can also name an assembler as the compiler of assembly language. This is because a compiler converts
the high-level language to machine language. On the other hand, an assembler is doing the same task
but, for assembly language, the name compiler of assembly language. Assembler is a program for
converting instructions written in low-level assembly code into relocatable machine code and
generating along information for the loader.

What is an Assembly Language?

An assembly language is a low-level language. It gives instructions to the processors for different
tasks. It is specific for any processor. The machine language only consists of 0s and 1s therefore, it is
difficult to write a program in it. On the other hand, the assembly language is close to a machine
language but has a simpler language and code.
We can create an assembly language code using a compiler or, a programmer can write it directly.
Mostly, programmers use high-level languages but, when more specific code is required, assembly
language is used. It uses opcode for the instructions. An opcode basically gives information about the
particular instruction. The symbolic representation of the opcode (machine level instruction) is
called mnemonics. Programmers use them to remember the operations in assembly language.
For example, ADD A, B
Here, ADD is the mnemonic that tells the processor that it has to perform additional function.
Moreover, A and B are the operands. Also, SUB, MUL, DIVC, etc. are other mnemonics.
The table below is a small assembly language program with a description of the opcode and
operand components during execution.

Assembly language opcode


Meaning/use
mnemonics and instructions

INP (Input) Inputs a value, then stores the value in the accumulator

11
OUT (Output) Outputs the accumulator contents

STA (Store) Transfers a number from the accumulator to RAM

LDA (Load) Transfers a number from RAM to the accumulator

ADD (Add) Adds accumulator contents to the contents at a RAM address

Subtracts accumulator contents from the contents at a RAM


SUB (Subtract)
address

BRA (Branch) When looping, jumps to the RAM memory address

HLT (Halt/Stop/End) Stops the processor

DAT (Data definition) Variable definition

Table 2.1. Assembly language opcode mnemonics and instructions

Original assembly
Opcode Operand Description
language

INP INP Input value and store in the accumulator

STA 1C STA 1C Store the number at memory address 1C

INP INP Input value and store in the accumulator

Add this number to the number stored at memory


ADD 1C ADD 1C
address 1C

OUT OUT Output the result

HLT HLT Stop the program

Table 2.2. a small assembly language program with a description of the opcode

12
A Simple SIC Assembler
Basic assembler functions

➢ Convert mnemonic operation codes to machine language equivalents

o E.g. LDA➔00 , STL➔14

➢ Convert symbolic operands to machine addresses

o E.g. STA TABLE, TABLE➔100

➢ Build machine instructions in the proper format

➢ Convert data constants to internal machine representations

o E.g. EOF ➔4546

➢ Write the object program and assembly listing files

Assembler directives
Assembler directives are the instructions used by the assembler at the time of assembling a source
program. More specifically, we can say, assembler directives are the commands or instructions
that control the operation of the assembler. Assembler directives are the instructions provided to
the assembler, not the processor as the processor has nothing to do with these instructions. These
instructions are also known as pseudo-instructions or pseudo-opcode.
So, assembler directives:
➢ show the beginning and end of a program provided to the assembler,
➢ used to provide storage locations to data,
➢ used to give values to variables,
➢ define the start and end of different segments, procedures or macros etc. of a program.
The assembler directives given below are used by 8086 assemblers:

➢ DB: Define Byte: This directive is used for the purpose of allocating and initializing single
or multiple data bytes.
➢ DW: Define Word: It is used for initializing single or multiple data words (16-bit).
➢ START: provides name and starting address of the program
13
➢ END: the end of the source program and specify the first executable instruction in the
program.
➢ WORD: generate one-word integer constant.
➢ RESB: reserves the indicated number of bytes for a data area

Assembler algorithm and data structures


The simple assembler uses two major internal data structures: –

Opcode table (OPTABLE): They store the value of mnemonics and their corresponding numeric
values. OPTAB must contain (at least) the mnemonic operation code and its machine language
equivalent. In more complex assemblers, this table also contains information about instruction
format and length.

➢ During Pass 1, OPTAB is used to look up and validate operation codes in the source
program.
➢ In Pass 2, it is used to translate the operation codes to machine language.

OPTAB is usually organized as a hash table, with mnemonic operation code as the key. In most
cases, OPTAB is a static table – that is, entries are not normally added to or deleted from it.

Symbol table (SYMTAB) : They store the value of programming language symbols used by the
programmer, and their corresponding numeric values. SYMTAB includes the name and value
(address) for each label in the source program, together with flags to indicate error condition (e.g.,
a symbol defined in two different places).

➢ During Pass 1, labels are entered into SYMTAB as they are encountered in the source
program, along with their assigned addresses (from LOCCTR).
➢ During Pass 2, symbols used as operands are looked up in SYMTAB to obtain the
addresses to be inserted in the assembled instruction.

SYMTAB is usually organized as a hash table for efficiency of insertion and retrieval.

Location Counter Variable: LOCCTR: It stores the address of the location where the current
instruction will be stored. A Location Counter (LOCCTR) is used to be a variable and help in the

14
assignment of addresses. Whenever a label in the source program is read, the current value of
LOCCTR gives the address to be associated with that label.

There is certain information (such as location counter values and error flags for statements) that
can or should be communicated between the two passes. For this reason, pass 1 usually writes an
inter-mediate file that contains each source statement together with its assigned address, error
indicators, etc. This file is used as the input to Pass 2.

Machine Dependent Assembler Features

Instruction formats and addressing modes

Assembly Instruction Format:

Although each Assembly Instruction Format has its own unique syntactical structure, such as
requiring upper case or lower case, or requiring colons after label definitions we discuss the
common features that assembler shares. The assembly text is usually divided into fields, separated
by spaces and tabs. A format for a typical line from assembly language program can be given as

Label : Mnemonic Operand1, Operand2 ; Comment

The first field, which is optional, is the label field, used to specify symbolic labels. A label is an
identifier that is assigned to the address of the first byte of the instruction in which it appears. As
mentioned earlier, the presence of a label is optional, but if present, the label provides a symbolic
name that can be used in branch instructions to branch to the instruction.

The second field is mnemonic, which is compulsory. All instructions, must contain a mnemonic.
Third and following fields are operands. The presence of the operands depends on the instruction.
Some instructions have no operands, some have one, and some have two. If there are two operands,
they are separated by a comma.

15
The last field is a comment field. It begins with a delimiter such as the semicolon and continues
to the end of the line. The comments are for our benefits, they tell us what the program is trying to
accomplish. Figure above shows a typical 8086 assembly language instruction.

Processor Registers
Processor operations mostly involve processing data. This data can be stored in memory and
accessed from thereon. However, reading data from and storing data into memory slows down the
processor, as it involves complicated processes of sending the data request across the control bus
and into the memory storage unit and getting the data through the same channel.

To speed up the processor operations, the processor includes some internal memory storage
locations, called registers. The registers store data elements for processing without having to
access the memory. A limited number of registers are built into the processor chip.

There are ten 32-bit and six 16-bit processor registers in IA-32 architecture. The registers are
grouped into three categories −

• General registers,
• Control registers, and
• Segment registers.

The general registers are further divided into the following groups −

• Data registers,
• Pointer registers, and
• Index registers.

16
Data Registers

Four 32-bit data registers are used for arithmetic, logical, and other operations. These 32-bit
registers can be used in three ways −

• As complete 32-bit data registers: EAX, EBX, ECX, EDX.


• Lower halves of the 32-bit registers can be used as four 16-bit data registers: AX, BX, CX
and DX.
• Lower and higher halves of the above-mentioned four 16-bit registers can be used as eight
8-bit data registers: AH, AL, BH, BL, CH, CL, DH, and DL.

Some of these data registers have specific use in arithmetical operations.

AX is the primary accumulator; it is used in input/output and most arithmetic instructions. For
example, in multiplication operation, one operand is stored in EAX or AX or AL register according
to the size of the operand.

BX is known as the base register, as it could be used in indexed addressing.

CX is known as the count register, as the ECX, CX registers store the loop count in iterative
operations.

DX is known as the data register. It is also used in input/output operations. It is also used with
AX register along with DX for multiply and divide operations involving large values.

Pointer Registers

The pointer registers are 32-bit EIP, ESP, and EBP registers and corresponding 16-bit right
portions IP, SP, and BP. There are three categories of pointer registers −

17
• Instruction Pointer (IP) − The 16-bit IP register stores the offset address of the next
instruction to be executed. IP in association with the CS register (as CS:IP) gives the
complete address of the current instruction in the code segment.
• Stack Pointer (SP) − The 16-bit SP register provides the offset value within the program
stack. SP in association with the SS register (SS:SP) refers to be current position of data or
address within the program stack.
• Base Pointer (BP) − The 16-bit BP register mainly helps in referencing the parameter
variables passed to a subroutine. The address in SS register is combined with the offset in
BP to get the location of the parameter. BP can also be combined with DI and SI as base
register for special addressing.

Index Registers

The 32-bit index registers, ESI and EDI, and their 16-bit rightmost portions. SI and DI, are used
for indexed addressing and sometimes used in addition and subtraction. There are two sets of index
pointers −

• Source Index (SI) − It is used as source index for string operations.


• Destination Index (DI) − It is used as destination index for string operations.

Segment Registers

Segments are specific areas defined in a program for containing data, code and stack. There are
three main segments −

18
• Code Segment − It contains all the instructions to be executed. A 16-bit Code Segment
register or CS register stores the starting address of the code segment.
• Data Segment − It contains data, constants and work areas. A 16-bit Data Segment register
or DS register stores the starting address of the data segment.
• Stack Segment − It contains data and return addresses of procedures or subroutines. It is
implemented as a 'stack' data structure. The Stack Segment register or SS register stores the
starting address of the stack.

Assembly - Addressing Modes


The term addressing modes refers to the way in which the operand of an instruction is specified.
The addressing mode specifies a rule for interpreting or modifying the address field of the
instruction before the operand is actually executed. Addressing modes are different ways by which
CPU can access data or operands. They determine how to access a specific memory address. To
load any data from and to memory/registers, MOV instruction is used. The syntax of MOV
instruction is:
MOV Destination, Source

It copies the data of 2nd operand (source) into the 1st operand (destination). To access memory,
segment registers are used along with general-purpose registers. The MOV instruction may have
one of the following five forms −

MOV register, register


MOV register, immediate
MOV memory, immediate
MOV register, memory
MOV memory, register

Please note that −

• Both the operands in MOV operation should be of same size


• The value of source operand remains unchanged
There are seven addressing modes in 8086 processor. Now, we will discuss all of them in detail
with example assembly instructions.

19
1. Register addressing mode
This mode involves the use of registers. These registers hold the operands. This mode is very fast
as compared to others because CPU doesn’t need to access memory. CPU can directly perform an
operation through registers.
For example:
MOV AX, BL
MOV AL, BL
The above two instructions copy the data of BL register to AX and AL.

2. Immediate Addressing Mode


In this mode, there are two operands. One is a register and the other is a constant value. The register
comes quickly after the op code.
For example:
• The instruction MOV AX, 30H copies hexadecimal value 30H to register AX.
• The instructions MOV BX, 255 copies decimal value 255 to register BX.
You cannot use the immediate addressing mode to load immediate value into segment registers.
To move any value into segment registers, first load that value into a general-purpose register then
add this value into segment register.

3. Direct Addressing Mode


It loads or stores the data from memory to register and vice versa. The instruction consists of a
register and an offset address. To compute physical address, shift left the DS register and add the
offset address into it.
MOV CX, [481]
The hexadecimal value of 481 is 1E1. Assume DS=2162H then the logical address will be
2162:01E1. To compute physical address, shift left the DS register and add it to offset address.
The physical address will be 26120H + 1E1H=26301H. Hence, after execution of the MOV
instruction the contents of the memory location 26301H will be loaded into the register CX. The
instruction MOV [2481], CX will store the CX register content to memory location 26301H.

20
4. Register Indirect Addressing Mode

The register indirect addressing mode uses the offset address which resides in one of these three
registers i.e., BX, SI, DI. The sum of offset address and the DS value shifted by one position
generates a physical address. For example:

MOV AL, [SI]


This instruction will calculate the physical address by shifting DS to the left by one position and
adding it to the offset address residing in SI. The brackets around SI indicates that the SI contain
the offset address of memory location whose data needs to be accessed. If brackets are absent, then
the instruction will copy the contents of SI register to AL. Therefore, brackets are necessary.
5. Based Relative Addressing Mode
This addressing mode uses a base register either BX or BP and a displacement value to calculate
physical address.
Physical Address= Segment Register (Shifted to left by 1) + Effective address
The effective address is the sum of offset register and displacement value. The default segments
for BX and BP are DS and SS. For example:

MOV [BX+5], DX
In this example, the effective address is BX + 5 and the physical address is DS (shifted left) +
BX+5. The instruction on execution will copy the value of DX to memory location of physical
address= DS (shifted left) +BX+5.
6. Indexed Relative Addressing Mode
This addressing mode is same as the based relative addressing mode. The only difference is it uses
DI and SI registers instead of BX and BP registers.

For example:
Given that DS=704, SI = 2B2, DI= 145
MOV [DI]+12, AL
This instruction on execution will copy the content of AL at memory address 7197 (7040 + 145 +
12)
MOV BX, [SI]+10

21
This instruction will load the contents from memory address 7302 (7040 +2B2 +10) to register
BX.
7. Based Indexed Addressing Mode

The based indexed addressing mode is actually a combination of based relative addressing mode
and indexed relative addressing mode. It uses one base register (BX, BP) and one index register
(SI, DI). For example:

MOV AX, [BX+SI+20]


The above instruction can also be written as:

MOV AX, [SI+BX+20]


Or
MOV AX, [SI][BX]+20
In this case, the physical address will be DS (Shifted left) + SI + BX + 20. Now, if we replace BX
with BP then the physical address is equal to SS (Shifted left) + SI + BX + 20.
Now, let us have a look on these two instructions:
MOV AX, [BX][BP]+20
MOV AX, [SI][DI]+20
Both expressions are illegal as this mode supports one base register and one segment register.
Program Relocation
In reality, the assembler does not know the actual location where the program will be loaded.
However, the assembler can identify for the loader those parts of the object program that need
modification. An object program that contains the information necessary to perform this kind of
modification is called a relocatable program.
The solution to the relocation problem: 1) When the assembler generates the object code for JSUB
instruction, it will insert the address of RDREC relative to the start of the program. (This is the
reason we initialized the location counter to 0 for the assembly.
2) The assembler will also produce a command for the loader, instructing it to add the beginning
address of the program to the address field in the JSUB instruction at load time.

22
Note that the length field of a modification record is stored in half-bytes (rather than byte) because
the address field to be modified may not occupy an integral number of bytes. For example, the
address field in the +JSUB occupies 20 bits. Z The starting location field of a modification record
is the location of the byte containing the leftmost bits of the address field to be modified. If this
address field occupies an odd number of half-bytes, it is assumed to begin in the middle of the first
byte at the starting location.
Example: the modification record for the +JSUB instruction would be “M00000705”. This record
specifies that the beginning address of the program is to be added to a field that begins at address
000007 (relative to the start of the program) and is 5 half-bytes in length. Thus, in the assembled
instruction 4B101036, the first 12 bits (4B1) will remain unchanged. The program load address
will be added to the last 20 bits (01036) to produce the correct operand address.

Machine independent assembler features


Literals
It is often convenient for the programmer to be able to write the value of a constant operand as a
part of the instruction that uses it. Note that a literal is identified with the prefix =, which followed
by a specification of the literal value.
Example: 45 001A ENDFIL LDA =C’EOF’ 032010 specifies a 3-byte operand with value ‘EOF’.
It is important to understand the difference between a literal and immediate operand.
1. With immediate addressing, the operand value is assembled as part of the machine
instruction.
2. With a literal, the assembler generates the specified value as a constant at some other
memory location. The address of this generated constant is used as target address for the
machine instruction.
All of the literal operands used in a program are gathered together into one or more literal pools.
Normally literals are placed into a pool at the end of the program. In some cases, it is desirable to
place literals into a pool at some other location in the object program. To allow this, we introduce
the assembler directive LTORG.
1. When the assembler encounters a LTORG statement, it creates a literal pool that contains
all of the literal operands used since the previous LTORG (or the beginning of the
program).

23
2. This literal pool is placed in the object program at the location where the LTORG directive
was encountered
3. Of course, literals placed in a pool by LTORG will not be repeated in the pool at the end
of the program.
Most assemblers recognize duplicate literals – that is, the same literal used in more than one place
in the program – and store only one copy of the specified data value. For example, the literal
=X’05’ is used in our program on lines 215 and 230.

The basic data structure that assembler handles literal operands is literal table (LITTAB). For
each literal used, this table contains the literal name, the operand value and length, and the address
assigned to the operand when it is placed in a literal pool. LITTAB is often organized as a hash
table, using the literal name or value as the key. During pass 1, the assembler searches LITTAB
for the specified literal name (or value). If the literal is already present in the table, no action is
needed. If it is not present, the literal is added to LITTAB (leaving the address unassigned). During
pass 2, the operand address for use in generating object code is obtained by searching LITTAB for
each literal operand encountered.
Symbol-Defining Statements
The user-defined symbols in assembler language programs appear as labels on instructions or data
areas. The value of such a label is the address assigned to the statement on which it appears. Most
assemblers provide an assembler directive that allows the programmer to define symbols and
specify their value. The assembler directive generally used is EQU. The general form:
symbol EQU value
*This statement defines the given symbol (enters it into SYMBOL) and assigns to it the value
specified.
One common use of EQU is to establish symbolic names that can be used for improved readability
in place of numeric values.
+LDT +4096 → MAXLEN EQU 4096
+LDT #MAXLEN
When the assembler encounters the EQU statement, it enters MAXLEN into SYMTAB.
Another common use of EQU is in defining mnemonic names for registers.

24
For example: A EQU 0
X EQU 1
L EQU 2
These statements cause the symbols A, X, L, ,,, to be entered into SYMBOL with their
corresponding values 0, 1, 2, …
Another common assembler directive ‘ORG’: its form is
ORG value
where value is a constant or an expression involving constants and previously defined symbols.
When this statement is encountered during assembly of a program, the assembler resets its location
counter (LOCCTR) to the specified value. Since the values of symbols are taken from LOCCTR,
the ORG statement will affect the values of all labels defined until the next ORG.
Expressions
Most assemblers allow the use of expressions. Each such expression must be evaluated by the
assembler to produce a single operand address or value. Expressions are classified as either
absolute expressions or relative expressions. Relative: means relative to the beginning of the
program. Labels on instructions and data areas, and references to the location counter value, are
relative terms. Absolute: means independent of program location. A constant is an absolute term.

Note: A symbol whose value is given by EQU (or some similar assembler directive) may be either
an absolute term or a relative term depending on the expression used to define its value.

If relative terms occur in pairs and the terms in each such pair have opposite signs, then the
resulting expressions are absolute expressions. None of the relative terms may enter into a
multiplication or division operation. A relative expression is one in which all of the relative terms
except one can be paired as described above; the remaining unpaired relative term must have a
positive sign.
Example: 107 MAXLEN EQU BUFEND-BUFFER both BUFEND and BUFFER are relative
terms, each representing an address within the program. However, the expression represents an
absolute value: the difference between the two addresses.

25
One pass assembler and Multi pass assemblers

Assembler is a program for converting instructions written in low-level assembly code into
relocatable machine code and generating along information for the loader. It generates
instructions by evaluating the mnemonics (symbols) in operation field and find the value of
symbol and literals to produce machine code. Now, if assembler do all this work in one scan
then it is called single pass assembler and These assemblers perform the whole conversion of
assembly code to machine code in one go, otherwise if it does in multiple scans then called
multiple pass assembler. multiple pass assemblers first process the assembly code and store
values in the opcode table and symbol table. And then in the second step, they generate the
machine code using these tables. Here assembler divide these tasks in two passes:
• Pass-1:
1. Define symbols and literals and remember them in symbol table and literal table
respectively.
2. Keep track of location counter
3. Process pseudo-operations
• Pass-2:
1. Generate object code by converting symbolic op-code into respective numeric op-code
2. Generate data for literals and look for values of symbols
Firstly, we will take a small assembly language program to understand the working in their
respective passes. Assembly language statement format:

[Label] [Opcode] [operand]

Example: M ADD R1, ='3'


where, M - Label; ADD - symbolic opcode;
R1 - symbolic register operand; (='3') - Literal
Assembly Program:
Label Op-code operand LC value(Location counter)
JOHN START 200
MOVER R1, ='3' 200
MOVEM R1, X 201
L1 MOVER R2, ='2' 202
26
LTORG 203
X DS 1 204
END 205
Let’s take a look on how this program is working:

1. START: This instruction starts the execution of program from location 200 and label with
START provides name for the program.(JOHN is name for program)
2. MOVER: It moves the content of literal(=’3′) into register operand R1.
3. MOVEM: It moves the content of register into memory operand(X).
4. MOVER: It again moves the content of literal(=’2′) into register operand R2 and its label is
specified as L1.
5. LTORG: It assigns address to literals (current LC value).
6. DS (Data Space): It assigns a data space of 1 to Symbol X.
7. END: It finishes the program execution.
Working of Pass-1: Define Symbol and literal table with their addresses.
Note: Literal address is specified by LTORG or END.

Symbols Address
X 204
L1 202

Literals Address
=’3’ 203
=’2’ 205

Now tables generated by pass 1 along with their LC value will go to pass-2 of assembler
for further processing of pseudo-opcodes and machine op-codes.

Working of Pass-2:
Pass-2 of assembler generates machine code by converting symbolic machine-opcodes into
their respective bit configuration (machine understandable form). It stores all machine-opcodes
in MOT(op-code table) with symbolic code, their length and their bit configuration. It will also
process pseudo-ops and will store them in POT table (pseudo-op table).

27

You might also like