Block-4 Microprocessor and Advanced Architectures
Block-4 Microprocessor and Advanced Architectures
Computer Organisation
Indira Gandhi
National Open University
School of Computer and
Information Sciences
Block
4
Microprocessor and Advanced
Architectures
UNIT 13
Microprocessor Architecture
UNIT 14
Introduction to Assembly Language Programming
UNIT 15
Assembly Language Programming
UNIT 16
Advanced Architectures
FACULTY OF THE SCHOOL
Prof P. V. Suresh, Director Prof. V. V. Subrahmanyam
Dr Akshay Kumar Mr Mangala Prasad Mishra
Dr Sudhansh Sharma
PRINT PRODUCTION
August, 2021
© Indira Gandhi National Open University, 2021
All rights reserved. No part of this work may be reproduced in any form, by mimeograph or any other
means, without permission in writing from the Indira Gandhi National Open University.
Further information on the Indira Gandhi National Open University courses may be obtained from the
University’s office at Maidan Garhi, New Delhi-110068.
Printed at :
BLOCK 4 INTRODUCTION
In the previous blocks, we have discussed about computer organizations, the number systems, memory and input output
organization, instruction set of a computer, addressing modes, the micro-operations and the control unit of a computer
system.
This block presents an example micro-processor, as an example of computer architecture. We will discuss the
microprocessor architecture and its programming. Our main emphasis would be on Intel 8086/ 8088 microprocessor. The
newer microprocessors may use the concepts covered for 8086 microprocessors.
This block is divided into four units. We start with the introduction to microprocessors, with special emphasis on 8086
microprocessors. Unit 1 also gives a brief introduction to the Instruction Set and the addressing modes of the 8086 micro-
processor. Taking this as the base, in Unit 2 we get on to the Introduction of Assembly Language Programming. In this
unit, we will also give a brief account of various tools required to develop and execute an Assembly Language Program.
In Unit 3, a detailed study of Assembly Language, its programming techniques along with several examples have been
taken up. Unit 4 presents a brief discussion on some of the advanced architectures.
This block gives you only some details about 8086 microprocessor and assembly language programming. For complete
details on this Intel series of microprocessors, you may refer to the further readings, given below. You may also study the
advanced architectures from the further readings given in Block 1.
ARCHITECTURE
Structure Page No.
13.0 Introduction
13.1 Objectives
13.2 Structure of 8086 CPU
13.2.1 Bus Interface Unit
13.2.2 Execution Unit
13.2.3 Register Set
13.3 Instruction Set of 8086
13.3.1 Data Transfer Instructions
13.3.2 Arithmetic Instructions
13.3.3 Bit Manipulation Instructions
13.3.4 Program Execution Transfer Instructions
13.3.5 String Instructions
13.3.6 Processor Control Instructions
13.4 Addressing Modes
13.4.1 Register Addressing Mode
13.4.2 Immediate Addressing Mode
13.4.3 Direct Addressing Mode
13.4.4 Indirect Addressing Mode
13.5 Summary
13.6 Solutions/Answers
13.0 INTRODUCTION
In the previous three blocks of this course, you have gone through the concept of data
representation, logic circuits, memory and I/O organisation, instruction set
architecture, micro-operations, control unit etc. of a computer system. The processor
of a general purpose computer consists of an instruction set, which uses a set of
addressing modes. The control unit of a processor uses a set of registers and arithmetic
logic unit to process these instructions. This unit present details of a micro-processor,
in the content of all the above concepts. We have selected a simple micro-processors
8086, for the discussion. Although the processor technology is old, all the concepts
are valid for higher end Intel processor family, which are commonly referred to as x86
family. This block does not attempt to make you an expert assembly programmer,
however, you will be able to write reasonably good assembly programs . This unit
discusses the 8086 microprocessor in some detail. This unit will introduce you to
block diagram of components of 8086 microprocessor. This is followed by discussion
on the register organization for this processor. Some useful instructions and
addressing modes of this processor are also discussed in this unit. Please note the
concepts discussed in this unit may be useful in writing good Assembly Programs.
13.1 OBJECTIVES
After going through this unit, you should be able to:
explain the role of various components of 8086 microprocessor;
illustrate the use of segmentation in 8086 microprocessor;
use some of the important instruction of 8086 microprocessor
illustrate the use of different types of addressing modes of 8086 microprocessor.
5
Assembly Language
Programming
13.2 STRUCTURE OF 8086 MICROPROCESSOR
A microprocessor contains one or more processing unit on a single chip. Today's
processors contain multiple processing cores in a single chip, therefore are called
multi-core processors. A computer system consists of a micro-processor, memory unit
and input/output interfaces, internal and external connection structure, such as buses;
and several input/output devices. The bus size of a processor is a very important
design parameter. For example, the address bus of a processor, generally, can
determine the size of the physical main memory. The data bus determines the size of
the data that can be transferred from the memory to the processor registers.
The size of address bus of 8086 micro-processor is 20 bits and data bus is 16 bits.
Thus, 8086 micro-processor has 220 = 1M Byte base memory. From this memory,
about 640 KB was part of base RAM and remaining was used as ROM.
8086 micro-processor was designed as a complex instruction set computer with the
basic objectives of supporting more instructions, addressing modes and more
throughput. Present day multi-core processors are far more powerful than 8086 micro-
processor, but objective of this block is to introduce some of the basic features of
micro-processor and assembly language programming. For the basic discussion this
processor is good example.
A microprocessor executes a sequence of machine instruction, which can be
represented as the following notional program:
These units can function independently, therefore, they can function as two stages
instruction pipeline. Components of these two units as shown in Figure 13.1 are
explained in the following sections.
This unit then reads or writes the information from the physical address as
computed above.
o If an instruction is fetched, it is stored in the instruction stream queue.
6
Microprocessor
o Data is fetched into a general purpose register. Architecture
o In case of writing the data value of a selected register is written into a
desired memory location or I/O port .
An instruction queue is useful only if more than one instructions are fetched
simultaneously, which may be used for instruction pipelining involving the stages of
instruction fetch and instruction execution.
The BIU of this microprocessor has four specific segment registers, namely CS: Code
Segment register, DS: Data Segment register, SS: Stack Segment register, and ES:
Extra Segment register. All of these registers are 16 bit long. Why segmentation?
Segmentation divides the 1 MB memory of the computer associated with this
microprocessor into logical overlapping segments of 64 KB. A program can have
several code, data and stack segments. However, a maximum of four segments, one
each of each type, may be available for accessing data and instructions at a specific 7
Assembly Language
Programming
time, as there are four segment registers only. Thus, a program can consist of logical
segments of code, data, stack etc. Thus, address of a data byte stored in the memory
consists of a double Segment Register (16 bits): Offset Register (16 bits) Pair. How
does this segmented addressing better than fetching two words? In the segmented
scheme an address included in an instructions consist of only the 16 bit memory
address, thus, a segment can be a maximum size of 216 = 64 KB only. In addition, as
the size of segment register is 16 bits, therefore, there can be 216 = 65536 number of
segments. Please note that these segments will be overlapping as the size of base
memory for this processor is only 1 MB. Figure 13.2 shows the memory organisation
of 8086 microprocessor. Thus, a segment register is loaded with the address of current
segments and offset is used to represent data within that segment. Thus, instruction
just needs to store 16 bit address only. The address adder computes the 20 bit physical
address from the Segment Register (16 bits): Offset Register (16 bits) Pair. Please note
all the content in Figure 13.2 is in hexadecimal notation.
Figure 13.2 shows two hypothetical segments (just for illustration) in the 1MB
memory using hexadecimal notation. Assume that one segment data is of 30 bytes,
thus it can be accommodated in the segment of size 32 Bytes. Please note that
segment start address for this segment is 0000h and offset of these locations are 0000
to 001Fh. Therefore, the second segment can start from the physical memory address
00020h. The second segment is assumed to be of 64 KB and starting from physical
memory 00020h. Therefore, it has segment starting address as 0002h and offset values
8
Microprocessor
from 0000h to FFFFh. An interesting fact about the memory of 8086 processor is that, Architecture
although a single byte has an address, but in a single memory access two bytes are
transferred though data bus. For example, access to an even memory offset 0000h will
transfer bytes at offset 0000h and 0001h. However, in case, you try to access an odd
memory offset like 0013h, then the bytes 0012h and 0013h would be transferred to the
processor.
Now, the question is given the segment starting address of 16 bits and segment offset
of 16 bits, how will you compute the physical address? The designers of 8086 used an
address adder to compute physical address. The addition is performed as follows:
Given: Segment Address 0002h; and offset say 0001h
Physical address is computed by shifting segment address to left by one
hexadecimal digit (appending 0 as the lowest hexadecimal digit and add the
offset in the result).
The Segment Address (hexadecimal) 0 0 0 2
Shift left and add zero in least significant digit 0 0 0 2 0
Add the offset 0 0 0 1
Resulting 20 bit physical address 0 0 0 2 1
Given: Segment Address 0002h; and offset say FFFFh. The physical address will be
computed as:
The Segment Address (hexadecimal) 0 0 0 2
Shift left and add zero in least significant digit 0 0 0 2 0
Add the offset F F F F
Resulting 20 bit physical address 1 0 0 1 F
Please note that F+2 will be 15+2=17, so addend is 1 and carry is 1. Also
when you add carry 1 to F, it will be 16, which is addend is 0 and carry is 1
(b) Stack Segment (SS) register and Stack Pointer (SP) register, which points to the
top of the stack in the stack segment, to compute the address of the top of the
stack. The following example explains their use.
An assumed Stack Segment(SS) Address (hexadecimal) F 1 1 D
Shift left and add zero in least significant digit F 1 1 D 0
Assume that SP contains an offset 0110h 0 1 1 0
9
Assembly Language
Programming
Resulting 20 bit physical address F 1 2 E 0
(c) Data Segment (DS) register and Offset to compute the address of the data to be
fetched. The following example explains their use.
An assumed Data Segment(DS) Address (hexadecimal) A 5 8 3
Shift left and add zero in least significant digit A 5 8 3 0
Assume that data offset is A021h A 0 2 1
Resulting 20 bit physical address A F 8 5 1
(d) Extra Segment (ES) register and offset to compute the address of extra data
segment (in case two data segments are used at the same time).
Control Circuitry for Instruction Decode and operand specification and ALU
The 8086 processor uses a micro-programmed control unit, which decodes the
instruction and executes it as per the micro-program stored in the control memory.
The control unit is also responsible for generating the control timing sequences. ALU
performs the operation on the data as instructed by the control unit.
Registers
8086 has several kinds of registers, which includes general purpose, special purpose
registers and a special flag register. The next section explains the role of different
registers of 8086 micro-processor.
13.3.3 Register Set
The 8086 registers have five different categories of registers. The following table
explains the role of these registers.
Register Category Register Name and Special Purpose, if any
Size
Segment Registers CS (16 bits) For storing the base address of
code segment
DS (16 bits) For storing the base address of
data segment
SS (16 bits) For storing the base address of
stack segment
ES (16 bits) For storing the base address of
extra data segment
General Purpose AX - 16 bits ; it It is also called accumulator
Register: Can be used for consists of two byte register. It can store the results of
any computation, in register AH, which addition or subtraction operation;
addition they are used for stores the higher byte for some instructions like
specific purpose as stated and AL, which stores multiplication and division it
in this table the lower byte store one of the operand.
BX - 16 bits ; it It is also called base register. It
consists of BH and stores the base location of a
BL memory array.
CX - 16 bits ; it It is also called counter register.
consists of CH and It can be used for keeping count
CL in looping instructions
10
Microprocessor
DX - 16 bits ; it This register can be used for I/O Architecture
consists of DH and operation.
DL
Pointer and Index BP (16 bits) Base Pointer register used in
Registers: These registers stack segment
can also be used as SI (16 bits) Source Index register used in
general purpose registers data segment
DI (16 bits) Destination Index register used
in extra data segment
Special Register SP Stack Pointer register, points to
the top of the stack.
Flags Register It consists of 16 flags Some of the important flags are
set by the last ALU carry flag (CF), Parity Flag (PF),
operations. Each flag Auxiliary Flag (AF), Zero flag
is 1 bit logn (ZF), Sign Flag (SF), Overflow
Flag (OF), Interrupt Enable flag
(IF) and other control flags.
The following are some of the important functional groups of the 8086 instructions.
12
Microprocessor
The maximum size of this stack segment is 0100h having offsets 0000h to Architecture
00FFh. The stack segment register value is not shown.
In 8086 microprocessor, the stack grows from higher offset to lower offset.
The stack would be empty if SP contains 0100h. Stack is full when SP is
0000h.
The PUSH instruction causes the decrementing the stack pointer by a value 2
(as stack is a word stack and the offset is an address of a byte), i.e. SP=SP-2,
and then the word operand of the PUSH instruction is put in the stack
locations pointed to by the SP.
POP instruction results in moving the content at the stack location into the
destination register, specified by the instruction. This is followed by
incrementing the stack pointer register value by 2, i.e. SP=SP+2.
PUSHF and POPF instructions: The PUSHF instruction is used for pushing the
current flags register on to the stack, while POPF pops the content at the top of the
stack to fags register.
Other data transfer instructions
There are a number of other data transfer instructions. These instruction and their
purpose is given in the following table:
MNEMONIC DESCRIPTION
XCHG Exchanges bytes of words of source and destination. At least
destination, source one operand should be a register operand.
XLAT This is a complex instruction, which translates a byte of AL
register using a lookup table. This instruction uses AL register
as the operand. An example of this instruction is given in Unit
15.
LEA register, This instruction results in loading of 16 bit effective address of
source source operand to the specified register operand. This
instruction is useful for array index manipulation.
IN accumulator, This instruction is used to transfers a byte or word from a
port address specified Input port to accumulator register. The instruction can
use DX register as implied operand for port address. The port
address can also be an immediate operand.
This instruction can be used to transfer a byte or word, which is
OUT port address, in accumulator register to specified output port address of an
Accumulator output devices, such as monitor or printer
LDS/LES These instructions and used to loading data segment/extras data
segment respectively along with one specified registers. Details
on these instructions are beyond the scope of this unit.
LAHF/SAHF The LAHF loads the low byte of flags register to AH register,
while SAHF stores value of AH register to low byte of flags
register.
Example:
Consider the AL register has ASCII digit '7' and BL contains ASCII '6'. You want to
add these two values to get an answer 13 in decimal. One of the way would be to
convert these operand into binary and perform the addition and convert the results
back to desired format. Other way will be to use AAA as follows:
MUL, DIV and IDIV instructions: MUL source, DIV source and IDIV source
MUL and DIV instructions are unsigned multiplication and unsigned division
instructions respectively. IDIV is a signed division instruction. The source can be a
memory or register operand, which contains either byte data or word data. For these
instructions one of the operand is assumed to be AL register (if data is of byte type) or
AX register (if data is of word type). The result of MUL instruction is stored in AX
register (if data is of one byte) or DX and AX pair (if data is of word type). Thus,
symbolically the MUL instructions can be represented as:
AX AL × source (if source is 8 bit data)
DX, AX AX, sources (if source is 16 bit data)
In case in this instruction, if most significant bit of the result is 0, then carry and
overflow flags are set to 0. In case a byte is to be multiplied with a word operand, then
you must first convert the byte operand to a word operand using instructions like
CBW given later in the unit.
The result of DIV and IDIV instructions for byte operands is stored as AH stores
remainder and AL stores quotient of division, or for word operands DX stores the
remainder and AX stores the quotient. in AH register (if data is of one byte) or DX
and AX pair (if data is of word type). Thus, symbolically the MUL instructions can be
represented as:
AH (Remainder) AL (Quotient) AL / source (if source is 8 bit data)
DX (Remainder) AX (Quotient) AX / source (if source is 16 bit data)
In the division operation a 0 value in the source register will result in run time error.
Example:
Assume that AL register contains 11h and BL register contains 02h.
Multiplication and division instructions will give following results:
MUL BL ; Result 11h × 02h = 22h; The AH = 00h and AL=22h
DIV BL ; Result 11h / 02h = Remainder in AH= 01h and
; Quotient in AL 08h
CMP instructions: compares destination and source operands
This is a very interesting instruction used for comparing two operands. This
instruction only sets the flag by subtracting source from the destination operand (both
byte or both word). Both the source and destination operands cannot be memory
operands at the same time. This operation may set carry flag zero flag, sign flag etc.
The following example explains how flags may be set by this operand. This
instruction only changes the flags, no operand value is changed.
Example:
Instruction Flags if AX= CX Flags if AX > CX Flags if AX < CX
CMP AX, CX CF=0; ZF=1; SF=0 CF=0; ZF=0; SF=0 CF=1; ZF=0; SF=1
Other arithmetic instructions
Some of the other instructions are given below:
SUB This instruction subtract source from destination. The carry flag in
destination, subtraction is a borrow flag.
source
SUB Subtracts with previous borrow, if any.
destination,
source
NEG source Creates the 2's complement of source number.
15
Assembly Language
Programming AAS, DAS Works in a similar manner as AAA and DAA, except they operate
after subtraction operation.
AAM, AAD Works in a similar manner as AAA, except the operation is
multiplication and division respectively.
CBW, CWD These instructions convert byte to word or word to double word
respectively. The value of sign bit is filled in the upper byte or word
as the case may be. For CBW operand is in AL register and resulting
word is in AX register; whereas for CWD the operand is in AX
register and the double word is in DX, AX pair.
All the bits of the byte are shifted towards the right. The most significant bit gets the
value 0 and least significant bit is pushed to the carry flag (shown in green colour).
In the arithmetic shift right, all the bits are shifted towards the right. The most
significant bit, which is a sign bit retains the same sign (please see the 1’s in the left
most position in the diagram above) and least significant bit is pushed to the carry flag
(shown in green colour).
In the Rotate shift left, all the bits are shifted towards the left. The most significant bit
is shifted to CF as well as rotated to least significant bit, as shown above (in green
colour).
.
CF AL Register Value ROR is rotate right;
0 1 0 0 0 0 0 1 1 Initial Value
1 1 1 0 0 0 0 0 1 After execution of ROR AL
Direction of shift
In the Rotate shift right, all the bits are shifted towards the right. The least significant
bit is shifted to CF as well as rotated to most significant bit, as shown above (in green
colour).
CF AL Register Value RCL is rotate left with carry;
0 1 0 0 0 0 0 1 1 Initial Value
1 0 0 0 0 0 1 1 0 After execution of RCL AL
Direction of shift
In the Rotate shift left with carry, all the bits are shifted towards the left. The most
significant bit is shifted to CF, and the CF is rotated to the least significant bit (shown
in blue colour)
CF AL Register Value RCR is rotate right with carry;
0 1 0 0 0 0 0 1 1 Initial Value
1 0 1 0 0 0 0 0 1 After execution of RCR AL
Direction of shift
In the Rotate shift right with carry, all the bits are shifted towards the right. The least
significant bit is rotated to CF (shown in blue colour), and the CF is shifted to the
most significant bit.
Call and return instructions are used form calling a procedure and once execution of
the execution of the procedure is over RET instruction brings the control to the next
instruction after the CALL instruction. In 8086 microprocessor there are two types of
calls, viz. NEAR call and FAR call. The near call is within the same segment, whereas
FAR call is to a different segment. A call instruction has the following basic format:
CALL <address of procedure>
Now, the question is how to recognize, if it is a NEAR or FAR procedure call? This is
resolved by the assembler from the declaration of the procedure, which is created as a
NEAR or FAR procedure. An example, explaining this is discussed in Unit 15 of this
Block. A call to the procedure can be made using the CALL instruction. For example,
if the name of a procedure in a separate code segment is procdeure1, then the
following call instruction will be used:
CALL procedure1 ;
This instruction will cause the execution for following sequence of operations:
1. If, it is a FAR procedure, then, present CS and IP should be saved as return
address on the top of the stack, otherwise only IP will be stored on the stack.
SP=SP-2; SS[SP]CS; // This step will not be required in NEAR procedure
SP=SP-2; SS[SP]IP;
2. The CS will be loaded with the code segment address of procedure1 and IP
will be loaded with the offset of procedure1.
CS= CS of procedure1; // This step will not be required in NEAR procedure
IP = Offset of first instruction of procedure1;
3. The next instruction as per CS:IP value updated in step 2 will be executed
next.
A procedure ends in a return instruction (RET). It causes the called procedure to
return to the calling program. The following sequence of actions are performed by the
RET instruction.
1. Perform the following actions:.
18
Microprocessor
CS SS[SP] ; SP=SP+2;; // NOT performed in NEAR procedure Architecture
IP SS[SP] ; SP=SP+2;
2. The next instruction as per CS:IP value updated in step 1 will be executed
next.
Jump instructions:
8086 micro-processor have instructions for unconditional and conditional jump
instructions. The unconditional jump can be to NEAR or FAR label. It only requires
one operand, which is the address, specified using a Label, of the next instruction to
be executed. The format of this instruction is given below:
JMP Label
Loop instructions:
A loop instruction (LOOP label) uses CX register as a counter register. The label in
the loop instcution should be in the range -128 to +127. Prior to a loop instruction, the
looping count value should be moved to CX register. The Loop instruction decrements
the CX register and checks if CX register has zero value. If CX is not zero, then loop
instruction takes the program back to the instruction, which is specified by the label of
that instruction. In case, CX is zero then the loop is terminated, i.e., the next
instruction after the loop instruction is executed in sequence. 8086 micro-processor
has a number of loop instruction, which differ in condition the condition of loop
termination. The following table lists some of these instructions, which may be used
19
Assembly Language
Programming
later Units. There are many other such instructions for looping, a discussion on them
is beyond the scope of this unit.
Example: Let us assume you have byte array of 40h bytes. Write an assembly program
segment that check if each of these elements have a value 00F0h.
Solution: Please note the two conditions - the first condition is that each element
should be equal to 00F0h and the second condition is loop is to be executed 40h times.
Thus, LOOPE instruction would be used, but prior to that you need to set different
registers. The program segment for looping is shown below:
; Assume that the name of the array is BYTECOST
MOV BX, OFFSET BYTECOST ; This instruction will cause the BX register to
; point at the first element of byte array BYTECOST.
DEC BX ; Decrementing the value of BX register by one.
; This will cause BX to point to one byte prior to
; BYTECOST array. Why is this instruction?
; This is due to specific loop instructions below.
MOV CX,40h ; Initialise the loop counter to size of array
L1: INC BX ; Move to the next element in the array.
CMP [BX],0F0h ; Compare the array element to 0F0h
LOOPE L1 ; Loop if the present array element is equal to
; 0F0h as per CMP instruction and CX is not zero.
It may be noted that LOOPE instruction will automatically decrement the value of the
counter CX register.
In addition to the program execution control transfer, there are string instructions
which are useful for string matching. Such instructions were specially designed for
8086 microprocessor, so that it can perform faster string comparisons. Some of these
instructions are discussed in the next section.
20
Microprocessor
This instruction moves data from one byte string to another byte string. This string Architecture
operator uses several registers implicitly. The source string is assumed to be in data
segment, indexed by SI register, whereas the destination string is assumed to be extra
data segment indexed by DI register. CX is used as counter register. On transfer of
one byte data from sources string to destination, automatically results in increment of
SI and DI registers, and decrement of CX register.
Example: Assume that both data segment and extra data segment registers start from
segment address 00FFh and a byte string of length 0100h starting at an offset 0400h is
to be coped at an offset 0600h. Write the program segment to show this transfer.
; Assuming data segment and extra data segments registered are already initialised.
Instruction Purpose
CMPS/CMPSB/ This instruction compares two byte or word strings, use of CX, SI
CMPSW and DI remains the same as MOVS. It is recommended to use
REPE in this case.
SCAS/SCASB/ This instruction compares a string with a value in AL or AX
SCASW register for a byte or word string respectively. The string to be
scanned is assumed to be in extra data segment. This instruction
uses CX and DI registers, when REP prefix is used.
LODS/LODSB/ This instruction is used to load a byte or word of a string pointed to
LODSW by SI register into AL or AX registers respectively.
STOS/STOSB/ This instruction is used to store a byte of word from AL or AX
STOSW registers respectively into a location pointed by DI register.
Instruction Purpose
STC This instruction sets the carry flag.
CLC This instruction clears the carry flag.
CMC This instruction complements or inverts the state of the carry flag.
STD This instruction sets the direction flag (DF=1), so the SI and DI are
decremented automatically.
CLD This instruction clears the direction flag (DF=0), so the SI and DI are
incremented automatically.
There are many other process control instructions. You may refer to further readings
to know more about these instructions.
22
Microprocessor
Architecture
BX register contains the offset of the location in Data Segment, whereeas BP register
points to the base of the stack segment register. The index registers SI and DI also
contains offset in the Data Segment and Extra data Segment respectively.
These registers can be combined to create several indirect addressing modes. These
are:
Register indirect: In this addressing mode the register contains the address of the data.
In general, the type of register as stated above determines the segment in which the
data is to be accessed. Examples of this mode are:
MOV AL, [DI] ; Move the byte at the memory location ES:DI to AL.
MOV AL, [BX] ; Move the byte at the memory location DS:BX to AL.
Based indirect: In this addressing mode a base register and a displacement are added
to compute the offset of address of data in the related segment. Example of this mode
are:
MOV AL, [BX+2] ; Move the byte at the memory location DS:BX+2 to AL.
Indexed indirect: In this addressing mode an index register and a displacement are
added to compute the offset of address of data in the related segment. Example of this
mode are:
MOV AL, [DI+2] ; Move the byte at the memory location ES:DI+2 to AL.
There are two more such indirect addressing modes, viz. Based Indexed and Based
Indexed with displacement, however, they are rarely used and are not explained in this
Unit.
13.6 SUMMARY
23
Assembly Language
Programming
In this unit, you have gone through the basic architecture of 8086 microprocessor.
This architecture was a creative design and used many interesting concepts related to
enhancing the speed of instruction processing. First of these is the concept of use of
segment registers to reduce 20 bit physical address to a 16 bit offset address, reducing
the size of instruction using direct addressing, second faster string processing by
using two separate segments to speed up string operations such as matching, third use
of pipelining by designing two sections in CPU, fourth use of instruction queue for
pre-fetching instructions and so on. 8086 assembly language forms the basis of Intel
instruction sets of advanced processors and may help you appreciate the assembly
language of those processors.
Some of the key features of this processor include:
It has 20 bit address bus, therefore, base memory is 1 MB
It has 16 bit data bus, thus can fetch two bytes simultaneously
It has four segment rgisters that along with other pointer registers converts 16
bit offsets to 20 bit physical address.
It has large number of instructions of different types, which allows writing of
powerful assembly programs.
Please refer to the further reading for more details on 8086 assembly language
programming.
13.7 SOLUTIONS/ANSWERS
(b)
DS (in hexadecimal) 0 2 1 1
Shift left by one Hexadecimal digit 0 2 1 1 0
BX (in hexadecimal) 0 1 0 0
Physical address (Hexadecimal) 0 2 2 1 0
(c)
SS (in hexadecimal) 4 2 A A
Shift left by one Hexadecimal digit 4 2 A A 0
SP (in hexadecimal) 0 1 2 3
Physical address (Hexadecimal) 4 2 B C 3
24
Microprocessor
3. Flag register is used to store all the flag bits, which are generated as a result of Architecture
last instruction. Some of these flags are sign flag, carry flag, overflow flag etc.
Flag register cannot be used as a general purpose register.
2. SHL is shift left instruction and identical to arithmetic shift left instruction.
Compare the different types of shift instructions of 8086 micro-processor.
However, SHR and SAL differ and different input is added to the left most bit.
Rotate instruction ROL and ROR just rotates the word/byte, whereas RCL and
RCR also rotate the sign bit. (Please refer to section 13.4.3).
3. Perform test instruction on the operands (please make sure both the operands are
not memory operand). If it sets the zero flag, then both the operands are same;
otherwise they are different.
1. CALL statement calls a subroutine, i.e. the next instruction to be executed by the
processor should be the first instruction of the subroutine. Since on completion of
the subroutine execution the next instruction of the calling program is to execute
therefore the return address is stored by the CALL instruction. RET instruction
just brings the control back the the next instruction after CALL instruction in the
calling program.
2. There are primarily two types of jump instructions: unconditional jump and
conditional jumps. The unconditional jump instruction causes a compulsory jump
to specified label. There are a number of conditional jump instructions, where a
jump is taken if the related condition is fulfilled; else next instruction in sequence
is executed.
3. Loop instruction in each iteration decrements CX register, and checks the value of
CX. In case it is not zero, you go back to the Label from where the loop started.
However, if the CX register is zero, the next instruction in sequence is executed.
4. String instructions in 8086 microprocessor are specially designed for efficient
execution of string operations. For example, to match two strings, one string each
be put in DS and ES with DS:SI pointing to first string and ES:DI pointing to
second string. String length is put in CX register. The string matching instruction
on using REPE command will compare the first byte and will increment SI and
DI; and decrement CX. Thus, you do not need to write lengthy program for string
matching, which includes all the operation as given above.
5. Immediate Operand
MOV AL, (9+7)*2 ; move 32 to AL register.
Register Addressing
MOV AL, DL ; move DL to AL register.
Direct Addressing
MOV AL, X ; move content of byte location X to AL register.
Register Addressing
MOV AH, [BX] ; move content of location, whose address is
; DS:BX to AL register.
25
Introduction to
UNIT 14 INTRODUCTION TO ASSEMBLY Assembly Language
Unit Name
Programming
LANGUAGE PROGRAMMING
Structure Page No.
14.0 Introduction
14.1 Objectives
14.2 The Need and Use of the Assembly Language
14.3 Assembly Program Execution
14.4 An Assembly Program and its Components
14.4.1 The Program Annotation
14.4.2 Directives
14.5 Input Output in Assembly Program
14.5.1 Interrupts
14.5.2 DOS Function Calls (Using INT 21H)
14.6 The Types of Assembly Programs
14.6.1 COM Programs
14.6.2 EXE Programs
14.7 How to Write Good Assembly Programs
14.8 Summary
14.9 Solutions/Answers
14.10 Further Readings
14.0 INTRODUCTION
In the previous unit, you have gone through the basic concepts of 8086
microprocessor, which included the 8086 structure, segmentation, register set,
instructions and addressing modes. This unit present a basic framework for writing
assembly language programs for 8086 microprocessor. In this unit, you will learn
about the importance, basic components and development tools of assembly language
programming. The Input/Output to an assembly language program is a complex
process. This unit discusses the Input/Output to assembly program by using interrupts.
This unit also discussed about different kinds of Assembly programs, viz. COM
programs and EXE programs. Finally,the unit presents an example assembly program.
An assembly program consists of assembler directives and instructions of 8086
microprocessor. This program is assembled using an assembler program. Several such
assembler programs exist, which use different assembler directives. We have used the
assembler directives, as used in Microsoft Assembler (MASM). However, these
directives may be different for different assemblers. Therefore, before running an
assembly program you must consult the reference manuals of the assembler you are
using and change directives accordingly.
14.1 OBJECTIVES
After going through this unit, you should be able to:
35
The Central
Processing
AssemblyUnit
Language
Programming
14.2 THE NEED AND USE OF THE ASSEMBLY
LANGUAGE
The computer instructions are a sequence of 0’s and 1’s. These sequencesconsist of
instruction operation code, addressing modes and operand addresses. The instructions
of the programs written in the machine language are directly decoded by processing
unit. However, you may have to face the following problems, if youprogram using
machine language:
Machine Language depends on machine instruction set and is difficult for most
people to write in 0-1 forms.
Debugging or correcting a machine language program is difficult.
Deciphering the machine code is very difficult. Thus, program logic of programs
written in machine language will be difficult to understand.
Assembly language programs are at least 30% denser than the same programs written
in high-level language. The reason for this is that the compilers produce a long list of
code for every instruction as compared to assembly language, which produces single
line code for a single instruction. Further, complex instructions of a computer, like
string instruction of 8086 highly that are highly optimized, can be used while writing
an assembly program, making program faster. On the other hand, unlike high level
languages,assembly language is machine dependent. Each microprocessor has its own
set of instructions. Thus, assembly programs are not portable.
Assembly language has very few restrictions or rules; nearly everything is left to the
discretion of the programmer. This gives lots of freedom to programmers to write
good logic of a program
36
Introduction to
about 5-10% machine dependent assembly code. In addition, many Assembly Language
telecommunications applicationsuse assembly routinesto enhancethe efficiency. Programming
Step 2: The link step involves converting the .OBJ module to an .EXE machine code
module. The linkercompletes any address left open by the assembler and
combines separately assembled programs into one executable module. In
addition, it also initializes the .EXE module with special instructions to
facilitate its subsequent loading of the .EXE program into the computer
memory for execution.
Step 3: The last step is to load the program for execution. Because the loader knows
where the program is going to be loaded in the memory, it is able to resolve
all the remaining incomplete addresses in the header. The loader drops the
header and creates a program segment prefix just before the program is loaded
in memory.
37
The Central
Processing
AssemblyUnit
Language
Programming Editor Editor: Create a Create an assembler
program file (.ASM) Source Program
Files
Prog.asm
Prog.obj
Prog.exe
Tools required for assembly language programming: Following are some of the
basic tools needed to create assembly program. A modern-day assembler may contain
several of these tools.
Editor: The editor is a program that allows the user to enter, modify, and store a
group of instructions or text under a file name. The editor program createsan ASCII
file. A common line editor program is NOTEPAD in Windows; vi editor in UNIX etc.
An editor program may be part of assembler itself. You should use proper syntax of
the assembly instructions to create an 8086-assembly program.
38
Introduction to
The assembler generates three files when your program gets successfully assembled Assembly Language
with no errors. These three files are the object file, the list file and cross reference file. Programming
The object file contains the binary code for each instruction in the program.The errors
that are detected by the assembler are called the symbol errors. For example, in the
following statement the mnemonic MOVE is compared by assembler to all the
mnemonics of the mnemonic set. It fails to get a match following which it assumes
MOVE to be an identifier and looksfor its entry in thesymbol table.It does not find it
there too, therefore gives an error “undeclared identifier”.
List file is optional and contains the source code, the binary equivalent of each
instruction, and the offsets of the symbols in the program. This file is for purely
documentation purposes.Some of the historical assemblers available on PCs are
MASM, TURBO assembler etc.
Linker: For better modularity programs are broken into several sub routines. It is
even better to design common routine, like reading a hexadecimal number, writing
hexadecimal number etc., which could be used by a lot of other programs.These
common routines can be put into files and assembled separately. After each file has
been successfully assembled, they can be linked together to form a large file, which
constitutes your complete program. The file containing the common routines can be
linked to your other program also. The program that links your program is called the
linker.
The linker produces a linked file, which contains the binary code for all component
modules. The linker also produces link map, which contains the address information
about the linked files. The linker, however, does not assign absolute or physical
addresses to your program. It only assigns continuous relative addresses to all the
modules linked starting from the zero. Since these programs uses just relative
addresses, they can be loaded in any physical memory address. Thus, these programs
are called relocatable programs.
Loader: The basic purpose of the loader program is to convert the logical or relative
addresses assigned by linker to absolute or physical memory addresses. This task is
performed, while loader loads the linked program into the physical memory for
execution. The linked program is brought from the secondary memory, like disk, to
the computermemory for execution The file name extension of the files for loading is
.EXE or .COM, which after loading can be executed by the CPU.
Debugger: The debugger is a program that allows the user to test and debug the
object file. The user can employ this program to perform the following functions:
Line Offset
Numbers Source Code
0001 DATA SEGMENT
0002 0000 MESSAGE DB “Assembly LanguageProgramming$”
0003 DATA ENDS
0004 STACK SEGMENT
0005 STACK 0400H
0006 STACK ENDS
0007 CODE SEGMENT
0008 ASSUME CS: CODE, DS: DATA SS: STACK
0009;Offset MachineCode Assembly Instructions
0010 0000 B8XXXX MOV AX,DATA
0011 0003 8ED8 MOV DS, AX
0012 0005 BAXXXX MOV DX, OFFSET MESSAGE
0013 0008 B409 MOV AH, 09H
0014 000A CD21 INT 21H
0015 000C B8004C MOV AX,4C00H
0016 000F CD21 INT 21H
0017 CODE ENDS
0018 END
The assembler assigns line numbers to the statements in the source file
sequentially. If the assembler issues an error message; the message will contain a
reference to one of these line numbers.
The second column from the left contains offsets. Each offset indicates the
address of an instruction or a datum as an offset from the base of its logical
segment, e.g., the statement at line 0010 produces machine language at offset
40
Introduction to
0000H of the CODE SEGMENT and the statement at line number 0002 produces Assembly Language
machine language at offset 0000H of the DATA SEGMENT. Programming
The third column in the annotation displays the machine language produce by
code instruction in the program.
Missing offset: The XXXX in the machine language for the instruction at line 0010 is
there because the assembler does not know the DATA segment location that will be
determined at loading time. The loader must supply that value.
Keyword: A keyword is a statement that defines the nature of that statement. If the
statement is a directive, then the keyword will be the title of that directive; if the
statement is a data-allocation statement the keyword will be a data definition type.
Some examples of the keywords are: SEGMENT (directive), MOV (statement) etc.
Identifier can use alphabet, digit or special character.It always starts with an alphabet.
Parameters: A parameter extends and refines the meaning that the assembler
attributes to the keyword in a statement. The number of parameters is dependent on
the Statement.
14.4.2 Directives
Assembly languages support a number of directive statements. Directives enable you
to control the way in which a source program assembles and list.Directives act only
when the assembly is in progress and generate no machine-executable code. Let us
discuss some common directives.
41
The Central
Processing Unit
1. HEX: The HEX directive facilitates the coding of hexadecimal values in the
Assembly Language
Programming body of the program. This statement directs the assembler to treat related tokens
in the source file as numeric constants in hexadecimal notation.
3. END DIRECTIVE: There are three different END directives. These are:
(i) ENDS Directive: This directive marks the completion of a segment. Thus,
every segment used by you must have an ENDS directive.
(ii) ENDP directive:As stated in point 2 it is used to mark the end of a procedure.
(iii) END directive: It marks the end of the entire program. Any statement after
this directive is ignored by the assembler.
CODE SEGMENT
The code segment contains the code of the program, which may include
procedures and sometimes other segments too. Linker marks the code segment in
a program in a header. This header is used by the operating system when it
invokes the loader to load an executable file of the program into memory. The
loader reads this header for setting the CS register. A physical memory address is
represented as CS: xxxx, where xxxx represents the offset in the code segment.
In general, the first instruction of the code segment is assumed as the first
instruction to be executed, therefore, is put at an offset of 0000H. The instruction
pointer (IP) register is used to mark the offset of an instruction in code segment.
The CS: IP pair is thus used to specify physical address of an instruction in a
program that is being executed.
STACK SEGMENT
8086Microprocessor supports the Word stack. The stack segment parameters
tell the assembler to alert the linker that this segment statement defines the
program stack area.
A program must have a stack area in that the computer is continuously carrying
on several background operations that are completely transparent, even to an
assembly language programmer, for example, a real time clock issues a real time
clock interrupts after every 55 milliseconds. Every 55 ms the CPU is
42
Introduction to
interrupted. The CPU records the state of its registers and then goes about Assembly Language
updating the system clock. When it finishes servicing the system clock, it has to Programming
restore the registers and go back to doing whatever it was doing before the
occurrence of interrupt. All such information gets recorded in the stack. Please
note if you have not specified the stack segment it is automatically created.Why
is stack segment essential? Consider your program is being executed by CPU,
and a clock pulse need service, then if the system has no stack, then your CPU
will not be able to return to your program again after serving of the clock pulse.
DATA SEGMENT
It contains the data allocation statements for a program. This segment is very
useful as it shows the data organization.
DB Define byte 1
DW Define word 2
DD Define double word 4
DQ Define Quad word 8
DT Define 10 bytes 10
DUP Directive is used to duplicate the basic data definition to ‘n’ number of
times
ARRAY DB 10 DUP (0)
In the above statement ARRAY is the name of the data item, which is of byte
type (DB). This array contains 10 duplicate zero values; i.e. 10 zero values.
EQU directive is used to define a name to a constant
CONST EQU 20
43
The Central
Processing
AssemblyUnit
Language ASCDB‘EXAMPLE’ ; Array of ASCII values is stored in variable ASCI.
Programming
(b) DUP directive is used to indicate if a same memory location is used by two
different variables name.
(d) The maximum number of active segments at a time in 8086 can be four.
(e) ASSUME directive specifies the physical address for the data values of
instruction.
14.5.1 Interrupts
An interrupt causes interruption of an ongoing program. Some of the common
interrupts are caused by devices like keyboard, printer, monitor, an error condition,
etc.
Hardware interrupts are generated by a device that requests for some service. A
software interrupt causes a call to the operating system. It usually is the input-output
routine.
In 8086 software interrupts can be used for input-output of data. A software interrupt
is initiated using the following statements:
INT number
In 8086, this interrupt instruction is processing using aninterrupt vector table (IVT).
The IVT is located in the first 1K bytes of memory, and has a total of 256 entries,
44
Introduction to
each of 4 bytes. The entry stores the address of the operating system subroutine that is Assembly Language
used to process the interrupt. This address may be different for different machines. Programming
Figure 14.3 shows the processing of an interrupt.
Return to calling
Entry F000H program
for
10th F005H
IVT
Step 2:The CPU locates the interrupt servicing routine (ISR) whose address is stored
at IVT entry of the interrupt. For example, in the figure above the ISR of INT
10H is stored at address (CS: IP) as F000h:F065h
Step 3: The CPU loads the CS register and the IP register, with this new address in
the IVT, and starts instruction execution process for that instruction.
Step 4:IRET (interrupt return) causes the program to resume execution of the next
instruction of the program, which was being executed prior to interrupt
servicing.
The advantage of this type of call is that it appears static to a programmer but flexible
to a system design engineer. For example, INT 00H is a special system level vector
45
The Central
Processing Unit
that points to the “recovery from division by zero” subroutine. If new designer come
Assembly Language
Programming and want to move interrupt location in memory, it adjusts the entry in the IVT vector
of interrupt 00H to a new location. Thus, from the system programmer point of view,
it is relatively easy to change the vectors under program control.
One of the commonly used Interrupts for Input /Output is called DOS function call.
Let us discuss more about it in the next subsection:
DOS
Purpose and Example
Function Call
This function called is used for reading a single character from
keyboard and displaying it on monitor. The input value is put in AL
register. For example, to read a character in a memory location X,
you may use the following code fragment:
AH = 01H
MOV AH,01H ; load AH register with the function value 01h
INT 21H;call the interrupt to read a character in AL
MOV X, AL;Load the read character in memory location X
This function prints an 8-bit data (normally ASCII), which is stored
in DL register, on the screen. For example, to print a character ‘?’ on
the monitor, you may write the following code fragment:
AH = 02H MOV AH,02H ; load AH register with the function value 02h
MOV DL, ‘?’ ; Move the character to be displayed in DL
INT 21H;call the interrupt to display the character in DL
This function is also used to input a single character into AL
register, except that the character does not get displayed on the
monitor. For example, to read a character in a memory location X,
you may use the following code fragment:
AH = 08H
MOV AH,08H ; load AH register with the function value 08h
INT 21H;call the interrupt to read a character in AL
MOV X, AL;Load the read character in memory location X
This function outputs a string whose offset is stored in DX register.
The string is terminated by using a $ character. You can use this
function to print newline character, tab character etc. For example,
to print a string “Hello World”, you may use the following code
fragment:
AH = 09H DATA SEGMENT
STRING DB ‘HELLO WORLD’,‘$’
DATA ENDS
CODE SEGMENT
…
46
Introduction to
MOV AX, DATA; Put offset of Data Segment to AX. Assembly Language
Programming
MOV DS, AX;Initialize data segment register using AX
MOV AH,09H;load AH with the function value 09h
MOV DX, OFFSET STRING;Store the offset of STRING in DX
INT 21H ; Call interrupt 21H to display the STRING
…
For input of string up to 255 characters. The string is stored in a
buffer. For example, the following data and code fragment will input
a string having a maximum length of 50 bytes. First, you need to
define these parameters in the data segment, as given below:
DATA SEGMENT
BUFFDB50
DB?
DB 50 DUP(0)
DATA ENDS
The name of the data segment, as given above, is DATA. It consists
of total 52 bytes locations named BUFF. The first location of BUFF
stores the decimal value 50, which is the maximum size of the string
that can be stored in this buffer. The second location, marked with
‘?’, will be used to store the actual size of the string, once it is read
AH = 0AH in the buffer. The remining 50 bytes at present are initialized as 0.
These bytes will contain the string once it is read. The code segment,
which will perform the string read operation is given below:
CODE SEGMENT
…
MOV AH, 0AH ; Move 0A to AH register
MOV DX, OFFSET BUFF; DX contains offset of BUFF
INT 21H ; Call interrupt 21h
…
CODE ENDS
For the given code (complete the other necessary directives and
statements) and data segment, if you input a value “Parv”, then it
will be stored in the BUFF as given below:
50 4 P a r v 0 0 … 0
AH = 4CH This function call returns control back to the operating system.
48
Introduction to
value of DL, resulting in 4*10+8. This output is stored as binary value in BL Assembly Language
register. Programming
DATA SEGMENT
ST1 DB “Output a string$”
49
The Central
DATA ENDS
Processing
AssemblyUnit
Language
Programming
CODE SEGMENT
:
MOV DX, OFFSET ST1
MOV AH, 09H
INT 21H
:
CODE ENDS.
DATA SEGMENT
MESSAGE DB “The input is$”
DATA ENDS
CODE SEGMENT
MOV AX, DATA ; Move data segment address to AX
MOV DS, AX ; Initialize DS register
MOV AH, 08H ; Set function for character read
INT 21H ; Read character in AL
MOV BL, AL ; Move input to BL
MOV AH, 09H ; Function to display strings
MOV DX, OFFSET MESSAGE ; Move offset of string to DX
INT 21 H ; Display string named MESSAGE
MOV AH, 02H ; Function to display character
MOV DL, BL ; Move character to DL
INT 21 H ; Display character
MOV AX, 4C00H ; Move 4CH to AH (DOS function call)
INT 21H ; Exit to DOS
CODE ENDS
END.
Q1: List the interrupts that can be used to input one character.
Q2: What is the output of following code segment, assume that BL register contains
the binary value 0000 00102
CODE SEGMENT
:
ADD BL, ‘0’
MOV DL, BL
MOV AH,02H
INT 21H
:
CODE ENDS.
50
Introduction to
Q3: Name the interrupt used to exit to operation to operating system. Assembly Language
Programming
The program initializes the data segment CSEG, which is also the code and stack
segment too. The two operands are stored in memory locations NUM1 and NUM2 are
added and the result is stored in the location RES. In order to store the carry bit first a
rotate with carry instruction is executed, followed my masking out the upper 7 bits.
This causes only the carry bit to remain in the AL register. This carry bit is then
moved to the location CARRY. Finally, the program exits to DOS using Interrupt
21H.
The COM programs are stored on a disk with an extension .com. A COM program
uses less disk space in comparison to an equivalent EXE program. At run-time the
51
The Central
Processing Unit
COM program places the stack automatically at the end of the segment, so they use at
Assembly Language
Programming least one complete segment.
The load module of EXE program consists of several segments of length up to 64K. It
may be noted that in 8086 microprocessor a maximum of four segments may be active
at any time. These segments can be of variable sizes, with the maximum size being
64K.
In the subsequent Units, you will be learning to write EXE programs only as:
EXE programs are better suited for debugging.
Assembled EXE programs can be easily linked to subroutines of high-level
languages.
EXE programs are easily to relocate in the memory, as they do not contain any
ORG statement. It may be noted that ORG statement forces a program to be
loaded from a specific memory address.
To fully use multitasking operating system, programs must be able to share
computer memory and resources. An EXE program is easily able to do this.
An example of EXE program, which is equivalent to the COM program given in the
previous section is given below:
DATA SEGMENT
NUM1DB 15h ; The first operand
NUM2 DB 20h ; The Second operand
RES DB ? ; Stores the sum
CARRY DB ? ; Stores the carry bit
DATA END
CODE SEGMENT
ASSUME CS:CODE, DS:DATA
START: MOV AX, DATA ;Move the segment address to AX
MOV DS, AX ; Initialize Data segment using AX
MOV AL, NUM1 ; Transfer first operand to AL
ADD AL, NUM2 ; Add second operand to AL
MOV RES, AL ; Store the result in AL to location RES
RCL AL, 01 ; Rotate AL by 1 bit to get carry into LSB
AND AL, 00000001B ; Mask out all bits except the LSB
MOV CARRY, AL ; Store the carry bit into location CARRY
MOV AX,4C00h
INT 21h
CODE ENDS
END START
52
Introduction to
1. Write an algorithm for your program closer to assembly language. For example, Assembly Language
the algorithm for preceding program would be: Programming
Assuming that both the numbers, NUM1 and NUM2 are in the memory.
o Put first number from memory to AL
o Add second number from memory to AL
o Store the result in some memory location
Position carry bit in Least significant bit (LSB) of a byte
o mask off upper seven bits
o store the result in the CARRY location.
3. Study the instruction set carefully: Study the available set of instructions, their
format and their limitations. For example, the limitation of the move instruction
is that it cannot move an immediate operand to a segment register. Thus, the
segment address is first moved to a register, say AX, which is then used to
initialize the segment register. cannot be directly initialized by a memory
variable.
4. You can exit to DOS, by using interrupt routine 21h, function 4Ch.
Therefore,04CH is placed in AH register followed by INT 21H instruction. This
will result in exit to DOS.
5. It is a nice practice to first code your program on paper, and use comments
liberally. This makes programming easier, and also helps you understand your
program later. Please note that the number of comments does not affect the size
of your program.
6. You may assemble your program using an assembler, which helps you in
removing the syntax errors. It also helps in creating an .exe file for execution.
53
The Central
Processing
AssemblyUnit
Language
Programming (iii) INT instruction in effect calls a subroutine, which is identified by a
number.
(iv) Interrupt vector table IVT stores the interrupt handling programs.
(vii) String input and output can be achieved using INT 21H with
function number 09h and 0Ah respectively.
(viii) To perform final exit to DOS we must use function 4CH with
the INT 21H.
(xvi) EXE program contains a header module, which is used by DOS for
calculating segment addresses.
(xviii) EXE programs are more easily relocatable than COM programs.
14.8 SUMMARY
This unit introduces you to some of the basic concept of 8086 programming,especially
input/output. 8086 microprocessor uses an interrupt vector table (IVT) that points to
the address of the interrupt servicing programs of 8086 micro-processor. One of the
most important interrupts being interrupt 21H, which is used for input/output and
several different functions. An IVT provides a flexible design environment, as you
can change the interrupt service program without much efforts. This unit discusses
some of the important functions of INT 21H. This unit also differentiatesbetween
COM & EXE program that are used in 8086 micro-processor.
54
Introduction to
Assembly Language
Programming
14.9 SOLUTIONS/ ANSWERS
Check Your Progress 1
1. (a) It helps in better understanding of computer architecture and
machine language.
(b) Results in smaller machine level code, thus result in efficient execution of
programs.
(c) Flexibility of use as very few restrictions exist.
3. (a) False
(b) False
(c) True
(d) True
(e) False
(f) True
55
The Central
Processing
AssemblyUnit
Language
Programming
14.10 FURTHER READINGS
1. Yu-Cheng Lin, Genn. A. Gibson, “Microcomputer System the 8086/8088
Family” 2nd Edition, PHI.
2. Peter Abel, “IBM PC Assembly Language and Programming”, 5th Edition, PHI.
3. Douglas, V. Hall, “Microprocessors and Interfacing”, 2nd edition, Tata
McGraw-Hill Edition.
4. Richard Tropper, “Assembly Programming 8086”, Tata McGraw-Hill Edition.
5. M. Rafiquzzaman, “Microprocessors, Theory and Applications: Intel and
Motorala”, PHI.
56
Assembly Language
UNIT 15 ASSEMBLY LANGUAGE Programming
(Part I)
PROGRAMMING
Structure
15.0 Introduction
15.1 Objectives
15.2 Simple Assembly Programs
15.2.1 Data Transfer
15.2.2 Simple Arithmetic Application
15.2.3 Application Using Shift Operations
15.2.4 Larger of the Two Numbers
15.3 Programming With Loops and Comparisons
15.3.1 Simple Program Loops
15.3.2 Find the Largest and the Smallest Array Values
15.3.3 Character Coded Data
15.3.4 Code Conversion
15.4 Programming for Arithmetic and String Operations
15.4.1 String Processing
15.4.2 Some More Arithmetic Problems
15.5 Modular Programming
15.5.1 The Stack
15.5.2 FAR and NEAR Procedures
15.5.3 Parameter Passing
15.5.4 External Procedure
15.6 Summary
15.7 Solutions/ Answers
15.0 INTRODUCTION
After discussing about the directives, program developmental tools and Input / Output
in assembly language programming, let us discuss more about assembly language
programs. In this unit, we will first discuss the simple assembly programs, which
performs simple tasks such as data transfer, arithmetic operations, and shift
operations. A key example would be to find the larger of two numbers. Thereafter,
you will go through more complex programs showing how loops and various
comparisons are used to implement tasks like code conversion, coding characters,
finding largest in array etc. This unit also discusses more complex arithmetic and
string operations and modular programming. You must refer to further readings for
more details on these programming concepts.
15.1 OBJECTIVES
After going through this unit, you should be able to:
write assembly programs with simple arithmetic logical and shift operations;
implement loops;
use comparisons for implementing various comparison functions;
write simple assembly programs for code conversion;
write simple assembly programs for implementing arrays;
explain the use of stack in parameter passing; and
use modular programming in assembly language
57
Assembly Language
Programming
15.2 SIMPLE ASSEMBLY PROGRAMS
In this unit, first simple assembly programs are discussed and later more complex
programs are written. In this section several simple assembly programs are explained.
Directives Discussion
Statement
DATA SEGMENT The data segment stores a variable
VAL DW4321H VAL, which stores a data word.
DATA ENDS
CODE SEGMENT Use of assume directive to correlate
ASSUME CS:CODE,DS:DATA segment registers with segment names
MAINP:MOV AX, DATA and explicitly initialize data segment
MOV DS, AX register.
MOVAX, 8765H Move 8765H to AX register.
XCHG AH,AL Result: AX=6587H
MOVBX, 8765H Move 8765H to BX register.
XCHG BX, VAL Result: BX=4321H and VAL=8765H
MOVAX, 4C00H Return to operating system using
INT21H Interrupt 21h with function 4C.
CODE ENDS
END MAINP
Directives Discussion
Statement
DATA SEGMENT
VAL1 DB 25h Define two variables VAL1 and VAL2
VAL2 DB 65h consisting of byte values.
DATA ENDS
CODE SEGMENT Use of assume directive to correlate
ASSUMECS:CODE,DS:DATA segment registers with segment names
MOV AX, DATA and explicitly initialize data segment
MOV DS, AX register.
MOV AL, VAL1 Load the variable VAL1into AL
XCHG VAL2,AL Exchange AL with variable VAL2
MOV VAL1,AL Now, move the AL to variable VAL1
MOV AX, 4C00H Return to operating system using
INT 21h Interrupt 21h with function 4C.
CODE ENDS
END
58
Assembly Language
In Program 2, why have we not used XCHG VAL1, VAL2 instruction directly? To Programming
answer this question,you should look into the constraints for the MOV instructions, (Part I)
which are given below:
The statement MOV AL, VAL1, copies the VAL1 that is 25h to the AL register:
The instruction, XCHG AL, VAL2exchanges the value of AL register (25h) with
VAL2(65h). Thus, after the execution of this instruction AL will contain 65h and
VAL2 will contain 25h. VAL1 at this time will also contain 25h only.
Finally, the instruction MOV VALUE1, AL will put the value 65h into VAL1.
Discussion: The program should have two memory variables stored in memory
locations FIRST and SECOND and a third location for storing the mean value.An add
instruction cannot add two memory locations directly, so you are required to move a
single value in AL first and then add the second value to it.In addition, on adding the
two-byte values, there is a possibility of a carry bit. Assuming that problem is
addressing two unsigned binary numbers, the problem is how to put the carry bit into
the AH register such that the AX(AH:AL) reflects the added byte values. This is done
using ADC instruction.The ADC AH,00h instruction will add the immediate number
00h to the contents of the carry flag and the contents of the AH register. The result
will be left in the AH register. Since we had cleared AH to all zeros, before the
addition, we really are adding 00h + 00h + CF. The result of all this is that the carry
flag bit is put in the AH register, which was desired by us.Finally, to get the mean
value, you can divide the sum given in AX by 2. After the division, the 8-bit quotient
will be left in the AL register, which can then be copied into the memory location
named AVGE.
Directives Discussion
Statement
DATA SEGMENT
FIRST DB 90h Three variables
SECOND DB 78h
MEAN DB ?
DATA ENDS
CODE SEGMENT
ASSUME CS:CODE, DS:DATA
59
Assembly Language
Programming
START: MOV AX, DATA Initialize data segment
MOV DS, AX
MOV AL, FIRST Get FIRST number in AL
ADD AL, SECOND Add SECOND number to AL
MOV AH, 00h Clear AH register
ADC AH, 00h Put carry in AH
MOV BL, 02h Load divisor (2) in BL register
DIV BL Divide AX by BL. Quotient in
MOV MEAN,AL AL and remainder in AH; and
MOV AX, 4C00H copy the result to memory and
INT 21H return to operating system.
CODE ENDS
END START
Move most significant BCD digit to upper four bit positions in byte by using
rotate instruction as shown below:
Pack the bits of rotated BCD digit with the least significant BCD digit bits, as
shown below to create a packed two-digit BCD number in a byte.
60
Assembly Language
Directives Discussion Programming
Statement (Part I)
DATA SEGMENT
HighDigit DB ‘3’ The data segment stores the ASCII
LowDigit DB ‘9’ value of the two digits, the assumed
PackedBCD DB ? digits are ‘3’ and ‘9’.
DATA ENDS
CODE SEGMENT Initialize data segment register
ASSUME CS:CODE, DS: DATA
START: MOV AX, DATA Move the Higher digit (3) in BL
MOV DS, AX Move the lower digit (9) to AL
MOVBL, HighDigit Mask upper 4 bits of BL
MOVAL, LowDigit Mask upper 4 bits of AL
ANDBL, 0Fh Move the rotate count to CL
ANDAL, 0Fh Rotate BL register using CL
MOVCL, 04h OR to get the packed BCD in AL
ROLBL, CL Store the result in Packed BCD
ORAL, BL Return to Operating system
MOV PackedBCD, AL
MOV AX 4C00H
INT 21H
CODE ENDS
END START
Discussion on Program 4:
8086 does not have any instruction to swap upper and lower four bits in a byte,
therefore you need to use the rotate instruction4 times. You can choose any of the two
rotate instructions, ROL and RCL. In this example, we have chosen ROL, as it rotates
the byte left by one or more positions, on the other hand RCL moves the MSB into the
carry flag and brings the original carry flag into the LSB position, which is not what
we want. Rest the entire program proceeds as per the algorithm.
Program 5:Write a program using 8086 assembly language that adds two binary
numbers (assume the number are of byte type) stored in the consecutive memory
locations. The result of the addition and carry, if any are also stored in the memory
locations.
Directives Discussion
Statement
DATA SEGMENT
NUM1DB 25h First number contains 25h
NUM2 DB 80h Second number contains 80h
RES DB ? Will store sum of the two numbers
CARY DB ? Will store carry bit, if any
DATA ENDS
CODE SEGMENT
ASSUME CS:CODE, DS:DATA
START:MOV AX, DATA Initialize data segment register
MOV DS, AX MOV
AL, NUM1 Load the first number in AL, addthe
ADD AL, NUM2 second number in AL and store the
MOV RES,AL RCL result into RES
AL, 01 AND AL, Rotate AL with carry, to bring carry bit
00000001B MOV CARY, AL to the least significant bit, AND it with
MOV AH, 4CH 000000012 to mask out all bits except
INT 21H the least significant bit. Store the carry
CODEENDS bit to CARY and return to the operating
ENDSTART system.
61
Assembly Language
Programming
Discussion:
RCL instruction brings the carry into the least significant bit position of the AL
register. The AND instruction is used for masking higher order bits of AL.
The following examples show how the flags are set when the numbers are compared.
Example 1:
MOV BL, 02h ;Move 02h to BL
CMP BL, 10h ; Compare BL with 10h. Sets carry flag = 1
As the value of BL is less than 10h, the carry flag would be set as borrow
would be needed to subtract 10h from BL.
Example 2:
MOV AX,F0F0h ; Same value is moved to AX
MOV DX, F0F0h ; and BX
CMP AX,DX ;On comparison, it sets Zero flag = 1
The zero flag is set as both the operands are same.
Example 3:
MOV BX,200H
CMP BX, 0 ; Zero and Carry flags = 0
The destination register (BX) contains a value greater than the source (0), so
both the zero and the carry flags are cleared.
In the following section we will discuss an example that uses the flags set by CMP
instruction.
1. In a MOV instruction, the immediate operand value for 8-bit destination cannot
exceed F0h.
3. In the example given in section 15.2.2 you can change instruction DIV BL
with a shift instruction.
4. A single instruction cannot swap the upper and lower four of a byte
register.
62
Assembly Language
Programming
5. An unpacked BCD number requires 8 bits of storage, however, two (Part I)
unpacked BCD numbers can be packed in a single byte register.
In the example given above the control of the program will directly transfer to the
label THERE, if the value stores in AX register is equal to that of the register BX. The
same example can be rewritten in the following manner, using different jumps.
Example 5:
CMP AX, BX ; compare instruction: sets flags
JNE FIX ; if not equal do addition
JMP THERE ; if equal skip next instruction
FIX: ADD AX, 02 ; add 02 to AX
…
THERE: MOV CL, 07
The assembly code given above is not efficient, but suggests that there are many ways
through which a conditional jump can be implemented. You should select the most
optimum way based on your program requirements.
Example 6:
CMP DX, 00 ; checks if DX is zero.
JE Label1 ; if yes, jump to Label1 i.e., if ZF=1
…
Label1:other instructions ; control comes here if DX=0
63
Assembly Language
Programming
Example 7:
LOOPING
Program 6: Write a program using 8086 assembly language that computes the new
prices from series of prices data stored in the memory. You may assume a constant
inflation factor that is added to each old price value. Also assume that all the prices
are given in the BCD form.
Discussion:
Input: A list of prices stored in the memory and a constant inflation factor
Output: The new prices
Process:
Repeat the following steps
Read a price (in BCD) from the input array
Add inflation factor
Adjust result to correct BCD
Put result back in the same array
Until all prices are converted to new price
Directives Discussion
Statement
ARRAYSSEGMENT The data segment is named
PRICES DB 25h, 35h, 45h, 65h, 75h ARRAYS and consist of a list of 5
ARRAYS ENDS PRICES.
CODESEGMENT
ASSUME CS:CODE, DS:ARRAYS Initialize data segment. Please
START:MOVAX, ARRAYS note the use of name ARRAYS
MOVDS, AX
LEA BX, PRICES Move address of variable PRICES
MOV CX, 0005h to BX register and move 5 to CX
as PRICES has 5 values.
DO_NEXT:MOVAL,[BX] Load the first value from array to
ADD AL, 0Ah AL and add constant 0Ah, which
is assumed as inflation factor
DAA Since input is BCD, DAA adjusts
MOV[BX], AL the addition, and results are stored
in the PRICES again.
INC BX BX is incremented to point to next
DEC CX value and counter CX is
decremented
JNZ DO_NEXT If the decrement operation does
MOV AH,4CH not result in zero, then jump is
INT 21H taken to DO_NEXT label, else all
CODE ENDS the values of PRICES has been
END START processed, so program exits to
Operating system.
Discussion:
Please note the use of instructionLEA BX,PRICES It will load the BX register with
the offset of the array PRICESin the data segment named ARRAYS. [BX] is an
64
Assembly Language
indirection through BX and points to the value stored at that element of array named Programming
PRICES. BX is incremented to point to the next element of the array. CX register acts (Part I)
as a loop counter and is decremented by one to keep a check of the bounds of the
array. Once the CX register becomes zero, zero flag is set to 1. The JNZ instruction
keeps track of the value of CX, and the loop terminates when zero flag is 1 because
JNZ does not loop back.
The same program can be written using the LOOP instruction, in such case, DEC CX
and JNZ DO_NEXT instructions are replaced by LOOP DO_NEXT instruction.
LOOP decrements the value of CX and jumps to the given label, only if CX is not
equal to zero. The LOOP instruction is demonstrated with the help of following
program:
Program 7:Write a program using 8086 assembly language that prints the alphabets
A-Z on the screen. This program is written using LOOP statement.
Directives Discussion
Statement
CODE SEGMENT
ASSUMECS:CODE
MAINP:MOVCX,1AH 1AH=26 (number of alphabets to be displayed)
MOV DL, 41H 41H is hexadecimal equivalent of ASCII ‘A’.
NEXTC: MOV AH, 02H Function 02H of Interrupt 21h is used to display
INT 21H the character stored in DL.
INC DL Increment DL to next alphabet value
LOOP NEXTC Loop instruction will decrement CX by 1 and
check if CX=0, if not it loops to label NEXTC to
MOV AX, 4C00H print remining characters.
INT21H CODE Once all the characters are printed, the program
ENDS returns to the operating system.
END MAINP
Let us now discuss slightly more complex program in the next section.
Program 8:Write a program using 8086 assembly language to find the largest and
smallest numbers in an array.
Discussion: Initialize the SMALL and the LARGE variables as the first number in
the array. They are then compared with the other array values one by one. If thevalue
happens to be smaller than the assumed smallest number or larger than theassumed
largest value, the SMALL and the LARGE variables are changed by this new value
respectively. Let us use register DI to point the current array value andLOOP
instruction for looping.
Directives Discussion
Statement
DATA SEGMENT Data segment includes a total of 6 signed
ARRAY DW -1, 2000, values. You need to find the largest and the
-4000,32767, 500,0 smallest among these values. The largest
LARGE DW ? and smallest values will be stored in
SMALL DW ? variables LARGE and SMALL
65
Assembly Language
Programming
DATA ENDS respectively.
CODESEGMENT Initialize the data segment register using
ASSUME CS:CODE, DS:DATA AX
START:MOVAX, DATA Move offset of ARRAY of data segment to
MOV DS,AX DI and move the array element pointed by
MOV DI, OFFSET ARRAY DI to AX register. DX and BX registers are
MOV AX, [DI] used to store the largest and smallest
MOV DX, AX respectively. The first value of array is
MOV BX, AX moved in both these registers. Since the size
MOV CX, 6 of array is 6, so move 6 to CX register.
Point to Note: Since the data is word type that is equal to 2 bytes and memory
organisation is byte wise, to point to next array value DI is incremented by 2.
As each digit is input, you would store its ASCII code in a memory byte. After the
first number was input, the number would be stored as follows:
66
Assembly Language
The number is stored as: Programming
(Part I)
31 32 33 34 hexadecimal values stored in memory
1 2 3 4 equivalent ASCII digits
Each of these numbers will be input as equivalent ASCII digits and need to be
converted either to digit string to a 16-bit binary value that can be used for
computation or the ASCII digits themselves can be added which can be followed by
instruction that adjust the sum to binary.
Another important data format is packed decimal numbers (packed BCD). A packed
BCD contains two decimal digits per byte. Packed BCD format has the following
advantages:
The BCD numbers allow accurate calculations for almost any number of
significant digits.
Conversion of packed BCD numbers to ASCII (and vice versa) is relatively fast.
An implicit decimal point may be used for keeping track of its position in a
separate variable.
The instructions DAA (decimal adjust after addition) and DAS (decimal adjust after
subtraction) are used for adjusting the result of an addition of subtraction operation on
packed decimal numbers. However, no such instruction exists for multiplication and
division. For the cases of multiplication and division the number must be unpacked.
First, multiplied or divided and packed again.
Let us discuss the process of conversion of ASCII digits to equivalent binary number,
which can be used for computation.
Directives Discussion
Statement
DATASEGMENT
ASCIIDB 39h ASCII variable contains an ASCII digit.
DATA ENDS
CODESEGMENT
ASSUME CS:CODE, DS:DATA
67
Assembly Language
Programming
START: MOV AX, DATA Initialize data segment using AX
MOV DS, AX
MOV AL, ASCII Get the ASCII digits in AL register and
CMP AL, 30h compare it with 30h. If it is less than 30h, it is
JB ERROR not a valid digit. So go to label ERROR
ATOF: CMP AL, 41h You will be here if ASCII is greater than or
JB ERROR equal to 3Ah. Check if it is below 41h, if yes
CMP AL, 46h go to label ERROR. Next, check if the ASCII
JA ERROR digit is above 46h, if yes go to label ERROR.
SUB AL, 37h JMP Otherwise convert the ASCII to hex digit
CONVERTED equivalent by subtracting 37h. Next, jump to
CONVERTED.
ERROR: MOV AL, 0FFh Error is detected when AL has FF, which is
moved to it.
CONVERTED: MOV AX, 4C00h Otherwise, AL contains the converted hex
INT 21h digit so the program returns to operating
CODEENDS
system.
ENDSTART
Discussions:
The above program demonstrates a conversion of a single ASCII character to
equivalent hexadecimal digit represented by that ASCII character. The above
programs can be extended to take more ASCII values and convert them into a 16-bit
binary number.
68
Assembly Language
15.4 PROGRAMMING FOR ARITHMETIC AND Programming
(Part I)
STRING OPERATIONS
Let us discuss some more advanced features of assembly language programming in
this section. Some of these features give assembly an edge over the high-level
language (HLL) programming as far as efficiency is concerned. One such instruction
is for string processing. The object code generated after compiling the HLL program
containing string instruction is much longer than the same program written in
assembly language. The following section discuss a program of string processing:
Directives Discussion
Statement
DATASEGMENT
PASSWORDDB'FAILSAFE' The source string
DESTSTR DB'FEELSAFE' The destination string
MESSAGEDB'String are equal$' The message to be displayed if
DATA ENDS strings are the same
CODESEGMENT The string matching requires two
ASSUMECS:CODE,DS:DATA,ES:DATA segments for data, viz. data
MOV AX, DATA segment for source string and
MOV DS, AX extra data segment for destination
MOV ES, AX string. Thus, DS and ES are
initialized using AX.
Discussion:
In the program given above, the instruction CMPSB compares the two strings pointed
by SI in Data Segment and DI register in extra data segment. The strings are
compared byte by byte and then the pointers SI and DI are incremented to next byte.
Please note the last letter B in the instruction indicates a byte. If it is W, that is if
instruction is CMPSW, then comparison is done word by word and SI and DI are
incremented by 2, that is to the next word respectively. The REPE prefix in front of
69
Assembly Language
Programming
the instruction tells the 8086 to decrement the CX register by one, and continue to
execute the CMPSB instruction, until the counter (CX) becomes zero. Thus, the code
size is substantially reduced, when string instructions are used.
Thus, you can write efficient programs for moving one string to another, using
MOVS, and scanning a string for a character using SCAS.
Let us write a program to add two 5-byte numbers stored in an array. For example,
two numbers in hex can be:
Carry in 0 0 0 1 0
20 11 01 10 FF
FF 40 30 20 10
1 1F 51 31 31 1F
Carry out
Let us also assume that the numbers are represented as the lowest significant byte first
and put in memory in two arrays. The result is stored in the third array SUM. The
SUM also contains the carry out information, thus would be 1 byte longer than
number arrays.
Program 11: Write a program in 8086 assembly language to add two five-byte
numbers using arrays.
Algorithm:
Input: two arrays of 5 bytes each.
Output: an array of sum of size 6 bytes
Process:
Repeat the following steps till all the elements of array (5) are added
Load the byte of first array in AL
Add the corresponding byte of second array in AL with carry
Store the result in a memory array
Increment to next bytes
Rotate carry into LSB of accumulator
Mask all but LSB of accumulator
Store carry result in memory
Directives Discussion
Statement
DATASEGMENT
NUM1DB0FFh,10h,01h,11h,20h Two arrays of size 5 each, SUM will
NUM2DB10h,20h,30h,40h,0FFh store the addition and overall carry out
SUMDB 6DUP(0) bit
DATAENDS
70
Assembly Language
CODESEGMENT Initialize segment register Programming
ASSUME CS:CODE, DS:DATA (Part I)
START: MOVAX, DATA SI register is being used as index
MOVDS, AX register, in the array, therefore, is
MOVSI, 00 initialized to 0. CX is initialized to the
MOVCX, 05h size of arrays and CLC clears the carry
CLC bit
Directives Operation
Statement
THOUEQU 3E8h Constant THOU is equal to 1000, i.e. 3E8h
DATASEGMENT
BCDDW4567h
HEXDW?
DATAENDS
CODESEGMENT
ASSUME CS:CODE, DS:DATA
START: MOVAX, DATA Initialize data segment Register.
MOVDS, AX
MOVAX, BCD AX = 4567
MOVBX, AX BX = AX = 4567
MOVAL, AH AL = AH = 45
MOVBH, BL BH = BL = 67
MOVCL, 04 CL = 4, as 4-bit rotation will be used
RORAH, CL AH = 54 due to 4-bit rotation
RORBH, CL BH= 76 due to 4-bit rotation
ANDAX, 0F0FH AX=5445 AND 0F0Fh = 0405
ANDBX, 0F0FH BX= 7667 AND 0F0Fh = 0607
MOV CX, AX AX is moved to CX so that AX can be used for
71
Assembly Language
Programming
other operations. CX = AX = 0405
72
Assembly Language
partly in higher level language necessarily involves at least one module for each Programming
language. (Part I)
3. Modular programming allows for the creation, maintenance and reuse of a
library of commonly used modules.
4. Modules are easy to comprehend.
5. Different modules can be assigned to different programs.
6. Debugging and testing can be done in a more orderly fashion.
7. Document action can be easily understood.
8. Modifications may be localized to a module.
Main Module
Module D Module E
You can divide a program into subroutines or procedures. You need to CALL the
procedures whenever needed. A subroutine call instruction transfers the control to
subroutine instructions and the return statement brings the control back to the calling
program.
In 8086 microprocessor a stack is created in the stack segment. The SS register stores
the base of stack segment and SP register stores the position of the top of the stack. A
value is pushed in to top of the stack or taken out (poped) from the top of the stack.
The stack of 8086 is a word stack. In order to use stack, first the stack segment
register is initialized, as given below:
CODE SEGMENT
ASSUME CS:CODE,SS:STACK_SEG Just like a data segment, the SS register is
MOV AX, STACK_SEG initialized to the base of stack segment.
MOV SS,AX
LEA SP,TOP The SP register is loaded with the
… maximum offset of the stack, represented
CODE ENDS
END
73
Assembly Language
Programming
The directive STACK_SEG SEGMENT STACK declares the logical segment for the
stack segment. DW 100 DUP(0) assigns an actual size of the stack to 100 words. All
locations of this stack are initialized to zero. The label TOS defines the initial top
ofthis empty stack. Please note that the stack in 8086 is a WORD stack. The stack
grows from a higher offset to a lower offset.The top position of stack uses an indirect
addressing mechanism through a special register called the stack pointer (SP). SP
initially is made to points to a label TOS. SP is automatically decremented,whenan
itemis put on the stack (called PUSH operation) and incremented as an item is
retrieved from the stack (called POP operation). SP points to the address of the last
element pushed on to the stack. The following table explains the PUSH and POP
instructions of 8086 microprocessor
The NEAR procedure call is also known as Intra-segment call as the called procedure
is in the same segment from which call has been made. Thus, only IP is stored as the
return address on the top of the stack. The IP can be stored on the stack as:
74
Assembly Language
Initial stack top (SP) . Programming
IP HIGH (Part I)
SP after NEAR call IP LOW
.
.
Stack segment base (SS)
Low
address
Please note the growth of stack is towards stack segment base register. So, stack
becomes full on an offset 0000h. Also, for push operation you decrement SP by 2 as
stack is a word stack (word size in 8086 = 16 bits) while memory is byte organized
memory.
FAR procedure call, also known as intersegment call, is a call made to separate code
segment. Thus, the control will be transferred outside the current segment. Therefore,
both CS and IP need to be stored as the return address. These values on the stack after
the calls look like:
When the 8086 executes the FAR call, it first stores the contents of the code segment
register followed by the contents of IP on to the stack. A RET from the NEAR
procedure. Pops the two bytes into IP. The RET from the FAR procedure pops four
bytes from the stack.
Procedure is defined within the source code by placing a directive of the form:
<Procedure name> PROC <Attribute>
The <procedure name> is the identifier used for calling the procedure and the
<attribute> is either NEAR or FAR. A procedure can be defined in:
1. The same code segment as the statement that calls it.
2. A code segment that is different from the segment containing the statement that
calls it, but in the same source module as the calling statement.
3. A different source module and segment from the calling statement.
In the first case the <attribute> code NEAR should be used as the procedure and code
are in the same segment. For the latter two cases the <attribute> must be FAR.
Directives Discussion
Statement
DATA_SEGSEGMENT
BCDDB25h Storage for BCD value
BINDB? Storage for binary value
DATA_SEGENDS
STACK_SEGSEGMENT STACK
DW100DUP(0) Stack of 100 words
TOP_STACKLABELWORD Label for stack top
STACK_SEGENDS
CODE_SEGSEGMENT
ASSUME CS:CODE_SEG,
DS:DATA_SEG, SS:STACK_SEG
START:MOVAX, DATA Initialize data segment
MOVDS, AX
MOVAX, STACK-SEG Initialize stack segment
MOV SS, AX
MOV SP, OFFSET TOP_STACK Initialize stack pointer
MOVAL, BCD Move BCD value into ALand
PUSH AX push it onto word stack as
parameter and call the
CALL BCD_BINARY
POPAX procedure. The procedure
MOV BIN,AL returns binary value in AX
MOV AH, 4CH register, which is moved to AL
INT 21H and the program returns to
operating system.
; PROCEDURE : BCD_BINARY The procedure to convert BCD
BCD_BINARY PROC NEAR value received in AX register to
; Store the registers binary value. But, first all the
PUSHF registers used by the procedure
76
Assembly Language
PUSHAX and flags register is pushed in the Programming
PUSHBX stack. Next, the value of stack (Part I)
PUSHCX top is moved to BP register.
PUSHBP The stack location [BP+12]
MOVBP, SP contains the BCD value, which is
MOVAX,[BP+ 12] moved to AX =0025h.
MOVBL, AL BL=AL=25h
ANDBL, 0Fh BL= 25h AND 0Fh = 05h
ANDAL, F0H AL= 25h AND F0h = 20h
MOVCL, 04 CL=04
RORAL, CL AL= 02h
MOVBH, 0Ah BH=0Ah (or 10)
MULBH AX= 02×10 = 0014h
ADDAL, BL AL=14h+05h=19h
MOV [BP + 12], AX Move this binary value to stack
; Restore flags and registers
POPBP Restore all the registers to their
POPCX original content, restore is in
POPBX reverse order of storage and
POPAX
POPF
RET return to the calling program
BCD_BINARY ENDP
End of procedure
CODE_SEG ENDS
END START End of code segment
End of the file
Discussion:
The parameter is pushed on the stack before the procedure call. The procedure call
causes the current instruction pointer to be pushed on to the stack. In the procedure
flags, AX, BX, CX and BP registers are also pushed in that order. Thus, the stack
would be as follows:
The instruction MOV BP, SP transfers the contents of the SP to the BP register. Now
BP is used to access any location in the stack, by adding appropriate offset to it. For
example, MOV AX, [BP + 12] instruction transfers the word beginning at the 12th
byte from the top of the stack to AX register.It does not change the contents of the BP
77
Assembly Language
Programming
register or the top of the stack. Since the BP contains SP value, which is 0082h,
therefore, BP+12 would be 0082h + 000Ch = 008Eh. This address contains the value
of AX, which was pushed prior to call to the procedure. Please recall this pushed
value was the BCD value, which is to be converted (this is the parameter value). Thus,
this instruction copies the BCD parameter value at offset 008Eh into the AX register
in the procedure. This instruction is not equivalent to POP instruction.
Stacks are useful for writing procedures for multi-user system programs or recursive
procedures. It is a good practice to make a stack diagram as above while using
procedure call through stacks. This helps in reducing errors in programming.
Use of Identifiers
a) Access to External Identifiers: An external identifier is one that is referred in
one module but defined in another. You can declare an identifier to be external
by including it on as EXTRN in the modules in which it is to be referred. This
tells the assembler to leave the address of the variable unresolved. The linker
looks for the address of this variable in the module where it is defined to be
PUBLIC.
b) Public Identifiers: A public identifier is one that is defined within one module
of a program but potentially accessible by all of the other modules in that
program. You can declare an identifier to be public by including it on a
PUBLIC directive in the module in which it is defined.
The procedure is named a SMART_DIV procedure and first the calling program to
this external procedure is given below:
Directives Discussion
Statement
78
Assembly Language
DATA_SEGSEGMENT WORD PUBLIC Public data segment Programming
DIVIDENDDW2345h,89ABh 32-bit dividend (Part I)
DIVISORDW5678h 16-bit divisor
MESSAGEDB‘INVALID’,‘$’ Message in case division is
DATA_SEGENDS invalid
STACK_SEGSEGMENTSTACK
DW100 DUP(0) Stack segment of 100 words
TOP_STACKLABEL WORD Label to stack top
STACK_SEG ENDS
79
Assembly Language
Programming
Discussion on the calling program:
The linker appends all the segments having the same name and PUBLIC directive
with segment name into one segment. Their contents are pulled together into
consecutive memory locations. The statement to be noted is PUBLIC DIVISOR. It
tells the assembler and the linker that this variable can be legally accessed by other
assembly modules. The statement EXTRN SMART_DIV:FAR tells the assembler that
this module will access a label or a procedure of type FAR in some assembly module.
Please also note that the EXTRN definition is enclosed within the PROCEDURES
SEGMENT PUBLIC and PROCEDURES ENDS, to tell the assembler and linker that
the procedure SMART_DIV is located within the segment PROCEDURES and all
such PROCEDURES segments need to be combined in one. Please also note that in
case the procedure SMART_DIV encounters an error, such as division by zero, it sets
carry flag, which is checked in the calling program to put the results in the memory or
display an error message.
Let us now define the PROCEDURE module:
Directives Discussion
Statement
Input:
Dividend is 2 words input. The low word is input in AX and high word is input
in DX register
The divisor is input in CX register.
Output:
The Quotient is returned in DX:AX pair and remainder is returned in CX
register. In case, divisor is zero, then Carry Flag is set to indicate that division is
incorrect.
DATA_SEGSEGMENTPUBLIC This declaration informs assembler
EXTRN DIVISOR:WORD that DIVISOR is a word variable and
DATA_SEG ENDS isexternal to this procedure. It also
indicates that DIVISOR can be found
in public segment DATA_SEG
Discussion:
The procedure accesses the data item named DIVISOR, which is defined in the calling
program, therefore the statement EXTRN DIVISOR:WORD is necessary for
informing assembler that this data name is found in some other segment. The data
type is defined to be of word type. Please not that the DIVISOR is enclosed in the
same segment name as that of calling program that is DATA_SEG and the procedure
SMART_DIV is in a PUBLIC PROCEDURES segment.
(b) A FAR call uses one word in the stack for storing the return address.
(c) While making a call to a procedure, the nature of procedure that is NEAR
or FAR must be specified.
(e)A segment if declared PUBLIC informs the linker to append all the
segments with same name into one.
15.6 SUMMARY
This Unit presents some programs written in 806 assembly language. The programs
cover elementary arithmetic problems, code conversion problems, use of arrays and
81
Assembly Language
Programming
jump statements in assembly, the use of near and far procedure, highlighting the use
of stack in procedure calls. Some of the important points presented in this unit are:
2. Assuming that each array element is a word variable and is stored in data
segment.
MOV CX, COUNT ; put the number of elements of the array in
; CX register
MOV AX, 0000h ; zero SI and AX
MOV SI, AX ; add the elements of array in AX repeatedly
AGAIN: ADD AX, ARRAY[SI] ; another way of handling array
ADD SI, 2 ; select the next element of the array
LOOP AGAIN ; add all the elements of the array. It will
; terminate when CX becomes zero.
82
Assembly Language
MOV TOTAL, AX ; store the results in TOTAL. Programming
(Part I)
3. Yes, because the conversion efforts are less.
4. You may use two nested loop instructions in assembly also. However, as both the
loop instructions use CX, therefore every time before we are entering inner loop
you must push CX of outer loop in the stack and reinitialize CX to the inner loop
requirements.
2. Direction flag if clear will cause REPE statement to perform in forward direction,
i.e. comparison would be from first element to last.
2.
SP . .
SP 00 00
50 50
30
00
50
55
Low
address
Original after (a) after (b)
(c) The return for FIRST can occur only after return of SECOND. Therefore, the
stack will be back in Original state.
83
UNIT 16 ADVANCED ARCHITECTURES
Structure Page Nos.
16.0 Introduction
16.1 Objectives
16.2 Need of Advanced Architectures and Parallel Processing
16.3 Parallelism in Uni-Processor Systems
16.3.1 Arithmetic Pipeline
16.3.2 Instruction Pipeline
16.4 Parallelism through Hardware and Software
16.4.1Vector Processing
16.4.2 Array Processing
16.5 Multiprocessors
16.5.1 Characteristics of Multiprocessors
16.5.2Interconnection Structures
16.5.3 Inter-processor Arbitration
16.6 Inter-Processor Communication and Synchronization
16.7 Cache Coherence
16.8 Multi-core Processors
16.9 Summary
16.10 Solutions/Answers
16.0 INTRODUCTION
The previous Units of this course discuss about the basic computer architecture of a
computer system, including the assembly language. Architecture design is the first
step in life cycle of a processor. It is a very crucial step as the performance of
processor majorly depends on the design chosen. For instance, if you choose a non-
pipelined architecture for your processor, you are compromising its performance for
simplicity. On the other hand, if you choose an architecture involving multiple cores,
you can increase your processor performance exponentially but at the same time its
complexityincreases drastically. So, choosing an architecture that suits our application
is a must. For that you should have an understanding of various types of architectures
and their implementations in detail.
In this Unit, we will discuss some advanced architectures and design methodologies
used to achieve higher performance of a computer system.
16.1 OBJECTIVES
After going through this Unit, you will be able to:
1
Introduction to Digital
Circuits
The fast-paced IT industry demands high-end architectural designs to support its high-
performance applications. To meet these needs computer architectures are going
through numerous changes constantly. One such implementation is parallel processing
which is used to achieve higher speeds.
Parallel Processing includes a set of techniques that can be used to increase processing
speed and latency, which is defined for the data rate of data transfer, of a
computational system. These techniques enable simultaneous processing of data as
opposed to the conventional sequential processing. Due to parallel execution of
instructions, there is a significant increase in throughput. Throughput is the measure
of computation and is defined in terms of number of instructions that can be executed
by a given processor in a given interval of time. Increase in throughput implies to
increase in speed of operation. To this extent it must benoted that the need of parallel
processing is to increase the performance of a system. The techniques involved in
inducing parallel processing in a system widely vary in methodology and resources
used. However, the aim of all these techniquesis to process data concurrently. For
example, in a processor, one instruction can be fetched from memory and another
instruction can be executed by the ALU in the same clock cycle. Another example can
be a system using two processors for parallel execution of instructions. In the former
example, the Processor is using Pipelining as a technique to speed up the execution of
instructions. We will discuss pipelining in the next sections in detail. The later
example of two processors is used to achieve parallel processing. Here, performance
of the system increases at the expense of increased complexity and cost.
Parallel execution of several program can be done on a single processor system. Such
parallelism can be implemented using the hardware as well as software. In this unit,
we will focus only on the hardware related parallelism in a uni-processor system.
As far as hardware-based parallelism is concerned several techniques are used on uni-
processor system to increase the throughout of instruction execution. Some of these
techniques are:
Several processors contain multiple units to perform various arithmetic
functions, such as multiple adder circuits unit as designed in Block 1. These
units’ speedup the execution of various functions that can be done in parallel.
Using the memory organizations, such as interleaved memory, to speed up
the data access operations. In addition, the processor operations can be
overlapped with memory operations, for example, 8086 micro-processor
Use of pipelining in the processor. Which is explained in detailed in this
section.
The concept of pipelining is one of the major aspects of Parallelism in Uniprocessors,
let us first answer the question “What is Pipelining?”.
2
Pipelining is a technique in which we divide the whole task into subtasks and execute
them concurrently. Each subtask is processed in different segment of the processor.
These segments are interconnected to each other in such a way that the result of one
segment is passed to another segment. Output is obtained after the data is processed
through all the segments. The characteristic feature of pipelining is that several
processes can be running in different segments at the same time. Typical related
example of pipelining can be an assembly linein industries, such as automobile
manufacturing, which fabricate automobile in a step-by-step manner in different
segments. In the following sub-section, the concepts of arithmetic and instruction
pipeline are explained.
Figure 16.1: An arithmetic pipeline organization with intermediate registers for addition
or subtraction of two floating point numbers.
4
Fetch: In the fetch stage, the instruction is fetched from instruction memory and
stored in Instruction Register (IR).
Decode: In this stage the instruction is decoded for information like operation to be
performed, source operand address, destination operand address etc., the source
operand and destination operand may be stored in temporary registers for further
processing.
Execute: In this stage the ALU performs the operation specified by the instruction on
the operands in the temporary registers.
In the pipelined processor as stated above, when the first instruction is in decode stage
the second instruction enters into fetch stage. When first instruction enters execute
stage, second instruction enters decode stage and third instruction enters fetch stage.
The process continues until all the instructions are out of the execution stage.
Assuming that one stage is performed in one clock cycle, then this processor will
take(k+n-1) clock cycles to execute n instructions in a k stage pipeline. In the present
case, k= 3, as we have used athree-stageinstruction pipeline. Thus, using this
instruction pipeline n instructions, in an ideal condition, would be executed in n+2
clock cycle.
If you are executing 100 instructions, then this processor will take 102 clock cycles to
execute all of them. A Non-pipelined processor would have taken 300 clock cycles to
complete the same task i.e., n×k.
ሺൈሻ௧
We could see that there is a speedup ratio (S) =
ሺାିଵሻ௧
Instruction 1 FE DE EX
Instruction 2 FE DE EX
Instruction 3 FE DE EX
Instruction 4 FE DE EX
Instruction 5 FE DE EX
The computation of speed up due to pipeline though looks very promising, however,
the pipelined execution suffers from several problems. The first problem is due to
limitation of resources of the processing unit. For example, the system bus is one of
the resources required by several units.The second problem may be related to data
dependencies among consecutive instructions, for example, an instruction may
produce a result, which may be required by the immediate next instruction. This may
sometimes may cause delay in execution of this immediate next instruction. Finally, in
general, the decision to take a jump in conditional jump instructions is known only
when the EX phase of that instruction is performed. This may result in emptying the
pipeline. For example, consider instruction 1 is a conditional jump instruction, which
causes jump to instruction 4 in case the condition is TRUE. The pipeline will execute
as shown in the following diagram, assuming that the condition is evaluand to be
TRUE.
5
Introduction to Digital
Circuits
Conditional Jump Instruction 1 FE DE EX
Instruction 2 FE DE -
Instruction 3 FE - -
Instruction 4 FE DE EX
Instruction 5 FE DE EX
Please note that the instruction 2 and instruction 3 are not to be executed, still they
will be fetched. There a number of methods such as branch prediction, which can be
used to handle such problem. A detailed discussion on these problems are beyond the
scope of this Unit.
iv) Modern day processors relay mostly on sequential data processing to improve
their performance
……………………………………………………………………………
……………………………………………………………………………
……………………………………………………………………………
……………………………………………………………………………
……………………………………………………………………………
……………………………………………………………………………
……………………………………………………………………………
……………………………………………………………………………
6
16.4 PARALLELISM THROUGH HARDWARE
AND SOFTWARE
As we know parallelism is the concept of processing multiple tasks concurrently. If
this parallelism is achieved through hardware, then it is called Hardware parallelism.
Similarly, if parallelism is achieved through software, then it is called Software
parallelism.
Hardware Parallelism:
For example, using multiple adders can speed up a system which is initially
working with single adder. But there is an overhead of cost of extra adders
implemented in the system. Depending on the application you should choose
between cost and performance.
Software Parallelism:
7
Introduction to Digital In vector processing vectors are subtracted in a single step. This procedure saves (n-1)
Circuits
clock cycles, which makes a huge difference when processing large arrays of data.
Vector instructions and scalar instructions are specified in a different way. For
example, a typical scalar instruction having three operands can be defined as follows:
Scalar instruction format:
Opcode Address of source 1 Address of source 2 Address of destination
Whereas, a typical vector instruction would require to specify the base addresses of
the operands and the length of the vector, which specifies the number of elements in
those vectors. A typical vector instruction can be represented as follows:
ii) SIMDarray processors contains a single processing unit and multiple control
2) What are the components present in Processor Unit in an SIMD Array Processor?
9
Introduction to Digital
Circuits
………………………………………………………………………………
………………………………………………………………………………
………………………………………………………………………………
………………………………………………………………………………
10
Multi-processor is a tightly coupled system with more than one CPU sharing same
memory and input-output resources. Some of the major characteristics of
multiprocessor are as discussed below:
I/O
These components should communicate with each other for proper execution of
programs. These components communicate among themselves through some
interconnect paths;these paths which connects these modules or components are called
as interconnection structures.
Let’s discuss various interconnection structures in the detail.
Multi-port memory Organization
The multiport memory organization is illustrated in Figure 16.5. The multi-core
processors have been using the shared memory resource modules (MM1, MM2,
MM3, MM4) with the bus interconnection, which is used as a path for communicating
among them. In this interconnection structure every CPU can communicate with every
memory module. The advantage of such kind of connection is that, in case every CPU
is trying to access a different memory module, they will be allowed access in parallel.
However, their can be conflict, if to processor are trying to access the same memory
module at the same time. A detailed discussion on this topic is beyond the scope of
this Unit.
11
Introduction to Digital
Circuits
In the above figure, Serial Arbitration procedure is illustrated. The arbiters are sharing
a common bus busy line which is synchronized to maintain the synchronous
communication. The input (l) is serially transmitted through the arbiters with the
enabling of bus busy line. It might depend on the clock edge occurrence of bus busy
line to transmit the input data through the arbiters.
12
TCP/IP socket communication (named, dynamic - loop back interface or
network interface)
D-Bus is an IPC mechanism offering one to many broadcast and subscription
facilities between processes.
Shared memory
Between Processes
The above figure illustrates the Inter-processor communication in two possible ways.
The first one is with respect to two processes A and B trying to pass messages to the
message queue. It depends on the algorithm used, which decides on which process
gets to pass the message first.
13
Introduction to Digital The second figure indicates the two processes A and B trying to use the common
Circuits
memory and ultimately depends on the algorithm designed, which decides how the
processes share the memory.
When two processes need multiple shared resources at the same time in order to
proceed further a Deadlock condition occurs.
To understand this condition, let us consider a scenario in which Thread X is
expecting data from Thread Y for further processing and Thread Y is expecting data
from Thread X for further processing of data. This scenario is called deadlock and no
further processing can be done between the two threads.
Operating System provides synchronization and communications between processes
sharing resources and prevents them from facing potential Inter process
communication problems.
Client Processor
Shared
Memory
Client Processor
In the Figure 16.9, there are two client processors having their own local cache
memory, which are initiallycoherent. Next, one of the client processors updatesits
cache, which in turn changes the data in the shared memory. The other client
processor should also update its cache memory to reflect the change made by the first
client processor. This is how cache coherence works. However, what happens if the
other processor is not able to update its cache. In this case, cache coherence is not
present. Thus, the update of data made by one processor though is reflected in its
cache and shared memory, but the other client processor has no clue about it. There
will be data conflict between both these client processors. To overcome such conflicts
and synchronize data between multiple caches, Cache Coherence protocols are used.
The detailed discussion on these protocols is beyond the scope of this Unit.
14
Figure 16.10: Multi-core Processor block diagram
16.9 SUMMARY
15
Introduction to Digital This Unit provides a basic introduction pf the concepts of pipelining and
Circuits
multiprocessor. The two pipelining techniques discussed in this Unit are – arithmetic
pipeline and instruction pipeline. Various Parallelism techniques such as Vector and
array processing are also discussed in this unit, and further topics like Multiprocessors
are demonstrated with diagrams and explained in brief.
The information given on various topics such as pipelining, both arithmetic and
Instruction pipeline, is introductoryand can be supplemented with additional reading.
In fact, a course in an area of computer must be supplemented by further reading to
keep your knowledge up to date, as the computer world is changing with leaps and
bounds. In addition to further readings the student is advised to study several Indian
Journals on computers to enhance her/hisknowledge.
16.10 SOLUTIONS
1.
i) False
ii) True
iii) False
iv) False
v) True
3.
o In high-speed Computers
o For doing floating point arithmetic
o For processing data of scientific problems
1.
i) True
ii) False
iii) False
iv) True
o ALU
o Floating Point Unit
o Registers
16
3. Vector processing is used in following fields:
o Weather forecasting
o Artificial Intelligence (AI)
o Image Processing
o Healthcare and diagnosis
3. These days due to ULSI technologies most of the processors are multi-core
processors. The Intel latest processors, AMD processor, processors of mobiles etc. all
have multi-cores.
17