0% found this document useful (0 votes)
8 views

LectureNotes 11

assembly

Uploaded by

amirentezari.ac
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views

LectureNotes 11

assembly

Uploaded by

amirentezari.ac
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

LESSON NO.

11
MULTIPLICATION IN ASSEMBLY LANGUAGE
In the multiplication algorithm discussed above we revised the way we multiplied
number in lower classes, and gave an example of that method on binary numbers. We
make a simple modification to the traditional algorithm before we proceed to formulate
it in assembly language.
In the traditional algorithm we calculate all intermediate answers and then sum them
to get the final answer. If we add every intermediate answer to accumulate the result,
the result will be same in the end, except that we do not have to remember a lot of
intermediate answers during the whole multiplication. The multiplication with the new
algorithm is shown below.

1101 = 13 Accumulated Result


0101 = 5
----- 0 (Initial Value)
1101 = 13 0 + 13 = 13
0000x = 0 13 + 0 = 13
1101xx = 52 13 + 52 = 65
0000xxx = 0 65 + 0 = 65 (Answer)

We try to identify steps of our algorithm. First we set the result to zero. Then we
check the right most bit of multiplier. If it is one add the multiplicand to the result, and
if it is zero perform no addition. Left shift the multiplicand before the next bit of
multiplier is tested. The left shifting of the multiplicand is performed regardless of the
value of the multiplier’s right most bit. Just like the crosses in traditional multiplication
are always placed to mark the ones, tens, thousands, etc. places. Then check the next
bit and if it is one add the shifted value of the multiplicand to the result. Repeat for as
many digits as there are in the multiplier, 4 in our example. Formulating the steps of
the algorithm we get:
• Shift the multiplier to the right.
• If CF=1 add the multiplicand to the result.
• Shift the multiplicand to the right.
• Repeat the algorithm 4 times.
For an 8bit multiplication the algorithm will be repeated 8 times and for a sixteen bit
multiplication it will be repeated 16 times, whatever the size of the multiplier is.
The algorithm uses the fact that shifting right forces the right most bit to drop in the
carry flag. If we test the carry flag using JC we are effectively testing the right most bit
of the multiplier. Another shifting will cause the next bit to drop in the next iteration
and so on. So our task of checking bits one by one is satisfied using the shift operation.
There are many other methods to do this bit testing as well, however we exemplify one
of the methods in this example.
In the first iteration there is no shifting just like there is no cross in traditional
multiplication in the first pass. Therefore we placed the left shifting of the multiplicand
after the addition step. However the right shifting of multiplier must be before the
addition as the addition step’s execution depends upon its result.
We introduce an assembly language program to perform this 4bit multiplication. The
algorithm is extensible to more bits but there are a few complications, which are left to
be discussed later. For now we do a 4bit multiplication to keep the algorithm simple.

Example 4.1
01 ; 4bit multiplication algorithm
02 [org 0x100]
03 jmp start
04
05 multiplicand: db 13 ; 4bit multiplicand (8bit space)
06 multiplier: db 5 ; 4bit multiplier
07 result: db 0 ; 8bit result
08
09 start: mov cl, 4 ; initialize bit count to four
10 mov bl, [multiplicand] ; load multiplicand in bl
11 mov dl, [multiplier] ; load multiplier in dl
12
13 checkbit: shr dl, 1 ; move right most bit in carry
14 jnc skip ; skip addition if bit is zero
15
16 add [result], bl ; accumulate result
17
18 skip: shl bl, 1 ; shift multiplicand left
19 dec cl ; decrement bit count
20 jnz checkbit ; repeat if bits left
21
22 mov ax, 0x4c00 ; terminate program
23 int 0x21

04-06 The numbers to be multiplied are constants for now. The


multiplication is four bit so the answer is stored in an 8bit register.
If the operands were 8bit the answer would be 16bit and if the
07 operands were 16bit the answer would be 32bit. Since eight bits can
fit in a byte we have used 4bit multiplication as our first example.
Since addition by zero means nothing we skip the addition step if
14-16 the rightmost bit of the multiplier is zero. If the jump is not taken
the shifted value of the multiplicand is added to the result.
The multiplicand is left shifted in every iteration regardless of the
18 multiplier bit.
DEC is a new instruction but its operation should be immediately
19
understandable with the knowledge gained till now. It simply
subtracts one from its single operand.
20 The JNZ instruction causes the algorithm to repeat till any bits of
the multiplier are left

Inside the debugger observe the working of the SHR and SHL instructions. The SHR
instruction is effectively dividing its operand by two and the remainder is stored in the
carry flag from where we test it. The SHL instruction is multiplying its operand by two
so that it is added at one place more towards the left in the result.

1.1. EXTENDED OPERATIONS


We performed a 4bit multiplication to explain the algorithm however the real
advantage of the computer is when we ask it to multiply large numbers. Numbers
whose multiplication takes real time. If we have an 8bit number we can do the
multiplication in word registers, but are we limited to word operations? What if we want
to multiply 32bit or even larger numbers? We are certainly not limited. Assembly
language only provides us the basic building blocks. We build a plaza out of these
blocks, or a building, or a classic piece of architecture is only dependant upon our
imagination. With our logic we can extend these algorithms as much as we want.
Our next example will be multiplication of 16bit numbers to produce a 32bit answer.
However for a 32bit answer we need a way to shift a 32bit number and a way to add
32bit numbers. We cannot depend on 16bit shifting as we have 16 significant bits in
our multiplicand and shifting any bit towards the left may drop a valuable bit causing a
totally wrong result. A valuable bit means any bit that is one. Dropping a zero bit
doesn’t cause any difference. So we place the 16it number in 32bit space with the upper
16 bits zeroed so that the sixteen shift operations don’t cause any valuable bit to drop.
Even though the numbers were 16bit we need 32bit operations to multiply correctly.
To clarify this necessity, we take example of a number 40000 or 9C40 in
hexadecimal. In binary it is represented as 1001110001000000. To multiply by two we
shift it one place to the left. The answer we get is 0011100010000000 and the left most
one is dropped in the carry flag. The answer should be the 17bit number 0x13880 but it
is 0x3880, which is 14464 in decimal instead of the expected 80000. We should be
careful of this situation whenever shifting is used.

Extended Shifting
Using our basic shifting and rotation instructions we can effectively shift a 32bit
number in memory word by word. We cannot shift the whole number at once since our
architecture is limited to word operations. The algorithm we use consists of just two
instructions and we name it extended shifting.
num1: dd 40000
shl word [num1], 1
rcl word [num1+2], 1
The DD directive reserves a 32bit space in memory, however the value we placed
there will fit in 16bits. So we can safely shift the number left 16 times. The least
significant word is accessible at num1 and the most significant word is accessible at
num1+2.
The two instructions are carefully crafted such that the first one shifts the lower word
towards the left and the most significant bit of that word is dropped in carry. With the
next instruction we push that dropped bit into the least significant bit of the next word
effectively joining the two 16bit words. The final carry after the second instruction will
be the most significant bit of the higher word, which for this number will always be
zero.
The following illustration will clarify the concept. The pipe on the right contains the
lower half and the pipe on the left contains the upper half. The first instruction forced a
zero from the right into the lower half and the left most bit is saved in carry, and from
there it is pushed into the upper half and the upper half is shifted as well.

Step 1 →
C 1 0 1 1 0 1 0 0 0
 Step 2
C 1 0 1 1 0 1 0 0
For shifting right the exact opposite is done however care must be taken to shift right
the upper half first and then rotate through carry right the lower half for obvious
reasons. The instructions to do this are.
num1: dd 40000
shr word [num1+2], 1
rcr word [num1], 1
The same logic has worked. The shift placed the least significant bit of the upper half
in the carry flag and it was pushed from right into the lower half. For a singed shift we
would have used the shift arithmetic right instruction instead of the shift logical right
instruction.
The extension we have done is not limited to 32bits. We can shift a number of any
size say 1024 bits. The second instruction will be repeated a number of times and we
can achieve the desired effect. Using two simple instructions we have increased the
capability of the operation to effectively an unlimited number of bits. The actual limit is
the available memory as even the segment limit can be catered with a little thought.
Extended Addition and Subtraction
We also needed 32bit addition for multiplication of 16bit numbers. The idea of
extension is same here. However we need to introduce a new instruction at this place.
The instruction is ADC or “add with carry.” Normal addition has two operands and the
second operand is added to the first operand. However ADC has three operands. The
third implied operand is the carry flag. The ADC instruction is specifically placed for
extending the capability of ADD. Numbers of any size can be added using a proper
combination of ADD and ADC. All basic building blocks are provided for the assembly
language programmer, and the programmer can extend its capabilities as much as
needed by using these fine instructions in appropriate combinations.
Further clarifying the operation of ADC, consider an instruction “ADC AX, BX.”
Normal addition would have just added BX to AX, however ADC first adds the carry flag
to AX and then adds BX to AX. Therefore the last carry is also included in the result.
The algorithm should be apparent by now. The lower halves of the two numbers to be
added are firsted added with a normal addition. For the upper halves a normal addition
would lose track of a possible carry from the lower halves and the answer would be
wrong. If a carry was generated it should go to the upper half. Therefore the upper
halves are added with an addition with carry instruction.
Since one operand must be in register, ax is used to read the lower and upper halves
of the source one by one. The destination is directly updated. The set of instructions
goes here.
dest: dd 40000
src: dd 80000
mov ax, [src]
add word [dest], ax
mov ax, [src+2]
adc word [dest+2], ax
To further extend it more addition with carries will be used. However the carry from
last addition will be wasted as there will always be a size limit where the results and the
numbers are stored. This carry will remain in the carry flag to be tested for a possible
overflow.
For subtraction the same logic will be used and just like addition with carry there is
an instruction to subtract with borrow called SBB. Borrow in the name means the carry
flag and is used just for clarity. Or we can say that the carry flag holds the carry for
addition instructions and the borrow for subtraction instructions. Also the carry is
generated at the 17th bit and the borrow is also taken from the 17th bit. Also there is
no single instruction that needs borrow and carry in their independent meanings at the
same time. Therefore it is logical to use the same flag for both tasks.
We extend subtraction with a very similar algorithm. The lower halves must be
subtracted normally while the upper halves must be subtracted with a subtract with
borrow instruction so that if the lower halves needed a borrow, a one is subtracted from
the upper halves. The algorithm is as under.
dest: dd 40000
src: dd 80000
mov ax, [src]
sub word [dest], ax
mov ax, [src+2]
sbb word [dest+2], ax

You might also like