LectureNotes 11
LectureNotes 11
11
MULTIPLICATION IN ASSEMBLY LANGUAGE
In the multiplication algorithm discussed above we revised the way we multiplied
number in lower classes, and gave an example of that method on binary numbers. We
make a simple modification to the traditional algorithm before we proceed to formulate
it in assembly language.
In the traditional algorithm we calculate all intermediate answers and then sum them
to get the final answer. If we add every intermediate answer to accumulate the result,
the result will be same in the end, except that we do not have to remember a lot of
intermediate answers during the whole multiplication. The multiplication with the new
algorithm is shown below.
We try to identify steps of our algorithm. First we set the result to zero. Then we
check the right most bit of multiplier. If it is one add the multiplicand to the result, and
if it is zero perform no addition. Left shift the multiplicand before the next bit of
multiplier is tested. The left shifting of the multiplicand is performed regardless of the
value of the multiplier’s right most bit. Just like the crosses in traditional multiplication
are always placed to mark the ones, tens, thousands, etc. places. Then check the next
bit and if it is one add the shifted value of the multiplicand to the result. Repeat for as
many digits as there are in the multiplier, 4 in our example. Formulating the steps of
the algorithm we get:
• Shift the multiplier to the right.
• If CF=1 add the multiplicand to the result.
• Shift the multiplicand to the right.
• Repeat the algorithm 4 times.
For an 8bit multiplication the algorithm will be repeated 8 times and for a sixteen bit
multiplication it will be repeated 16 times, whatever the size of the multiplier is.
The algorithm uses the fact that shifting right forces the right most bit to drop in the
carry flag. If we test the carry flag using JC we are effectively testing the right most bit
of the multiplier. Another shifting will cause the next bit to drop in the next iteration
and so on. So our task of checking bits one by one is satisfied using the shift operation.
There are many other methods to do this bit testing as well, however we exemplify one
of the methods in this example.
In the first iteration there is no shifting just like there is no cross in traditional
multiplication in the first pass. Therefore we placed the left shifting of the multiplicand
after the addition step. However the right shifting of multiplier must be before the
addition as the addition step’s execution depends upon its result.
We introduce an assembly language program to perform this 4bit multiplication. The
algorithm is extensible to more bits but there are a few complications, which are left to
be discussed later. For now we do a 4bit multiplication to keep the algorithm simple.
Example 4.1
01 ; 4bit multiplication algorithm
02 [org 0x100]
03 jmp start
04
05 multiplicand: db 13 ; 4bit multiplicand (8bit space)
06 multiplier: db 5 ; 4bit multiplier
07 result: db 0 ; 8bit result
08
09 start: mov cl, 4 ; initialize bit count to four
10 mov bl, [multiplicand] ; load multiplicand in bl
11 mov dl, [multiplier] ; load multiplier in dl
12
13 checkbit: shr dl, 1 ; move right most bit in carry
14 jnc skip ; skip addition if bit is zero
15
16 add [result], bl ; accumulate result
17
18 skip: shl bl, 1 ; shift multiplicand left
19 dec cl ; decrement bit count
20 jnz checkbit ; repeat if bits left
21
22 mov ax, 0x4c00 ; terminate program
23 int 0x21
Inside the debugger observe the working of the SHR and SHL instructions. The SHR
instruction is effectively dividing its operand by two and the remainder is stored in the
carry flag from where we test it. The SHL instruction is multiplying its operand by two
so that it is added at one place more towards the left in the result.
Extended Shifting
Using our basic shifting and rotation instructions we can effectively shift a 32bit
number in memory word by word. We cannot shift the whole number at once since our
architecture is limited to word operations. The algorithm we use consists of just two
instructions and we name it extended shifting.
num1: dd 40000
shl word [num1], 1
rcl word [num1+2], 1
The DD directive reserves a 32bit space in memory, however the value we placed
there will fit in 16bits. So we can safely shift the number left 16 times. The least
significant word is accessible at num1 and the most significant word is accessible at
num1+2.
The two instructions are carefully crafted such that the first one shifts the lower word
towards the left and the most significant bit of that word is dropped in carry. With the
next instruction we push that dropped bit into the least significant bit of the next word
effectively joining the two 16bit words. The final carry after the second instruction will
be the most significant bit of the higher word, which for this number will always be
zero.
The following illustration will clarify the concept. The pipe on the right contains the
lower half and the pipe on the left contains the upper half. The first instruction forced a
zero from the right into the lower half and the left most bit is saved in carry, and from
there it is pushed into the upper half and the upper half is shifted as well.
Step 1 →
C 1 0 1 1 0 1 0 0 0
Step 2
C 1 0 1 1 0 1 0 0
For shifting right the exact opposite is done however care must be taken to shift right
the upper half first and then rotate through carry right the lower half for obvious
reasons. The instructions to do this are.
num1: dd 40000
shr word [num1+2], 1
rcr word [num1], 1
The same logic has worked. The shift placed the least significant bit of the upper half
in the carry flag and it was pushed from right into the lower half. For a singed shift we
would have used the shift arithmetic right instruction instead of the shift logical right
instruction.
The extension we have done is not limited to 32bits. We can shift a number of any
size say 1024 bits. The second instruction will be repeated a number of times and we
can achieve the desired effect. Using two simple instructions we have increased the
capability of the operation to effectively an unlimited number of bits. The actual limit is
the available memory as even the segment limit can be catered with a little thought.
Extended Addition and Subtraction
We also needed 32bit addition for multiplication of 16bit numbers. The idea of
extension is same here. However we need to introduce a new instruction at this place.
The instruction is ADC or “add with carry.” Normal addition has two operands and the
second operand is added to the first operand. However ADC has three operands. The
third implied operand is the carry flag. The ADC instruction is specifically placed for
extending the capability of ADD. Numbers of any size can be added using a proper
combination of ADD and ADC. All basic building blocks are provided for the assembly
language programmer, and the programmer can extend its capabilities as much as
needed by using these fine instructions in appropriate combinations.
Further clarifying the operation of ADC, consider an instruction “ADC AX, BX.”
Normal addition would have just added BX to AX, however ADC first adds the carry flag
to AX and then adds BX to AX. Therefore the last carry is also included in the result.
The algorithm should be apparent by now. The lower halves of the two numbers to be
added are firsted added with a normal addition. For the upper halves a normal addition
would lose track of a possible carry from the lower halves and the answer would be
wrong. If a carry was generated it should go to the upper half. Therefore the upper
halves are added with an addition with carry instruction.
Since one operand must be in register, ax is used to read the lower and upper halves
of the source one by one. The destination is directly updated. The set of instructions
goes here.
dest: dd 40000
src: dd 80000
mov ax, [src]
add word [dest], ax
mov ax, [src+2]
adc word [dest+2], ax
To further extend it more addition with carries will be used. However the carry from
last addition will be wasted as there will always be a size limit where the results and the
numbers are stored. This carry will remain in the carry flag to be tested for a possible
overflow.
For subtraction the same logic will be used and just like addition with carry there is
an instruction to subtract with borrow called SBB. Borrow in the name means the carry
flag and is used just for clarity. Or we can say that the carry flag holds the carry for
addition instructions and the borrow for subtraction instructions. Also the carry is
generated at the 17th bit and the borrow is also taken from the 17th bit. Also there is
no single instruction that needs borrow and carry in their independent meanings at the
same time. Therefore it is logical to use the same flag for both tasks.
We extend subtraction with a very similar algorithm. The lower halves must be
subtracted normally while the upper halves must be subtracted with a subtract with
borrow instruction so that if the lower halves needed a borrow, a one is subtracted from
the upper halves. The algorithm is as under.
dest: dd 40000
src: dd 80000
mov ax, [src]
sub word [dest], ax
mov ax, [src+2]
sbb word [dest+2], ax