8086 Assemblyprogramming
8086 Assemblyprogramming
Programming
What is in a Computer?
The field of Computer Architecture is about the
fundamental structure of computer systems
What are the components?
How are they interconnected?
How fast does the system operate?
What is the power consumption?
How much does it all costs?
What architecture leads to the “best” trade-offs?
The conceptual model for computer architecture that
is still in effect since 1965 is the Von-Neumann
architecture
Instructions?
Whenever somebody builds a CPU they first define
what instructions the CPU will know how to decode
and execute
This is called the Instruction Set Architecture (ISA)
The ISA for a Pentium is different from the ISA for a
PowerPC for instance
The ISA is described in a (lengthy) documentation
that describes everything that one can do with the
CPU
Every instruction lasts some number of clock cycles
Instructions
Instructions are encoded in binary machine code
E.g.: 01000110101101 may mean “perform an addition of two
registers and store the results in another register”
The CPU is built using gates (or, and, etc.) which
themselves use transistors
These gates implement instruction decoding
Based on the bits of the instruction code, several signals are
sent to different electronic components, which in turn perform
useful tasks
Typically, an instruction consists of two parts
The opcode: what the instruction computes
The operands: the input to the computation
opcode operands
0 1 0 0 0 1 1 0 1 0 1 1 0 1
Assembly language
It’s really difficult for humans to read/remember
binary instruction encodings
We will see that typically one would use hexadecimal
encoding, but still
Therefore it is typical to use a set of mnemonics,
which form the assembly language
It is often said that the CPU understands assembly
language
This is not technically true, as the CPU understand
machine code, which we, as humans, choose the
represent using assembly language
An assembler transforms assembly code into
machine code
Assembly Language
It used to be that all computer programmers did all
day was to write assembly code
This was difficult for many reasons
Difficult to read
Very difficult to debug
Different from one computer to another!
The use of assembly language for all programming
prevented the (sustainable) development of large
software project involving many programmers
This is the main motivation for the development of
high-level languages
FORTRAN, Cobol, C, etc.
Why Assembly?
It's difficult
Error prone
Hard to debug
Takes a lot of time to
develop
Why Assembly?
However:
Assembly is fast. A LOT faster than any compiler
of any language could ever produce.
Assembly is a lot closer to machine level than
any language because the commands of
assembly language are mapped 1-1 to machine
instructions.
Assembly code is a lot smaller than any compiler
of any language could ever produce.
In Assembly, we can do a lot of things that we
can't do in any higher level language, such as
playing with processor flags, etc.
High-level Languages
The first successful high-level language was FORTRAN
Developed by IBM in 1954 to run on they 704 series
Used for scientific computing
The introduction of FORTRAN led people to believe that there would
never be bugs again because it made programming so easy!
But high-level languages led to larger and more complex software
systems, hence leading to bugs
Another early programming language was COBOL
Developed in 1960, strongly supported by DoD
Used for business applications
In the early 60s IBM had a simple marketing strategy
On the IBM 7090 you used FORTRAN to do science
On the IBM 7080 you used COBOL to do business
Many high-level languages have been developed since then, and
they are what most programmers use
Fascinating history
High-Level Languages
Having high-level languages is good, but CPUs do not
understand them
Therefore, there needs to be a translation from a high-level
language to machine code
There are two ways to run a high-level language on a CPU
that only understands machine code:
Interpretation: An interpreter is a program that reads in high-
level code and simulates a computer that understands high-
level code
Compilation: A compiler is a program that reads in high-level
code and produces equivalent machine code, which can then
be executed on the CPU at a later time
Some languages are interpreted, some are compiled, some
are both or hybrid
The Big (Simplified) Picture
Machine code
High-level code
010000101010110110
101010101111010101
101001010101010001
101010101010100101
char *tmpfilename; 111100001010101001
int num_schedulers=0;
000101010111101011
ASSEMBLER
int num_request_submitters=0;
int i,j; 010000000010000100
000010001000100011
if (!(f = fopen(filename,"r"))) {
xbt_assert1(0,"Cannot open file %s",filename); 101001010010101011
} 000101010010010101
while(fgets(buffer,256,f)) {
if (!strncmp(buffer,"SCHEDULER",9))
010101010101010101
num_schedulers++; 101010101111010101
if (!strncmp(buffer,"REQUESTSUBMITTER",16)) 101010101010100101
num_request_submitters++;
} 111100001010101001
fclose(f);
tmpfilename = strdup("/tmp/jobsimulator_
Assembly code
sll $t3, $t1, 2
add $t3, $s0, $t3
sll $t4, $t0, 2
Program counter register
add $t4, $s0, $t4
lw $t5, 0($t3) register
CPU
lw $t6, 0($t4) register
slt $t2, $t5, $t6
COMPILER
beq $t2, $zero, endif
add $t0, $t1, $zero
sll $t4, $t0, 2 Control
add $t4, $s0, $t4 ALU Unit
lw $t5, 0($t3)
lw $t6, 0($t4)
slt $t2, $t5, $t6
beq $t2, $zero, endif
The Big (Simplified) Picture
Hand-written Machine code
High-level code Assembly code 010000101010110110
101010101111010101
101001010101010001
101010101010100101
char *tmpfilename; sll $t3, $t1, 2 111100001010101001
int num_schedulers=0; add $t3, $s0, $t3
000101010111101011
ASSEMBLER
int num_request_submitters=0; sll $t4, $t0, 2
int i,j; 010000000010000100
add $t4, $s0, $t4
lw $t5, 0($t3)
000010001000100011
if (!(f = fopen(filename,"r"))) {
xbt_assert1(0,"Cannot open file %s",filename); lw $t6, 0($t4) 101001010010101011
} slt $t2, $t5, $t6 000101010010010101
while(fgets(buffer,256,f)) {
if (!strncmp(buffer,"SCHEDULER",9))
beq $t2, $zero, endif 010101010101010101
num_schedulers++; 101010101111010101
if (!strncmp(buffer,"REQUESTSUBMITTER",16)) 101010101010100101
num_request_submitters++;
} 111100001010101001
fclose(f);
tmpfilename = strdup("/tmp/jobsimulator_
Assembly code
sll $t3, $t1, 2
add $t3, $s0, $t3
sll $t4, $t0, 2
Program counter register
add $t4, $s0, $t4
lw $t5, 0($t3) register
CPU
lw $t6, 0($t4) register
slt $t2, $t5, $t6
COMPILER
beq $t2, $zero, endif
add $t0, $t1, $zero
sll $t4, $t0, 2 Control
add $t4, $s0, $t4 ALU Unit
lw $t5, 0($t3)
lw $t6, 0($t4)
slt $t2, $t5, $t6
beq $t2, $zero, endif
What we do in this class:
Hand-written Machine code
High-level code Assembly code 010000101010110110
101010101111010101
101001010101010001
101010101010100101
char *tmpfilename; sll $t3, $t1, 2 111100001010101001
int num_schedulers=0; add $t3, $s0, $t3
000101010111101011
ASSEMBLER
int num_request_submitters=0; sll $t4, $t0, 2
int i,j; 010000000010000100
add $t4, $s0, $t4
lw $t5, 0($t3)
000010001000100011
if (!(f = fopen(filename,"r"))) {
xbt_assert1(0,"Cannot open file %s",filename); lw $t6, 0($t4) 101001010010101011
} slt $t2, $t5, $t6 000101010010010101
while(fgets(buffer,256,f)) {
if (!strncmp(buffer,"SCHEDULER",9))
beq $t2, $zero, endif 010101010101010101
num_schedulers++; 101010101111010101
if (!strncmp(buffer,"REQUESTSUBMITTER",16)) 101010101010100101
num_request_submitters++;
} 111100001010101001
fclose(f);
tmpfilename = strdup("/tmp/jobsimulator_
Assembly code
sll $t3, $t1, 2
add $t3, $s0, $t3
sll $t4, $t0, 2
Program counter register
add $t4, $s0, $t4
lw $t5, 0($t3) register
CPU
lw $t6, 0($t4) register
slt $t2, $t5, $t6
COMPILER
beq $t2, $zero, endif
add $t0, $t1, $zero
sll $t4, $t0, 2 Control
add $t4, $s0, $t4 ALU Unit
lw $t5, 0($t3)
lw $t6, 0($t4)
slt $t2, $t5, $t6
beq $t2, $zero, endif
Performance : Bubble Sort
Example
14
Processors Prior to 8086
(1971) 4004 – First processor made by the Intel
Corporation. Allowed computer intelligence to
be put into small devices like cell phones, key
chains, calculators, etc.
Bus
control
ALU Instruction Queue External bus
EU
control
Flag register
Bus Interface Unit (BIU)
20
Organization of 8088/8086
Intel 8088 facts
VDD (5V)
20 bit address bus allow accessing
1 M memory locations
16-bit internal data bus and 8-bit 20-bit
external data bus. Thus, it need 8-bit data address
two read (or write) operations to
read (or write) a 16-bit datum control 8088 control
Byte addressable and byte-swapping signals signals
To 8088 from 8088
Word: 5A2F CLK
18001 5AHigh byte of word GND
18000 2FLow byte of word
8088 signal classification
Memory locations
21
The 8086 Registers
To write assembly code for an ISA you must know
the name of registers
Because registers are places in which you put data to
perform computation and in which you find the result of the
computation (think of them as variables for now)
The registers are identified by binary numbers, but
assembly languages give them “easy-to-remember” names
The 8086 offered 16-bit registers
Four general purpose 16-bit registers
AX
BX
CX
DX
General purpose registers
AX, BX, CX, and DX: They can be
assigned to any value you want.
AX (accumulator register). Most of
arithmetical operations are done with AX.
BX (base register). Used to do array
operations. BX is usually worked with other
registers like SP to point to stacks.
CX (counter register). Used for counter
purposes.
DX (data register). Used for storing data value.
The 8086 Registers
AX BX CX DX
AH AL BH BL CH CL DH DL
AH AL BH BL CH CL DH DL
source array
DI (destination index) is always pointed to
current stack.
IP (instruction pointer) denotes the
SI
DI
BP
SP
IP
= FLAGS
CS
DS
SS
ES
16 bits
Control
ALU Unit
Addresses in Memory
We mentioned several registers that are used for
holding addresses of memory locations
Segments:
CS, DS, SS, ES
Pointers:
SI, DI: indices (typically used for pointers)
SP: Stack pointer
BP: (Stack) Base pointer
address space
A program constantly references all three code
regions
Therefore, the program constantly references
bytes in three different segments
For now let’s assume that each region is fully data
contained in a single segment, which is in fact
not always the case
CS: points to the beginning of the code
segment stack
DS: points to the beginning of the data
segment
SS: points to the beginning of the stack
segment
Address Space
In the 8086 processor, a program is limited to referencing an
address space of size 1MB, that is 220 bytes
Therefore, addresses are 20-bit long!
A d-bit long address allows to reference 2d different “things”
Example:
2-bit addresses
00, 01, 10, 11
4 “things”
3-bit addresses
000, 001, 010, 011, 100, 101, 110, 111
8 “things”
In our case, these things are “bytes”
One cannot address anything smaller than a byte
Therefore, a 20-bit address makes it possible to address 220
individual bytes, or 1MB
Address Space
selector offset
4 bits 16 bits
0001…
selector offset 0010…
0011…
0110…
0111…
1MB
We have 1MB of memory of
1000…
We have 64K segments memory
1001…
We have 16 segments 1010…
1011…
1100…
1101…
1110…
1111…
Memory Segmentation
A segment is a 64KB block of memory starting from any 16-byte
boundary
For example: 00000, 00010, 00020, 20000, 8CE90, and E0840 are all valid
segment addresses
The requirement of starting from 16-byte boundary is due to the 4-bit
left shifting
DS Data Segment
SS Stack Segment
ES Extra Segment
42
Memory Address Calculation
Examples
CS 3 4 8 A 0 SS 5 0 0 0 0
IP + 4 2 1 4 SP + F F E 0
Instruction address 3 8 A B 4 Stack address 5 F F E 0
DS 1 2 3 4 0
DI + 0 0 2 2
Data address 1 2 3 6 2
43
Fetching Instructions
Where to fetch the next instruction?
8088 Memory
CS 1 2 3 4
IP 0012 12352 MOV AL, 0
12352
Update IP
— After an instruction is fetched, Register IP is updated as follows:
— For Example: the length of MOV AL, 0 is 2 bytes. After fetching this instruction,
the IP is updated to 0014
44
Accessing Data Memory
There is a number of methods to generate the memory address when
accessing data memory. These methods are referred to as
Addressing Modes
Examples:
— Direct addressing: MOV AL, [0300H]
DS 1 2 3 4 0 (assume DS=1234H)
0 3 0 0
Memory address 1 2 6 4 0
DS 1 2 3 4 0 (assume DS=1234H)
0 3 1 0 (assume SI=0310H)
Memory address 1 2 6 5 0
45
In-class Exercise
MOV AL, BL AH AL
BH BL
In immediate and register addressing modes, the processor does not access memory.
Thus, the execution of such instructions are fast.
Immediate Addressing Mode
table1[0] = 56
Direct Addressing Example
AH AL 17000
12 34 12 17001H
34 17000H
Direct Addressing Mode
0A000H 12
DS: 0 8 0 0 _
+ SI: 200 0
memory
0A0 0 0
Example 2: assume SS = 0800H, BP=2000H, DL = 7
MOV [BP], DL
Register Indirect Addressing
Using indirect addressing mode, we can
process arrays using loops
Example: Summing array elements
Load the starting address (i.e., offset) of the
array into BX
Loop for each element in the array
Get the value using the offset in BX
Use indirect addressing
Add the value to the running total
Update the offset in BX to point to the next element
of the array
Register Indirect Addressing
Loading offset value into a register
DS BX
10H + + Displacement = Memory address
SS BP
DS: 0 1 0 0 _ 01605H C0
+ BX: 0 6 0 0 01604H B0
+ Disp.: 0 0 0 4
01604 memory
MOV [BP-7], CH
Indexed Addressing
The operand field of the instruction contains an index register (SI or DI)
and an 8-bit (or 16-bit) constant (displacement)
For Example: MOV [DI-8], BL
Calculate memory address
SI
DS 10H + + Displacement = Memory address
DI
Example: assume DS = 0200H, DI=0030H BL = 17H
MOV [DI-8], BL
BH BL
DS: 0 2 0 0 _ 17
+ DI: 003 0 17 02028H
- Disp.: 0 0 0 8
02 028 memory
Based Indexed Addressing
The operand field of the instruction contains a base register (BX or BP)
and an index register
For Example: MOV [BP] [SI], AH
or MOV [BP+SI], AH
DS BX
10H + + {SI or DI} = Memory address
SS BP
SS: 2 0 0 0 _ 24800H 07
+ BP: 4 0 0 0
+ SI.: 080 0
24800 memory
MOV [BX+DI], CH
Based Indexed with Displacement Addressing
The operand field of the instruction contains a base register (BX or BP),
an index register, and a displacement
DS BX
10H + + {SI or DI} + Disp. = Memory address
SS BP
DS: 0 3 0 0 _
+ BX: 1 0 0 0 06090H 20
+ DI.: 0010
+ Disp. 2 0 8 0
memory
06090
MOV [BP+SI+0010H], CH
Summary of Addressing Modes
Assembler converts a variable name into a
constant offset (called also a displacement)
quad1 DQ 1234567812345678h
val1 DT 1000000000123456789Ah
Little Endian Order
All data types larger than a byte store their individual
bytes in reverse order. The least significant byte occurs
at the first (lowest) memory address.
Example:
val1 DD 12345678h
EQU Directive
Define a symbol as either an integer or
text expression.
Cannot be redefined
PI EQU <3.1416>
pressKey EQU <"Press any key to continue...",0>
.data
prompt DB pressKey
Moving Around Values
If you need to do some calculations or commands
involving the variables you'll have to load the variable
values to the registers.
The syntax of the mov command is mov a , b . which
means assign b to a
Var1
Var2
Reg 1
mov ax, [var2]
MM mov [var1],ax
Reg 2
Caveats in MOVs
You CANNOT use mov [var1], [var2].
In other words, mov command cannot transfer
values between two variables directly. So, how can
we get around with this? Use the register.
Suppose both var1 and var2 are word
variables. We can use any word registers (AX,
BX, CX, DX, and so on) to do the transfer.
Suppose we use AX.
Thus, mov [var1], [var2] must be transformed into:
mov ax, [var2]
mov [var1],ax
Moving Around Values example
:
jmp start
our_var dw 10
start: The square brackets [ ] are to
mov bx, [our_var] distinguish the variable from its
mov cx, bx address.
mov [our_var], cx
mov ax, 4c00h
int 21h
end
Moving Around Values cont.
When we deal with byte variables (i.e. db), we need to use
byte registers (e.g. AL, AH, BL, BH, and so on) to do our
bidding.
AX, BX, CX, DX, and so on are word registers.
You can use double-word registers which is available in 80386
processors or better (use p386n instead of p286n to enable
double-word registers).
The double-word registers includes EAX, EBX, ECX, EDX, and
so on.
We can assign variables with constants with mov instruction.
However, this will work only with 80286 or better processors:
mov [word ptr our_var], 1
Notice the word ptr modifier must be used when you assign
constants to variables. Since our_var is a word variable, we
need to use word ptr modifier. Likewise, byte variable uses
byte ptr modifier and double-word variable uses dword ptr.
Moving Around Values
example
AX <= 0502h
Moving Around Values cont.
al
ax
bl
Size Reduction
Of course, when doing a size reduction, one loses
information
So the “conversion” may not work
Example:
mov ax, 000A2h ; ax = 162 decimal
mov bl, ax; ; bl = 162 decimal
Decimal 162 is encodable on 8 bits
Example:
mov ax, 00101h ; ax = 257 decimal
mov bl, ax; ; bl = 1 decimal
Decimal 257 is not encodable on 8 bits
Size Reduction and Sign
Consider a 2-byte quantity: FFF4
If we interpret this quantity as unsigned it is decimal 65,524
Remember that the computer does not know whether the
content of registers/memory corresponds to signed or unsigned
quantities
Once again it’s the responsibility of the programmer to do the
right thing
In this case size reduction “does not work”, meaning that
reduction to a 1-byte quantity will not be interpreted as
decimal 65,524, but instead as decimal 244 (F4h)
If instead FFF4 is a signed quantity (using 2’s complement),
then it corresponds to -000C (000B + 1), that is to decimal -12
In this case, size reduction works!
Size Reduction and Sign
This does not mean that size reduction always
works for signed quantities
For instance, consider FF32h, which is a negative
number equal to -00CEh, that is, decimal -206
A size reduction into a 1-byte quantity leads to 32h,
which is decimal +50!
Note that -206 is not encodable on 1 byte
The range of signed 1-byte quantities is between decimal
-128 and decimal +127
So, size reduction may work or not work for signed
or unsigned quantities!
Two Rules to Remember
For unsigned numbers: size reduction works if all removed
bits are 0
0 0 0 0 0 0 0 0 X X X X X X X X
X X X X X X X X
For signed numbers: size reduction works if all removed bits
are all 0’s or all removed bits are all 1’s, AND if the highest bit
not removed is equal to the removed bits
This highest remaining bit is the new sign bit, and
thus must be the same as the original sign bit
1 1 1 1 1 1 1 1 1 X X X X X X X
1 X X X X X X X
Size Increase
Size increase for unsigned quantities is
simple: just add 0s to the left of it
Size increase for signed quantities requires
sign extension: the sign bit must be
extended, that is, replicated
Consider the signed 1-byte number 5A. This
is a positive number (decimal 90), and so its
2-byte version would be 005A
Consider the signed 1-byte number 8A. This
is a negative number (decimal -118), and so
its 2-byte version would be FF8A
Unsigned size increase
Say we want to size increase an unsigned 1-
byte number to be a 2-byte unsigned number
This can be done in a few easy steps, for
instance:
Put the 1-byte number into al
Set all bits of ah to 0
Access the number as ax
Example
mov al, 0EDh
mov ah, 0
move ..., ax
Unsigned size increase
How about increasing the size of a 2-byte quantity to 4 byte?
This cannot be done in the same manner because there is no
way to access the 16 highest bit of register eax separately!
AX
AH AL = EAX
00E3
+ F74F
= F832 Carry bit is not set
AL E1 BL A2
In-Class Exercise
mov al, 0E1h
mov bl, 0A2h
add al, bl
movzx eax, al
call print_int
movsx eax, bl
call print_int
AL 83 BL A2
E1
+ A2
= 183
In-Class Exercise
mov al, 0E1h
mov bl, 0A2h
add al, bl
movzx eax, al
call print_int
movsx eax, bl
call print_int
AL
EAX 00 00 00 83 BL A2
E1
+ A2
= 183
In-Class Exercise
mov al, 0E1h
mov bl, 0A2h
add al, bl
movzx eax, al
call print_int
movsx eax, bl
call print_int
AL
EAX 00 00 00 83 BL A2
AL
EAX FF FF FF A2 BL A2
E1
+ A2
= 183
In-Class Exercise
mov al, 0E1h
mov bl, 0A2h
add al, bl
movzx eax, al
call print_int
movsx eax, bl
call print_int
AL
EAX FF FF FF A2 BL A2
One should use the imul instruction instead (but unfortunately imul
doesn’t work on 1-byte quantities):
movsx ax, al ; sign extension
imul ax, 16 ; result in ax
In-Class Exercise
1 0 1 1 0 0 1 1 0 0 0 1
AND OR
1 1 0 1 1 0 0 1 1 0 1 1
= =
1 0 0 1 0 0 1 1 1 0 1 1
1 1 0 0 0 1 NOT 1 1 0 0 0 1
XOR
0 1 1 0 1 1
= = 0 0 1 1 1 0
1 0 1 0 1 0
Boolean Bitwise Instructions
or underflows)
SF: sign flag (set to 1 if the result is negative)
cmp a,b ZF CF OF SF
a=b 1 0
unsigned a<b 0 1
a>b 0 0
a=b 1 0 0
signed a<b 0 v !v
a>b 0 v v
Branch Instructions
...
add eax, ebx
jmp here
sub al, bl This instruction will
mvsx ax, al never be executed!
here:
call print_int
...
The JMP Instruction
The ability to jump to a label in the assembly code is convenient
In machine code there is no such thing as a label: only addresses
So one would constantly have to compute addresses by hand
e.g., “jump to the instruction +4319 bytes from here in the source code”
e.g., “jump to the instruction -18 bytes from here in the source code”
This is what programmers way back when used to do by hand, using
signed displacements in bytes
The displacements are added to the EIP register (program counter)
There are three versions of the JMP instruction in machine code:
Short jump: Can only jump to an instruction that is within 128 bytes in
memory of the jump instruction (1-byte displacement)
Near jump: 4-byte displacement (any location in the code segment)
Far jump: very rare jump to another code segment
We won’t use this at all
The JMP Instruction
A short jump:
jmp label
or jmp short label
A near jump:
jmp near label
Why do we even have this?
Remember that instructions are encoded in binary
To jump one needs to encode the number of bytes to add/subtract to the
program counter
If this number is large, we need many bits to encode it
If this number is small, we want to use few bits so that our program
takes less space in memory
i.e., the encoding of a short jmp instruction takes fewer bits than the
encoding of a near jmp instruction (3 bytes less)
In a code that has 100,000 near jumps, if you can replace 50% of them
by short jumps, you save ~150KB (in the size of the executable)
Conditional Branches
JZ branches if ZF is set
JNZ branches if ZF is unset
JO branches if OF is set
JNO branches if OF is unset
JS branches is SF is set
JNS branches is SF is unset
JC branches if CF is set
JNC branches if CF is unset
JP branches if PF is set
JNP branches if PF is unset
Example
Consider the following C-like code
if (EAX == 0)
EBX = 1;
else
EBX = 2;
Here it is in x86 assembler
cmp eax, 0 ; do the comparison
jz thenblock ; if = 0, then goto thenblock
mov ebx, 2 ; else clause
jmpnext ; jump over the then clause
thenblock:
mov ebx, 1 ; then clause
next:
Could use jnz and be the other way around
Another Example
Say we have the following C code (let us assume that EAX is
signed)
if (EAX >= 5)
EBX = 1;
else
EAX = 2;
This is much less straightforward
Let’s go back to our table for signed numbers
signed
a<b 0 v !v if (OF = SF) then a >= b
a>b 0 v v
Another Example
a>=b if (OF = SF)
Skeleton program
cmp eax, 5 Comparison
thenblock:
mov ebx, 1 “Then” block
jmp end
elseblock:
mov ebx, 2 “Else” block
end:
Another Example
a>=b if (OF = SF)
Program:
cmp eax, 5 ; do the comparison
jo oset ; if OF = 1 goto oset
js elseblock ; (OF=0) and (SF = 1) goto elseblock
jmp thenblock ; (OF=0) and (SF=0) goto thenblock
oset:
jns elseblock ; (OF=1) and (SF = 0) goto elseblock
jmp thenblock ; (OF=1) and (SF=1) goto thenblock
thenblock:
mov ebx, 1
jmp end
elseblock:
let’s check that it works
mov ebx, 2
end:
Another Example
cmp eax, 5 ; do the comparison
jo oset ; if OF = 1 goto oset
js elseblock ; (OF=0) and (SF = 1) goto elseblock
jmp thenblock ; (OF=0) and (SF=0) goto thenblock
oset:
jns elseblock ; (OF=1) and (SF = 0) goto elseblock
jmp thenblock ; (OF=1) and (SF=1) goto thenblock
thenblock:
mov ebx, 1
Unneeded
jmp end
instruction, we can
elseblock:
just “fall through”
mov ebx, 2
end:
A bit too hard?
One can play tricks by putting the else block
before the then block
The previous two examples are really
awkward, and it’s very easy to introduce bugs
Consequently, x86 assembly provides other
branch instructions to make our life much
easier :)
Let’s look at these instructions
More branches
cmp x, y
signed unsigned
Instruction branches if Instruction branches if
JE x=y JE x=y
JNE x != y JNE x != y
JL, JNGE x<y JB, JNAE x<y
JLE, JNG x <= y JBE, JNA x <= y
JG, JNLE x>y JA, JNBE x>y
JGE, JNL x >= y JAE, JNB x >= y
Redoing our Example
if (EAX >= 5)
EBX = 1;
else
EAX = 2;
cmp eax, 5
jgethenblock
mov eax, 2
jmp end
thenblock:
mov ebx, 1
end:
Translating high-level structures
SKIP:
INC BP ; Increment array base_pointer
DEC CL ; Decrement counter
JNZ BACK ; if not 0 go to back
END START
Computing Prime Numbers
The book has an example of an assembly
program that computes prime numbers
Let’s look at it in detail
Principle:
Try possible prime numbers in increasing order
starting at 5
Skip even numbers
Test whether the possible prime number (the
“guess”) is divisible by any number other than 1
and itself
If yes, then it’s not a prime, otherwise, it is
Computing Primes in C
unsigned int guess;
unsigned int factor;
unsigned int limit;
printf(“Find primes up to: “);
scanf(“%u”,&limit);
printf(“2\n3\n”); // prints the first 2 obvious primes
guess = 5; // we start the guess at 5
while (guess <= limit) {
factor = 3; // look for a possible factor
// we only look at factors < sqrt(guess)
while ( factor*factor < guess && guess % factor != 0 )
factor += 2;
if ( guess % factor != 0 ) // we never found a factor
printf(“%d\n”,guess);
guess += 2; // skip even numbers
}
Computing Primes in Assembly
unsigned int guess;
unsigned int factor; bss segment
unsigned int limit;
printf(“Find primes up to: “);
scanf(“%u”,&limit); data segment (message)
printf(“2\n3\n”); // prints the first 2 obvious primes
guess = 5; // we start the guess at 5 easy text segment
while (guess <= limit) {
factor = 3; // look for a possible factor
// we only look at factors < sqrt(guess)
while ( factor*factor < guess && guess % factor != 0 )
factor += 2;
if ( guess % factor != 0 ) // we never found a factor
more difficult text segment
printf(“%d\n”,guess);
guess += 2; // skip even numbers
}
Computing Primes in Assembly
unsigned int guess;
unsigned int factor; bss segment
unsigned int limit;
printf(“Find primes up to: “);
scanf(“%u”,&limit); data segment (message)
printf(“2\n3\n”); // prints the first 2 obvious primes
guess = 5; // we start the guess at 5 easy text segment
numbers
while_limit:
mov eax, [Guess]
cmp eax, [Limit] ; compare Guess and Limit
jnbe end_while_limit ; If !(Guess <= Limit) Goto end_while_limit
jmp while_limit
end_while_limit:
popa ; clean up
mov eax, 0 ; clean up
leave ; clean up
ret ; clean up
Computing Primes in Assembly
factor = 3; // look for a possible factor
mov ebx, 3 ; ebx is factor // we only look at factors < sqrt(guess)
while_factor: while ( factor*factor < guess &&
guess % factor != 0 )
mov eax, ebx ; eax = factor factor += 2;
mul eax ; edx:eax = factor * factor if ( guess % factor != 0 ) // we never found a
factor
cmp edx, 0 ; compare edx and 0 printf(“%d\n”,guess);
jne endif ; factor too big guess += 2; // skip even numbers
cmp eax, [Guess] ; compare factor*factor and guess
jnb endif ; if !< goto endif (factor too big)
mov edx, 0 ; edx = 0 if edx != 0, then we’re
mov eax, [Guess] ; eax = [Guess] too big
div ebx ; divide edx:eax by factor
cmp edx, 0 ; compare the reminder with 0
don’t forget to
je end_while_factor ; if == 0 goto end_while_factor
initialize edx
add ebx, 2 ; factor += 2
jmp while_factor ; loop back
end_while_factor:
mov eax, [Guess] ; print guess
call print_int ; print guess
We don’t chose
call print_nl ; print guess
eax for factor
endif:
because eax is
add dword [Guess], 2 ; guess += 2
used by a lot of
functions/routines
Stacks
Why Stack?
There are several reasons why we need
stacks:
To save register values if we ran out of
registers.
To pass parameters to subroutines
To make space for local variables in
subroutines
To preserve original register values if we
change them in a subroutine
To fetch processor flag status
Stack operations
last in first out (LIFO)
Stack operations mainly done by two
instructions either push or pop.
The instruction push will push values
into the stack, while pop will pop it out.
The syntax is like this:
little endian
00000FFEh 0
increasing addresses
00000FFDh 0
push dword 1 ; ESP = 00000FFCh 00000FFCh 1
00000FFBh 0
little endian
push dword 2 ; ESP = 00000FF8h 00000FFAh 0
push dword 3 ; ESP = 00000FF4h
00000FF9h
00000FF8h
0
2
00000FF7h 0
little endian
pop eax ; EAX = 3
00000FF6h 0
00000FF5h 0
pop ebx ; EBX = 2 00000FF4h 3
pop ecx ; ECX = 1
The ESP Register
func:
...
ret ; return
Nested Calls
The use of the stack enables nested calls
Return addresses are popped in the reverse order in which
they were pushed (Last-In-First-Out)
Warning: one must be extremely careful to pop
everything that’s pushed on the stack inside a
function
Example of erroneous use of the stack:
func:
mov eax, 12 ;
push eax ; put eax on the stack
ret ; pop eax and interpret
; it as a return address!!
Activation Records
The stack is useful to store and retrieve return
addresses, transparently managed via the CALL and
RET instructions
But it’s much more useful than this
In general, when calling a function, one puts all kinds of
useful information on the stack
When the function returns, this information is popped off
the stack and the function’s caller can safely resume
execution
The set of “useful information” is typically called an
activation record (or a “stack frame”)
One very important component of an activation record is
the parameters passed to the function
Another is the return address, as we’ve already seen
Subprogram Conventions
Typically, one uses a consistent calling convention, so that there is a
generic way to call a subprogram
Of course compilers use calling conventions
The compiler, when generating assembly code, must follow a standard
process to generate assembly corresponding to function calls and
returns
Some languages specify which calling convention should be used
What we describe in all that follows is mostly the convention used
by the C language
i.e., C compilers should use this convention when generating assembly
code from C code
A Simple Activation Record
To call a function you have to follow the following steps:
Push the parameters onto the stack
Execute the CALL instruction, which pushes the return address
onto the stack
func:
push ebp ; save original EBP
mov ebp, esp ; set EBP = ESP
func:
push ebp ; save original EBP
mov ebp, esp ; set EBP = ESP
push ebx ; save EBX
push ecx ; save ECX
func:
push ebp ; save original EBP
mov ebp, esp ; set EBP = ESP
pusha ; save all (including new EBP)
Note:
BX is nicked as 'base register',
SI as 'source index' and
DI as 'destination index'.
XCHG
XCHG instruction used to swap things
Interrupt
Essentials
Introduction to Interrupt
Interrupts can be seen as a number of
functions.
These functions make the programming
much easier, instead of writing a code to
print a character you can simply call the
interrupt and it will do everything for you.
There are also interrupt functions that
work with disk drive and other hardware.
We call such functions software
interrupts.
Interrupts are also triggered by different
hardware, these are called hardware
interrupts. Currently we are interested
in software interrupts only.
Introduction to Interrupt
To make a software interrupt there is an INT instruction, it has
very simple syntax:
INT value
Where value can be a number between 0 to 255 (or 0 to 0FFh),
Buffer
Output: A Better Version
There is one way to cope with “$” issue
by output characters one by one using
a loop.
The loop terminates if the character
being read is 0.
Zero in ASCII number is defined as a
blank and usually used to terminate
stuffs.
Interrupt 21h, service 06h used to print
one character on screen
[bx] means bx is treated as a pointer instead
of value
Input one Character
Number to String
The output routines we discussed so far
are intended only for outputting strings.
How can we output numbers?
We have to convert the numbers to
string first.
Screen features
Setting the cursor