80x85 Format
80x85 Format
80x85 Format
! 8086 Instructions are represented as binary numbers Instructions require between 1 and 6 bytes
Note that some architectures have fixed length instructions (particularly RISC architectures)
byte 1 2 3 4 5 6
mod
7 6 5 4 3 2 1 0 opcode reg [optional] [optional] [optional] [optional] d w Opcode byte r/m Addressing mode byte low disp, addr, or data high disp, addr, or data low data high data
! This is the general instruction format used by the majority of 2-operand instructions
There are over a dozen variations of this format
! Note that bytes 1 and 2 are divided up into 6 fields: opcode d direction (or s = sign extension) w word/byte mod mode reg register r/m register/memory
! Instruction may also be optionally preceded by one or more prefix bytes for repeat, segment override, or lock prefixes
In 32-bit machines we also have an address size override prefix and an operand size override prefix
! Some instructions are one-byte instructions and lack the addressing mode byte ! Note the order of bytes in an assembled instruction:
[Prefix] Opcode [Addr Mode] [Low Disp] [Low data] [High data] [High Disp]
Prefix Bytes
! Repetition
REP, REPE, REPZ REPNE, REPNZ F3H F2H
Note that REP and REPE and not distinct Machine (microcode) interpretation of REP and REPE code depends on instruction currently being executed
! Segment override
CS DS ES SS 2EH 3EH 26H 36H
! Lock F0H
! opcode field specifies the operation performed (mov, xchg, etc) ! d (direction) field specifies the direction of data movement:
d=1 data moves from operand specified by R/M field to operand specified by REG field data moves from operand specified by REG field to operand specified by R/M field
d=0
s=0
! d position is replaced by "c" bit in Shift and Rotate instructions indicates whether CL is used for shift count ! w (word/byte) specifies operand size
W=1 W=0 data is word data is byte
! Our primary focus is 16-bit instruction encoding so we will not discuss 32-bit encoding beyond this topic
We only have one bit (the w bit) for operand size so only two operand sizes can be directly specified 16-bit machines: 32-bit machines: w=0 data is 8 bits; w=1 data is 16 bits w=0 data is 8 bits; w=1 data is 32 bits
! Operand and Address size override prefixes are used to specify 32-registers in 16-bit code and 16-bit registers in 32bit code
66h = operand size override 67h = address size override
! Interpretation of an instruction depends on whether it is executed in a 16-bit code segment or a 32-bit code segment
Instruction mov ax,[bx] mov eax,[bx] mov ax,[ebx] mov eax,[ebx] 16-bit code 8B 07 66 8B 07 67 8B 03 67 66 8B 03 32-bit code 67 66 8B 07 67 8B 07 66 8B 03 8B 03
! Contains three fields Mod Bits 6-7 (mode; determines how R/M field is interpreted Reg Bits 3-5 (register) or SREG (Seg register) R/M Bits 0-2 (register/memory) ! Specifies details about operands ! MOD 00 01 10 11 ! REG 000 001 010 011
Use R/M Table 1 for R/M operand Use R/M Table 2 with 8-bit displacement Use R/M Table 2 with 16-bit displacement Two register instruction; use REG table w=0 AL CL DL BL w=1 AX CX DX BX REG 100 101 110 111 w=0 AH CH DH BH w=1 SP BP SI DI
! SREG 000 ES
000 001 [BX+SI] [BX+DI]
001 CS
010 SS
110 DS
! R/M Table 2 (Mod = 01) Add DISP to register specified: 000 [BX+SI] 010 [BP+SI] 100 [SI] 110 [BP] 001 [BX+DI] 011 [BP+DI] 101 [DI] 111 [BX]
! In general is not present if instruction has no operands ! For one-operand instructions the R/M field indicates where the operand is to be found ! For two-operand instructions (except those with an immediate operand) one is a register determined by REG (SREG) field and the other may be register or memory and is determined by R/M field.
Direction bit has meaning only in two-operand instructions Indicates whether "destination" is specified by REG or by R/M Note that this allows many instructions to be encoded in two different ways
Addressing Mode 00
! Note that the 110 case (direct addressing) requires that the instruction be followed by two address bytes
There are then two possibilities: 1 Opcode 2 Opcode Addressing Mode Addressing Mode Offset-Low
Offset-High
Addressing Mode 01
All instructions have the form: Opcode Addressing Mode Examples MOV AX,[BP+2] MOV DX,[BX+DI+4] MOV [BX-4],AX 8086 Instruction Encoding-8
Displacement
Addressing Mode 10
Opcode
Disp-High
Addressing Mode 11
Encoding Examples
! Note that w = 1 always for POP (cannot pop bytes) ! To POP into AX: MOD = 11 (Use REG table) R/M = 000 Encoding: 8FH C0H
To POP into BP: MOD = 11 R/M = 101 Encoding = 8FH C3H To POP into memory location DS:1200H MOD = 00 R/M = 110 Encoding = 8F 06 00 12 To POP into memory location CS:1200H MOD = 00 R/M = 110 Encoding = 2E 8F 06 00 12
! This one-byte opcode has the structure: 01011 REG So POP AX = 01011000 = 58H POP BX = 01001011 = 5BH ! Note that there are two legal encodings of POP REG
Shorter form exists because POPs are so common Most assemblers will use the shorter form
! Note that both forms of POP REG do not follow the general rules outlined above--registers are coded into the opcode byte ! Note also that even though POP CS is illegal, DEBUG will correctly assemble it as 0F -- but will not unassemble it.
Examples (Cont'd)
! MOV instruction has seven possible formats. We will not discuss them all.
MOV reg/mem,reg/mem
! MOV AX,BX - w = 1 because we are dealing with words - MOD = 11 because it is register-register
- if d = 0 then REG = source (BX) and R/M = dest (AX) = 1000 1001 1101 1000 (89 D8) - if d = 1 then REG = source (AX) and R/M = dest (BX) = 1000 1011 1010 0011 (8B C3)
! MOV [BX+10h],CL - w = 0 because we are dealing with a byte - d = 0 because we need R/M Table 2 to encode [BX+10h] therefore first byte is (1000 1000) = 88H
- since 10H can be encoded as an 8-bit displacement, we can use MOD=01 REG=001 and R/M=111 = 0100 1111 = 4FH and the last byte is 10H result: 88 4F 10 Note: MOV [BX+10H],CX = 89 4F 10 8086 Instruction Encoding-12
! Can also encode MOV [BX+10h],CL with a 16-bit displacement, (MOD 10) although there is no reason to do so:
88 8F 10 00
Where displacement bytes optional depending on value of MOD MOV BYTE PTR [100H],10H - w = 0 because we have byte operand - MOD = 00 (R/M Table 1) R/M = 110 (Displacement) - bytes 3 and 4 are address; byte 5 immediate data C6 06 00 01 10
MOV accumulator,mem
! Note special form for accumulator Many other instructions have a short form for AX register ! Could also be assembled as:
1000 1011 0000 0110 0000 0000 0000 0001 8B 06 00 01
! Immediate mode instructions have only one register or memory operand; the other is encoded in the instruction itself
The Reg field is used an opcode extension The addressing mode byte has to be examined to determine which operation is specified add imm to reg/mem or imm to reg/mem 1000 00sw 1000 00sw mod000r/m mod001r/m
! In many instructions with immediate operands the d bit is interpreted as the s bit
When the s bit is set it means that the single byte operand should be sign-extended to 16 bits
;Add imm to reg16 mod000r/m mod = 11 (use REG table) r/m = 010 =DX
With s bit set we have 1000 0011 11 000 010 With s bit clear we have 1000 0001 11 000 010
operand = 83 C2 03 operand = 81 C2 03 00
! Eric Isaacson claims that the A86 assembler has a unique "footprint" that allows him to detect whether or not a machine language program has been assembled with A86
Selected Instruction Formats Instruction Opcode Addr.Mode ADC reg/mem with reg 000100dw modregr/m ADC immed to reg/mem 100000sw mod010r/m ADD reg/mem with reg 000000dw modregr/m ADD immed to accumulator 0000010w data ADD immed to reg/mem 100000sw mod000r/m OR reg/mem with reg 000010dw modregr/m OR immed to reg/mem 100000sw mod001r/m OR immed to accumlator 0000110w data INC reg16 01000reg INC reg/mem 1111111w mod000r/m MOV reg/mem to/from reg 100010dw modregr/m MOV reg/mem to segreg 10001110 modsegr/m MOV immed to reg/mem 1100011w mod000r/m MOV immed to reg 1011wreg data MOV direct mem to/from acc 101000dw addr XCHG reg/mem with reg 1000011w modregr/m XCHG reg16 with accum. 10010reg CMP reg/mem with reg 001110dw modregr/m CMP immed to accumulator 0011110w data CMP immed to reg/mem 100000sw mod111r/m POP reg 01011reg POP segreg 00reg111 POP reg/mem 10001111 modxxxr/m RCL reg/mem,CL/immediate 110100cw mod010r/m RCR reg/mem,CL/immediate 110100cw mod011r/m STOS 1010101w CMPS 1010011w MUL reg/mem 1111011w mod100r/m
[addr] data [addr] [addr] data [addr] data [addr] [addr] (seg = segreg) [addr] data [addr] [addr] [addr] data (xxx = dont care) [addr] (if c=0 shift= 1, [addr] if c=1 shift = CL) [addr]