0% found this document useful (0 votes)
17 views50 pages

x86 Instruction Encoding

Uploaded by

k230049
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views50 pages

x86 Instruction Encoding

Uploaded by

k230049
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 50

x86 Instruction

Encoding (12.3)

Lecture By: Atiya Jokhio


Outlines
•Introduction
•Instruction Format
•Single Byte Instructions
•Move Immediate to Register
•Register Mode Instructions
•Processor Operand-Size Prefix
•Memory Mode Instructions
Introduction
•The Intel 8086 processor was the first in a line of processors using a
Complex Instruction Set Computer (CISC) design.
• a wide variety of memory-addressing, shifting, arithmetic, data movement, and
logical operations.

•To encode an instruction means to convert an assembly language


instruction and its operands into machine code.

•To decode an instruction means to convert a machine code instruction into


assembly language.

•We will begin with the 8086/8088 processor as an illustrative example.


• Later, we will show some of the changes made when Intel introduced
32-bit processors.
Instruction Format

•Instructions are stored in little-endian order, so the prefix byte is


located at the instruction’s starting address.
•Every instruction has an opcode, but the remaining fields are
optional.
•Most instructions are 2 or 3 bytes.
•The instruction prefix overrides default operand sizes. The prefix byte is
not the opcode expansion prefix discussed earlier - they are special bytes to
modify the behavior of existing instruction
•The opcode (operation code) identifies a specific variant of an instruction.
• E.g. the ADD instruction has nine different opcodes, depending on the parameter
types used.
•The Mod R/M field identifies the addressing mode and operands. The
notation “R/M” stands for register and mode.
•The scale index byte (SIB) is used to calculate offsets of array indexes.
•The address displacement field holds an operand’s offset, or it can be
added to base and index registers in addressing modes such as base-
displacement or base-index-displacement
•The immediate data field holds constant operands
•The six-bit opcode identifies the operation. The same opcode is used for
both 8- and 16-bit operations.
•The size of the operands is given by the W bit: W = 0 means 8-bit data
and W = 1 means 16-bit (or 32 bits) data.
•Bit number one, marked D, specifies the direction of the data transfer:
• If d = 0 then the destination operand is a memory location (not in reg mod),
e.g.
e.g. add [ebx], al
• If d = 1 then the destination operand is a register (not in reg mod), e.g.
e.g. add al, [ebx]
•the D bit specifies whether the register in the REG field is a source or
destination operand, D = 0 means source and D = 1 means destination.
•The MOD R/M byte (byte 2 above) specifies instruction operands and
their addressing mode

•The R/M field, combined with MOD, specifies either


• the second operand in a two-operand instruction, or
• the only operand in a single-operand instruction like NOT or NEG.
• These two are used to let the CPU know if there is some memory operand, if
yes, then how to calculate its offset address.
•The MOD field specifies x86 addressing mode
•MOD = 11 means register mode.
•MOD = 00 means memory mode with no displacement.
• (Except when R/M= 110, then a 16-bit displacement follows).

•MOD= 01 means memory, with 8 bit displacement following (D8).


•MOD = 10 means memory mode with 16-bit displacement following
(D16).

MOV [ESI+01234Fh], BX ; MOD 1 0


•The REG field
specifies source or
destination register. (W = 0)
(W =1)
•For certain (often
single-operand or
immediate-operand)
instructions, the
REG field may contain
an opcode
extension rather than
the register bits.
•The R/M field will
specify the operand
in such case.
•Depending on the
instruction, this can
be either the source
or the destination
operand
Segment Override
00 ES
01 CS
10 SS
11 DS
(1 bit) Direction. 1 = Register is Destination, 0 = Register is
source.
M0v 1000 10DW
MOV VAR32, EDX; 1000 10 0 1 89
MOV BL, DL; 1000 10 0 0 88
MOV DL,var8 1000 10 1 0 8A
MOV BX, DX 1000 10 0 1 89
ADD 00|01|02|03 (0000
00DW)
ADD WORD PTR[1009CFEh], CX
0000 00 0 1
=01
ADD DL, BL
0000 00 0 0
=00
ADD CL, BYTE PTR [0900C123]
0000 00 1 0
=02
ADD DX, var16
0000 00 1 1
=03
SUB 28\29\2A\2B
SUB var32, EDX
0010 10 0 1 = 29
Sub DX, CX
0010 10 0 1 = 29
SUB EDX, var32
0010 10 1 1 = 2B
SUB BL, DL
0010 10 0 0 = 28
MOV 1000 10DW
MOV [ESI+32h], DX
1000 10 0 1 01 010 100
89 54 <- 32
= 66 89 54 32h

MOV [ESI+32], EDX


1000 10 0 1 01 010 100 <- 32
=89 54 32h
SUB = 0010 10DW
SUB DL, [ESI+1Ch]
SUB [ESI+FFEEh], BX

SUB [EBX+ESI+ 70F79h], EDX


0010 10 0 1 10 010 000 <- 79 0F 07 00
29 90 790F0700
XCHG 1000 10DW
XCHG EDX, DWORD PTR [EDI+ 709C89h]
1000 01 1 1 10 010 101 <- 89 9C 70 00
= 87 95 899C7000

XCHG BX, DX

1000 10 0 1 11 010 011


89 D3
MOD and R/M Fields
Register Encoding
Register-Mode Instructions
•In instructions using register operands, the Mod R/M byte contains a 3-
bit identifier for each register operand.
•Bits 6 to 7 are the mod field, which identifies the addressing mode.
•Bits 3 to 5 are the reg field, which identifies the source operand.
•Bits 0 to 2 are the r/m field, which identifies the destination operand.
e.g.
ADD CX, AX
= 00000001 11 000 001
= 01C1 h
Memory Mode Instructions
•Intel assembly language has a wide variety of memory addressing
modes, causing the encoding of the Mod R/M byte to be fairly complex.

•Exactly 256 different combinations of operands can be specified by the


Mod R/M byte.
Memory Mode Instructions
•The two bits in the Mod column indicate groups of
addressing modes.
• Mod 00, for example, has eight possible R/M values (000 to 111 binary) that
identify operand types listed in the Effective Address column.

•Encode :
1. MOV AX, [SI] ;

2. MOV [SI], AL
Single-Byte Instructions
•The simplest type of instruction is one with either no operand or an
implied operand. Such instructions require only the opcode field, the
value of which is predetermined by the processor’s instruction set.

• register increments are optimized for code size and execution speed
Single Immediate Data
•When the only operand of an instruction is an immediate data the
machine language code is the opcode followed by that immediate data.

•E.g. RET 8 is C2 08 00, where C2 is the opcode and 0008 is the


immediate data (appended in little endian order).
•E.g.
PUSH 17097Ch
50 <- 7C 09 17
= 50 7C091700 50: Opcode Byte Data Bytes: 7C091700
Single Operand (in a register)
Instructions
•The machine language instruction can be obtained by adding to the
register number to the opcode byte.
•Example: PUSH CX The machine instruction is 51. The encoding steps
are as follows:
1. The opcode for PUSH with a 16-bit register operand is 50.
2. The register number for CX is 1, so add 1 to 50, producing opcode
51.
PUSH BX
50 + 03 = 53
Register, Immediate
Instructions
•Immediate operands (constants) are appended to instructions in little-
endian order (lowest byte first).
•The encoding format of a MOV instruction that moves an immediate word
into a register is
B8 + rw dw
•where the opcode byte value is B8 + rw, indicating that a register number (0
through 7) is added to B8
•dw is the immediate word operand, low byte first.

MOV DL, 17 (reg, imm)


B0 + 02 <-17
= 8A 17
MOV reg/mem, imm
•Example: MOV BX,1 The machine instruction is B8 01 00
(hexadecimal). Here’s how it is encoded:
1. The opcode for moving an immediate value to a 16-bit register is B8.
2. The register number for BX is 3, so 3 is added to B8
3. The immediate operand (0001) is appended to the instruction in
little-endian order (01 , 00)

B8 + 03 <- 01 00
=66 BB 0100
ADD reg16, imm16 81

ADD BX, 7F0F


81 + 3 <- 0F7F
84 0F7F
•Example: MOV BX, 1234h The machine instruction is BB 34 12. The
encoding steps are as follows:
1. The opcode for moving an immediate value to a 16-bit register is B8.
2. The register number for BX is 3, so add 3 to B8, producing opcode
BB.
3. The immediate operand bytes are 34 12.
Single Operand (in a
MEMORY) Instructions

REG bits of R/M byte hold opcode extension in these instructions


Single Operand (in a
MEMORY) Instructions

REG bits of R/M byte hold opcode extension in these instructions


MOD|REG|R/M
- INC var32

- INC WORD PTR [EDI]


MOD|REG|R/M
INC var32
FF 00 000 110
= FF 06h

INC WORD PTR [EDI]


FF 00 000 101
=FF05h
MOV mem,imm C7 (ext 000)
SUB mem,imm 81 (Ext 101)
MOV [EBX], 5879790Ch
C7 00 000 111 <- 0C 79 79 58
= C7 07 0C797958

MOV [EBP+ESI], 1234h


C7 00 000 010 <-34 12
=66 C7 02 3412

SUB [EBX+EDI+76980989], 1C7C1920h


81 10 101 001 <- 89 09 98 76 <- 20 19 7C 1C
=81 A9 98099876 20197C1Ch
Memory, Immediate
Instructions

REG bits of R/M byte hold opcode extension in these instructions


Memory, Immediate
Instructions

REG bits of R/M byte hold opcode extension in these instructions


Memory, Immediate
Instructions

REG bits of R/M byte hold opcode extension in these instructions


REG bits of R/M byte hold opcode extension in these instructions
1.

2.

3.

4.
1.

2.

3.

4.
Summary (Formats)
✓CBW
✓RET 8 ✓POP mem16
✓PUSH BX ✓INC mem8
✓SUB CX, 15 ✓DEC WORD PTR [100Fh]
✓ADD CL, CH ✓DEC WORD PTR [BX+02h]
✓SUB VAR, BX ✓DEC WORD PTR [DI+767Fh]
✓SUB BX, [101Fh] ✓SUB VAR, 15
✓XOR [DI+07h], DX ✓SUB [SI+1fh], 15
✓OR [SI+347Ch], Bh ✓SUB [SI+1f1fh],15

You might also like