Chapter_Addressing Modes_Instruction Encoding - Chapter
Chapter_Addressing Modes_Instruction Encoding - Chapter
Addressing Modes,
Instruction Encoding,
The mnemonic-opcode, the operand(s), and the remark are separated from
each other by one or more spaces and/or tabs. The following are examples
of assembly language instructions:
You may notice in the above examples that operand(s) are also specified
in symbolic forms and that two operands are separated from each other with
a comma. This chapter first discusses how the operands of assembly
language instructions may be specified. It then describes how they are
encoded in machine language instructions, and how to allocate storage and
define constant data in an assembly language program.
Register Addressing
An instruction is said to use register addressing mode if it specifies an
operation in which data is fetched from or moved into a register. The register
is specified in the instruction by its symbolic names. For examples AX, AH,
BX, CL, CS, and SS. The following table illustrates instructions with
operand(s) in register addressing mode.
Instructions Remarks
MOV AX, CX two register operands: the first is register AX, and the second is
CX
ADD AL, 5 the first operand is a register: register AL
IMUL BX one register operand: register BX
Immediate Addressing
For certain instructions, the data item to be processed is specified as part of
the instruction. These instructions are said to use immediate addressing and
the data item is referred to as immediate data. Only one operand of a two-
operand instruction may be immediate data. Immediate data may be specified
in each of the following number system:
S binary (followed by the suffix "B" or "b"). For examples 1101B, -1101b,
or 110B.
S octal (followed by the suffix "O", "o", "Q" or "q"). For examples 24o,
24q, or -37q.
Instructions Remarks
MOV AX, 5Fh two operands: the second is an immediate data 5F (in hexadecimal)
ADD AL, -101101b two operands; the second is an immediate data -101101 (in binary)
RET 4 one immediate data operand 4 (in decimal)
The five remaining addressing modes are used for instructions that have
a memory location operand. The Intel 8086 processor has two types of
instructions with a memory location operand: some instructions specify
operations in which data is fetched from or moved into a memory location,
and others specify the transfer of control to another instruction. The
remaining addressing modes are used to specify the offset or the
“selector:offset” address of a memory location operand.
Direct Addressing
An instruction is said to use direct addressing mode if the offset of a
memory location is specified in the instruction as an operand. The offset is
specified in the same way as an immediate data, but in square brackets. For
example [DS : 20F5h], [DS :10110010b], [DS:25d], or [DS:12]. You
precede an offset with a segment register in order to specify the segment in
which the memory location is located. The default segment is the data
segment for most instructions. The following table illustrates instructions
with a memory location operand in direct addressing mode.
Instructions Remarks
MOV AX, [DS : 1Ah] two operands: the first is register AX and the second is a
word memory location at offset 001Ah in the data
segment.
ADD [DS : 1Ah], AL two operands: the second is register AL, and the first is a
byte memory location at offset 001Ah in the data
segment.
MOV BYTE PTR [DS :1Ah], 5 two operands: the first is a byte memory location at offset
001Ah in the data segment, and the second is a byte
immediate data.
IMUL WORD PTR [DS:1Ah] one operand: a word memory location at offset 001Ah in
the data segment.
Instructions Remarks
MOV AX, [BX] two operands: the first is register AX, and the second is a word
memory location in the data segment at offset in register BX.
MOV BYTE PTR [DI], 5 two operands: the first is a byte memory location in the data
segment at offset in register DI, and the second is a byte
immediate data.
IM UL W ORD PTR [SI] one operand: a word memory location in the data segment at offset in
register SI.
int * pt;
int total = 0;
Instructions Remarks
MOV AX, [SI + 6] the second operand is a word memory location in the data
segment. Its offset is computed by adding 6 to the contents
of register SI.
MOV BYTE PTR [DI - 3], 5 the first operand is a byte memory location in the data
segment. Its offset is computed by adding -3 to the contents
of register DI.
IMUL WORD PTR [SI] + 4 one operand: a word memory location in the data segment.
Its offset is computed by adding 4 to the contents of register
SI.
Note that when a data item is in the data segment, its offset can be
specified using either the base relative addressing or the direct indexed
addressing mode. However, we find base relative addressing more
appropriate for the implementation of pointer arithmetic found in high-level
programming languages such as C/C++. For example, in the following
C/C++ code segment, the variable pt could be implemented using either
register BP or register BX, depending on whether the memory location is in
the stack or not; and the expression pt + 2 implemented as either [BP + 2] or
[BX + 2].
Instructions Remarks
MOV AX, [BP + 6] the second operand is a word memory location in the stack
segment. Its offset is computed by adding 6 to the contents
of register BP.
ADD [BX - 9], AL the first operand is a byte memory location in the data
segment. Its offset is computed by adding - 9 to the
contents of register BX.
MOV BYTE PTR [BX - 3], 5 the first operand is a byte memory location in the data
segment. Its offset is computed by adding -3 to the contents
of register BX.
IMUL WORD PTR [BP]+ 4 one operand: a word memory location in the stack segment.
Its offset is computed by adding 4 to the contents of register
BP.
i) Without displacement:
[Reg1][Reg2] or [Reg1] + [Reg2] or [Reg1+Reg2]
Where Reg1 is either register BX or register BP, and Reg2 is either register
SI or register DI. Register BP is used instead of register BX only if the
memory location is in the stack.
Instructions Remarks
MOV AX, [BP][SI] + 6 the second operand is a word memory location in the
stack segment. Its offset is computed by adding 6 to the
sum of the contents of registers BP and SI.
ADD [BX + DI - 9], AL the first operand is a byte memory location in the data
segment. Its offset is computed by adding - 9 to the sum
of the contents of registers BX and DI.
MOV BYTE PTR [BX][SI], 5 the first operand is a byte memory location in the data
segment. Its offset is the sum of the the contents of
registers BX and DI.
IMUL WORD PTR [BP + DI] one operand: a word memory location in the stack
segment. Its offset is the sum of the contents of registers
BP and DI.
Exercise 4.1
What is the addressing mode of each operand of the following instructions:
MOV [2B5h], CX
MOV 4[DI], BX
MOV [BP][SI], DX
Instructions Remarks
MOV [DS:200h], WORD PTR 5 two operands: a word in memory at offset 0200h in the
data segment, and a word immediate data 0005h
ADD BYTE PTR [BX], 20 two operands: a byte in memory in the data segment at
offset in register BX and a byte immediate data 14h
MOV [DS:200h], CX two operands: a word in memory at offset 0200h in the
data segment and a word in register CX. WORD PTR
operator is not necessary.
MOV BL, 7 two operands: a byte in register BL and a byte immediate
data. BYTE PTR operator is not necessary.
IMUL WORD PTR [SI] one operand: a word memory location in the data
segment at offset in register SI
Exercise 4.2
Indicate by T (true) or F (false) whether the size of the following operands must be explicitly
specified using the PTR operator:
Solutions
Exercise 4.3
For some instructions the opcode byte also contains information about a
particular register operand: this register is said to be implied by the opcode
and is not specified in the operand field. The opcodes of a subset of the Intel
8086 processor’s instructions are provided in Appendix 2.
‘mod r/m’ byte this byte consists of three fields: the mod field (bits
7 and 6); the register field (bits 5, 4 and 3); and the r/m (register/memory)
field (bits 2, 1 and 0) as follows:
7 6 5 4 3 2 1 0
S the mod and the r/m fields are used to let the CPU know whether or not
there is a memory location operand, and how to get its offset when it is
the case.
Encoding Registers
Registers are encoded in a machine language instruction using a 3-bit code.
However, the code used to encode a base or an index register depends on
whether or not this register is used in register mode (that means used to hold
data) or in one of the indirect addressing modes. Table 4.1 provides the
codes of all registers (used in register mode), and Table 4.2 provides the
codes of base and index registers when they are used in indirect addressing
modes.
Note that the same code is assigned to two or three registers; the CPU
distinguishes registers with the same code by using the context in which they
are used. For example, in some contexts the code 100 will refer to register
SP, but in others, it will refer to register AH. That means, register SP and AH
can not be used in instructions with the same opcode.
Codes registers
000 [BX + SI]
001 [BX + DI]
010 [BP + SI]
011 [BP + DI]
100 [SI]
101 [DI]
110 [BP]
111 [BX]
Exercise 4.4
Following the first two examples, use the mod and the r/m fields of the following ‘mod r/m’ bytes
to indicate whether or not there is a memory location operand. Also specify how to compute the
offset of a memory location operand (note: you do not have to compute the offset) if there is one.
1E
18
C2
17
91
CWD 99h
CBW 98h
RETN C3h
The short form is used when the register operand is a 16-bit register,
whereas the general form is used in any other situation. The short forms are
shown in Figure 4.1, and the more general form is shown in Figure 4.2.
Figure 4.1 Short form of an instruction with a single 16-bit register operand
Two formats are used for the short form: one for segments registers, and another one for
other 16-bit registers.
7 6 5 4 3 2 1 0
register
opcode code
Note that the machine language instruction that corresponds to a register may be obtained by
adding to the opcode byte (in which bits 0, 1, and 2 are set to zero) the code of that register.
Using the information in Appendix 2, we have the following machine language instructions:
7 6 5 4 3 2 1 0
register
opcode code opcode
The 2-bit opcode in bit positions 6 and 7, and the 3-bit opcode in bit positions 0, 1, and 2
are provided in Appendix 2. Using this information, we have the following machine
language instructions:
7 6 5 4 3 2 1 0
Exercise 4.5
Provide the machine language instruction that corresponds to each of the following assembly
language instructions:
7 6 5 4 3 2 1 0
Using the information in Appendix 2, we have the following machine language instruction:
7 6 5 4 3 2 1 0
Using the information in Appendix 2, we have the following machine language instruction:
7 6 5 4 3 2 1 0
Exercise 4.6
Provide the machine language instruction that corresponds to each of the following assembly
language instructions:
Exercise 4.7
Provide the machine language instruction that corresponds to each of the following assembly
language instructions:
Using the information in Appendix 2, we have the following machine language instruction:
7 6 5 4 3 2 1 0
Using the information in Appendix 2, we have the following machine language instruction:
Note that two encoding patterns are available for some instructions with an immediate data:
one in which the immediate data is represented as a byte and another in which it is
represented as a word. Therefore, if the immediate data can be represented as a byte, you
may use either one of the encoding pattern.
7 6 5 4 3 2 1 0
Using the information in Appendix 2, we have the following machine language instruction:
Using the information in Appendix 2, we have the following machine language instruction:
7 6 5 4 3 2 1 0
Using the information in Appendix 2, we have the following machine language instruction:
7 6 5 4 3 2 1 0
Using the information in Appendix 2, we have the following machine language instruction:
7 6 5 4 3 2 1 0
Using the information in Appendix 2, we have the following machine language instruction:
7 6 5 4 3 2 1 0
Using the information in Appendix 2, we have the following machine language instruction:
7 6 5 4 3 2 1 0
Using the information in Appendix 2, we have the following machine language instruction:
7 6 5 4 3 2 1 0
Using the information in Appendix 2, we have the following machine language instruction:
Exercise 4.8
Provide the machine language instruction that corresponds to each of the following assembly
language instructions:
e. MOV BYTE PTR [SI], 1Ah k. MOV WORD PTR [15Bh], 1A1Ch
There are two types of transfer of control instructions: those that only
affect the contents of register IP, and those that also affect the contents of the
code segment register CS in addition to affecting the contents of register IP.
<mnemonic-opcode> <offset>
Note that the low and the high bytes of the offset and the segment selector are
swapped in the instruction.
Exercise 4.9
Provide the machine language instruction that corresponds to each of the following assembly
language instructions:
a. 016B JB 17Eh
016D
021C
0233
d. JMP 1B3Eh:4Ah
Dx <list-of-values>
A value for the DW, DD, DQ, or DT form is either a positive or negative
integer value (specified either in decimal, hexadecimal, binary, or octal), or
a floating-point value.
A value for a basic form other than DB is stored in memory from the low
byte to the high byte. Some examples are illustrated in Figure 4.5.
For each of the following examples, assume that the directive is specified at offset 0200h.
1. DB 4 Offset: 0200
Contents: 04
Notice in example 6 of Figure 4.5 that a value for the DW form may also
be specified as two characters in single or double quotes. These characters
are represented in ASCII code and stored in memory from the low byte to the
high byte.
Exercise 4.10
Assuming that each of the following define directives is specified at offset 0200h, show the
contents of the memory locations after the assembly of these directives.
Contents:
Contents:
Contents:
Contents:
Contents:
Contents:
If you only want to reserve a memory location for future use in a program,
you may do so by specifying the question mark (?) as a value. The assembler
will initialize the corresponding memory location with zeroes. We therefore
have the following equivalent define directives:
DW ?, 25 is the same as DW 0, 25
However, Debug does not allow the specification of the question mark.
Two or more define directives specified one after another cause the
assembler or Debug to allocate consecutive memory locations as illustrated
in Figure 4.6. Also, with Dx being either DB, DW, DD, DQ, or DT,
Dx L1
Dx L2
...
Dx Ln
Assume that the first define directive is specified at offset 0200, and that the second is
specified immediately after the first, and the third is specified immediately after the
second.
1. DB 4 Offset: 0200
Contents: 04
DUP Operator
The DUP operator is used to repeat the definition of a list of values. Its
syntax is as follows:
where Dx is either DB, DW, DD, DQ, or DT, and count is the number of
repetition. For example, the following definitions are equivalent:
Exercise 4.11
Assuming that the first define directive is specified at offset 0200, and that each subsequent
directive is specified after the previous one, show the contents of the memory locations after the
processing of the following directives by the assembler or Debug.
Contents:
Contents:
3. DW 3 DUP( 0 ) Offset:
Contents:
Contents:
5. DW 34Fh Offset:
Contents:
6. DW -2Fh, 5 Offset:
Contents:
Exercise 4.11
Assuming that the first define directive is specified at offset 0200, and that each subsequent
directive is specified after the previous one, show the contents of the memory locations after the
assembly of these directives.
Contents:
Contents:
3. DW 3 DUP( 0 ) Offset:
Contents:
Contents:
5. DW 34Fh Offset:
Contents:
6. DW -2Fh, 5 Offset:
Contents:
Chapter 4: Exercises
DS: 1000; SS: 2000; BX: 0A00; BP: 0B00; DI: 0050; SI: 0070
3. For each of the following ‘mod r/m’ byte, indicate whether or not there is an operand in memory.
If there is an operand in memory, also indicate how to compute the offset of that memory
location (note: you do not have to compute the offset).
4. Using the information in Appendix 2, specify the machine language instruction that corresponds
to each of the following assembly language instructions:
0122 015C
5. Assuming that each of the following define directives is specified at offset 0200, show the
contents of the memory locations after they have been processed by the assembler.
Instructions Remarks
ADD BX, 25 First operand is a register; second operand is immediate.
MOV CX, DX First and second operand are registers.
SUB CX, [10Ah] First operand is a register; second operand is in memory
location with offset in direct addressing mode.
MOV WORD PTR [BX], 102Fh First operand is in memory location with offset in
register indirect addressing mode and second operand is
immediate.
IMUL WORD PTR 5[DI] One memory location operand with offset in direct
indexed addressing mode.
MOV AX, 10[BP] First operand is a register; second operand is in memory
location with offset in base relative addressing mode.
ADD [BX + SI], DX First operand is in memory location with offset in base
indexed addressing mode; and the second operand is a
register.
MOV 54[BX][DI], CX First operand is in memory location with offset in base
indexed addressing mode; and the second operand is a
register.
2.
4.
5.
DB ‘Dr. Mark’ 44 72 2E 20 4D 61 72 6B
DB ‘Math 234', 27 4D 61 74 68 20 32 33 34 1B
DW 136, -71 88 00 B9 FF
DW 8Ah, ‘CK’ 8A 00 4B 43
DW 4 DUP ( ? ) 00 00 00 00 00 00 00 00
DD 1A2B3C4Dh 4D 3C 2B 1A
DD 62B4h B4 62 00 00
DQ 1B35Eh 5E B3 01 00 00 00 00 00