0% found this document useful (0 votes)
19 views60 pages

1 CPE 413 Overview of x86 Architecture-1

Assembly language
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views60 pages

1 CPE 413 Overview of x86 Architecture-1

Assembly language
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 60

CPE 413 ASSEMBLY LANGUAGE

PROGRAMMING

Prof. Christopher U. Ngene email: [email protected] 1


What is in a Computer?
• The field of Computer Architecture is about the
fundamental structure of computer systems
• What are the components?
• How are they interconnected?
• How fast does the system operate?
• What is the power consumption?
• How much does it all costs?
• What architecture leads to the “best” trade-offs?
• The conceptual model for computer architecture that is
still in effect since 1965 is the Von-Neumann architecture

Prof. Christopher U. Ngene email: [email protected] 2


Instructions?
• Whenever somebody builds a CPU they first define
what instructions the CPU will know how to decode
and execute

• This is called the Instruction Set Architecture (ISA)

• The ISA for a Pentium is different from the ISA for a


PowerPC for instance

• The ISA is described in a (lengthy) documentation that


describes everything that one can do with the CPU

• Every instruction lasts some number of clock cycles

Prof. Christopher U. Ngene email: [email protected] 3


Instructions…
• Instructions are encoded in binary machine code
• E.g.: 01000110101101 may mean “perform an addition of two
registers and store the results in another register”
• The CPU is built using gates (or, and, etc.) which themselves
use transistors
• These gates implement instruction decoding
• Based on the bits of the instruction code, several signals are sent to
different electronic components, which in turn perform useful tasks
• Typically, an instruction consists of two parts
• The opcode: what the instruction computes
• The operands: the input to the computation

opcode operands
0 1 0 0 0 1 1 0 1 0 1 1 0 1
Prof. Christopher U. Ngene email: [email protected] 4
Assembly Language
• It’s really difficult for humans to read/remember binary
instruction encodings
• We will see that typically one would use hexadecimal encoding,
but still difficult.
• Therefore it is typical to use a set of mnemonics, which
form the assembly language
• It is often said that the CPU understands assembly language
• This is not technically true, as the CPU understand machine code,
which we, as humans, choose the represent using assembly
language
• An assembler transforms assembly code into machine
code

Prof. Christopher U. Ngene email: [email protected] 5


Assembly Language…
• It used to be that all computer programmers did all day
was to write assembly code
• This was difficult for many reasons
• Difficult to read
• Very difficult to debug
• Different from one computer to another!
• The use of assembly language for all programming
prevented the (sustainable) development of large
software project involving many programmers
• This is the main motivation for the development of high-
level languages
• FORTRAN, Cobol, C, etc.

Prof. Christopher U. Ngene email: [email protected] 6


Assembly Language…
• Low-level language
• Each instruction performs a much lower-level task
compared to a high-level language instruction
• One-to-one correspondence between assembly
language and machine language instructions
• For most assembly language instructions, there is a
machine language equivalent
• Assembler translates assembly language instructions to
machine language instructions
• Directly influenced by the instruction set and
architecture of the processor (CPU)

Prof. Christopher U. Ngene email: [email protected] 7


Assembly Language…
• Some example assembly language
instructions:
inc result
mov class_size, 45
and mask1, 128
add marks, 10
• Some points to note:
• Assembly language instructions are cryptic
• Mnemonics are used for operations
– inc for increment, mov for move (i.e., copy)
• Assembly language instructions are low level
– Cannot write instructions such as mov marks, value

Prof. Christopher U. Ngene email: [email protected] 8


Assembly Language…
• Some simple high-level language instructions can be
expressed by a single assembly instruction
Assemby Language C
inc result result ++;

mov class_size, 45 size = 45;

and mask1, 128 mask1 &= 128;

add marks, 10 marks += 10;

Prof. Christopher U. Ngene email: [email protected] 9


Assembly Language…
• Most high-level language instructions need more than
one assembly instruction

C Assemby Language

Size = value; mov AX, value


mov size, AX
Sum += x + y + z; mov AX, sum
add AX, x
add AX, y
add AX, z
mov sum, AX

Prof. Christopher U. Ngene email: [email protected] 10


Assembly Language…
• Readability of assembly language instructions is
much better than the machine language instructions
• Machine language instructions are a sequence of 1s and
0s

Prof. Christopher U. Ngene email: [email protected] 11


The Pros and Cons of ALP
• Cons
• It's difficult
• Error prone
• Hard to debug
• Takes a lot of time to develop
• Portability issues

• Pros
• Assembly is fast (Time efficiency). A LOT faster than any compiler of any
language could ever produce.
• Assembly is a lot closer to machine level than any language because the
commands of assembly language are mapped 1-1 to machine
instructions.
• Assembly code is a lot smaller (Space efficiency) than any compiler of
any language could ever produce.
• In Assembly, we can do a lot of things that we can't do in any higher level
language, such as playing with processor flags, etc.

Prof. Christopher U. Ngene email: [email protected] 12


Typical Applications
• Application that need one of the three
advantages of the assembly language
• Time-efficiency
• Time-convenience
• » Good to have but not required for functional correctness
• Graphics
• Time-critical
• Necessary to satisfy functionality
• Real-time applications
• Aircraft navigational systems
• Process control systems
• Robot control software
• Missile control software
Prof. Christopher U. Ngene email: [email protected] 13
Typical Applications…
• Accessibility to system hardware
• System software typically requires direct control of
the system hardware devices
• Assemblers, linkers, compilers
• Network interfaces, device drivers
• Video games

• Space-efficiency
• Not a big plus point for most applications
• Code compactness is important in some cases
• Portable and hand-held device software
• Spacecraft control software

Prof. Christopher U. Ngene email: [email protected] 14


High-Level Languages
• Having high-level languages is good, but CPUs do not
understand them
• Therefore, there needs to have a translation from a high-level
language to machine code
• There are two ways to run a high-level language on a CPU that
only understands machine code:
• Interpretation: An interpreter is a program that reads in high-
level code and simulates a computer that understands high-
level code
• Compilation: A compiler is a program that reads in high-level
code and produces equivalent machine code, which can then
be executed on the CPU at a later time
• Some languages are interpreted, some are compiled, some
are both or hybrid
Prof. Christopher U. Ngene email: [email protected] 15
The Simplified Picture

Machine code
010000101010110110
High-level code 101010101111010101
101001010101010001
101010101010100101
char *tmpfilename; 111100001010101001
int num_schedulers=0;
000101010111101011

ASSEMBLER
int num_request_submitters=0;
int i,j; 010000000010000100
000010001000100011
if (!(f = fopen(filename,"r"))) {
xbt_assert1(0,"Cannot open file %s",filename); 101001010010101011
} 000101010010010101
while(fgets(buffer,256,f)) { 010101010101010101
if (!strncmp(buffer,"SCHEDULER",9))
num_schedulers++; 101010101111010101
if (!strncmp(buffer,"REQUESTSUBMITTER",16)) 101010101010100101
num_request_submitters++; 111100001010101001
}
fclose(f);
tmpfilename = strdup("/tmp/jobsimulator_

Assembly code
sll $t3, $t1, 2
add $t3, $s0, $t3
sll $t4, $t0, 2
add $t4, $s0, $t4 Program counter register
lw $t5, 0($t3) register

CPU
lw $t6, 0($t4) register
slt $t2, $t5, $t6

COMPILER
beq $t2, $zero, endif
add $t0, $t1, $zero
Control
sll $t4, $t0, 2
add $t4, $s0, $t4
ALU Unit
lw $t5, 0($t3)
lw $t6, 0($t4)
slt $t2, $t5, $t6
beq $t2, $zero, endif

Prof. Christopher U. Ngene email: [email protected] 16


The Simplified Picture

Hand-written Machine code


High-level code Assembly code 010000101010110110
101010101111010101
101001010101010001
sll $t3, $t1, 2 101010101010100101
char *tmpfilename;
add $t3, $s0, $t3 111100001010101001
int num_schedulers=0;

ASSEMBLER
int num_request_submitters=0; sll $t4, $t0, 2 000101010111101011
int i,j; add $t4, $s0, $t4 010000000010000100
lw $t5, 0($t3) 000010001000100011
if (!(f = fopen(filename,"r"))) {
xbt_assert1(0,"Cannot open file %s",filename); lw $t6, 0($t4)
101001010010101011
} 000101010010010101
slt $t2, $t5, $t6
while(fgets(buffer,256,f)) { 010101010101010101
if (!strncmp(buffer,"SCHEDULER",9)) beq $t2, $zero, endif
num_schedulers++;
101010101111010101
if (!strncmp(buffer,"REQUESTSUBMITTER",16)) 101010101010100101
num_request_submitters++; 111100001010101001
}
fclose(f);
tmpfilename = strdup("/tmp/jobsimulator_

Assembly code
sll $t3, $t1, 2
add $t3, $s0, $t3
sll $t4, $t0, 2
add $t4, $s0, $t4 Program counter register
lw $t5, 0($t3) register

CPU
lw $t6, 0($t4) register
slt $t2, $t5, $t6

COMPILER
beq $t2, $zero, endif
add $t0, $t1, $zero
Control
sll $t4, $t0, 2
add $t4, $s0, $t4 ALU Unit
lw $t5, 0($t3)
lw $t6, 0($t4)
slt $t2, $t5, $t6
beq $t2, $zero, endif

Prof. Christopher U. Ngene email: [email protected] 17


The 80x86 Architecture
• To learn assembly programming we need to pick a
processor family with a given ISA (Instruction Set
Architecture)

• We will pick the Intel 80x86 ISA (x86 for short) – CISC
machines
• The most common today in existing computers
• For instance in my laptop

• You have already studied MIPS which is an example of a


RISC architecture

Prof. Christopher U. Ngene email: [email protected] 18


The 80x86 Architecture…

• 8086 in 1979 • 80386 in 1985


• 20-bit address bus (1MB) • First 32-bit processor
• 16-bit data bus • 32-bit address bus (4GB)
• No floating-point • 32-bit data bus
coprocessor • No floating-point coprocessor

• 80286 in 1982 • 80486 in 1989


• 24-bit address bus (16MB) • Improved version of 386
• 16-bit data bus • 32-bit address bus (4GB)
• No floating-point • 32-bit data bus
coprocessor • Includes floating-point
• Memory protection • coprocessor and internal cache
capabilities

Prof. Christopher U. Ngene email: [email protected] 19


The 80x86 Architecture…
• Pentium in 1993
• Enhanced version of 486
• 32-bit address bus (4GB)
• 64-bit data bus
• All internal registers are 32-bit wide
• Still a 32-bit processor
• All instructions operate on at most 32-bit operands
• Several versions
• Some versions are stripped down for under $1000
PC market
• Some enhanced with MMX technology

Prof. Christopher U. Ngene email: [email protected] 20


The 8086 Registers
• To write assembly code for an ISA you must know the
name of registers
• Because registers are places in which you put data to perform
computation and in which you find the result of the computation
(think of them as variables for now)
• The registers are identified by binary numbers, but assembly
languages give them “easy-to-remember” names
• The 8086 offered 16-bit registers
• Four general purpose 16-bit registers
• AX
• BX
• CX
• DX

Prof. Christopher U. Ngene email: [email protected] 21


General-Purpose Registers
• AX, BX, CX, and DX: They can be assigned to any
value you want.
• AX (accumulator register). Most of arithmetic operations are
done with AX.

• BX (base register). Used to do array operations. BX is usually


works with other registers like SP to point to stacks.

• CX (counter register). Used for counter purposes.

• DX (data register). Used for storing data value.

Prof. Christopher U. Ngene email: [email protected] 22


The 8086 Registers…
AX BX CX DX

AH AL BH BL CH CL DH DL

• Each of the 16-bit registers consists of 8 “low bits”


and 8 “high bits”
• Low: least significant
• High: most significant
• The ISA makes it possible to refer to the low or high bits
individually
• AH, AL
• BH, BL
• CH, CL
• DH, DL

Prof. Christopher U. Ngene email: [email protected] 23


The 8086 Registers…
AX BX CX DX

AH AL BH BL CH CL DH DL

• The xH and xL registers can be used as 1-byte


register to store 1-byte quantities

• Important: both are “tied” to the 16-bit register


• Changing the value of AX will change the values of AH
and AL
• Changing the value of AH or AL will change the value of
AX

Prof. Christopher U. Ngene email: [email protected] 24


Index Registers
• SI and DI: Usually used to process arrays or
strings:
• SI (source index) is always pointed to the
source array

• DI (destination index) is always pointed to


the destination array.

Prof. Christopher U. Ngene email: [email protected] 25


Segment Registers
• CS, DS, ES, and SS:
• CS (code segment register). Points to the segment of the
running program. We may NOT modify CS directly.

• DS (data segment register). Points to the segment of the


data used by the running program. You can point this to
anywhere you want as long as it contains the desired data.

• ES (extra segment register). Usually used with DI and doing


pointers things. The couple DS:SI and ES:DI are commonly
used to do string operations.

• SS (stack segment register). Points to stack segment.

Prof. Christopher U. Ngene email: [email protected] 26


Pointer Registers
• BP, SP, and IP:

• BP (base pointer) used for preserving space to use local


variables.

• SP (stack pointer) used to point the current stack.

• IP (instruction pointer) denotes the current pointer of the


running program. It is always coupled with CS and it is
NOT modifiable. So, the couple of CS:IP is a pointer
pointing to the current instruction of running program.
You can NOT access CS nor IP directly.
• Points to the next instruction to execute
• Typically not handled directly when writing assembly code

Prof. Christopher U. Ngene email: [email protected] 27


Extended Registers
• 80386 processors introduced extended register.

• Most of the registers, except segment registers are


enhanced into 32-bit.

• So, we have extended registers EAX, EBX, ECX, and so on.

• AX is only the low 16-bit (bit 0 to 15) of EAX.

• There are NO special direct access to the upper 16-bit (bit


16 to 31) in extended register.

Prof. Christopher U. Ngene email: [email protected] 28


16-bit Flags Register
• The 16-bit FLAGS registers
• Flag is 16-bit register that contains processor status.
• It holds the value of which the programmers may need to access. This
involves detecting whether the last arithmetic holds zero result or may
be overflow.
• Intel doesn't provide a direct access to it; rather it is accessed via
stack. (via POPF and PUSHF)
• You can access each flag attribute by using bitwise AND operation
since each status is mostly represented by just 1 bit.
• Information is stored in individual bits of the FLAGS register
• Whenever an instruction is executed and produces a result, it may
modify some bit(s) of the FLAGS register
• Example: Z (or ZF) denotes one bit of the FLAGS register, which is set
to 1 if the previously executed instruction produced 0, or 0 otherwise
• We’ll see many uses of the FLAGS registers

Prof. Christopher U. Ngene email: [email protected] 29


Flag Register…
• C carry flag is turned to 1 whenever the last arithmetical
operation, such as adding and subtracting, has carry or
borrow otherwise 0.
• P parity flag It will set to 1 if the last operation (any
operation) results even number of bit 1.
• A auxilarry flag It is set in Binary Coded Decimal (BCD)
operations.
• Z zero flag used to detect whether the last operation
(any operation) holds zero result.
• S sign flag used to detect whether the last operation
holds negative result. It is set to 1 if the highest bit (bit 7
in bytes or bit 15 in words) of the last operation is 1.

Prof. Christopher U. Ngene email: [email protected] 30


Flag Register…
• T trap flag used in debuggers to turn on the step-by-
step feature.
• I interrupt flag used to toggle the interrupt enable or
not. If the bit is set (= 1), then the interrupts are
enabled, otherwise disabled. The default is on.
• D direction flag used for directions of string
operations. If the bit is set, then all string operations
are done backward. Otherwise, forward. The default
is forward (= 0).
• O the overflow flag used to detect whether the last
arithmetic operation result has overflowed or not. If
the bit is set, then it has been an overflow.

Prof. Christopher U. Ngene email: [email protected] 31


The 8086 Registers
AH AL = AX
BH BL = BX
CH CL = CX
DH DL = DX

SI
DI

BP
SP

IP
= FLAGS

CS
DS
SS
ES

16 bits

Control
ALU Unit
Prof. Christopher U. Ngene email: [email protected] 32
The 8386 Registers
The 80386 was the first 32-bit processor of the x86 family.
AX

AH AL = EAX
BX

BH BL = EBX
CS
CX DS
ES
CH CL = ECX FS
DX GS
SS
DH DL = EDX

= ESI 16 bits
SI
DI = EDI
BP = EBP
SP = ESP
FLAGS = EFLAGS
IP = EIP

32 bits
Prof. Christopher U. Ngene email: [email protected] 33
The 8386 Registers…
• EAX, EBX, ECX and EDX are called data or general
purpose registers.
• (E is for extended as they are 32-bit extensions of their 16-
bit counter parts AX, BX, CX and DX in 16-bit ISA).
• The register EAX is also known as accumulator
because it is used as destination in many arithmetic
operations.
• Some instructions generate more efficient code if
they reference the EAX register rather than other
registers.
• Bits in a register are conventionally numbered from
right to left, beginning with 0 as shown below.
Prof. Christopher U. Ngene email: [email protected] 34
80386 Registers..
• Although they are called general purpose registers, only the ones marked
with a * should be used in Windows programming
• Windows uses ESI, EDI, ESP and EBP internally and it doesn't expect the values in those
registers to change.
• This doesn't mean that you cannot use those four registers, you can. Just be sure to
restore them back before passing control back to Windows.

Note!! You don't have segments in Win32 (80386 processors and above)
Prof. Christopher U. Ngene email: [email protected] 35
80386 Registers…
• There are 6 16-bit segment registers. They define segments in
memory:

• Lastly, there are 2 32-bit registers that don't fit into any category:

Prof. Christopher U. Ngene email: [email protected] 36


80386 EFLAGS

CF Carry Flag IOPL I/O privilege Level


PF Parity Flag NT Nested Task
AF Auxiliary Carry Flag RF Resume Flag
ZF Zero Flag VM Virtual-8086 Mode
SF Sign Flag AC Alignment Check
TF Trap Flag VIF Virtual Interrupt Flag
IF Interrupt Enable Flag VIP Virtual Interrupt Pending
DF Direction Flag ID
OF Overflow Flag
Prof. Christopher U. Ngene email: [email protected] 37
Memory

• Memory can be viewed as an


ordered sequence of bytes

• Each byte of memory has an


address
• Memory address is
essentially the sequence
number of the byte
• Such memories are called
byte addressable
• Number of address lines
determine the memory
address space of a processor

Prof. Christopher U. Ngene email: [email protected] 38


Memory…
• Two basic memory operations
• Read operation (read from memory)
• Write operation (write into memory)
• Access time
• Time needed to retrieve data at addressed location
• Cycle time
• Minimum time between successive operations

Prof. Christopher U. Ngene email: [email protected] 39


Memory - Read Cycle
• Steps in a typical read cycle
• Place the address of the location to be read on the address
bus
• Activate the memory read control signal on the control bus
• Wait for the memory to retrieve the data from the addressed
memory location
• Read the data from the data bus
• Drop the memory read control signal to terminate the read
cycle
• A simple Pentium memory read cycle takes 3 clocks
• Steps 1&2 and 4&5 are done in one clock cycle each
• For slower memories, wait cycles will have to be inserted

Prof. Christopher U. Ngene email: [email protected] 40


Memory – Write Cycle
• Steps in a typical write cycle
• Place the address of the location to be written on the address
bus
• Place the data to be written on the data bus
• Activate the memory write control signal on the control bus
• Wait for the memory to store the data at the addressed
location
• Drop the memory write control signal to terminate the write
cycle

• A simple Pentium memory write cycle takes 3 clocks


• Steps 1&3 and 4&5 are done in one clock cycle each
• For slower memories, wait cycles will have to be inserted

Prof. Christopher U. Ngene email: [email protected] 41


Memory Architecture 80386 and Higher
processors
• Two modes
• Real mode
• Uses 16-bit addresses
• Supports segmented memory architecture
• Provides backward compatibility
• To run 8086 programs

• Protected mode
• Native mode of x386 and higher
• Uses 32-bit addresses
• Supports segmentation and paging
• Paging is useful to implement virtual memory

Prof. Christopher U. Ngene email: [email protected] 42


Real Mode Memory Architecture
• Real mode memory architecture
• Pentium operates like a faster 8086 processor
• The 8086 has
• 20 address lines (can address 1 MB)
• All registers are 16-bit wide
• Memory is organized as segments of (up to) 64 KB
• Due to 16-bit registers (216 = 64 K)
• Two components are required to specify a location
• Segment base address (segment start address)
• Offset within a segment
• Both are 16-bit numbers

Prof. Christopher U. Ngene email: [email protected] 43


Protected Mode Architecture
• In protected mode, Pentium supports
• Logical address consists of
• 16-bit segment selector (CS, SS, DS, ES, FS, GS)
• 32-bit offset (EIP, ESP, EBP, ESI ,EDI, EAX, EBX, ECX, EDX)
• More sophisticated segmentation
• Segmentation can be made invisible (flat model)
• Paging for virtual memory
• Paging can be turned off

Prof. Christopher U. Ngene email: [email protected] 44


Addresses in Memory
• In the 8086 processor, a program is limited to referencing an address
space of size 1MB, that is 2 bytes 20

• A 20-bit processor uses 20-bit addresses and thus can access 2 B = 1MB
20

physical memory.
• Depending on the machine, a processor can access one or more
bytes from memory at a time.
• The number of bytes accessed simultaneously from main memory is
called word length of machine.
• Generally, all machines are byte-addressable i.e.; every byte stored in
memory has a unique address.
• However, word length of a machine is typically some integral multiple
of a byte.
• Therefore, the address of a word must be the address of one of its
constituting bytes.
• In this regard, one of the following methods of addressing (also
known as byte ordering) may be used.
Prof. Christopher U. Ngene email: [email protected] 45
Addresses in Memory…
• Big Endian – the higher byte is stored at lower memory address (i.e. Big
Byte first). MIPS, Apple, Sun SPARC are some of the machines in this class.
• Little Endian - the lower byte is stored at lower memory address (i.e. Little
Byte first). Intel’s machines use little endian
• Consider for example, storing 0xA2B1C3D4 in main memory. The two byte
orderings are illustrated in Fig. 1-2.

Prof. Christopher U. Ngene email: [email protected] 46


Data alignment requirements for byte
addressable memories
• Data alignment requirements for byte
addressable memories
• 1-byte data
• Always aligned
• 2-byte data
• Aligned if the data is stored at an even address (i.e., at
an address that is a multiple of 2)
• 4-byte data
• Aligned if the data is stored at an address that is a
multiple of 4
• 8-byte data
• Aligned if the data is stored at an address that is a
multiple of 8
Prof. Christopher U. Ngene email: [email protected] 47
Addressing Space
• One says that a running program has a 1MB address space

• And the program needs to use 20-bit addresses to


reference memory content
• Instructions, data, etc.

• Problem: registers are at 16-bit long! How can they hold a


20-bit address in 8086 processor???

• The solution: split addresses in two pieces:


• The selector
• The offset

Prof. Christopher U. Ngene email: [email protected] 48


Simple Selector and Offset
• Let us assume that we have an address space of size
24=16 bytes
• Yes, that would not be a useful computer

• Addresses are 4-bit long

• Let’s assume we have a 2-bit selector and a 2-bit offset


• As if our computer had only 2-bit registers

• We take such small numbers because it’s difficult to draw


pictures with 220 bytes!

Prof. Christopher U. Ngene email: [email protected] 49


Selector and Offset Example

0000
0001
selector offset
0010
0011
address 0 1 x x 0100
16 bytes
0101
of
0110
memory
0111
For a fixed value of the 1000
selector there are 22=4 1001
addressable bytes of memory. 1010
The set of these bytes is 1011
called a memory segment. 1100
1101
1110
1111
Prof. Christopher U. Ngene email: [email protected] 50
Selector and Offset Example

0000
0001
selector offset
0010
0011
address x x x x 0100
0101
0110
16 bytes
0111
We have 16 bytes of memory of
1000
We have 4-byte segments memory
1001
We have 4 segments
1010
1011
1100
1101
1110
1111
Prof. Christopher U. Ngene email: [email protected] 51
Selector and Offset
• The way in which one addresses the memory content is
then pretty straightforward
• First, set the bits of the selector to “pick” a segment
• Second, set the bits of the offset to address a byte within
the segment
• This all makes sense because a program typically
addresses bytes that are next to each other, that is within
the same segment
• So, the selector bits stay the same for a long time, while
the offset bits change often
• Of course, this isn’t true for tiny 4-byte segments as in our
example…

Prof. Christopher U. Ngene email: [email protected] 52


Selector and Offset: For 20-bit Addresses

selector offset

4 bits 16 bits

• On the 8086 the offset is 16-bit long


• And therefore the selector is 4-bit
• We have 24 = 16 different segments
• Each segment is 216 byte = 64KB
• For a total of 1MB of memory, which is what the 8086
used

Prof. Christopher U. Ngene email: [email protected] 53


Selector and Offset: For 20-bit Addresses…

0000…

0001…
selector offset 0010…
0011…

address 4 bits 16 bits 0100…

0101…
0110…

0111…
1MB
We have 1MB of memory 1000…
of
We have 64K segments 1001…
memory
We have 16 segments 1010…

1011…
1100…

1101…
1110…

1111…

Prof. Christopher U. Ngene email: [email protected] 54


The 8086 Selector Scheme
• So far we’ve talked about the selector as a 4-bit quantity, for
simplicity
• This leads to 16 non-overlapping segments
• The designers of the 8086 wanted more flexibility
• E.g., if you know that you need only an 8K segment, why use
64K for it? Just have the “next” segment start 8K after the
previous segment
• We’ll see why segments are needed in a little bit
• So, for the 8086, the selector is NOT a 4-bit field, but rather the
address of the beginning of the segment
• But now we’re back to our initial problem: Addresses are 20-bit,
how are we to store an address in a 16-bit register???

Prof. Christopher U. Ngene email: [email protected] 55


The 8086 Selector Scheme…
• What the designers of the 8086 did is pretty simple
• Enforce that the beginning address of a segment can
only be a multiple of 16
• Therefore, its representation in binary always has its
four lowest bits set to 0
• Or, in hexadecimal, its last digit is always 0
• So the address of a beginning of a segment is a 20-
bit hex quantity that looks like: XXXX0
• Since we know the last digit is always 0, no need to
store it
• Therefore, we need to store only 4 hex digits which, lo and
behold, fits in a 16-bit register!
Prof. Christopher U. Ngene email: [email protected] 56
The 8086 Selector Scheme…
• So now we have two 16-bit quantities
• The 16-bit selector
• The 16-bit offset
• The selector must be stored in one of the “segment” registers
• CS, DS, SS, ES
• The offset is typically stored in one of the “index” registers
• SI, DI
• But could be stored in a general purpose register
• Address computation is straightforward
• Given a 16-bit selector and a 16-bit offset, the 20-bit address is
computed as follows
• Multiply the selector by 16
• This simply transforms XXXX into XXXX0, thanks to the beauty of hexadecimal
• Add the offset

Prof. Christopher U. Ngene email: [email protected] 57


Exercise
• Consider the byte at address 13DDE within a 64K
segment defined by selector value 10DE. What is its
offset?

• 13DDE = 10DE * 1610 + offset


13DDE =10DE0 + offset
• offset = 13DDE - 10DE0
• offset = 2FFE (a 16-bit quantity)

Prof. Christopher U. Ngene email: [email protected] 58


CODE, DATA and STACK Segments

• Although we’ll discuss these at length later, let’s


just accept for now that the address space has
three regions

address space
• A program constantly references all three code
regions
• Therefore, the program constantly references
bytes in three different segments
• For now let’s assume that each region is fully data
contained in a single segment, which is in fact
not always the case
• CS: points to the beginning of the code stack
segment
• DS: points to the beginning of the data segment
• SS: points to the beginning of the stack
segment
Prof. Christopher U. Ngene email: [email protected] 59
The Problem with Segments
• It is well-known that programming with segmented
architectures is really a pain
• Don’t panic, for our purposes we won’t really suffer from
this (too much)
• You constantly have to make sure segment registers are set
up correctly
• What happens if you have data/code that’s more than 64K?
• You must then switch back and forth between selector
values, which can be really awkward
• Something that can cause complexity also is that two
different (selector, offset) pairs can reference the same
address
• Example: (a,b) and (a-1, b+16)
Note!! You don't have segments in Win32 (80386 processors and above)
Prof. Christopher U. Ngene email: [email protected] 60

You might also like