1 CPE 413 Overview of x86 Architecture-1
1 CPE 413 Overview of x86 Architecture-1
PROGRAMMING
opcode operands
0 1 0 0 0 1 1 0 1 0 1 1 0 1
Prof. Christopher U. Ngene email: [email protected] 4
Assembly Language
• It’s really difficult for humans to read/remember binary
instruction encodings
• We will see that typically one would use hexadecimal encoding,
but still difficult.
• Therefore it is typical to use a set of mnemonics, which
form the assembly language
• It is often said that the CPU understands assembly language
• This is not technically true, as the CPU understand machine code,
which we, as humans, choose the represent using assembly
language
• An assembler transforms assembly code into machine
code
C Assemby Language
• Pros
• Assembly is fast (Time efficiency). A LOT faster than any compiler of any
language could ever produce.
• Assembly is a lot closer to machine level than any language because the
commands of assembly language are mapped 1-1 to machine
instructions.
• Assembly code is a lot smaller (Space efficiency) than any compiler of
any language could ever produce.
• In Assembly, we can do a lot of things that we can't do in any higher level
language, such as playing with processor flags, etc.
• Space-efficiency
• Not a big plus point for most applications
• Code compactness is important in some cases
• Portable and hand-held device software
• Spacecraft control software
Machine code
010000101010110110
High-level code 101010101111010101
101001010101010001
101010101010100101
char *tmpfilename; 111100001010101001
int num_schedulers=0;
000101010111101011
ASSEMBLER
int num_request_submitters=0;
int i,j; 010000000010000100
000010001000100011
if (!(f = fopen(filename,"r"))) {
xbt_assert1(0,"Cannot open file %s",filename); 101001010010101011
} 000101010010010101
while(fgets(buffer,256,f)) { 010101010101010101
if (!strncmp(buffer,"SCHEDULER",9))
num_schedulers++; 101010101111010101
if (!strncmp(buffer,"REQUESTSUBMITTER",16)) 101010101010100101
num_request_submitters++; 111100001010101001
}
fclose(f);
tmpfilename = strdup("/tmp/jobsimulator_
Assembly code
sll $t3, $t1, 2
add $t3, $s0, $t3
sll $t4, $t0, 2
add $t4, $s0, $t4 Program counter register
lw $t5, 0($t3) register
CPU
lw $t6, 0($t4) register
slt $t2, $t5, $t6
COMPILER
beq $t2, $zero, endif
add $t0, $t1, $zero
Control
sll $t4, $t0, 2
add $t4, $s0, $t4
ALU Unit
lw $t5, 0($t3)
lw $t6, 0($t4)
slt $t2, $t5, $t6
beq $t2, $zero, endif
ASSEMBLER
int num_request_submitters=0; sll $t4, $t0, 2 000101010111101011
int i,j; add $t4, $s0, $t4 010000000010000100
lw $t5, 0($t3) 000010001000100011
if (!(f = fopen(filename,"r"))) {
xbt_assert1(0,"Cannot open file %s",filename); lw $t6, 0($t4)
101001010010101011
} 000101010010010101
slt $t2, $t5, $t6
while(fgets(buffer,256,f)) { 010101010101010101
if (!strncmp(buffer,"SCHEDULER",9)) beq $t2, $zero, endif
num_schedulers++;
101010101111010101
if (!strncmp(buffer,"REQUESTSUBMITTER",16)) 101010101010100101
num_request_submitters++; 111100001010101001
}
fclose(f);
tmpfilename = strdup("/tmp/jobsimulator_
Assembly code
sll $t3, $t1, 2
add $t3, $s0, $t3
sll $t4, $t0, 2
add $t4, $s0, $t4 Program counter register
lw $t5, 0($t3) register
CPU
lw $t6, 0($t4) register
slt $t2, $t5, $t6
COMPILER
beq $t2, $zero, endif
add $t0, $t1, $zero
Control
sll $t4, $t0, 2
add $t4, $s0, $t4 ALU Unit
lw $t5, 0($t3)
lw $t6, 0($t4)
slt $t2, $t5, $t6
beq $t2, $zero, endif
• We will pick the Intel 80x86 ISA (x86 for short) – CISC
machines
• The most common today in existing computers
• For instance in my laptop
AH AL BH BL CH CL DH DL
AH AL BH BL CH CL DH DL
SI
DI
BP
SP
IP
= FLAGS
CS
DS
SS
ES
16 bits
Control
ALU Unit
Prof. Christopher U. Ngene email: [email protected] 32
The 8386 Registers
The 80386 was the first 32-bit processor of the x86 family.
AX
AH AL = EAX
BX
BH BL = EBX
CS
CX DS
ES
CH CL = ECX FS
DX GS
SS
DH DL = EDX
= ESI 16 bits
SI
DI = EDI
BP = EBP
SP = ESP
FLAGS = EFLAGS
IP = EIP
32 bits
Prof. Christopher U. Ngene email: [email protected] 33
The 8386 Registers…
• EAX, EBX, ECX and EDX are called data or general
purpose registers.
• (E is for extended as they are 32-bit extensions of their 16-
bit counter parts AX, BX, CX and DX in 16-bit ISA).
• The register EAX is also known as accumulator
because it is used as destination in many arithmetic
operations.
• Some instructions generate more efficient code if
they reference the EAX register rather than other
registers.
• Bits in a register are conventionally numbered from
right to left, beginning with 0 as shown below.
Prof. Christopher U. Ngene email: [email protected] 34
80386 Registers..
• Although they are called general purpose registers, only the ones marked
with a * should be used in Windows programming
• Windows uses ESI, EDI, ESP and EBP internally and it doesn't expect the values in those
registers to change.
• This doesn't mean that you cannot use those four registers, you can. Just be sure to
restore them back before passing control back to Windows.
Note!! You don't have segments in Win32 (80386 processors and above)
Prof. Christopher U. Ngene email: [email protected] 35
80386 Registers…
• There are 6 16-bit segment registers. They define segments in
memory:
• Lastly, there are 2 32-bit registers that don't fit into any category:
• Protected mode
• Native mode of x386 and higher
• Uses 32-bit addresses
• Supports segmentation and paging
• Paging is useful to implement virtual memory
• A 20-bit processor uses 20-bit addresses and thus can access 2 B = 1MB
20
physical memory.
• Depending on the machine, a processor can access one or more
bytes from memory at a time.
• The number of bytes accessed simultaneously from main memory is
called word length of machine.
• Generally, all machines are byte-addressable i.e.; every byte stored in
memory has a unique address.
• However, word length of a machine is typically some integral multiple
of a byte.
• Therefore, the address of a word must be the address of one of its
constituting bytes.
• In this regard, one of the following methods of addressing (also
known as byte ordering) may be used.
Prof. Christopher U. Ngene email: [email protected] 45
Addresses in Memory…
• Big Endian – the higher byte is stored at lower memory address (i.e. Big
Byte first). MIPS, Apple, Sun SPARC are some of the machines in this class.
• Little Endian - the lower byte is stored at lower memory address (i.e. Little
Byte first). Intel’s machines use little endian
• Consider for example, storing 0xA2B1C3D4 in main memory. The two byte
orderings are illustrated in Fig. 1-2.
0000
0001
selector offset
0010
0011
address 0 1 x x 0100
16 bytes
0101
of
0110
memory
0111
For a fixed value of the 1000
selector there are 22=4 1001
addressable bytes of memory. 1010
The set of these bytes is 1011
called a memory segment. 1100
1101
1110
1111
Prof. Christopher U. Ngene email: [email protected] 50
Selector and Offset Example
0000
0001
selector offset
0010
0011
address x x x x 0100
0101
0110
16 bytes
0111
We have 16 bytes of memory of
1000
We have 4-byte segments memory
1001
We have 4 segments
1010
1011
1100
1101
1110
1111
Prof. Christopher U. Ngene email: [email protected] 51
Selector and Offset
• The way in which one addresses the memory content is
then pretty straightforward
• First, set the bits of the selector to “pick” a segment
• Second, set the bits of the offset to address a byte within
the segment
• This all makes sense because a program typically
addresses bytes that are next to each other, that is within
the same segment
• So, the selector bits stay the same for a long time, while
the offset bits change often
• Of course, this isn’t true for tiny 4-byte segments as in our
example…
selector offset
4 bits 16 bits
0000…
0001…
selector offset 0010…
0011…
0101…
0110…
0111…
1MB
We have 1MB of memory 1000…
of
We have 64K segments 1001…
memory
We have 16 segments 1010…
1011…
1100…
1101…
1110…
1111…
address space
• A program constantly references all three code
regions
• Therefore, the program constantly references
bytes in three different segments
• For now let’s assume that each region is fully data
contained in a single segment, which is in fact
not always the case
• CS: points to the beginning of the code stack
segment
• DS: points to the beginning of the data segment
• SS: points to the beginning of the stack
segment
Prof. Christopher U. Ngene email: [email protected] 59
The Problem with Segments
• It is well-known that programming with segmented
architectures is really a pain
• Don’t panic, for our purposes we won’t really suffer from
this (too much)
• You constantly have to make sure segment registers are set
up correctly
• What happens if you have data/code that’s more than 64K?
• You must then switch back and forth between selector
values, which can be really awkward
• Something that can cause complexity also is that two
different (selector, offset) pairs can reference the same
address
• Example: (a,b) and (a-1, b+16)
Note!! You don't have segments in Win32 (80386 processors and above)
Prof. Christopher U. Ngene email: [email protected] 60