0% found this document useful (0 votes)
67 views7 pages

Ics312 x86 PDF

The 80x86 architecture originated in the late 1970s with the Intel 8088 and 8086 processors. It has seen little fundamental change since then due to backward compatibility requirements, though it has been improved through incremental additions. The architecture uses 16-bit registers like AX, BX, CX, and DX that can each be divided into 8-bit high and low parts. Other registers include instruction pointer, segment, index, and flag registers. Memory is addressed through segment registers combined with index registers.

Uploaded by

Siddhasen Patil
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
67 views7 pages

Ics312 x86 PDF

The 80x86 architecture originated in the late 1970s with the Intel 8088 and 8086 processors. It has seen little fundamental change since then due to backward compatibility requirements, though it has been improved through incremental additions. The architecture uses 16-bit registers like AX, BX, CX, and DX that can each be divided into 8-bit high and low parts. Other registers include instruction pointer, segment, index, and flag registers. Memory is addressed through segment registers combined with index registers.

Uploaded by

Siddhasen Patil
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

The 80x86 Architecture

! To learn assembly programming we need to pick a


processor family with a given ISA (Instruction Set
The x86 Architecture)
! We will pick the Intel 80x86 ISA (x86 for short)
Architecture " The most common today in existing computers
" For instance in my laptop
! We could have picked others
" Old ones: Sparc, VAX
ICS312 - Spring 2009 " Current ones: PowerPC, Itanium, MIPS
Machine-Level and " In ICS431 you’ll (likely) be exposed to MIPS
Systems Programming ! Some courses in some curricula subject students to
two or even more ISAs, but in this course we’ll just
Henri Casanova ([email protected]) focused on one more in depth

X86 History X86 History


! In the late 70s Intel creates the 8088 and 8086 processors ! It’s quite amazing that this architecture has witnessed so little
" 16-bit registers (fundamental) change since the 8086
" 1 MB of memory, divided into 64KB segments
" Backward compatibility
! In 1982: the 80286
" Imposed early as _the_ ISA (Intel was the first company to
" Added some instructions
produce a 16-bit architecture)
" 16 MB of memory, divided into 64KB segments
! In 1985: the 80386 ! Some argue that it’s an ugly ISA
" 32-bit registers " Due to it being a set of add-ons rather than a modern re-design
" 5 GB of memory, divided into 4GB segments ! But it’s easy to implement in hardware, and Intel’s was
! 1989: 486; 1992: Pentium; 1995: P6 successful in making faster and faster x86 processors for
" Only incremental changes to the architecture
decades
! 1997: MMX extensions
" New instructions to speed up graphics (for instance) ! Intel’s new ISA is IA64 (used in Itanium processors), which is
! 1999: Pentium III radically different
" SSE extension (new cache instructions, new floating point operations) " And closer to the ISA you’ll see in ICS431
! 2001: SSE2 extension

The 8086 Registers The 8086 Registers


AX BX CX DX
! To write assembly code for an ISA you must know
the name of registers AH AL BH BL CH CL DH DL
Because registers are places in which you put data to
"
perform computation and in which you find the result of the
! Each of the 16-bit registers consists of 8 “low bits”
computation (think of them as variables for now) and 8 “high bits”
" The registers are identified by binary numbers, but " Low: least significant
assembly languages give them “easy-to-remember” names " High: most significant
! The 8086 offered 16-bit registers ! The ISA makes it possible to refer to the low or high
! Four general purpose 16-bit registers bits individually
" AX " AH, AL
" BX " BH, BL
" CX
" CH, CL
" DX
" DH, DL
The 8086 Registers The 8086 Registers
AX BX CX DX
! Two 16-bit index registers:
AH AL BH BL CH CL DH DL
" SI
! The xH and xL registers can be used as 1- " DI
byte register to store 1-byte quantities ! These are basically general-purpose
! Important: both are “tied” to the 16-bit register registers
" Changing the value of AX will change the values ! But by convention they are often used as
of AH and AL “pointers”, i.e., they contain addresses
" Changing the value of AH or AL will change the ! And they cannot be decomposed into High
value of AX
and Low 1-byte registers

The 8086 Registers The 8086 Registers


! Two 16-bit special registers: ! The 16-bit Instruction Pointer (IP) register:
" BP: Base Pointer " Points to the next instruction to execute
" Typically not handled directly when writing assembly code
" SP: Stack Pointer
! The 16-bit FLAGS registers
" We’ll discuss these at length later
" Information is stored in individual bits of the FLAGS
! Four 16-bit segment registers: register
" CS: Code Segment " Whenever an instruction is executed and produces a result,
it may modify some bit(s) of the FLAGS register
" DS: Data Segment
" Example: Z (or ZF) denotes one bit of the FLAGS register,
" SS: Stack Segment which is set to 1 if the previously executed instruction
" ES: Extra Segment produced 0, or 0 otherwise
" We’ll see many uses of the FLAGS registers
" We’ll discuss these later as well

The 8086 Registers Addresses in Memory


AH AL = AX
BH
CH
BL
CL
= BX
= CX
! We mentioned several registers that are used for
DH DL = DX holding addresses of memory locations
SI
DI ! Segments:
BP
SP " CS, DS, SS, ES
IP
= FLAGS
! Pointers:
CS " SI, DI: indices (typically used for pointers)
DS
SS " SP: Stack pointer
ES
" BP: (Stack) Base pointer
16 bits

Control ! Let’s look at the structure of the address space


ALU Unit
Address Space Address Space
! In the 8086 processor, a program is limited to referencing an ! One says that a running program has a 1MB
address space of size 1MB, that is 220 bytes
! Therefore, addresses are 20-bit long! address space
! A d-bit long address allows to reference 2d different “things” ! And the program needs to use 20-bit
! Example:
" 2-bit addresses
addresses to reference memory content
! 00, 01, 10, 11 " Instructions, data, etc.
! 4 “things”
" 3-bit addresses ! Problem: registers are at 16-bit long! How
! 000, 001, 010, 011, 100, 101, 110, 111 can they hold a 20-bit address???
! 8 “things”
! In our case, these things are “bytes” ! The solution: split addresses in two pieces:
" One cannot address anything smaller than a byte " The selector
! Therefore, a 20-bit address makes it possible to address 220
individual bytes, or 1MB " The offset

Simple Selector and Offset Selector and Offset Example


0000
! Let us assume that we have an address 0001
selector offset
space of size 24=16 bytes 0010
0011
" Yes, that would not be a useful computer 0 1 x x
address 0100
! Addresses are 4-bit long 0101
0110
16 bytes
! Let’s assume we have a 2-bit selector and a 0111 of
2-bit offset For a fixed value of the 1000 memory
selector there are 22=4 1001
" As if our computer had only 2-bit registers addressable bytes of memory. 1010
The set of these bytes is called 1011
a memory segment. 1100
! We take such small numbers because it’s 1101
difficult to draw pictures with 220 bytes! 1110
1111

Selector and Offset Example Selector and Offset


0000
0001 ! The way in which one addresses the memory
selector offset content is then pretty straightforward
0010
0011 ! First, set the bits of the selector to “pick” a segment
address x x x x 0100
0101 ! Second, set the bits of the offset to address a byte
0110 within the segment
16 bytes
0111 of ! This all makes sense because a program typically
We have 16 bytes of memory
1000 memory
We have 4-byte segments
1001
addresses bytes that are next to each other, that is
We have 4 segments within the same segment
1010
1011 ! So, the selector bits stay the same for a long time,
1100 while the offset bits change often
1101
" Of course, this isn’t true for tiny 4-byte segments as in our
1110
example…
1111
For 20-bit Addresses For 20-bit Addresses
0000…
selector offset 0001…
selector offset 0010…

4 bits 16 bits 0011…

address 4 bits 16 bits 0100…

0101…

! On the 8086 the offset if 16-bit long 0110…


1MB
0111…
" And therefore the selector is 4-bit We have 1MB of memory
1000…
of
We have 64K segments memory
! We have 24 = 16 different segments We have 16 segments
1001…
1010…
! Each segment is 216 byte = 64KB 1011…

! For a total of 1MB of memory, which is what the 1100…


1101…
8086 used 1110…

1111…

The 8086 Selector Scheme The 8086 Selector Scheme


! So far we’ve talked about the selector as a 4-bit quantity, for ! What the designers of the 8086 did is pretty simple
simplicity ! Enforce that the beginning address of a segment can
! This leads to 16 non-overlapping segments only be a multiple of 16
! The designers of the 8086 wanted more flexibility ! Therefore, its representation in binary always has its four
! E.g., if you know that you need only an 8K segment, why use lowest bits set to 0
64K for it? Just have the “next” segment start 8K after the ! Or, in hexadecimal, its last digit is always 0
previous segment ! So the address of a beginning of a segment is a 20-bit
" We’ll see why segments are needed in a little bit hex quantity that looks like: XXXX0
! So, for the 8086, the selector is NOT a 4-bit field, but rather ! Since we know the last digit is always 0, no need to store
the address of the beginning of the segment it
! But now we’re back to our initial problem: Addresses are 20- ! Therefore, we need to store only 4 hex digits
bit, how are we to store an address in a 16-bit register??? ! Which, lo and behold, fits in a 16-bit register!

The 8086 Selector Scheme In-class Exercise


! So now we have two 16-bit quantities
" The 16-bit selector
! Consider the byte at address 13DDE within a
" The 16-bit offset 64K segment defined by selector value 10DE.
The selector must be stored in one of the “segment” registers
!
What is its offset?
" CS, DS, SS, ES

! The offset is typically stored in one of the “index” registers


" SI, DI

" But could be stored in a general purpose register

! Address computation is straightforward


! Given a 16-bit selector and a 16-bit offset, the 20-bit address is
computed as follows
" Multiply the selector by 16

! This simply transforms XXXX into XXXX0, thanks to the


beauty of hexadecimal
" Add the offset

" And voila


In-class Exercise Address Computation Example
Consider the whole 1MB address space
! Consider the byte at address 13DDE within a !
! Say that we want a 64K segment whose end is 8K from the end of the
64K segment defined by selector value 10DE. address space
! The address at the end of the address space is FFFFF
What is its offset? ! 8K in binary is 10000000000000, that is 02000 in 20-bit hex
! So the address right after the end of the segment is
FFFFF - 02000 + 1 = FDFFF + 1 = FE000
The length of the segment is 64K
! 13DDE = 10DE * 1610 + offset !
! 64K in binary is 1000000000000000, that is 10000 in 20-bit hex
! offset = 13DDE - 10DE0 ! So the address at the beginning of the segment is
FE000 - 10000 = EE000
! offset = 2FFE (a 16-bit quantity) !
!
So the value to store in a segment register is EE00
To reference the 43th byte in the segment, one must store 002A (= 4210) in
an index register
! The address of that byte is: EE000 + 002A = EE02A
! The address of the last byte in the segment is: EE000 + 0FFFF = FDFFF
" Which is right before FE000, the beginning of the last 8K of the address space

Code, Data, Stack The trouble with segments


! Although we’ll discuss these at length later,
! It is well-known that programming with segmented
let’s just accept for now that the address
architectures is really a pain
space has three regions
Don’t panic, for our purposes we won’t really suffer from this (too
address space

"
! A program constantly references all three code much)
regions
! You constantly have to make sure segment registers are set
! Therefore, the program constantly references up correctly
bytes in three different segments
! What happens if you have data/code that’s more than 64K?
" For now let’s assume that each region is fully data " You must then switch back and forth between selector values,
contained in a single segment, which is in fact
not always the case which can be really awkward
! CS: points to the beginning of the code ! Something that can cause complexity also is that two different
segment stack (selector, offset) pairs can reference the same address
" Example: (a,b) and (a-1, b+16)
! DS: points to the beginning of the data
segment ! There is an interesting on-line article on the topic:
http://world.std.com/~swmcd/steven/rants/pc.html
! SS: points to the beginning of the stack
segment

How come it ever survived? “Real Mode”


! If you code and your data are <64K, segments are great ! The addressing scheme we just described is called real mode
! Otherwise, they are a pain ! Called “real” because addresses being computed are physical
! Unfortunately, one often has more than 64K data addresses
! Given the horror of segmented programming, one may wonder how
come it stuck? " that reference physical locations in the memory chips
! From the linked article: “Under normal circumstances, a design so ! This can cause problems if we have multiple programs
twisted and flawed as the 8086 would have simply been ignored by running at the same time
the market and faded away.” " Which is something all modern computers do
! But in 1980, Intel was lucky that IBM picked it for the PC!
! First problem: a program may inadvertently modify the
! And we!ve been stuck with it ever since
memory that belongs to another program
! Luckily the segment issue isn!t as terrible with 32-bit architectures
" Segments are 4GB, and thus typically “big enough”
" This could be very dangerous
! Not to criticize IBM or anything, but they are also the reason why we ! Second problem: the total memory used by all running
got stuck with FORTRAN for so many years :) programs must be no bigger than the total physical memory
! Probably the fate of giant companies: whenever they make a bad " This is limiting and may be difficult to enforce
choice it can have large repercussions
“Protected Mode” 32-bit Protected Mode
! With the 286, Intel introduced protected mode, also known as ! With the 80386 Intel introduced a processor
virtual memory
! This is a complicated topic that you will see in your ICS412 with 32-bit registers
and ICS431
" I may give a short Virtual Memory lecture if time allows
! Addresses are 32-bit long
! With virtual memory, the addresses computed and issued by " Segments are 4GB
the CPU are not physical addresses
! Although the CPU thinks they are, instead they are virtual
! All modern operating systems running on 32-
addresses bit Intel processors use paged 32-bit
! Virtual addresses are translated into physical addresses by protected mode
the hardware/OS
! Benefits
Memory protection between different programs
"

" Use of the disk as extended memory


! Let’s have a look at the 32-bit registers
! All modern computers use protected mode

The 80386 32-bit registers The 8386 Registers


AX

! The General purposes were extended to 32-bit AH AL = EAX


" EAX, EBX, ECX, EDX BX

" For backward compatibility, AX, BX, CX, and DX refer to BH BL = EBX
the 16 low bits of EAX, EBX, ECX, and EDX CX
CS
DS
" AH and AL are as before ES
CH CL = ECX FS
" There is no way to access the high 16 bits of EAX DX GS
separately SS
= EDX
! Similarly, other registers are extended DH DL

16 bits
" EBX, EDX, ESI, EDI, EBP, ESP, EFLAGS SI = ESI
= EDI
DI
" For backward compatibility, the previous names are used BP = EBP

to refer to the low 16 bits SP = ESP


= EFLAGS
FLAGS
! The segment registers stay the same IP = EIP

! Two new segment registers: FS and GS


32 bits

Segment registers Conclusion


! In 32-bit protected mode, segment registers are still 16-bit ! From now own we’ll keep referring to the
! This may seem surprising, but in fact segmenting in the 386 is
very different from segmenting in the 8086
register names, so make sure you absolutely
! Each segment register points to an address know them
! At that address is a small data structure that describes the
corresponding segment
" Begin, end
! We’re ready to move on to writing assembly
! So when the CPU issues an address in a segment, the code for the x86 architecture in 32-bit
address computation is just a bit more complicated than in the protected mode
8086
" Go look up the data structure ! The registers are, in some sense, the
" Find the beginning of the segment variables that we can use
" Add the offset to it
In-class Quiz
! Quiz #2 will be on the last two sets of slides
" Numbers and Computers
" The x86 Architecture
! Note that there will be questions on the quiz like:
" Converting numbers from one base to another
" Computing 2’s complements
" Arithmetic in different bases

! The quiz will be next...

You might also like