0% found this document useful (0 votes)
15 views

Lecture 5

Uploaded by

jam khan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views

Lecture 5

Uploaded by

jam khan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 68

Machine-Level Representation of

Programs
A Historical Perspective
 The Intel processor line, referred to as x86, has followed a long
evolutionary development.
 It started with one of the first single-chip 16-bit microprocessors,
 Where many compromises had to be made due to the limited capabilities
of integrated circuit technology at the time.
 Since then, it has grown to take advantage of technology improvements
 as well as to satisfy the demands for higher performance and for
supporting more advanced operating systems.
A Historical Perspective
 The list that follows shows some models of Intel processors and some of
their key features.
 We use the number of transistors required to implement the processors
as an indication of how they have evolved in complexity.
 In this table, “K” denotes 1,000 (103), “M” denotes 1,000,000 (106), and
“G” denotes 1,000,000,000 (109)
8086 (1978, 29 K transistors)
 One of the first single-chip, 16-bit microprocessors.
 The 8088, a variant of the 8086 with an 8-bit external bus,
 Formed the heart of the original IBM personal computers.
 IBM contracted with then-tiny Microsoft to develop the MS-DOS
operating system.
 The original models came with 32,768 bytes of memory and two floppy
drives (no hard drive).
8086 (1978, 29 K transistors)
 In 1980, Intel introduced the 8087 floating-point coprocessor (45K
transistors)
 to operate alongside an 8086 or 8088 processor,
 executing the floating-point instructions.
 The 8087 established the floating-point model for the x86 line, often
referred to as “x87.”
80286 (1982, 134 K transistors)
 Added more (and now obsolete) addressing modes.
 Formed the basis of the IBM PC-AT personal computer,
 The original platform for MS Windows.
i386 (1985, 275 K transistors)
 Expanded the architecture to 32 bits.
 Added the flat addressing model used by Linux and recent versions of the
Windows operating system.
 This was the first machine in the series that could fully support a Unix
operating system.
i486 (1989, 1.2 M transistors)
 Improved performance and integrated the floating-point unit onto the
processor chip.
 but did not significantly change the instruction set.
Pentium (1993, 3.1 M transistors)
 Improved performance
 but only added minor extensions to the instruction set.
PentiumPro (1995, 5.5 M transistors)
 Introduced a radically new processor design,
 internally known as the P6 microarchitecture.
 Added a class of “conditional move” instructions to the instruction set.
Pentium/MMX (1997, 4.5 M transistors)
 Added new class of instructions to the Pentium processor for
manipulating vectors of integers.
Pentium II (1997, 7 M transistors)
Pentium III (1999, 8.2 M transistors)
 Introduced a class of instructions for manipulating vectors of integer or
floating-point data.
Pentium 4 (2000, 42 M transistors)
 adding new data types (including double-precision floating point),
 along with 144 new instructions for these formats
Pentium 4E (2004, 125 M transistors)
 Added hyperthreading,
 a method to run two programs simultaneously on a single processor.
Core 2 (2006, 291 M transistors)
 First multi-core Intel microprocessor,
 where multiple processors are implemented on a single chip.
 Did not support hyperthreading.
Core i7, Nehalem (2008, 781 M transistors)
 Incorporated both hyperthreading and multi-core,
 with the initial version supporting two executing programs on each core
and up to four cores on each chip.
Core i7, Sandy Bridge (2011, 1.17 G transistors)
Core i7, Haswell (2013, 1.4 G transistors)
 Over the years, several companies have produced processors
 that are compatible with Intel processors,
 capable of running the exact same machine-level programs.
 Chief among these is Advanced Micro Devices (AMD).
Program Encodings
 Suppose we write a C program as two files p1.c and p2.c.
 We can then compile this code using a Unix command line:
linux> gcc -Og -o p p1.c p2.c
 The command gcc indicates the gcc C compiler.
 The command-line option -Og instructs the compiler,
 to apply a level of optimization that yields machine code that follows the
overall structure of the original C code.
Program Encodings
 In practice, higher levels of optimization (e.g., specified with the option
-O1 or -O2).
 The gcc command invokes an entire sequence of programs to turn the
source code into executable code.
 First, the C preprocessor expands the source code to include any files
 specified with #include commands and to expand any macros, specified
with #define declarations.
Program Encodings
 Second, the compiler generates assembly code versions of the two source
files having names p1.s and p2.s.
 Next, the assembler converts the assembly code into binary object-code
files p1.o and p2.o.
 Finally, the linker merges these two object-code files along with code
implementing library functions (e.g., printf)
 and generates the final executable code file p (as specified by the
command-line directive -o p).
Machine-Level Code
 Computer systems employ several different forms of abstraction,
 hiding details of an implementation through the use of a simpler abstract
model.
 Two of these are especially important for machine-level programming.
 First, the format and behavior of a machine-level program is defined by
the instruction set architecture, or ISA.
 Most ISAs, including x86-64, describe the behavior of a program as if each
instruction is executed in sequence,
 with one instruction completing before the next one begins.
Machine-Level Code
 Second, the memory addresses used by a machine-level program are
virtual addresses,
 providing a memory model that appears to be a very large byte array.
 The actual implementation of the memory system involves a combination
of multiple hardware memories and operating system software.
 The machine code for x86-64 differs greatly from the original C code.
 Parts of the processor state are visible that normally are hidden from the
C programmer:
Machine-Level Code
 The program counter (commonly referred to as the PC, and called %rip in
x86- 64)
 indicates the address in memory of the next instruction to be executed.
 The integer register file contains 16 named locations storing 64-bit values.
 These registers can hold addresses (corresponding to C pointers) or
integer data.
Machine-Level Code
 The condition code registers hold status information about the most
recently executed arithmetic or logical instruction.
 These are used to implement conditional changes in the control or data
flow, such as is required to implement if and while statements.
 A set of vector registers can each hold one or more integer or floating-
point values.
 Aggregate data types in C such as arrays and structures are represented in
machine code as contiguous collections of bytes.
Code Examples
 Suppose we write a C code file mstore.c containing the following function
definition:
long mult2(long, long);
void multstore(long x, long y, long *dest) {
long t = mult2(x, y);
*dest = t;
}
Code Examples
 To see the assembly code generated by the C compiler,
 we can use the –S option on the command line:
linux> gcc -Og -S mstore.c
 This will cause gcc to run the compiler, generating an assembly file
mstore.s, and go no further.
 Normally it would then invoke the assembler to generate an object-code
file.
Code Examples
 The assembly-code file contains various declarations, including the
following set of lines:
multstore:
pushq %rbx
movq %rdx, %rbx
call mult2
movq %rax, (%rbx)
popq %rbx
ret
Code Examples
 Each indented line in the code corresponds to a single machine
instruction.
 For example, the pushq instruction indicates that the contents of register
%rbx should be pushed onto the program stack.
 All information about local variable names or data types has been stripped
away.
Code Examples
 If we use the -c command-line option, gcc will both compile and assemble
the code
linux> gcc -Og -c mstore.c
 This will generate an object-code file mstore.o
 that is in binary format and hence cannot be viewed directly.
 To inspect the contents of machine-code files,
 a class of programs known as disassemblers can be invaluable.
Code Examples
 These programs generate a format similar to assembly code from the
machine code.
 With Linux systems, the program objdump (for “object dump”) can serve
this role given the -d command-line flag:
linux> objdump -d mstore.o
Notes on Formatting
 The assembly code generated by gcc is difficult for a human to read.
 For example, suppose we give the command to generate the file mstore.s.
linux> gcc -Og -S mstore.c
 The full content of the
file is as follows:
Code Examples
 All of the lines beginning with ‘.’ are directives to guide the assembler and
linker.
 We can generally ignore these.
 For our example, an annotated version would appear as follows:

long mult2(long, long);


void multstore(long x, long y, long *dest)
{
long t = mult2(x, y);
*dest = t;
}
ATT versus Intel assembly-code formats
 We show assembly code in ATT format (named after AT&T, the company
that operated Bell Laboratories for many years),
 the default format for gcc, objdump, and the other tools we will consider.
 Other programming tools, including those from Microsoft
 as well as the documentation from Intel, show assembly code in Intel
format.
 The two formats differ in a number of ways.
ATT versus Intel assembly-code formats
 As an example, gcc can generate code in Intel format for the same
function using the following command line:
linux> gcc -Og -S -masm=intel mstore.c
multstore:
push rbx
mov rbx, rdx
call mult2
mov QWORD PTR [rbx], rax
pop rbx
ret
ATT versus Intel assembly-code formats
 We see that the Intel and ATT formats differ in the following ways:
 The Intel code omits the size designation suffixes.
 We see instruction push and mov instead of pushq and movq.
 The Intel code omits the ‘%’ character in front of register names, using rbx
instead of %rbx.
 The Intel code has a different way of describing locations in memory
 for example, QWORD PTR [rbx] rather than (%rbx).
 Instructions with multiple operands list them in the reverse order.
Data Formats
 Due to its origins as a 16-bit architecture that expanded into a 32-bit one,
 Intel uses the term “word” to refer to a 16-bit data type.
 Based on this, they refer to 32- bit quantities as “double words,” and 64-
bit quantities as “quad words.”
 Figure 3.1 shows the x86-64 representations used for the primitive data
types of C.
Data Formats
Data Formats
 most assembly-code instructions generated by gcc have a single-
character suffix denoting the size of the operand.
 For example, the data movement instruction has four variants:
 movb (move byte), movw (move word),
 movl (move double word), and movq (move quad word).
 The suffix ‘l’ is used for double words,
 since 32-bit quantities are considered to be “long words.”
Accessing Information
 An x86-64 central processing unit (CPU) contains a set of 16 general-
purpose registers storing 64-bit values.
 These registers are used to store integer data as well as pointers.
 Figure 3.2 diagrams the 16 registers.
 Their names all begin with %r
Accessing Information
Accessing Information
 The original 8086 had eight 16-bit registers.
 With the extension to IA32, these registers were expanded to 32-bit
registers, labeled %eax through %ebp.
 In the extension to x86-64, the original eight registers were expanded to
64 bits, labeled %rax through %rbp.
 In addition, eight new registers were added.
Accessing Information
 instructions can operate on data of different sizes stored in the low-order
bytes of the 16 registers.
 Byte-level operations can access the least significant byte,
 16-bit operations can access the least significant 2 bytes,
 32-bit operations can access the least significant 4 bytes,
 and 64-bit operations can access entire registers.
Operand Specifiers
 Most instructions have one or more operands
 specifying the source values to use in performing an operation and the
destination location into which to place the result.
 x86-64 supports a number of operand forms (see Figure 3.3).
 Source values can be given as constants or read from registers or
memory.
 Results can be stored in either registers or memory.
Operand Specifiers
 The different operand possibilities can be classified into three types.
 The first type, immediate, is for constant values.
 In ATT format assembly code, these are written with a ‘$’ followed by an
integer using standard C notation for example, $-577 or $0x1F.
 The second type, register, denotes the contents of a register,
 One of the sixteen 8-, 4-, 2-, or 1-byte low-order portions of the registers
for operands having 64, 32, 16, or 8 bits, respectively.
Operand Specifiers
 The different operand possibilities can be classified into three types.
 The first type, immediate, is for constant values.
 In ATT format assembly code, these are written with a ‘$’ followed by an
integer using standard C notation for example, $-577 or $0x1F.
 The second type, register, denotes the contents of a register,
 One of the sixteen 8-, 4-, 2-, or 1-byte low-order portions of the registers
for operands having 64, 32, 16, or 8 bits, respectively.
Operand Specifiers
 In Figure 3.3, we use the notation r to denote an arbitrary register
a
a and indicate its value with the reference R[ra],
 viewing the set of registers as an array R indexed by register
identifiers.
Operand Specifiers
 The third type of operand is a memory reference,
 in which we access some memory location according to a computed
address, often called the effective address.
 Since we view the memory as a large array of bytes,
 we use the notation M[Addr] to denote a reference to the byte value
stored in memory starting at address Addr.
Operand Specifiers
 As Figure 3.3 shows, there are many different addressing modes allowing
different forms of memory references.
 The most general form is shown at the bottom of the table with syntax
Imm(rb,ri,s).
 Such a reference has four components: an immediate offset Imm, a base
register rb, an index register ri, and a scale factor s,
 where s must be 1, 2, 4, or 8.
 Both the base and index must be 64-bit registers.
 The effective address is computed as Imm + R[rb]+ R[ri]. s.
Data Movement Instructions
 Among the most heavily used instructions are those that copy data from
one location to another.
 Figure 3.4 lists the simplest form of data movement instructions—mov
class.
 These instructions copy data from a source location to a destination
location, without any transformation.
 The class consists of four instructions: movb, movw, movl, and movq.
Data Movement Instructions
 All four of these instructions have similar effects;
 they differ primarily in that they operate on data of different sizes: 1, 2, 4,
and 8 bytes, respectively.
Data Movement Instructions
 All four of these instructions have similar effects;
 they differ primarily in that they operate on data of different sizes: 1, 2, 4,
and 8 bytes, respectively.
 The source operand designates a value that is immediate, stored in a
register, or stored in memory.
 The destination operand designates a location that is either a register or a
memory address.
 x86-64 imposes the restriction that a move instruction cannot have both
operands refer to memory locations.
Data Movement Instructions
 Copying a value from one memory location to another requires two
instructions,
 The first to load the source value into a register, and the second to write
this register value to the destination.
 register operands for these instructions can be the labeled portions of
any of the 16 registers,
 where the size of the register must match the size designated by the last
character of the instruction (‘b’, ‘w’, ‘l’, or ‘q’).
Data Movement Instructions
 For most cases, the mov instructions will only update the specific register
bytes or memory locations indicated by the destination operand.
 The only exception is that when movl has a register as the destination,
 it will also set the high-order 4 bytes of the register to 0.
 This exception arises from the convention, adopted in x86-64,
 that any instruction that generates a 32-bit value for a register also sets
the high-order portion of the register to 0.
Data Movement Instructions
 The following mov instruction examples show the five possible
combinations of source and destination types.
 Recall that the source operand comes first and the destination second.
Data Movement Instructions
 A movabsq instruction documented in Figure 3.4 is for dealing with 64-bit
immediate data.
 The regular movq instruction can only have immediate source operands
that can be represented as 32-bit two’s-complement numbers.
 This value is then sign extended to produce the 64-bit value for the
destination.
 The movabsq instruction can have an arbitrary 64-bit immediate value as
its source operand and can only have a register as a destination.
Data Movement Instructions
 Figures 3.5 and 3.6 document two classes of data movement instructions
for use when copying a smaller source value to a larger destination.
 All of these instructions copy data from a source, which can be either a
register or stored in memory, to a register destination.
 Instructions in the movz class fill out the remaining bytes of the
destination with zeros.
 while those in the movs class fill them out by sign extension,
 replicating copies of the most significant bit of the source operand.
Data Movement Instructions
 Figure 3.6 also documents the cltq instruction.
 This instruction has no operands
 it always uses register %eax as its source and %rax as the destination for the
sign-extended result.
 It therefore has the exact same effect as the instruction movslq %eax, %rax,
but it has a more compact encoding.
Pushing and Popping Stack Data
 The final two data movement operations are used to push data onto and
pop data from the program stack,
 the stack plays a vital role in the handling of procedure calls.
 a stack is a data structure where values can be added or deleted, but only
according to a “last-in, first-out” discipline.
 We add data to a stack via a push operation and remove it via a pop
operation,
 with the property that the value popped will always be the value that was
most recently pushed and is still on the stack.
Pushing and Popping Stack Data
 A stack can be implemented as an array,
 where we always insert and remove elements from one end of the array.
 This end is called the top of the stack.
 With x86-64, the program stack is stored in some region of memory.
Pushing and Popping Stack Data
 Pushing a quad word value onto the stack involves
 first decrementing the stack pointer by 8 and then writing the value at
the new top-of-stack address.
 Therefore, the behavior of the instruction pushq %rbp is equivalent to
that of the pair of instructions
subq $8,%rsp Decrement stack pointer
movq %rbp,(%rsp) Store %rbp on stack
Pushing and Popping Stack Data
 Popping a quad word involves reading from the top-of-stack location
 And then incrementing the stack pointer by 8.
 Therefore, the instruction popq %rax is equivalent to the following pair of
instructions:
movq (%rsp),%rax Read %rax from stack
addq $8,%rsp Increment stack pointer
Pushing and Popping Stack Data
 For example, assuming the
 topmost element of the stack is a quad word,
 the instruction movq 8(%rsp),%rdx,
 will copy the second quad word from the stack to register %rdx.

You might also like