Lecture 5
Lecture 5
Programs
A Historical Perspective
The Intel processor line, referred to as x86, has followed a long
evolutionary development.
It started with one of the first single-chip 16-bit microprocessors,
Where many compromises had to be made due to the limited capabilities
of integrated circuit technology at the time.
Since then, it has grown to take advantage of technology improvements
as well as to satisfy the demands for higher performance and for
supporting more advanced operating systems.
A Historical Perspective
The list that follows shows some models of Intel processors and some of
their key features.
We use the number of transistors required to implement the processors
as an indication of how they have evolved in complexity.
In this table, “K” denotes 1,000 (103), “M” denotes 1,000,000 (106), and
“G” denotes 1,000,000,000 (109)
8086 (1978, 29 K transistors)
One of the first single-chip, 16-bit microprocessors.
The 8088, a variant of the 8086 with an 8-bit external bus,
Formed the heart of the original IBM personal computers.
IBM contracted with then-tiny Microsoft to develop the MS-DOS
operating system.
The original models came with 32,768 bytes of memory and two floppy
drives (no hard drive).
8086 (1978, 29 K transistors)
In 1980, Intel introduced the 8087 floating-point coprocessor (45K
transistors)
to operate alongside an 8086 or 8088 processor,
executing the floating-point instructions.
The 8087 established the floating-point model for the x86 line, often
referred to as “x87.”
80286 (1982, 134 K transistors)
Added more (and now obsolete) addressing modes.
Formed the basis of the IBM PC-AT personal computer,
The original platform for MS Windows.
i386 (1985, 275 K transistors)
Expanded the architecture to 32 bits.
Added the flat addressing model used by Linux and recent versions of the
Windows operating system.
This was the first machine in the series that could fully support a Unix
operating system.
i486 (1989, 1.2 M transistors)
Improved performance and integrated the floating-point unit onto the
processor chip.
but did not significantly change the instruction set.
Pentium (1993, 3.1 M transistors)
Improved performance
but only added minor extensions to the instruction set.
PentiumPro (1995, 5.5 M transistors)
Introduced a radically new processor design,
internally known as the P6 microarchitecture.
Added a class of “conditional move” instructions to the instruction set.
Pentium/MMX (1997, 4.5 M transistors)
Added new class of instructions to the Pentium processor for
manipulating vectors of integers.
Pentium II (1997, 7 M transistors)
Pentium III (1999, 8.2 M transistors)
Introduced a class of instructions for manipulating vectors of integer or
floating-point data.
Pentium 4 (2000, 42 M transistors)
adding new data types (including double-precision floating point),
along with 144 new instructions for these formats
Pentium 4E (2004, 125 M transistors)
Added hyperthreading,
a method to run two programs simultaneously on a single processor.
Core 2 (2006, 291 M transistors)
First multi-core Intel microprocessor,
where multiple processors are implemented on a single chip.
Did not support hyperthreading.
Core i7, Nehalem (2008, 781 M transistors)
Incorporated both hyperthreading and multi-core,
with the initial version supporting two executing programs on each core
and up to four cores on each chip.
Core i7, Sandy Bridge (2011, 1.17 G transistors)
Core i7, Haswell (2013, 1.4 G transistors)
Over the years, several companies have produced processors
that are compatible with Intel processors,
capable of running the exact same machine-level programs.
Chief among these is Advanced Micro Devices (AMD).
Program Encodings
Suppose we write a C program as two files p1.c and p2.c.
We can then compile this code using a Unix command line:
linux> gcc -Og -o p p1.c p2.c
The command gcc indicates the gcc C compiler.
The command-line option -Og instructs the compiler,
to apply a level of optimization that yields machine code that follows the
overall structure of the original C code.
Program Encodings
In practice, higher levels of optimization (e.g., specified with the option
-O1 or -O2).
The gcc command invokes an entire sequence of programs to turn the
source code into executable code.
First, the C preprocessor expands the source code to include any files
specified with #include commands and to expand any macros, specified
with #define declarations.
Program Encodings
Second, the compiler generates assembly code versions of the two source
files having names p1.s and p2.s.
Next, the assembler converts the assembly code into binary object-code
files p1.o and p2.o.
Finally, the linker merges these two object-code files along with code
implementing library functions (e.g., printf)
and generates the final executable code file p (as specified by the
command-line directive -o p).
Machine-Level Code
Computer systems employ several different forms of abstraction,
hiding details of an implementation through the use of a simpler abstract
model.
Two of these are especially important for machine-level programming.
First, the format and behavior of a machine-level program is defined by
the instruction set architecture, or ISA.
Most ISAs, including x86-64, describe the behavior of a program as if each
instruction is executed in sequence,
with one instruction completing before the next one begins.
Machine-Level Code
Second, the memory addresses used by a machine-level program are
virtual addresses,
providing a memory model that appears to be a very large byte array.
The actual implementation of the memory system involves a combination
of multiple hardware memories and operating system software.
The machine code for x86-64 differs greatly from the original C code.
Parts of the processor state are visible that normally are hidden from the
C programmer:
Machine-Level Code
The program counter (commonly referred to as the PC, and called %rip in
x86- 64)
indicates the address in memory of the next instruction to be executed.
The integer register file contains 16 named locations storing 64-bit values.
These registers can hold addresses (corresponding to C pointers) or
integer data.
Machine-Level Code
The condition code registers hold status information about the most
recently executed arithmetic or logical instruction.
These are used to implement conditional changes in the control or data
flow, such as is required to implement if and while statements.
A set of vector registers can each hold one or more integer or floating-
point values.
Aggregate data types in C such as arrays and structures are represented in
machine code as contiguous collections of bytes.
Code Examples
Suppose we write a C code file mstore.c containing the following function
definition:
long mult2(long, long);
void multstore(long x, long y, long *dest) {
long t = mult2(x, y);
*dest = t;
}
Code Examples
To see the assembly code generated by the C compiler,
we can use the –S option on the command line:
linux> gcc -Og -S mstore.c
This will cause gcc to run the compiler, generating an assembly file
mstore.s, and go no further.
Normally it would then invoke the assembler to generate an object-code
file.
Code Examples
The assembly-code file contains various declarations, including the
following set of lines:
multstore:
pushq %rbx
movq %rdx, %rbx
call mult2
movq %rax, (%rbx)
popq %rbx
ret
Code Examples
Each indented line in the code corresponds to a single machine
instruction.
For example, the pushq instruction indicates that the contents of register
%rbx should be pushed onto the program stack.
All information about local variable names or data types has been stripped
away.
Code Examples
If we use the -c command-line option, gcc will both compile and assemble
the code
linux> gcc -Og -c mstore.c
This will generate an object-code file mstore.o
that is in binary format and hence cannot be viewed directly.
To inspect the contents of machine-code files,
a class of programs known as disassemblers can be invaluable.
Code Examples
These programs generate a format similar to assembly code from the
machine code.
With Linux systems, the program objdump (for “object dump”) can serve
this role given the -d command-line flag:
linux> objdump -d mstore.o
Notes on Formatting
The assembly code generated by gcc is difficult for a human to read.
For example, suppose we give the command to generate the file mstore.s.
linux> gcc -Og -S mstore.c
The full content of the
file is as follows:
Code Examples
All of the lines beginning with ‘.’ are directives to guide the assembler and
linker.
We can generally ignore these.
For our example, an annotated version would appear as follows: