4 Microprocessors
4 Microprocessors
Chapter-
Chapter-4 Microprocessors
Microprocessor History
A microprocessor -- also known as a CPU or central processing unit -- is a complete
computation engine that is fabricated on a single chip. The first microprocessor was the
Intel 4004, introduced in 1971. The 4004 was not very powerful -- all it could do was add
and subtract, and it could only do that 4 bits at a time. But it was amazing that everything
was on one chip. Prior to the 4004, engineers built computers either from collections of
chips or from discrete components (transistors wired one at a time). The 4004 powered
one of the first portable electronic calculators.
The first microprocessor to make it into a home computer was the Intel 8080, a complete
8-bit computer on one chip, introduced in 1974. The first microprocessor to make a real
splash in the market was the Intel 8088, introduced in 1979 and incorporated into the
IBM PC (which first appeared around 1982). If you are familiar with the PC market and
its history, you know that the PC market moved from the 8088 to the 80286 to the 80386
to the 80486 to the Pentium to the Pentium II to the Pentium III to the Pentium 4. All of
these microprocessors are made by Intel and all of them are improvements on the basic
design of the 8088. The Pentium 4 can execute any piece of code that ran on the original
8088, but it does it about 5,000 times faster!
A chip is also called an integrated circuit. Generally it is a small, thin piece of silicon
onto which the transistors making up the microprocessor have been etched. A chip might
be as large as an inch on a side and can contain tens of millions of transistors. Simpler
processors might consist of a few thousand transistors etched onto a chip just a few
millimeters square.
The following table helps you to understand the differences between the different
processors that Intel has introduced over the years.
Clock Data
Name Date Transistors Microns MIPS
speed width
8080 1974 6,000 6 2 MHz 8 bits 0.64
16 bits
8088 1979 29,000 3 5 MHz 8-bit 0.33
bus
80286 1982 134,000 1.5 6 MHz 16 bits 1
16
80386 1985 275,000 1.5 32 bits 5
MHz
25
80486 1989 1,200,000 1 32 bits 20
MHz
32 bits
60
Pentium 1993 3,100,000 0.8 64-bit 100
MHz
bus
32 bits
233
Pentium II 1997 7,500,000 0.35 64-bit ~300
MHz
bus
32 bits
450
Pentium III 1999 9,500,000 0.25 64-bit ~510
MHz
bus
32 bits
1.5
Pentium 4 2000 42,000,000 0.18 64-bit ~1,700
GHz
bus
• The date is the year that the processor was first introduced. Many processors are
re-introduced at higher clock speeds for many years after the original release date.
• Transistors are the number of transistors on the chip. You can see that the
number of transistors on a single chip has risen steadily over the years.
• Microns is the width, in microns, of the smallest wire on the chip. For
comparison, a human hair is 100 microns thick. As the feature size on the chip
goes down, the number of transistors rises.
• Clock speed is the maximum rate that the chip can be clocked at. Clock speed
will make more sense in the next section.
• Data Width is the width of the ALU. An 8-bit ALU can
add/subtract/multiply/etc. two 8-bit numbers, while a 32-bit ALU can manipulate
32-bit numbers. An 8-bit ALU would have to execute four instructions to add two
32-bit numbers, while a 32-bit ALU can do it in one instruction. In many cases,
the external data bus is the same width as the ALU, but not always. The 8088 had
a 16-bit ALU and an 8-bit bus, while the modern Pentiums fetch data 64 bits at a
time for their 32-bit ALUs.
• MIPS stands for "millions of instructions per second" and is a rough measure of
the performance of a CPU.
From this table you can see that, in general, there is a relationship between clock
speed and MIPS. The maximum clock speed is a function of the manufacturing
process and delays within the chip. There is also a relationship between the number of
transistors and MIPS. For example, the 8088 clocked at 5 MHz but only executed at
0.33 MIPS (about one instruction per 15 clock cycles). Modern processors can often
execute at a rate of two instructions per clock cycle. That improvement is directly
related to the number of transistors on the chip and will make more sense in the next
section.
Inside a Microprocessor
To understand how a microprocessor works, it is helpful to look inside and learn about
the logic used to create one. In the process you can also learn about assembly language -
- the native language of a microprocessor -- and many of the things that engineers can do
to boost the speed of a processor.
There may be very sophisticated things that a microprocessor does, but those are its three
basic activities.
Microprocessor Instructions
Even the incredibly simple microprocessor shown in the previous example will have a
fairly large set of instructions that it can perform. The collection of instructions is
implemented as bit patterns, each one of which has a different meaning when loaded into
the instruction register. Humans are not particularly good at remembering bit patterns, so
a set of short words are defined to represent the different bit patterns. This collection of
words is called the assembly language of the processor. An assembler can translate the
words into their bit patterns very easily, and then the output of the assembler is placed in
memory for the microprocessor to execute.
Microprocessor Performance
The number of transistors available has a huge effect on the performance of a processor.
As seen earlier, a typical instruction in a processor like an 8088 took 15 clock cycles to
execute. Because of the design of the multiplier, it took approximately 80 cycles just to
do one 16-bit multiplication on the 8088. With more transistors, much more powerful
multipliers capable of single-cycle speeds become possible.
Many modern processors have multiple instruction decoders, each with its own pipeline.
This allows for multiple instruction streams, which means that more than one instruction
can complete during each clock cycle. This technique can be quite complex to
implement, so it takes lots of transistors.
Intel Microprocessors
8086/8088 (1978/1979)
The 29,000-transistor 8086 marked the first 16-bit microprocessor - that is, there are 16
data bits available from the CPU itself. This immediately offered twice the data
throughput of earlier 8-bit CPUs. Each of the 24 registers in the 8086/8088 is expanded to
16 bits, rather than just 8. Twenty address lines allow direct access to 1,048,576 bytes
(1MB) of external system memory. Although 1MB of RAM is considered almost
negligible today, IC designers at the time never suspected that more than 1MB would
ever be needed. Both the 8086 and 8088 (as well as all subsequent Intel CPUs) can
address 64KB of I/O space (as opposed to RAM space). The 8086 was available for four
clock speeds; 5MHz, 6MHz, 8MHz, and 10MHz. Three clock speeds allowed the 8086 to
process 0.33, 0.66, and 0.75 MIPS (Millions of Instructions Per Second), respectively.
80186 (1980)
The 16-bit 80186 built on the x86 foundation to offer additional features, such as an
internal clock generator, system controller, interrupt controller, Direct Memory Access
(DMA) controller, and timer/counter circuitry right on the CPU itself. No Intel CPU
before or since has offered so much integration in a single CPU. The x186 was also first
to abandon 5MHz clock speeds in favor of 8MHz, 10MHz, and 12.5MHz. Aside from
these advances, however, the x186 remained similar to the 8086/8088 with 24 registers
and 20 address lines to access up to 1MB of RAM. The x186 were used as CPUs in
embedded applications and never saw service in personal computers. The limitations of
the early x86 architecture in the PC demanded a much faster CPU capable of accessing
far more than 1MB of RAM.
80286 (1982)
The 24-register, 134,000-transistor 80286 CPU (first used in the IBM PC/AT and
compatibles) offered some substantial advantages over older CPUs. Design advances
allow the i286 to operate at 1.2 MIPS, 1.5 MIPS, and 2.66 MIPS (for 8, 10, and
12.5MHz, respectively). The i286 also breaks the 1MB RAM barrier by offering 24
address lines, instead of 20, which allow it to directly address 16MB of RAM. In addition
to 16MB of directly accessible RAM, the i286 can handle up to 1GB (gigabytes) of
virtual memory, which allows blocks of program code and data to be swapped between
the i286’s real memory (up to 16MB) and a secondary (or “virtual”) storage location,
such as a hard disk. To maintain backward compatibility with the 8086/8088, which can
only address 1MB of RAM, the i286 can operate in a real mode. One of the great failings
of the i286 is that it can switch from real-mode to protected-mode, but it cannot switch
back to real-mode without a warm reboot of the system. The i286 uses a stand-alone math
co-processor, the 80287.
80386 (1985–1990)
The next major microprocessor released by Intel was the 275,000-transistor, 32-register,
80386DX CPU in 1985. With a full 32-bit data bus, data throughput is immediately
double that of the 80286. The 16, 20, 25, and 33MHz versions allow data throughput up
to 50MB/s and processing power up to 11.4 MIPS at 33MHz. A full 32-bit address bus
allows direct access to an unprecedented 4GB of RAM in addition to a staggering 64 TB
(tera bytes) of virtual memory capacity. The i386 was the first Intel CPU to enhance
processing through the use of instruction pipelining, which allows the CPU to start
working on a new instruction while waiting for the current instruction to finish. A new
operating mode (called the virtual real-mode) enables the CPU to run several real-mode
sessions simultaneously under operating systems such as Windows.
Intel took a small step backward in 1988 to produce the 80386SX CPU. The i386SX uses
24 address lines for 16MB of addressable RAM and an external data bus of 16 bits,
instead of full 32 bits from the DX. Correspondingly, the processing power for the
i386SX is only 3.6 MIPS at 33MHz. In spite of these compromises, this offered a
significantly less-expensive CPU, which helped to propagate the i386 family into desktop
and portable computers. Aside from changes to the address and bus width, the i386
architecture is virtually unchanged from that of the i386DX. By 1990, Intel integrated the
i386 into an 855,000-transistor, low-power version, called the 80386SL. The i386SL
incorporated an ISA-compatible chip set along with power-management circuitry that
optimized the i386 for use in mobile computers. The i386SL resembled the i386SX
version in its 24 address lines and 16-bit external data bus.
Each member of the i386 family uses stand-alone math co-processors (80387DX,
80387SX, and 80387SL, respectively). All versions of the 80386 can switch between
real-mode and protected-mode, as needed, so they will run the same software as (and are
backwardly compatible with) the 80286 and the 8086/8088.
80486 (1989–1994)
The consistent push for higher speed and performance resulted in the development of
Intel’s 1.2 million-transistor, 29-register, 32-bit microprocessor, called the 80486DX, in
1989. The i486DX provides full 32-bit addressing for access to 4GB of physical RAM
and up to 64TB (tera bytes) of virtual memory. The i486DX offers twice the performance
of the i386DX with 26.9 MIPS at 33MHz. Two initial versions (25 and 33MHz) were
available. As with the i386 family, the i486 series uses pipelining to improve instruction
execution, but the i486 series also adds 8KB of cache memory right on the IC. Cache
saves memory access time by predicting the next instructions that will be needed by the
CPU and loading them into the cache memory before the CPU actually needs them. If the
needed instruction is indeed in cache, the CPU can access the information from cache
without wasting time waiting for memory access. Another improvement of the i486DX is
the inclusion of a floating-point unit (an MCP) in the CPU itself, rather than requiring a
separate co-processor IC. This is not true of all members of the i486 family, however. A
third departure for the i486DX is that it is offered in 5- and 3-V versions. The 3-V
version is intended for laptop, notebook, and other low-power mobile computing
applications. Finally, the i486DX is upgradeable. Up to 1989/1990, personal computers
were limited by their CPU - when the CPU became obsolete, so did the computer (more
specifically the motherboard).
Pentium Pro processors could handle four pipelines simultaneously and, as a result, were
capable of performing the equivalent of three simultaneous processes.
In addition to a 16K L1 cache, the Pentium Pro processor also had an onboard L2 cache
of 256K, 512K, or 1MB. This new onboard L2 cache ran at the same speed as the
processor and, thus, provided an excellent boost in processing efficiency.
Celerons, the architecture owed little to the Pentium Pro design, and was new from the
ground up. Notable with the introduction of the Pentium 4 was the very fast 400 MHz
FSB; it was actually a 100 MHz Quad-pumped bus, but the theoretical bandwidth was 4x
that of a normal bus, and so it was considered to run at 400 MHz- the fastest competition
was running at 266 MHz (133 MHz Double-pumped).
To the surprise of most industry observers, the Pentium 4 did not improve on the old P6
design in either of the normal two key performance measures: integer processing speed or
floating-point performance. Instead, it sacrificed per-cycle performance in order to gain
two things: very high clock speeds, and SSE performance. As is traditional with Intel's
flagship chips, the Pentium 4 also comes in a low-end Celeron version (often referred to
as Celeron 4) and a high-end Xeon version intended for SMP configurations.
The design goal of the Pentium 4 was to easily scale to fast clock speeds because
consumers were beginning to purchase computers based on GHz ratings. Intel used a
deep instruction pipeline to implement this goal, which reduced the amount of real work
that the Pentium 4 could do per clock cycle compared to other CPUs like the Pentium III
and Athlon.
The Pentium 4 design was initially expected to scale 10 GHz, but it had unsolvable
thermal problems at 4 GHz. Intel abandoned development of the Pentium 4 in mid-2005
to focus on the cooler running Pentium M, which was repositioned for the desktop
computer and small server markets. In retrospect, the Pentium III core was
technologically superior to Pentium 4. Only the system bus of the Pentium 4 is still used
in current system designs.
AMD Processors
Since the days of the Intel 486, Advanced Micro Devices (AMD) has released CPUs to
compete with Intel products. In more recent days, AMD has arguably caught up and, in
some cases, surpassed Intel in the capability of its products, a feat no one (outside of AMD)
would have thought possible even a year or two ago. Let's check out AMD's A+ offerings.
AMD K5 Processors
The AMD K5 processor was released by AMD in 1995 to compete with the Pentium
processor. The K5 processor had a 296-pin PGA and used Socket 7 to connect to the
motherboard. This processor was available in 75, 90, 100, and 116 MHz and required you
to use an active heat sink.
The AMD K5 processor had a 64-bit wide data bus and the 32-bit address bus allowed this
processor to address up to 4GB of RAM. This processor used 3.52 volts DC and only
supported 8K of L1 cache. Aside from markings on the chip, the K5 looked identical to a
Pentium.
Cyrix Processors
Cyrix has produced competitive CPUs since the days of the 486
(the predecessor to the Pentium), but concentrates on the lower
end of the CPU spectrum. Cyrix had a worthy, pin-compatible
alternative to the Pentium, called the 6x86 line, that ranged in
P-Rating (that is, effective clock speed) from 166 MHz to 233
MHz. The chipset giant VIA Technologies bought Cyrix a few
years ago and has continued to release lower-end CPUs under
the Cyrix label, such as the Cyrix M-II and VIA C3. Both CPUs
are pin-compatible with Intel CPUs. The M-II plugs into later
Socket 7 motherboards and the C3 plugs into Socket 370 motherboards.
@@@@@