Computer Architecture Lecture1
Computer Architecture Lecture1
• Control unit: Controls the operation of the CPU and hence the
computer.
• Arithmetic and logic unit (ALU): Performs the computer’s data
processing functions.
• Registers: Provides storage internal to the CPU.
• CPU interconnection: Some mechanism that provides for
communication among the control unit, ALU, and registers.
Why study Computer Organization and Architecture
The second generation also for the appearance of the Digital Equipment
Corporation (DEC).
DEC was founded in 1957 and, in that year, delivered its first computer.
This computer and this company began the minicomputer phenomenon that
would become so prominent in the third generation.
The Third Generation: Integrated Circuits
Transistor is called a discrete component. Throughout the 1950s and
early 1960s, electronic equipment was composed largely of discrete
components transistors, resistors, capacitors, and so on. Discrete
components were manufactured separately, packaged in their own
containers.
These facts of life were beginning to create problems in the computer
industry. Early second-generation computers contained about 10,000
transistors. making the manufacture of newer, more powerful
machines increasingly difficult.
Continue ..
In 1958 came the achievement that revolutionized electronics and
started the era of microelectronics:
the invention of the integrated circuit. It is the integrated circuit that
defines the third generation of computers. since the beginnings of
digital electronics and the computer industry.
The basic elements of a digital computer, as we know, must perform
storage, movement, processing, and control functions.
Continue ..
Only two fundamental types of components are required: gates and
memory cells. A gate is a device that implements a simple Boolean or
logical function, such as IF A AND B ARE TRUE THEN C IS TRUE (AND
gate). Such devices are called gates because they control data flow in
much the same way that canal gates control the flow of water. The
memory cell is a device that can store one bit of data;
Continue..
By interconnecting large numbers of these fundamental devices, we can
construct a computer. We can relate this to our four basic functions as
follows:
• Data storage: Provided by memory cells.
• Data processing: Provided by gates.
• Data movement: The paths among components are used to move data
from memory to memory and from memory through gates to memory.
• Control: The paths among components can carry control signals. For
example, a gate will have one or two data inputs plus a control signal input
that activates the gate. When the control signal is ON, the gate performs its
function on the data inputs and produces a data output.
Later Generations
Beyond the third generation there is less general agreement on
defining generations of computers.
With the introduction of large-scale integration (LSI), more than 1000
components can be placed on a single integrated circuit chip. Very-
large-scale integration (VLSI) achieved more than 10,000 components
per chip, while current ultra-large-scale integration (ULSI) chips can
contain more than one billion components.
Continue..
With the rapid pace of technology, the high rate of introduction of new
products, and the importance of software and communications as well
as hardware, the classification by generation becomes less clear and
less meaningful. It could be said that the commercial application of
new developments resulted in a major change in the early 1970s and
that the results of these changes are still being worked out.
SEMICONDUCTOR MEMORY The first application of integrated circuit
technology to computers was construction of the processor (the
control unit and the arithmetic and logic unit)
Continue..
In the 1950s and 1960s, most computer memory was constructed from tiny
rings of ferromagnetic wires suspended on small screens inside the
computer material,. These rings were strung up on grids of fine.
Then, in 1970, produced the first relatively capacious semiconductor
memory. This chip, about the size of a single core, could hold 256 bits of
memory.
In 1974, a seminal event occurred, The price per bit of semiconductor
memory dropped below the price per bit of core memory.
Developments in memory technology, together with developments in
processor technology, changed the nature of computers in less than a
decade.
DESIGNING FOR PERFORMANCE
Year by year, the cost of computer systems continues to drop
dramatically, while the performance and capacity of those systems
continue to rise equally dramatically.
Today’s laptops have the computing power of an IBM mainframe from
10 or 15 years ago. Thus, we have virtually “free” computer power.
Continue..
For example, desktop applications that require the great power of
today’s
microprocessor-based systems include
• Image processing
• Speech recognition
• Videoconferencing
• Multimedia authoring
• Voice and video annotation of files
• Simulation modeling
Microprocessor Speed
chipmakers can unleash a new generation of chips every three years
with four times
In microprocessors, the addition of new circuits, and the speed boost
that comes from reducing the distances between them
has improved performance four- or fivefold every three years or so
since Intel launched its x86 family in 1978.
while the chipmakers have been busy learning how to fabricate chips of
greater and greater density, the processor designers must come up
with ever more elaborate techniques for feeding the monster. Among
the techniques built into contemporary processors are the following:
Pipelining
Branch prediction
Data flow analysis
Speculative execution
Pipelining: With pipelining, a processor can simultaneously work on
multiple instructions. The processor overlaps operations by moving
data or instructions into a conceptual pipe with all stages of the pipe
processing simultaneously. For example, while one instruction is
being executed, the computer is decoding the next instruction.
Branch prediction: The processor looks ahead in the instruction code
fetched from memory and predicts which branches, or groups of
instructions, are likely to be processed next. If the processor guesses
right most of the time, it can prefetch the correct instructions and
buffer them so that the processor is kept busy.
• Data flow analysis: The processor analyzes which instructions are
dependent on each other’s results, or data, to create an optimized
schedule of instructions. In fact, instructions are scheduled to be executed
when ready, independent of the original program order. This prevents
unnecessary delay.
• Speculative execution: Using branch prediction and data flow analysis,
some processors speculatively execute instructions ahead of their actual
appearance in the program execution, holding the results in temporary
locations. This enables the processor to keep its execution engines as busy
as possible by executing instructions that are likely to be needed.
Performance Balance
Performance balance: adjusting of the organization and architecture to
compensate for the mismatch among the capabilities of the various
components.
Now here is the problem created by such mismatches more critical
than in the interface between processor and main memory. While
processor speed has grown rapidly, the speed with which data can be
transferred between main memory and the processor has lagged badly.
Continue ..
The interface between processor and main memory is the most crucial
pathway in the entire computer because it is responsible for carrying a
constant flow of program instructions and data between memory chips
and the processor. If memory or the pathway fails to keep pace with
the processor’s insistent demands, the processor stalls in a wait state,
and valuable processing time is lost.
Continue ..
(Solution of problems)
A system architect can attack this problem in a number of ways, all of
which are reflected in contemporary computer designs. Consider the
following examples:
• Increase the number of bits that are retrieved at one time by making DRAMs
• Change the DRAM interface to make it more efficient by including a cache or other
buffering scheme on the DRAM chip.
• Reduce the frequency of memory access by incorporating increasingly complex and
efficient cache structures between the processor and main memory.
• Increase the interconnect bandwidth between processors and memory by using
higher-speed buses and a hierarchy of buses to buffer and structure data flow.
Improvements in Chip Organization and Architecture
• Increase the size and speed of caches that are interposed between the
processor and main memory. In particular, by dedicating a portion of
the processor chip itself to the cache, cache access times drop
significantly.
• Make changes to the processor organization and architecture that
increase the effective speed of instruction execution.
Continue..
Beginning in the late 1980s, and continuing for about 15 years, two main
strategies have been used to increase performance beyond what can be
achieved simply by increasing clock speed. First, there has been an increase in
cache capacity.
There are now typically two levels of cache between the processor and main
memory. As chip density has increased, more of the cache memory has been
incorporated on the chip, enabling faster cache access.
Cache Levels Level 1: usually build onto the microprocessor chip itself for
example the Intel microprocessor comes with 32 thousand bytes of L1.
Level 2: a separate chip (expansion card) that can be accessed more quickly
than the larger “main” memory. A popular L2 cache memory size is one MB.
The Evolution of the INTEL x86 Architecture
The current x86 offerings represent the results of decades of design effort on
complex instruction set computers (CISCs).
It is worthwhile to list some of the highlights of the evolution of the Intel
product line:
• 8080: The world’s first general-purpose microprocessor. This was an 8-bit
machine, with an 8-bit data path to memory. The 8080 was used in the first
personal computer, the Altair.
• 8086: A far more powerful, 16-bit machine. In addition to a wider data path
and larger registers, the 8086 sported an instruction cache, or queue, that
pre fetches a few instructions before they are executed. A variant of this
processor, the 8088, was used in IBM’s first personal computer, securing
the success of Intel. The 8086 is the first appearance of the x86
architecture.
Continue
80286: This extension of the 8086 enabled addressing a 16-MByte
memory instead of just 1 MByte.
• 80386: Intel’s first 32-bit machine, and a major overhaul of the
product. With a 32-bit architecture, the 80386 rivaled the complexity
and power of minicomputers and mainframes introduced just a few
years earlier. This was the first Intel processor to support multitasking,
meaning it could run multiple programs at the same time.
• 80486: The 80486 introduced the use of much more sophisticated
and powerful cache technology and sophisticated instruction
pipelining. The 80486 also offered a built-in math coprocessor,
offloading complex math operations from the main CPU.
Continue..
• Pentium: With the Pentium, Intel introduced the use of superscalar
techniques, which allow multiple instructions to execute in parallel.
• Pentium Pro: The Pentium Pro continued the move into superscalar
organization begun with the Pentium, with aggressive use of register
renaming, branch prediction, data flow analysis, and speculative execution.
• Pentium II: The Pentium II incorporated Intel MMX technology, which is
designed specifically to process video, audio, and graphics data efficiently.
• Pentium III: The Pentium III incorporates additional floating-point
instructions to support 3D graphics software.
• Pentium 4: The Pentium 4 includes additional floating-point and other
enhancements for multimedia.
Continue..
• Core: This is the first Intel x86 microprocessor with a dual core,
referring to the implementation of two processors on a single chip.
• Core 2: The Core 2 extends the architecture to 64 bits. The Core 2
Quad provides four processors on a single chip. More recent Core
offerings have up to 10 processors per chip.
Embedded Systems
The term embedded system refers to the use of electronics and
software within a product, as opposed to a general-purpose computer,
such as a laptop or desktop system.
ARM Evolution
ARM is a family of RISC-based microprocessors and microcontrollers
designed by ARM Inc., Cambridge, England. The company doesn’t make
processors but instead designs microprocessor and multicore
architectures and licenses them to manufacturers.
ARM chips are high speed processors that are known for their small die
size and low power requirements.
Continue..
Devices, including games and phones as well as a large variety of
consumer products.
ARM chips are the processors in Apple’s popular iPod and iPhone
devices.
In the early 1980s, Acorn was awarded a contract by the British
Broadcasting Corporation (BBC) to develop a new microcomputer
architecture for the BBC Computer Literacy Project. The success of this
contract enabled Acorn to go on to develop the first commercial RISC
processor, the Acorn RISC Machine (ARM).
Continue..
The first version, ARM1, became operational in 1985 and was used for
internal research and development as well as being used as a
coprocessor in the BBC machine. Also in 1985, Acorn released the
ARM2, which had greater functionality and speed within the same
physical space. Further improvements were achieved with the release
in 1989 of the ARM3.
According to the ARM Web site arm.com, ARM processors are
designed to meet the needs of three system categories: