Notes FT HA
Notes FT HA
The definitions consider two fundamental streams, the instruction stream and the data
stream. You can think of an instruction stream as a control unit that processes branch
instructions, and a data stream as an ALU with storage (e.g. registers). All of the
above parallel architectures can be simulated by a PRAM. The PRAM acts like a
synchronous MIMD. These architectures do not tell us how memory and processors
are organized. There are no machines widely accepted to be MISD.
SISD machine
SISD, refers to conventional computers, even those employing pipelining and similar
techniques.
SIMD machine
For applications with lots of data parallelism, the most cost effective platforms are
SIMD machines. In these machines, a single control unit broadcasts (micro-)
instructions to many processing elements (PE's, each of which is a set of functional
units with local storage) in parallel. The best known SIMD computer is the
Connection Machine from Thinking Machines. The CM-2 model has 64k PEs, and
even though each PE is only four bits wide, the machine could outperform many big
Crays on some specially programmed problems.
If you imagine a pipeline in which fetching operands is separate from and follows
instruction decoding, then a PE is the part of a CPU that implements all the stages
after instruction decoding, while a control unit is the part of a CPU that implements all
the stages up to instruction decoding. An SIMD computer connects each control unit
not to one PE, but to many PEs. An application is data parallel if it wants to do the
same computation on lots of pieces of data, which typically come from different
squares in a grid. Examples include image processing, weather forecasting, and
computational fluid dynamics (e.g. simulating airflow around a car or inside a jet
engine). SIMD machines cannot use commodity microprocessors, one reason being
that it would be very difficult to modify these to broadcast their control signals to a
multitude of processing elements. The companies that design SIMD machines have all
designed their own processing elements and control units. The processing elements
are usually slower than ordinary microprocessors, but they are also much smaller,
which makes it possible to put several on a single chip.
Since the CPUs are nonstandard, SIMD machines need their own compilers and other
system software. The costs of designing the CPU and this system software add
significantly to the up-front investment required for the machine. Due to the multi-
million dollar price tags of SIMD
machines, this investment has to be recovered from a relatively small number of
customers, so each customer's share of the development cost is quite high.
MIMD machine
Most multiprocessors today on the market are (shared memory) MIMD machines..
They are built out of standard processors and standard memory chips, interconnected
by a fast bus (memory is interleaved). If the processor's control unit can send different
instructions to each ALU in parallel then the architecture is MIMD. A superscalar
architecture is also MIMD. In this case there are multiple execution units so that
multiple instructions can be issued in parallel.
The use of standard components is important because it keeps down the costs of the
company designing the multiprocessor; the development cost of the standard
components is spread out over a much larger number of customers. In theory, the
interconnection network can be something other than a bus. However, for cache
coherence, you need an interconnection network in which each processor sees the
traffic between every other processor and memory, and all such interconnection
networks are either buses or have components which are equivalent to buses. Low-end
and midrange multiprocessors use buses; some high-end multiprocessors use multiple
bus systems, or crossbars with broadcast as well as point-to-point capability.
THE HARVARD ARCHITECTURE
Von Neumann architecture uses same memory to store the instructions as well as
data and had a single set of buses for transfer between memory and processor. This
leads to a slow-down in processing the instructions because the control has to first
fetch the instruction and then again go to memory for getting the data associated with
the instruction(two memory cycles, first for instruction and second for the data).
Harvard architecture introduces two different memory units, one for instructions and
the other for data called Instruction memory and Data memory respectively, and uses
two separate set of buses for the separate memories(Instruction bus and data bus).
When control has to perform an instruction it will fetch the instruction from
instruction memory by using instruction bus and read the data from data memory by
using the data bus. This will speed up the overall processing because instruction and
data are received by the processor at the same time(no need to wait for the data to
arrive).
Harvard architecture also gives another freedom to have different address space and
word lenth for the separate instruction and data memory e.g. PIC 16 microcontroller
based upon Harvard architecture uses 14-bit wide instruction memory and 8-bit wide
data memory.