I) Ti Asc:: Yogesh Chauhan (12-1-5-004) Write A Case Study On
I) Ti Asc:: Yogesh Chauhan (12-1-5-004) Write A Case Study On
i) TI ASC:
The Advanced Scientific Computer, or ASC, was a supercomputer architecture
designed by Texas Instruments (TI) between 1966 and 1973. Key to the ASC's
design was a single high-speed shared memory, which was accessed by a number
of processors and channel controllers, in a fashion similar to Seymour Cray's
groundbreaking CDC 6600. Whereas the 6600 featured ten smaller computers
feeding a single math unit (ALU), in the ASC this was simplified into a single 8-core
processor feeding the ALU. The 4-core ALU/CPU was one of the first to include
dedicated vector processing instructions, with the ability to send the same
instruction to all four cores.
Memory was accessed solely under the control of the memory control unit, or MCU.
The MCU was a two-way, 256-bit/channel parallel network that could support up to
eight independent processors, with a ninth channel for accessing "main memory"
(or "extended memory" as they referred to it). The MCU also acted as a cache
controller, offering high speed access on the eight processor ports to a
semiconductor-based memory, and handling all communications to the 24-bit
address space in main memory. The MCU was designed to operate asynchronously,
allowing it to work at a variety of speeds and scale across a number of performance
points. For instance, main memory could be constructed out of slower but less
expensive core memory, although this was not used in practice. At the fastest, it
could sustain transfer rates of 80 million 32-bit words per second per port, for a
total transfer capacity of 640M-words/sec. This was well beyond the capabilities of
even the fastest memories of the era.
The main ALU/CPU was extremely advanced for its era. The design included four
basic cores that could be combined to handle vector instructions. Each core
included a complete instruction pipeline system that could keep up to twelve scalar
instructions in-flight at the same time, allowing up to 36 instructions in total across
the entire CPU. From one to four vector results could be produced every 60ns, the
basic cycle time (about 16 MHz), depending on the number of execution units
provided. Implementations of this sort of parallel/pipelined instruction system did
not appear on modern commodity processors until the late 1990s, and vector
instructions (now known as SIMD) until a few years later.
The processor included 48 32-bit registers, a huge number for the time, although
they were not general purpose as they are in modern designs. Sixteen were used for
addresses, another sixteen for math, eight for index offsets and another eight for
vector instructions. Registers were accessed externally using a RISC-like load/store
system, with instructions to load anything from 4-bits to 64-bit (two registers) at a
time.
Most vector machines tended to be memory-limited, that is, they could process data
faster than they could get it from memory. This remains a major problem on modern
SIMD designs as well, which is why considerable effort has been put into increasing
memory throughput in modern computer designs (although largely unsuccessfully).
In the ASC this was improved somewhat with a lookahead unit that predicted
upcoming memory accesses and loaded them into the ALU registers invisibility,
using a memory interface in the CPU known as the memory buffer unit (MBU).
ii) STAR-100:
The STAR-100 was a vector supercomputer designed, manufactured, and marketed
by Control Data Corporation (CDC). It was one of the first machines to use a vector
processor to improve performance on appropriate scientific applications.
The name STAR was a construct of the words STrings and ARrays. The 100 came
from 100 million floating point operations per second (MFLOPS), the speed at which
the machine was designed to operate. The computer was announced very early
during the 1970s and was supposed to be several times faster than the CDC 7600,
which was then the world's fastest supercomputer with a peak performance of 36
MFLOPS. On August 17, 1971, CDC announced that General Motors had placed the
first commercial order for a STAR-100.
The main memory had a capacity of 65,536 superwords (SWORDs), which are 512bit words.[1] The main memory was 32-way interleaved to pipeline memory
accesses. It was constructed from core memory with an access time of 1.28 s. The
main memory was accessed via a 512-bit bus, controlled by the storage access
controller (SAC), which handled requests from the stream unit. The stream unit
accesses the main memory through the SAC via three 128-bit data buses, two for
reads, and one for writes. Additionally, there is a 128-bit data bus for instruction
fetch, I/O, and control vector access. The stream unit serves as the control unit,
fetching and decoding instructions, initiating memory accesses on the behalf of the
pipelined functional units, and controlling instruction execution, among other tasks.
It also contains two read buffers and one write buffer for streaming data to the
execution units.[1]
The STAR-100 has two pipelines where arithmetic is performed. The first pipeline
contains a floating point adder and multiplier, whereas the second pipeline is
multifunctional, capable of executing all scalar instructions. It also contains a
floating point adder, multiplier, and divider. Both pipelines are 64-bit for floating
point operations and are controlled by microcode. The STAR-100 can split its floating
point pipelines into four 32-bit pipelines, doubling the peak performance of the
system to 100 MFLOPS at the expense of half the precision.[1]
The STAR-100 uses I/O processors to offload I/O from the CPU. Each I/O processor is
a 16-bit minicomputer with its own main memory of 65,536 words of 16 bits each,
which is implemented with core memory. The I/O processors all share a 128-bit data
bus to the SAC.