Processors
Processors
Processors
Agenda
Modern processor technology
Instruction set architectures (CISC vs RISC) Typical processors: superscalar, VLIW, superpipelined and vector
Processors
Advanced Processor Technology
Design Space of Processors Instruction-Set Architectures CISC Scalar Processors RISC Scalar Processors
Superpipelined Conventional Todays Special Very Long subclass RISC Instruction processors processors processors of RISC Word like like processor have (VLIW) Intel Intel higher i486, i860, architecture (Superscalar clock M68040, SPARC, rate uses MIPS VAX/8600, processors) ~ 100 even R3000, 500 IBM 390, RS/6000, IBM which more MHz, etc functional however allow fall into multiple etc. CPI units this have is family. instructions than also faster high superscalar, Typical clock unless to be clock rate issued there thus rate ~ 20 its is during ~ use CPI 33 120 of is each MHz 50 multiple further MHz and cycle, and Withbut with thus low, functional hardwired taking microprogrammed due units CPI to control, long as to a ininstructions lower the typical control, case value CPI of(microprogrammed), typical with vector ~1 similar -CPI supercomputers 2 ~ clock 1 - 20 rate its as clock that of RISC rate is slow
Instruction Pipeline
Typical instruction execution involves four phases: fetch, decode, execute & writeback Often executed by instruction pipeline
Processors
Advanced Processor Technology
Design Space of Processors Instruction-Set Architectures CISC Scalar Processors RISC Scalar Processors
Both integer unit and floating point unit may be present in same CPU Ideally, its performance should be that of instruction pipeline with one instruction fed per clock cycle Practically, it works in underpipelined situation due to data dependencies, resource conflicts, branch penalties, etc.
Example 1
Typical CISC architecture with Microprogrammed control Instruction set contains 300 instructions with 20 different addressing modes
CPU consist of two functional units for execution of floating point and integer instructions
Unified cache holds both instructions and data 16 GPRs in instruction unit and Instruction pipelining has six stages
Example 2
Processor implements over 100 instructions using 16 GPRs Separate cache each of 4KB for data and instruction with MMUs present in separate memory units
General characteristics Large number of instructions More options in the addressing modes
Processors
Advanced Processor Technology
Design Space of Processors Instruction-Set Architectures CISC Scalar Processors RISC Scalar Processors
General characteristics All use 32-bit instructions Instruction set consist of less than 100 instructions High clock rate Low CPI
Example 1
SPARC stands for scalable processor architecture Scalability is due to use of number of register windows (explained on next slide) Floating point unit (FPU) is implemented on a separate chip
Window Registers
SPARC runs each procedure with a set of thirty two 32-bit registers
Each register window is divided into three sections Ins, Locals and Outs
Locals are addressable by each procedure and Ins & Outs are shared among procedures
Example 2
64 bit RISC processor on a single chip It executes 82 instructions, all of them in single clock cycle There are nine functional units connected by multiple data paths There are two floating point units namely multiplier unit and adder unit, both of which can execute concurrently
Processors
Advanced Processor Technology
Design Space of Processors Instruction-Set Architectures CISC Scalar Processors RISC Scalar Processors
Superscalar processors: Multiple instruction pipelines used Multiple instruction issued per cycle and Multiple results generated per cycle
Superscalar Processors
Designed to exploit instruction-level parallelism in user programs Amount of parallelism depends on the type of code being executed On average, at instruction level around 2 instructions can be executed in parallel There is no benefit to have a processor which can be fed with 3 instructions per cycle Thus, instruction-issue degree in superscalar has been limited to 2 5
Example 1
A typical superscalar architecture Multiple instruction pipelines are used, instruction cache supplies multiple instructions per fetch Multiple functional units are built into integer unit and floating point unit
Multiple data buses run though functional units, and in theory, all such units can be run simultaneously
Example 2
A superscalar architecture by IBM Three functional units namely branch processor, fixed point processor and floating point processor, all of which can operate in parallel Branch processor can facilitate execution of up to five instructions per cycle Number of buses of varying width are provided to support high instruction and data bandwidths.
Processors
Advanced Processor Technology
Design Space of Processors Instruction-Set Architectures CISC Scalar Processors RISC Scalar Processors
2. Horizontal microcoding
Different fields of the long instruction word carries opcodes to be dispatched to multiple functional units Programs written in conventional short opcodes are to be converted into VLIW format by compilers
Processors
Advanced Processor Technology
Design Space of Processors Instruction-Set Architectures CISC Scalar Processors RISC Scalar Processors
Vector Processors
Vector processor is a coprocessor designed to perform vector computations Vector computations involve instructions with large array of operands
Same operation is performed over an array of operands
Vector Instructions
Register-based instructions
Vi represent vector register of length n si represent scalar register of length n
Memory-based instructions
M(1:n) represent memory array of length n
Vector Pipelines
Scalar pipeline Each Execute-Stage operates upon a scalar operand
Vector pipeline
Symbolic Processors
Applications in the areas of pattern recognition, expert systems, artificial intelligence, cognitive science, machine learning, etc. Symbolic processors differ from numeric processors in terms of: Data and knowledge representations Primitive operations Algorithmic behavior Memory I/O communication
Characteristics
Example
Symbolic Lisp Processor Multiple processing units are provided which can work in parallel Operands are fetched from scratch pad or stack Processor executes most of the instructions in single machine cycle