0% found this document useful (0 votes)
55 views29 pages

Vlsi Systems and Architecture: An Overview

This document provides an overview of VLSI systems and architecture. It discusses the course objectives which are to familiarize students with architectural techniques used to implement complex logic functions in VLSI chips. It covers key aspects of VLSI chip design including functional, structural, and physical design. It also discusses abstraction levels, architectural design process, and examples of architectural techniques like pipelining and using multi-functional processing elements to improve performance.

Uploaded by

KrunalKapadiya1
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
55 views29 pages

Vlsi Systems and Architecture: An Overview

This document provides an overview of VLSI systems and architecture. It discusses the course objectives which are to familiarize students with architectural techniques used to implement complex logic functions in VLSI chips. It covers key aspects of VLSI chip design including functional, structural, and physical design. It also discusses abstraction levels, architectural design process, and examples of architectural techniques like pipelining and using multi-functional processing elements to improve performance.

Uploaded by

KrunalKapadiya1
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 29

VLSI SYSTEMS AND ARCHITECTURE

an overview

Prof S Gurunarayanan
BITS, Pilani

BITS Pilani, Pilani Campus


Course Objectives:

 To familiarize the student with various architectural


techniques used in implementing complex logic
functions as VLSI chips to achieve various design
objective such as high performance; low cost, high
throughput, low-power or a combination thereof

 the course covers the architectural techniques and


design methods used for designing programmable
processors (CISC, RISC, ASIP Processors).

BITS Pilani, Pilani Campus


Recap: Three aspects of VLSI chip
design
 Functional :
What function does the chip perform ?

 Structural :
What is the compositional plan for the chip ?
What parts is it composed of ? and
how those parts have been interconnected ?
( the schematic view )

 Physical :
What is the layout of various physical layers
that go into the manufacturing of the chip ?
BITS Pilani, Pilani Campus
 Each of the three aspects can be addressed at several
different levels of detail.
(also called hierarchy of abstraction levels ).

(this is pretty analogous to viewing of Google maps; you may view


a larger area with poor details or a smaller area with greater
details)

 These abstraction levels present to the designer only the relevant


and manageable amounts of detail of the design at a time so that
the designer can comfortably work with them without getting
overwhelmed by the full details of the design.

BITS Pilani, Pilani Campus


 VLSI chip design starts with specification of the function to be
performed by the chip.

The function can be specified in a natural language ( such as


English and can run into hundreds of pages, leaving scope for
semantic ambiguity and inconsistency )

or

It can be specified using a formal language ( e.g. ‘C’ ,VHDL ,


Verilog , System ‘C’ , System Verilog ) , which leaves no scope
for semantic ambiguity

BITS Pilani, Pilani Campus


Together with the functional specification of a VLSI chip there
are also expectations in terms of performance ( the speed at which
the chip will perform its function ) , power consumption , and cost
(chip size / gate count )

There are also expectations ( or requirements ) in terms of


functional agility ( ability to accommodate changes in chip
functionality ), time permissible for completing the design and
design cost

BITS Pilani, Pilani Campus


Architecture design is the very first step ( in a series of
steps which constitute the full VLSI design cycle).

In this step a high-level ( block- level ) compositional


scheme ( also called block-level schematics ) for the
chip is conceived ( in terms of blocks of simpler
functionality and their interconnections ) to realize the
specified complex functionality of the VLSI chip .

BITS Pilani, Pilani Campus


Many architectural designs ( also often called architectural
schemes ) can be proposed for a given chip functionality.

They all will provide the specified chip functionality ;


however usually their performance ( speed at which
function is performed ), power consumption, chip-cost and
functional agility can differ drastically – making some
architecture(s) more favorable than others from the
market / user’s perspective.

BITS Pilani, Pilani Campus


 Architecting is to be learnt from the examples of existing
architectures and a set of key architecting techniques that
have been developed over the years.

 Architectural design provides the frame-work within which


the more detailed design work ( at lower design abstraction
levels e.g. gate level, transistor level and layout level )
proceeds.

 Performance, power and cost of the chip are largely


Influenced by the architectural choice : They can be
tweaked to certain extents by choices made at more
detailed design levels ( lower abstraction levels)

BITS Pilani, Pilani Campus


1982 Intel 80286 • 2001 Intel Pentium 4
12.5 MHz • 1500 MHz (120X)
2 MIPS (peak) • 4500 MIPS (peak) (2250X)
Latency 320 ns • Latency 15 ns (20X)
134,000 xtors, 47 mm2 • 42,000,000 xtors, 217 mm2
16-bit data bus, 68 pins • 64-bit data bus, 423 pins
Microcode interpreter, • 3-way superscalar,
separate FPU chip Dynamic translate to RISC, Superpipelined (22 stage),
(no caches) Out-of-Order execution
• On-chip 8KB Data caches,
96KB Instr. Trace cache,
256KB L2 cache

BITS Pilani, Pilani Campus


Chip architecture refers to the process of putting
various functional components together in
specific relationships to achieve a particular goal:

a signal processor,
an inexpensive “system on a chip”,
a high-performance pipelined processor

BITS Pilani, Pilani Campus


Ex: Behavioral Model of an Adder

Z =A+B+C +D

Architectural choices

to realize digital hardware solution for a given


behavioral description - creation of suitable
architectural plan

BITS Pilani, Pilani Campus


Combinational Architecture

+
B
+

C
+

Architectural Plan - I

Function Delay = 3 * Tadd ; Sustained Throughput = @ 3 * Tadd


Energy / Function = 3 * Eadd; Gate complexity = 3 * Gadd
Functional Flexibility = none; Function Expandability = none

BITS Pilani, Pilani Campus


A

+
B

C +

+
D

Architectural Plan - II Leveraging parallelism inherent in the function


Function Delay = 2 * Tadd Energy / function = 3 * Eadd
Throughput = @ 2 * Tadd Functional Flexibility = Nil
Gate Complexity = 3 * Gadd Functional Extendability = Nil

BITS Pilani, Pilani Campus


- Different
combinational architectures possible
( delays different )
- Combinational architectures use many processing
elements in the architecture

- Do not need any storage elements

- Individual PEs used only during a small fraction of


time needed for processing

BITS Pilani, Pilani Campus


Pipelined Architectures involves

Modification in combinational architecture involves


inserting a storage element in every interconnection path
between two processing elements.

Storage elements inserted are called pipelined registers


and the architecture is called pipelined architectures.

Throughput increases
BITS Pilani, Pilani Campus
Architecture 3 :
Pipelining: segmenting long combinational logic delay path
by inserting registers to increase throughput
a
b
+ Dff
Cl
k + f
c
d
+ Dff
Function Delay = Tadd + Tsetup + Tclktoq + Tadd
Cl
Throughput = @ (Tadd + Tsetup )
k
Energy / Function = 2 * Eadd + 2 * Eff + Eadd
Gate complexity = 3 * Gadd + 2 * Gff
Functional Flexibility = None
Functional Expandability = None

BITS Pilani, Pilani Campus


Architecture 4 :

Introducing functional flexibility in an architecture through the


use of multi-functional processing elements

a AL
U
b
AL
C1(3:0) U
f
c C3(3:0)
AL
d U

C2(3:0) Function Delay = 2 * Talu Gate complexity = 3 * Galu


Throughput = @ 2 * Talu Functional Flexibility = Full(static)
Energy / Function = 3 * Ealu

BITS Pilani, Pilani Campus


SEQUENTIAL ARCHITECTURE

Actions of many different instances of a language operator in a


behavioral description can be realized through a single processing
element in the architecture in different time slots

Reduces the number of PEs

Needs storage elements which can save result of a PE in a sequential


step for future use.

Relinquish PE to carryout other operations on different inputs.

A Single multifunction ALU can be selected for use

BITS Pilani, Pilani Campus


VLSI Architectures
• Universal architecture with full functional flexibility
and functional expandability (requires no change to
implement any function):
Universal architecture that can
sequentially realize any function without
Data requiring any change In any part of
ALU hardware because the control is also
Mem
flexible

(programmable through the contents of


control memory e.g. the sequence of
Control
Control control words stored in the control
sequen
Mem memory that are sequentially fetched
cer
and applied by the control sequencer)
A CISC Architecture RISC Architecture
Larger instructions with variable
formats LOAD- STORE Architecture
(16-64 bits/ instruction)
Fewer Addressing Modes
Larger Addressing Modes
(12- 24) Fixed Length Instructions
More Registers
Few Registers
Designed for Pipeline Efficiency
Most Microcoded with control Memory
Hardwired Control Unit

BITS Pilani, Pilani Campus


Time
Processor Performance = ---------------
Program

Instructions Cycles Time


= X X
Program Instruction Cycle

(code size) (CPI) (cycle time)

Architecture --> Implementation --> Realization

Compiler Designer Processor Designer Chip Designer

BITS Pilani, Pilani Campus


Fastest instructions have all their operands in CPU registers
and can be executed by CPU in single clock cycle.

Slowest instruction have multiple memory accesses and


multiple register to register operation

Execution of a bench mark program Q on a CPU takes T


seconds and involves execution of N machine instructions.

T- Program execution Time

Number of instructions executed per second IPS


BITS Pilani, Pilani Campus
T = N/ IPS
Average no of cycles per instruction CPI needed to execute Q

CPI = (f x 10 6) / IPS

T = (N x CPI)/ (f x 10 6)

Performance determined by
Software N  T
Architecture CPI  T
Hardware f  T

CISC aims to reduce N at the expense of CPI


RISC aims to reduce CPI at the expense of N
BITS Pilani, Pilani Campus
Amdahl’s Law
Example: Executing a program on n independent processors

Fractionenhanced = parallelizable part of program

Speedup =n
enhanced
ExTime old Fraction enhanced
ExTime = ExTimeold (1- Fraction )+
enhanced
new n

ExTime old 1
Speedup overall  
ExTimenew Fraction enhanced
1  Fractionenhanced  
Speedup enhanced

Lim Speedup = 1 / (1 - Fraction )


n  overall enhanced

BITS Pilani, Pilani Campus


Example: Improving part of a processor (e.g., multiplier,
floating-point unit)
Fraction = part of program to be enhanced
enhanced

1
Speedup 
overall Fraction enhanced
 1  Fraction enhanced 
Speedup
enhanced

BITS Pilani, Pilani Campus


Pipelining
X
4 consecutive operations Z
( )2 Square Root
2 2
Z=F(X,Y)=SqRoot(X +Y ) Y

If each step takes 1T then one calculation takes 3T, four take 12T
X
Stage 1 Stage 2 Stage 3 Z

X2 +Y2 SqRoot
Y

Assuming ideally that each stage takes 1T


What will be the latency (time to produce the first result)?
What will be the throughput (pipeline rate in the steady state)?

BITS Pilani, Pilani Campus


Learning outcomes

 Understand the design CISC instruction set and its implementation as a


microprocessor chip through the creation of optimal datapath and a
microprogrammed/ hardwired control unit using the flow -chart method
 How to design Reduced Instruction Set Computer (RISC) architecture which
implements a streamlined instruction set on a pipelined execution unit to
achieve single cycle execution.
 Understand techniques for implementing Instruction level parallelism to
improve throughput of processors.
 Cache Memory Design & Performance metrics.
 Design of Application Specific Instruction Set Processor (ASIP) with emphasis
on how high performance, low-power and functional flexibility can be
simultaneously addressed through them.

BITS Pilani, Pilani Campus


Evaluation
– Mid Term Test (EC2) 30%
– Assignments/Labs/quiz (EC1) 30%
– Comprehensive Exam (EC3) 40%

BITS Pilani, Pilani Campus

You might also like