Computer Hardware Design
Computer Hardware Design
EECS 4340
Columbia University
Course Description
• Practicum on hardware design
• “A college course, often in a specialized field of study, that is designed to give
students supervised practical application of a previously or concurrently studied
theory.”
Columbia University
Computer Architecture Lab Columbia University
NViDIA (7.1 Billion Transistors)
Columbia University
A Hardware Design Engineer Must…
• understand what drives the field…
• Moore’s law
• … convert copious raw transistors into products …
• design principles
• … that function correctly …
• validation, testing techniques
• … and maximize profit.
• Understand time-to-market and choose best perf/time
• Benefits:
• Increased functionality in the same area
• more devices on a chip => more complex functions can be implemented
• or same functionality in a smaller area footprint
• Smaller chip => more dies per wafer => more profit per wafer
• Further, smaller devices are faster
• And, consume less energy to operate!
x 0.7x
Parameters:
• Feature Size: 0.7x => Area = 0.5x
• Capacitance(C): 0.6x
• Supply voltage(Vdd): 0.9x
• Power (CV2F): ~0.5*Fx
• Power density should remain constant
• => Frequency: 1.0x
Adapted from:
Scaling with design constraints: predicting the future of big chips (Rajamani)
The Exascale Challenge (Borkar)
18
2014, 17
16
Diameter in Inches
14
12 2002, 12
10
8 1995, 8
6 1989, 6
1986, 5
4 1983, 4
1975, 3
2 1968, 2
1964, 1.5
1962, 1
0
1959, 0.5
Linewidth (nm)
10000
Linewidth (um)
Fab Cost ($M)
1000
100
0.1
10
Logic
Specification
Synthesis
Architecture Circuit
Design Design
Micro
Layout
architecture
RTL Design
Fabrication
and Entry
Validation and
Testing
Verification
• At least four:
• Instruction to perform addition
• Instruction to load inputs
• Instruction to store results
OVERVIEW
ADD
This instruction performs addition of two 32 bit values stored in
register files. It takes two operands s1 and s2 and stores the result
in the destination register. If any of the additions result in an
overflow the value is stored in the overflow register at the byte
location that caused the overflow.
The instruction opcode is 01.
Instruction Format
Example:
01 S1 S2 D1
[1] [1] [1] [1]
[0][0][0][0] 2 2 2 2
MEMORY
The processor can address 64 addressable memory locations. Each
memory location holds 8 bits.
0
BOOT/BIOS
7
Key Board Input
15
Display Output
23
63
CLK
CLK DISPLAY
GEN CONTROL
RESET
5
mem_addr_o
8
mem_data_i
PWR
SUPPLY
KBD
CONTROL PROC
8
mem_data_o
EEPROM
PROC
BIOS
PROC
DEC
REGFILE Memory
ADD BR MEM
s1_i
8
d_o
ADDER 9
s2_i
• Logic Synthesis
• Process of converting from a relatively
abstract HDL model of the desired
behavior to a structural model that can
be realized in hardware.
• Three choices
• Automatic synthesis (this class)
• Semi-custom design
• Full-custom design
Area 1.12
Delay 1.39
Power 1.07
Flip-flop from NAND
Full 1. Full-custom, • Complete customization of all mask layers. Design cost: Best: at all levels
mask 2. Semi-custom • Reserved for high-performance, high-volume Highest from fabrication,
targets 3. Std-cell (ASIC) (microprocessors, analog circuits) Manufacturing cost: circuit to high-
• Design libraries can be: Highest level design
• Obtained from external vendor
• Full-custom (each team builds one)
• Semi-custom (all in-house teams share)
Metal Metal programmable Wafers with prefabricated array of gates (“sea of Design cost: Innovations
mask logic (Structured ASIC) universal gates”) and memory/processors that can Reasonable restricted to
targets Example: Atmel CAP be customized by connecting wires in layers. Fabs Manufacturing cost: functionality
can pre-stock wafers ~ 3 weeks turnaround time. Medium (e.g., new USB)
No Field programmable “Sea” of lookup tables implement functions No fabrication costs! Usually slower,
Masks logic (FPGA) Example: Low startup costs, much cheaper and slower than Design cost is same larger than
Xilinx, Altera etc., Standard cell designs, for 100K units FPGAs are as custom mask above two
better. options options
No Soft IP Example (http:// Provide encrypted intellectual property that can be Almost like New
Masks www.ip-extreme.com/ used by other companies. Initial part and Royalty software, need EDA functionality,
corestore/) tools better faster etc.
• Testing cost
• Cost = test time + test cost per hour
• Test time = 1-2 minutes, test cost per hour = hundreds per hour
Assumptions Calculation
Die Area 140 mm2
Wafer diameter 200 mm Die per wafer 186
Defect density 0.5/cm2
Process complexity 4
Wafer yield 95% Die yield 50%
Processed Wafer Cost $3000 Die cost $33
Base package cost $10
Cost per pin $0.01
Number of pins 500 Package cost $15
Test time 30s
Test cost per hour $400/hour Test cost $3
Test yield 95% Processor cost $54
Assumptions Calculation
Die Area 310 mm2
Wafer diameter 200 mm Die per wafer 76
Defect density 0.5/cm2
Process complexity 4
Wafer yield 95% Die yield 25%
Processed Wafer Cost $3000 Die cost $158
Base package cost $15
Cost per pin $0.01
Number of pins 1000 Package cost $25
Test time 45s
Test cost per hour $400/hour Test cost $5
Test yield 95% Processor cost $198
Commodity Server
Die Cost 64% 84%
Package and Assembly 29% 13%
Test 7% 3%
Observations:
Know when to optimize for area, and remember each design decision affects cost!