Hardware I
Hardware I
ARCHITECTURE
HARDWARE I
Dr. Valon Raca
Notes
- Slides adapted and based on sources from University of Pennsylvania (Joseph
Devietti, Benedict Brown, C.J. Taylor, Milo Martin & Amir Roth)
2
This Unit: Digital Logic & Hdw Description
3
Readings
→ Digital logic
→ P&H, Appendix B
→ Manufacturing
→ P&H, Section 1.7
4
Motivation: Implementing a Datapath
datapath
fetch
control
→ Datapath: performs computation (registers, ALUs, etc.)
→ ISA specific: can implement every insn (single-cycle: in one pass!)
→ Control: determines which computation is performed
→ Routes data through datapath (which regs, which ALU op)
→ Fetch: get insn, translate opcode into control
5
→ Fetch → Decode → Execute “cycle”
Two Types of Components
datapath
fetch
control
→ Purely combinational: stateless computation
→ ALUs, muxes, control
→ Arbitrary Boolean functions
→ Combinational+sequential: storage
→ PC, insn/data memories, register file
→ Internally contain some combinational components
6
Example LC4 Datapath
7
LC4 Datapath
+1
16
insn[2:0]
3
insn[11:9]
insn[11:9]
insn[8:6] 3
insn[11:9] 3 3’b111
3’b111
Memory
16 16 216 by 16
PC we wsel
16 bit
r1sel r2sel 16
16 Reg.
r1data
File
16
ALU
addr we wdata
16
out
r2data
Memory
216 by 16 bit
Reg.
File in
n/z/p
16
3 we
NZP Reg
3 Branch
NZP Reg Logic
16
8
Transistors & Fabrication
9
Intel
Pentium M
Wafer
10
Semiconductor Technology
gate gate
insulator
source drain source drain
Substrate channel
channel
→ Basic technology element: MOSFET
→ Solid-state component acts like electrical switch
→ MOS: metal-oxide-semiconductor
→ Conductor, insulator, semi-conductor
→ FET: field-effect transistor
→ Channel conducts source→drain only when voltage applied to gate
→ Channel length: characteristic parameter (short → fast)
→ Aka “feature size” or “technology”
→ Currently: 0.007 micron (mm), 7 nanometers (nm)
→ Continued miniaturization (scaling) known as “Moore’s Law”
11 → Won’t last forever, physical limits approaching (or are they?)
Transistors and Wires
©IBM
From slides © Krste Asanović, MIT
12
Complementary MOS (CMOS)
→ Voltages as values
→ Power (VDD) = “1”, Ground = “0” power (1)
→ P-transistors
→ Conduct when gate voltage is 0 ground (0)
→ Good at passing 1s
→ CMOS
→ Complementary n-/p- networks form boolean logic (i.e., gates)
→ And some non-gate elements too (important example: RAMs)
13
Basic CMOS Logic Gate
→ Inverter: NOT gate
→ One p-transistor, one n-transistor
→ Basic operation 0
→ Input = 0
1
14
Another CMOS Gate Example
→ What is this? Look at truth table A B
→ 0, 0 → 1
→ 0, 1 → 1 output
→ 1, 0 → 1
A
→ 1, 1 → 0 B
A B
15
Digital Building Blocks: Logic Gates
→ Logic gates: implement Boolean functions
→ Basic gates: NOT, NAND, NOR
→ Underlying CMOS transistors are naturally inverting ( = NOT)
NOT (Inverter) NAND NOR
A A
A A’ (AB)’ (A+B)’
B B
17
Boolean Functions and Truth Tables
→ Any Boolean function can be represented as a truth table
→ Truth table: point-wise input → output mapping
→ Function is disjunction of all rows in which “Out” is 1
A,B,C → Out
0,0,0 → 0
0,0,1 → 0
0,1,0 → 0
0,1,1 → 0
1,0,0 → 0
1,0,1 → 1
1,1,0 → 1
1,1,1 → 1
19
PLA Example
→ PLA with 3 inputs, 2 outputs, and 4 product terms
→ Out0 = AB’C + ABC’ + ABC
A Permanent
B connections
C
Programmable
connections
(unconnected)
Out0
Out1
20
Boolean Algebra
→ Boolean Algebra: rules for rewriting Boolean functions
→ Useful for simplifying Boolean functions
→ Simplifying = reducing gate count, reducing gate “levels”
→ Rules: similar to logic (0/1 = F/T)
→ Identity: A1 = A, A+0 = A
→ 0/1: A0 = 0, A+1 = 1
→ Inverses: (A’)’ = A
→ Idempotency: AA = A, A+A = A
→ Tautology: AA’ = 0, A+A’ = 1
→ Commutativity: AB = BA, A+B = B+A
→ Associativity: A(BC) = (AB)C, A+(B+C) = (A+B)+C
→ Distributivity: A(B+C) = AB+AC, A+(BC) = (A+B)(A+C)
→ DeMorgan’s: (AB)’ = A’+B’, (A+B)’ = A’B’
21
Logic Minimization
→ Logic minimization
→ Iterative application of rules to reduce function to simplest form
→ Design tools do this automatically
Out = AB’C + ABC’ + ABC
Out = A(B’C + BC’ + BC) // distributivity
Out = A(B’C + (BC’ + BC)) // associativity
Out = A(B’C + B(C’+C)) // distributivity (on B)
Out = A(B’C + B1) // tautology
Out = A(B’C + B) // 0/1
Out = A((B’+B)(C+B)) // distributivity (on +B)
Out = A(1(B+C)) // tautology
Out = A(B+C) // 0/1
22
Non-Arbitrary Boolean Functions
→ PLAs implement Boolean functions point-wise
→ E.g., represent f(X) = X+5 as [0→5, 1→6, 2→7, 3→8, …]
→ Mainly useful for “arbitrary” functions, no compact representation
23
Multiplexer (Mux)
→ Multiplexer (mux): selects output from N inputs
→ Example: 1-bit 4-to-1 mux
→ Not shown: N-bit 4-to-1 mux = N 1-bit 4-to-1 muxes + 1 decoder
S (binary)
S (1-hot)
S (binary)
A
A
B
O
B O C
D
24
Adder
→ Adder: adds/subtracts two binary integers in two’s complement format
→ Half adder: adds two 1-bit “integers”, no carry-in
→ Full adder: adds three 1-bit “integers”, includes carry-in
→ Ripple-carry adder: N chained full adders add 2 N-bit integers
→ To subtract: negate B input, set bit 0 carry-in to 1
25
Full Adder
→ What is the logic for a full adder?
→ Look at truth table CI
CI A B → C0 S
0 0 0 → 0 0 S
0 0 1 → 0 1 A CI
0 1 0 → 0 1 B
A
FA
S
0 1 1 → 1 0
1 0 0 → 0 1 B
1 0 1 → 1 0 CO
1 1 0 → 1 0
1 1 1 → 1 1
CO
0
1
A0 S0
FA
B0
A1 S1 A +/-
FA S
B1 B
+/–
…
AN-1 SN-1
FA
BN-1
+/–
• More later when we cover arithmetic
27
FPGAs
28
Alternative to Fabrication: FPGA
→ We’ll use FPGAs (Field Programmable Gate Array)
→ Also called Programmable Logic Devices (PLDs)
→ Uses
→ Hardware prototyping (what “we” are doing)
→ Low-volume special-purpose hardware
→ Network processing. FPGAs in AWS, Azure Clouds
29
FPGA
→ A Field Programmable Gate Array contains a collection of configurable logic elements and
a programmable interconnect that can be set up to perform the desired logical operations.
Programmable Interconnect
30
Configurable Logic Blocks
→ Each of the configurable logic blocks (or logic cells) contains some lookup tables and one or
more flip-flops.
→ By setting the entries in the lookup tables (LUTs) these units can be programmed to
implement arbitrary logical functions on their inputs.
→ http://en.wikipedia.org/wiki/Field-programmable_gate_array
→ ZedBoard has 85K logic cells
31
Configuring FPGAs
→ By configuring the CLBs and the interconnect the FPGA can be ‘programmed’ to
implement the desired operation.
Configurable Logic Blocks (CLBs)
AND
AND
XOR XOR
NAND
NAND
Programmable Interconnect
32
Hardware Design Methods
33
Hardware Design Methodologies
→ Fabricating a chip requires a detailed layout
→ All transistors & wires
→ How does a hardware designer describe such design?
→ (Bad) Option #1: draw all the masks “by hand”
→ All 1 billion transistors? Umm…
→ Option #2: use computer-aided design (CAD) tools to help
→ Layout done by engineers with CAD tools or automatically
→ Design levels – uses abstraction
→ Transistor-level design – designer specifies transistors (not layout)
→ Gate-level design – designer specifics gates, wires (not transistors)
→ Higher-level design – designer uses higher-level building blocks
→ Adders, memories, etc.
34
→ Or logic in terms of and/or/not, and tools translates into gates
Describing Hardware
→ Two general options
→ Schematics
→ Pictures of gates & wires
→ Hardware description languages
→ Use textual descriptions to specify hardware
35
Schematics
S
A
O
B
→ Draw pictures
→ Use a schematic entry program to draw wires, logic blocks, gates
→ Support hierarchical design (arbitrary nesting)
+ Good match for hardware which is inherently spatial
– Time consuming, “non-scalable” (large designs are unreadable)
→ Rarely used in practice (“real-world” designs are too big)
36
Hardware Description Languages (HDLs)
→ Write “code” to describe hardware
→ HDL vs. SDL
→ Specify wires, gates, modules (also hierarchical)
+ Easier to create, edit, modify, scales well
– Misleading “sequential” representation: must still “think” spatially (gets easier
with practice)
module mux2to1(S, A, B, Out);
input S, A, B; S
output Out; A
wire S_, AnS_, BnS; Out
B
not (S_, S);
and (AnS_, A, S_);
and (BnS, B, S);
or (Out, AnS_, BnS);
endmodule
37
(Hierarchical) HDL Example
→ Build up more complex modules using simpler modules
→ Example: 4-bit wide mux from four 1-bit muxes
S
4
module mux2to1_4(S, A, B, Out); A
4
input [3:0] A; 4
Out
input [3:0] B; B
input S;
output [3:0] Out;
38
Verilog HDL
→ Verilog: HDL we will be using
→ Syntactically similar to C (by design)
± Ease of syntax hides fact that this isn’t C (or any software lang)
→ We will use a few lectures to learn Verilog
module mux2to1_4(S, A, B, Out);
input [3:0] A;
These aren’t variables
input [3:0] B;
input S;
output [3:0] Out;
These aren’t function calls
HDL netlist
(wires, implementation
source synthesis bitstream
gates, (place & route)
code FFs)
44
Side note: High-Level Synthesis
→ Translate “C to gates”
→ write hardware at a higher level of abstraction than conventional HDLs
→ greater programmer productivity
→ need to write stylized C that will synthesize well
→ tools are still slow
45