0% found this document useful (0 votes)
33 views45 pages

Hardware I

Uploaded by

Bleron Morina
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
33 views45 pages

Hardware I

Uploaded by

Bleron Morina
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 45

COMPUTER

ARCHITECTURE
HARDWARE I
Dr. Valon Raca
Notes
- Slides adapted and based on sources from University of Pennsylvania (Joseph
Devietti, Benedict Brown, C.J. Taylor, Milo Martin & Amir Roth)

2
This Unit: Digital Logic & Hdw Description

App App App → Transistors & fabrication


System software
→ Digital logic basics
Mem CPU I/O
→ Focus on useful components

3
Readings
→ Digital logic
→ P&H, Appendix B

→ Manufacturing
→ P&H, Section 1.7

4
Motivation: Implementing a Datapath
datapath

fetch

Insn Register Data


PC
memory File Memory

control
→ Datapath: performs computation (registers, ALUs, etc.)
→ ISA specific: can implement every insn (single-cycle: in one pass!)
→ Control: determines which computation is performed
→ Routes data through datapath (which regs, which ALU op)
→ Fetch: get insn, translate opcode into control
5
→ Fetch → Decode → Execute “cycle”
Two Types of Components
datapath

fetch

Insn Register Data


PC
memory File Memory

control
→ Purely combinational: stateless computation
→ ALUs, muxes, control
→ Arbitrary Boolean functions
→ Combinational+sequential: storage
→ PC, insn/data memories, register file
→ Internally contain some combinational components
6
Example LC4 Datapath

7
LC4 Datapath
+1
16

insn[2:0]
3
insn[11:9]
insn[11:9]
insn[8:6] 3
insn[11:9] 3 3’b111

3’b111
Memory
16 16 216 by 16
PC we wsel
16 bit
r1sel r2sel 16
16 Reg.
r1data
File
16

ALU
addr we wdata
16
out
r2data
Memory
216 by 16 bit
Reg.
File in

n/z/p
16
3 we
NZP Reg
3 Branch
NZP Reg Logic

16

8
Transistors & Fabrication

9
Intel
Pentium M
Wafer

10
Semiconductor Technology
gate gate

insulator
source drain source drain
Substrate channel
channel
→ Basic technology element: MOSFET
→ Solid-state component acts like electrical switch
→ MOS: metal-oxide-semiconductor
→ Conductor, insulator, semi-conductor
→ FET: field-effect transistor
→ Channel conducts source→drain only when voltage applied to gate
→ Channel length: characteristic parameter (short → fast)
→ Aka “feature size” or “technology”
→ Currently: 0.007 micron (mm), 7 nanometers (nm)
→ Continued miniaturization (scaling) known as “Moore’s Law”
11 → Won’t last forever, physical limits approaching (or are they?)
Transistors and Wires

©IBM
From slides © Krste Asanović, MIT

12
Complementary MOS (CMOS)
→ Voltages as values
→ Power (VDD) = “1”, Ground = “0” power (1)

→ Two kinds of MOSFETs p-transistor

→ N-transistors input output


→ Conduct when gate voltage is 1 (“node”)

→ Good at passing 0s n-transistor

→ P-transistors
→ Conduct when gate voltage is 0 ground (0)

→ Good at passing 1s
→ CMOS
→ Complementary n-/p- networks form boolean logic (i.e., gates)
→ And some non-gate elements too (important example: RAMs)
13
Basic CMOS Logic Gate
→ Inverter: NOT gate
→ One p-transistor, one n-transistor
→ Basic operation 0

→ Input = 0
1

→ P-transistor closed, n-transistor


open
→ Power charges output (1)
→ Input = 1
→ P-transistor open, n-transistor 1 0
closed
→ Output discharges to ground
(0)

14
Another CMOS Gate Example
→ What is this? Look at truth table A B

→ 0, 0 → 1
→ 0, 1 → 1 output

→ 1, 0 → 1
A

→ 1, 1 → 0 B

→ Result: NAND (NOT AND)


→ NAND is “universal”
A

→ What function is this? B


output

A B

15
Digital Building Blocks: Logic Gates
→ Logic gates: implement Boolean functions
→ Basic gates: NOT, NAND, NOR
→ Underlying CMOS transistors are naturally inverting ( = NOT)
NOT (Inverter) NAND NOR
A A
A A’ (AB)’ (A+B)’
B B

→ NAND, NOR are “Boolean complete”


BUF AND OR
A A A+B
A A AB
B B

AND3 ANDNOT XOR


A A A
B AB’ AB’+A’B
B B (A^B)
C
16
Digital Logic Review

17
Boolean Functions and Truth Tables
→ Any Boolean function can be represented as a truth table
→ Truth table: point-wise input → output mapping
→ Function is disjunction of all rows in which “Out” is 1
A,B,C → Out
0,0,0 → 0
0,0,1 → 0
0,1,0 → 0
0,1,1 → 0
1,0,0 → 0
1,0,1 → 1
1,1,0 → 1
1,1,1 → 1

→ Example above: Out = AB’C + ABC’ + ABC


18
Truth Tables and PLAs
→ Implement Boolean function by implementing its truth table
→ Takes two levels of logic
→ Assumes inputs and inverses of inputs are available
(usually are)
→ First level: ANDs (product terms)
→ Second level: ORs (sums of product terms)

→ PLA (programmable logic array)


→ Flexible circuit for doing this

19
PLA Example
→ PLA with 3 inputs, 2 outputs, and 4 product terms
→ Out0 = AB’C + ABC’ + ABC
A Permanent
B connections

C
Programmable
connections
(unconnected)

Out0

Out1

20
Boolean Algebra
→ Boolean Algebra: rules for rewriting Boolean functions
→ Useful for simplifying Boolean functions
→ Simplifying = reducing gate count, reducing gate “levels”
→ Rules: similar to logic (0/1 = F/T)
→ Identity: A1 = A, A+0 = A
→ 0/1: A0 = 0, A+1 = 1
→ Inverses: (A’)’ = A
→ Idempotency: AA = A, A+A = A
→ Tautology: AA’ = 0, A+A’ = 1
→ Commutativity: AB = BA, A+B = B+A
→ Associativity: A(BC) = (AB)C, A+(B+C) = (A+B)+C
→ Distributivity: A(B+C) = AB+AC, A+(BC) = (A+B)(A+C)
→ DeMorgan’s: (AB)’ = A’+B’, (A+B)’ = A’B’
21
Logic Minimization
→ Logic minimization
→ Iterative application of rules to reduce function to simplest form
→ Design tools do this automatically
Out = AB’C + ABC’ + ABC
Out = A(B’C + BC’ + BC) // distributivity
Out = A(B’C + (BC’ + BC)) // associativity
Out = A(B’C + B(C’+C)) // distributivity (on B)
Out = A(B’C + B1) // tautology
Out = A(B’C + B) // 0/1
Out = A((B’+B)(C+B)) // distributivity (on +B)
Out = A(1(B+C)) // tautology
Out = A(B+C) // 0/1

22
Non-Arbitrary Boolean Functions
→ PLAs implement Boolean functions point-wise
→ E.g., represent f(X) = X+5 as [0→5, 1→6, 2→7, 3→8, …]
→ Mainly useful for “arbitrary” functions, no compact representation

→ Many useful Boolean functions are not arbitrary


→ Have a compact implementation
→ Examples
→ Multiplexer
→ Adder

23
Multiplexer (Mux)
→ Multiplexer (mux): selects output from N inputs
→ Example: 1-bit 4-to-1 mux
→ Not shown: N-bit 4-to-1 mux = N 1-bit 4-to-1 muxes + 1 decoder
S (binary)
S (1-hot)
S (binary)
A
A
B
O
B O C
D

24
Adder
→ Adder: adds/subtracts two binary integers in two’s complement format
→ Half adder: adds two 1-bit “integers”, no carry-in
→ Full adder: adds three 1-bit “integers”, includes carry-in
→ Ripple-carry adder: N chained full adders add 2 N-bit integers
→ To subtract: negate B input, set bit 0 carry-in to 1

25
Full Adder
→ What is the logic for a full adder?
→ Look at truth table CI

CI A B → C0 S
0 0 0 → 0 0 S
0 0 1 → 0 1 A CI
0 1 0 → 0 1 B
A
FA
S
0 1 1 → 1 0
1 0 0 → 0 1 B
1 0 1 → 1 0 CO
1 1 0 → 1 0
1 1 1 → 1 1
CO

→ S = C’A’B + C’AB’ + CA’B’ + CAB = C ^ A ^ B


→ CO = C’AB + CA’B + CAB’ + CAB = CA + CB + AB
26
N-bit Adder/Subtracter

0
1
A0 S0
FA
B0

A1 S1 A +/-
FA S
B1 B
+/–

AN-1 SN-1
FA
BN-1

+/–
• More later when we cover arithmetic

27
FPGAs

28
Alternative to Fabrication: FPGA
→ We’ll use FPGAs (Field Programmable Gate Array)
→ Also called Programmable Logic Devices (PLDs)

→ An FPGA is a special type of programmable chip


→ Conceptually, contains a grid of gates
→ The wiring connecting them can be reconfigured electrically
→ Using more transistors as switches
→ Once configured, the FPGA can emulate any digital logic design
→ Tool converts gate-level design to configuration

→ Uses
→ Hardware prototyping (what “we” are doing)
→ Low-volume special-purpose hardware
→ Network processing. FPGAs in AWS, Azure Clouds
29
FPGA
→ A Field Programmable Gate Array contains a collection of configurable logic elements and
a programmable interconnect that can be set up to perform the desired logical operations.

Configurable Logic Blocks (CLBs)

Programmable Interconnect

30
Configurable Logic Blocks
→ Each of the configurable logic blocks (or logic cells) contains some lookup tables and one or
more flip-flops.
→ By setting the entries in the lookup tables (LUTs) these units can be programmed to
implement arbitrary logical functions on their inputs.
→ http://en.wikipedia.org/wiki/Field-programmable_gate_array
→ ZedBoard has 85K logic cells

31
Configuring FPGAs
→ By configuring the CLBs and the interconnect the FPGA can be ‘programmed’ to
implement the desired operation.
Configurable Logic Blocks (CLBs)

AND

AND

XOR XOR

NAND

NAND

Programmable Interconnect

32
Hardware Design Methods

33
Hardware Design Methodologies
→ Fabricating a chip requires a detailed layout
→ All transistors & wires
→ How does a hardware designer describe such design?
→ (Bad) Option #1: draw all the masks “by hand”
→ All 1 billion transistors? Umm…
→ Option #2: use computer-aided design (CAD) tools to help
→ Layout done by engineers with CAD tools or automatically
→ Design levels – uses abstraction
→ Transistor-level design – designer specifies transistors (not layout)
→ Gate-level design – designer specifics gates, wires (not transistors)
→ Higher-level design – designer uses higher-level building blocks
→ Adders, memories, etc.
34
→ Or logic in terms of and/or/not, and tools translates into gates
Describing Hardware
→ Two general options
→ Schematics
→ Pictures of gates & wires
→ Hardware description languages
→ Use textual descriptions to specify hardware

→ Translation process called “synthesis”


→ Textual description -> gates -> full layout
→ Tries to minimizes the delay and/or number of gates
→ Much like process of compilation of software
→ Much slower!

35
Schematics

S
A
O
B

→ Draw pictures
→ Use a schematic entry program to draw wires, logic blocks, gates
→ Support hierarchical design (arbitrary nesting)
+ Good match for hardware which is inherently spatial
– Time consuming, “non-scalable” (large designs are unreadable)
→ Rarely used in practice (“real-world” designs are too big)
36
Hardware Description Languages (HDLs)
→ Write “code” to describe hardware
→ HDL vs. SDL
→ Specify wires, gates, modules (also hierarchical)
+ Easier to create, edit, modify, scales well
– Misleading “sequential” representation: must still “think” spatially (gets easier
with practice)
module mux2to1(S, A, B, Out);
input S, A, B; S
output Out; A
wire S_, AnS_, BnS; Out
B
not (S_, S);
and (AnS_, A, S_);
and (BnS, B, S);
or (Out, AnS_, BnS);
endmodule
37
(Hierarchical) HDL Example
→ Build up more complex modules using simpler modules
→ Example: 4-bit wide mux from four 1-bit muxes
S
4
module mux2to1_4(S, A, B, Out); A
4

input [3:0] A; 4
Out
input [3:0] B; B
input S;
output [3:0] Out;

mux2to1 mux0 (S, A[0], B[0], Out[0]);


mux2to1 mux1 (S, A[1], B[1], Out[1]);
mux2to1 mux2 (S, A[2], B[2], Out[2]);
mux2to1 mux3 (S, A[3], B[3], Out[3]);
endmodule

38
Verilog HDL
→ Verilog: HDL we will be using
→ Syntactically similar to C (by design)
± Ease of syntax hides fact that this isn’t C (or any software lang)
→ We will use a few lectures to learn Verilog
module mux2to1_4(S, A, B, Out);
input [3:0] A;
These aren’t variables
input [3:0] B;
input S;
output [3:0] Out;
These aren’t function calls

mux2to1 mux0 (S, A[0], B[0], Out[0]);


mux2to1 mux1 (S, A[1], B[1], Out[1]);
mux2to1 mux2 (S, A[2], B[2], Out[2]);
mux2to1 mux3 (S, A[3], B[3], Out[3]);
endmodule
39
HDLs are not “SDLs”
→ SDL == Software Description Language (e.g., Java, C)
→ Similar in some (intentional) ways …
→ Syntax
→ Named entities, constants, scoping, etc.
→ Tool chain: synthesis tool analogous to compiler
→ Multiple levels of representation
→ “Optimization”
→ Multiple targets (portability)
→ “Software” engineering
→ Modular structure and parameterization
→ Libraries and code repositories
→ … but different in many others
40
→ One of the most difficult conceptual leaps of this course
Hardware is not Software
→ Just two different beasts (or two parts of the same beast)
→ Things that make sense in hardware, don’t in software, vice versa
→ One of the main themes of this course
→ Software is sequential
→ Hardware is inherently parallel and “always on”
→ Have to work to get hardware to not do things in parallel
→ Software atoms are purely functional (“digital”)
→ Hardware atoms have quantitative (“analog”) properties too
→ Including correctness properties!
→ Software mostly about quality (“functionality”)
→ Hardware mostly about quantity: performance, area, power, etc.

41 → One reason that HDLs are not SDLs


HDL: Behavioral Constructs
→ HDLs have low-level structural constructs
→ Specify hardware structures directly
→ Transistors, gates (and, not) and wires, hierarchy via modules
→ Also have mid-level behavioral constructs
→ Specify operations, not hardware to perform them
→ Low-to-medium-level: &, ~, +, *
→ Also higher-level behavioral constructs
→ High-level: if-then-else, for loops
→ Some of these are synthesizable (some are not)
→ Tools try to guess what you want, often highly inefficient
– Higher-level → more difficult to know what it will synthesize to!
→ HDLs are both high- and low-level languages in one!
42 → And the boundary is not clear!
HDL: Simulation
→ Another use of HDL: simulating & testing a hardware design
→ Cheaper & faster turnaround (no need to fabricate)
→ More visibility into design (“debugger” interface)

→ HDLs have features just for simulation


→ Higher level data types: integers, FP-numbers, timestamps
→ Routines for I/O: error messages, file operations
→ Obviously, these cannot be synthesized into circuits

→ Also another reason for HDL/SDL confusion


→ HDLs have “SDL” features for simulation
43
FPGA “Design Flow”

HDL netlist
(wires, implementation
source synthesis bitstream
gates, (place & route)
code FFs)

→ Hardware compilers are generally much slower than their


software counterparts
→ solving hard problems: many more choices, optimizing
for area, power, picosecond-level timing

44
Side note: High-Level Synthesis
→ Translate “C to gates”
→ write hardware at a higher level of abstraction than conventional HDLs
→ greater programmer productivity
→ need to write stylized C that will synthesize well
→ tools are still slow

45

You might also like