CSE 820 Graduate Computer Architecture: Dr. Enbody
CSE 820 Graduate Computer Architecture: Dr. Enbody
CSE 820
Graduate Computer Architecture
Richard Enbody
Dr. Enbody
1/10/11
Objectives
In this course students will study advanced
concepts in computer architecture. The
emphasis is on modern processor design,
and will include some multicore design. More
than half the time will be spent with material
related to the textbook; the remainder will be
material not in the text. Research papers will
be assigned to be read and analyzed.
Prerequisites
Assume undergraduate computer
architecture course such as CSE 420
1/10/11
Grading
30% Homework
30% Midterm Exam (Tuesday, March 1 in class)
35% Final Exam (Monday, May 2, 7:45 - 9:45 AM)
05% Classroom Participation
Course grade:
93% and above is a 4.0;
85% - 92% is a 3.5;
80% - 84% is a 3.0, etc.
Schedule
First half: text
Midterm
Second half: finish text
then cool architecture stuff
Final
In-between: readings, writings
1/10/11
Cool Stuff?
Possibilities
Virtualization support
IBM Cell processor
Multi-cores
Newest Intel and AMD chips
Google architecture
Power, Thermal, Skew issues
Asynchronous
Graphic processing
Homework
Most are brief overviews
of assigned reading,
e.g. one page.
1/10/11
1/10/11
1/10/11
1/10/11
1/10/11
1/10/11
Why?
Intels response to GPGPU-based
supercomputers running CUDA
It is all about Flops per Watt
10
1/10/11
Algorithms
A benchmark production planning model solved using linear
programming would have taken 82 years to solve in 1988.
Fifteen years later (2003) it could be solved in roughly 1 minute,
an improvement of roughly 43 million.
Roughly 1,000 was due to increased processor speed;
a factor of roughly 43,000 was due to improvements in algorithms!
Professor Martin Grtschel
Konrad-Zuse-Zentrum fr Informationstechnik Berlin.
11
1/10/11
Outline
Computer Science at a Crossroads
Computer Arch. vs. Instruction Set Arch.
What Computer Architecture brings to table
12
1/10/11
technology
driven
architectural and
organizational driven
SPECFP increased faster.
10000
1000
??%/year
52%/year
100
10
25%/year
1
1978 1980 1982 1984 1986 1988 1990 1992 1994 1996 1998 2000 2002 2004 2006
VAX
: 25%/year 1978 to 1986
RISC + x86: 52%/year 1986 to 2002
13
1/10/11
14
1/10/11
Outline
Computer Science at a Crossroads
Computer Arch. vs. Instruction Set Arch.
What Computer Architecture brings to table
15
1/10/11
instruction set
hardware
r31
PC
lo
hi
Programmable storage
2^32 x bytes
31 x 32-bit GPRs (R0=0)
32 x 32-bit FP regs (paired DP)
HI, LO, PC
Data types ?
Format ?
Addressing Modes?
Arithmetic logical
Add, AddU, Sub, SubU, And, Or, Xor, Nor, SLT, SLTU,
AddI, AddIU, SLTI, SLTIU, AndI, OrI, XorI, LUI
SLL, SRL, SRA, SLLV, SRLV, SRAV
Memory Access
LB, LBU, LH, LHU, LW, LWL,LWR
SB, SH, SW, SWL, SWR
Control
16
1/10/11
Patterson:
ISA vs. Computer Architecture
Old definition of computer architecture = instruction set design
Other aspects of computer design called implementation
Insinuates implementation is uninteresting or less challenging
Pattersons view is computer architecture >> ISA
Architects job much more than instruction set design;
technical hurdles today are more challenging
than those in instruction set design
Since instruction set design not where action is, some conclude
computer architecture (using old definition) is not where action is
disagree on conclusion
agree that ISA not where action is
17
1/10/11
Computer Architecture is
Design and Analysis
Design
Analysis
18
1/10/11
Outline
Computer Science at a Crossroads
Computer Arch. vs. Instruction Set Arch.
What Computer Architecture brings to table
Technology Trends
Outline
Computer Science at a Crossroads
Computer Architecture v. Instruction Set
Arch.
What Computer Architecture brings to
table
19
1/10/11
20
1/10/11
Ifetch
O
r
d
e
r
DMem
Reg
Ifetch
Reg
Reg
DMem
Reg
Reg
Ifetch
Reg
DMem
ALU
Reg
ALU
Ifetch
ALU
Reg
DMem
Limits to pipelining
Hazards prevent next instruction
from executing during its designated clock cycle
Structural hazards:
attempt to use the same hardware to do two different things at once
Data hazards:
Instruction depends on result of prior instruction still in the pipeline
Control hazards:
Caused by delay between the fetching of instructions and
decisions about changes in control flow (branches and jumps).
Reg
DMem
Ifetch
Reg
DMem
Reg
DMem
Reg
ALU
O
r
d
e
r
Ifetch
ALU
I
n
s
t
r.
ALU
I
n
s
t
r.
ALU
Ifetch
Ifetch
Reg
Reg
Reg
DMem
Reg
21
1/10/11
Capacity
Access Time
Cost
MEM
CPU Registers
100s Bytes
300 500 ps (0.3-0.5 ns)
Registers
L1 and L2 Cache
10s-100s K Bytes
~1 ns - ~10 ns
$1000s/ GByte
L1 Cache
Main Memory
G Bytes
80ns- 200ns
~ $100/ GByte
Disk
10s T Bytes, 10 ms
(10,000,000 ns)
~ $1 / GByte
Tape
infinite
sec-min
~$1 / GByte
Instr. Operands
Blocks
Upper Level
prog./compiler
1-8 bytes
faster
cache cntl
32-64 bytes
L2 Cache
Blocks
cache cntl
64-128 bytes
Memory
Pages
OS
4K-8K bytes
Files
user/operator
Mbytes
Disk
Tape
Larger
Lower Level
22
1/10/11
4) Amdahls Law
&
Fractionenhanced #
ExTimenew = ExTimeold ( $(1 ' Fractionenhanced )+
Speedupenhanced !"
%
Speedupoverall =
ExTimeold
=
ExTimenew
(1 ! Fractionenhanced ) +
Fractionenhanced
Speedupenhanced
(1 - Fractionenhanced )
23
1/10/11
Speedup enhanced
(1 ! 0.4)+ 0.4
10
1
= 1.56
0.64
CPI
= Seconds
Program
Program
Inst Count
X
CPI
Compiler
(X)
Inst. Set.
Organization
Technology
Cycle time
Clock Rate
X
X
24
1/10/11
combinational
logic
And in conclusion
Computer Architecture >> instruction sets
Computer Architecture skill sets are different
5 Quantitative principles of design
Quantitative approach to design
Solid interfaces that really work
Technology tracking and anticipation
25