FALL 2009 Computer Sciences Department University of Wisconsin-Madison Ph.D. Qualifying Examination
FALL 2009 Computer Sciences Department University of Wisconsin-Madison Ph.D. Qualifying Examination
Computer Architecture
Qualifying Examination
Monday, September 21, 2009
GENERAL INSTRUCTIONS:
SPECIFIC INSTRUCTIONS:
Answer all of the following SIX questions. The questions are quite specific. If, however,
some confusion should arise, be sure to state all your assumptions explicitly.
The Exam Committee tries to proofread the exam as carefully as possible. Nevertheless,
the exam sometimes contains misprints and ambiguities. If you are convinced a
problem has been stated incorrectly, mention this to the proctor. If necessary, the
proctor can contact a representative of the area to resolve problems during the first hour
of the exam. In any case, you should indicate your interpretation of the problem in your
written answer. Your interpretation should be such that the problem is non-trivial.
1. Logic design
Design (show detailed logic diagrams) for an 8-bit serial multiplier, whose inputs are
two 8-bit unsigned numbers, and the output is one 16-bit unsigned number. The
multiplier must also output a "done" signal that denotes when the multiplication is
complete.
Your multiplier design must be clocked and there must be no additional logic in the
stage in which the adder is being used. Make reasonable assumptions about how much
logic to put in other stages. Justify any assumptions.
2. Branch Prediction
d) Overall branch prediction accuracy is not the only issue that an architect must
consider when deciding which scheme to implement. Discuss some of the other factors
that must be considered before deciding upon a branch prediction scheme.
3. Technology scaling
Over several generations since the dawn of the integrated circuit transistors have kept
getting smaller.
a) As a result of smaller transistors, what have been the benefits from a low-level device
perspective?
b) How have these benefits of the smaller devices been exploited by architectures?
c) Is this trend of smaller devices and their associated benefits (as observed historically)
expected to change in the future. If so, how?
For the past decade or so, most high-performance processors have used large, highly-
associative, non-blocking caches with wide datapaths, and aggressive hardware and
software directed prefetching to achieve good single-threaded memory-system
performance. As power and energy become more important design constraints, future
memory systems are likely to reflect these additional constraints.
a) Considering only a basic single-level cache for a uniprocessor, discuss how the basic
cache parameters (i.e., capacity, associativity, and block size) will tend to change when
optimizing for both energy-efficiency and performance, rather than performance alone.
How does workload choice affect these trends?
For example, LogTM is an explicit TM, while speculative lock elision (SLE) is an implicit
TM.
(b) Compare and contrast explicit versus implicit TM on programmer use issues.
(c) Do you expect one, both, or neither of these techniques to flourish? Why?
A design team was tasked with designing a processor core for a microprocessor
intended for a peta-scale system to be deployed in about 2011. The main use for this
petascale system would be the solution of large-scale scientific computing problems.
The team came up with a “large” out-of-order (OOO) processing core which, amongst
other attributes, had the following three microarchitectural features: (i) capability to
dispatch 6 instruction instructions per cycle, (ii) about a hundred instructions in flight,
(iii) 4-way simultaneous multithreading.
(a) Argue why a processor core design with the above three features is a good choice for
use in a circa 2011 peta-scale system.
(b) Argue why a processor core design with the above three features is a poor choice for
use in a circa 2011 peta-scale system.