0% found this document useful (0 votes)
226 views83 pages

Performance of A Computer

CPU time is an important performance metric for computer architects. However, it does not capture the entire time a program spends in the system. The total time includes: 1. CPU Time: Time spent by the CPU actually executing instructions of the program. As you mentioned, this includes both system and user CPU time. 2. I/O Time: Time spent waiting for I/O operations like disk/network access to complete. 3. Context Switch Time: Time spent in switching between programs due to time-sharing or multi-tasking. 4. Memory Access Time: Time spent waiting for data/instructions to be fetched from memory. So the total execution time of a program is the sum of CPU

Uploaded by

PrakherGupta
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
226 views83 pages

Performance of A Computer

CPU time is an important performance metric for computer architects. However, it does not capture the entire time a program spends in the system. The total time includes: 1. CPU Time: Time spent by the CPU actually executing instructions of the program. As you mentioned, this includes both system and user CPU time. 2. I/O Time: Time spent waiting for I/O operations like disk/network access to complete. 3. Context Switch Time: Time spent in switching between programs due to time-sharing or multi-tasking. 4. Memory Access Time: Time spent waiting for data/instructions to be fetched from memory. So the total execution time of a program is the sum of CPU

Uploaded by

PrakherGupta
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 83

Current slide

Unit 1a

Faculty of Engineering, D.E.I


Autumn 2016

EEM 513 ComputerArchitecture


1

Introduction
Where to place this course
The Computational Stack, Hardware/Software Interface
Administrivia

The HW/SW Interface


Application software

a[i] = b[i] + c;
Compiler

Systems software
(OS, compiler)

lw
add
add
lw
lw
add
sw

$15, 0($2)
$16, $15, $14
$17, $15, $13
$18, 0($12)
$19, 0($17)
$20, $18, $19
$20, 0($16)
Assembler

Hardware

000000101100000
110100000100010

Where we place this course


Cprog/DS

The C language as a computation model


High level view of how to model a
computing system to solve problems

How does a C/asm program end


up executing as DL?
What happens in-between?
How is a computer designed to
satisfy specific requirements using
digital blocks and wires

BE/DigSys

Microarchitects view:
How to design a computer that
meets system design goals.
Choices critically affect both
the SW programmer and
the HW designer

Digital logic as computation model


HW designers view of how a
computer system works
4

The Computational Stack


Algorithm
Implementation (PL)

Interface between hardware and


software contract that the
hardware promises to fulfil.

Operating System
ISA
Microarchitecture

Logic
Circuits
Electrons

Computer Architecture: the attributes of a


system as seen by the programmer
conceptual structure and functional behavior
of the system
Different from the logic design and physical
implementation Computer Organization

Key Computer Architecture Goal: Design the hardware/ software


interface to create a (computing) system that meets functional,
performance and possibly other system goals.
5

Moores Law

Moore, Cramming more components onto integrated circuits,


Electronics Magazine, 1965.
Component counts double every other year
6

Computer Architecture Today


Paradigm shift over last decade (to multi-core and beyond)
Multi and many-core systems leading the way
Many other issues
Rising need for energy efficient devices, especially in fast growing segments like
smartphones, notebooks etc.
Design and scaling issues of multi-core systems
Memory, Programmability, Power walls (duh? DHA1)

Administrivia
Periodic class assignments
(keep a hardcopy handy)
Surprise Quizzes
(best n-1 of n, n ~ 2/3)
HAs (Daily)
CAs (Weekly, based on the weeks coverage)
CTs (Two mid-sem and end-sem)
Attendance
Textbook: Computer Organization and Design: The
Hardare/Software Interface, Hennessy & Patterson, 2nd / 3rd
Ed., MKP
8

Introduction
Elements of Computing Systems
Von Neumann vs Harvard Model
ISA and Microarchitecture

Key Elements of a Computing System

Processing
Control
(sequencing)

Memory
(program
and data)

I/O

Datapath

10

The Von Neumann Model/Architecture


AKA stored program computer (instructions in memory).
Key properties:
1. Stored program
Instructions stored in memory
Unified memory: instructions and data stored together
How we interpret the contents of memory depends on the control signals issued

2. Sequential instruction processing


One instruction processed at a time
(fetch decode fetch operands execute writeback)
Program Counter (PC) / Instruction Pointer identifies current/next instruction
Advances sequentially except in case of conditional/jump instructions

All major instruction set architectures today based on this model


x86, ARM, MIPS, SPARC, Alpha

11

Von-Neumann Model

PC
IR

Register File

MAR

MDR

12

Harvard Architecture
Separate storage and datapath for instructions and data
Originated from Harvard Mark I
CPU can both read an instruction and access data memory
simultaneously.
Can

thus be faster since instruction fetches and data access do not


contend for a single memory pathway, as in Von Neumann

Distinct code and data address spaces


Applications in DSP, microcontrollers like PIC
Modern high performance CPU chip designs incorporate
aspects of both Harvard and von Neumann architecture.
Split I/D cache

13

Computer Architecture Today


Paradigm shift over last decade (to multi-core and beyond)
Multi and many-core systems leading the way
Many other issues
Energy Constraints:
Single thread/processor systems?
Rising need for energy efficient devices in fast growing segments like
smartphones, notebooks etc.
Design issues in multi-core systems

Technology scaling issues


How small can a transistor be? Reliability
Memory wall/bottleneck
Reliability wall/issues
Programmability wall
14

ISA vs. Microarchitecture


ISA: Specifies how the programmer sees instructions to be
executed
Programmer sees a sequential flow of execution

Micro-architecture: How underlying implementation actually


executes instructions
Micro-architecture may execute instructions in any order as long as it obeys the semantics
specified by the ISA when making the instruction results visible to software

Micro-architecture level execution models vary in most


implementations
Pipelined instruction execution: Intel 80486 uarch
Multiple instructions at a time: Intel Pentium uarch
Out-of-order execution: Intel Pentium Pro uarch
Separate instruction and data caches

15

ISA vs. Microarchitecture


Various micro-architectures exist for the same ISA
Add instruction (ISA) vs. Adder implementation (Micro-architecture)
Several possible implementations: ripple carry v/s carry lookahead v/s carry save

x86 ISA has many implementations: 286, 386, 486, Pentium, Pentium Pro,
Pentium 4, Core,

Q. Which of the two changes faster? ISA or Microarchitecture?


Few ISAs (x86, ARM, SPARC, MIPS, Alpha) but many microarchitectures
Why?

16

What comprises the ISA?


Instructions
Opcodes, Data Types
Instruction Types and Formats
Registers, Addressing Modes

Memory
Address space, Addressability, Alignment
Virtual memory management

Function Calls, Interrupt/Exception Handling


Access Control, Device Priorities
I/O mapping: memory-mapped vs. I/O mapped
Task/thread Management

17

What comprises the Micro-architecture?


Micro-architecture: Implementation of the ISA under specified
design constraints and goals
Pipelining
In-order versus out-of-order instruction execution
Memory access scheduling policy
Speculative execution
Superscalar/VLIW
Clock gating
Caching: number of levels, cache size, associativity,
replacement

18

DHA 2: ISA or Micro-architecture?


Why?
a.
b.
c.
d.
e.
f.
g.

Opcode of MUL instruction


Number of bytes per word
Number of ports to the register file
Number of general purpose registers
Number of cycles to execute the MUL instruction
Number of Pipeline stages
Buffers in Cache

19

Tradeoffs in ISA and MicroArch


Design Considerations like
Cost
Performance
Energy/Battery life
Reliability and Correctness
Time to market

20

Computer Architecture
Architecture
ktkt/
The art and science of designing and constructing
buildings

Computer Architecture
The art and science of selecting, designing
and interconnecting hardware components
and designing the hardware/software
interface to create a computing system that
meets the required system design goals
21

Recap

22

Performance
Response time & throughput; Factors that affect performance
The Performance Equation
Some Examples

23

Performance and Cost


Which of the following aircrafts has
the best performance?

Airplane
Boeing 737
Boeing 747
Concorde
Douglas DC-8

Passengers
101
470
132
146

Range (mi) Speed (mph)


630
4150
4000
8720

598
610
1350
544

Concorde is the fastest


DC-8 has the longest range
747 can carry the max. no of passengers

24

Performance and Cost

Which of the following aircrafts has


the best performance?
Airplane
Boeing 737
Boeing 747
Concorde
Douglas DC-8

Passengers
101
470
132
146

Range (mi) Speed (mph)


630
4150
4000
8720

598
610
1350
544

Clearly, we must first define what we mean by performance

25

Defining Performance

Why study hardware performance?


Hardware performance is often the key to the
effectiveness of the entire software

Not an easy task


Different performance metrics appropriate for
different types of applications
Different aspects of a computing system can
assume significance in determining the overall
performance
26

Defining Performance

What is important to whom?


Individual Computer User/Programmer

Minimize Response Time for the program


= completion time start time
Also called elapsed time, wall clock time

Data Center Manager (what are data centers?)

Maximize Throughput
number of jobs completed in given time

Measured as #jobs/unit time

27

Q. Defining Performance

Does each of the following improve


(a) response time or (b) throughput? Or both?
Replacing the existing CPU with a faster CPU
Adding more processor cores in the same
system, for performing separate tasks

28

Q. Defining Performance

Does each of the following improve


(a) response time or (b) throughput? Or both?
Replacing the existing CPU with a faster CPU

Both (a) and (b)

Adding more CPUs/cores

No single task gets done faster, but (b) improves

29

Response Time vs. Throughput

How do we relate the two?


Is throughput = 1/(average response time) ?
E.g. Simple uniprocessor system

processes executed sequentially

Assume 5 processes
Each process needs 2 minutes of CPU time

One process starts after every 2 minutes


OR
Processes start every 10 seconds

Whats the throughput in each case?

Is throughput = 1/average response time?


Yes, BUT only if there is NO overlap of jobs
Otherwise, throughput > 1/average response time
30

Defining Performance

Depending on the type of application, we


might need to optimize
Throughput
Response time
Some combination of both

31

Performance according to the


Computer Architect

From the perspective of a computer designer


CPU time = time spent actually running a program

Intuitively, faster should be better, so:

Clearly, CPU time is a major performance


metric
32

CPU Execution Time

Time spent by CPU in running a program


System CPU Time
User CPU Time

(Try) Unix/Linux time command


Gives user and system CPU time
Also gives elapsed time

CPU time is a major performance metric:


Specifically, well view CPU performance in terms of
user CPU time

33

Performance Comparison

Machine A is n times faster than machine B iff


perf(A)/perf(B) = time(B)/time(A) = n
Machine A is x% faster than machine B iff
perf(A)/perf(B) = time(B)/time(A) = 1 + x/100

Example: for some program run on machines A


and B, time(A) = 10s, time(B) = 15s
15/10 = 1.5 => A is 1.5 times faster than B
15/10 = 1.5 => A is 50% faster than B

34

CPU Performance Factors

Computer equipped with a clock


that runs at a fixed rate, and
determines when events will take place in the hardware

Discrete time intervals called clock cycles/ticks


Example:
1GHz Processor runs 109 cycles/sec, 1 cycle every 1 ns
2.5GHz Core i7 runs 2.5x109 cycles/sec

So one clock tick every 0.4 ns


Clock tick is a.k.a. clock cycle time
2.5GHz is the clock rate

35

CPU Performance Factors

CPU time can be related to


Number of CPU clock cycles
Clock cycle time

Alternatively:

36

Clearly the hardware designer can improve


performance by reducing either the length of the
clock cycle or the number of clock cycles
required for a program
Trade-off: Many techniques that decrease
the number of clock cycles also increase the clock
cycle time.

37

Example 1 (Ch4, HP3E)


Our favourite program runs in 10 seconds on
computer A, which has a 2 GHz clock. We are
trying to help a computer designer build a
machine that runs this program in 6 seconds. The
designer has determined that a substantial
increase in the clock rate is possible but this
increase will affect the rest of the CPU design,
causing machine B to require 1.2 times as many
clock cycles as machine A for this program.
What clock rate should we tell the designer to
target?

38

This equation does not include any reference to the


number of instructions needed for the program.
However, the execution time must depend on the
number of instructions in the program.
A simple way to think about execution time is that
it equals the number of instructions executed
multiplied by the average time per instruction.

39

CPU Performance Equation

Example 2
Suppose we have two implementations of the
same instruction set architecture. Machine A has
a 1ns clock cycle and average CPI (clock cycles
per instruction) of 2.0 for some program and
machine B has a clock cycle time of 2 ns and a
CPI of 1.2 for the same program. Which machine
is faster for this program, and by how much?

41

The performance equation separates the three key factors


that affect performance.
Can be used to compare two different implementations
To evaluate a design alternative if we know its impact on
these three parameters
How can we determine the values of these factors in the
performance equation?

42

CPU Performance Equation

It is also possible to obtain the number of clock


cycles by looking at the different types of
instructions and their individual clock cycles
counts:

where ci is the number of instructions of class i


executed

43

Example 3
A compiler designer is trying to decide between two code
sequences for a particular machine. The hardware designers have
supplied the following facts:
Instruction class
A
B
C

CPI for this instruction class


1
2
3

For a particular HLL statement the compiler designer is


considering two code sequences that require the following
instruction counts:
Instruction counts for instruction class
Code Sequence
1

A
2

B
1

C
2

Which code sequence executes the most instructions? Which will be


faster? What is the CPI for each sequence?
44

Example 3
Considering only one factor (instruction count, in
this case) to assess performance can mislead.
When comparing two computers, we must look at
all three components, which combine to form
execution time.
If some of the factors are identical, like the clock rate
in the previous example, performance can be
determined by comparing all the non-identical
factors.
45

Three key factors


affecting performance

Instruction count
Determined by algorithm, compiler, ISA
Measured using software that simulates the ISA, hardware
counters present in many systems

Clock cycle time


Determined by technology, organization, efficient circuit design
Usually published in processor documentation

Average cycles per instructions (CPI)


Determined by ISA, CPU organization, instrn mix
Measured by detailed simulations of implementation

46

Measuring Performance

The Performance Equation helps us use


these key factors of CPU performance for:
Comparing different implementations
Evaluating design alternatives if we know the
impact on these three parameters

47

Summary

We looked at some fundamental ways of


quantifying and comparing the
performance of computing systems
Moving on: Other performance
issues, benchmarking

Readings:
HP3E Ch.1. (1.1-1.3, 1.5); Ch.2. (4.1,4.2)

48

Performance
Evaluating Performance
49

Performance & Benchmarks


Measuring performance of a computer
system is not easy.
What we need: a simple yet
representative metric that captures the
capabilities of the computer system for
our usage.
metric that takes into account the kind of
applications, or app-mix we run on our
system.
Scientific
Word processing
Blah blah

Performance & Benchmarks


If we plan to run scientific applications
involving lots of floating-point calculations,
there is no point in knowing how many
integer calculations a given machine can
perform
Similarly, if app-mix mostly does character
manipulation, we dont care how many
integer / FP calculations per second the
system can do.

Performance & Benchmarks


So it is important to take into account
the expected mix of applications the
workload and derive metrics that make
sense for the target user group.
Workload: suite of representative
programs which can be executed to
measure the time.
If this suite of applications represents the
target user application mix reasonably, then
we can compare the performance of different
systems by comparing execution times

Performance & Benchmarks


Obviously, if machine X executes the
workload in 300 seconds and machine Y
takes 330 seconds, we say that machine X
is better for this workload.
Note, however, that if we change the
workload, it is quite possible that machine Y
performs better than machine X for the new
workload.
Takeaway: workload is important in
comparing the performance of different
machines.

Performance & Benchmarks


Standard bodies have tried to define a set
of benchmark programs that approximate
the intended real-world applications.
Benchmarks can be real programs taken from
sample applications or they can be synthetic.

In synthetic benchmarks, artificial


programs are created to exercise the
system in a specific way. Examples:
Whetstone and Dhrystone benchmarks.

Performance Metrics
Computer system performance can be
measured by several performance
metrics.
The metrics we use depend on the
purpose as well as the component of the
system in which we are interested.
For example, to benchmark a networking
device, wed use network bandwidth, which
tells us the number of bits the component can
transmit per second.

Performance Metrics
MIPS stands for millions of instructions
per second.
simple metric but practically useless to
express the performance of a system (why?)
Instructions can vary widely among processors
For example, complex instructions take more clocks
than simple instructions.
Thus, a complex instruction rate will be lower than
that for simple instructions.

The MIPS metric does not capture the actual work


done by these instructions.

Performance Metrics
MIPS is perhaps useful in comparing
various versions of processors derived
from the same instruction set.
MFLOPS: popular metric often used in
the scientific computing area.
Millions of floating-point operations per
second.

Better metric than MIPS as it captures the


number of operations performed, rather
than instructions

Synthetic Benchmarks
Programs specifically written for performance
testing.
Whetstone benchmark, named after the
Whetstone Algol compiler (Algol, later Fortran)
was developed in the mid-1970s to
measure floating-point performance
Dhrystone benchmark (Ada, later C)
developed in 1984 to measure integer
performance.

Synthetic Benchmarks
Both Whetstone and Dhrystone
benchmarks are small programs
Drawbacks with synthetic benchmarks:
No user would use them as applications:
they dont do anything of use
Not real programs, so they do not reflect
program behavior
They encouraged excessive optimization
by compilers to distort performance
results

Real Benchmarks
SPEC: System Performance Evaluation
Cooperative
SPEC CPU2006
Benchmark for measuring processor performance,
memory, and compiler
12I + 17F apps written in three or four different PLs
Integer programs: Compilers, compression, chess, CAD
placement & routing programs etc.
Floating Points: FEM, CFD simulations, ANN, 3D
graphics, image processing programs etc.
Performance of a REF machine is given (Sun Sparc ?).

Real Benchmarks
Others
SPECmail, SPECweb, SPECjvm
etc.

Means of Performance
Matter of interest: a single summarizing metric to get
an idea of performance
Less information, but preferred by marketers and users

Even if we conduct several experiments, once an


appropriate workload has been identified and the
performance metric has been selected, we need to find a
way to get a single value for the metric.
Many ways of obtaining such a metric.
Suppose we run two programs to evaluate a system.
Individual execution times of 100 seconds (for
Program 1) and 80 seconds (for Program 2).

Means of Performance
Arithmetic mean = 90 seconds
The implicit assumption in our arithmetic
mean calculation
Both programs are equally likely in the target
workload.
What if they are not?

Weighted Arithmetic Mean


Example: P2 appears three times more often than
P1
Weighted Mean = (3 x 80 + 1 x 100)/4 = 85 seconds

AM is a special case of the weighted


arithmetic mean with equal weights

Comparing relative performance


Simplest approach: use execution time
Perf (A) / Perf (B) = ET (B) / ET(A)

Program 1
(seconds)
Program 2
(seconds)
Total time
(seconds)

Computer A Computer B
1
10
1000

100

1001

110

Comparing relative performance


Usually normalize to a reference machine
when comparing A and B: A/REF; B/REF
AM is inconsistent, as it depends on the
REF machine.
How about GM?
Example follows.

Means of Performance
GM has the following property:
GM (Xi) / GM (Yi) = GM (Xi/Y)
Advantage: Independent of running
times of individual programs and REF
machine
Example:

P1
P2
AM
GM

Time on A Time on B
1
10
1000
100
1001
110
31.6
31.6

Means of Performance
AM values tell us that A is about three
times faster than B, but GM suggests that
both machines perform the same. (Why?)
Coz GM tracks the performance ratio, not
execution time (thats its key drawback)
o Since Program 1 runs 10 times faster on A
and Program B runs 10 times faster on B, by
using GM we erroneously conclude that the
average performance of the two programs is
the same.

Summary: Means of Performance


Execution Time is the best measure of
performance. Other metrics have
limitations/drawbacks.
Any measure that summarizes
performance should reflect execution
time: weighted AM
GM does not do this

GM is good for throughput: SPEC


benchmarks use GM for this purpose.

Next Class
Wind up Unit 1: Amdahls law

Review: Performance
sec
clock cycle

sec
CPU time

CPU
cycles
for
program

clock
cycle
time

program
clock cycle
program

clock cycle
CPU cycles for program
program
sec

CPU time

clock cycle
program
Clock rate

sec

clock cycle
CPU cycles for program
program
clock cycle

CPI

instruction
instruction
Instruction count

program
instruction
clock cycle
CPI

Instructio
n
count
program
sec
instruction

CPU time

program
clock cycle

Clock rate

sec

1
Clock rate
program
CPU performance

CPU time
sec CPI Instruction count
70

Review: MIPS
Machines with different

CPI Instruction count


CPU time
Clockrate
Clockrate Instruction count

CPI
CPU time
Clockrate Instruction count

MIPS
6
6
CPI 10
CPU time 10

instruction sets?
Programs with different
instruction mixes?
Uncorrelated with
performance
Marketing metric

Meaningless Indicator of

Processor Speed

71

Review: MFLOPS
Popular in supercomputing

community
Number of FP operations Often not where time is
MFLOP/s
spent
CPU time 106
Not all FP operations are
equal
Can magnify performance
differences
A better algorithm (e.g.,

with better data reuse) can


run faster even with higher
FLOP count

72

Amdahls Law
A motivating example
A program runs in 100 seconds on a
computer, with multiply operations
responsible for 80 seconds of this time. By
how much do I have to improve the speed of
multiplication if I want my program to run
five times faster?

73

Amdahls Law
Validity of the single processor approach to achieving large scale computing capabilities, G. M. Amdahl,
AFIPS Conference Proceedings, pp. 483-485, April 1967

Historical context
Amdahl was demonstrating the continued validity of the
single processor approach and of the weaknesses of the
multiple processor approach
A fairly obvious conclusion which can be drawn at this point is
that the effort expended on achieving high parallel performance
rates is wasted unless it is accompanied by achievements in
sequential processing rates of very nearly the same magnitude.

Nevertheless, Amdahls Law has widespread

applicability in many situations

74

Amdahls Law

Average execution rate


(performance)

avg

R
i

Fi 1
i

Fraction of results
generated at rate Ri

i
Note: Not fraction
of time spent working
at this rate

The amount of parallel speedup in a given problem is


limited by the sequential portion of the problem.

75

Example of Amdahls Law


30% of results are generated at the rate of 1 MFLOPS,
20% at 10 MFLOPS,
50% at 100 MFLOPS.
What is the average performance?
What is the bottleneck, the rate that consumes most time?

Ravg

1
100
100

3.08 MFLOPS
0.3 0.2 0.5 30 2 0.5 32.5

1 10 100

30
2
0. 5
92.3%,
6.2%,
1 .5 %
32.5
32.5
32.5
Bottleneck

76

Amdahls Law

Fenhanced
Exec _ timenew Exec _ timeold (1 Fenhanced)

Speedup
enhanced

Speedup overall

Exec _ time old


1

F enhanced
Exec _ time new (1 F enhanced )
Speedup enhanced

77

Implications of Amdahls Law


The performance improvements provided by a feature are

limited by how often that feature is used


Bottleneck is the most promising target for improvements
Make the common case fast
Infrequent events, even if they consume a lot of time, will make

little difference to performance


Typical use: Change only one parameter of system, and

compute effect of this change


The same program, with the same input data, should run on the

machine in both cases

78

Making the Common Case Fast: Examples


All instructions require an instruction fetch, only a

fraction require a data fetch/store


Optimize instruction access over data access

Programs exhibit locality


Spatial Locality
items with addresses near one another tend to be referenced
close together in time
Temporal Locality

recently accessed items are likely to be accessed in the near


future

Access to small memories is faster


Provide a storage hierarchy such that the most frequent
accesses are to the smallest (closest) memories.
Reg's

Cache

Memory

Disk / Tape

79

Make The Common Case Fast (2)


What is the common case?
The rate at which the system spends most of its time
The bottleneck
Exactly what does this statement mean?
Make the common case faster, rather than making some
other case faster
Make the common case faster by a certain amount, rather
than making some other case faster by the same amount

80

Example
Which change is more effective on a certain machine: speeding up
10-fold the floating point square root operation only, which takes
up 20% of execution time, or speeding up 2-fold all floating point
operations, which take up 50% of total execution time?
(Assume that the cost of accomplishing either change is the same,
and the two changes are mutually exclusive.)

81

(HA03)
Suppose we have made the following measurements:
Frequency of floating point operations = 25%
Average CPI of floating point operations = 4.0
Average CPI of other instructions = 1.33
Frequency of FPSQR = 2% (FPSQR = Instruction for FPSquare Root)
CPI of FPSQR = 20
Given two design alternatives, the first being to reduce the
FPSQR to 2 and the second being to reduce the average CPI of
all FP operations to 2, which one should we opt for?

82

Readings
References. HP3E Ch.1. (1.1(1.1-1.3, 1.5); Ch.2. (4.1(4.1-4.3, 4.54.5-4.6
4.6))

Readings:
Real Stuff: Two SPEC Benchmarks and the Performance of Recent Intel

Processors, HP3Ed, pages 259 266


266..
Gordon E. Moore, Cramming more components onto ICs, Electronics,
April 19, 1965

AA 1: Questions 4.1, 4.2, 4.3, 4.6, 4.8, 4.9, 4.10,

4.11, 4.12, 4.14, 4.15, 4.16, 4.45, 4.46, 4.51.


HP3Ed, Pages 272 277.
Submission Deadline Monday, 25th July. Turn in: AA

Notebook..
Notebook

You might also like