0% found this document useful (0 votes)
64 views2 pages

Ex 1

This document discusses topics from the textbook "Computer Architecture – A Quantitative Approach (4th Ed)" including sections on performance trends and reading assignments. It also provides exercises related to optimizing processor performance for web serving and graphics applications. Specific questions calculate speedup from parallelization and enhanced hardware for different applications and systems.

Uploaded by

samanafatima
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
64 views2 pages

Ex 1

This document discusses topics from the textbook "Computer Architecture – A Quantitative Approach (4th Ed)" including sections on performance trends and reading assignments. It also provides exercises related to optimizing processor performance for web serving and graphics applications. Specific questions calculate speedup from parallelization and enhanced hardware for different applications and systems.

Uploaded by

samanafatima
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
You are on page 1/ 2

Topics Discussed

Book: Computer Architecture – A Quantitative Approach (4th Ed)


Sections: 1.1, 1.2, 1.3, 1.4 (Performance trends), 1.9
1.11 – Reading Assignment

History and Evaluation of Computers (not in this book) -- Chapter 2 of Computer


Architecture and Organization (6th Edition)

EXERCISES:
1. Suppose that we want to enhance the processor used for Web serving. The new
processor is 10 times faster on computation in the Web serving application than the
original processor. Assuming that the original processor is busy with computation
40% of the time and is waiting for I/O 60% of the time, what is the overall speedup
gained by incorporating the enhancement?

2. A common transformation required in graphics processors is square root.


Implementations of floating-point (FP) square root vary significantly in performance,
especially among processors designed for graphics. Suppose FP square root (FPSQR)
is responsible for 20% of the execution time of a critical graphics benchmark. One
proposal is to enhance the FPSQR hardware and speed up this operation by a factor of
10. The other alternative is just to try to make all FP instructions in the graphics
processor run faster by a factor of 1.6; FP instructions are responsible for half of the
execution time for the application. The design team believes that they can make all
FP instructions run 1.6 times faster with the same effort as required for the fast square
root. Compare these two design alternatives.

3. Calculate the reliability improvement.

The improvement/speedup of fraction is 4150.

4. Your company has just bought a new dual Pentium processor, and you have been
tasked with optimizing your software for this processor. You will run two
applications on this dual Pentium, but the resource requirements are not equal. The
first application needs 80% of the resources, and the other only 20% of the resources.
a. Given that 40% of the first application is parallelizable, how much speedup would
you achieve with that application if ran in isolation?
b. Given that 99% of the second application is parallelizable, how much speedup
would this application observe if run in isolation?
c. Given that 40% of the first application is parallelizable, how much overall system
speedup would you observe if you parallelized it?
d. Given that 99% of the second application is parallelizable, how much overall
system speedup would you get?

5. Program-I runs in 10 sec on machine A, which has 400 MHz clock rate. We are trying
to design a machine B with factor clock rate so as to reduce the total execution time
to 6 seconds. The increase of clock rate will affect the rest of the CPU design causing
B to require 1.2 times as many clock cycles as machine A. you have to determine the
clock rate of machine B.

6. Calculate performance of the machine in terms of MIPS (millions of instruction per


seconds), having following statistics.
Instruction Type Instruction Count CPI
Integer Arithmetic 45000 1
7. Data Transfer 32000 2
Floating Point Instruction 15000 2
Control transfer 8000 2
Compare performance of the following three machines in terms of MIPS (millions of
instruction per seconds), having following specifications:

A B C
IC
200MHz 230MHz 300MHz
Avg. CPI of CPU
3 2 2 30000
dependent Instructions
Avg. CPI of memory
8 8 12 20000
dependent Instructions

You might also like