Assignment1
Assignment1
Assume you have 5 work loads of dirty clothes, which you need to wash, dry, and iron. A
washing machine, tumble dryer, or ironer can complete one work load of clothes per hour.
a) If you have exactly one washing machine, one tumble dryer, and one ironer, how long
does it take you to do all the work if you do all the work sequentially (Only work on one work
load at a time. Do not use your washing machine, tumble dryer, and ironer at the same
time)?
Answer:
Total Time required for single cloth 1+1+1
for 5 clothes = 5*3 = 15 Hrs
b) Assume you have some friends who want to help you clean your clothes. Each of your
friends has a washing machine, a tumble dryer, and an ironer at home. Thus, you can
parallelize the work among your friends using the data decomposition approach. How do you
do it? How long would it take until the work is done, if everybody starts immediately?
Assume that you have as many helpers as you need for your solution.
Answer
Assuming my friends who want to help me are 4
Each person will have single cloth after diving task
Total time required = 1+1+1 = 3hrz
c) In principle, you could use your washing machine, the tumble dryer, and the ironer at the
same time. How long would it take you to finish your work, if you have to work alone but
utilize the parallelism of your machines?
Answer
7 Hrz
Task 2
Assume an application where the execution of floating-point instructions on a certain
processor P consumes 60% of the total runtime. Let’s assume further that 25% of the
floating-point time is spent in square root calculations.
a) Based on some initial simulations, the design team of the next-generation processor P2
believes that it could either improve the performance of all floating-point instructions equally
by a factor of 1.5 or alternatively speed up the square root operation by a factor of 8. From
which design alternative would the application mentioned before benefit more?
Answer
execution of floating-point = 60% run time
15% of time is spent of square root calculation (25% of 60%)
b) Instead of waiting for the next processor generation, the developers of the application
decide to parallelize the code. What speedup can be achieved on an 16-CPU system if 90%
of the code can be perfectly parallelized? What fraction of the code has to be parallelized to
get a speedup of 10?
Answer: