0% found this document useful (0 votes)
5 views2 pages

Exercise 1

The document presents exercises related to parallel computing, specifically focusing on Amdahl's law and speedup calculations for programs running on multiple cores. It includes a scenario where 1% of a program is not parallelizable and asks for the parallel speedup when run on 61 cores. Additionally, it discusses the impact of broadcast operations on speedup, comparing two different implementations of broadcast overhead.

Uploaded by

lamya.gandhi4
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views2 pages

Exercise 1

The document presents exercises related to parallel computing, specifically focusing on Amdahl's law and speedup calculations for programs running on multiple cores. It includes a scenario where 1% of a program is not parallelizable and asks for the parallel speedup when run on 61 cores. Additionally, it discusses the impact of broadcast operations on speedup, comparing two different implementations of broadcast overhead.

Uploaded by

lamya.gandhi4
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 2

Exercise 1

Assume 1% of the runtime of a program is not parallelizable. This program


is run on 61 cores of a Intel Xeon Phi. Under the assumption that the
program runs at the same speed on all of those cores, and there are no
additional overheads, what is the parallel speedup?

Solution
Amdahl’s law assumes that a program consists of a serial part and a
parallelizable part. The fraction of the program which is serial can be
denoted as B — so the parallel fraction becomes 1 − B. If there is no
additional overhead due to parallelization, the speedup can therefore be
expressed as

Exercise 2
Assume that the program invokes a broadcast operation. This broadcast
adds overhead, depending on the number of cores involved. There are two
broadcast implementations available. One adds a parallel overhead of
0.0001n, the other one 0.0005log(n). For which number of cores do you get
the highest speedup for both implementations?
Exercise 3

Exercise 4

Exercise 5

Exercise 6

You might also like