Exercises 6
Exercises 6
For each of the three systems, state if this system executes programs sequentially or con-
currently and if the hardware is executing in parallel or not. Motivate your answer.
2. Draw a 2 by 2 matrix, where each element represents SISD, SIMD, MISD, and MIMD.
3. Assume that you are performing a speedup performance measurement for a image pro-
cessing algorithm. Someone else has created a very efficient sequential implementation
Is and you have implemented a new parallel version Ip of the algorithm. The sequential
implementation has been executed on a benchmark B on a computer C1 with two cores,
each running at 4GHz. You have the source code for both Is and Ip and access to a ma-
chine with 8 cores. Each core is running at 2.2GHz with hardware multi-threading, which
makes the OS believe that there are 16 cores in the machine.
Explain how you would perform a fair speedup evaluation of your parallel implementa-
tion.
1
4. Assume we have a program where 10% of the execution time is purely sequential and that
the rest of the execution time can be improved by parallelization. For the part of the code
that can be parallelized, each core gives only 80% improvement. For instance, 5 cores
give 5 × 80% = 4 times improvement.
(a) Create a speedup chart, showing speedup on the Y-axis and the number of cores on
the X-axis. Show the graph for 1 to 200 cores, for instance by plotting with 25 cores
interval.
(b) What is the maximal speedup that can be achieved regardless how many cores we
add?
(c) What is it called if we would increase the problem size linearly to the number of
cores? What kind of scaling was used in problem (a)? Why would either of these
scaling approaches make sense?
2
Concurrent Programming and Semaphores
6. In this task, you should consider a multi-threaded producer-consumer problem. There are
two threads, a producer thread and a consumer thread. The producer thread is writing data
into a first-in-first-out (FIFO) buffer and the consumer thread is reading from the buffer.
The FIFO buffer can hold between 0 and n elements.
The task is to create both a tread safe consumer function and a thread safe producer
function with proper synchronization. If the buffer is empty, the consumer needs to wait
until there is an available item in the buffer. If the buffer is full (holds n elements),
the producer has to wait until there is space available, before it writes any data items
into the buffer. Solve the problem by using semaphores. You may write the solution as
pseudocode, as long as you clearly explain the program semantics.
Data-Level Parallelism