Exercises
Exercises
These exercises were originally developed for the Cyberinfrastructure Tutor, the web-based training
site for High Performance Computing and Cyberinfrastructure topics:
http://ci-tutor.ncsa.illinois.edu
which is hosted by the National Center for Supercomputing Applications (NCSA). The exercises
and solutions are adapted to show the use of different debugging tools and compiler options besides
using traditional debuggers.
Please note that using a different compiler or debugger — for that matter even a different version of
the same compiler or debugger — could result in a somewhat different behavior of the debugger.
$ ./variablePrinting
array1 = [ 2 3 4 5 6 7 8 9 10 11 ]
The result for the array del certainly does not look correct.
a. What are the expected values for array del?
To identify this error we recompile our code using the debugging option '–g' and analyze it using
GDB. In this example we illustrate how to debug a code by using a debugger to print out the values
of selected variables during the execution of a program.
One possibility that could account for the values of all elements of del being zero is a coding error
in our function squareArray(). If, for some reason, this function fails to square the elements of
array1 and, instead, is simply returning an array, array2, whose elements are identical to those of
array1 then
del[ k ] = array2[ k ] – array1[ k ] = array1[ k ] – array1[ k ] = 0
for all k.
Therefore, we might first want to examine the values of the elements of the array array2 to see if
{ a2k } = { a1k }. We can do this in either of two ways. The first way is to set a breakpoint on the
line of our code where we call the function squareArray() then print out the value of array2[indx]
as we step through all nelem iterations of the for() loop within the body of this function.
int squareArray(const int nelem_in_array, int *array)
{
int indx;
The second way is to set a breakpoint immediately after the call to squareArray() and print out the
elements of array2 returned by the call to squareArray(). This example will illustrate both ways of
examining these values { a2k }.
After recompiling our code we run it using GDB.
$ gcc -g -o variablePrinting variablePrinting.c
The tui option shows the source code in the upper half of the terminal, which is great to see where
the execution of the program is, what will be the next command and to identify locations of
breakpoints.
If we want to step into and through our function squareArray(), we need to set our initial breakpoint
on line 37 and then rerun our code.
(gdb) b 37
Breakpoint 1 at 0x8048469: file variablePrinting.c, line 37.
(gdb) run
Starting program: variablePrinting
array1 = [ 2 3 4 5 6 7 8 9 10 11 12 13 ]
We can then step into this function by issuing the 'step' (or 's') command.
(gdb) s
squareArray (nelem_in_array=12, array=0x80498d8) at variablePrinting.c:68
68 for (indx = 0; indx < nelem_in_array; indx++)
(gdb) s
70 array[indx] *= array[indx];
At this point we begin printing out the values of the elements of array on both the left- and right-
hand side of the expression on line 70. We also track our progress thru the for() loop by printing out
the value of the variable indx during each iteration of the loop. We can print the value of each of
these variables by issuing the 'print' (or 'p') command followed by the name of the variable whose
value you want to print; e.g.,
(gdb) p indx
(gdb) p array[indx]
prints out the value of the variable v in the same format as a printf() statement in C, that is
printf(“%d\n”, v)
if v is an integer variable or
printf(“%f\n”, v))
if v is a double.
In this example, we are only printing out the values of two variables. However, as the number of
variables becomes more than just a few, this method of printing a sequence of variable values will
begin to require a lot of typing. Fortunately, GDB provides alternatives for printing out a series of
variable values without having to type multiple print commands each time you want to examine
these values. One method involves involves issuing the GDB 'display' (or 'disp') command.
Let's display the loop variable and the array value
(gdb) disp indx
1: indx = 0
(gdb) disp array[indx]
2: array[indx] = 2
Another option is to set a breakpoint in a longer loop and run, so you see the values only at that
point for every iteration.
Examining the values of array[indx] we see that each element of array2 returned by squareArray()
is equal to the square of the corresponding element in array1. Consequently, we know that our
coding error is not located anywhere within the function squareArray(). Therefore, we need to
continue debugging our program until we locate the coding error in our program.
Complete the debugging of this code and identify the coding error in the program.
(BONUS e. How can you print the whole array array1 in one go?)
Exercise 2: arrays
The default range of indices differs among the major programming languages. For example, in
Fortran 77 the default index range is i=1,...,N (where N is the array size); in C/C++ the default
range is i=0,...,N-1; and in Fortran 90 the allowed index range can be any sequence of integers with
any spacing. So if you routinely use several programming languages, it is easy to confuse the
indexing rules.
This program is supposed to fill an integer array N elements according to an algorithm based on the
mod operator (% in C, mod() in Fortran) and should add the values of all of the array elements with
even indices and output the result. It should also do the same with odd-indexed elements. The
program is written in two versions, both C and Fortran, just choose one.
In this exercise we use the Intel compiler. After compiling and executing the code, we get the
following result:
salomon$ module load intel
salomon$ icc -o array array.c
salomon$ ./array
oddsum=5
evensum=224683
salomon$
As you can see, the code does not generate compiler or runtime errors. However, something is
wrong. By looking at the body of the first loop it looks like the array elements should be relatively
small integers. So the value of oddsum is probably alright. But why is the value of evensum so
large? The value of evensum is computed in our summation loop, so we suspect that the error occurs
there. Therefore, we will concentrate our debugging effort on that section of code.
There are five basic steps we will take to identify the error along with a final step of exiting the
debugger:
1. Recompile using the '-g' option
2. Run the debugger (you can use gdb-ia, TotalView or DDT, whichever you prefer)
3. Identify the breakpoint
4. Execute the code up to the breakpoint
5. Step through the summation loop
The first step shouldn't be a problem anymore. If you will use GDB, use the command gdb-ia, it is
built by Intel to match its compiler. Let's put a breakpoint at the beginning of the summation loop.
a. What's the command to do that? How do we run the program up to the breakpoint?
Before analyzing the summation loop, it is good idea to check out some of the array elements to see
if they have the correct values.
b. Use the 'print' debugger command to show the values of the elements tock[2] and tock[7]. Are the
values what you expected?
c. How do we display the value of evensum to make sure that it did get initialized to zero and how
it gets updated?
Step through the summation loop until you see the problematic values.
d. What is the cause of the high (and random) evensum and how can you solve this?
e. How can you print the whole array tock in one go?
As you can see from the output, the value of the sum is way too large given the numbers input. We
leave it up to you to figure out why.
Hint: to run an application in gdb that reads the file input.txt from standard input, use
$ gdb ./readsum
(gdb) start < input.txt
There are two possible and common reasons why the ifort compiler gives this error message:
1. the program is trying to read more data than there is in the file
2. the file "sectionsCT.txt" does not even exist.
a. Which one is the problem here?
Fortran opens files for reading & writing by default and it creates the file it doesn't exist. However,
it's good practice to open files read-only when you don't intend to write to it. This prevents
confusing error messages or accidentally overwritten files. So input-files should be opened only
when they exist, and output files should be opened only if they don't already exist.
b. What is the optional argument to only open existing files? What is the optional argument to open
the file readonly?
Edit the source code and run 'make clean; make'. Now, when we run the program the runtime
environments fails with the helpful message that the file does not exist.
Correct the file name in the source code, recompile and rerun the program. The fortran runtime
environment runs into more errors when reading the input file.
c. Compare the input that is read in the program to what you expect it should be reading (either
through print statements or a debugger). When printing strings, it is often difficult to see if there are
extra spaces at the beginning or end, how can you make these visible?
d. Edit the format to correctly parse the input-file.
Rerun the program and check that the output in the output file is as expected.
e. Is it correct? Please fix the rest of the program.
The four columns contain the endpoint number, the independent variable x, the dependent variable
cos(x), and the integral of the function from the left endpoint to the current point. The good news is
that the final value of the integral is exactly zero, as expected. The bad news is that most of the
other values in the output file are clearly incorrect.
a. Use your favourite debugger to find why the value of x does not advance through the loop.
b. Could the GNU compiler be of help to locate these issues for you?
which is hosted by the National Center for Supercomputing Applications (NCSA). The exercises
and solutions are adapted to show the use of different debugging tools and compiler options besides
using traditional debuggers.
Please note that using a different compiler or debugger — for that matter even a different version of
the same compiler or debugger — could result in a somewhat different behaviors in the debugger.
The default version without MPI support will however report a large number of false errors in the
MPI library, such as:
Salomon contains two Valgrind versions with MPI included to check for these errors. Load one of
the following modules before compiling:
Valgrind/3.11.0-intel-2015b for Intel MPI
Valgrind/3.11.0-foss-2015b for OpenMPI
Try different sizes of the array up to at least 10.000 elements. How does the behaviour of the
program change? Why do you think is that?
b) If you like, attach TotalView to the running job and try to see what's the problem.
c) Fix the program so that it also works for larger arrays.
An OpenMP directive was then added to parallelize the main loop as follows:
!$omp parallel do private(xi,txi,xmom)
do j = 1, nyd
xi = cxi*y(j)
txi = tanh(xi)
xmom = 1.0 - txi**2
u(j) = cu*xmom
v(j) = cv*(2.0*xi*xmom - txi)
enddo
This code was compiled and run with two threads resulting in the following end of the output file:
1.105E+01 4.8600 4.364E-09 -1.373E-05 2.403E-04 -7.563E-01
1.153E+01 5.0672 2.883E-09 -1.374E-05 1.588E-04 -7.569E-01
1.202E+01 5.2827 1.872E-09 -1.375E-05 1.031E-04 -7.573E-01
edge momentum factor = 0.000E+00
The velocity profile (table of numbers) is identical between the serial and parallel cases, but the
edge momentum factor differs. Upon examining the entire output files, you would see that the
whole file is identical between the two cases with the exception of the edge momentum factor.
Introduction
In MPI programming, a common reason for a program to hang is a faulty assumption about the
number or configuration of the MPI processes spawned by the parallel program. This is especially
common when using MPI's blocking point-to-point communication routines (eg. MPI_SEND() and
MPI_RECV()).
Objectives
In this lesson, you will learn how to diagnose programming errors involving incorrect process
spawning assumptions and debug them using a parallel debugger.
The C program, ring.c, constructs a ring topology out of its MPI processes and has each process
send data to its neighbor on one side and receive data from its neighbor on the other side. However,
as currently written, this program only works if it has four MPI processes. Find a way to make it
work with an arbitrary number of processes. (Hint: there is more than one way to do this.)
Note the lack of response from MPI processes 0 through 3. Worse, this behavior is independent of
the number of MPI processes used.