common_concurrency_problems
common_concurrency_problems
Researchers have spent a great deal of time and effort looking into con-
currency bugs over many years. Much of the early work focused on
deadlock, a topic which we’ve touched on in the past chapters but will
now dive into deeply [C+71]. More recent work focuses on studying
other types of common concurrency bugs (i.e., non-deadlock bugs). In
this chapter, we take a brief look at some example concurrency problems
found in real code bases, to better understand what problems to look out
for. And thus our problem:
359
360 C OMMON C ONCURRENCY P ROBLEMS
Atomicity-Violation Bugs
The first type of problem encountered is referred to as an atomicity vi-
olation. Here is a simple example, found in MySQL. Before reading the
explanation, try figuring out what the bug is. Do it!
1 Thread 1::
2 if (thd->proc_info) {
3 ...
4 fputs(thd->proc_info, ...);
5 ...
6 }
7
8 Thread 2::
9 thd->proc_info = NULL;
In the example, two different threads access the field proc info in
the structure thd. The first thread checks if the value is non-NULL and
then prints its value; the second thread sets it to NULL. Clearly, if the
first thread performs the check but then is interrupted before the call to
fputs, the second thread could run in-between, thus setting the pointer
to NULL; when the first thread resumes, it will crash, as a NULL pointer
will be dereferenced by fputs.
O PERATING
S YSTEMS
[V ERSION 0.80] WWW. OSTEP. ORG
C OMMON C ONCURRENCY P ROBLEMS 361
Order-Violation Bugs
Another common type of non-deadlock bug found by Lu et al. is known
as an order violation. Here is another simple example; once again, see if
you can figure out why the code below has a bug in it.
1 Thread 1::
2 void init() {
3 ...
4 mThread = PR_CreateThread(mMain, ...);
5 ...
6 }
7
8 Thread 2::
9 void mMain(...) {
10 ...
11 mState = mThread->State;
12 ...
13 }
T HREE
c 2014, A RPACI -D USSEAU E ASY
P IECES
362 C OMMON C ONCURRENCY P ROBLEMS
that the value of mThread is initially NULL; if not, even stranger things
could happen as arbitrary memory locations are read through the deref-
erence in Thread 2).
The more formal definition of an order violation is this: “The desired
order between two (groups of) memory accesses is flipped (i.e., A should
always be executed before B, but the order is not enforced during execu-
tion).” [L+08]
The fix to this type of bug is generally to enforce ordering. As we
discussed in detail previously, using condition variables is an easy and
robust way to add this style of synchronization into modern code bases.
In the example above, we could thus rewrite the code as follows:
O PERATING
S YSTEMS
[V ERSION 0.80] WWW. OSTEP. ORG
C OMMON C ONCURRENCY P ROBLEMS 363
Thread 1: Thread 2:
lock(L1); lock(L2);
lock(L2); lock(L1);
Note that if this code runs, deadlock does not necessarily occur; rather,
it may occur, if, for example, Thread 1 grabs lock L1 and then a context
switch occurs to Thread 2. At that point, Thread 2 grabs L2, and tries to
acquire L1. Thus we have a deadlock, as each thread is waiting for the
other and neither can run. See Figure 32.1 for details; the presence of a
cycle in the graph is indicative of the deadlock.
The figure should make clear the problem. How should programmers
write code so as to handle deadlock in some way?
T HREE
c 2014, A RPACI -D USSEAU E ASY
P IECES
364 C OMMON C ONCURRENCY P ROBLEMS
Holds
Thread 1 Lock L1
Wanted by
Wanted by
Lock L2 Thread 2
Holds
O PERATING
S YSTEMS
[V ERSION 0.80] WWW. OSTEP. ORG
C OMMON C ONCURRENCY P ROBLEMS 365
Prevention
Circular Wait
Probably the most practical prevention technique (and certainly one that
is used frequently) is to write your locking code such that you never in-
duce a circular wait. The way to do that is to provide a total ordering on
lock acquisition. For example, if there are only two locks in the system (L1
and L2), we can prevent deadlock by always acquiring L1 before L2. Such
strict ordering ensures that no cyclical wait arises; hence, no deadlock.
As you can imagine, this approach requires careful design of global
locking strategies and must be done with great care. Further, it is just a
convention, and a sloppy programmer can easily ignore the locking pro-
tocol and potentially cause deadlock. Finally, it requires a deep under-
standing of the code base, and how various routines are called; just one
mistake could result in the wrong ordering of lock acquisition, and hence
deadlock.
Hold-and-wait
The hold-and-wait requirement for deadlock can be avoided by acquiring
all locks at once, atomically. In practice, this could be achieved as follows:
1 lock(prevention);
2 lock(L1);
3 lock(L2);
4 ...
5 unlock(prevention);
T HREE
c 2014, A RPACI -D USSEAU E ASY
P IECES
366 C OMMON C ONCURRENCY P ROBLEMS
No Preemption
Because we generally view locks as held until unlock is called, multiple
lock acquisition often gets us into trouble because when waiting for one
lock we are holding another. Many thread libraries provide a more flexi-
ble set of interfaces to help avoid this situation. Specifically, a trylock()
routine will grab the lock (if it is available) or return -1 indicating that the
lock is held right now and that you should try again later if you want to
grab that lock.
Such an interface could be used as follows to build a deadlock-free,
ordering-robust lock acquisition protocol:
1 top:
2 lock(L1);
3 if (trylock(L2) == -1) {
4 unlock(L1);
5 goto top;
6 }
Note that another thread could follow the same protocol but grab the
locks in the other order (L2 then L1) and the program would still be dead-
lock free. One new problem does arise, however: livelock. It is possible
(though perhaps unlikely) that two threads could both be repeatedly at-
tempting this sequence and repeatedly failing to acquire both locks. In
this case, both systems are running through this code sequence over and
over again (and thus it is not a deadlock), but progress is not being made,
hence the name livelock. There are solutions to the livelock problem, too:
for example, one could add a random delay before looping back and try-
ing the entire thing over again, thus decreasing the odds of repeated in-
terference among competing threads.
One final point about this solution: it skirts around the hard parts of
using a trylock approach. The first problem that would likely exist again
arises due to encapsulation: if one of these locks is buried in some routine
that is getting called, the jump back to the beginning becomes more com-
plex to implement. If the code had acquired some resources (other than
O PERATING
S YSTEMS
[V ERSION 0.80] WWW. OSTEP. ORG
C OMMON C ONCURRENCY P ROBLEMS 367
L1) along the way, it must make sure to carefully release them as well;
for example, if after acquiring L1, the code had allocated some memory,
it would have to release that memory upon failure to acquire L2, before
jumping back to the top to try the entire sequence again. However, in
limited circumstances (e.g., the Java vector method above), this type of
approach could work well.
Mutual Exclusion
The final prevention technique would be to avoid the need for mutual
exclusion at all. In general, we know this is difficult, because the code we
wish to run does indeed have critical sections. So what can we do?
Herlihy had the idea that one could design various data structures to
be wait-free [H91]. The idea here is simple: using powerful hardware in-
structions, you can build data structures in a manner that does not require
explicit locking.
As a simple example, let us assume we have a compare-and-swap in-
struction, which as you may recall is an atomic instruction provided by
the hardware that does the following:
1 int CompareAndSwap(int *address, int expected, int new) {
2 if (*address == expected) {
3 *address = new;
4 return 1; // success
5 }
6 return 0; // failure
7 }
Instead of acquiring a lock, doing the update, and then releasing it, we
have instead built an approach that repeatedly tries to update the value to
the new amount and uses the compare-and-swap to do so. In this manner,
no lock is acquired, and no deadlock can arise (though livelock is still a
possibility).
Let us consider a slightly more complex example: list insertion. Here
is code that inserts at the head of a list:
1 void insert(int value) {
2 node_t *n = malloc(sizeof(node_t));
3 assert(n != NULL);
4 n->value = value;
5 n->next = head;
6 head = n;
7 }
T HREE
c 2014, A RPACI -D USSEAU E ASY
P IECES
368 C OMMON C ONCURRENCY P ROBLEMS
The code here updates the next pointer to point to the current head,
and then tries to swap the newly-created node into position as the new
head of the list. However, this will fail if some other thread successfully
swapped in a new head in the meanwhile, causing this thread to retry
again with the new head.
Of course, building a useful list requires more than just a list insert,
and not surprisingly building a list that you can insert into, delete from,
and perform lookups on in a wait-free manner is non-trivial. Read the
rich literature on wait-free synchronization if you find this interesting.
O PERATING
S YSTEMS
[V ERSION 0.80] WWW. OSTEP. ORG
C OMMON C ONCURRENCY P ROBLEMS 369
locks at all. We can show these lock acquisition demands of the threads
in tabular form:
T1 T2 T3 T4
L1 yes yes no no
L2 yes yes yes no
CPU 1 T3 T4
CPU 2 T1 T2
Note that it is OK for (T3 and T1) or (T3 and T2) to overlap. Even
though T3 grabs lock L2, it can never cause a deadlock by running con-
currently with other threads because it only grabs one lock.
Let’s look at one more example. In this one, there is more contention
for the same resources (again, locks L1 and L2), as indicated by the fol-
lowing contention table:
T1 T2 T3 T4
L1 yes yes yes no
L2 yes yes yes no
In particular, threads T1, T2, and T3 all need to grab both locks L1 and
L2 at some point during their execution. Here is a possible schedule that
guarantees that no deadlock could ever occur:
CPU 1 T4
CPU 2 T1 T2 T3
T HREE
c 2014, A RPACI -D USSEAU E ASY
P IECES
370 C OMMON C ONCURRENCY P ROBLEMS
32.4 Summary
In this chapter, we have studied the types of bugs that occur in con-
current programs. The first type, non-deadlock bugs, are surprisingly
common, but often are easier to fix. They include atomicity violations,
in which a sequence of instructions that should have been executed to-
gether was not, and order violations, in which the needed order between
two threads was not enforced.
We have also briefly discussed deadlock: why it occurs, and what can
be done about it. The problem is as old as concurrency itself, and many
hundreds of papers have been written about the topic. The best solu-
tion in practice is to be careful, develop a lock acquisition total order,
and thus prevent deadlock from occurring in the first place. Wait-free
approaches also have promise, as some wait-free data structures are now
finding their way into commonly-used libraries and critical systems, in-
cluding Linux. However, their lack of generality and the complexity to
develop a new wait-free data structure will likely limit the overall util-
ity of this approach. Perhaps the best solution is to develop new concur-
rent programming models: in systems such as MapReduce (from Google)
[GD02], programmers can describe certain types of parallel computations
without any locks whatsoever. Locks are problematic by their very na-
ture; perhaps we should seek to avoid using them unless we truly must.
O PERATING
S YSTEMS
[V ERSION 0.80] WWW. OSTEP. ORG