13-Conc Bugs
13-Conc Bugs
Concurrency
Bugs
Nima Honarmand
(Based on slides by Prof. Andrea Arpaci-Dusseau)
Fall 2017 :: CSE 306
The blackout's primary cause was a bug in the alarm system... The
lack of an alarm left operators unaware of the need to re-distribute
power after overloaded transmission lines hit unpruned foliage,
triggering a "race condition" in the energy management system…
What would have been a manageable local blackout cascaded into
massive widespread distress on the electric grid.”
Source: en.wikipedia.org/wiki/Northeast_blackout_of_2003
Fall 2017 :: CSE 306
• What’s wrong?
• How to fix?
• Use a lock
Fall 2017 :: CSE 306
• What’s wrong?
}
• How to fix?
• Use a condition variable
Fall 2017 :: CSE 306
• Problems?
Locking Granularity
• Coarse-grain locking
• Have one (or a few) locks that protect all (or big chunks) of shared
state
• Example: early Linux’s BKL (Big Kernel Lock)
• One big lock protecting all kernel data
• Only one processor code execute kernel code at any point of time; others
would have to wait
• Significant contention over big locks → hurts performance
• Fine-grain locking
• Have many small locks, each protecting one (or a few) objects
• Reduces contention → better performance
• Increases deadlock risk
Fall 2017 :: CSE 306
Deadlock Bugs
• Deadlock: No progress can be made because two or
more threads are waiting for the other to take
some action and thus neither ever does
Deadlock Theory
• Deadlocks can only occur when all
four conditions are true:
1) Mutual exclusion
STOP
STOP
2) Hold-and-wait
B
3) Circular wait A
4) No preemption D
C
STOP
• Eliminate deadlock by eliminating STOP
1) Mutual Exclusion
• Definition: “Threads claim exclusive control of
resources that they require (e.g., thread grabs a lock)”
2) Hold-and-Wait
• Definition: “Threads hold resources allocated to them
(e.g., locks they have already acquired) while waiting
for additional resources (e.g., locks they wish to
acquire).”
• Strategy: release currently held resources when waiting
for new ones
Example with trylock
top:
pthread_mutex_lock(A);
if (pthread_mutex_trylock(B) != 0)
{
pthread_mutex_unlock(A);
goto top;
}
…
Fall 2017 :: CSE 306
3) Circular Wait
• Definition: “There exists a circular chain of threads such
that each thread holds a resource (e.g., lock) being
requested by next thread in the chain.”
Simple Example
Thread 1 Thread 2
lock(&A); lock(&B);
lock(&B); lock(&A);
Thread 1 Thread 2
lock(&A); lock(&A);
lock(&B); lock(&B);
Fall 2017 :: CSE 306
19
Fall 2017 :: CSE 306
Other Complications
• Sometimes can’t know all virtual addresses in
advance
Coarse-Grained
Locking
Complexity
Locking in Kernel
• All locking stuff we discussed so far applies equally
to kernel and user code
• Spinlocks
• Blocking locks
• Granularity
• Deadlock
• Etc.
Solution
• How can we solve this problem?