0% found this document useful (0 votes)

24 views27 pages

13-Conc Bugs

Here are a few ways to fix the code: 1. Impose a lock ordering - Always acquire lock A before lock B. 2. Use a single lock to protect both resources. 3. Don't take multiple locks at the same time. Release lock A before taking lock B in Thread 1. 4. Use trylock() to avoid blocking - Thread 1 tries lock B, if it fails it releases lock A and yields. By enforcing a lock ordering, using a single lock, or not holding multiple locks concurrently, we break the circular wait condition and prevent deadlock from occurring.

Uploaded by

chandreshpatel16

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

24 views27 pages

13-Conc Bugs

Uploaded by

chandreshpatel16

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 27

Fall 2017 :: CSE 306

Concurrency
Bugs
Nima Honarmand
(Based on slides by Prof. Andrea Arpaci-Dusseau)
Fall 2017 :: CSE 306

Concurrency Bugs are Serious

The Therac-25 incident (1980s)

“The accidents occurred when the high-power electron beam

was activated instead of the intended low power beam, and
without the beam spreader plate rotated into place. Previous
models had hardware interlocks in place to prevent this, but
Therac-25 had removed them, depending instead on software
interlocks for safety. The software interlock could fail due to a
race condition.”

“…in three cases, the injured patients later died.”

Source: en.wikipedia.org/wiki/Therac-25
Fall 2017 :: CSE 306

Concurrency Bugs are Serious (2)

Northeast blackout of 2003

“The Northeast blackout of 2003 was a widespread power outage

that occurred throughout parts of the Northeastern and Midwestern
United States and the Canadian province of Ontario on Thursday,
August 14, 2003, just after 4:10 p.m. EDT.”

The blackout's primary cause was a bug in the alarm system... The
lack of an alarm left operators unaware of the need to re-distribute
power after overloaded transmission lines hit unpruned foliage,
triggering a "race condition" in the energy management system…
What would have been a manageable local blackout cascaded into
massive widespread distress on the electric grid.”

Source: en.wikipedia.org/wiki/Northeast_blackout_of_2003
Fall 2017 :: CSE 306

Concurrency Study from 2008

For four major projects, search for concurrency bugs

among > 500K bug reports. Analyze small sample to
identify common types of concurrency bugs.
Source: Lu et. al, “Learning from mistakes — a comprehensive study
on real world concurrency bug characteristics”
Fall 2017 :: CSE 306

Atomicity Violation Bugs

“The desired serializability among multiple memory
accesses is violated (i.e. a code region is intended to be
atomic, but the atomicity is not enforced during
execution)”
MySQL Example
Thread 1 Thread 2
if (thd->proc_info) { thd->proc_info = NULL;
…
fputs(thd->proc_info, …);
…
}

• What’s wrong?
• How to fix?
• Use a lock
Fall 2017 :: CSE 306

Ordering Violation Bugs

“The desired order between two (groups of) memory
accesses is flipped (i.e., A should always be executed
before B , but the order is not enforced during
execution)”
Mozilla Example
Thread 1 Thread 2
void init() { void mMain(…) {
… …
mThread = mState = mThread->State;
PR_CreateThread(mMain, …); …
… }

• What’s wrong?
}

• How to fix?
• Use a condition variable
Fall 2017 :: CSE 306

Ordering Violation Bugs (2)

Thread 1 Thread 2
void init() { void mMain(…) {
… …
mThread = mutex_lock(&mtLock);
PR_CreateThread(mMain, …); while (mtInit == 0)
mutex_lock(&mtLock); cond_wait(&mtCond, &mtLock);
mtInit = 1; mutex_unlock(&mtLock);
cond_signal(&mtCond);
mutex_unlock(&mtLock); mState = mThread->State;
… …
} }

• Why are we using a new flag (mtInit) instead of

mThread itself?
Fall 2017 :: CSE 306

Fixing Concurrency Bugs: Easy?

• If all we had to do was adding locks and cond vars,
concurrent programming would be quite simple

• Problems?

1) Adding too many locks increase the danger of

deadlocks

2) How about having just a few big locks then?

• Causes performance problems because it reduces
concurrency
Fall 2017 :: CSE 306

Locking Granularity
• Coarse-grain locking
• Have one (or a few) locks that protect all (or big chunks) of shared
state
• Example: early Linux’s BKL (Big Kernel Lock)
• One big lock protecting all kernel data
• Only one processor code execute kernel code at any point of time; others
would have to wait
• Significant contention over big locks → hurts performance

• Fine-grain locking
• Have many small locks, each protecting one (or a few) objects
• Reduces contention → better performance
• Increases deadlock risk
Fall 2017 :: CSE 306

Deadlock Bugs
• Deadlock: No progress can be made because two or
more threads are waiting for the other to take
some action and thus neither ever does

• Could arise when we need to coordinate access to

more than one shared resources
• Means we need to grab and hold multiple locks
simultaneously
Fall 2017 :: CSE 306

Deadlock Theory
• Deadlocks can only occur when all
four conditions are true:
1) Mutual exclusion

STOP
STOP
2) Hold-and-wait
B
3) Circular wait A
4) No preemption D
C

STOP
• Eliminate deadlock by eliminating STOP

any one condition

Fall 2017 :: CSE 306

1) Mutual Exclusion
• Definition: “Threads claim exclusive control of
resources that they require (e.g., thread grabs a lock)”

• Strategy: eliminate locks

• Try to use atomic instructions instead

Concurrent Counter Example

Code with locks Code with Compare-and-Swap (CAS)
void add (int *val, int amt) void add (int *val, int amt)
{ {
mutex_lock(&m); do {
*val += amt; int old = *value;
mutex_unlock(&m); } while(!CAS(val, old, old+amt));
} }
Fall 2017 :: CSE 306

Example: Lock-Free Linked List Insert

Code with locks Code with Compare-and-Swap (CAS)
void insert (int val) void insert (int val)
{ {
node_t *n = node_t *n = malloc(sizeof(*n));
malloc(sizeof(*n)); n->val = val;
n->val = val; do {
mutex_lock(&m); n->next = head;
n->next = head; } while (!CAS(&head, n->next, n));
head = n; }
mutex_unlock(&m);
}
Fall 2017 :: CSE 306

2) Hold-and-Wait
• Definition: “Threads hold resources allocated to them
(e.g., locks they have already acquired) while waiting
for additional resources (e.g., locks they wish to
acquire).”
• Strategy: release currently held resources when waiting
for new ones
Example with trylock
top:
pthread_mutex_lock(A);
if (pthread_mutex_trylock(B) != 0)
{
pthread_mutex_unlock(A);
goto top;
}
…
Fall 2017 :: CSE 306

Problem w/ This Strategy

• Potential for Livelock: no process makes forward
progress, but the state of involved processes
constantly changes
• Can happen if all processes release resources and
then try to re-acquire, fail, and keep doing this
• Classic solution: back-off techniques
• Random back-off: wait for a random amount of time
before retrying
• Exponential back-off: wait for exponentially increasing
amount of time before retrying
Fall 2017 :: CSE 306

3) Circular Wait
• Definition: “There exists a circular chain of threads such
that each thread holds a resource (e.g., lock) being
requested by next thread in the chain.”

• Usually the easiest deadlock requirement to attack

• Strategy: impose a well-documented order of acquiring

locks
• Decide which locks should be acquired before others
• If A before B, never acquire A if B is already held!
• Document this, and write code accordingly

• Works well if system has distinct layers

Fall 2017 :: CSE 306

Simple Example
Thread 1 Thread 2
lock(&A); lock(&B);
lock(&B); lock(&A);

How would you fix this code?

Thread 1 Thread 2
lock(&A); lock(&A);
lock(&B); lock(&B);
Fall 2017 :: CSE 306

Example: mm/filemap.c lock ordering

/*
* Lock ordering:
* ->i_mmap_lock (vmtruncate)
* ->private_lock (__free_pte->__set_page_dirty_buffers)
* ->swap_lock (exclusive_swap_page, others)
* ->mapping->tree_lock
* ->i_mutex
* ->i_mmap_lock (truncate->unmap_mapping_range)
* ->mmap_sem
* ->i_mmap_lock
* ->page_table_lock or pte_lock (various, mainly in memory.c)
* ->mapping->tree_lock (arch-dependent flush_dcache_mmap_lock)
* ->mmap_sem
* ->lock_page (access_process_vm)
* ->mmap_sem
* ->i_mutex (msync)
* ->i_mutex
* ->i_alloc_sem (various)
* ->inode_lock
* ->sb_lock (fs/fs-writeback.c)
* ->mapping->tree_lock (__sync_single_inode)
* ->i_mmap_lock
* ->anon_vma.lock (vma_adjust)
* ->anon_vma.lock
* ->page_table_lock or pte_lock (anon_vma_prepare and various)
* ->page_table_lock or pte_lock
* ->swap_lock (try_to_unmap_one)
* ->private_lock (try_to_unmap_one)
* ->tree_lock (try_to_unmap_one)
* ->zone.lru_lock (follow_page->mark_page_accessed)
. . .

19
Fall 2017 :: CSE 306

Encapsulation Makes Ordering Difficult

• Encapsulation, and emphasis on code modularity, make
things difficult
• Can’t control the order in which locks are acquired when we
calling a function in another module

• What could go wrong in this code?

set_t *intersect(set_t *s1, set_t *s2)
{
Deadlock possible if one
set_t *rv = malloc(sizeof(*rv));
mutex_lock(&s1->lock);
mutex_lock(&s2->lock); thread calls
for(int i=0; i<s1->len; i++) { intersect(s1, s2)
if(set_contains(s2, s1->items[i]) and another thread
set_add(rv, s1->items[i]);
mutex_unlock(&s2->lock);
intersect(s2, s1)
mutex_unlock(&s1->lock);
}
Fall 2017 :: CSE 306

One Possible Solution

• Acquire the locks in the order of their virtual
addresses when possible
set_t *intersect(set_t *s1, set_t *s2) {
set_t *rv = malloc(sizeof(*rv));
if ((uint)&s1->lock < (uint)&s2->lock) {
mutex_lock(&s1->lock);
mutex_lock(&s2->lock);
} else {
mutex_lock(&s2->lock);
mutex_lock(&s1->lock);
}
for(int i=0; i<s1->len; i++) {
if(set_contains(s2, s1->items[i])
set_add(rv, s1->items[i]); You may also want to
mutex_unlock(&s2->lock); change the order of
mutex_unlock(&s1->lock); unlock()s to be
} reverse of lock()s.
Fall 2017 :: CSE 306

Other Complications
• Sometimes can’t know all virtual addresses in
advance

• Example: when traversing a linked list where each

object has a separate lock
Fall 2017 :: CSE 306

Linux Example: fs/dcache.c

void d_prune_aliases(struct inode *inode) {
struct dentry *dentry;
struct hlist_node *p;
restart:
spin_lock(&inode->i_lock); Make sure inode lock is
hlist_for_each_entry(dentry, p, acquired before dentry
&inode->i_dentry, d_alias) { locks
spin_lock(&dentry->d_lock);
if (!dentry->d_count) {
__dget_dlock(dentry);
__d_drop(dentry); When a list element is
spin_unlock(&dentry->d_lock);
removed, have to restart
spin_unlock(&inode->i_lock);
dput(dentry); from beginning because
goto restart; order of items has
} changed.
spin_unlock(&dentry->d_lock);
}
spin_unlock(&inode->i_lock);
}
Fall 2017 :: CSE 306

4) Deadlock Detection and Recovery

• Database systems use many, many locks
• Very difficult to always avoid deadlocks in general in
such a system

• Last-resort strategy: detect deadlocks, and recover

• Detection usually involves looking out for locks that are
held for too long
• Recovery usually requires a restart of the database app

• An example of breaking the “No preemption”

condition
• By restarting, we are forcibly releasing the resource
Fall 2017 :: CSE 306

Summary: Current Reality

Fine-Grained Locking
Performance

Coarse-Grained
Locking

Complexity

Unsavory trade-off between synchronization

complexity and performance
25
Fall 2017 :: CSE 306

Locking in Kernel
• All locking stuff we discussed so far applies equally
to kernel and user code
• Spinlocks
• Blocking locks
• Granularity
• Deadlock
• Etc.

• However, there is one form of concurrency that’s

(almost) only found in kernel, remember?
• Yes, interrupts!
Fall 2017 :: CSE 306

Locks and Interrupts

• Suppose you are in the disk driver (say, serving a read()
syscall) and holding a disk-related lock

• Say, a disk interrupt happens, and you need to grab the

same lock in the interrupt service routine (ISR)

• What would happen?

• Yes, deadlock
• Can’t finish the ISR without grabbing the lock
• Can’t return to driver code (to release the lock) without finishing ISR

• Can you identify the multiple resources that are involved in

the deadlock?
1) Lock
2) CPU
Fall 2017 :: CSE 306

Solution
• How can we solve this problem?

• Two part solution:

1) Only use spinlocks in ISRs — never call, directly or
indirectly, a routine that would use a blocking lock
2) When acquiring a spinlock in kernel, disable interrupts
on the current processor

• Why just on this processor? Is it okay to get an

interrupt on other processors?

• This is why xv6 kernel spinlocks disable interrupts

Principles: Life and Work
From Everand
Principles: Life and Work
Ray Dalio
4/5 (648)
The Gifts of Imperfection: Let Go of Who You Think You're Supposed to Be and Embrace Who You Are
From Everand
The Gifts of Imperfection: Let Go of Who You Think You're Supposed to Be and Embrace Who You Are
Brené Brown
4/5 (1175)
The Glass Castle: A Memoir
From Everand
The Glass Castle: A Memoir
Jeannette Walls
4.5/5 (1856)
Sing, Unburied, Sing: A Novel
From Everand
Sing, Unburied, Sing: A Novel
Jesmyn Ward
4/5 (1267)
The Perks of Being a Wallflower
From Everand
The Perks of Being a Wallflower
Stephen Chbosky
4.5/5 (4103)
Her Body and Other Parties: Stories
From Everand
Her Body and Other Parties: Stories
Carmen Maria Machado
4/5 (903)
Shoe Dog: A Memoir by the Creator of Nike
From Everand
Shoe Dog: A Memoir by the Creator of Nike
Phil Knight
4.5/5 (629)
JOHN DEERE 06 - 644K - English PDF
100% (4)
JOHN DEERE 06 - 644K - English PDF
300 pages
The Emperor of All Maladies: A Biography of Cancer
From Everand
The Emperor of All Maladies: A Biography of Cancer
Siddhartha Mukherjee
4.5/5 (298)
Steve Jobs
From Everand
Steve Jobs
Walter Isaacson
4.5/5 (1139)
The Yellow House: A Memoir (2019 National Book Award Winner)
From Everand
The Yellow House: A Memoir (2019 National Book Award Winner)
Sarah M. Broom
4/5 (100)
Angela's Ashes: A Memoir
From Everand
Angela's Ashes: A Memoir
Frank McCourt
4.5/5 (943)
The World Is Flat 3.0: A Brief History of the Twenty-first Century
From Everand
The World Is Flat 3.0: A Brief History of the Twenty-first Century
Thomas L. Friedman
3.5/5 (2289)
The Outsider: A Novel
From Everand
The Outsider: A Novel
Stephen King
4/5 (2886)
A Heartbreaking Work Of Staggering Genius: A Memoir Based on a True Story
From Everand
A Heartbreaking Work Of Staggering Genius: A Memoir Based on a True Story
Dave Eggers
3.5/5 (233)
Team of Rivals: The Political Genius of Abraham Lincoln
From Everand
Team of Rivals: The Political Genius of Abraham Lincoln
Doris Kearns Goodwin
4.5/5 (244)
Rise of ISIS: A Threat We Can't Ignore
From Everand
Rise of ISIS: A Threat We Can't Ignore
Jay Sekulow
3.5/5 (144)
Manhattan Beach: A Novel
From Everand
Manhattan Beach: A Novel
Jennifer Egan
3.5/5 (919)
100day CPA Youtube Method
100% (2)
100day CPA Youtube Method
2 pages
Fear: Trump in the White House
From Everand
Fear: Trump in the White House
Bob Woodward
3.5/5 (836)
John Adams
From Everand
John Adams
David McCullough
4.5/5 (2546)
The Unwinding: An Inner History of the New America
From Everand
The Unwinding: An Inner History of the New America
George Packer
4/5 (45)
The Light Between Oceans: A Novel
From Everand
The Light Between Oceans: A Novel
M.L. Stedman
4.5/5 (815)
Little Women
From Everand
Little Women
Louisa May Alcott
4.5/5 (2369)
Iiith Pgcss Partb Brochure
No ratings yet
Iiith Pgcss Partb Brochure
20 pages
Internship Report
No ratings yet
Internship Report
23 pages
TAFJ in R12: Click To Edit Master Title Style
No ratings yet
TAFJ in R12: Click To Edit Master Title Style
52 pages
ROS2 HUMBLE INSTALL v1.0
No ratings yet
ROS2 HUMBLE INSTALL v1.0
4 pages
RoadMap Data Science
No ratings yet
RoadMap Data Science
6 pages
Speeduino Manual PDF
100% (1)
Speeduino Manual PDF
107 pages
ERP in Construction Company
100% (1)
ERP in Construction Company
16 pages
Remote Journal Function For High Availability and Data Replication-SG24-5189-00
No ratings yet
Remote Journal Function For High Availability and Data Replication-SG24-5189-00
130 pages
Defeating Bit Locker Encryption With Keys From RAM
No ratings yet
Defeating Bit Locker Encryption With Keys From RAM
44 pages
Qsu3063 Biomechanics Lab 3: Activities TOPIC: Video Recording Procedures
No ratings yet
Qsu3063 Biomechanics Lab 3: Activities TOPIC: Video Recording Procedures
1 page
Korblox - Pesquisa Google
No ratings yet
Korblox - Pesquisa Google
1 page
Block-Diagram Tour Late Model (5100 Series) EF Johnson 700/800 MHZ 2-Way Radio RF Deck
No ratings yet
Block-Diagram Tour Late Model (5100 Series) EF Johnson 700/800 MHZ 2-Way Radio RF Deck
7 pages
Test Plan Document Client and Server Application
No ratings yet
Test Plan Document Client and Server Application
8 pages
Thesis Hotel Management Information System
100% (4)
Thesis Hotel Management Information System
5 pages
Newsletter - Moms Club of Eugene
No ratings yet
Newsletter - Moms Club of Eugene
6 pages
FREE UX Books @UXlinks
No ratings yet
FREE UX Books @UXlinks
4 pages
Cec PDF
100% (1)
Cec PDF
11 pages
Reading Sample Sap Press Sap Analytics Cloud Financial Planning and Analysis
No ratings yet
Reading Sample Sap Press Sap Analytics Cloud Financial Planning and Analysis
28 pages
Understanding Information: Unit 5
No ratings yet
Understanding Information: Unit 5
77 pages
SpectraLink 8000 SVP Admin Password CS 04 06 0
No ratings yet
SpectraLink 8000 SVP Admin Password CS 04 06 0
2 pages
Syllabus IT 430-002 Ethical Hacking
No ratings yet
Syllabus IT 430-002 Ethical Hacking
3 pages
Opensprinkler User Manual: Firmware 2.1.9 (Aug 5, 2020)
No ratings yet
Opensprinkler User Manual: Firmware 2.1.9 (Aug 5, 2020)
16 pages
ASR Manager Install
No ratings yet
ASR Manager Install
7 pages
Consensus Map For Grade 3 Final
No ratings yet
Consensus Map For Grade 3 Final
3 pages
Computational Lexicography
No ratings yet
Computational Lexicography
3 pages
An Introduction To Presentation 2
No ratings yet
An Introduction To Presentation 2
48 pages
BRMS Detail
No ratings yet
BRMS Detail
290 pages
CAAL Previous Year Paper
No ratings yet
CAAL Previous Year Paper
5 pages

13-Conc Bugs

Uploaded by

13-Conc Bugs

Uploaded by

Fall 2017 :: CSE 306

Concurrency Bugs are Serious

“The accidents occurred when the high-power electron beam

“…in three cases, the injured patients later died.”

Concurrency Bugs are Serious (2)

“The Northeast blackout of 2003 was a widespread power outage

Concurrency Study from 2008

For four major projects, search for concurrency bugs

Atomicity Violation Bugs

Ordering Violation Bugs

Ordering Violation Bugs (2)

• Why are we using a new flag (mtInit) instead of

Fixing Concurrency Bugs: Easy?

1) Adding too many locks increase the danger of

2) How about having just a few big locks then?

• Could arise when we need to coordinate access to

any one condition

• Strategy: eliminate locks

Concurrent Counter Example

Example: Lock-Free Linked List Insert

Problem w/ This Strategy

• Usually the easiest deadlock requirement to attack

• Strategy: impose a well-documented order of acquiring

• Works well if system has distinct layers

How would you fix this code?

Example: mm/filemap.c lock ordering

Encapsulation Makes Ordering Difficult

• What could go wrong in this code?

One Possible Solution

• Example: when traversing a linked list where each

Linux Example: fs/dcache.c

4) Deadlock Detection and Recovery

• Last-resort strategy: detect deadlocks, and recover

• An example of breaking the “No preemption”

Summary: Current Reality

Unsavory trade-off between synchronization

• However, there is one form of concurrency that’s

Locks and Interrupts

• Say, a disk interrupt happens, and you need to grab the

• What would happen?

• Can you identify the multiple resources that are involved in

• Two part solution:

• Why just on this processor? Is it okay to get an

• This is why xv6 kernel spinlocks disable interrupts

You might also like