Foundations of The C++ Concurrency Memory Model: John Mellor-Crummey and Karthik Murthy
Foundations of The C++ Concurrency Memory Model: John Mellor-Crummey and Karthik Murthy
• Problem
— C and C++: single threaded languages using thread libraries
– unaware of threads
— compilers are thread unaware
– optimize programs for a single thread
– problem: may perform optimizations that are valid for single-thread
programs but violate intended meaning of multithreaded programs
— prior informal specifications don’t precisely define
– data races
– semantics of a program without data races
2
A Familiar Example
4
Register Promotion
6
Trylock and Ordering
9
Why Undefined Data Race Semantics?
10
Possible Memory Models
• Sequential consistency
— intuitive, but restricts optimizations
• Relaxed memory models
— allow hardware optimizations
— specified at a low level: makes it hard for programmers to
reason about correctness
— can limit compiler optimizations
– e.g., at least one relaxed model disallows global analysis or RRE
11
Why Data Race Free Models?
12
Definitions - I
• Memory location
— each scalar value occupies a separate memory location
– except bitfields inside the same innermost struct or class
13
Definitions - II
• Thread Execution
— set of memory actions
— partial order corresponding to the sequenced before ordering
– sequenced before applies to memory operations by same thread
14
Definitions - III
15
C++ Memory Model
16
Legal Reorderings
17
Legal Lock Optimizations
18
C++ Memory Model Solution for Trylock
19
Sequential Consistency vs. Write Atomicity
• Independent read, independent write doesn’t guarantee
sequential consistency if writes don’t execute atomically
21
C++ Atomics
22
Some Problems
23
Implications for Current Processors
24
Problematic Examples
25
Case for Low-Level Atomics
26
Low-Level Atomics
• Why?
— enable expert programmers to maximize performance
• What?
— can explicitly parameterize an operation on an atomic
variable with its memory ordering constraints
– e.g., x.load(memory_order_relaxed)
allows instruction to be reordered with other memory
operations
load is never an acquire operation, hence does not contribute
to the synchronizes-with ordering
— for read-modify-write operations, programmer can specify
whether an operation acts as an acquire, release, neither, or
both
27
Additional Definitions
• Happens-before (HB)
— if a is sequenced before b, then a happens before b
— if a synchronizes with b, then a happens before b
— if a happens before b and b happens before c, then a
happens before c
• Type 2 data race
— two data conflicting accesses to the same memory location
are unordered by happens before
28
C++ Memory Model for Low Level Atomics
29
Java/C++ Comparison
30
Summary
31