SlideShare a Scribd company logo
JMM
Java Memory Model
Łukasz Koniecki
24/10/2016
About me
Java
Universe
Spring
MyFaces
JSF
Play
Spark
GWT
Vadin
Tapestry
Wicket
Spring MVC
Struts
Grails
REST API
JPA
GC
JVM
JAVA EE
Tomcat
Spark
Goal
• Familiarize with the JMM,
• How processor works?
• Recall how Java compiler and JVM work,
• JIT in action,
• Explain what is a data race and a correctly synchronized
program,
• Talk about synchronization and atomicity,
• Based on examples...
• Next-gen JMM...
§17.4 Memory Model
John von Neumann
Wikipedia: http://bit.ly/2cMU0GB
Von Neumann Architecture
Dummy program
public class Example {
int i, j;
public void myDummyMethod() {
i+=1;
j+=1;
i+=1;
...
}
}
RAM
i = 0
j = 0
Cache
Program execution
System Bus
public class Example {
int i, j;
public void myDummyMethod() {
i+=1;
j+=1;
i+=1;
...
}
}
The Java Memory Model for Practitioners: http://bit.ly/2cMXklJ
RAM
i = 0
j = 0
Cache
Program execution
System Bus
i = 0
public class Example {
int i, j;
public void myDummyMethod() {
i+=1;
j+=1;
i+=1;
...
}
}
RAM
i = 0
j = 0
Cache
Program execution
System Bus
i = 1
public class Example {
int i, j;
public void myDummyMethod() {
i+=1;
j+=1;
i+=1;
...
}
}
RAM
i = 0
j = 0
Cache
Program execution
System Bus
i = 1
public class Example {
int i, j;
public void myDummyMethod() {
i+=1;
j+=1;
i+=1;
...
}
}
RAM
i = 1
j = 0
Cache
Program execution
System Bus
public class Example {
int i, j;
public void myDummyMethod() {
i+=1;
j+=1;
i+=1;
...
}
}
RAM
i = 1
j = 0
Cache
Program execution
System Bus
j = 0
public class Example {
int i, j;
public void myDummyMethod() {
i+=1;
j+=1;
i+=1;
...
}
}
RAM
i = 1
j = 0
Cache
Program execution
System Bus
j = 1
public class Example {
int i, j;
public void myDummyMethod() {
i+=1;
j+=1;
i+=1;
...
}
}
RAM
i = 1
j = 1
Cache
Program execution
System Bus
j = 1
public class Example {
int i, j;
public void myDummyMethod() {
i+=1;
j+=1;
i+=1;
...
}
}
RAM
i = 1
j = 1
Cache
Program execution
System Bus
Sequentialy consistent
execution
public class Example {
int i, j;
public void myDummyMethod() {
i+=1;
j+=1;
i+=1;
...
}
}
PC World: http://bit.ly/2cE9f7q
Haswell-E processor
Our world in data: http://bit.ly/1NLxNcH
Moore’s Law
Moore’s Law
Our world in data: http://bit.ly/1NLxNcH
2006
Processor technology
• ...
• 22 nm – 2012
• 14 nm – 2014
• 10 nm – 2017
• 7 nm – ~2019
• 5 nm – ~2021
Wikipedia: http://bit.ly/2cMWoNg
Processor vs. Memory Performance
How L1 and L2 CPU caches work, and why they’re an essential part of modern chips: http://bit.ly/2cpHu1x
Wikipedia: http://bit.ly/2cm33me
Cache hierarchy in a modern processor
Wikipedia: http://bit.ly/2cm33me
Cache hierarchy in a modern processor
Important latency numbers
Core i7 Xeon 5500 Series Data Source Latency (approximate)
local L1 CACHE hit, ~4 cycles ( 2.1 - 1.2 ns )
local L2 CACHE hit, ~10 cycles ( 5.3 - 3.0 ns )
local L3 CACHE hit, line unshared ~40 cycles ( 21.4 - 12.0 ns )
local L3 CACHE hit, shared line in another core ~65 cycles ( 34.8 - 19.5 ns )
local L3 CACHE hit, modified in another core ~75 cycles ( 40.2 - 22.5 ns )
remote L3 CACHE (Ref: Fig.1 [Pg. 5]) ~100-300 cycles ( 160.7 - 30.0 ns )
local DRAM ~60 ns
remote DRAM ~100 ns
Performance Analysis Guide for Intel® Core™ i7 Processor and Intel® Xeon™ 5500 processors: http://intel.ly/2cV1ZFZ
Cache latency
Weak vs. Strong hardware Memory Models
Weak vs. Strong Memory Models: http://bit.ly/2cC4avk
x86/x64 processor memory model
R-R R-W
W-R W-W
Intel® 64 and IA-32 Architectures Software Developer’s Manual: http://intel.ly/2csMyB2
Processor P can read B
before it’s write to A is seen
by all processors
(processor can move its
own reads in front of its
own writes)
x86/x64 processor memory model
R-R R-W
W-R W-W
Intel® 64 and IA-32 Architectures Software Developer’s Manual: http://intel.ly/2csMyB2
Processor P can read B
before it’s write to A is seen
by all processors
(processor can move its
own reads in front of its
own writes)
How Java compiler works?
javac
Source
code
Byte
code
Bytecode
verifier
Class loader
JIT
JVM
OS
Native
code
Byte
code
JIT
•Profile guided,
•Speculatively optimizing,
•Backup strategies,
•Optimizes code for us,
•We don’t have to care so much about cache-wise
operations
Tiered compilation
time
throuput
startup
interpreted
C1
C2
sampling full speed
deoptimize
bail to interpreter
Tiered compilation (interpreter)
time
throuput
startup
interpreted
C1
C2
sampling full speed
deoptimize
bail to interpreter
Interpreter
• extremly slow,
• not profiling
Tiered compilation (C1 compiler)
time
throuput
startup
interpreted
C1
C2
sampling full speed
deoptimize
bail to interpreter
C1
• client,
• fast but dummy,
• does the profiling,
• e.g: branches, typechecks,
Tiered compilation (C2 compiler)
time
throuput
startup
interpreted
C1
C2
sampling full speed
deoptimize
bail to interpreter
C2
• server,
• slow but clever,
• aggresively optimizing,
• based on profile,
• e.g.: loop optimizations
(unswitching, unrolling),
Implicit Null Checking
Why do we need a JMM?
• Different platform memory models (none of them match the JMM!!!)
• Many JVM implementations,
• People don’t know how to program concurrently,
• Programmers: write reliable and multithreaded code,
• Compiler writers: implement optimization which will be a legal,
optimization according to the JLS
• Compiler: produce fast and optimal native code,
JMM
• Action: read and write to variable, lock and unlock of monitor, starting
and joining with thread,
• Happens-before partial order,
• Thread executing action B can see the results of action A (any thread),
there must be a happens-before relationship between A and B,
• Otherwise JVM is free to reorder,
Happens-before orderings
• Unlock of a monitor / lock of that monitor,
• Write to a volatile variable / read of that variable,
• Call to start() / any action in the started thread,
• All actions in a thread / any other thread successfully returns from
join() on that thread,
• Setting default values for variables, setting value to a final field in the
constructor / constructor finish,
• Write to an Atomic variable / read from that variable,
• Many java.util.concurrent methods,
JMM
• A promise for programmers: sequential consistency must be sacrificed to allow
optimizations, but it will still hold for data race free programs. This is the data
race free (DRF) guarantee.
• A promise for security: even for programs with data races, values should not
appear “out of thin air”, preventing unintended information leakage.
• A promise for compilers: common hardware and software optimizations should
be allowed as far as possible without violating the first two requirements.
Java Memory Model Examples: Good, Bad and Ugly: http://bit.ly/2cZfF1I
Example
@NotThreadSafe
class DataRace {
int a, b;
int x, y;
void thread1() {
y = a;
b = 1;
}
void thread2() {
x = b;
a = 2;
}
}
y == 2, x == 1 ???
How can this happen?
• Processor can reorder statements (out-of-order execution,
HT)
• Lazy synchronization between caches and main memory,
• Compiler can reorder statements (or keep values is registers),
• Aggressive optimizations in JIT,
Example
@NotThreadSafe
class DataRace {
int a, b;
int x, y;
void thread1() {
y = a;
b = 1;
}
void thread2() {
x = b;
a = 2;
}
}
time
Thread 1 Thread 2
y = a;
b = 1;
x = b;
a = 2;
Example
@NotThreadSafe
class DataRace {
int a, b;
int x, y;
void thread1() {
y = a;
b = 1;
}
void thread2() {
x = b;
a = 2;
}
}
time
Thread 1 Thread 2
b = 1;
y = a;
x = b;
a = 2;
Example
@NotThreadSafe
class DataRace {
int a, b;
int x, y;
void thread1() {
y = a;
b = 1;
}
void thread2() {
x = b;
a = 2;
}
}
time
Thread 1 Thread 2
b = 1;
y = a;
a = 2;
x = b;
Example
@NotThreadSafe
class DataRace {
int a, b;
int x, y;
void thread1() {
y = a;
b = 1;
}
void thread2() {
x = b;
a = 2;
}
}
time
Thread 1 Thread 2
b = 1;
a = 2;
x = b;
y = a;
y == 2, x == 1
Example of x86/x64 test results
Test using jstress
@JCStressTest
@Description("Data race")
@Outcome(id = {"0, 0", "0, 1", "2, 0"}, expect = ACCEPTABLE,
desc = "Trivial under sequential consistency")
@Outcome(id = {"2, 1"}, expect = ACCEPTABLE, desc = "Racy read of x")
@State
public class DataRace {
int a, b;
int x, y;
@Actor
void thread1(IntResult2 r) {
y = a;
b = 1;
r.r1 = y;
}
@Actor
void thread2(IntResult2 r) {
x = b;
a = 2;
r.r2 = x;
}
}
jcstress: http://bit.ly/2daSL5Q
Example of x86/x64 test results
R-R R-W
W-R W-W
Test results interpretation
y==0, x==0
y==0, x==1
y==2, x==0
time
.
.
.
y = a;
b = 1;
.
.
.
x = b;
a = 2;
Test results interpretation
y==0, x==0
y==0, x==1
y==2, x==0
time
.
.
.
y = a;
b = 1;
.
.
.
x = b;
a = 2;
Test results interpretation
y==0, x==0
y==0, x==1
y==2, x==0
time
.
.
.
y = a;
b = 1;
.
.
.
x = b;
a = 2;
Visibility between threads
@ThreadSafe
public class DataRace {
int a, b;
int x, y;
void thread1() {
synchronized (this) {
y = a;
b = 1;
}
}
void thread2() {
synchronized (this) {
x = b;
a = 2;
}
}
}
Visibility between threads
time
Thread 1 Thread 2
(Th2 starts after Th1)
Program
order
Program
order
synchronization
order
Every operation that
happens before
an unlock (release)
Is visible to an operation that
happens after
a later lock (aquire)happens-before
order
@ThreadSafe
public class DataRace {
int a, b;
int x, y;
void thread1() {
synchronized (this) {
y = a;
b = 1;
}
}
void thread2() {
synchronized (this) {
x = b;
a = 2;
}
}
}
.
.
.
<enter this>
y = a;
b = 1;
<exit this>
<enter this>
x = b;
a = 2;
<exit this>
.
.
.
Possible results:
y==0, x == 1
y==2, x == 0
Synchronization
High level
• java.util.concurrent
Low level
• synchronized() blocks and methods,
• java.util.concurrent.locks
Low level primitives
• volatile variables
• java.util.concurrent.atomic
Volatile
@ThreadUnsafe
public class Looper {
static boolean done;
public static void main(String[] args)
throws InterruptedException {
new Thread(new Runnable() {
@Override
public void run() {
int count = 0;
while (!done) {
count++;
}
System.out.println("Ending this task");
}
}).start();
Thread.sleep(1000);
System.out.println("Waiting done");
done = true;
}
}
Volatile
@ThreadSafe
public class Looper {
volatile static boolean done;
public static void main(String[] args)
throws InterruptedException {
new Thread(new Runnable() {
@Override
public void run() {
int count = 0;
while (!done) {
count++;
}
System.out.println("Ending this task");
}
}).start();
Thread.sleep(1000);
System.out.println("Waiting done");
done = true;
}
}
Program
order
Program
order
synchronization
order
Thread 1
time
Thread 2
.
.
.
done = true;
while (!done)
.
.
.
happens-before
order
More about volatile
• Volatile reads are very cheep (no locks compared to
synchronized)
• Volatile increment is not atomic (!!!)
• Elements in volatile collection are not volatile (e.g. volatile
int[])
• Consider using java.util.concurrent
What operations in Java are atomic?
• Read/write on variables of primitive types (except of long
and double – Word Tearing problem),
• Read/write on volatile variables of primitive type (including
long and double),
• All read/writes to references are always atomic
(http://bit.ly/2c8kn8i),
• All operations on java.util.concurrent.atomic types,
Examples
Be careful what you’re doing...
Double-checked locking
@ThreadSafe
public class DoubleCheckedLocking {
private volatile Helper helper = null;
public Helper getHelper() {
if (helper == null) {
synchronized (this) {
if (helper == null)
helper = new Helper();
}
}
return helper;
}
}
The "Double-Checked Locking is Broken" Declaration: http://bit.ly/2cIDBnA
Final
@ThreadUnsafe
class UnsafePublication {
private int a;
private static UnsafePublication instance;
private UnsafePublication() {
a = 1;
}
void thread1() throws InterruptedException {
instance = new UnsafePublication();
}
void thread2() {
if (instance != null) {
System.out.println(instance.a);
}
}
}
What state
can thread 2 see???
null, 0, 1
Final
@ThreadSafe
class SafePublication {
private final int a;
private static SafePublication instance;
private SafePublication() {
a = 1;
}
void thread1() throws InterruptedException {
instance = new SafePublication();
}
void thread2() {
if (instance != null) {
System.out.println(instance.a);
}
}
}
Next-JMM
• JEP 188,
• Improve formalization,
• JVM coverage,
• Extend scope,
• Testing support,
• Tool support,
• Enh: atomic r/w for long and double,
To sum up...
• Concurrent programming isn’t easy,
• Design your code for concurrency (make it right before you
make it fast),
• Do not code against the implementation. Code against the
specification,
• Use high level synchronization wherever possible,
• Watch out for useless synchronization,
• Use Thread Safe Immutable objects,
Further reading
• Aleksey Shipilëv: One Stop Page
(http://bit.ly/2cqBt4x),
• Rafael Winterhalter: The Java Memory Model for
Practitioners (http://bit.ly/2cMXklJ),
• Brian Goetz: Java Concurrency in Practice
(http://amzn.to/2cloe76)
Thank you!

More Related Content

ODP
Java memory model
Michał Warecki
 
PPTX
The Java Memory Model
CA Technologies
 
PPTX
java memory management & gc
exsuns
 
PPTX
The Java memory model made easy
Rafael Winterhalter
 
PPTX
Николай Папирный Тема: "Java memory model для простых смертных"
Ciklum Minsk
 
PPTX
Java memory model
Rushan Arunod
 
PPTX
Threading in java - a pragmatic primer
SivaRamaSundar Devasubramaniam
 
PPTX
Basics of Java Concurrency
kshanth2101
 
Java memory model
Michał Warecki
 
The Java Memory Model
CA Technologies
 
java memory management & gc
exsuns
 
The Java memory model made easy
Rafael Winterhalter
 
Николай Папирный Тема: "Java memory model для простых смертных"
Ciklum Minsk
 
Java memory model
Rushan Arunod
 
Threading in java - a pragmatic primer
SivaRamaSundar Devasubramaniam
 
Basics of Java Concurrency
kshanth2101
 

What's hot (19)

PPTX
Advanced Introduction to Java Multi-Threading - Full (chok)
choksheak
 
PPT
Java Performance Tuning
Minh Hoang
 
PDF
Java Performance Tuning
Atthakorn Chanthong
 
PDF
Programming Language Memory Models: What do Shared Variables Mean?
greenwop
 
PPTX
Multithreading in java
Raghu nath
 
PPTX
Multi threading
Mavoori Soshmitha
 
PPT
Java Tut1
guest5c8bd1
 
PPT
Javatut1
desaigeeta
 
PDF
Jvm profiling under the hood
RichardWarburton
 
PPT
Java Tutorial | My Heart
Bui Kiet
 
PDF
Дмитрий Копляров , Потокобезопасные сигналы в C++
Sergey Platonov
 
PPT
Free FreeRTOS Course-Task Management
Amr Ali (ISTQB CTAL Full, CSM, ITIL Foundation)
 
PPTX
Inter thread communication &amp; runnable interface
keval_thummar
 
PDF
Why GC is eating all my CPU?
Roman Elizarov
 
PPTX
Medical Image Processing Strategies for multi-core CPUs
Daniel Blezek
 
PDF
"What's New in HotSpot JVM 8" @ JPoint 2014, Moscow, Russia
Vladimir Ivanov
 
PPTX
Synchronization problem with threads
Syed Zaid Irshad
 
PDF
Objective-C Blocks and Grand Central Dispatch
Matteo Battaglio
 
PDF
JCConf 2020 - New Java Features Released in 2020
Joseph Kuo
 
Advanced Introduction to Java Multi-Threading - Full (chok)
choksheak
 
Java Performance Tuning
Minh Hoang
 
Java Performance Tuning
Atthakorn Chanthong
 
Programming Language Memory Models: What do Shared Variables Mean?
greenwop
 
Multithreading in java
Raghu nath
 
Multi threading
Mavoori Soshmitha
 
Java Tut1
guest5c8bd1
 
Javatut1
desaigeeta
 
Jvm profiling under the hood
RichardWarburton
 
Java Tutorial | My Heart
Bui Kiet
 
Дмитрий Копляров , Потокобезопасные сигналы в C++
Sergey Platonov
 
Free FreeRTOS Course-Task Management
Amr Ali (ISTQB CTAL Full, CSM, ITIL Foundation)
 
Inter thread communication &amp; runnable interface
keval_thummar
 
Why GC is eating all my CPU?
Roman Elizarov
 
Medical Image Processing Strategies for multi-core CPUs
Daniel Blezek
 
"What's New in HotSpot JVM 8" @ JPoint 2014, Moscow, Russia
Vladimir Ivanov
 
Synchronization problem with threads
Syed Zaid Irshad
 
Objective-C Blocks and Grand Central Dispatch
Matteo Battaglio
 
JCConf 2020 - New Java Features Released in 2020
Joseph Kuo
 
Ad

Viewers also liked (20)

PDF
[BGOUG] Java GC - Friend or Foe
SAP HANA Cloud Platform
 
PDF
Java gc
Niit
 
PDF
Java GC - Pause tuning
ekino
 
PPTX
Вячеслав Блинов «Java Garbage Collection: A Performance Impact»
Anna Shymchenko
 
PPT
Java Garbage Collection(GC)- Study
Dhanu Gupta
 
PPTX
Java concurrency
Scheidt & Bachmann
 
PDF
Java Memory Model
Skills Matter
 
ODP
Java Memory Consistency Model - concepts and context
Tomek Borek
 
PPTX
Java gc and JVM optimization
Rajan Jethva
 
PDF
What you need to know about GC
Kelum Senanayake
 
PPTX
HSA Memory Model Hot Chips 2013
HSA Foundation
 
ODP
Java GC, Off-heap workshop
Valerii Moisieienko
 
PDF
How long can you afford to Stop The World?
Java Usergroup Berlin-Brandenburg
 
PDF
JVM及其调优
zhongbing liu
 
PPTX
Tuning Java GC to resolve performance issues
Sergey Podolsky
 
PDF
GC Tuning in the HotSpot Java VM - a FISL 10 Presentation
Ludovic Poitou
 
PPT
Memory models
Dr. C.V. Suresh Babu
 
PDF
淺談 Java GC 原理、調教和 新發展
Leon Chen
 
PPTX
Java GC
Ray Cheng
 
PDF
Let's Learn to Talk to GC Logs in Java 9
Poonam Bajaj Parhar
 
[BGOUG] Java GC - Friend or Foe
SAP HANA Cloud Platform
 
Java gc
Niit
 
Java GC - Pause tuning
ekino
 
Вячеслав Блинов «Java Garbage Collection: A Performance Impact»
Anna Shymchenko
 
Java Garbage Collection(GC)- Study
Dhanu Gupta
 
Java concurrency
Scheidt & Bachmann
 
Java Memory Model
Skills Matter
 
Java Memory Consistency Model - concepts and context
Tomek Borek
 
Java gc and JVM optimization
Rajan Jethva
 
What you need to know about GC
Kelum Senanayake
 
HSA Memory Model Hot Chips 2013
HSA Foundation
 
Java GC, Off-heap workshop
Valerii Moisieienko
 
How long can you afford to Stop The World?
Java Usergroup Berlin-Brandenburg
 
JVM及其调优
zhongbing liu
 
Tuning Java GC to resolve performance issues
Sergey Podolsky
 
GC Tuning in the HotSpot Java VM - a FISL 10 Presentation
Ludovic Poitou
 
Memory models
Dr. C.V. Suresh Babu
 
淺談 Java GC 原理、調教和 新發展
Leon Chen
 
Java GC
Ray Cheng
 
Let's Learn to Talk to GC Logs in Java 9
Poonam Bajaj Parhar
 
Ad

Similar to Java Memory Model (20)

PPTX
JVM Memory Model - Yoav Abrahami, Wix
Codemotion Tel Aviv
 
PPTX
Jvm memory model
Yoav Avrahami
 
PDF
jvm/java - towards lock-free concurrency
Arvind Kalyan
 
PDF
Java under the hood
Vachagan Balayan
 
PPTX
CPU Caches
shinolajla
 
ODP
Java Memory (Consistency) Model - Polish JUG One Beer Talk #2
Tomek Borek
 
ODP
4Developers 2015: Java Memory Consistency Model or intro to multithreaded pro...
PROIDEA
 
PPT
Java Performance, Threading and Concurrent Data Structures
Hitendra Kumar
 
ODP
Lightning talk on Java Memory Consistency Model Java Day Kiev 2014
Tomek Borek
 
PPT
Optimizing your java applications for multi core hardware
IndicThreads
 
PPT
Drd secr final1_3
Devexperts
 
PPTX
Memory model
Yi-Hsiu Hsu
 
PDF
Java Concurrency, A(nother) Peek Under the Hood [Code One 2019]
David Buck
 
PDF
Facts about multithreading that'll keep you up at night - Guy Bar on, Vonage
Codemotion Tel Aviv
 
PPTX
Memory model
MingdongLiao
 
PDF
Java Concurrency in Practice
Alina Dolgikh
 
PDF
Concurrency
Isaac Liao
 
PPTX
Multithreading Fundamentals
PostSharp Technologies
 
KEY
Modern Java Concurrency (OSCON 2012)
Martijn Verburg
 
PDF
Practical Introduction to Java Memory Model
Dmitry Degrave
 
JVM Memory Model - Yoav Abrahami, Wix
Codemotion Tel Aviv
 
Jvm memory model
Yoav Avrahami
 
jvm/java - towards lock-free concurrency
Arvind Kalyan
 
Java under the hood
Vachagan Balayan
 
CPU Caches
shinolajla
 
Java Memory (Consistency) Model - Polish JUG One Beer Talk #2
Tomek Borek
 
4Developers 2015: Java Memory Consistency Model or intro to multithreaded pro...
PROIDEA
 
Java Performance, Threading and Concurrent Data Structures
Hitendra Kumar
 
Lightning talk on Java Memory Consistency Model Java Day Kiev 2014
Tomek Borek
 
Optimizing your java applications for multi core hardware
IndicThreads
 
Drd secr final1_3
Devexperts
 
Memory model
Yi-Hsiu Hsu
 
Java Concurrency, A(nother) Peek Under the Hood [Code One 2019]
David Buck
 
Facts about multithreading that'll keep you up at night - Guy Bar on, Vonage
Codemotion Tel Aviv
 
Memory model
MingdongLiao
 
Java Concurrency in Practice
Alina Dolgikh
 
Concurrency
Isaac Liao
 
Multithreading Fundamentals
PostSharp Technologies
 
Modern Java Concurrency (OSCON 2012)
Martijn Verburg
 
Practical Introduction to Java Memory Model
Dmitry Degrave
 

Recently uploaded (20)

PDF
How Onsite IT Support Drives Business Efficiency, Security, and Growth.pdf
Captain IT
 
PDF
Why Your AI & Cybersecurity Hiring Still Misses the Mark in 2025
Virtual Employee Pvt. Ltd.
 
PDF
A Day in the Life of Location Data - Turning Where into How.pdf
Precisely
 
PDF
Revolutionize Operations with Intelligent IoT Monitoring and Control
Rejig Digital
 
PDF
Google I/O Extended 2025 Baku - all ppts
HusseinMalikMammadli
 
PDF
agentic-ai-and-the-future-of-autonomous-systems.pdf
siddharthnetsavvies
 
PDF
CIFDAQ's Teaching Thursday: Moving Averages Made Simple
CIFDAQ
 
PPTX
Comunidade Salesforce São Paulo - Desmistificando o Omnistudio (Vlocity)
Francisco Vieira Júnior
 
PDF
Google’s NotebookLM Unveils Video Overviews
SOFTTECHHUB
 
PDF
Cloud-Migration-Best-Practices-A-Practical-Guide-to-AWS-Azure-and-Google-Clou...
Artjoker Software Development Company
 
PPTX
New ThousandEyes Product Innovations: Cisco Live June 2025
ThousandEyes
 
PDF
CIFDAQ's Token Spotlight: SKY - A Forgotten Giant's Comeback?
CIFDAQ
 
PDF
AI Unleashed - Shaping the Future -Starting Today - AIOUG Yatra 2025 - For Co...
Sandesh Rao
 
PDF
BLW VOCATIONAL TRAINING SUMMER INTERNSHIP REPORT
codernjn73
 
PDF
Shreyas_Phanse_Resume: Experienced Backend Engineer | Java • Spring Boot • Ka...
SHREYAS PHANSE
 
PDF
Make GenAI investments go further with the Dell AI Factory - Infographic
Principled Technologies
 
PPTX
PA Analog/Digital System: The Backbone of Modern Surveillance and Communication
AVTRON Technologies LLC
 
PDF
Software Development Company | KodekX
KodekX
 
PDF
Automating ArcGIS Content Discovery with FME: A Real World Use Case
Safe Software
 
PDF
Accelerating Oracle Database 23ai Troubleshooting with Oracle AHF Fleet Insig...
Sandesh Rao
 
How Onsite IT Support Drives Business Efficiency, Security, and Growth.pdf
Captain IT
 
Why Your AI & Cybersecurity Hiring Still Misses the Mark in 2025
Virtual Employee Pvt. Ltd.
 
A Day in the Life of Location Data - Turning Where into How.pdf
Precisely
 
Revolutionize Operations with Intelligent IoT Monitoring and Control
Rejig Digital
 
Google I/O Extended 2025 Baku - all ppts
HusseinMalikMammadli
 
agentic-ai-and-the-future-of-autonomous-systems.pdf
siddharthnetsavvies
 
CIFDAQ's Teaching Thursday: Moving Averages Made Simple
CIFDAQ
 
Comunidade Salesforce São Paulo - Desmistificando o Omnistudio (Vlocity)
Francisco Vieira Júnior
 
Google’s NotebookLM Unveils Video Overviews
SOFTTECHHUB
 
Cloud-Migration-Best-Practices-A-Practical-Guide-to-AWS-Azure-and-Google-Clou...
Artjoker Software Development Company
 
New ThousandEyes Product Innovations: Cisco Live June 2025
ThousandEyes
 
CIFDAQ's Token Spotlight: SKY - A Forgotten Giant's Comeback?
CIFDAQ
 
AI Unleashed - Shaping the Future -Starting Today - AIOUG Yatra 2025 - For Co...
Sandesh Rao
 
BLW VOCATIONAL TRAINING SUMMER INTERNSHIP REPORT
codernjn73
 
Shreyas_Phanse_Resume: Experienced Backend Engineer | Java • Spring Boot • Ka...
SHREYAS PHANSE
 
Make GenAI investments go further with the Dell AI Factory - Infographic
Principled Technologies
 
PA Analog/Digital System: The Backbone of Modern Surveillance and Communication
AVTRON Technologies LLC
 
Software Development Company | KodekX
KodekX
 
Automating ArcGIS Content Discovery with FME: A Real World Use Case
Safe Software
 
Accelerating Oracle Database 23ai Troubleshooting with Oracle AHF Fleet Insig...
Sandesh Rao
 

Java Memory Model

  • 1. JMM Java Memory Model Łukasz Koniecki 24/10/2016
  • 3. Goal • Familiarize with the JMM, • How processor works? • Recall how Java compiler and JVM work, • JIT in action, • Explain what is a data race and a correctly synchronized program, • Talk about synchronization and atomicity, • Based on examples... • Next-gen JMM...
  • 7. Dummy program public class Example { int i, j; public void myDummyMethod() { i+=1; j+=1; i+=1; ... } }
  • 8. RAM i = 0 j = 0 Cache Program execution System Bus public class Example { int i, j; public void myDummyMethod() { i+=1; j+=1; i+=1; ... } } The Java Memory Model for Practitioners: http://bit.ly/2cMXklJ
  • 9. RAM i = 0 j = 0 Cache Program execution System Bus i = 0 public class Example { int i, j; public void myDummyMethod() { i+=1; j+=1; i+=1; ... } }
  • 10. RAM i = 0 j = 0 Cache Program execution System Bus i = 1 public class Example { int i, j; public void myDummyMethod() { i+=1; j+=1; i+=1; ... } }
  • 11. RAM i = 0 j = 0 Cache Program execution System Bus i = 1 public class Example { int i, j; public void myDummyMethod() { i+=1; j+=1; i+=1; ... } }
  • 12. RAM i = 1 j = 0 Cache Program execution System Bus public class Example { int i, j; public void myDummyMethod() { i+=1; j+=1; i+=1; ... } }
  • 13. RAM i = 1 j = 0 Cache Program execution System Bus j = 0 public class Example { int i, j; public void myDummyMethod() { i+=1; j+=1; i+=1; ... } }
  • 14. RAM i = 1 j = 0 Cache Program execution System Bus j = 1 public class Example { int i, j; public void myDummyMethod() { i+=1; j+=1; i+=1; ... } }
  • 15. RAM i = 1 j = 1 Cache Program execution System Bus j = 1 public class Example { int i, j; public void myDummyMethod() { i+=1; j+=1; i+=1; ... } }
  • 16. RAM i = 1 j = 1 Cache Program execution System Bus Sequentialy consistent execution public class Example { int i, j; public void myDummyMethod() { i+=1; j+=1; i+=1; ... } }
  • 18. Our world in data: http://bit.ly/1NLxNcH Moore’s Law
  • 19. Moore’s Law Our world in data: http://bit.ly/1NLxNcH 2006
  • 20. Processor technology • ... • 22 nm – 2012 • 14 nm – 2014 • 10 nm – 2017 • 7 nm – ~2019 • 5 nm – ~2021 Wikipedia: http://bit.ly/2cMWoNg
  • 21. Processor vs. Memory Performance How L1 and L2 CPU caches work, and why they’re an essential part of modern chips: http://bit.ly/2cpHu1x
  • 25. Core i7 Xeon 5500 Series Data Source Latency (approximate) local L1 CACHE hit, ~4 cycles ( 2.1 - 1.2 ns ) local L2 CACHE hit, ~10 cycles ( 5.3 - 3.0 ns ) local L3 CACHE hit, line unshared ~40 cycles ( 21.4 - 12.0 ns ) local L3 CACHE hit, shared line in another core ~65 cycles ( 34.8 - 19.5 ns ) local L3 CACHE hit, modified in another core ~75 cycles ( 40.2 - 22.5 ns ) remote L3 CACHE (Ref: Fig.1 [Pg. 5]) ~100-300 cycles ( 160.7 - 30.0 ns ) local DRAM ~60 ns remote DRAM ~100 ns Performance Analysis Guide for Intel® Core™ i7 Processor and Intel® Xeon™ 5500 processors: http://intel.ly/2cV1ZFZ Cache latency
  • 26. Weak vs. Strong hardware Memory Models Weak vs. Strong Memory Models: http://bit.ly/2cC4avk
  • 27. x86/x64 processor memory model R-R R-W W-R W-W Intel® 64 and IA-32 Architectures Software Developer’s Manual: http://intel.ly/2csMyB2 Processor P can read B before it’s write to A is seen by all processors (processor can move its own reads in front of its own writes)
  • 28. x86/x64 processor memory model R-R R-W W-R W-W Intel® 64 and IA-32 Architectures Software Developer’s Manual: http://intel.ly/2csMyB2 Processor P can read B before it’s write to A is seen by all processors (processor can move its own reads in front of its own writes)
  • 29. How Java compiler works? javac Source code Byte code Bytecode verifier Class loader JIT JVM OS Native code Byte code
  • 30. JIT •Profile guided, •Speculatively optimizing, •Backup strategies, •Optimizes code for us, •We don’t have to care so much about cache-wise operations
  • 32. Tiered compilation (interpreter) time throuput startup interpreted C1 C2 sampling full speed deoptimize bail to interpreter Interpreter • extremly slow, • not profiling
  • 33. Tiered compilation (C1 compiler) time throuput startup interpreted C1 C2 sampling full speed deoptimize bail to interpreter C1 • client, • fast but dummy, • does the profiling, • e.g: branches, typechecks,
  • 34. Tiered compilation (C2 compiler) time throuput startup interpreted C1 C2 sampling full speed deoptimize bail to interpreter C2 • server, • slow but clever, • aggresively optimizing, • based on profile, • e.g.: loop optimizations (unswitching, unrolling), Implicit Null Checking
  • 35. Why do we need a JMM? • Different platform memory models (none of them match the JMM!!!) • Many JVM implementations, • People don’t know how to program concurrently, • Programmers: write reliable and multithreaded code, • Compiler writers: implement optimization which will be a legal, optimization according to the JLS • Compiler: produce fast and optimal native code,
  • 36. JMM • Action: read and write to variable, lock and unlock of monitor, starting and joining with thread, • Happens-before partial order, • Thread executing action B can see the results of action A (any thread), there must be a happens-before relationship between A and B, • Otherwise JVM is free to reorder,
  • 37. Happens-before orderings • Unlock of a monitor / lock of that monitor, • Write to a volatile variable / read of that variable, • Call to start() / any action in the started thread, • All actions in a thread / any other thread successfully returns from join() on that thread, • Setting default values for variables, setting value to a final field in the constructor / constructor finish, • Write to an Atomic variable / read from that variable, • Many java.util.concurrent methods,
  • 38. JMM • A promise for programmers: sequential consistency must be sacrificed to allow optimizations, but it will still hold for data race free programs. This is the data race free (DRF) guarantee. • A promise for security: even for programs with data races, values should not appear “out of thin air”, preventing unintended information leakage. • A promise for compilers: common hardware and software optimizations should be allowed as far as possible without violating the first two requirements. Java Memory Model Examples: Good, Bad and Ugly: http://bit.ly/2cZfF1I
  • 39. Example @NotThreadSafe class DataRace { int a, b; int x, y; void thread1() { y = a; b = 1; } void thread2() { x = b; a = 2; } } y == 2, x == 1 ???
  • 40. How can this happen? • Processor can reorder statements (out-of-order execution, HT) • Lazy synchronization between caches and main memory, • Compiler can reorder statements (or keep values is registers), • Aggressive optimizations in JIT,
  • 41. Example @NotThreadSafe class DataRace { int a, b; int x, y; void thread1() { y = a; b = 1; } void thread2() { x = b; a = 2; } } time Thread 1 Thread 2 y = a; b = 1; x = b; a = 2;
  • 42. Example @NotThreadSafe class DataRace { int a, b; int x, y; void thread1() { y = a; b = 1; } void thread2() { x = b; a = 2; } } time Thread 1 Thread 2 b = 1; y = a; x = b; a = 2;
  • 43. Example @NotThreadSafe class DataRace { int a, b; int x, y; void thread1() { y = a; b = 1; } void thread2() { x = b; a = 2; } } time Thread 1 Thread 2 b = 1; y = a; a = 2; x = b;
  • 44. Example @NotThreadSafe class DataRace { int a, b; int x, y; void thread1() { y = a; b = 1; } void thread2() { x = b; a = 2; } } time Thread 1 Thread 2 b = 1; a = 2; x = b; y = a; y == 2, x == 1
  • 45. Example of x86/x64 test results
  • 46. Test using jstress @JCStressTest @Description("Data race") @Outcome(id = {"0, 0", "0, 1", "2, 0"}, expect = ACCEPTABLE, desc = "Trivial under sequential consistency") @Outcome(id = {"2, 1"}, expect = ACCEPTABLE, desc = "Racy read of x") @State public class DataRace { int a, b; int x, y; @Actor void thread1(IntResult2 r) { y = a; b = 1; r.r1 = y; } @Actor void thread2(IntResult2 r) { x = b; a = 2; r.r2 = x; } } jcstress: http://bit.ly/2daSL5Q
  • 47. Example of x86/x64 test results R-R R-W W-R W-W
  • 48. Test results interpretation y==0, x==0 y==0, x==1 y==2, x==0 time . . . y = a; b = 1; . . . x = b; a = 2;
  • 49. Test results interpretation y==0, x==0 y==0, x==1 y==2, x==0 time . . . y = a; b = 1; . . . x = b; a = 2;
  • 50. Test results interpretation y==0, x==0 y==0, x==1 y==2, x==0 time . . . y = a; b = 1; . . . x = b; a = 2;
  • 51. Visibility between threads @ThreadSafe public class DataRace { int a, b; int x, y; void thread1() { synchronized (this) { y = a; b = 1; } } void thread2() { synchronized (this) { x = b; a = 2; } } }
  • 52. Visibility between threads time Thread 1 Thread 2 (Th2 starts after Th1) Program order Program order synchronization order Every operation that happens before an unlock (release) Is visible to an operation that happens after a later lock (aquire)happens-before order @ThreadSafe public class DataRace { int a, b; int x, y; void thread1() { synchronized (this) { y = a; b = 1; } } void thread2() { synchronized (this) { x = b; a = 2; } } } . . . <enter this> y = a; b = 1; <exit this> <enter this> x = b; a = 2; <exit this> . . . Possible results: y==0, x == 1 y==2, x == 0
  • 53. Synchronization High level • java.util.concurrent Low level • synchronized() blocks and methods, • java.util.concurrent.locks Low level primitives • volatile variables • java.util.concurrent.atomic
  • 54. Volatile @ThreadUnsafe public class Looper { static boolean done; public static void main(String[] args) throws InterruptedException { new Thread(new Runnable() { @Override public void run() { int count = 0; while (!done) { count++; } System.out.println("Ending this task"); } }).start(); Thread.sleep(1000); System.out.println("Waiting done"); done = true; } }
  • 55. Volatile @ThreadSafe public class Looper { volatile static boolean done; public static void main(String[] args) throws InterruptedException { new Thread(new Runnable() { @Override public void run() { int count = 0; while (!done) { count++; } System.out.println("Ending this task"); } }).start(); Thread.sleep(1000); System.out.println("Waiting done"); done = true; } } Program order Program order synchronization order Thread 1 time Thread 2 . . . done = true; while (!done) . . . happens-before order
  • 56. More about volatile • Volatile reads are very cheep (no locks compared to synchronized) • Volatile increment is not atomic (!!!) • Elements in volatile collection are not volatile (e.g. volatile int[]) • Consider using java.util.concurrent
  • 57. What operations in Java are atomic? • Read/write on variables of primitive types (except of long and double – Word Tearing problem), • Read/write on volatile variables of primitive type (including long and double), • All read/writes to references are always atomic (http://bit.ly/2c8kn8i), • All operations on java.util.concurrent.atomic types,
  • 58. Examples Be careful what you’re doing...
  • 59. Double-checked locking @ThreadSafe public class DoubleCheckedLocking { private volatile Helper helper = null; public Helper getHelper() { if (helper == null) { synchronized (this) { if (helper == null) helper = new Helper(); } } return helper; } } The "Double-Checked Locking is Broken" Declaration: http://bit.ly/2cIDBnA
  • 60. Final @ThreadUnsafe class UnsafePublication { private int a; private static UnsafePublication instance; private UnsafePublication() { a = 1; } void thread1() throws InterruptedException { instance = new UnsafePublication(); } void thread2() { if (instance != null) { System.out.println(instance.a); } } } What state can thread 2 see??? null, 0, 1
  • 61. Final @ThreadSafe class SafePublication { private final int a; private static SafePublication instance; private SafePublication() { a = 1; } void thread1() throws InterruptedException { instance = new SafePublication(); } void thread2() { if (instance != null) { System.out.println(instance.a); } } }
  • 62. Next-JMM • JEP 188, • Improve formalization, • JVM coverage, • Extend scope, • Testing support, • Tool support, • Enh: atomic r/w for long and double,
  • 63. To sum up... • Concurrent programming isn’t easy, • Design your code for concurrency (make it right before you make it fast), • Do not code against the implementation. Code against the specification, • Use high level synchronization wherever possible, • Watch out for useless synchronization, • Use Thread Safe Immutable objects,
  • 64. Further reading • Aleksey Shipilëv: One Stop Page (http://bit.ly/2cqBt4x), • Rafael Winterhalter: The Java Memory Model for Practitioners (http://bit.ly/2cMXklJ), • Brian Goetz: Java Concurrency in Practice (http://amzn.to/2cloe76)