Leveraging Concurrent Collections to
Simplify Application Design
José Paumard
PHD, JAVA CHAMPION, JAVA ROCK STAR
@JosePaumard https://github.com/JosePaumard
Agenda Concurrent collections and maps
Collections: Queue, BlockingQueue
Map: ConcurrentMap
And implementations
Concurrent Interfaces
Implementing the Producer / Consumer at the
API level vs the application level
For that, we need new API, new Collections
Two branches: Collection and Map
JDK 2 Collection
List Set
SortedSet
JDK 2 Collection
List Set Queue JDK 5
SortedSet Deque BlockingQueue
BlockingDeQue
JDK 2 Collection
List Set Queue JDK 5
SortedSet Deque BlockingQueue
JDK 6 NavigableSet BlockingDeQue
JDK 2 Collection
List Set Queue JDK 5
SortedSet Deque BlockingQueue
JDK 7
JDK 6 NavigableSet BlockingDeQue TransferQueue
JDK 2 Map
SortedMap
JDK 2 Map
SortedMap ConcurrentMap JDK 5
JDK 2 Map
SortedMap ConcurrentMap JDK 5
NavigableMap ConcurrentNavigableMap JDK 6
Concurrent interfaces, that define contracts in
concurrent environments
And implementations that follow these contracts
But concurrency is complex!
Dealing with 10 threads is not the same as 10k
threads…
So we need different implementations
Concurrent Lists
About Vectors and Stacks
There are thread-safe structures: Vector and Stack
They are legacy structures, very poorly implemented
They should not be used!
Exists for list and set
No locking for read operations
Copy on Write
Write operations create a new structure
The new structure then replaces the previous one
tab
tab
e add(e)
tab
e add(e)
synchronized
The thread that already has a reference on the
previous array will not see the modification
Copy on Write
The new threads will see the modification
Two structures:
Copy on Write - CopyOnWriteArrayList
- CopyOnWriteArraySet
Copy on Write Structures
Work well when there are many reads and very, very few
writes
Example: application initialization
Queues and Stacks
Queue and Deque: interfaces
ArrayBlockingQueue: a bounded blocking queue
built on an array
ConcurrentLinkedQueue: an unbounded blocking
queue
How Does a Queue Work?
Two kinds of queues: FIFO (queue) and LIFO (stack)
In the JDK we have the following
- Queue: queue
- Deque: both a queue and a stack
There is no “pure” stack (the Stack class does not count)
Producer Consumer
tail head
c b a
Queue
Producer Consumer
tail head
c b a
Queue
So we can have as many producers and
consumers as we need
We Are in a Each of them in its own thread
Concurrent World
A thread does not know how many elements are
in the queue…
Two Questions
1) What happens if the queue / stack is full and we need to
add an element to it?
2) What happens if the queue / stack is empty and we need
to get an element from it?
Adding an Element to a Queue That Is Full
k j i h g f e d c b a
boolean add(E e); // fail: IllegalArgumentException
// fail: return false
boolean offer(E e);
If the Queue Is a BlockingQueue
k j i h g f e d c b a
boolean add(E e); // fail: IllegalArgumentException
// fail: return false
boolean offer(E e);
// blocks until a cell becomes available
void put(E e);
If the Queue Is a BlockingQueue
k j i h g f e d c b a
boolean add(E e); // fail: IllegalArgumentException
// fail: return false
boolean offer(E e); boolean offer(E e, timeOut, timeUnit);
// blocks until a cell becomes available
void put(E e);
Two behaviors:
Adding Elements - Failing with an exception
at the Tail of a - Failing and returning false
Queue And for blocking queue:
- Blocking until the queue can accept the element
Deque can accept elements at the head of a
We Also Have queue:
- addFirst(), offerFirst(),
Deque And
BlockingDeque And for the BlockingDeque
- putFirst()
Queues have also get and peek operations
Queue:
- Returns null: poll() and peek()
Other Methods - Exception: remove() and element()
BlockingQueue:
- blocks: take()
Queues have also get and peek operations
Deque:
- Returns null: pollLast() and peekLast()
Other Methods - Exception: removeLast() and getLast()
BlockingDeque:
- blocks: takeLast()
Queue and BlockingQueue
Four different types of queues: they may be blocking or not,
may offer access from both sides or not
Different types of failure: special value, exception, blocking
That makes the API quite complex, with a lot of methods
Concurrent Maps
One interface:
ConcurrentMap: redefining the JavaDoc
Two implementations:
ConcurrentHashMap: JDK 7 & JDK 8
ConcurrentSkipListMap: JDK 6, no
synchronization
ConcurrentMap defines atomic operations:
- putIfAbsent(key, value)
Atomic - remove(key, value)
Operations - replace(key, value)
- replace(key, existingValue, newValue)
ConcurrentMap Implementations
ConcurrentMap implementations are:
- Thread-safe maps
- Efficient up to a certain number of threads
- A number of efficient, parallel special operations
How Does a HashMap Work?
A hashmap is built on an array
1) Compute a hashcode from the key
key value 2) Decide which cell will hold the key /
value pair
How Does a HashMap Work?
A hashmap is built on an array
key1 value1
key2 value2
Each cell is called a “bucket”
Adding a key / value pair to a map is a several
steps problem
1) Compute the hashcode of the key
Understanding 2) Check if the bucket is there or not
the Problem 3) Check if the key is there or not
4) Update the map
In a concurrent map these steps must not be
interrupted by another thread
The only way to guard an array-based structure is
to lock the array
Synchronizing the put would work, but…
Understanding It would be very inefficient to block all the map:
the Problem
- we should allow several threads on different
buckets
- we should allow concurrent reads
Synchronizing All the Map
Synchronizing on the array itself
key1 value1 Pros: it works!
Cons: one write
key2 value2 blocks everything
Synchronizing All the Map
Synchronizing on parts of the array
key1 value1 Pros: it works!
It allows for a certain
key2 value2 level of parallelism
ConcurrentHashMap From JDK 7
Built on a set of synchronized segments
Number of segments = concurrency level (16 - 64k)
This sets the number of threads that can use this map
The number of key / value pairs has to be (much) greater
than the concurrency level
ConcurrentHashMap JDK 8
The implementation changed completely
Serialization: compatible with JDK 7 in both ways
Tailored to handle heavy concurrency and
millions of key / value pairs
Parallel methods implemented
ConcurrentHashMap<Long, String> map = ...; // JDK 8
String result =
map.search(10_000,
(key, value) ->
value.startsWith(“a”) ? “a” : null
);
ConcurrentHashMap: Parallel Search
The first parameter is the parallelism threshold
The second is the operation to be applied
Also searchKeys(), searchValues(), searchEntries()
ConcurrentHashMap<Long, List<String>> map = ...; // JDK 8
String result =
map.reduce(10_000,
(key, value) -> value.size(),
(value1, value2) -> Integer.max(value1, value2)
);
ConcurrentHashMap: Parallel Map / Reduce
The first bifunction maps to the element to be reduced
The second bifunction reduces two elements together
ConcurrentHashMap<Long, List<String>> map = ...; // JDK 8
String result =
map.forEach(10_000,
(key, value) -> value.removeIf(s -> s.length() > 20)
);
ConcurrentHashMap: Parallel for Each
The biconsumer is applied to all the key / value pairs of the map
Also forEachKeys(), forEachValues(), forEachEntry()
Set<String> set = ConcurrentHashMap.<String>newKeySet(); // JDK 8
ConcurrentHashMap to Create Concurrent Sets
This concurrent hash map can also be used as a concurrent set
No parallel operations available
ConcurrentHashMap From JDK 8
A fully concurrent map
Tailored to handle millions of key / value pairs
With built-in parallel operations
Can be used to create concurrent sets
Concurrent Skip Lists
Another concurrent map (JDK 6)
A skip list is a smart structure used to create
linked lists
Relies on atomic reference operations, no
synchronization
That can be used to create maps and sets
Skip Lists
It starts with a classical linked list
head tail
Skip Lists
Problem: it takes time to reach element N
Complexity is O(N)
head tail
Skip Lists
Solution: create a fast access list
head tail
Skip Lists
We can even create more than one
head tail
Skip Lists
It assumes that the elements are sorted
head tail
Skip Lists
The access time is now in O(log(N))
a1 a2 a3 a4 a5 a6 a7 a8
head tail
Skip Lists
Let us see how it works
1 2 3 4 5 6 7 8
head tail
Skip Lists
Let us create a first layer of fast access…
1 3 5 7
1 2 3 4 5 6 7 8
head tail
Skip Lists
… and a second one
1 5
1 3 5 7
1 2 3 4 5 6 7 8
head tail
Skip Lists
Now, suppose we need to locate 4
1 5
1 3 5 7
1 2 3 4 5 6 7 8
head tail
Skip Lists
4 is between 1 and 5, so we go down one layer on 1
1 5
1 3 5 7
1 2 3 4 5 6 7 8
head tail
Skip Lists
4 is between 3 and 5, so we go down one layer on 3
1 5
1 3 5 7
1 2 3 4 5 6 7 8
head tail
Skip Lists
Then we can reach 4
1 5
1 3 5 7
1 2 3 4 5 6 7 8
head tail
Skip Lists
The access time is now O(log(N))
1 5
1 3 5 7
1 2 3 4 5 6 7 8
head tail
A skip list is used to implement a map
The keys are sorted
The skip list structure ensure fast access to any
key
A skip list is not an array-based structure
And there are other ways than locking to guard it
The ConcurrentSkipListMap uses this structure
All the references are implemented with
AtomicReference
So it is a thread-safe map with
no synchronization
There is also a ConcurrentSkipListSet using the
same structure
Tailored for high concurrency!
Concurrent Skip Lists
Used for maps and sets
Thread safety with no locking (synchronization)
Usable when concurrency is high
As usual: some methods should not be used (size())
Demo Let us see some code!
Let us implement a Consumer / Producer using
an ArrayBlockingQueue
And see the ConcurrentHashMap from JDK8 in
action
Demo
Wrapup What did we see?
A producer / consumer implementation using
concurrent queues
How to use the ConcurrentHashMap to look for
information in a ~170k data set
Module What did we learn?
Wrapup We saw what the JDK has to offer as
concurrent collections and maps
They can be used to solve concurrent problems
while delegating thread safety to the API
We focused on blocking queues and
concurrent maps
Which structure for which case?
Module
If you have very few writes, use copy on write
Wrapup structures
If you have low concurrency, you can rely on
synchronization
In high concurrency, skip lists are usable with
many objects, or ConcurrentHashMap
High concurrency with few objects is always
problematic
Be careful when designing concurrent code :
Course
1) be sure to have a good idea of what your
Wrapup problem is
2) concurrent programming is different from
parallel processing
3) try to delegate to the API as much as you
can
4) know the concurrent collections well, as they
solve many problems
Course
Wrapup
Thank you!
@JosePaumard
https://github.com/JosePaumard