Lecture 06
Lecture 06
COMPUTING
LECTURE 6
MIMD
Example
A shared memory system with two
cores and two caches
y0 privately owned by Core 0
y1 and z1 privately owned by Core 1
Copyright © 2010,
Elsevier Inc. All rights
Reserved 3
Cache coherence
y0 privately owned by Core 0
y1 and z1 privately owned by Core 1
x = 2; /* shared variable */
y0 eventually ends up = 2
y1 eventually ends up = 6
z1 = ???
Copyright © 2010,
Elsevier Inc. All rights
Reserved 4
Problem with Write - Through
Policy
5
Problem with Write - back
Policy
6
Cache coherence
Programmers have no control over caches
and when they get updated.
Copies of the data stored in the shared
memory must match those copies stored in
the local caches. This is referred to as
cache coherence.
The copies of a shared variable are coherent
if they are all equal
Cache coherence is important to guarantee
correct program execution and to ensure high
system performance. 7
Cache Coherence Protocols
A cache coherence protocol must be used to
ensure that the contents of the cache
memories are consistent with the contents of
the shared memory.
8
Snooping Cache Coherence
10
Snooping Cache Coherence
Write-through vs. write-back
Requires a broadcast every time a variable is
updated.
Large networks broadcasts are expensive
Snooping cache coherence isn’t scalable,
because for larger systems it will cause
performance to degrade.
11
Directory Based Cache
Coherence
Uses a data structure called a directory that
stores the status of each cache line.
Copyright © 2010,
Elsevier Inc. All rights
Reserved 12
Directory Based Cache
Coherence
13
Directory Based Cache
Coherence
The local caches associated with the processors
have local cache controllers to coordinate
updating the copies of the shared variables
stored in the local caches.
The central controller is responsible for cache
coherence for the system..
There will be additional storage required for the
directory
When a cache variable is updated, only the
cores storing that variable need to be contacted.
14
False Sharing
CPU caches are implemented in hardware, so
they operate on cache lines, not individual
variables.
Copyright © 2010,
Elsevier Inc. All rights
Reserved 15
False Sharing
Copyright © 2010,
Elsevier Inc. All rights
Reserved 16
False Sharing
Copyright © 2010,
Elsevier Inc. All rights
Reserved 17
False Sharing
Copyright © 2010,
Elsevier Inc. All rights
Reserved 18
False Sharing
Copyright © 2010,
Elsevier Inc. All rights
Reserved 19
False Sharing
Copyright © 2010,
Elsevier Inc. All rights
Reserved 20
False Sharing
Copyright © 2010,
Elsevier Inc. All rights
Reserved 21
Parallel software
The burden is on software
Hardware and compilers can keep up the pace
needed.
From now on…
In shared memory programs:
◼ Start a single process and fork threads.
◼ Threads carry out tasks.
Copyright © 2010,
Elsevier Inc. All rights
Reserved 23
SPMD – single program multiple data
Copyright © 2010,
Elsevier Inc. All rights
Reserved 25
Shared Memory
Dynamic threads
Master thread waits for work, forks new threads,
and when threads are done, they terminate
Efficient use of resources, but thread creation and
termination is time consuming.
Static threads
Pool of threads created and are allocated work,
but do not terminate until cleanup.
Better performance, but potential waste of system
resources.
Copyright © 2010,
Elsevier Inc. All rights
Reserved 26