Technische Universitt Mnchen
Chip Multicore Processors
Tutorial 9
S. Wallentowitz
Institute for Integrated Systems Theresienstr. 90 Building N1 www.lis.ei.tum.de
Technische Universitt Mnchen
Preparation for next week
Read article!
Why on-chip coherency is here to stay Milo M. K. Martin, Mark D. Hill, Daniel J. Sorin Communications of the ACM, July 2012
Chip Multicore Processors Tutorial 9 2 S. Wallentowitz
Institute for Integrated Systems
Technische Universitt Mnchen
9.1: Directory-based Cache Coherency
In this tutorial you will learn about directory-based cache coherency protocols. Given is the sketched system: A central directory is located near to the memory and each cache supports directory-based cache coherency. The communication is performed over a network (that is abstracted here). The directory implements the MSI protocol and stores the bitvector of all caches which share a cache line.
C0
C1
C2
Interconnect
Dir
Mem
Chip Multicore Processors Tutorial 9 3 S. Wallentowitz
Institute for Integrated Systems
Technische Universitt Mnchen
a)
In the following you find the initial state of the caches. Determine the entries of the directory from the local entries
0 1 I S M I 408 410 54 | 04 20 | 01 400 408 0 1 I M I S 418 0a | 00 428 00 | 20 410 ... 00 | 00 54 | 04 03 | 00
C0
Directory
...
2 3
418
420 428 430
00 | 00
01 | 02 0c | d0 00 | ff 00 | 00 ... ...
C1
2 3
0 1
S S M I
420 408 430
01 | 02 54 | 04 00 | 00
438
C2
2 3
Memory
Chip Multicore Processors Tutorial 9 4 S. Wallentowitz
Institute for Integrated Systems
Technische Universitt Mnchen
b)
Explicit messages are exchanged between the caches and the directory. The following table contains all messages that are exchanged. Develop a MSI state diagram for the coherency protocol. Beside processor events external messages lead to state transitions. Messages are sent as actions during state transitions.
Type Read Miss (RM) Write Miss (WM) Write (M)
Invalidate (IN) Fetch (FE) Fetch/Inv. (FEI) Data (DA) Data Reply (DV) Write Back (WB) ACK
Source Local Cache Local Cache Local Cache
Directory Directory Directory Directoy Remote Cache Cache General
Destination Directory Directory Directory
Remote Cache Remote Cache Remote Cache Local Cache Directory Directory General
Content ID, Address ID, Address ID, Address
Address Address Address Data Data Address, Data General
Chip Multicore Processors Tutorial 9 5 S. Wallentowitz
Institute for Integrated Systems
Technische Universitt Mnchen
MSI Protocol, Cache side
Invalid
Shared
Modified
Chip Multicore Processors Tutorial 9 6 S. Wallentowitz
Institute for Integrated Systems
Technische Universitt Mnchen
MSI Protocol, Directory side
Uncached
Shared
Modified
Chip Multicore Processors Tutorial 9 7 S. Wallentowitz
Institute for Integrated Systems
Technische Universitt Mnchen
c)
Complete the message sequence charts of the coherency protocol starting from the initial state. Start from the initial state for each of the sequences separately. Also update the new states of the local caches and the directory.
Chip Multicore Processors Tutorial 9 8 S. Wallentowitz
Institute for Integrated Systems
Technische Universitt Mnchen
c.i)
(P1) read 400 C0 C1 C2 Dir C0
0 1 2 3 I S M I 408 410 400 408 0 1 I M I S 418 428 410
Directory
... U S M --101 100
C1
418
420 428 430
S
S M M U ...
010
001 010 001 ---
2 3
0 1
S S M I
420 408 430
438
C2
2 3
Chip Multicore Processors Tutorial 9 9 S. Wallentowitz
Institute for Integrated Systems
Technische Universitt Mnchen
c.i)
(P2) read 438 C0 C1 C2 Dir C0
0 1 2 3 I S M I 408 410 400 408 0 1 S M I S 418 400 428 410
Directory
... S S M 010 101 100
C1
418
420 428 430
S
S M M U ...
010
001 010 001 ---
2 3
0 1
S S M I
420 408 430
438
C2
2 3
Chip Multicore Processors Tutorial 9 10 S. Wallentowitz
Institute for Integrated Systems
Technische Universitt Mnchen
c.ii)
(P1) read 410 C0 C1 C2 Dir C0
0 1 2 3 I S M I 408 410 400 408 0 1 I M I S 418 428 410
Directory
... U S M --101 100
C1
418
420 428 430
S
S M M U ...
010
001 010 001 ---
2 3
0 1
S S M I
420 408 430
438
C2
2 3
Chip Multicore Processors Tutorial 9 11 S. Wallentowitz
Institute for Integrated Systems
Technische Universitt Mnchen
c.ii)
(P2) read 410 C0 C1 C2 Dir C0
0 1 2 3 I S S I 408 410 400 408 0 1 I M S S 428 410 418 410
Directory
... U S S --101 110
C1
418
420 428 430
S
S M M U ...
010
001 010 001 ---
2 3
0 1
S S M I
420 408 430
438
C2
2 3
Chip Multicore Processors Tutorial 9 12 S. Wallentowitz
Institute for Integrated Systems
Technische Universitt Mnchen
c.ii)
(P0) read 430 C0 C1 C2 Dir C0
0 1 2 3 I S S I 408 410 400 408 0 1 I M S S 428 410 418 410
Directory
... U S S --101 111
C1
418
420 428 430
S
S M U U ...
010
001 010 -----
2 3
0 1
S S S I
420 408 410
438
C2
2 3
Chip Multicore Processors Tutorial 9 13 S. Wallentowitz
Institute for Integrated Systems
Technische Universitt Mnchen
c.iii)
(P0) write 420, 42 C0 C1 C2 Dir C0
0 1 2 3 I S M I 408 410 400 408 0 1 I M I S 418 428 410
Directory
... U S M --101 100
C1
418
420 428 430
S
S M M U ...
010
001 010 001 ---
2 3
0 1
S S M I
420 408 430
438
C2
2 3
Chip Multicore Processors Tutorial 9 14 S. Wallentowitz
Institute for Integrated Systems
Technische Universitt Mnchen
c.iii)
(P2) read 424 C0 C1 C2 Dir C0
0 1 2 3 M S M I 420 408 410 400 408 0 1 I M I S 418 428 410
Directory
... U S M --101 100
C1
418
420 428 430
S
M M M U ...
010
100 010 001 ---
2 3
0 1
I S M I 408 430
438
C2
2 3
Chip Multicore Processors Tutorial 9 15 S. Wallentowitz
Institute for Integrated Systems
Technische Universitt Mnchen
c.iii)
(P2) write 424, 23 C0 C1 C2 Dir C0
0 1 2 3 S S M I 420 408 410 400 408 0 1 I M I S 418 428 410
Directory
... U S M --101 100
C1
418
420 428 430
S
S M M U ...
010
101 010 001 ---
2 3
0 1
S S M I
420 408 430
438
C2
2 3
Chip Multicore Processors Tutorial 9 16 S. Wallentowitz
Institute for Integrated Systems
Technische Universitt Mnchen
c.iv)
(P1) read 410 C0
0 1 2 3 I S M I 408 410 400 408 0 1 I M I S 418 428 410
Directory
... U S M --101 100
Self Study
C1
418
420 428 430
S
S M M U ...
010
001 010 001 ---
2 3
0 1
S S M I
420 408 430
438
C2
2 3
Chip Multicore Processors Tutorial 9 17 S. Wallentowitz
Institute for Integrated Systems
Technische Universitt Mnchen
What do you think may a home directory be? How can this technique improve the performance of directory-based cache coherency.
Chip Multicore Processors Tutorial 9 18 S. Wallentowitz
Institute for Integrated Systems
Technische Universitt Mnchen
Chip Multicore Processors Tutorial 9 19 S. Wallentowitz
Institute for Integrated Systems