Chip Multicore Processors: Tutorial 9
Chip Multicore Processors: Tutorial 9
S. Wallentowitz
Read article!
Why on-chip coherency is here to stay Milo M. K. Martin, Mark D. Hill, Daniel J. Sorin Communications of the ACM, July 2012
C0
C1
C2
Interconnect
Dir
Mem
a)
In the following you find the initial state of the caches. Determine the entries of the directory from the local entries
0 1 I S M I 408 410 54 | 04 20 | 01 400 408 0 1 I M I S 418 0a | 00 428 00 | 20 410 ... 00 | 00 54 | 04 03 | 00
C0
Directory
...
2 3
418
420 428 430
00 | 00
01 | 02 0c | d0 00 | ff 00 | 00 ... ...
C1
2 3
0 1
S S M I
01 | 02 54 | 04 00 | 00
438
C2
2 3
Memory
b)
Explicit messages are exchanged between the caches and the directory. The following table contains all messages that are exchanged. Develop a MSI state diagram for the coherency protocol. Beside processor events external messages lead to state transitions. Messages are sent as actions during state transitions.
Invalid
Shared
Modified
Uncached
Shared
Modified
c)
Complete the message sequence charts of the coherency protocol starting from the initial state. Start from the initial state for each of the sequences separately. Also update the new states of the local caches and the directory.
c.i)
(P1) read 400 C0 C1 C2 Dir C0
0 1 2 3 I S M I 408 410 400 408 0 1 I M I S 418 428 410
Directory
... U S M --101 100
C1
418
420 428 430
S
S M M U ...
010
001 010 001 ---
2 3
0 1
S S M I
438
C2
2 3
c.i)
(P2) read 438 C0 C1 C2 Dir C0
0 1 2 3 I S M I 408 410 400 408 0 1 S M I S 418 400 428 410
Directory
... S S M 010 101 100
C1
418
420 428 430
S
S M M U ...
010
001 010 001 ---
2 3
0 1
S S M I
438
C2
2 3
c.ii)
(P1) read 410 C0 C1 C2 Dir C0
0 1 2 3 I S M I 408 410 400 408 0 1 I M I S 418 428 410
Directory
... U S M --101 100
C1
418
420 428 430
S
S M M U ...
010
001 010 001 ---
2 3
0 1
S S M I
438
C2
2 3
c.ii)
(P2) read 410 C0 C1 C2 Dir C0
0 1 2 3 I S S I 408 410 400 408 0 1 I M S S 428 410 418 410
Directory
... U S S --101 110
C1
418
420 428 430
S
S M M U ...
010
001 010 001 ---
2 3
0 1
S S M I
438
C2
2 3
c.ii)
(P0) read 430 C0 C1 C2 Dir C0
0 1 2 3 I S S I 408 410 400 408 0 1 I M S S 428 410 418 410
Directory
... U S S --101 111
C1
418
420 428 430
S
S M U U ...
010
001 010 -----
2 3
0 1
S S S I
438
C2
2 3
c.iii)
(P0) write 420, 42 C0 C1 C2 Dir C0
0 1 2 3 I S M I 408 410 400 408 0 1 I M I S 418 428 410
Directory
... U S M --101 100
C1
418
420 428 430
S
S M M U ...
010
001 010 001 ---
2 3
0 1
S S M I
438
C2
2 3
c.iii)
(P2) read 424 C0 C1 C2 Dir C0
0 1 2 3 M S M I 420 408 410 400 408 0 1 I M I S 418 428 410
Directory
... U S M --101 100
C1
418
420 428 430
S
M M M U ...
010
100 010 001 ---
2 3
0 1
I S M I 408 430
438
C2
2 3
c.iii)
(P2) write 424, 23 C0 C1 C2 Dir C0
0 1 2 3 S S M I 420 408 410 400 408 0 1 I M I S 418 428 410
Directory
... U S M --101 100
C1
418
420 428 430
S
S M M U ...
010
101 010 001 ---
2 3
0 1
S S M I
438
C2
2 3
c.iv)
(P1) read 410 C0
0 1 2 3 I S M I 408 410 400 408 0 1 I M I S 418 428 410
Directory
... U S M --101 100
Self Study
C1
418
420 428 430
S
S M M U ...
010
001 010 001 ---
2 3
0 1
S S M I
438
C2
2 3
What do you think may a home directory be? How can this technique improve the performance of directory-based cache coherency.