0% found this document useful (0 votes)
28 views

Chip Multicore Processors: Tutorial 9

cc

Uploaded by

Bobby Beaman
Copyright
© Attribution Non-Commercial (BY-NC)
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
28 views

Chip Multicore Processors: Tutorial 9

cc

Uploaded by

Bobby Beaman
Copyright
© Attribution Non-Commercial (BY-NC)
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 19

Technische Universitt Mnchen

Chip Multicore Processors


Tutorial 9

S. Wallentowitz

Institute for Integrated Systems Theresienstr. 90 Building N1 www.lis.ei.tum.de

Technische Universitt Mnchen

Preparation for next week

Read article!

Why on-chip coherency is here to stay Milo M. K. Martin, Mark D. Hill, Daniel J. Sorin Communications of the ACM, July 2012

Chip Multicore Processors Tutorial 9 2 S. Wallentowitz

Institute for Integrated Systems

Technische Universitt Mnchen

9.1: Directory-based Cache Coherency


In this tutorial you will learn about directory-based cache coherency protocols. Given is the sketched system: A central directory is located near to the memory and each cache supports directory-based cache coherency. The communication is performed over a network (that is abstracted here). The directory implements the MSI protocol and stores the bitvector of all caches which share a cache line.

C0

C1

C2

Interconnect

Dir

Mem

Chip Multicore Processors Tutorial 9 3 S. Wallentowitz

Institute for Integrated Systems

Technische Universitt Mnchen

a)

In the following you find the initial state of the caches. Determine the entries of the directory from the local entries
0 1 I S M I 408 410 54 | 04 20 | 01 400 408 0 1 I M I S 418 0a | 00 428 00 | 20 410 ... 00 | 00 54 | 04 03 | 00

C0

Directory
...

2 3

418
420 428 430

00 | 00
01 | 02 0c | d0 00 | ff 00 | 00 ... ...

C1

2 3

0 1

S S M I

420 408 430

01 | 02 54 | 04 00 | 00

438

C2

2 3

Memory

Chip Multicore Processors Tutorial 9 4 S. Wallentowitz

Institute for Integrated Systems

Technische Universitt Mnchen

b)

Explicit messages are exchanged between the caches and the directory. The following table contains all messages that are exchanged. Develop a MSI state diagram for the coherency protocol. Beside processor events external messages lead to state transitions. Messages are sent as actions during state transitions.

Type Read Miss (RM) Write Miss (WM) Write (M)


Invalidate (IN) Fetch (FE) Fetch/Inv. (FEI) Data (DA) Data Reply (DV) Write Back (WB) ACK

Source Local Cache Local Cache Local Cache


Directory Directory Directory Directoy Remote Cache Cache General

Destination Directory Directory Directory


Remote Cache Remote Cache Remote Cache Local Cache Directory Directory General

Content ID, Address ID, Address ID, Address


Address Address Address Data Data Address, Data General

Chip Multicore Processors Tutorial 9 5 S. Wallentowitz

Institute for Integrated Systems

Technische Universitt Mnchen

MSI Protocol, Cache side

Invalid

Shared

Modified

Chip Multicore Processors Tutorial 9 6 S. Wallentowitz

Institute for Integrated Systems

Technische Universitt Mnchen

MSI Protocol, Directory side

Uncached

Shared

Modified

Chip Multicore Processors Tutorial 9 7 S. Wallentowitz

Institute for Integrated Systems

Technische Universitt Mnchen

c)

Complete the message sequence charts of the coherency protocol starting from the initial state. Start from the initial state for each of the sequences separately. Also update the new states of the local caches and the directory.

Chip Multicore Processors Tutorial 9 8 S. Wallentowitz

Institute for Integrated Systems

Technische Universitt Mnchen

c.i)
(P1) read 400 C0 C1 C2 Dir C0
0 1 2 3 I S M I 408 410 400 408 0 1 I M I S 418 428 410

Directory
... U S M --101 100

C1

418
420 428 430

S
S M M U ...

010
001 010 001 ---

2 3

0 1

S S M I

420 408 430

438

C2

2 3

Chip Multicore Processors Tutorial 9 9 S. Wallentowitz

Institute for Integrated Systems

Technische Universitt Mnchen

c.i)
(P2) read 438 C0 C1 C2 Dir C0
0 1 2 3 I S M I 408 410 400 408 0 1 S M I S 418 400 428 410

Directory
... S S M 010 101 100

C1

418
420 428 430

S
S M M U ...

010
001 010 001 ---

2 3

0 1

S S M I

420 408 430

438

C2

2 3

Chip Multicore Processors Tutorial 9 10 S. Wallentowitz

Institute for Integrated Systems

Technische Universitt Mnchen

c.ii)
(P1) read 410 C0 C1 C2 Dir C0
0 1 2 3 I S M I 408 410 400 408 0 1 I M I S 418 428 410

Directory
... U S M --101 100

C1

418
420 428 430

S
S M M U ...

010
001 010 001 ---

2 3

0 1

S S M I

420 408 430

438

C2

2 3

Chip Multicore Processors Tutorial 9 11 S. Wallentowitz

Institute for Integrated Systems

Technische Universitt Mnchen

c.ii)
(P2) read 410 C0 C1 C2 Dir C0
0 1 2 3 I S S I 408 410 400 408 0 1 I M S S 428 410 418 410

Directory
... U S S --101 110

C1

418
420 428 430

S
S M M U ...

010
001 010 001 ---

2 3

0 1

S S M I

420 408 430

438

C2

2 3

Chip Multicore Processors Tutorial 9 12 S. Wallentowitz

Institute for Integrated Systems

Technische Universitt Mnchen

c.ii)
(P0) read 430 C0 C1 C2 Dir C0
0 1 2 3 I S S I 408 410 400 408 0 1 I M S S 428 410 418 410

Directory
... U S S --101 111

C1

418
420 428 430

S
S M U U ...

010
001 010 -----

2 3

0 1

S S S I

420 408 410

438

C2

2 3

Chip Multicore Processors Tutorial 9 13 S. Wallentowitz

Institute for Integrated Systems

Technische Universitt Mnchen

c.iii)
(P0) write 420, 42 C0 C1 C2 Dir C0
0 1 2 3 I S M I 408 410 400 408 0 1 I M I S 418 428 410

Directory
... U S M --101 100

C1

418
420 428 430

S
S M M U ...

010
001 010 001 ---

2 3

0 1

S S M I

420 408 430

438

C2

2 3

Chip Multicore Processors Tutorial 9 14 S. Wallentowitz

Institute for Integrated Systems

Technische Universitt Mnchen

c.iii)
(P2) read 424 C0 C1 C2 Dir C0
0 1 2 3 M S M I 420 408 410 400 408 0 1 I M I S 418 428 410

Directory
... U S M --101 100

C1

418
420 428 430

S
M M M U ...

010
100 010 001 ---

2 3

0 1

I S M I 408 430

438

C2

2 3

Chip Multicore Processors Tutorial 9 15 S. Wallentowitz

Institute for Integrated Systems

Technische Universitt Mnchen

c.iii)
(P2) write 424, 23 C0 C1 C2 Dir C0
0 1 2 3 S S M I 420 408 410 400 408 0 1 I M I S 418 428 410

Directory
... U S M --101 100

C1

418
420 428 430

S
S M M U ...

010
101 010 001 ---

2 3

0 1

S S M I

420 408 430

438

C2

2 3

Chip Multicore Processors Tutorial 9 16 S. Wallentowitz

Institute for Integrated Systems

Technische Universitt Mnchen

c.iv)
(P1) read 410 C0
0 1 2 3 I S M I 408 410 400 408 0 1 I M I S 418 428 410

Directory
... U S M --101 100

Self Study

C1

418
420 428 430

S
S M M U ...

010
001 010 001 ---

2 3

0 1

S S M I

420 408 430

438

C2

2 3

Chip Multicore Processors Tutorial 9 17 S. Wallentowitz

Institute for Integrated Systems

Technische Universitt Mnchen

What do you think may a home directory be? How can this technique improve the performance of directory-based cache coherency.

Chip Multicore Processors Tutorial 9 18 S. Wallentowitz

Institute for Integrated Systems

Technische Universitt Mnchen

Chip Multicore Processors Tutorial 9 19 S. Wallentowitz

Institute for Integrated Systems

You might also like