0% found this document useful (0 votes)
2 views3 pages

Problem 3 Solution

The document presents a problem set for EE282: Computer Systems Architecture at Stanford University, focusing on a snooping-based coherence protocol for a multicore SMT multiprocessor. It includes exercises that require students to analyze cache states and memory changes based on specific CPU operations. Each exercise details the initial conditions and expected outcomes after various read and write operations, illustrating the coherence states of cache blocks and memory updates.

Uploaded by

huisubshm875
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views3 pages

Problem 3 Solution

The document presents a problem set for EE282: Computer Systems Architecture at Stanford University, focusing on a snooping-based coherence protocol for a multicore SMT multiprocessor. It includes exercises that require students to analyze cache states and memory changes based on specific CPU operations. Each exercise details the initial conditions and expected outcomes after various read and write operations, illustrating the coherence states of cache blocks and memory updates.

Uploaded by

huisubshm875
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

EE282: Computer Systems Architecture Problem Set

Stanford University Spring 2025 SOLUTION

Problem 3: Snooping-Based Coherence Protocol (12 points)


Adapted from H&P Case Study 1 and exercise 5.1, page 446.
A multicore SMT multiprocessor is illustrated in Figure 3. Only the cache contents are shown,
and we ignore Core 2. Each core has a single, private cache with coherence maintained using the
snooping coherence protocol (MSI protocol), as shown in Figure 2. Each private cache is direct-
mapped, with four cache blocks, each of size 8 byte. To simplify the diagram, the data column for
each cache block in Fig. 2 shows only 2-bytes data (assume highest 6-byte as zeros). For further
simplification, the whole cache block addresses are shown in the address fields in the caches, where
the tag would normally exist (we ignore the tag fields here). The coherence states are denoted M for
Modified, S for Shared, and I for Invalid.
For more information on Snooping-based Coherence Protocol, please read the H&P textbook Sec.
5.2, page 383 – 387: An Example Protocol.

Figure 2: (H&P Fig. 5.5, page 384) The cache coherence mechanism receives requests from both
the core’s processor and the shared bus and responds to these based on the type of request,
whether it hits or misses in the local cache, and the state of the local cache block specified in
the request.

8
EE282: Computer Systems Architecture Problem Set
Stanford University Spring 2025 SOLUTION

Figure 3: (Adapted from H&P Fig. 5.37) Multicore (point-to-point) multiprocessor.

Exercise 1 (12 points)


For each question of this exercise, the initial cache and memory state are assumed to initially have
the contents shown in Figure 3. Each question specifies one CPU operation of the form:
C<core id>: R, <address> for reads, and
C<core id>: W, <address> → <value written> for writes.
For example:
C3: R, AC10 means Core 3 reads address AC10.
C0: W, AC18 → 0018 means Core 0 writes value 0018 to address AC18. (We will use addresses
and data values in hexadecimal, and may drop the 0x prefix for simplicity.)
Assume read and write operations are for one cache block at a time. Show the resulting state
(i.e., coherency state, address, and data) of the private caches and memory after the
actions given below along with some explanation in how you derive the resulting state
such as what events happen in sequence. Show only the cache blocks or memory that
experience some state change; for example:
C0.B0: (S, AC20, 0001) indicates that block 0 in Core 0 now assumes an Shared coherency state
(S), stores address AC20 from the memory, and has data contents 0001. You don’t need to fill in the
address or data for I state.
Furthermore, represent any changes to the memory state as: M: <address> → <value>.
Di!erent questions do not depend on one another: assume the actions in all questions are applied
to the initial cache and memory states.
a) (3 points) C0: R, AC20

Solution Core 0 read miss, places read miss on bus. Upon receiving message, Core 3 sends
copy of cache block to Core 0. Core 0 writes data to cache and changes AC20’s state to

9
EE282: Computer Systems Architecture Problem Set
Stanford University Spring 2025 SOLUTION

Shared.
C0.B0: (S, AC20, 0020)

b) (3 points) C0: R, AC30

Solution Core 0 read miss, address conflict miss, write-back block, places read miss on bus.
Upon receiving bus message, memory updates. Core 0 changes AC30’s state to Shared.
C0.B2: (S, AC30, 0030)
M: AC10 → 0030

c) (3 points) C3: W, AC08 → 0080

Solution Core 3 write hit, place invalidate on bus, and change state to Modified. Upon
receiving bus message, Core 0 invalidates its copy of AC08.
C3.B1: (M, AC08, 0080)
C0.B1: (I, ..., ...)

d) (3 points) C1: W, AC10 → 0042

Solution Core 1 write miss, place write miss on bus. Upon receiving bus message, Core 0
places cache block 2 on bus, write back block to memory, changes from Modified to Invalid.
Now Core 1 receives data and changes block 2 to Modified.
C0.B2: (I, ..., ...)
C1.B2: (M, AC10, 0042)
M: AC10 → 0030

10

You might also like