0% found this document useful (0 votes)

40 views13 pages

Tutorial08 Solution

Uploaded by

Bobby Beaman

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

40 views13 pages

Tutorial08 Solution

Uploaded by

Bobby Beaman

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 13

Technische Universitt Mnchen

Chip Multicore Processors

Tutorial 8

S. Wallentowitz

Institute for Integrated Systems Theresienstr. 90 Building N1 www.lis.ei.tum.de

Technische Universitt Mnchen

Task 8.1: Performance of Snooping-based Cache Cohereny

0 1 2 3 I S M I 408 410 54 | 04 20 | 01 400 408 0 1 2 3 I M I S 418 0a | 00 428 00 | 20 410 ... 00 | 00 54 | 04 03 | 00

418
420 428 430

00 | 00
01 | 02 0c | d0 00 | ff 00 | 00 ...

0 1 2 3

S S M I

420 408 430

01 | 02 54 | 04 00 | 00

438

Chip Multicore Processors Tutorial 8 2 S. Wallentowitz

Institute for Integrated Systems

Technische Universitt Mnchen

a) sequence 1
1: (P1) read 410 2: (P2) read 410 3: (P0) read 430 3: replace
0 1 I S M I 408 410 S 430 54 | 04 20 | 01 00 | 00 ... 400 408 00 | 00 54 | 04 20 | 01 03 | 00

1: write back

2 3

1: 2: 3:

+ + 200 cycles
2: write back and load 1: read miss

0 1 2 3

I M I S S 428 410 418 00 | 20 20 | 01 0a | 00

410

418
420 428 430

00 | 00
01 | 02 0c | d0 00 | ff 00 | 00 00 | 00 ...

0 1 2 3

S S M I S

420 408 430 410

01 | 02 54 | 04 00 | 00 20 | 01

438

Chip Multicore Processors Tutorial 8 3 S. Wallentowitz

Institute for Integrated Systems

Technische Universitt Mnchen

a) sequence 2
1: write miss 1: (P0) write 420, 42 2: (P2) read 424 3: (P2) write 424, 23 2: write back 3: invalidate
0 1 2 3 I 420 MS I S 408 M I 410 01 | 42 54 | 04 20 | 01 400 408 ... 00 | 00 54 | 04 03 | 00

1: 2: 3:

+ 124 cycles
1: snoop WM 2: read miss

0 1 2 3

I M I S 418 0a | 00 23 | 42 01 | 02 01 | 42 54 | 04 00 | 00 428 00 | 20

410

418
420 428 430

00 | 00
01 | 02 01 | 42 0c | d0 00 | ff 00 | 00 ...

0 1 2 3

420 ISM S 408 M I 430

438

3: invalidate

Chip Multicore Processors Tutorial 8 4 S. Wallentowitz

Institute for Integrated Systems

Technische Universitt Mnchen

a) sequence 3
1: (P0) write 420, 42 2: (P2) read 424 3: (P2) write 424, 23
0 1 2 3 I S M I 408 410 54 | 04 20 | 01 400 408 ... 00 | 00 54 | 04 03 | 00

Self Study

0 1 2 3

I M I S 418 0a | 00 428 00 | 20

410

418
420 428 430

00 | 00
01 | 02 0c | d0 00 | ff 00 | 00 ...

0 1 2 3

S S M I

420 408 430

01 | 02 54 | 04 00 | 00

438

Chip Multicore Processors Tutorial 8 5 S. Wallentowitz

Institute for Integrated Systems

Technische Universitt Mnchen

8.1 b)

To optimize the external accesses an owner state (O) is added to the cache coherency protocol. On a write, all other cache entries should be invalidated (write-invalidate). Instead of the memory the current owner will give the data on a read access of another cache. Sketch the modified diagramm of the MOSI protocol.

Chip Multicore Processors Tutorial 8 6 S. Wallentowitz

Institute for Integrated Systems

Technische Universitt Mnchen

Example Coherency Protocol (MSI)

Invalidate Write Miss Read Hit

Invalid
CPU Write Miss (Place write miss on bus)

CPU Read Miss (Place read miss on bus)

Shared
Read Miss

Write Miss (Write Back)

All actions on cache lines Write-back cache Processor triggered

Events Cache actions

Modified
Hit

Bus triggered
Events Cache actions

Chip Multicore Processors Tutorial 8 7 S. Wallentowitz

Institute for Integrated Systems

Technische Universitt Mnchen

MOSI
Invalidate Write Miss Read Hit

Invalid
Write Miss (Write Back)

Read Miss (Read Miss)

Shared
Read Miss Write (Invalidate)

Write Miss (Write miss)

Eviction (Write Back)

Read Miss (Provide Data) Write (Invalidate) Read Miss (Provide Data)

Modified
Read/Write Hit

Owner

Read Hit

Chip Multicore Processors Tutorial 8 8 S. Wallentowitz

Institute for Integrated Systems

Technische Universitt Mnchen

c) sequence 1
1: (P1) read 410 2: (P2) read 410 3: (P0) read 430 3: replace
0 1 I S M 408 54 | 04 20 | 01 00 | 00 ... 400 408 0 1 I M I S S 428 410 418 00 | 20 20 | 01 0a | 00 410 00 | 00 54 | 04 20 | 01 03 | 00

1: provide data 2: provide data

2 3

410 O 430 I S

1: 2: 3:

+ + 152 cycles
2: write back and load 1: read miss

418
420 428 430

00 | 00
01 | 02 0c | d0 00 | ff 00 | 00 00 | 00 ...

2 3

0 1 2 3

S S M I S

420 408 430 410

01 | 02 54 | 04 00 | 00 20 | 01

438

Chip Multicore Processors Tutorial 8 9 S. Wallentowitz

Institute for Integrated Systems

Technische Universitt Mnchen

c) sequence 2
1: write miss 1: (P0) write 420, 42 2: (P2) read 424 3: (P2) write 424, 23 2: provide data 3: invalidate
0 1 2 3 I 420 MO I S 408 M I 410 01 | 42 54 | 04 20 | 01 400 408 ... 00 | 00 54 | 04 03 | 00

1: 2: 3:

60 cycles
1: snoop WM, dont provide 2: read miss

0 1 2 3

I M I S 418 0a | 00 23 | 42 01 | 02 01 | 42 54 | 04 00 | 00 428 00 | 20

410

418
420 428 430

00 | 00
01 | 02 0c | d0 00 | ff 00 | 00 ...

0 1 2 3

420 ISM S 408 M I 430

438

3: invalidate

Chip Multicore Processors Tutorial 8 10 S. Wallentowitz

Institute for Integrated Systems

Technische Universitt Mnchen

a) sequence 3
1: (P0) write 420, 42 2: (P2) read 424 3: (P2) write 424, 23
0 1 2 3 I S M I 408 410 54 | 04 20 | 01 400 408 ... 00 | 00 54 | 04 03 | 00

Self Study: Will be online

0 1 2 3

I M I S 418 0a | 00 428 00 | 20

410

418
420 428 430

00 | 00
01 | 02 0c | d0 00 | ff 00 | 00 ...

0 1 2 3

S S M I

420 408 430

01 | 02 54 | 04 00 | 00

438

Chip Multicore Processors Tutorial 8 11 S. Wallentowitz

Institute for Integrated Systems

Technische Universitt Mnchen

8.2

Read the article Memory Performance and Cache Coherency Effects on an Intel Nehalem Multiprocessor System, Daniel Molka et al., PACT 2009. Shortly describe the investigated architecture? What is decribed by the term ccNUMA? How do the information in the L3 cache relate to the other levels and how precise is it? Shortly describe the executed benchmarks and central findings of the article.

Chip Multicore Processors Tutorial 8 12 S. Wallentowitz

Institute for Integrated Systems

Technische Universitt Mnchen

ccNUMA: non-uniform memory access, cache coherent L3: inclusive last level, core valid bits are imprecise
0: core does for sure not hold a copy 1: core may hold a copy

Benchmarks: latency, local and global bandwidth

Chip Multicore Processors Tutorial 8 13 S. Wallentowitz

Institute for Integrated Systems

Lecture 13
No ratings yet
Lecture 13
114 pages
Intel 80586 (Pentium)
100% (3)
Intel 80586 (Pentium)
24 pages
Yan Solihin - Fundamentals of Parallel Computer Architecture
100% (2)
Yan Solihin - Fundamentals of Parallel Computer Architecture
547 pages
Multi-Core Architectures
100% (1)
Multi-Core Architectures
43 pages
Multi Processor
No ratings yet
Multi Processor
63 pages
Chapter 8 - Parallel Processing
No ratings yet
Chapter 8 - Parallel Processing
50 pages
Week4 1
No ratings yet
Week4 1
37 pages
Lecture 1
No ratings yet
Lecture 1
30 pages
Mp&i L1
No ratings yet
Mp&i L1
21 pages
MODULE 4 HPC
No ratings yet
MODULE 4 HPC
41 pages
07 Introduction To Multicore Programming PDF
No ratings yet
07 Introduction To Multicore Programming PDF
60 pages
FALLSEM2024-25 CSI3021 TH VL2024250101951 2024-07-19 Reference-Material-I
No ratings yet
FALLSEM2024-25 CSI3021 TH VL2024250101951 2024-07-19 Reference-Material-I
21 pages
Bus-Based Multiprocessor: A.K.A or Snoopy-Bus Architecture
No ratings yet
Bus-Based Multiprocessor: A.K.A or Snoopy-Bus Architecture
54 pages
Summary Exam 2015
No ratings yet
Summary Exam 2015
30 pages
L39 - Centralized Shared Memory Architectures
No ratings yet
L39 - Centralized Shared Memory Architectures
31 pages
Chip Multicore Processors: Tutorial 9
No ratings yet
Chip Multicore Processors: Tutorial 9
19 pages
Chip Multicore Processors: Tutorial 4
No ratings yet
Chip Multicore Processors: Tutorial 4
21 pages
Computer Registers
No ratings yet
Computer Registers
23 pages
Tutorial05 Solution
No ratings yet
Tutorial05 Solution
23 pages
L7 Multicore 1
No ratings yet
L7 Multicore 1
50 pages
Lecture22 PDF
No ratings yet
Lecture22 PDF
44 pages
Tutorial10 Solution
No ratings yet
Tutorial10 Solution
14 pages
Lecture12 PDF
No ratings yet
Lecture12 PDF
9 pages
Microprocessor Prem
No ratings yet
Microprocessor Prem
5 pages
Arquitectura
No ratings yet
Arquitectura
8 pages
Computer Architecture: CSCE 350
No ratings yet
Computer Architecture: CSCE 350
41 pages
IJARCCE-46 Cachemesiwithverilog
No ratings yet
IJARCCE-46 Cachemesiwithverilog
5 pages
Multiprocessors
No ratings yet
Multiprocessors
39 pages
Introduction To Multicore Programming: University of Western Ontario, London, Ontario (Canada)
No ratings yet
Introduction To Multicore Programming: University of Western Ontario, London, Ontario (Canada)
60 pages
Cache Coherence: Computer Science & Artificial Intelligence Lab
No ratings yet
Cache Coherence: Computer Science & Artificial Intelligence Lab
36 pages
CompArch Sample Exam With Answers
No ratings yet
CompArch Sample Exam With Answers
3 pages
Chapter 4: Multiprocessor: Dr. Eng. Amr T. Abdel-Hamid Spring 2011
No ratings yet
Chapter 4: Multiprocessor: Dr. Eng. Amr T. Abdel-Hamid Spring 2011
22 pages
ECE 4100/6100 Advanced Computer Architecture: Lecture 13 Multithreading and Multicore Processors
No ratings yet
ECE 4100/6100 Advanced Computer Architecture: Lecture 13 Multithreading and Multicore Processors
56 pages
Ee547 (B) Assignment 1
No ratings yet
Ee547 (B) Assignment 1
11 pages
Chip Multicore Processors - Tutorial 8: Task 8.1: Performance of Snooping-Based Cache Coherency
No ratings yet
Chip Multicore Processors - Tutorial 8: Task 8.1: Performance of Snooping-Based Cache Coherency
3 pages
Pipeline Hazards. Presentation
100% (2)
Pipeline Hazards. Presentation
20 pages
#Include #Include #Define
No ratings yet
#Include #Include #Define
8 pages
Microprocessor
No ratings yet
Microprocessor
2 pages
HPCA Endsem SPR 2024
No ratings yet
HPCA Endsem SPR 2024
3 pages
Cache Coherence (Part 1)
No ratings yet
Cache Coherence (Part 1)
13 pages
Shared Memory Architecture
No ratings yet
Shared Memory Architecture
39 pages
Shared-Memory Architectures: Adapted From A Lecture by Ian Watson, University of Machester
No ratings yet
Shared-Memory Architectures: Adapted From A Lecture by Ian Watson, University of Machester
33 pages
Homework 5
No ratings yet
Homework 5
6 pages
Cache Coherence and Synchronization - Tutorialspoint
No ratings yet
Cache Coherence and Synchronization - Tutorialspoint
7 pages
Shared Memory Architecture Concepts and Performance Issues: Outline
No ratings yet
Shared Memory Architecture Concepts and Performance Issues: Outline
7 pages
Chapter 3-PIC IO Port Programming
75% (4)
Chapter 3-PIC IO Port Programming
36 pages
Parallel 2
No ratings yet
Parallel 2
14 pages
Parallel Computer Architecture A Hardware-Software
No ratings yet
Parallel Computer Architecture A Hardware-Software
18 pages
8085 Time-Delay-And-Counter
100% (1)
8085 Time-Delay-And-Counter
6 pages
Answers To Problems For Operating Systems Global Edition, 9th Edition by William Stallings
No ratings yet
Answers To Problems For Operating Systems Global Edition, 9th Edition by William Stallings
7 pages
Computer Architecture: Multiprocessors Shared Memory Architectures Prof. Jerry Breecher CSCI 240 Fall 2003
No ratings yet
Computer Architecture: Multiprocessors Shared Memory Architectures Prof. Jerry Breecher CSCI 240 Fall 2003
24 pages
Multiprocessors and Multithreading: CS151B/EE M116C Computer Systems Architecture
No ratings yet
Multiprocessors and Multithreading: CS151B/EE M116C Computer Systems Architecture
13 pages
Cache Coherence - MESI MOESI
No ratings yet
Cache Coherence - MESI MOESI
57 pages
ARM Cortex M4 in Few Words
No ratings yet
ARM Cortex M4 in Few Words
14 pages
Guide To Assembly Code-TMS320C5x
0% (1)
Guide To Assembly Code-TMS320C5x
314 pages
Cache Coherence: Part I: CMU 15-418: Parallel Computer Architecture and Programming (Spring 2012)
No ratings yet
Cache Coherence: Part I: CMU 15-418: Parallel Computer Architecture and Programming (Spring 2012)
31 pages
Computer Architecture: Ph.D. Qualifiers Examination - Sample Questions
No ratings yet
Computer Architecture: Ph.D. Qualifiers Examination - Sample Questions
2 pages
Cache Coherency
No ratings yet
Cache Coherency
33 pages
Parallel Arch 2
No ratings yet
Parallel Arch 2
9 pages
80486
0% (1)
80486
21 pages
Model Answers - HW1 PDF
No ratings yet
Model Answers - HW1 PDF
6 pages
Microcontrollers & Applications QB
No ratings yet
Microcontrollers & Applications QB
7 pages
Presentation 3
No ratings yet
Presentation 3
37 pages
Multiprocessing: Flynn's Classification (1966)
No ratings yet
Multiprocessing: Flynn's Classification (1966)
8 pages
High Performance Computing - CS 3010 - MID SEM Question by Subhasis Dash With Solution
No ratings yet
High Performance Computing - CS 3010 - MID SEM Question by Subhasis Dash With Solution
12 pages
Intel Processors
No ratings yet
Intel Processors
30 pages
MIPS
No ratings yet
MIPS
70 pages
4CS3 MPI Unit 2
No ratings yet
4CS3 MPI Unit 2
150 pages
CAO Fall 2024 Lecture 04 Instruction Set Architecture RISC V Machine Language Microarchitecture
No ratings yet
CAO Fall 2024 Lecture 04 Instruction Set Architecture RISC V Machine Language Microarchitecture
42 pages
ES Chapter6
No ratings yet
ES Chapter6
61 pages
Superscalar - Superpipeline - Processor
No ratings yet
Superscalar - Superpipeline - Processor
10 pages
Lecture 10-Third Microprocessor C
No ratings yet
Lecture 10-Third Microprocessor C
12 pages
Co Unit 4
No ratings yet
Co Unit 4
17 pages
Micro Controller Quiz
0% (1)
Micro Controller Quiz
2 pages
Module 4 MP
No ratings yet
Module 4 MP
11 pages
Final Exam System On Chip Solutions in Networking SS 2007
No ratings yet
Final Exam System On Chip Solutions in Networking SS 2007
10 pages
AMMC Question Bank2025
No ratings yet
AMMC Question Bank2025
3 pages
Thumb-2 Instruction Set
No ratings yet
Thumb-2 Instruction Set
11 pages
Module 5 Pentium Processor
No ratings yet
Module 5 Pentium Processor
14 pages
Computer Architecture: Trần Trọng Hiếu
No ratings yet
Computer Architecture: Trần Trọng Hiếu
29 pages
FPGA期末考筆記
No ratings yet
FPGA期末考筆記
7 pages
8085 Microprocessor
No ratings yet
8085 Microprocessor
17 pages
Laboratory Manual: Department of Computer Engineering Advanced M Semester:-VI
No ratings yet
Laboratory Manual: Department of Computer Engineering Advanced M Semester:-VI
22 pages
Itanium Processor: Presented by Name-Mohammad Faizan Akhter Branch-ETC (Section) Semester-6 Regd No-1801289179
No ratings yet
Itanium Processor: Presented by Name-Mohammad Faizan Akhter Branch-ETC (Section) Semester-6 Regd No-1801289179
18 pages
Learning Activity No.2
No ratings yet
Learning Activity No.2
1 page
2008 FinalExam SoCN Final Master Solution
No ratings yet
2008 FinalExam SoCN Final Master Solution
10 pages
Pipelining in Modern CPU's: Sangita Sah 221235 Nepal College of Information Technology
No ratings yet
Pipelining in Modern CPU's: Sangita Sah 221235 Nepal College of Information Technology
2 pages
2010 FinalExam SoCN Solution
No ratings yet
2010 FinalExam SoCN Solution
12 pages
Final Exam System On Chip Solutions in Networking SS 2010
No ratings yet
Final Exam System On Chip Solutions in Networking SS 2010
12 pages
Department of Cse CP7103 Multicore Architecture Unit - 2, DLP in Vector, Simd and Gpu Architectures 100% THEORY Question Bank
No ratings yet
Department of Cse CP7103 Multicore Architecture Unit - 2, DLP in Vector, Simd and Gpu Architectures 100% THEORY Question Bank
3 pages
Tutorial 1 - Introduction
No ratings yet
Tutorial 1 - Introduction
4 pages
EE 010 605 Microcontrollers and Embedded Systems
No ratings yet
EE 010 605 Microcontrollers and Embedded Systems
2 pages
Assignment 1
No ratings yet
Assignment 1
2 pages
Chip Multicore Processors - Tutorial 10: Task 10.1: Why On-Chip Coherence Is Here To Stay
No ratings yet
Chip Multicore Processors - Tutorial 10: Task 10.1: Why On-Chip Coherence Is Here To Stay
2 pages
Chip Multicore Processors - Tutorial 11: Task 11.1: Routing
No ratings yet
Chip Multicore Processors - Tutorial 11: Task 11.1: Routing
2 pages
Chip Multicore Processors - Tutorial 7: Task 7.1: Memory Overhead of Cache Coherency
No ratings yet
Chip Multicore Processors - Tutorial 7: Task 7.1: Memory Overhead of Cache Coherency
2 pages
Chip Multicore Processors - Tutorial 2: 2.1: Frequency and Voltage Scaling, Amdahl's Law
No ratings yet
Chip Multicore Processors - Tutorial 2: 2.1: Frequency and Voltage Scaling, Amdahl's Law
2 pages
Chip Multicore Processors - Tutorial 3: 3.1: 3-Thread Lock
No ratings yet
Chip Multicore Processors - Tutorial 3: 3.1: 3-Thread Lock
2 pages
Chip Multicore Processors - Tutorial 5: Task 5.1: Semaphores
No ratings yet
Chip Multicore Processors - Tutorial 5: Task 5.1: Semaphores
1 page
Chip Multicore Processors - Tutorial 6: Task 6.1: Cache Misses
No ratings yet
Chip Multicore Processors - Tutorial 6: Task 6.1: Cache Misses
1 page
Chip Multicore Processors - Tutorial 4: Task 4.1: Counter Implementation
No ratings yet
Chip Multicore Processors - Tutorial 4: Task 4.1: Counter Implementation
1 page
C Programming for the Pc the Mac and the Arduino Microcontroller System
From Everand
C Programming for the Pc the Mac and the Arduino Microcontroller System
Peter D Minns
No ratings yet
Foundation Course for Advanced Computer Studies
From Everand
Foundation Course for Advanced Computer Studies
Franck Ismael Djédjé
No ratings yet

Tutorial08 Solution

Uploaded by

Tutorial08 Solution

Uploaded by

Technische Universitt Mnchen

Chip Multicore Processors

Institute for Integrated Systems Theresienstr. 90 Building N1 www.lis.ei.tum.de

Technische Universitt Mnchen

Task 8.1: Performance of Snooping-based Cache Cohereny

420 408 430

Chip Multicore Processors Tutorial 8 2 S. Wallentowitz

Institute for Integrated Systems

Technische Universitt Mnchen

I M I S S 428 410 418 00 | 20 20 | 01 0a | 00

420 408 430 410

Chip Multicore Processors Tutorial 8 3 S. Wallentowitz

Institute for Integrated Systems

Technische Universitt Mnchen

420 ISM S 408 M I 430

Chip Multicore Processors Tutorial 8 4 S. Wallentowitz

Institute for Integrated Systems

Technische Universitt Mnchen

420 408 430

Chip Multicore Processors Tutorial 8 5 S. Wallentowitz

Institute for Integrated Systems

Technische Universitt Mnchen

Chip Multicore Processors Tutorial 8 6 S. Wallentowitz

Institute for Integrated Systems

Technische Universitt Mnchen

Example Coherency Protocol (MSI)

CPU Read Miss (Place read miss on bus)

Write Miss (Write Back)

All actions on cache lines Write-back cache Processor triggered

Chip Multicore Processors Tutorial 8 7 S. Wallentowitz

Institute for Integrated Systems

Technische Universitt Mnchen

Read Miss (Read Miss)

Write Miss (Write miss)

Eviction (Write Back)

Chip Multicore Processors Tutorial 8 8 S. Wallentowitz

Institute for Integrated Systems

Technische Universitt Mnchen

1: provide data 2: provide data

420 408 430 410

Chip Multicore Processors Tutorial 8 9 S. Wallentowitz

Institute for Integrated Systems

Technische Universitt Mnchen

420 ISM S 408 M I 430

Chip Multicore Processors Tutorial 8 10 S. Wallentowitz

Institute for Integrated Systems

Technische Universitt Mnchen

Self Study: Will be online

420 408 430

Chip Multicore Processors Tutorial 8 11 S. Wallentowitz

Institute for Integrated Systems

Technische Universitt Mnchen

Chip Multicore Processors Tutorial 8 12 S. Wallentowitz

Institute for Integrated Systems

Technische Universitt Mnchen

Benchmarks: latency, local and global bandwidth

Chip Multicore Processors Tutorial 8 13 S. Wallentowitz

Institute for Integrated Systems

You might also like