0% found this document useful (0 votes)

12 views25 pages

Lab3 Suppl

Uploaded by

info.bptrades

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

12 views25 pages

Lab3 Suppl

Uploaded by

info.bptrades

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 25

Cache Lab Implementation and

Blocking

1
Outline
 Memory organization
 Caching
 Different types of locality
 Cache organization
 Cache lab
 Cache Structure
 getopt/fscanf/Malloc
 Page Replacement
 LRU algorithm
 FIFO algorithm

2
Memory Hierarchy
CPU registers hold words retrieved from L1
cache
Smaller, L0:
faster,
costlier Registers
per byte
L1 cache holds cache lines retrieved from
L1: L1 cache L2 cache
(SRAM)

L2: L2 cache L2 cache holds cache lines retrieved

(SRAM) from main memory

Main Main memory holds disk blocks

L3: memory retrieved from local disks
(DRAM)
Larger,
slower,
cheaper Local disks hold files
L4: Local secondary retrieved from disks on
per byte
storage remote network servers
(local disks)

Remote secondary storage

L5:
(tapes, distributed file systems, Web servers)

3
SRAM vs DRAM tradeoff
 SRAM (cache)
 Faster (L1 cache: 1 CPU cycle)
 Smaller (Kilobytes (L1) or Megabytes (L2))
 More expensive and “energy-hungry”
 DRAM (main memory)
 Relatively slower (hundreds of CPU cycles)
 Larger (Gigabytes)
 Cheaper

4
Locality
 Temporal locality
 Recently referenced items are likely
to be referenced again in the near future
 After accessing address X in memory, save the bytes in cache for
future access

 Spatial locality
 Items with nearby addresses tend
to be referenced close together in time
 After accessing address X, save the block of memory around X in
cache for future access

5
Memory Address
 For example, 64-bit on shark machines

 Block offset: b bits

 Set index: s bits
 Tag Bits: (Address Size – b – s)

6
Cache
 A cache is a set of 2^s cache sets

 A cache set is a set of E cache lines

 E is called associativity
 If E=1, it is called “direct-mapped”

 Each cache line stores a block

 Each block has B = 2^b bytes

 Total Capacity = SBE

7
Visual Cache Terminology
E lines per set

Address of word:
t bits s bits b bits

S = 2s sets
tag set block
index offset

data begins at this offset

v tag 0 1 2 B-1

valid bit
B = 2b bytes per cache block (the data)
8
General Cache Concepts

Smaller, faster, more expensive

Cache 8
4 9 14
10 3 memory caches a subset of
the blocks

Data is copied in block-sized

10
4 transfer units

Larger, slower, cheaper memory

Memory 0 1 2 3 viewed as partitioned into “blocks”

4 5 6 7
8 9 10 11
12 13 14 15

9
General Cache Concepts: Miss
Request: 12 Data in block b is needed

Cache
Block b is not in cache:
8 9
12 14 3
Miss!

Block b is fetched from

12 Request: 12
memory

Memory
Block b is stored in cache
0 1 2 3 • Placement policy:
4 5 6 7 determines where b goes
• Replacement policy:
8 9 10 11
determines which block
12 13 14 15 gets evicted (victim)

10
General Caching Concepts:
Types of Cache Misses
 Cold (compulsory) miss
 The first access to a block has to be a miss

 Conflict miss
 Conflict misses occur when the level k cache is large enough, but multiple
data objects all map to the same level k block
 E.g., Referencing blocks 0, 8, 0, 8, 0, 8, ... would miss every time

 Capacity miss
 Occurs when the set of active cache blocks (working set) is larger than
the cache

11
Cache Simulator
 A cache simulator is NOT a cache!
 Memory contents NOT stored
 Block offsets are NOT used – the b bits in your address don’t
matter.
 Simply count hits, misses, and evictions

 Your cache simulator needs to work for different s, b, e,

given at run time.

 Use LRU – Least Recently Used replacement policy

 Evict the least recently used block from the cache to make room
for the next block.
 Queues ? Time Stamps ?

12
Cache structure
 A cache is just 2D array of cache lines:
 struct cache_line cache[S][E];
 S = 2^s, is the number of sets
 E is associativity

 Each cache_line has:

 Valid bit
 Tag
 LRU counter ( only if you are not using a queue )

13
getopt
 getopt() automates parsing elements on the unix
command line If function declaration is missing
 Typically called in a loop to retrieve arguments
 Its return value is stored in a local variable
 When getopt() returns -1, there are no more options

 To use getopt, your program must include the header file

#include <unistd.h>

 If not running on the shark machines then you will need

#include <getopt.h>.
 Better Advice: Run on Shark Machines !

14
getopt
 A switch statement is used on the local variable holding
the return value from getopt()
 Each command line input case can be taken care of separately
 “optarg” is an important variable – it will point to the value of the
option argument

 Think about how to handle invalid inputs

 For more information,

 look at man 3 getopt
 http://www.gnu.org/software/libc/manual/html_node/Getopt.ht
ml

15
getopt Example
int main(int argc, char** argv){
int opt,x,y;
/* looping over arguments */
while(-1 != (opt = getopt(argc, argv, “x:y:"))){
/* determine which argument it’s processing */
switch(opt) {
case 'x':
x = atoi(optarg);
break;
case ‘y':
y = atoi(optarg);
break;
default:
printf(“wrong argument\n");
break;
}
}
}
 Suppose the program executable was called “foo”.
Then we would call “./foo -x 1 –y 3“ to pass the value 1
to variable x and 3 to y.
16
fscanf
The fscanf() function is just like scanf() except it can specify
a stream to read from (scanf always reads from stdin)
 parameters:
 A stream pointer
 format string with information on how to parse the file
 the rest are pointers to variables to store the parsed data
 You typically want to use this function in a loop. It returns -1 when
it hits EOF or if the data doesn’t match the format string
 For more information,
 man fscanf
 http://crasseux.com/books/ctutorial/fscanf.html
 fscanf will be useful in reading lines from the trace files.
 L 10,1
 M 20,1
17
fscanf example
FILE * pFile; //pointer to FILE object

pFile = fopen ("tracefile.txt",“r"); //open file for reading

char identifier;
unsigned address;
int size;
// Reading lines like " M 20,1" or "L 19,3"

while(fscanf(pFile,“ %c %x,%d”, &identifier, &address, &size)>0)

{
// Do stuff
}

fclose(pFile); //remember to close file when done

18
Malloc/free
 Use malloc to allocate memory on the heap

 Always free what you malloc, otherwise may

get memory leak
 some_pointer_you_malloced = malloc(sizeof(int));
 Free(some_pointer_you_malloced);

 Don’t free memory you didn’t allocate

19
Page Replacement Algorithms
 When cache is full, a cached data should be replaced with the
new data
 The new data is referred right now and just used
 To find a space to hold the new data, choose a data which was
previously cached as a victim
 The victim data is least likely used

 Algorithms
 First-In-First-Out (FIFO) Algorithm
 Optimal Algorithm
 Least Recently Used (LRU) Algorithm
 Implementation of the algorithms in the cachelab
 LRU should be implemented by default
 FIFO and optimal algorithms could be implemented for extra credits

20
First-In-First-Out (FIFO) Algorithm
 Reference string: 7,0,1,2,0,3,0,4,2,3,0,3,0,3,2,1,2,0,1,7,0,1
 Cache: Direct Mapped, 3 sets are instantiated
 Note that the number of sets must be 2𝑠 in your actual cachelab.
 Page: data block

 Totally 15 cache misses occurred

21
Optimal Algorithm
 Replace page(data block) that will not be used for longest
period of time
 Used for measuring how well your algorithm performs

 Totally 9 cache misses occurred

 Issue: how do you know this?
 Can’t read the future
22
Least Recently Used (LRU) Algorithm
 Use past knowledge rather than future
 Replace page that has not been used in the most amount of time
 Associate time of last use with each page

 12 misses – better than FIFO but worse than OPT

 Generally good algorithm and frequently used
 But how to implement?

23
LRU Implementation
 Counter implementation
 Every page entry has a counter; every time page is referenced
through this entry, copy the clock into the counter
 When a page needs to be changed, look at the counters to find
smallest value
 Search through table needed

 Stack implementation
 Keep a stack of page numbers in a double link form:
 Page referenced:
 move it to the top
 requires 6 pointers to be changed
 But each update more expensive
 No search for replacement

24
LRU Implementation Example with Stack

Chapter03 Machine Level Programming (1) Basics
No ratings yet
Chapter03 Machine Level Programming (1) Basics
43 pages
William Stallings Computer Organization and Architecture 9 Edition
No ratings yet
William Stallings Computer Organization and Architecture 9 Edition
46 pages
9 - CH05 - Cache Memory Organization
No ratings yet
9 - CH05 - Cache Memory Organization
27 pages
Input Output Organization (2.3)
No ratings yet
Input Output Organization (2.3)
151 pages
NC JC 2022 - Brochure
No ratings yet
NC JC 2022 - Brochure
19 pages
Terraform Associate Exam - Free Questions and Answers - ITExams - Com3
No ratings yet
Terraform Associate Exam - Free Questions and Answers - ITExams - Com3
2 pages
Chapter08-Exceptional Control Flow (1) Processes
No ratings yet
Chapter08-Exceptional Control Flow (1) Processes
59 pages
Chapter09-Virtual Memory
No ratings yet
Chapter09-Virtual Memory
53 pages
Cache
No ratings yet
Cache
35 pages
Chapter08-Exceptional Control Flow (2) Signals
No ratings yet
Chapter08-Exceptional Control Flow (2) Signals
53 pages
Lec8 Memory
No ratings yet
Lec8 Memory
17 pages
Assembly
No ratings yet
Assembly
49 pages
Chapter 1 Introduction To Assembly Language Programming
No ratings yet
Chapter 1 Introduction To Assembly Language Programming
45 pages
Parallel & Distributed Computing
No ratings yet
Parallel & Distributed Computing
58 pages
Embedded Interview Questions - SPI & I2C
No ratings yet
Embedded Interview Questions - SPI & I2C
10 pages
Tut 09
No ratings yet
Tut 09
12 pages
Module4 CAche Performance
No ratings yet
Module4 CAche Performance
40 pages
Chapter09-VM Suppl Systems Examples
No ratings yet
Chapter09-VM Suppl Systems Examples
25 pages
Cache Presentation
No ratings yet
Cache Presentation
45 pages
An Autonomous Institution - Affiliated To Anna University
No ratings yet
An Autonomous Institution - Affiliated To Anna University
44 pages
Rec 07
No ratings yet
Rec 07
40 pages
CH04
No ratings yet
CH04
46 pages
Chap 5 Memory System p1
No ratings yet
Chap 5 Memory System p1
30 pages
Chapter 5
No ratings yet
Chapter 5
16 pages
Cache Writing & Performance
No ratings yet
Cache Writing & Performance
23 pages
Section 9: Reporting On Line Items: Transaction KSB1 - Display Actual Cost Line Items For Cost Centres
No ratings yet
Section 9: Reporting On Line Items: Transaction KSB1 - Display Actual Cost Line Items For Cost Centres
24 pages
10 Caches
No ratings yet
10 Caches
34 pages
Algorithm
No ratings yet
Algorithm
17 pages
Memory 2
No ratings yet
Memory 2
31 pages
Lectures wk11
No ratings yet
Lectures wk11
21 pages
Chapter 5.1-5.6 Memory
No ratings yet
Chapter 5.1-5.6 Memory
26 pages
פרק ט - גדול ומהיר - ניצול היררכיות זיכרון
No ratings yet
פרק ט - גדול ומהיר - ניצול היררכיות זיכרון
77 pages
CH04 COA10e
No ratings yet
CH04 COA10e
41 pages
DAA Miniproject
No ratings yet
DAA Miniproject
11 pages
Lecture 19: Cache Basics: Today's Topics: Out-Of-Order Execution Cache Hierarchies Reminder: Assignment 7 Due On Thursday
No ratings yet
Lecture 19: Cache Basics: Today's Topics: Out-Of-Order Execution Cache Hierarchies Reminder: Assignment 7 Due On Thursday
17 pages
Unit 5 Dpco
No ratings yet
Unit 5 Dpco
20 pages
4.chap6 Lookahead Vs Look Through Cache
No ratings yet
4.chap6 Lookahead Vs Look Through Cache
23 pages
DBMS July - 2024
No ratings yet
DBMS July - 2024
10 pages
Server Hosting Management System (Ip Class 12) (2024-25)
No ratings yet
Server Hosting Management System (Ip Class 12) (2024-25)
21 pages
MTech CO
No ratings yet
MTech CO
21 pages
Lec8 - Caches
No ratings yet
Lec8 - Caches
55 pages
Cache Memory: A Safe Place For Hiding or Storing Things
100% (1)
Cache Memory: A Safe Place For Hiding or Storing Things
34 pages
Logplot
No ratings yet
Logplot
3 pages
Lab3 Cachelab
No ratings yet
Lab3 Cachelab
5 pages
csc241.20S Sorting Methods Assignment
No ratings yet
csc241.20S Sorting Methods Assignment
5 pages
Topics: Cache Innovations (Sections 2.4, B.4, B.5), Virtual Memory Intro
No ratings yet
Topics: Cache Innovations (Sections 2.4, B.4, B.5), Virtual Memory Intro
20 pages
MCA Rtu Syllabuss
No ratings yet
MCA Rtu Syllabuss
6 pages
BK3120
No ratings yet
BK3120
86 pages
Cache Basics and Operation
No ratings yet
Cache Basics and Operation
42 pages
HW 4
No ratings yet
HW 4
4 pages
Cache Misses
No ratings yet
Cache Misses
8 pages
IES 4 Memory System Mechanisms
No ratings yet
IES 4 Memory System Mechanisms
11 pages
C If Statement
No ratings yet
C If Statement
6 pages
Caching and Virtual Memory
No ratings yet
Caching and Virtual Memory
34 pages
Syed Tarique Abedin Resume
No ratings yet
Syed Tarique Abedin Resume
1 page
Cache Memory: CS 322M Digital Logic & Computer Architecture
No ratings yet
Cache Memory: CS 322M Digital Logic & Computer Architecture
16 pages
HL7 Version 2 XML Encoding Rules, Release 2
No ratings yet
HL7 Version 2 XML Encoding Rules, Release 2
50 pages
117 2080 PL-300-1
No ratings yet
117 2080 PL-300-1
7 pages
Week12 Handout1
No ratings yet
Week12 Handout1
2 pages
10 Caches Detail
No ratings yet
10 Caches Detail
45 pages
Web Technologies (Topic - 04 SGML)
No ratings yet
Web Technologies (Topic - 04 SGML)
11 pages
Project - Cache Organization and Performance Evaluation
No ratings yet
Project - Cache Organization and Performance Evaluation
9 pages
Advanced Computer Architecture: BY Dr. Radwa M. Tawfeek
No ratings yet
Advanced Computer Architecture: BY Dr. Radwa M. Tawfeek
32 pages
15IF11 Multicore B
No ratings yet
15IF11 Multicore B
36 pages
Computer Architecture: Cache Memory
No ratings yet
Computer Architecture: Cache Memory
28 pages
Getting Started nRF5SDK Ses
No ratings yet
Getting Started nRF5SDK Ses
39 pages
R RRRRRRRR Final
No ratings yet
R RRRRRRRR Final
28 pages
CH 4 e F08
No ratings yet
CH 4 e F08
4 pages
Cache and Caching: Electrical and Electronic Engineering
No ratings yet
Cache and Caching: Electrical and Electronic Engineering
15 pages
Exploring Quantum Computing Use Cases For Manufacturing - IBM
No ratings yet
Exploring Quantum Computing Use Cases For Manufacturing - IBM
8 pages
Zoology queSTION
No ratings yet
Zoology queSTION
1 page
Lab 8
No ratings yet
Lab 8
10 pages
With Industrial Hivision.: Hirschmann. Simply A Good Connection
No ratings yet
With Industrial Hivision.: Hirschmann. Simply A Good Connection
2 pages
CD Player State Diagram
No ratings yet
CD Player State Diagram
14 pages
Cache Memory
No ratings yet
Cache Memory
39 pages
Duet GGC
No ratings yet
Duet GGC
6 pages
Assistive Technology Booklet
No ratings yet
Assistive Technology Booklet
28 pages
Quiz 1
No ratings yet
Quiz 1
2 pages
Report Blood Bank Management System DBMS
100% (1)
Report Blood Bank Management System DBMS
33 pages
Lecture: Cache Hierarchies: Topics: Cache Innovations (Sections B.1-B.3, 2.1)
No ratings yet
Lecture: Cache Hierarchies: Topics: Cache Innovations (Sections B.1-B.3, 2.1)
20 pages
Sampriya Chandra Cache Memory
No ratings yet
Sampriya Chandra Cache Memory
36 pages
Memory Cache
No ratings yet
Memory Cache
18 pages
Predictive Targeting Suite V2 Manual
No ratings yet
Predictive Targeting Suite V2 Manual
28 pages
EE204 - Computer Architecture Course Project
No ratings yet
EE204 - Computer Architecture Course Project
7 pages
Cache1 2
No ratings yet
Cache1 2
30 pages
Elements of Cache Design
No ratings yet
Elements of Cache Design
6 pages
5.6 Basic Computer Structure: Basic Maintenance Training Manual Module 5 Digital Technology / EIS
100% (1)
5.6 Basic Computer Structure: Basic Maintenance Training Manual Module 5 Digital Technology / EIS
26 pages
Cache and Caching: Electrical and Electronic Engineering
No ratings yet
Cache and Caching: Electrical and Electronic Engineering
15 pages
4.1 Computer Memory System Overview
No ratings yet
4.1 Computer Memory System Overview
12 pages
Lecture 5: Memory Hierarchy and Cache Traditional Four Questions For Memory Hierarchy Designers
No ratings yet
Lecture 5: Memory Hierarchy and Cache Traditional Four Questions For Memory Hierarchy Designers
10 pages
UNIT-IV Memory and I/O
No ratings yet
UNIT-IV Memory and I/O
36 pages
Large and Fast: Exploiting Memory Hierarchy
No ratings yet
Large and Fast: Exploiting Memory Hierarchy
48 pages
Understand CPU Caching Concepts
No ratings yet
Understand CPU Caching Concepts
14 pages
The Reed Solomon Code
No ratings yet
The Reed Solomon Code
2 pages
Storage Area Networks For Dummies
From Everand
Storage Area Networks For Dummies
Christopher Poelker
3.5/5 (2)
FreeBSD Mastery: Storage Essentials: IT Mastery, #4
From Everand
FreeBSD Mastery: Storage Essentials: IT Mastery, #4
Michael W. Lucas
No ratings yet
PlayStation 2 Architecture: Architecture of Consoles: A Practical Analysis, #12
From Everand
PlayStation 2 Architecture: Architecture of Consoles: A Practical Analysis, #12
Rodrigo Copetti
No ratings yet
Nintendo 64 Architecture: Architecture of Consoles: A Practical Analysis, #8
From Everand
Nintendo 64 Architecture: Architecture of Consoles: A Practical Analysis, #8
Rodrigo Copetti
No ratings yet

Lab3 Suppl

Uploaded by

Lab3 Suppl

Uploaded by

Cache Lab Implementation and

L2: L2 cache L2 cache holds cache lines retrieved

Main Main memory holds disk blocks

Remote secondary storage

 Block offset: b bits

 A cache set is a set of E cache lines

 Each cache line stores a block

 Total Capacity = S*B*E

data begins at this offset

Smaller, faster, more expensive

Data is copied in block-sized

Larger, slower, cheaper memory

Block b is fetched from

 Your cache simulator needs to work for different s, b, e,

 Use LRU – Least Recently Used replacement policy

 Each cache_line has:

 To use getopt, your program must include the header file

 If not running on the shark machines then you will need

 Think about how to handle invalid inputs

 For more information,

pFile = fopen ("tracefile.txt",“r"); //open file for reading

while(fscanf(pFile,“ %c %x,%d”, &identifier, &address, &size)>0)

fclose(pFile); //remember to close file when done

 Always free what you malloc, otherwise may

 Don’t free memory you didn’t allocate

 Totally 15 cache misses occurred

 Totally 9 cache misses occurred

 12 misses – better than FIFO but worse than OPT

You might also like

 Total Capacity = SBE