Présentation Temps Reel
Présentation Temps Reel
IESE5
01/2017
INTRODUCTION
TILED ARCHITECHTURES
EVALUATION SETUP
EXPERIMENTAL RESULTS
CONCLUSION
INTRODUCTION
Tiled CMPs
Tiled extended chip micro
processor
Cache
A fast and small memory unit
Problem
With more than 16 cores in 65nm and smaller
processes
32 processors
Processor element (PE)
Cache (L1)
Remote cache access
controller (RAC)
Data Placement
and Remote Cache
Accesses
Virtual adress:
index the local L1
cache,
perform a local TLB
lookup to obtain the
physical address
perform a local MAP
lookup to obtain the
identity of the home
TILED
ARCHITECTURES The first processor
Write
to write on a page
Read-only data sharing
OS intercept s and marks it the as
Read
the owner
The first processor to touch a page
Sebsequent reading by other
Obtain a mapping
processors with existing mappings
while the OS marked the page as read
allowed
only
Sebsequent write/read without a
Other process touch the page table
local mapping
Allowed to creat local mapping
OS generates an entry pointing to
the owner node
OS does not need to keep track of
which processors are sharing the most of
page the state transitions occur only at
the OS level & the hardware
state machine is fairly simple.
EVALUATION
SETUP
Performance analysis - Splash-2 benchmarks and ALPBench
benchmarks.
The benchmarks were compiled with gcc 3.4.4 and glibc
2.3.5 for PowerPC
EVALUATION
SETUP
Performance analysis - Splash-2 benchmarks and ALPBench
benchmarks.
The benchmarks were compiled with gcc 3.4.4 and glibc
2.3.5 for PowerPC
Liberty Simulation enviroment (LSE) Simulator
EXPERIMENTAL
RESULTS
We start by comparing the overall performance of our
architecture against the hardware distributed directory system
Speedup results
Conclusion
Proposition of novel cost effective software/hardware mechanism to
support memory
New mechanism
IESE5
01/2017