0% found this document useful (0 votes)
61 views3 pages

The Changing Memory Hierarchy

memory

Uploaded by

Robert Robinson
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
61 views3 pages

The Changing Memory Hierarchy

memory

Uploaded by

Robert Robinson
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

07/03/2017 Thechangingmemoryhierarchy

Oneofthemainwaystoincreasesystemperformanceisminimisinghowfardownthememoryhierarchy
onehastogotomanipulatedata.It'snotjustsystemlevelprogrammersthatneedtobeawareoftheseissues,
asmostsystemshaveatime/costrequirement,beithowfastyourwebapplicationresponds,orhowmany
racksyouneedinyourdatacenter.

InthiseraofmultipleCPUspersystemthiscanbefurthercomplicatedforprogrammersduetomemory
contentionbetweeneachCPU.Also,virtualizationintroducesfurthercomplications.Considerthefollowing
diagramwhichshowsthememoryhierarchycurrentlyina4socketby4coresystem,whichUlrichDrepper
mentionsisgoingtobeacommonsysteminhisexcellentpaperoncomputermemory.

[UpdateSep2010:NotetheorganisationofcachelevelsinmulticoreCPUscanvaryquiteabit]

http://www.pixelbeat.org/docs/memory_hierarchy/ 1/3
07/03/2017 Thechangingmemoryhierarchy

[UpdateOct2010:hwlocisahandytoolforautomaticallygeneratingdiagramslikethese]

Upuntillatelywe'vejusthadincrementalimprovementstotheperformance(notsize),ofRAMand
mechanicalharddisks,andCPUperformancehasdivergedfromthemalot.Sochangestothememory
hierarchywouldbothspeedsystemsupalot,andsimplifysoftwarerunningontheCPU.It'stheseexciting
changesthatarehappeningnowandinthenextfewyearsthatI'mfocusingonhere.

[UpdateOct2015:Asstatedabove,thedivergenceinspeedbetweenmainmemoryandCPUs,impliesmuch
moreperformanceforefficientuseoftheCPUcaches.Thisisdemonstratedinprofilinghardwareevents,
whereadjustingthememorysizeandaccesspatternreducestheaccessdepthinthememoryhierarchy,thus
greatlyincreasingperformance.Nowoftenit'snotpossibleorpracticaltoadjustallmemoryaccesses,andso
IntelasoftheBroadwellmicroarchitecture(Sep2014),hasmadeCATavailable(incertainXEON
processorstostart),whichallowsonetodynamicallypartitionthesharedcache,tolimitwhatpartofthe
cachecanbewrittentobyacore.Inthisway,restrictingVMs/containers/apps/...toacore,willrestrictthem
toevictingonlypartofthesharedcacheacrosscores,resultinginmoreefficientutilizationofthesystem.
ThisiswellexplainedinDanLuu'ssummaryofCATadvantages.Partitioningfunctionalitylikethiswillalso
improvesecurityisolation,andprotectagainstsidechannelattacks.Infuturedynamiccacheallocationwill
probablybecomeavailableonmostCPUsandacrossmorecachelevels.]

[UpdateSep2015:Notecachecoherenceisabiglimitationtothenumberofcorespossible,andanew
"tardis"cachecoherencemodelpromisingtoremovethelinearincreaseincacheaccountingmemoryper
core.Itworksbytaggingtheoperationswithacountertoorderreads/writes,thusallowingcorestooperate
onolderdataifthatsuffices.Generationcountersareusefulforrelativeorderingratherthantryingto
synchronizewiththeuniversewithtimestampsorsomething.Iproposedonlkml(andstillstandby)asimilar
mechanismforrelativeorderingoffileswithinafilesystem.Distributedcores/filesystemscanusehigher
levelmethodsforcoherence,butwithinthe"system"countershaveanadvantage.]

SolidStateDisks
ConsiderforexamplehowSSDsaffectprocessingofalargefileonamulticoresystem.Becauserandom
seeksareofnoextracostonSSDscomparedtomechanicaldisks,it'ssensibleformultiplecorestoprocess
separateportionsofafiledirectly.Withmechanicaldiskseachcorewouldjustbefightingoverthe
mechanicaldiskhead,andslowdownalotcomparedtojustasinglecoreprocessingthefile.Inotherwords,
datapartitioningtotakeadvantageofmultiplecoresismuchmorecomplicatedformechanicaldisksthanfor
SSDs,requiringmorecomplexlogicandarraysofdiskstoachieveparallelization.Noteforcertain
operationslikesorting,onehastotakeRAMsizeintoaccount,sothecoresshouldprocesschunksofthefile
inparallelwhereeachchunkis((ramsize/numcpus)abit).Forotheroperationslikesearchingforexample,
RAMsizeisnotafactor,andonecanjustsplitthefileintochunksof(filesize/numcpus).[UpdateDec
2012:GiventhewideningdisparitybetweentraditionaldisksandSSDs,they'reseparatingouttodistinct
http://www.pixelbeat.org/docs/memory_hierarchy/ 2/3
07/03/2017 Thechangingmemoryhierarchy

layersinthememoryhierarchy.Totakeadvantageofthis,hybriddrivesarebecomingavailable,asis
softwaretotransparentlycombineseparatedrives,likeSRTorLinuxsolutionslikebcache.][UpdateJan
2016:ACMQueuediscussiononfasternonvolatilestorage"itisrarethattheperformanceassumptionsthat
wemakeaboutanunderlyinghardwarecomponentchangeby1,000x".]

2TransistorDRAM
2TDRAMcurrentlybeingdevelopedbyIntel,hasthepotentialtoenhancecachesinCPUsatleast.Youcan
seeinthediagramabovethatthelevel2cachecanbebothusedtospeedaccesstotherelativelyslowRAM
andspeedupcommunicationbetweencoresinasingleprocessor.Whenthismemorywallislowereditagain
givestheopportunitytousedifferentalgorithms,especiallyonmulticoresystems.TianTianofIntelhas
writtenagoodarticleonhowsharedcachesenhanceamulticoresystemandhowprogrammerscantake
furtheradvantageofthem.TherealsoisanothergoodACMarticleonoptimizingapplicationperformancein
thepresenceofcaches,andthisexcellentpresentationonlockfreealgorithmstakingconsiderationsofthe
currentmemoryhierarchy.[UpdateDec2008:InoticedanIEEEreferencetoaSandiaNationalLaboratories
simulation,whichshowedthatformanyapplications,thememorywallwithcurrentarchitecturescauses
performancetodeclinewithgreaterthan8processors,soitlooksliketechnologylike2TDRAMwillbe
requiredinthenearfuture.]

MRAMandMemristors
Thesetechnologieshavethepotentialtobethebiggestgamechangers.They'reessentiallyveryfastnon
volatilememory,andsowillaffectbothcurrentRAMandflashtechnologies.

MRAMhasbeenindevelopmentforawhile,butwhilebeingaboutastwiceasfastascurrentRAM
technologies,it'smuchmoreexpensive.HoweverresearchersinGermanyhaverecentlyfiguredouthowto
makeit10timesfasteragain!

MemristorshaverecentlybeencreatedbyHPlabsandagaintheyhavethepotentialtobeafast,dense,
cheap,nonvolatilememory.Thememristorwasfirsttheorizedin1971byLeonChua,beingafourth
fundamentalcircuitelement,havingpropertiesthatcannotbeachievedbyanycombinationoftheotherthree
elements(resistor,inductor,capacitor).[UpdateSep2010:Memristorswillbeavailableby2014apparently.]
[UpdateNov2011:Youcanapparentlymakehomemadememristors:)][UpdateJun2014:Informative
memristorinfoandroadmapfromHP]Interestingtimes...

[UpdateJul2015:3DXpointwasannouncedbyIntel/Microntobeavailablein2016.Mostlymarketingfor
now,butasatransistorlessnonvolatiletechnology,haspotentialtobeanotherlevelinthehierarchyunder
DRAMatfirst,andeventuallyreplacingitaltogether.]

Aug192008

http://www.pixelbeat.org/docs/memory_hierarchy/ 3/3

You might also like