資源描述:
《Memory Barriers a Hardware View for Software Hackers.pdf》由會員上傳分享,免費在線閱讀,更多相關(guān)內(nèi)容在學術(shù)論文-天天文庫。
1、MemoryBarriers:aHardwareViewforSoftwareHackersPaulE.McKenneyLinuxTechnologyCenterIBMBeavertonpaulmck@linux.vnet.ibm.comJuly23,2010SowhatpossessedCPUdesignerstocausethemingteninstructionspernanosecond,butwillrequiretoin?ictmemorybarriersonpoorunsuspectingSM
2、Pmanytensofnanosecondstofetchadataitemfromsoftwaredesigners?mainmemory.Thisdisparityinspeed—morethanInshort,becausereorderingmemoryreferencesal-twoordersofmagnitude—hasresultedinthemulti-lowsmuchbetterperformance,andsomemorybarri-megabytecachesfoundonmoder
3、nCPUs.Theseersareneededtoforceorderinginthingslikesynchro-cachesareassociatedwiththeCPUsasshowninFig-nizationprimitiveswhosecorrectoperationdependsure1,andcantypicallybeaccessedinafewcycles.1onorderedmemoryreferences.Gettingamoredetailedanswertothisquestio
4、nrequiresagoodunderstandingofhowCPUcachesCPU0CPU1work,andespeciallywhatisrequiredtomakecachesreallyworkwell.Thefollowingsections:1.presentthestructureofacache,CacheCache2.describehowcache-coherencyprotocolsensureInterconnectthatCPUsagreeonthevalueofeachloc
5、ationinmemory,and,?nally,3.outlinehowstorebu?ersandinvalidatequeuesMemoryhelpcachesandcache-coherencyprotocolsachievehighperformance.Wewillseethatmemorybarriersareanecessaryevilthatisrequiredtoenablegoodperformanceandscal-Figure1:ModernComputerSystemCacheS
6、tructureability,anevilthatstemsfromthefactthatCPUsareordersofmagnitudefasterthanareboththein-Data?owsamongtheCPUs’cachesandmemoryterconnectsbetweenthemandthememorytheyarein?xed-lengthblockscalled“cachelines”,whichareattemptingtoaccess.normallyapoweroftwoin
7、size,rangingfrom16to256bytes.Whenagivendataitemis?rstaccessedby1CacheStructure1Itisstandardpracticetousemultiplelevelsofcache,withasmalllevel-onecacheclosetotheCPUwithsingle-cycleac-cesstime,andalargerlevel-twocachewithalongeraccessModernCPUsaremuchfastert
8、hanaremodernmem-time,perhapsroughlytenclockcycles.Higher-performanceorysystems.A2006CPUmightbecapableofexecut-CPUsoftenhavethreeorevenfourlevelsofcache.1Way0Way1agivenCPU,itwillbeabsentfromthatCPU’scache,0x00