資源描述:
《The Essence of Caching》由會(huì)員上傳分享,免費(fèi)在線閱讀,更多相關(guān)內(nèi)容在學(xué)術(shù)論文-天天文庫。
1、TheEssenceofCachingGregLuck,FounderandCTOEhcache,TerracottaJavaOne2011Session24241Tuesday,4October2011TheProblemTuesday,4October2011ApplicationApplicationApplicationApplication>100msDataStoreAverageResponseTimeSpeed?Cost?Scalability?Tuesday,4October2011TypesofScalingUPSc
2、aleCommodityServerApplicationScaleOUTTuesday,4October2011CapacityPlanning-PeakLoadWORKLOADSANDSOFTWAREINFRASTRUCTURE25FIGURE2.2:Exampleofdailytraf?c?uctuationforasearchserviceinonedatacenter;Google’s?Web?Searches?(1?Datacenter)x-axisisa24-h5periodandthey-axisistraf?cmeas
3、uredinqueriespersecond.Tuesday,4October2011knowntoprovidegood-qualitysimilarityscores.Hereweconsideronesuchtypeofanalysis,calledco-citation.TheunderlyingideaistocounteveryarticlethatcitesarticlesAandBasavoteforthesimilaritybetweenAandB.Afterthatisdoneforallarticlesandapp
4、ropriatelynormalized,weob-tainanumericalscoreforthe(co-citation)similaritybetweenallpairsofarticlesandcreateadatastructurethatforeacharticlereturnsanorderedlist(byco-citationscore)ofsimilararticles.Thisdatastructureisperiodicallyupdated,andeachupdatethenbecomespartofthes
5、ervingstatefortheonlineservice.Thecomputationstartswithacitationgraphthatcreatesamappingfromeacharticleidenti-?ertoasetofarticlescitedbyit.Theinputdataaredividedintohundredsof?lesofapproximatelythesamesize(e.g.,thiscanbedonebytakinga?ngerprintofthearticleidenti?er,dividi
6、ngitbythenumberofinput?les,andusingtheremainderasthe?leID)toenableef?cientparallelexecution.WeuseasequenceofMapReducerunstotakeacitationgraphandproduceco-citationsimilarityscorevectorforallarticles.Inthe?rstMapphase,wetakeeachcitationlist(A1,A2,A3,...,An)andgenerateallpa
7、irsofdocumentsinthecitationlist,feedingthemtotheReducephase,whichcountsalloccurrencesofeachpair.This?rststepresultsinastructurethatassociatesallpairsofco-citeddocumentswithaco-citationcount.NotethatthisbecomesmuchlessthanaquadraticThe?Elephant?Curve6Tuesday,4October2011D
8、esirablePropertiesofaSolutionTuesday,4October20118Tuesday,4October20119Tuesday,4October2011APerformance