資源描述:
《data+science+in+talkingdata-》由會員上傳分享,免費在線閱讀,更多相關內容在行業(yè)資料-天天文庫。
1、DataScienceinTalkingData主講人:TalkingData首席數(shù)據(jù)科學家張夏天DatainTalkingDataCHINA’SLARGESTINDEPENDENTMOBILEDATAPLATFORMEstablishedin2011HeadquartersinBeijingThreeroundsofVCfinancing650mln+100,000+30mln200mln+MonthlyActiveAppswithSDKDailyMobileAdMonthlyDeviceUniqueDevicesIntegratedClicks:China
2、’sPanelonAppInstallLargestMobileAd&UsageTrackingPlatformChallengesinTalkingDataBigDataVariousApplications?Volume?Finance?Velocity?Retail?Variety?RealEstate?Variability?…?Veracity?UnreadableDataDataScienceinTalkingDataLearningonBigDataImproveEfficiencyofDataScience?Fregata?SmartDataL
3、ab?Myna?AutoModel?EventDataMiningApplicationsOpen?Lookalike?BusinessPartners?RecommenderSystem?AcademicPartners?DemographicCognition?Education?ChurnAlert?……?ContextAwareness?IndoorPositioning?……LearningonBigDataFregata(OpenSource)?LargescalemachinelearninglibraryonSparkMyna(OpenSour
4、ce)?TheframeworkofcontextawarenessofAndriodEventDataMining?Eventdatamanagementsolution?Eventdata&unreadabledataminingTheRoadToHighPerformanceMLAlgorithms:Fregata‘sApproachRemoveHypeParameters?GreedystepaveragingoptimizationmethodLowCostParallelizationMethod?Modelaveragingmethod?Conv
5、ergencewithonlyonescanofthewholedataCompressModelSizes?Expandthemodelcapabilityonasinglenodebyafactorof1000GreedyStepAveraginghttps://arxiv.org/abs/1611.03608ConvergenceofGSAGSAvsSGDGSAvsAdadeltaGSAvsSCSGParallelizationGradientAveraging/ηHighcostontrainingstage?"=?"$%?)??,(?"$%)?,01
6、ModelAveraging/1?"=)?"$%,,?,01SuitableforSparkScoreAveraging81Highcostonscoringstage?5=)?5,7?701ConvergenceofModelAveragingThemodelaveragingmethodcanapproachtheoptimalmodelforlinearproblemswithaverylargeamountoftrainingdata.Fregatavs.MLLib:LogisticRegressionFregatavs.MLLib:Softmaxon
7、MNISTModelCompressionDiscretizeparametervaluesbyK-Means?Typically,discretizeparametervaluesto128buckets.?Thenwecanuse7bitstoencodeabucket,andbuildamappingindextodiscretizeparametervalues.CompresstheresultingmodelbitmapbyRoaringBitmapsModelCompression:AccuracyCompressedModelOriginalM
8、odel(128buckets)Dat