資源描述:
《科學(xué)數(shù)據(jù)網(wǎng)格中數(shù)據(jù)挖掘技術(shù)研究》由會員上傳分享,免費(fèi)在線閱讀,更多相關(guān)內(nèi)容在學(xué)術(shù)論文-天天文庫。
1、中國科學(xué)院計算技術(shù)研究所博士學(xué)位論文科學(xué)數(shù)據(jù)網(wǎng)格中數(shù)據(jù)挖掘技術(shù)研究姓名:佟強(qiáng)申請學(xué)位級別:博士專業(yè):計算機(jī)系統(tǒng)結(jié)構(gòu)指導(dǎo)教師:閻保平20060601ResearchonDataMiningintheScientificDataGridQiangTong(ComputerArchitecture)DirectedByBaopingYanWiththeemergenceanddevelopmentof面dcomputing,itbecomespossibletosharedataandcollaborateinalargescalemodelofcross-organiz
2、ationandcross-legion.Intheareaofscientificresearch,theproblemofmodernscientificresearchbecomesmoleandmorecomplex,whichresultsinabrand·newscientificcollaborationmodelandthelargescienceproject,i.e.,theinfomationizationofscientificresearch(e-Science).Inordertoshareresourcesandproducts,and
3、alsocollaboratetoaccomplishlargescalemodemscientificresearches,itisnecessarytoestablishallalliedvirtualresearchgroupviatheIntemetbasedon卯dcomputing.Byusingdataminingtechnologies,thispaperaimstoimprovetheserviceleveloftheScientificDataGridandtheScientificDatabase,basedontheirexistinglar
4、ge—scaledatastorageandpowerfulcomputingcapabilities.ThemainresearchcontentsandcontributionsarelistedasfoIlows.’(1)BasedondetailedanalysesofthedataminingpropertiesoftheScientificDataGrid,ascientificdataminingsystemisproposed.Thesystemconsistsofthleemaincomponents:theScientificDataMining
5、Architecture(SDMA),theScientificDataMimngToolkit(SDMK),andtheScientificDataMiningService(SDMS).SDMAdescribesthemulti-dimensionmodelarchitectureofdatamimngapplications;SDMKprovidesalargeamountofdatapreprocessinganddataminingalgorithms;SDMSpresentsadataminingschemetoaddresstheproblemsund
6、er畫denvironmentthroughaformof鰣dservice.Comparedwithtraditionaldataminingsystems,theproposedsystemhasmanyexcellentproperties,andismolesuitabletotheenvironmentoftheScientificDataGridandtheScientificDatabase.Nowadays,ithasbeenappliedinsomerealdatabaseapplications.Besidesthesimplequeryands
7、earchfunctions,theproposedsystemCanalsoperformmoreadvancedfunctionssuchasdatastatistic,dataanalysis,andknowledgediscovery.Asaresult,theservicelevelofthedatabaseisimproved.(2)Clusteringindataminingisadiscoveryprocesswhichgroupsasetofdatasuchthattheintra-clustersimilarityismaximizedand