資源描述:
《Integrating Frequent Pattern Mining from Multiple Data Domains for Classification》由會員上傳分享,免費(fèi)在線閱讀,更多相關(guān)內(nèi)容在學(xué)術(shù)論文-天天文庫。
1、2012IEEE28thInternationalConferenceonDataEngineeringIntegratingFrequentPatternMiningfromMultipleDataDomainsforClassi?cationDhavalPatelWynneHsuMongLiLeeNationalUniversityofSingaporeSingapore{dhaval,whsu,leeml}@comp.nus.edu.sgAbstract—Manyfrequentpatternminingalg
2、orithmshavebeentomineusefulfrequentpatternsforclassi?cation[10],[24],developedforcategorical,numerical,timeseries,orinterval[14],[5],[6].data.However,littleattentionhasbeengiventointegratetheseAstraightforwardmethodtodiscoverheterogenouspatternsalgorithmssoasto
3、minefrequentpatternsinvolvingmultipleistoapplydifferentfrequentpatternminingalgorithmsforthedatadomainsforclassi?cation.Inthispaper,weintroducethenotionofaheterogenouspatternthatcapturestheassociationsdifferentdatadomains,followedbyanexhaustivecombina-amongdiff
4、erentdatadomains.Weproposeauni?edframeworktionsofthediscoveredpatterns.Aquickcalculationrevealsforminingmultipledomainsanddesignaniterativealgorithmthatthisapproachiscomputationallyinfeasible.AsmallcalledHTMiner.HTMinerdiscoversessentialheterogenouspat-datasetw
5、ith10categoricalattributes,20numericalattributes,ternsforclassi?cationandperformsinstanceelimination.This10events,and10daysof10timeseriesdatawouldresultinstanceeliminationstepreducestheproblemsizeprogressivelyinthegenerationof210frequentitemsets[22],220frequent
6、byremovingtraininginstanceswhicharecorrectlycoveredbythediscoveredessentialheterogenouspattern.Experimentsonintervals[9],1010frequenttemporalpatterns[14],and1010tworealworlddatasetsshowthattheHTMinerisef?cientandtimemotifspatterns[15].Thecombinationofthesepatte
7、rnsiscansigni?cantlyimprovetheclassi?cationaccuracy.58oftheorder2,andonlyasubsetofthesepatternsareusefulforclassi?cation.Clearly,weneedamoreintegratedapproachI.INTRODUCTIONtodiscoverusefulpatternsfromdifferentdatadomainsforeffectiveclassi?cation.Manydatabaseapp
8、licationsinvolverecordswithattributesEarlyworksonheterogenouspatternsarelimitedtominingfromdifferentdatadomains.Forexample,inaclinicalap-fromatmosttwodifferentkindsofdata[17