資源描述:
《Model order selection for boolean matrix factorization》由會(huì)員上傳分享,免費(fèi)在線閱讀,更多相關(guān)內(nèi)容在行業(yè)資料-天天文庫(kù)。
1、ModelOrderSelectionforBooleanMatrixFactorizationPauliMiettinenJillesVreekenMaxPlanckInstituteforInformaticsDept.ofMathematicsandComputerScienceSaarbrücken,GermanyUniversiteitAntwerpen,Belgiumpmiettin@mpi-inf.mpg.dejilles.vreeken@ua.ac.beABSTRACTcalledalow-dimensionalreprese
2、ntationofthedata,andisusuallyobtainedusingsomeformofmatrixfactorization.Matrixfactorizations—whereagivendatamatrixisapproximatedInmatrixfactorizationstheinputdata(representedasamatrix)byaproductoftwoormorefactormatrices—arepowerfuldataisdecomposedintotwo(ormore)factormatric
3、es.Usuallytheaimminingtools.Amongothertasks,matrixfactorizationsareoftenistohavelow-dimensionalfactormatriceswhoseproductapproxi-usedtoseparateglobalstructurefromnoise.This,however,requiresmatestheoriginalmatrixwell.Byimposingdifferentconstraints,solvingthe‘modelorderselect
4、ionproblem’ofdeterminingwhereoneobtainsdifferentfactorizations.Perhapsthetwobest-known?ne-grainedstructurestops,andnoisestarts,i.e.,whatistheproperfactorizationsareSingularValueDecomposition(SVD),closelysizeofthefactormatrices.relatedtoPrincipalComponentAnalysis(PCA),andNon
5、-negativeBooleanmatrixfactorization(BMF)—wheredata,factors,andMatrixFactorization(NMF).SVDandPCArestrictthefactorma-matrixproductareBoolean—hasreceivedincreasedattentionfromtricestobeorthogonal,whileNMFrequiresthedataandthefactorthedataminingcommunityinrecentyears.Thetechni
6、quehasmatricestobenon-negative.desirableproperties,suchashighinterpretabilityandnaturalsparsity.WhentheinputdataisBoolean,(thatis,containsonly0sand1s,ButsofarnomethodforselectingthecorrectmodelorderforBMFasistypicalwithsupermarketbasketdata),onecanapplyBooleanhasbeenavailab
7、le.InthispaperweproposetousetheMinimumMatrixFactorization(BMF).SimilarlytoNMF,itrestrictsthefactorDescriptionLength(MDL)principleforthistask.Besidessolvingmatricesforaddedinterpretabilityandsparsity.InBMF,thefactortheproblem,thiswell-foundedapproachhasnumerousbene?ts,e.g.,m
8、atricesarerequiredtobeBoolean,i.e.,containonly0sand1s.itisautomatic,doesnotrequire