資源描述:
《基于機(jī)器學(xué)習(xí)的中文微博情感分類實(shí)證研究》由會(huì)員上傳分享,免費(fèi)在線閱讀,更多相關(guān)內(nèi)容在行業(yè)資料-天天文庫。
1、Seediscussions,stats,andauthorprofilesforthispublicationat:https://www.researchgate.net/publication/321588841ThearticletitleArticleinComputDes·December2017CITATIONSREADS06921author:SeanYoungChineseAcademyofSciences18PUBLICATIONS0CITATIONSSEEPROFILESomeoftheauthorsofthispublicationarealsoworkingo
2、ntheserelatedprojects:ThisisthirdprjectViewprojectasdffsdfsdaasfdafViewprojectAllcontentfollowingthispagewasuploadedbySeanYoungon06December2017.Theuserhasrequestedenhancementofthedownloadedfile.ComputerEngineeringandApplications計(jì)算機(jī)工程與應(yīng)用2012,48(1)1?博士論壇?基于機(jī)器學(xué)習(xí)的中文微博情感分類實(shí)證研究劉志明,劉魯LIUZhiming,LIULu北京
3、航空航天大學(xué)經(jīng)濟(jì)管理學(xué)院,北京100191SchoolofEconomicsandManagement,BeihangUniversity,Beijing100191,ChinaLIUZhiming,LIULu.EmpiricalstudyofsentimentclassificationforChinesemicroblogbasedonmachinelearning.ComputerEngineeringandApplications,2012,48(1):1-4.Abstract:Withthedevelopmentofmicroblog,itismoreconvenientto
4、commentontheWeb.Uptonow,thereareveryfewstudiesonthesentimentclassificationforChinesemicroblog,thereforethispaperusesthreemachinelearningalgorithms,threekindsoffeaturese-lectionmethodsandthreefeatureweightmethodstostudythesentimentclassificationforChinesemicroblog.Theexperimentalresultsindicateth
5、attheperformanceofSVMisbestinthreemachinelearningalgorithms,IGisthebetterfeatureselectionmethodcomparedtotheothermethods,andTF-IDFisbestfitforthesentimentclassificationinChinesemicroblog.Combiningthethreefactorsthecon-clusioncanbedrawnthattheperformanceofcombinationofSVM,IGandTF-IDFisbest.Forthe
6、moviedomainitisfoundthatthesentimentclassificationdependsonthereviewstyle.Keywords:microblog;sentimentclassification;machinelearning;featureselection;termweight摘要:使用三種機(jī)器學(xué)習(xí)算法、三種特征選取算法以及三種特征項(xiàng)權(quán)重計(jì)算方法對(duì)微博進(jìn)行了情感分類的實(shí)證研究。實(shí)驗(yàn)結(jié)果表明,針對(duì)不同的特征權(quán)重計(jì)算方法,支持向量機(jī)(SVM)和貝葉斯分類算法(Na?veBayes)各有優(yōu)勢,信息增益(IG)特征選取方法相比于其他的方法效果明顯要好。
7、綜合考慮三種因素,采用SVM和IG,以及TF-IDF(TermFrequency-InverseDocumentFrequency)作為特征項(xiàng)權(quán)重,三者結(jié)合對(duì)微博的情感分類效果最好。針對(duì)電影領(lǐng)域,比較了微博評(píng)論和普通評(píng)論之間分類模型的通用性,實(shí)驗(yàn)結(jié)果表明情感分類性能依賴于評(píng)論的風(fēng)格。關(guān)鍵詞:微博;情感分類;機(jī)器學(xué)習(xí);特征選??;特征項(xiàng)權(quán)重DOI:10.3778/j.issn.1002-8331.2012.01.001文章編號(hào):1002-8331(