資源描述:
《基于雙向匹配的中文分詞算法的研究與實(shí)現(xiàn)》由會(huì)員上傳分享,免費(fèi)在線閱讀,更多相關(guān)內(nèi)容在學(xué)術(shù)論文-天天文庫(kù)。
1、石家莊經(jīng)濟(jì)學(xué)院本科生畢業(yè)論文摘要中文分詞是信息提取、信息檢索、機(jī)器翻譯、文本分類、自動(dòng)文摘、語(yǔ)音識(shí)別、文本語(yǔ)音轉(zhuǎn)換、自然語(yǔ)言理解等中文信息處理領(lǐng)域的基礎(chǔ),雖然研究了很多年,但是中文分詞依然是中文信息處理的瓶頸之一。本文首先將已有的分詞算法進(jìn)行了分析、總結(jié)和歸納,討論了中文識(shí)別一直難以很好解決的兩大問(wèn)題:歧義識(shí)別和未登錄詞。接著在基于詞典的基礎(chǔ)上將最大正向匹配和最大逆向匹配結(jié)合起來(lái),得到了雙向匹配分詞算法,并且使用了自己提出的字典機(jī)制(子字典機(jī)制)實(shí)現(xiàn)了一個(gè)基于雙向匹配算法的中文分詞系統(tǒng)。關(guān)鍵詞:中文分詞;雙向匹配;子字典機(jī)制ABSTRACTChinesew
2、ordsegmentationisthebasisofinformationextraction,informationretrieval,machinetranslation,textcategorization,automaticsummarization,speechrecognition,text-speech,naturallanguageunderstandingandotherChineseinformationprocessing,althoughChinesewordsegmentationhasbeenstudiedformanyyea
3、rs,theChinesewordisoneoftheBottleneckofChineseinformationprocessing.Firstly,thispaperistopresentthesegmentationalgorithmwhichhasbeenanalyzed,summarized,discussedtheimplementationoftheChinesehasnotbeenidentifiedtwomajorproblems:ambiguouswordrecognitionandnotlanding.Then,thebasisoft
4、hedictionarywillbebasedonmaximummatchingandmaximumreversepositivematchtogethertoformatwo-waymatchingwordsegmentationalgorithm,andusesitsowndictionarymechanismproposedby(adictionarymechanism.)toachieveatwo-waymatchingalgorithmbasedonChinesewordsegmentationsystem.Keywords:Chinesewor
5、d;two-waymatch;Sub-dictionarymechanismⅠ石家莊經(jīng)濟(jì)學(xué)院本科生畢業(yè)論文目錄摘要.............................................................................................................................ⅠABSTRACT..........................................................................................
6、.......................Ⅰ1引言.............................................................................................................................11.1研究背景、目的及意義...........................................................................................11.2中文分詞的現(xiàn)狀.............
7、.......................................................................................11.3本文的主要?jiǎng)?chuàng)新點(diǎn)................................................................................................31.4課題任務(wù)和論文結(jié)構(gòu).........................................................................
8、...................32中文分詞簡(jiǎn)介......