2024-03-29T08:48:32Z
https://meral.edu.mm/oai
oai:meral.edu.mm:recid/4302
2021-12-13T08:06:27Z
1582963302567:1597824322519
user-ucsy
Myanmar Word Stemming and POS Tagging using Rule Based Approach
Minn, Kyaw Htet
Myanmar language is spoken by more than 33 million people and use it as averbal and written communication which is an official language of the Republic of theUnion of Myanmar. With the rapid growth of digital content in Myanmar Language,applications like machine learning, translation and information retrieval becomepopular and it required to obtain the effective Natural Language Processing (NLP)studies. The NLP field on Myanmar language still has a big challenge. Segmenting,stemming and Part-Of- Speech (POS) tagging are pre-processing steps in Text Miningapplications as well as a very common requirement of Natural Language processingfunctions. In fact, it is very important in most of the Information Retrieval systems. Themain objective of this thesis is to study Myanmar words morphology, to implement ngram based word segmentation and to propose grammatical stemming rules and POStagging rules for Myanmar language. This thesis proposed the word segmentation,stemming and POS tagging based on n-gram method and rule-based stemming methodthat has the ability to cope the challenges of Myanmar NLP tasks. This system not onlygenerates the segmented words but also generates the stemmed words with POS tag byremoving prefixes, infixes and suffixes. It provides 82 % accuracy. The data arecollected from several online sources and the system is implemented using Pythonlanguage.
2019-03
http://hdl.handle.net/20.500.12678/0000004302
https://meral.edu.mm/records/4302