-
RootNode
-
Co-operative College, Mandalay
-
Cooperative College, Phaunggyi
-
Co-operative University, Sagaing
-
Co-operative University, Thanlyin
-
Dagon University
-
Kyaukse University
-
Laquarware Technological college
-
Mandalay Technological University
-
Mandalay University of Distance Education
-
Mandalay University of Foreign Languages
-
Maubin University
-
Mawlamyine University
-
Meiktila University
-
Mohnyin University
-
Myanmar Institute of Information Technology
-
Myanmar Maritime University
-
National Management Degree College
-
Naypyitaw State Academy
-
Pathein University
-
Sagaing University
-
Sagaing University of Education
-
Taunggyi University
-
Technological University, Hmawbi
-
Technological University (Kyaukse)
-
Technological University Mandalay
-
University of Computer Studies, Mandalay
-
University of Computer Studies Maubin
-
University of Computer Studies, Meikhtila
-
University of Computer Studies Pathein
-
University of Computer Studies, Taungoo
-
University of Computer Studies, Yangon
-
University of Dental Medicine Mandalay
-
University of Dental Medicine, Yangon
-
University of Information Technology
-
University of Mandalay
-
University of Medicine 1
-
University of Medicine 2
-
University of Medicine Mandalay
-
University of Myitkyina
-
University of Public Health, Yangon
-
University of Veterinary Science
-
University of Yangon
-
West Yangon University
-
Yadanabon University
-
Yangon Technological University
-
Yangon University of Distance Education
-
Yangon University of Economics
-
Yangon University of Education
-
Yangon University of Foreign Languages
-
Yezin Agricultural University
-
New Index
-
Item
{"_buckets": {"deposit": "22497a5b-ea21-4e1f-8cca-a257c7e05643"}, "_deposit": {"id": "4576", "owners": [], "pid": {"revision_id": 0, "type": "recid", "value": "4576"}, "status": "published"}, "_oai": {"id": "oai:meral.edu.mm:recid/4576", "sets": ["user-ucsy"]}, "communities": ["ucsy"], "item_1583103067471": {"attribute_name": "Title", "attribute_value_mlt": [{"subitem_1551255647225": "Building Large Scale Text Corpus for Joint Word Segmentation and Part-of-Speech Tagging of Myanmar Language", "subitem_1551255648112": "en_US"}]}, "item_1583103085720": {"attribute_name": "Description", "attribute_value_mlt": [{"interim": "In Natural Language Processing (NLP), Word segmentation and Part-of-Speech (POS) taggingare fundamental tasks. The POS information is also necessary in NLP’s preprocessing work applications suchas machine translation (MT), information retrieval (IR), etc. Currently, there are many research efforts inword segmentation and POS tagging developed separately with different methods to get high performanceand accuracy. Word segmentation and Part-of-speech tagging is one of the important actions in languageprocessing. Against this, while numerous models are provided in different languages, few works have beenperformed for Myanmar language. This paper describes the building of Myanmar Corpus to use for jointword segmentation and part-of-speech tagging of Myanmar Language. In our research, the corpus contains51207 sentences and 839161words. The corpus is created using 12 tags. To evaluate the accuracy of thecorpus, HMM model is trained on different data size and testing is done with closed test and opened test.Results with 94% accuracy in the experiments show the appropriate efficiency of the built corpus."}]}, "item_1583103108160": {"attribute_name": "Keywords", "attribute_value_mlt": [{"interim": "Natural Language Processing"}, {"interim": "POS"}, {"interim": "HMM"}, {"interim": "Corpus"}]}, "item_1583103120197": {"attribute_name": "Files", "attribute_type": "file", "attribute_value_mlt": [{"accessrole": "open_access", "date": [{"dateType": "Available", "dateValue": "2020-03-12"}], "displaytype": "preview", "download_preview_message": "", "file_order": 0, "filename": "Dim Lam Cing.pdf", "filesize": [{"value": "177 Kb"}], "format": "application/pdf", "future_date_message": "", "is_thumbnail": false, "licensetype": "license_free", "mimetype": "application/pdf", "size": 177000.0, "url": {"url": "https://meral.edu.mm/record/4576/files/Dim Lam Cing.pdf"}, "version_id": "15eebf66-99e1-4b00-9f85-53f3d3041fb8"}]}, "item_1583103131163": {"attribute_name": "Journal articles", "attribute_value_mlt": [{"subitem_issue": "", "subitem_journal_title": "Proceedings of the 10th International Workshop on Computer Science and Engineering", "subitem_pages": "", "subitem_volume": ""}]}, "item_1583103147082": {"attribute_name": "Conference papers", "attribute_value_mlt": [{"subitem_acronym": "", "subitem_c_date": "", "subitem_conference_title": "", "subitem_part": "", "subitem_place": "", "subitem_session": "", "subitem_website": ""}]}, "item_1583103211336": {"attribute_name": "Books/reports/chapters", "attribute_value_mlt": [{"subitem_book_title": "", "subitem_isbn": "", "subitem_pages": "", "subitem_place": "", "subitem_publisher": ""}]}, "item_1583103233624": {"attribute_name": "Thesis/dissertations", "attribute_value_mlt": [{"subitem_awarding_university": "", "subitem_supervisor(s)": [{"subitem_supervisor": ""}]}]}, "item_1583105942107": {"attribute_name": "Authors", "attribute_value_mlt": [{"subitem_authors": [{"subitem_authors_fullname": "Dim Lam, Cing"}, {"subitem_authors_fullname": "Soe, Khin Mar"}]}]}, "item_1583108359239": {"attribute_name": "Upload type", "attribute_value_mlt": [{"interim": "Publication"}]}, "item_1583108428133": {"attribute_name": "Publication type", "attribute_value_mlt": [{"interim": "Article"}]}, "item_1583159729339": {"attribute_name": "Publication date", "attribute_value": "2020-02-28"}, "item_1583159847033": {"attribute_name": "Identifier", "attribute_value": "978-981-14-4787-7"}, "item_title": "Building Large Scale Text Corpus for Joint Word Segmentation and Part-of-Speech Tagging of Myanmar Language", "item_type_id": "21", "owner": "1", "path": ["1597824175385"], "permalink_uri": "http://hdl.handle.net/20.500.12678/0000004576", "pubdate": {"attribute_name": "Deposited date", "attribute_value": "2020-03-12"}, "publish_date": "2020-03-12", "publish_status": "0", "recid": "4576", "relation": {}, "relation_version_is_last": true, "title": ["Building Large Scale Text Corpus for Joint Word Segmentation and Part-of-Speech Tagging of Myanmar Language"], "weko_shared_id": -1}
Building Large Scale Text Corpus for Joint Word Segmentation and Part-of-Speech Tagging of Myanmar Language
http://hdl.handle.net/20.500.12678/0000004576
http://hdl.handle.net/20.500.12678/0000004576a70b949f-9c07-4d5e-a959-ce2b282c2c86
22497a5b-ea21-4e1f-8cca-a257c7e05643
Name / File | License | Actions |
---|---|---|
![]() |
|
Publication type | ||||||
---|---|---|---|---|---|---|
Article | ||||||
Upload type | ||||||
Publication | ||||||
Title | ||||||
Title | Building Large Scale Text Corpus for Joint Word Segmentation and Part-of-Speech Tagging of Myanmar Language | |||||
Language | en_US | |||||
Publication date | 2020-02-28 | |||||
Authors | ||||||
Dim Lam, Cing | ||||||
Soe, Khin Mar | ||||||
Description | ||||||
In Natural Language Processing (NLP), Word segmentation and Part-of-Speech (POS) taggingare fundamental tasks. The POS information is also necessary in NLP’s preprocessing work applications suchas machine translation (MT), information retrieval (IR), etc. Currently, there are many research efforts inword segmentation and POS tagging developed separately with different methods to get high performanceand accuracy. Word segmentation and Part-of-speech tagging is one of the important actions in languageprocessing. Against this, while numerous models are provided in different languages, few works have beenperformed for Myanmar language. This paper describes the building of Myanmar Corpus to use for jointword segmentation and part-of-speech tagging of Myanmar Language. In our research, the corpus contains51207 sentences and 839161words. The corpus is created using 12 tags. To evaluate the accuracy of thecorpus, HMM model is trained on different data size and testing is done with closed test and opened test.Results with 94% accuracy in the experiments show the appropriate efficiency of the built corpus. | ||||||
Keywords | ||||||
Natural Language Processing, POS, HMM, Corpus | ||||||
Identifier | 978-981-14-4787-7 | |||||
Journal articles | ||||||
Proceedings of the 10th International Workshop on Computer Science and Engineering | ||||||
Conference papers | ||||||
Books/reports/chapters | ||||||
Thesis/dissertations |