-
RootNode
-
Co-operative College, Mandalay
-
Cooperative College, Phaunggyi
-
Co-operative University, Sagaing
-
Co-operative University, Thanlyin
-
Dagon University
-
Kyaukse University
-
Laquarware Technological college
-
Mandalay Technological University
-
Mandalay University of Distance Education
-
Mandalay University of Foreign Languages
-
Maubin University
-
Mawlamyine University
-
Meiktila University
-
Mohnyin University
-
Myanmar Institute of Information Technology
-
Myanmar Maritime University
-
National Management Degree College
-
Naypyitaw State Academy
-
Pathein University
-
Sagaing University
-
Sagaing University of Education
-
Taunggyi University
-
Technological University, Hmawbi
-
Technological University (Kyaukse)
-
Technological University Mandalay
-
University of Computer Studies, Mandalay
-
University of Computer Studies Maubin
-
University of Computer Studies, Meikhtila
-
University of Computer Studies Pathein
-
University of Computer Studies, Taungoo
-
University of Computer Studies, Yangon
-
University of Dental Medicine Mandalay
-
University of Dental Medicine, Yangon
-
University of Information Technology
-
University of Mandalay
-
University of Medicine 1
-
University of Medicine 2
-
University of Medicine Mandalay
-
University of Myitkyina
-
University of Public Health, Yangon
-
University of Veterinary Science
-
University of Yangon
-
West Yangon University
-
Yadanabon University
-
Yangon Technological University
-
Yangon University of Distance Education
-
Yangon University of Economics
-
Yangon University of Education
-
Yangon University of Foreign Languages
-
Yezin Agricultural University
-
New Index
-
Item
{"_buckets": {"deposit": "edfac9d5-4969-4586-b895-1b90125ffebc"}, "_deposit": {"id": "4966", "owners": [], "pid": {"revision_id": 0, "type": "recid", "value": "4966"}, "status": "published"}, "_oai": {"id": "oai:meral.edu.mm:recid/4966", "sets": ["user-ucsy"]}, "communities": ["ucsy"], "item_1583103067471": {"attribute_name": "Title", "attribute_value_mlt": [{"subitem_1551255647225": "Font Script Identification Based on N-gram Text Categorization", "subitem_1551255648112": "en"}]}, "item_1583103085720": {"attribute_name": "Description", "attribute_value_mlt": [{"interim": "In this paper, we propose a method for identifyingfont scripts of Myanmar Language. Because of theunavailability of nationwide standardized encodingscheme in Myanmar font scripts, knowledge writtenin Myanmar language are scattered across internetpages. Font scripts Identifier are essential to mergethose scattered knowledge into one for NLPapplication such as text categorization, informationretrieval and text summarization. Our proposedmethod use N-gram based text categorization. Apiece of text for 11 font scripts is taken for training.TF-IDF (Term Frequency-Inverse DocumentFrequency) weights of character N-grams for eachfont script are computed and stored as a profile forthat particular font script. When a new text documentis given to testify, TF-IDF weight is computed forthat font script and cosine similarity is measuredbetween the test and trained profiles. The highestsimilarity scored of the font script is taken as aresult. 100% accuracy is obtained for testing of11different font scripts by applying TF-IDFapproach. Therefore, this method works well forMyanmar font script identification."}]}, "item_1583103108160": {"attribute_name": "Keywords", "attribute_value_mlt": [{"interim": "Font"}, {"interim": "Font Script"}, {"interim": "Language Identification"}, {"interim": "Font Script Identification"}, {"interim": "N-gram"}, {"interim": "Text Categorization"}, {"interim": "TF-IDF Weights"}]}, "item_1583103120197": {"attribute_name": "Files", "attribute_type": "file", "attribute_value_mlt": [{"accessrole": "open_access", "date": [{"dateType": "Available", "dateValue": "2019-07-12"}], "displaytype": "preview", "download_preview_message": "", "file_order": 0, "filename": "psc2010paper (136).pdf", "filesize": [{"value": "152 Kb"}], "format": "application/pdf", "future_date_message": "", "is_thumbnail": false, "licensetype": "license_free", "mimetype": "application/pdf", "size": 152000.0, "url": {"url": "https://meral.edu.mm/record/4966/files/psc2010paper (136).pdf"}, "version_id": "643677c9-7676-49dc-93c2-7965b1b40fb6"}]}, "item_1583103131163": {"attribute_name": "Journal articles", "attribute_value_mlt": [{"subitem_issue": "", "subitem_journal_title": "Fifth Local Conference on Parallel and Soft Computing", "subitem_pages": "", "subitem_volume": ""}]}, "item_1583103147082": {"attribute_name": "Conference papers", "attribute_value_mlt": [{"subitem_acronym": "", "subitem_c_date": "", "subitem_conference_title": "", "subitem_part": "", "subitem_place": "", "subitem_session": "", "subitem_website": ""}]}, "item_1583103211336": {"attribute_name": "Books/reports/chapters", "attribute_value_mlt": [{"subitem_book_title": "", "subitem_isbn": "", "subitem_pages": "", "subitem_place": "", "subitem_publisher": ""}]}, "item_1583103233624": {"attribute_name": "Thesis/dissertations", "attribute_value_mlt": [{"subitem_awarding_university": "", "subitem_supervisor(s)": [{"subitem_supervisor": ""}]}]}, "item_1583105942107": {"attribute_name": "Authors", "attribute_value_mlt": [{"subitem_authors": [{"subitem_authors_fullname": "Than, Kyaw Myo"}, {"subitem_authors_fullname": "Htay, Hla Hla"}]}]}, "item_1583108359239": {"attribute_name": "Upload type", "attribute_value_mlt": [{"interim": "Publication"}]}, "item_1583108428133": {"attribute_name": "Publication type", "attribute_value_mlt": [{"interim": "Article"}]}, "item_1583159729339": {"attribute_name": "Publication date", "attribute_value": "2010-12-16"}, "item_1583159847033": {"attribute_name": "Identifier", "attribute_value": "http://onlineresource.ucsy.edu.mm/handle/123456789/865"}, "item_title": "Font Script Identification Based on N-gram Text Categorization", "item_type_id": "21", "owner": "1", "path": ["1597824273898"], "permalink_uri": "http://hdl.handle.net/20.500.12678/0000004966", "pubdate": {"attribute_name": "Deposited date", "attribute_value": "2019-07-12"}, "publish_date": "2019-07-12", "publish_status": "0", "recid": "4966", "relation": {}, "relation_version_is_last": true, "title": ["Font Script Identification Based on N-gram Text Categorization"], "weko_shared_id": -1}
Font Script Identification Based on N-gram Text Categorization
http://hdl.handle.net/20.500.12678/0000004966
http://hdl.handle.net/20.500.12678/000000496694bfc436-3690-42b0-a973-02557b8bd86e
edfac9d5-4969-4586-b895-1b90125ffebc
Name / File | License | Actions |
---|---|---|
![]() |
|
Publication type | ||||||
---|---|---|---|---|---|---|
Article | ||||||
Upload type | ||||||
Publication | ||||||
Title | ||||||
Title | Font Script Identification Based on N-gram Text Categorization | |||||
Language | en | |||||
Publication date | 2010-12-16 | |||||
Authors | ||||||
Than, Kyaw Myo | ||||||
Htay, Hla Hla | ||||||
Description | ||||||
In this paper, we propose a method for identifyingfont scripts of Myanmar Language. Because of theunavailability of nationwide standardized encodingscheme in Myanmar font scripts, knowledge writtenin Myanmar language are scattered across internetpages. Font scripts Identifier are essential to mergethose scattered knowledge into one for NLPapplication such as text categorization, informationretrieval and text summarization. Our proposedmethod use N-gram based text categorization. Apiece of text for 11 font scripts is taken for training.TF-IDF (Term Frequency-Inverse DocumentFrequency) weights of character N-grams for eachfont script are computed and stored as a profile forthat particular font script. When a new text documentis given to testify, TF-IDF weight is computed forthat font script and cosine similarity is measuredbetween the test and trained profiles. The highestsimilarity scored of the font script is taken as aresult. 100% accuracy is obtained for testing of11different font scripts by applying TF-IDFapproach. Therefore, this method works well forMyanmar font script identification. | ||||||
Keywords | ||||||
Font, Font Script, Language Identification, Font Script Identification, N-gram, Text Categorization, TF-IDF Weights | ||||||
Identifier | http://onlineresource.ucsy.edu.mm/handle/123456789/865 | |||||
Journal articles | ||||||
Fifth Local Conference on Parallel and Soft Computing | ||||||
Conference papers | ||||||
Books/reports/chapters | ||||||
Thesis/dissertations |