Log in
Language:

MERAL Myanmar Education Research and Learning Portal

  • Top
  • Universities
  • Ranking
To
lat lon distance
To

Field does not validate



Index Link

Index Tree

Please input email address.

WEKO

One fine body…

WEKO

One fine body…

Item

{"_buckets": {"deposit": "edfac9d5-4969-4586-b895-1b90125ffebc"}, "_deposit": {"id": "4966", "owners": [], "pid": {"revision_id": 0, "type": "recid", "value": "4966"}, "status": "published"}, "_oai": {"id": "oai:meral.edu.mm:recid/4966", "sets": ["user-ucsy"]}, "communities": ["ucsy"], "item_1583103067471": {"attribute_name": "Title", "attribute_value_mlt": [{"subitem_1551255647225": "Font Script Identification Based on N-gram Text Categorization", "subitem_1551255648112": "en"}]}, "item_1583103085720": {"attribute_name": "Description", "attribute_value_mlt": [{"interim": "In this paper, we propose a method for identifyingfont scripts of Myanmar Language. Because of theunavailability of nationwide standardized encodingscheme in Myanmar font scripts, knowledge writtenin Myanmar language are scattered across internetpages. Font scripts Identifier are essential to mergethose scattered knowledge into one for NLPapplication such as text categorization, informationretrieval and text summarization. Our proposedmethod use N-gram based text categorization. Apiece of text for 11 font scripts is taken for training.TF-IDF (Term Frequency-Inverse DocumentFrequency) weights of character N-grams for eachfont script are computed and stored as a profile forthat particular font script. When a new text documentis given to testify, TF-IDF weight is computed forthat font script and cosine similarity is measuredbetween the test and trained profiles. The highestsimilarity scored of the font script is taken as aresult. 100% accuracy is obtained for testing of11different font scripts by applying TF-IDFapproach. Therefore, this method works well forMyanmar font script identification."}]}, "item_1583103108160": {"attribute_name": "Keywords", "attribute_value_mlt": [{"interim": "Font"}, {"interim": "Font Script"}, {"interim": "Language Identification"}, {"interim": "Font Script Identification"}, {"interim": "N-gram"}, {"interim": "Text Categorization"}, {"interim": "TF-IDF Weights"}]}, "item_1583103120197": {"attribute_name": "Files", "attribute_type": "file", "attribute_value_mlt": [{"accessrole": "open_access", "date": [{"dateType": "Available", "dateValue": "2019-07-12"}], "displaytype": "preview", "download_preview_message": "", "file_order": 0, "filename": "psc2010paper (136).pdf", "filesize": [{"value": "152 Kb"}], "format": "application/pdf", "future_date_message": "", "is_thumbnail": false, "licensetype": "license_free", "mimetype": "application/pdf", "size": 152000.0, "url": {"url": "https://meral.edu.mm/record/4966/files/psc2010paper (136).pdf"}, "version_id": "643677c9-7676-49dc-93c2-7965b1b40fb6"}]}, "item_1583103131163": {"attribute_name": "Journal articles", "attribute_value_mlt": [{"subitem_issue": "", "subitem_journal_title": "Fifth Local Conference on Parallel and Soft Computing", "subitem_pages": "", "subitem_volume": ""}]}, "item_1583103147082": {"attribute_name": "Conference papers", "attribute_value_mlt": [{"subitem_acronym": "", "subitem_c_date": "", "subitem_conference_title": "", "subitem_part": "", "subitem_place": "", "subitem_session": "", "subitem_website": ""}]}, "item_1583103211336": {"attribute_name": "Books/reports/chapters", "attribute_value_mlt": [{"subitem_book_title": "", "subitem_isbn": "", "subitem_pages": "", "subitem_place": "", "subitem_publisher": ""}]}, "item_1583103233624": {"attribute_name": "Thesis/dissertations", "attribute_value_mlt": [{"subitem_awarding_university": "", "subitem_supervisor(s)": [{"subitem_supervisor": ""}]}]}, "item_1583105942107": {"attribute_name": "Authors", "attribute_value_mlt": [{"subitem_authors": [{"subitem_authors_fullname": "Than, Kyaw Myo"}, {"subitem_authors_fullname": "Htay, Hla Hla"}]}]}, "item_1583108359239": {"attribute_name": "Upload type", "attribute_value_mlt": [{"interim": "Publication"}]}, "item_1583108428133": {"attribute_name": "Publication type", "attribute_value_mlt": [{"interim": "Article"}]}, "item_1583159729339": {"attribute_name": "Publication date", "attribute_value": "2010-12-16"}, "item_1583159847033": {"attribute_name": "Identifier", "attribute_value": "http://onlineresource.ucsy.edu.mm/handle/123456789/865"}, "item_title": "Font Script Identification Based on N-gram Text Categorization", "item_type_id": "21", "owner": "1", "path": ["1597824273898"], "permalink_uri": "http://hdl.handle.net/20.500.12678/0000004966", "pubdate": {"attribute_name": "Deposited date", "attribute_value": "2019-07-12"}, "publish_date": "2019-07-12", "publish_status": "0", "recid": "4966", "relation": {}, "relation_version_is_last": true, "title": ["Font Script Identification Based on N-gram Text Categorization"], "weko_shared_id": -1}
  1. University of Computer Studies, Yangon
  2. Conferences

Font Script Identification Based on N-gram Text Categorization

http://hdl.handle.net/20.500.12678/0000004966
http://hdl.handle.net/20.500.12678/0000004966
94bfc436-3690-42b0-a973-02557b8bd86e
edfac9d5-4969-4586-b895-1b90125ffebc
None
Preview
Name / File License Actions
psc2010paper psc2010paper (136).pdf (152 Kb)
Publication type
Article
Upload type
Publication
Title
Title Font Script Identification Based on N-gram Text Categorization
Language en
Publication date 2010-12-16
Authors
Than, Kyaw Myo
Htay, Hla Hla
Description
In this paper, we propose a method for identifyingfont scripts of Myanmar Language. Because of theunavailability of nationwide standardized encodingscheme in Myanmar font scripts, knowledge writtenin Myanmar language are scattered across internetpages. Font scripts Identifier are essential to mergethose scattered knowledge into one for NLPapplication such as text categorization, informationretrieval and text summarization. Our proposedmethod use N-gram based text categorization. Apiece of text for 11 font scripts is taken for training.TF-IDF (Term Frequency-Inverse DocumentFrequency) weights of character N-grams for eachfont script are computed and stored as a profile forthat particular font script. When a new text documentis given to testify, TF-IDF weight is computed forthat font script and cosine similarity is measuredbetween the test and trained profiles. The highestsimilarity scored of the font script is taken as aresult. 100% accuracy is obtained for testing of11different font scripts by applying TF-IDFapproach. Therefore, this method works well forMyanmar font script identification.
Keywords
Font, Font Script, Language Identification, Font Script Identification, N-gram, Text Categorization, TF-IDF Weights
Identifier http://onlineresource.ucsy.edu.mm/handle/123456789/865
Journal articles
Fifth Local Conference on Parallel and Soft Computing
Conference papers
Books/reports/chapters
Thesis/dissertations
Back
0
0
views
downloads
See details
Views Downloads

Versions

Ver.1 2020-09-01 15:35:21.161509
Show All versions

Share

Mendeley Twitter Facebook Print Addthis

Export

OAI-PMH
  • OAI-PMH DublinCore
Other Formats
  • JSON

Confirm


Back to MERAL


Back to MERAL