Index Link

  • RootNode
    • Co-operative College, Mandalay
    • Cooperative College, Phaunggyi
    • Co-operative University, Sagaing
    • Co-operative University, Thanlyin
    • Dagon University
    • Kyaukse University
    • Laquarware Technological college
    • Mandalay Technological University
    • Mandalay University of Distance Education
    • Mandalay University of Foreign Languages
    • Maubin University
    • Mawlamyine University
    • Meiktila University
    • Mohnyin University
    • Myanmar Institute of Information Technology
    • Myanmar Maritime University
    • National Management Degree College
    • Naypyitaw State Academy
    • Pathein University
    • Sagaing University
    • Sagaing University of Education
    • Taunggyi University
    • Technological University, Hmawbi
    • Technological University (Kyaukse)
    • Technological University Mandalay
    • University of Computer Studies, Mandalay
    • University of Computer Studies Maubin
    • University of Computer Studies, Meikhtila
    • University of Computer Studies Pathein
    • University of Computer Studies, Taungoo
    • University of Computer Studies, Yangon
    • University of Dental Medicine Mandalay
    • University of Dental Medicine, Yangon
    • University of Information Technology
    • University of Mandalay
    • University of Medicine 1
    • University of Medicine 2
    • University of Medicine Mandalay
    • University of Myitkyina
    • University of Public Health, Yangon
    • University of Veterinary Science
    • University of Yangon
    • West Yangon University
    • Yadanabon University
    • Yangon Technological University
    • Yangon University of Distance Education
    • Yangon University of Economics
    • Yangon University of Education
    • Yangon University of Foreign Languages
    • Yezin Agricultural University
    • New Index

Item

{"_buckets": {"deposit": "033d64ba-46ea-44db-a6f9-1ecffc6ed024"}, "_deposit": {"id": "3116", "owners": [], "pid": {"revision_id": 0, "type": "recid", "value": "3116"}, "status": "published"}, "_oai": {"id": "oai:meral.edu.mm:recid/3116", "sets": ["user-ytu"]}, "communities": ["ytu"], "item_1583103067471": {"attribute_name": "Title", "attribute_value_mlt": [{"subitem_1551255647225": "Modified K-Means for Document Clustering System", "subitem_1551255648112": "en"}]}, "item_1583103085720": {"attribute_name": "Description", "attribute_value_mlt": [{"interim": "\u003cp\u003eIn\u0026nbsp; today\u0026rsquo;s\u0026nbsp; era\u0026nbsp; of\u0026nbsp; World\u0026nbsp; Wide\u0026nbsp; Web,\u0026nbsp; there\u0026nbsp; is\u0026nbsp; a\u003cbr\u003e\ntremendous\u0026nbsp; proliferation\u0026nbsp; in\u0026nbsp; the\u0026nbsp; amount\u0026nbsp; of\u0026nbsp; digitized\u0026nbsp; text\u003cbr\u003e\ndocuments. As there is huge collection of documents on the web,\u003cbr\u003e\nthere\u0026nbsp; is\u0026nbsp; a\u0026nbsp; need\u0026nbsp; of\u0026nbsp; grouping\u0026nbsp; the\u0026nbsp; set\u0026nbsp; of\u0026nbsp; documents\u0026nbsp; into\u0026nbsp; clusters.\u003cbr\u003e\nDocument\u0026nbsp; clustering\u0026nbsp; plays\u0026nbsp; an\u0026nbsp; important\u0026nbsp; role\u0026nbsp; in\u0026nbsp; effectively\u003cbr\u003e\nnavigating\u0026nbsp; and\u0026nbsp; organizing\u0026nbsp; the\u0026nbsp; documents.\u0026nbsp; K-Means\u0026nbsp; clustering\u003cbr\u003e\nalgorithm\u0026nbsp; is\u0026nbsp; the\u0026nbsp; most\u0026nbsp; commonly\u0026nbsp; document\u0026nbsp; clustering\u0026nbsp; algorithm\u003cbr\u003e\nbecause it can be easily implemented and is the most efficient one\u003cbr\u003e\nin\u0026nbsp; terms\u0026nbsp; of\u0026nbsp; execution\u0026nbsp; times.\u0026nbsp; The\u0026nbsp; major\u0026nbsp; problem\u0026nbsp; with\u0026nbsp; this\u003cbr\u003e\nalgorithm is that it is quite sensitive to selection of initial cluster\u003cbr\u003e\ncentroids. The algorithm takes the initial cluster center arbitrarily\u003cbr\u003e\nso it does not always promise good clustering results. If the initial\u003cbr\u003e\ncentroids\u0026nbsp; are\u0026nbsp; incorrectly\u0026nbsp; determined,\u0026nbsp; the\u0026nbsp; remaining\u0026nbsp; data\u0026nbsp; points\u003cbr\u003e\nwith the same similarity scores may fall into the different clusters\u003cbr\u003e\ninstead of the same cluster. To overcome this problem,\u0026nbsp;\u0026nbsp; modified\u003cbr\u003e\nK-Means\u0026nbsp; approach\u0026nbsp; is\u0026nbsp; proposed\u0026nbsp; to\u0026nbsp; improve\u0026nbsp; the\u0026nbsp; quality\u0026nbsp; of\u003cbr\u003e\nclustering\u0026nbsp; in\u0026nbsp; this\u0026nbsp; paper.\u0026nbsp;\u0026nbsp;\u0026nbsp; Unlike\u0026nbsp; the\u0026nbsp; traditional\u0026nbsp; K-Means\u003cbr\u003e\nclustering, the proposed K-Means method can generate the most\u003cbr\u003e\ncompact and stable clustering results based on maximum distance\u003cbr\u003e\ninitial centroids points instead of random initial centroid points.\u003cbr\u003e\nMoreover,\u0026nbsp; the\u0026nbsp; similar\u0026nbsp; data\u0026nbsp; points\u0026nbsp; are\u0026nbsp; clustered\u0026nbsp; based\u0026nbsp; on\u003cbr\u003e\nmaximum probability distribution of data points.\u0026nbsp; Therefore, the\u003cbr\u003e\nproposed method is more effective and converges to more accurate\u003cbr\u003e\nclusters than original K-Means clustering method. In this paper,\u003cbr\u003e\nexperimental\u0026nbsp; results\u0026nbsp; are\u0026nbsp; presented\u0026nbsp; in\u0026nbsp; F-measure\u0026nbsp; using\u0026nbsp; 20-News\u003cbr\u003e\nGroup standard dataset.\u003c/p\u003e"}]}, "item_1583103108160": {"attribute_name": "Keywords", "attribute_value_mlt": [{"interim": "Document clustering"}, {"interim": "F-measure"}, {"interim": "Initial centroid"}, {"interim": "K-Means"}]}, "item_1583103120197": {"attribute_name": "Files", "attribute_type": "file", "attribute_value_mlt": [{"accessrole": "open_access", "date": [{"dateType": "Available", "dateValue": "2019-07-04"}], "displaytype": "preview", "download_preview_message": "", "file_order": 0, "filename": "Modified K-Means for Document Clustering System-2016.pdf", "filesize": [{"value": "311 Kb"}], "format": "application/pdf", "future_date_message": "", "is_thumbnail": false, "mimetype": "application/pdf", "size": 311000.0, "url": {"url": "https://meral.edu.mm/record/3116/files/Modified K-Means for Document Clustering System-2016.pdf"}, "version_id": "6026e71d-9240-4bb0-99cb-d5f012f59eab"}]}, "item_1583103131163": {"attribute_name": "Journal articles", "attribute_value_mlt": [{"subitem_issue": "", "subitem_journal_title": "", "subitem_pages": "", "subitem_volume": ""}]}, "item_1583103147082": {"attribute_name": "Conference papers", "attribute_value_mlt": [{"subitem_acronym": "", "subitem_c_date": "", "subitem_conference_title": "", "subitem_part": "", "subitem_place": "", "subitem_session": "", "subitem_website": ""}]}, "item_1583103211336": {"attribute_name": "Books/reports/chapters", "attribute_value_mlt": [{"subitem_book_title": "", "subitem_isbn": "", "subitem_pages": "", "subitem_place": "", "subitem_publisher": ""}]}, "item_1583103233624": {"attribute_name": "Thesis/dissertations", "attribute_value_mlt": [{"subitem_awarding_university": "", "subitem_supervisor(s)": [{"subitem_supervisor": ""}]}]}, "item_1583105942107": {"attribute_name": "Authors", "attribute_value_mlt": [{"subitem_authors": [{"subitem_authors_fullname": "Tin Thu Zar Win"}, {"subitem_authors_fullname": "Moe Moe Aye"}]}]}, "item_1583108359239": {"attribute_name": "Upload type", "attribute_value_mlt": [{"interim": "Publication"}]}, "item_1583108428133": {"attribute_name": "Publication type", "attribute_value_mlt": [{"interim": "Conference paper"}]}, "item_1583159729339": {"attribute_name": "Publication date", "attribute_value": "2016-10-01"}, "item_1583159847033": {"attribute_name": "Identifier", "attribute_value": "10.5281/zenodo.3268423"}, "item_title": "Modified K-Means for Document Clustering System", "item_type_id": "21", "owner": "1", "path": ["1596119372420"], "permalink_uri": "http://hdl.handle.net/20.500.12678/0000003116", "pubdate": {"attribute_name": "Deposited date", "attribute_value": "2019-07-04"}, "publish_date": "2019-07-04", "publish_status": "0", "recid": "3116", "relation": {}, "relation_version_is_last": true, "title": ["Modified K-Means for Document Clustering System"], "weko_shared_id": -1}

Modified K-Means for Document Clustering System

http://hdl.handle.net/20.500.12678/0000003116
49866c9c-5325-4137-ac2d-1e00db6b4690
033d64ba-46ea-44db-a6f9-1ecffc6ed024
None
Name / File License Actions
Modified Modified K-Means for Document Clustering System-2016.pdf (311 Kb)
Publication type
Conference paper
Upload type
Publication
Title
Title Modified K-Means for Document Clustering System
Language en
Publication date 2016-10-01
Authors
Tin Thu Zar Win
Moe Moe Aye
Description
<p>In&nbsp; today&rsquo;s&nbsp; era&nbsp; of&nbsp; World&nbsp; Wide&nbsp; Web,&nbsp; there&nbsp; is&nbsp; a<br>
tremendous&nbsp; proliferation&nbsp; in&nbsp; the&nbsp; amount&nbsp; of&nbsp; digitized&nbsp; text<br>
documents. As there is huge collection of documents on the web,<br>
there&nbsp; is&nbsp; a&nbsp; need&nbsp; of&nbsp; grouping&nbsp; the&nbsp; set&nbsp; of&nbsp; documents&nbsp; into&nbsp; clusters.<br>
Document&nbsp; clustering&nbsp; plays&nbsp; an&nbsp; important&nbsp; role&nbsp; in&nbsp; effectively<br>
navigating&nbsp; and&nbsp; organizing&nbsp; the&nbsp; documents.&nbsp; K-Means&nbsp; clustering<br>
algorithm&nbsp; is&nbsp; the&nbsp; most&nbsp; commonly&nbsp; document&nbsp; clustering&nbsp; algorithm<br>
because it can be easily implemented and is the most efficient one<br>
in&nbsp; terms&nbsp; of&nbsp; execution&nbsp; times.&nbsp; The&nbsp; major&nbsp; problem&nbsp; with&nbsp; this<br>
algorithm is that it is quite sensitive to selection of initial cluster<br>
centroids. The algorithm takes the initial cluster center arbitrarily<br>
so it does not always promise good clustering results. If the initial<br>
centroids&nbsp; are&nbsp; incorrectly&nbsp; determined,&nbsp; the&nbsp; remaining&nbsp; data&nbsp; points<br>
with the same similarity scores may fall into the different clusters<br>
instead of the same cluster. To overcome this problem,&nbsp;&nbsp; modified<br>
K-Means&nbsp; approach&nbsp; is&nbsp; proposed&nbsp; to&nbsp; improve&nbsp; the&nbsp; quality&nbsp; of<br>
clustering&nbsp; in&nbsp; this&nbsp; paper.&nbsp;&nbsp;&nbsp; Unlike&nbsp; the&nbsp; traditional&nbsp; K-Means<br>
clustering, the proposed K-Means method can generate the most<br>
compact and stable clustering results based on maximum distance<br>
initial centroids points instead of random initial centroid points.<br>
Moreover,&nbsp; the&nbsp; similar&nbsp; data&nbsp; points&nbsp; are&nbsp; clustered&nbsp; based&nbsp; on<br>
maximum probability distribution of data points.&nbsp; Therefore, the<br>
proposed method is more effective and converges to more accurate<br>
clusters than original K-Means clustering method. In this paper,<br>
experimental&nbsp; results&nbsp; are&nbsp; presented&nbsp; in&nbsp; F-measure&nbsp; using&nbsp; 20-News<br>
Group standard dataset.</p>
Keywords
Document clustering, F-measure, Initial centroid, K-Means
Identifier 10.5281/zenodo.3268423
Journal articles
Conference papers
Books/reports/chapters
Thesis/dissertations
0
0
views
downloads
Views Downloads

Export

OAI-PMH
  • OAI-PMH DublinCore
Other Formats