-
RootNode
-
Co-operative College, Mandalay
-
Cooperative College, Phaunggyi
-
Co-operative University, Sagaing
-
Co-operative University, Thanlyin
-
Dagon University
-
Kyaukse University
-
Laquarware Technological college
-
Mandalay Technological University
-
Mandalay University of Distance Education
-
Mandalay University of Foreign Languages
-
Maubin University
-
Mawlamyine University
-
Meiktila University
-
Mohnyin University
-
Myanmar Institute of Information Technology
-
Myanmar Maritime University
-
National Management Degree College
-
Naypyitaw State Academy
-
Pathein University
-
Sagaing University
-
Sagaing University of Education
-
Taunggyi University
-
Technological University, Hmawbi
-
Technological University (Kyaukse)
-
Technological University Mandalay
-
University of Computer Studies, Mandalay
-
University of Computer Studies Maubin
-
University of Computer Studies, Meikhtila
-
University of Computer Studies Pathein
-
University of Computer Studies, Taungoo
-
University of Computer Studies, Yangon
-
University of Dental Medicine Mandalay
-
University of Dental Medicine, Yangon
-
University of Information Technology
-
University of Mandalay
-
University of Medicine 1
-
University of Medicine 2
-
University of Medicine Mandalay
-
University of Myitkyina
-
University of Public Health, Yangon
-
University of Veterinary Science
-
University of Yangon
-
West Yangon University
-
Yadanabon University
-
Yangon Technological University
-
Yangon University of Distance Education
-
Yangon University of Economics
-
Yangon University of Education
-
Yangon University of Foreign Languages
-
Yezin Agricultural University
-
New Index
-
Item
{"_buckets": {"deposit": "033d64ba-46ea-44db-a6f9-1ecffc6ed024"}, "_deposit": {"id": "3116", "owners": [], "pid": {"revision_id": 0, "type": "recid", "value": "3116"}, "status": "published"}, "_oai": {"id": "oai:meral.edu.mm:recid/3116", "sets": ["user-ytu"]}, "communities": ["ytu"], "item_1583103067471": {"attribute_name": "Title", "attribute_value_mlt": [{"subitem_1551255647225": "Modified K-Means for Document Clustering System", "subitem_1551255648112": "en"}]}, "item_1583103085720": {"attribute_name": "Description", "attribute_value_mlt": [{"interim": "\u003cp\u003eIn\u0026nbsp; today\u0026rsquo;s\u0026nbsp; era\u0026nbsp; of\u0026nbsp; World\u0026nbsp; Wide\u0026nbsp; Web,\u0026nbsp; there\u0026nbsp; is\u0026nbsp; a\u003cbr\u003e\ntremendous\u0026nbsp; proliferation\u0026nbsp; in\u0026nbsp; the\u0026nbsp; amount\u0026nbsp; of\u0026nbsp; digitized\u0026nbsp; text\u003cbr\u003e\ndocuments. As there is huge collection of documents on the web,\u003cbr\u003e\nthere\u0026nbsp; is\u0026nbsp; a\u0026nbsp; need\u0026nbsp; of\u0026nbsp; grouping\u0026nbsp; the\u0026nbsp; set\u0026nbsp; of\u0026nbsp; documents\u0026nbsp; into\u0026nbsp; clusters.\u003cbr\u003e\nDocument\u0026nbsp; clustering\u0026nbsp; plays\u0026nbsp; an\u0026nbsp; important\u0026nbsp; role\u0026nbsp; in\u0026nbsp; effectively\u003cbr\u003e\nnavigating\u0026nbsp; and\u0026nbsp; organizing\u0026nbsp; the\u0026nbsp; documents.\u0026nbsp; K-Means\u0026nbsp; clustering\u003cbr\u003e\nalgorithm\u0026nbsp; is\u0026nbsp; the\u0026nbsp; most\u0026nbsp; commonly\u0026nbsp; document\u0026nbsp; clustering\u0026nbsp; algorithm\u003cbr\u003e\nbecause it can be easily implemented and is the most efficient one\u003cbr\u003e\nin\u0026nbsp; terms\u0026nbsp; of\u0026nbsp; execution\u0026nbsp; times.\u0026nbsp; The\u0026nbsp; major\u0026nbsp; problem\u0026nbsp; with\u0026nbsp; this\u003cbr\u003e\nalgorithm is that it is quite sensitive to selection of initial cluster\u003cbr\u003e\ncentroids. The algorithm takes the initial cluster center arbitrarily\u003cbr\u003e\nso it does not always promise good clustering results. If the initial\u003cbr\u003e\ncentroids\u0026nbsp; are\u0026nbsp; incorrectly\u0026nbsp; determined,\u0026nbsp; the\u0026nbsp; remaining\u0026nbsp; data\u0026nbsp; points\u003cbr\u003e\nwith the same similarity scores may fall into the different clusters\u003cbr\u003e\ninstead of the same cluster. To overcome this problem,\u0026nbsp;\u0026nbsp; modified\u003cbr\u003e\nK-Means\u0026nbsp; approach\u0026nbsp; is\u0026nbsp; proposed\u0026nbsp; to\u0026nbsp; improve\u0026nbsp; the\u0026nbsp; quality\u0026nbsp; of\u003cbr\u003e\nclustering\u0026nbsp; in\u0026nbsp; this\u0026nbsp; paper.\u0026nbsp;\u0026nbsp;\u0026nbsp; Unlike\u0026nbsp; the\u0026nbsp; traditional\u0026nbsp; K-Means\u003cbr\u003e\nclustering, the proposed K-Means method can generate the most\u003cbr\u003e\ncompact and stable clustering results based on maximum distance\u003cbr\u003e\ninitial centroids points instead of random initial centroid points.\u003cbr\u003e\nMoreover,\u0026nbsp; the\u0026nbsp; similar\u0026nbsp; data\u0026nbsp; points\u0026nbsp; are\u0026nbsp; clustered\u0026nbsp; based\u0026nbsp; on\u003cbr\u003e\nmaximum probability distribution of data points.\u0026nbsp; Therefore, the\u003cbr\u003e\nproposed method is more effective and converges to more accurate\u003cbr\u003e\nclusters than original K-Means clustering method. In this paper,\u003cbr\u003e\nexperimental\u0026nbsp; results\u0026nbsp; are\u0026nbsp; presented\u0026nbsp; in\u0026nbsp; F-measure\u0026nbsp; using\u0026nbsp; 20-News\u003cbr\u003e\nGroup standard dataset.\u003c/p\u003e"}]}, "item_1583103108160": {"attribute_name": "Keywords", "attribute_value_mlt": [{"interim": "Document clustering"}, {"interim": "F-measure"}, {"interim": "Initial centroid"}, {"interim": "K-Means"}]}, "item_1583103120197": {"attribute_name": "Files", "attribute_type": "file", "attribute_value_mlt": [{"accessrole": "open_access", "date": [{"dateType": "Available", "dateValue": "2019-07-04"}], "displaytype": "preview", "download_preview_message": "", "file_order": 0, "filename": "Modified K-Means for Document Clustering System-2016.pdf", "filesize": [{"value": "311 Kb"}], "format": "application/pdf", "future_date_message": "", "is_thumbnail": false, "mimetype": "application/pdf", "size": 311000.0, "url": {"url": "https://meral.edu.mm/record/3116/files/Modified K-Means for Document Clustering System-2016.pdf"}, "version_id": "6026e71d-9240-4bb0-99cb-d5f012f59eab"}]}, "item_1583103131163": {"attribute_name": "Journal articles", "attribute_value_mlt": [{"subitem_issue": "", "subitem_journal_title": "", "subitem_pages": "", "subitem_volume": ""}]}, "item_1583103147082": {"attribute_name": "Conference papers", "attribute_value_mlt": [{"subitem_acronym": "", "subitem_c_date": "", "subitem_conference_title": "", "subitem_part": "", "subitem_place": "", "subitem_session": "", "subitem_website": ""}]}, "item_1583103211336": {"attribute_name": "Books/reports/chapters", "attribute_value_mlt": [{"subitem_book_title": "", "subitem_isbn": "", "subitem_pages": "", "subitem_place": "", "subitem_publisher": ""}]}, "item_1583103233624": {"attribute_name": "Thesis/dissertations", "attribute_value_mlt": [{"subitem_awarding_university": "", "subitem_supervisor(s)": [{"subitem_supervisor": ""}]}]}, "item_1583105942107": {"attribute_name": "Authors", "attribute_value_mlt": [{"subitem_authors": [{"subitem_authors_fullname": "Tin Thu Zar Win"}, {"subitem_authors_fullname": "Moe Moe Aye"}]}]}, "item_1583108359239": {"attribute_name": "Upload type", "attribute_value_mlt": [{"interim": "Publication"}]}, "item_1583108428133": {"attribute_name": "Publication type", "attribute_value_mlt": [{"interim": "Conference paper"}]}, "item_1583159729339": {"attribute_name": "Publication date", "attribute_value": "2016-10-01"}, "item_1583159847033": {"attribute_name": "Identifier", "attribute_value": "10.5281/zenodo.3268423"}, "item_title": "Modified K-Means for Document Clustering System", "item_type_id": "21", "owner": "1", "path": ["1596119372420"], "permalink_uri": "http://hdl.handle.net/20.500.12678/0000003116", "pubdate": {"attribute_name": "Deposited date", "attribute_value": "2019-07-04"}, "publish_date": "2019-07-04", "publish_status": "0", "recid": "3116", "relation": {}, "relation_version_is_last": true, "title": ["Modified K-Means for Document Clustering System"], "weko_shared_id": -1}
Modified K-Means for Document Clustering System
http://hdl.handle.net/20.500.12678/0000003116
http://hdl.handle.net/20.500.12678/000000311649866c9c-5325-4137-ac2d-1e00db6b4690
033d64ba-46ea-44db-a6f9-1ecffc6ed024
Name / File | License | Actions |
---|---|---|
![]() |
|
Publication type | ||||||
---|---|---|---|---|---|---|
Conference paper | ||||||
Upload type | ||||||
Publication | ||||||
Title | ||||||
Title | Modified K-Means for Document Clustering System | |||||
Language | en | |||||
Publication date | 2016-10-01 | |||||
Authors | ||||||
Tin Thu Zar Win | ||||||
Moe Moe Aye | ||||||
Description | ||||||
<p>In today’s era of World Wide Web, there is a<br> tremendous proliferation in the amount of digitized text<br> documents. As there is huge collection of documents on the web,<br> there is a need of grouping the set of documents into clusters.<br> Document clustering plays an important role in effectively<br> navigating and organizing the documents. K-Means clustering<br> algorithm is the most commonly document clustering algorithm<br> because it can be easily implemented and is the most efficient one<br> in terms of execution times. The major problem with this<br> algorithm is that it is quite sensitive to selection of initial cluster<br> centroids. The algorithm takes the initial cluster center arbitrarily<br> so it does not always promise good clustering results. If the initial<br> centroids are incorrectly determined, the remaining data points<br> with the same similarity scores may fall into the different clusters<br> instead of the same cluster. To overcome this problem, modified<br> K-Means approach is proposed to improve the quality of<br> clustering in this paper. Unlike the traditional K-Means<br> clustering, the proposed K-Means method can generate the most<br> compact and stable clustering results based on maximum distance<br> initial centroids points instead of random initial centroid points.<br> Moreover, the similar data points are clustered based on<br> maximum probability distribution of data points. Therefore, the<br> proposed method is more effective and converges to more accurate<br> clusters than original K-Means clustering method. In this paper,<br> experimental results are presented in F-measure using 20-News<br> Group standard dataset.</p> |
||||||
Keywords | ||||||
Document clustering, F-measure, Initial centroid, K-Means | ||||||
Identifier | 10.5281/zenodo.3268423 | |||||
Journal articles | ||||||
Conference papers | ||||||
Books/reports/chapters | ||||||
Thesis/dissertations |