{"created":"2020-08-30T13:56:20.888250+00:00","id":3116,"links":{},"metadata":{"_buckets":{"deposit":"033d64ba-46ea-44db-a6f9-1ecffc6ed024"},"_deposit":{"id":"3116","owners":[],"pid":{"revision_id":0,"type":"recid","value":"3116"},"status":"published"},"_oai":{"id":"oai:meral.edu.mm:recid/3116","sets":["1582963413512:1596119372420"]},"communities":["ytu"],"item_1583103067471":{"attribute_name":"Title","attribute_value_mlt":[{"subitem_1551255647225":"Modified K-Means for Document Clustering System","subitem_1551255648112":"en"}]},"item_1583103085720":{"attribute_name":"Description","attribute_value_mlt":[{"interim":"
In today’s era of World Wide Web, there is a
\ntremendous proliferation in the amount of digitized text
\ndocuments. As there is huge collection of documents on the web,
\nthere is a need of grouping the set of documents into clusters.
\nDocument clustering plays an important role in effectively
\nnavigating and organizing the documents. K-Means clustering
\nalgorithm is the most commonly document clustering algorithm
\nbecause it can be easily implemented and is the most efficient one
\nin terms of execution times. The major problem with this
\nalgorithm is that it is quite sensitive to selection of initial cluster
\ncentroids. The algorithm takes the initial cluster center arbitrarily
\nso it does not always promise good clustering results. If the initial
\ncentroids are incorrectly determined, the remaining data points
\nwith the same similarity scores may fall into the different clusters
\ninstead of the same cluster. To overcome this problem, modified
\nK-Means approach is proposed to improve the quality of
\nclustering in this paper. Unlike the traditional K-Means
\nclustering, the proposed K-Means method can generate the most
\ncompact and stable clustering results based on maximum distance
\ninitial centroids points instead of random initial centroid points.
\nMoreover, the similar data points are clustered based on
\nmaximum probability distribution of data points. Therefore, the
\nproposed method is more effective and converges to more accurate
\nclusters than original K-Means clustering method. In this paper,
\nexperimental results are presented in F-measure using 20-News
\nGroup standard dataset.