Performance Analysis of Parallel Clustering on Spark Computing Platform

Nway Yu Aung; Aye Chan Mon; Swe Zin Hlaing

MERAL Myanmar Education Research and Learning Portal

lat lon distance

[[sub_check.contents]]　

[[sub_radio.contents]]

Field does not validate

[[sub_attr.contents]]　

Index Tree

Item

{"_buckets": {"deposit": "defbe439-ec22-4acc-92b1-25806b995b92"}, "_deposit": {"created_by": 45, "id": "6296", "owner": "45", "owners": [45], "owners_ext": {"displayname": "", "username": ""}, "pid": {"revision_id": 0, "type": "recid", "value": "6296"}, "status": "published"}, "_oai": {"id": "oai:meral.edu.mm:recid/6296", "sets": ["user-uit"]}, "communities": ["uit"], "item_1583103067471": {"attribute_name": "Title", "attribute_value_mlt": [{"subitem_1551255647225": "Performance Analysis of Parallel Clustering on Spark Computing Platform", "subitem_1551255648112": "en"}]}, "item_1583103085720": {"attribute_name": "Description", "attribute_value_mlt": [{"interim": "In the area of information and technology, data is\ngenerated from a plethora of sources such as social\nmedia, internet of things, multimedia, sensor networks.\nClustering is an essential data mining tool for analyzing\nthis valuable information.Clustering algorithms are\ngenerally classified as a hierarchical and partitioning\nalgorithm. This paper interested in partitioning\nalgorithms. There are two kinds of partitioning algorithm,\nmean-based and medoids-based. The paper focuses on\nmedoids-based because of medoids less influence by\noutliers or other extreme values than mean. But, one of\nthe main issues of partitioning algorithm cannot handle\nlarge volume of data in case of the poor cluster quality\nand higher execution time.The objective of theresearchis\nto solve these two issues.To improve clustering quality,\nthis paper appliesswarm intelligence optimization\nalgorithm on the partition clustering algorithm. And then,\nthis paper expects to reduce execution time for clustering\nlarge volume of data by using Spark framework."}]}, "item_1583103108160": {"attribute_name": "Keywords", "attribute_value_mlt": [{"interim": "Clustering"}, {"interim": "Partitioning algorithm"}, {"interim": "Bat algorithm"}, {"interim": "Apache Spark"}]}, "item_1583103120197": {"attribute_name": "Files", "attribute_type": "file", "attribute_value_mlt": [{"accessrole": "open_access", "date": [{"dateType": "Available", "dateValue": "2020-11-19"}], "displaytype": "preview", "download_preview_message": "", "file_order": 0, "filename": "Performance Analysis of Parallel Clustering on Spark Computing Platform.pdf", "filesize": [{"value": "1.2 Mb"}], "format": "application/pdf", "future_date_message": "", "is_thumbnail": false, "licensefree": "© 2018 ICAIT", "licensetype": "license_free", "mimetype": "application/pdf", "size": 1200000.0, "url": {"url": "https://meral.edu.mm/record/6296/files/Performance Analysis of Parallel Clustering on Spark Computing Platform.pdf"}, "version_id": "3571fecd-3a91-4940-a5a7-4d0bf69ea3a3"}]}, "item_1583103147082": {"attribute_name": "Conference papers", "attribute_value_mlt": [{"subitem_acronym": "ICAIT-2018", "subitem_c_date": "1-2 November, 2018", "subitem_conference_title": "2nd International Conference on Advanced Information Technologies", "subitem_place": "Yangon, Myanmar", "subitem_session": "Data Mining", "subitem_website": "https://www.uit.edu.mm/icait-2018/"}]}, "item_1583105942107": {"attribute_name": "Authors", "attribute_value_mlt": [{"subitem_authors": [{"subitem_authors_fullname": "Nway Yu Aung"}, {"subitem_authors_fullname": "Aye Chan Mon"}, {"subitem_authors_fullname": "Swe Zin Hlaing"}]}]}, "item_1583108359239": {"attribute_name": "Upload type", "attribute_value_mlt": [{"interim": "Publication"}]}, "item_1583108428133": {"attribute_name": "Publication type", "attribute_value_mlt": [{"interim": "Conference paper"}]}, "item_1583159729339": {"attribute_name": "Publication date", "attribute_value": "2018-11-02"}, "item_title": "Performance Analysis of Parallel Clustering on Spark Computing Platform", "item_type_id": "21", "owner": "45", "path": ["1605779935331"], "permalink_uri": "http://hdl.handle.net/20.500.12678/0000006296", "pubdate": {"attribute_name": "Deposited date", "attribute_value": "2020-11-19"}, "publish_date": "2020-11-19", "publish_status": "0", "recid": "6296", "relation": {}, "relation_version_is_last": true, "title": ["Performance Analysis of Parallel Clustering on Spark Computing Platform"], "weko_shared_id": -1}

Performance Analysis of Parallel Clustering on Spark Computing Platform

http://hdl.handle.net/20.500.12678/0000006296

Preview

Name / File	License	Actions
Performance Analysis of Parallel Clustering on Spark Computing Platform.pdf (1.2 Mb)	© 2018 ICAIT

Publication type
		Conference paper
Upload type
		Publication
Title
	Title	Performance Analysis of Parallel Clustering on Spark Computing Platform
	Language	en
Publication date		2018-11-02
Authors
		Nway Yu Aung
		Aye Chan Mon
		Swe Zin Hlaing
Description
		In the area of information and technology, data is generated from a plethora of sources such as social media, internet of things, multimedia, sensor networks. Clustering is an essential data mining tool for analyzing this valuable information.Clustering algorithms are generally classified as a hierarchical and partitioning algorithm. This paper interested in partitioning algorithms. There are two kinds of partitioning algorithm, mean-based and medoids-based. The paper focuses on medoids-based because of medoids less influence by outliers or other extreme values than mean. But, one of the main issues of partitioning algorithm cannot handle large volume of data in case of the poor cluster quality and higher execution time.The objective of theresearchis to solve these two issues.To improve clustering quality, this paper appliesswarm intelligence optimization algorithm on the partition clustering algorithm. And then, this paper expects to reduce execution time for clustering large volume of data by using Spark framework.
Keywords
		Clustering, Partitioning algorithm, Bat algorithm, Apache Spark
Conference papers
		ICAIT-2018
		1-2 November, 2018
		2nd International Conference on Advanced Information Technologies
		Yangon, Myanmar
		Data Mining
		https://www.uit.edu.mm/icait-2018/