Resource-based Data Placement Strategy for Hadoop Distributed File System

Nang Kham Soe; Tin Tin Yee; Ei Chaw Htoon

Index Tree

RootNode
- Co-operative College, Mandalay
- Cooperative College, Phaunggyi
- Co-operative University, Sagaing
- Co-operative University, Thanlyin
- Dagon University
- Kyaukse University
- Laquarware Technological college
- Mandalay Technological University
- Mandalay University of Distance Education
- Mandalay University of Foreign Languages
- Maubin University
- Mawlamyine University
- Meiktila University
- Mohnyin University
- Myanmar Institute of Information Technology
- Myanmar Maritime University
- National Management Degree College
- Naypyitaw State Academy
- Pathein University
- Sagaing University
- Sagaing University of Education
- Taunggyi University
- Technological University, Hmawbi
- Technological University (Kyaukse)
- Technological University Mandalay
- University of Computer Studies, Mandalay
- University of Computer Studies Maubin
- University of Computer Studies, Meikhtila
- University of Computer Studies Pathein
- University of Computer Studies, Taungoo
- University of Computer Studies, Yangon
- University of Dental Medicine Mandalay
- University of Dental Medicine, Yangon
- University of Information Technology
- University of Mandalay
- University of Medicine 1
- University of Medicine 2
- University of Medicine Mandalay
- University of Myitkyina
- University of Public Health, Yangon
- University of Veterinary Science
- University of Yangon
- West Yangon University
- Yadanabon University
- Yangon Technological University
- Yangon University of Distance Education
- Yangon University of Economics
- Yangon University of Education
- Yangon University of Foreign Languages
- Yezin Agricultural University
- New Index

Item

{"_buckets": {"deposit": "c7b551af-527c-4c59-bb8a-2ae6c7987787"}, "_deposit": {"created_by": 45, "id": "6387", "owner": "45", "owners": [45], "owners_ext": {"displayname": "", "username": ""}, "pid": {"revision_id": 0, "type": "recid", "value": "6387"}, "status": "published"}, "_oai": {"id": "oai:meral.edu.mm:recid/6387", "sets": ["user-uit"]}, "communities": ["uit"], "item_1583103067471": {"attribute_name": "Title", "attribute_value_mlt": [{"subitem_1551255647225": "Resource-based Data Placement Strategy for Hadoop Distributed File System", "subitem_1551255648112": "en"}]}, "item_1583103085720": {"attribute_name": "Description", "attribute_value_mlt": [{"interim": "Big-Data is a term for data sets that are so large or\ncomplex that traditional data processing tools are\ninadequate to process or manage them. Apache Hadoop\nis an open-source software framework for distributed\nstorage and distributed processing of very large data\nsets on computer clusters built from commodity\nhardware. The default Hadoop data placement strategy\nworks well in homogeneous cluster. But it performs\npoorly in heterogeneous clusters because of the\nheterogeneity (in terms of processing, memory,\nthroughput, I/O, etc.) of the nodes capabilities. It may\ncause load imbalance and reduce Hadoop performance.\nTherefore, Hadoop Distributed File System (HDFS) has\nto rely on load balancing utility to balance data\ndistribution. The utility consumes the cost of extra\nsystem resources and running time. As a result, data can\nbe placed evenly across the Hadoop cluster. But it may\ncause the overhead of transferring unprocessed data\nfrom slow nodes to fast nodes because each node has\ndifferent computing capacity in heterogeneous Hadoop\ncluster. In order to solve these problems, a data/replica\nplacement algorithm based on storage utilization and\ncomputing capacity of each data node in heterogeneous\nHadoop Cluster is proposed. The proposed policy can\nbalance the workload as well as reduce overhead of\ndata transmission between different computing nodes."}]}, "item_1583103108160": {"attribute_name": "Keywords", "attribute_value_mlt": [{"interim": "HDFS"}, {"interim": "Data Placement Policy"}, {"interim": "Load Balancing"}]}, "item_1583103120197": {"attribute_name": "Files", "attribute_type": "file", "attribute_value_mlt": [{"accessrole": "open_access", "date": [{"dateType": "Available", "dateValue": "2020-11-20"}], "displaytype": "preview", "download_preview_message": "", "file_order": 0, "filename": "Resource-based Data Placement Strategy for Hadoop Distributed File System.pdf", "filesize": [{"value": "342 Kb"}], "format": "application/pdf", "future_date_message": "", "is_thumbnail": false, "licensetype": "license_0", "mimetype": "application/pdf", "size": 342000.0, "url": {"url": "https://meral.edu.mm/record/6387/files/Resource-based Data Placement Strategy for Hadoop Distributed File System.pdf"}, "version_id": "9375e58b-a21c-4b91-a738-1cd3fb29678e"}]}, "item_1583103147082": {"attribute_name": "Conference papers", "attribute_value_mlt": [{"subitem_acronym": "ICAIT-2017", "subitem_c_date": "1-2 November, 2017", "subitem_conference_title": "1st International Conference on Advanced Information Technologies", "subitem_place": "Yangon, Myanmar", "subitem_session": "Workshop Session", "subitem_website": "https://www.uit.edu.mm/icait-2017/"}]}, "item_1583105942107": {"attribute_name": "Authors", "attribute_value_mlt": [{"subitem_authors": [{"subitem_authors_fullname": "Nang Kham Soe"}, {"subitem_authors_fullname": "Tin Tin Yee"}, {"subitem_authors_fullname": "Ei Chaw Htoon"}]}]}, "item_1583108359239": {"attribute_name": "Upload type", "attribute_value_mlt": [{"interim": "Publication"}]}, "item_1583108428133": {"attribute_name": "Publication type", "attribute_value_mlt": [{"interim": "Conference paper"}]}, "item_1583159729339": {"attribute_name": "Publication date", "attribute_value": "2017-11-02"}, "item_title": "Resource-based Data Placement Strategy for Hadoop Distributed File System", "item_type_id": "21", "owner": "45", "path": ["1605779935331"], "permalink_uri": "http://hdl.handle.net/20.500.12678/0000006387", "pubdate": {"attribute_name": "Deposited date", "attribute_value": "2020-11-20"}, "publish_date": "2020-11-20", "publish_status": "0", "recid": "6387", "relation": {}, "relation_version_is_last": true, "title": ["Resource-based Data Placement Strategy for Hadoop Distributed File System"], "weko_shared_id": -1}

Resource-based Data Placement Strategy for Hadoop Distributed File System

http://hdl.handle.net/20.500.12678/0000006387

Preview

Name / File	License	Actions
Resource-based Data Placement Strategy for Hadoop Distributed File System.pdf (342 Kb)

Publication type
		Conference paper
Upload type
		Publication
Title
	Title	Resource-based Data Placement Strategy for Hadoop Distributed File System
	Language	en
Publication date		2017-11-02
Authors
		Nang Kham Soe
		Tin Tin Yee
		Ei Chaw Htoon
Description
		Big-Data is a term for data sets that are so large or complex that traditional data processing tools are inadequate to process or manage them. Apache Hadoop is an open-source software framework for distributed storage and distributed processing of very large data sets on computer clusters built from commodity hardware. The default Hadoop data placement strategy works well in homogeneous cluster. But it performs poorly in heterogeneous clusters because of the heterogeneity (in terms of processing, memory, throughput, I/O, etc.) of the nodes capabilities. It may cause load imbalance and reduce Hadoop performance. Therefore, Hadoop Distributed File System (HDFS) has to rely on load balancing utility to balance data distribution. The utility consumes the cost of extra system resources and running time. As a result, data can be placed evenly across the Hadoop cluster. But it may cause the overhead of transferring unprocessed data from slow nodes to fast nodes because each node has different computing capacity in heterogeneous Hadoop cluster. In order to solve these problems, a data/replica placement algorithm based on storage utilization and computing capacity of each data node in heterogeneous Hadoop Cluster is proposed. The proposed policy can balance the workload as well as reduce overhead of data transmission between different computing nodes.
Keywords
		HDFS, Data Placement Policy, Load Balancing
Conference papers
		ICAIT-2017
		1-2 November, 2017
		1st International Conference on Advanced Information Technologies
		Yangon, Myanmar
		Workshop Session
		https://www.uit.edu.mm/icait-2017/

Back

0

views

downloads

See details

	Views	Downloads

Versions

Ver.2	2020-11-20 14:05:06.391585
Ver.1	2020-11-20 14:03:43.319371

Index Link

Index Tree

Item

Resource-based Data Placement Strategy for Hadoop Distributed File System

Versions

Share

Export