MERAL Myanmar Education Research and Learning Portal
Item
{"_buckets": {"deposit": "7873561d-6fd5-4ed2-8050-bdf42149a3d8"}, "_deposit": {"id": "4209", "owners": [], "pid": {"revision_id": 0, "type": "recid", "value": "4209"}, "status": "published"}, "_oai": {"id": "oai:meral.edu.mm:recid/4209", "sets": ["user-ucsy"]}, "communities": ["ucsy"], "item_1583103067471": {"attribute_name": "Title", "attribute_value_mlt": [{"subitem_1551255647225": "An Efficient Approach for Web Data Extraction", "subitem_1551255648112": "en"}]}, "item_1583103085720": {"attribute_name": "Description", "attribute_value_mlt": [{"interim": "Most of the Web page typically contains clutterunlike conventional data or text. It usually has suchnoise data as navigation panels, copyright andprivacy notices, and advertisement. These noisedata can seriously harm for Web miners byextracting whole document rather than theinformative content and also retrieve non-relevantresults. So, eliminating these noise patterns is greatimportant. In this paper, we propose an effectivetechnique to detect and remove various noisepatterns from Web document to enhance Webmining. Our system first builds DOM tree structurefor an incoming Web page and then split it into subtreesto detect noise data. We also apply backpropagation neural network algorithm to classifyvarious noise patterns, data patterns and mixturepatterns in current Web page. The classificationresult of neural network is used for eliminatingvarious noise patterns. The proposed technique isevaluated on several commercial Web sites andNews Web sites to show the performance andimprovement of our approach."}]}, "item_1583103108160": {"attribute_name": "Keywords", "attribute_value": []}, "item_1583103120197": {"attribute_name": "Files", "attribute_type": "file", "attribute_value_mlt": [{"accessrole": "open_access", "date": [{"dateType": "Available", "dateValue": "2019-08-06"}], "displaytype": "preview", "download_preview_message": "", "file_order": 0, "filename": "59011.pdf", "filesize": [{"value": "117 Kb"}], "format": "application/pdf", "future_date_message": "", "is_thumbnail": false, "licensetype": "license_free", "mimetype": "application/pdf", "size": 117000.0, "url": {"url": "https://meral.edu.mm/record/4209/files/59011.pdf"}, "version_id": "de81954e-46de-4aff-89a4-33d08665496b"}]}, "item_1583103131163": {"attribute_name": "Journal articles", "attribute_value_mlt": [{"subitem_issue": "", "subitem_journal_title": "Fourth Local Conference on Parallel and Soft Computing", "subitem_pages": "", "subitem_volume": ""}]}, "item_1583103147082": {"attribute_name": "Conference papers", "attribute_value_mlt": [{"subitem_acronym": "", "subitem_c_date": "", "subitem_conference_title": "", "subitem_part": "", "subitem_place": "", "subitem_session": "", "subitem_website": ""}]}, "item_1583103211336": {"attribute_name": "Books/reports/chapters", "attribute_value_mlt": [{"subitem_book_title": "", "subitem_isbn": "", "subitem_pages": "", "subitem_place": "", "subitem_publisher": ""}]}, "item_1583103233624": {"attribute_name": "Thesis/dissertations", "attribute_value_mlt": [{"subitem_awarding_university": "", "subitem_supervisor(s)": [{"subitem_supervisor": ""}]}]}, "item_1583105942107": {"attribute_name": "Authors", "attribute_value_mlt": [{"subitem_authors": [{"subitem_authors_fullname": "Htwe, Thanda"}, {"subitem_authors_fullname": "Kham, Nang Saing Moon"}]}]}, "item_1583108359239": {"attribute_name": "Upload type", "attribute_value_mlt": [{"interim": "Publication"}]}, "item_1583108428133": {"attribute_name": "Publication type", "attribute_value_mlt": [{"interim": "Article"}]}, "item_1583159729339": {"attribute_name": "Publication date", "attribute_value": "2009-12-30"}, "item_1583159847033": {"attribute_name": "Identifier", "attribute_value": "http://onlineresource.ucsy.edu.mm/handle/123456789/1902"}, "item_title": "An Efficient Approach for Web Data Extraction", "item_type_id": "21", "owner": "1", "path": ["1597824273898"], "permalink_uri": "http://hdl.handle.net/20.500.12678/0000004209", "pubdate": {"attribute_name": "Deposited date", "attribute_value": "2019-08-06"}, "publish_date": "2019-08-06", "publish_status": "0", "recid": "4209", "relation": {}, "relation_version_is_last": true, "title": ["An Efficient Approach for Web Data Extraction"], "weko_shared_id": -1}
An Efficient Approach for Web Data Extraction
http://hdl.handle.net/20.500.12678/0000004209
http://hdl.handle.net/20.500.12678/00000042090f64b249-3cf4-4fc1-acb0-b17fe6b1de4b
7873561d-6fd5-4ed2-8050-bdf42149a3d8
Name / File | License | Actions |
---|---|---|
![]() |
|