Log in
Language:

MERAL Myanmar Education Research and Learning Portal

  • Top
  • Universities
  • Ranking
To
lat lon distance
To

Field does not validate



Index Link

Index Tree

Please input email address.

WEKO

One fine body…

WEKO

One fine body…

Item

{"_buckets": {"deposit": "84125f4a-79f2-4f55-9cd0-4b2dabd1cd50"}, "_deposit": {"id": "4522", "owners": [], "pid": {"revision_id": 0, "type": "recid", "value": "4522"}, "status": "published"}, "_oai": {"id": "oai:meral.edu.mm:recid/4522", "sets": ["user-ucsy"]}, "communities": ["ucsy"], "item_1583103067471": {"attribute_name": "Title", "attribute_value_mlt": [{"subitem_1551255647225": "Extracting Information Content from Web Pages Using Block Clustering Method", "subitem_1551255648112": "en_US"}]}, "item_1583103085720": {"attribute_name": "Description", "attribute_value_mlt": [{"interim": "The World Wide Web is the main “allkind of information” repository and has been sofar very successful in disseminating to humans.As web sites are getting more complicated, theconstruction of web information extractionsystems becomes more difficult and timeconsuming. Therefore we need to mine the maincontent of web page in order to extractinformation from such web pages. In this paper,we study the problem of automaticallyextracting the web information (unsupervisedIE) without any learning examples or othersimilar human input. Firstly, web pages aresegment into several raw chunks. Then removethe noisy blocks based on product features.Data region identification is based on theobservation that appearance similarity of thedata record in web document. Therefore blockclustering method is proposed based on thisobservation. This approach requires no humanintervention and experimental results haveshown its accuracy to be promising."}]}, "item_1583103108160": {"attribute_name": "Keywords", "attribute_value_mlt": [{"interim": "Information Extraction (IE)"}, {"interim": "Wrapper"}, {"interim": "Document Object Model (DOM)"}]}, "item_1583103120197": {"attribute_name": "Files", "attribute_type": "file", "attribute_value_mlt": [{"accessrole": "open_access", "date": [{"dateType": "Available", "dateValue": "2019-11-15"}], "displaytype": "preview", "download_preview_message": "", "file_order": 0, "filename": "10103.pdf", "filesize": [{"value": "463 Kb"}], "format": "application/pdf", "future_date_message": "", "is_thumbnail": false, "licensetype": "license_free", "mimetype": "application/pdf", "size": 463000.0, "url": {"url": "https://meral.edu.mm/record/4522/files/10103.pdf"}, "version_id": "ddbffd17-b905-4ab5-a2e7-4367de367f31"}]}, "item_1583103131163": {"attribute_name": "Journal articles", "attribute_value_mlt": [{"subitem_issue": "", "subitem_journal_title": "Tenth International Conference On Computer Applications (ICCA 2012)", "subitem_pages": "", "subitem_volume": ""}]}, "item_1583103147082": {"attribute_name": "Conference papers", "attribute_value_mlt": [{"subitem_acronym": "", "subitem_c_date": "", "subitem_conference_title": "", "subitem_part": "", "subitem_place": "", "subitem_session": "", "subitem_website": ""}]}, "item_1583103211336": {"attribute_name": "Books/reports/chapters", "attribute_value_mlt": [{"subitem_book_title": "", "subitem_isbn": "", "subitem_pages": "", "subitem_place": "", "subitem_publisher": ""}]}, "item_1583103233624": {"attribute_name": "Thesis/dissertations", "attribute_value_mlt": [{"subitem_awarding_university": "", "subitem_supervisor(s)": [{"subitem_supervisor": ""}]}]}, "item_1583105942107": {"attribute_name": "Authors", "attribute_value_mlt": [{"subitem_authors": [{"subitem_authors_fullname": "Hlaing, Nwe Nwe"}, {"subitem_authors_fullname": "Nyunt, Thi Thi Soe"}]}]}, "item_1583108359239": {"attribute_name": "Upload type", "attribute_value_mlt": [{"interim": "Publication"}]}, "item_1583108428133": {"attribute_name": "Publication type", "attribute_value_mlt": [{"interim": "Article"}]}, "item_1583159729339": {"attribute_name": "Publication date", "attribute_value": "2012-02-28"}, "item_1583159847033": {"attribute_name": "Identifier", "attribute_value": "http://onlineresource.ucsy.edu.mm/handle/123456789/2447"}, "item_title": "Extracting Information Content from Web Pages Using Block Clustering Method", "item_type_id": "21", "owner": "1", "path": ["1597824273898"], "permalink_uri": "http://hdl.handle.net/20.500.12678/0000004522", "pubdate": {"attribute_name": "Deposited date", "attribute_value": "2019-11-15"}, "publish_date": "2019-11-15", "publish_status": "0", "recid": "4522", "relation": {}, "relation_version_is_last": true, "title": ["Extracting Information Content from Web Pages Using Block Clustering Method"], "weko_shared_id": -1}
  1. University of Computer Studies, Yangon
  2. Conferences

Extracting Information Content from Web Pages Using Block Clustering Method

http://hdl.handle.net/20.500.12678/0000004522
http://hdl.handle.net/20.500.12678/0000004522
b2881633-54cb-45ee-8dd7-f03188658df2
84125f4a-79f2-4f55-9cd0-4b2dabd1cd50
None
Preview
Name / File License Actions
10103.pdf 10103.pdf (463 Kb)
Publication type
Article
Upload type
Publication
Title
Title Extracting Information Content from Web Pages Using Block Clustering Method
Language en_US
Publication date 2012-02-28
Authors
Hlaing, Nwe Nwe
Nyunt, Thi Thi Soe
Description
The World Wide Web is the main “allkind of information” repository and has been sofar very successful in disseminating to humans.As web sites are getting more complicated, theconstruction of web information extractionsystems becomes more difficult and timeconsuming. Therefore we need to mine the maincontent of web page in order to extractinformation from such web pages. In this paper,we study the problem of automaticallyextracting the web information (unsupervisedIE) without any learning examples or othersimilar human input. Firstly, web pages aresegment into several raw chunks. Then removethe noisy blocks based on product features.Data region identification is based on theobservation that appearance similarity of thedata record in web document. Therefore blockclustering method is proposed based on thisobservation. This approach requires no humanintervention and experimental results haveshown its accuracy to be promising.
Keywords
Information Extraction (IE), Wrapper, Document Object Model (DOM)
Identifier http://onlineresource.ucsy.edu.mm/handle/123456789/2447
Journal articles
Tenth International Conference On Computer Applications (ICCA 2012)
Conference papers
Books/reports/chapters
Thesis/dissertations
Back
0
0
views
downloads
See details
Views Downloads

Versions

Ver.1 2020-09-01 15:01:24.750856
Show All versions

Share

Mendeley Twitter Facebook Print Addthis

Export

OAI-PMH
  • OAI-PMH DublinCore
Other Formats
  • JSON

Confirm


Back to MERAL


Back to MERAL