MERAL Myanmar Education Research and Learning Portal
Item
{"_buckets": {"deposit": "191aeee1-f7f6-4f14-892d-ff7df70608e8"}, "_deposit": {"created_by": 45, "id": "6266", "owner": "45", "owners": [45], "owners_ext": {"displayname": "", "username": ""}, "pid": {"revision_id": 0, "type": "recid", "value": "6266"}, "status": "published"}, "_oai": {"id": "oai:meral.edu.mm:recid/6266", "sets": ["user-uit"]}, "communities": ["uit"], "item_1583103067471": {"attribute_name": "Title", "attribute_value_mlt": [{"subitem_1551255647225": "Analysis of Historical Census Household data with Similarity Threshold Method", "subitem_1551255648112": "en"}]}, "item_1583103085720": {"attribute_name": "Description", "attribute_value_mlt": [{"interim": "Historical census data contains valuable information of families in a country. It captures information about ancestors. These data can be used to reconstruct important parts of a specific period in order to trace the households and families changes across time. Linking census data is a challenging task due to poor data quality, household changes over time. During the decades, a household may split multiple households due to marriage or moving to another household. This paper introduces an approach for data cleaning, standardization and linking of historical census data across time. The key fact of the proposed approach is firstly to detect households, clean and unified into standard format. After cleaning these records, approximate string similarity measures are used to link individual records and then define matched and unmatched records with similarity threshold method. The result of the experiment shows optimal threshold value which is efficient for household linkage."}]}, "item_1583103108160": {"attribute_name": "Keywords", "attribute_value_mlt": [{"interim": "historical census data"}, {"interim": "data cleaning"}, {"interim": "data matching"}, {"interim": "record linkage"}, {"interim": "household linkage"}, {"interim": "pair-wise linkage"}]}, "item_1583103120197": {"attribute_name": "Files", "attribute_type": "file", "attribute_value_mlt": [{"accessrole": "open_access", "date": [{"dateType": "Available", "dateValue": "2020-11-19"}], "displaytype": "preview", "download_preview_message": "", "file_order": 0, "filename": "Analysis of Historical Census Household data with Similarity Threshold Method.pdf", "filesize": [{"value": "1.4 Mb"}], "format": "application/pdf", "future_date_message": "", "is_thumbnail": false, "licensefree": "© 2017 ICAIT", "licensetype": "license_free", "mimetype": "application/pdf", "size": 1400000.0, "url": {"url": "https://meral.edu.mm/record/6266/files/Analysis of Historical Census Household data with Similarity Threshold Method.pdf"}, "version_id": "0d2ba3d0-c0f2-4a11-a206-2db63974e272"}]}, "item_1583103147082": {"attribute_name": "Conference papers", "attribute_value_mlt": [{"subitem_acronym": "ICAIT-2017", "subitem_c_date": "1-2 November, 2017", "subitem_conference_title": "1st International Conference on Advanced Information Technologies", "subitem_place": "Yangon, Myanmar", "subitem_session": "Data Science", "subitem_website": "https://www.uit.edu.mm/icait-2017/"}]}, "item_1583105942107": {"attribute_name": "Authors", "attribute_value_mlt": [{"subitem_authors": [{"subitem_authors_fullname": "Khin Su Mon Myint"}, {"subitem_authors_fullname": "Thet Thet Zin"}, {"subitem_authors_fullname": "Kyaw May Oo"}]}]}, "item_1583108359239": {"attribute_name": "Upload type", "attribute_value_mlt": [{"interim": "Publication"}]}, "item_1583108428133": {"attribute_name": "Publication type", "attribute_value_mlt": [{"interim": "Conference paper"}]}, "item_1583159729339": {"attribute_name": "Publication date", "attribute_value": "2017-11-02"}, "item_title": "Analysis of Historical Census Household data with Similarity Threshold Method", "item_type_id": "21", "owner": "45", "path": ["1605779935331"], "permalink_uri": "http://hdl.handle.net/20.500.12678/0000006266", "pubdate": {"attribute_name": "Deposited date", "attribute_value": "2020-11-19"}, "publish_date": "2020-11-19", "publish_status": "0", "recid": "6266", "relation": {}, "relation_version_is_last": true, "title": ["Analysis of Historical Census Household data with Similarity Threshold Method"], "weko_shared_id": -1}
Analysis of Historical Census Household data with Similarity Threshold Method
http://hdl.handle.net/20.500.12678/0000006266
http://hdl.handle.net/20.500.12678/0000006266cac70a09-3cf7-4bdd-bd99-09f9a463b9e9
191aeee1-f7f6-4f14-892d-ff7df70608e8
Name / File | License | Actions |
---|---|---|
![]() |
© 2017 ICAIT
|
Publication type | ||||||
---|---|---|---|---|---|---|
Conference paper | ||||||
Upload type | ||||||
Publication | ||||||
Title | ||||||
Title | Analysis of Historical Census Household data with Similarity Threshold Method | |||||
Language | en | |||||
Publication date | 2017-11-02 | |||||
Authors | ||||||
Khin Su Mon Myint | ||||||
Thet Thet Zin | ||||||
Kyaw May Oo | ||||||
Description | ||||||
Historical census data contains valuable information of families in a country. It captures information about ancestors. These data can be used to reconstruct important parts of a specific period in order to trace the households and families changes across time. Linking census data is a challenging task due to poor data quality, household changes over time. During the decades, a household may split multiple households due to marriage or moving to another household. This paper introduces an approach for data cleaning, standardization and linking of historical census data across time. The key fact of the proposed approach is firstly to detect households, clean and unified into standard format. After cleaning these records, approximate string similarity measures are used to link individual records and then define matched and unmatched records with similarity threshold method. The result of the experiment shows optimal threshold value which is efficient for household linkage. | ||||||
Keywords | ||||||
historical census data, data cleaning, data matching, record linkage, household linkage, pair-wise linkage | ||||||
Conference papers | ||||||
ICAIT-2017 | ||||||
1-2 November, 2017 | ||||||
1st International Conference on Advanced Information Technologies | ||||||
Yangon, Myanmar | ||||||
Data Science | ||||||
https://www.uit.edu.mm/icait-2017/ |