Responsible data sharing: Identifying and remedying possible re-identification of human participants

peter.suber's bookmarks 2023-09-26


Abstract:  Open data collected from humans creates a tension between scholarly values of transparency and sharing on the one hand, and privacy and security on the other. A common solution is to make datasets anonymous by removing personally identifying information before sharing. However, ostensibly anonymized datasets may be at risk of re-identification if they include demographic information. In the present article, we (a) review current privacy standards; (b) describe computer science data protection frameworks and their adaptability to the social sciences; (c) provide practical guidance for assessing and addressing re-identification risk; (d) introduce two open-source algorithms – MinBlur and MinBlurLite – to increase privacy while maintaining the integrity of open data; and (e) highlight aspects of ethical data sharing that require further attention. Technical innovations can support competing values so that science can be as open as possible to promote transparency and sharing, and as closed as necessary to maintain privacy and security.



From feeds:

Open Access Tracking Project (OATP) » peter.suber's bookmarks

Tags: oa.privacy oa.floss oa.ssh

Date tagged:

09/26/2023, 08:59

Date published:

09/26/2023, 04:59