Analyzing the licenses of all 11,000+ GBIF registered datasets - Peter Desmet

abernard102@gmail.com 2014-01-09

Summary:

"In my previous post, I highlighted the legal issues showing 13,297 American bullfrog records downloaded from GBIF on a map. 96% of those records had no or a non-standard data license, making data use legally cumbersome.  But how much of this applies to all 417+ million occurrence records in GBIF? How challenging is GBIF's 2014 mission to provide a machine readable, standard licensefor all datasets? Fellow Datafable1 member Bart Aelterman and I tried to figure out. We used the GBIF registry API to obtain the metadata for all 11,000+ GBIF registered datasets and in particular the rights field, which is where data publishers can provide the license under which the dataset is published. We then created aunique list of all licenses used, which we annotated with parameters such asuse allowedand attribution required. This information was joined back with the dataset information to get an idea of the distribution of certain types of licenses over all datasets and occurrence records. We also documented the guidelines we used for annotating these licenses.  In total we analyzed 11,974 datasets2, representing 415,927,654 occurrences. The first thing we noticed is that only 10% of those datasets (26% of the occurrences) have a license. This is problematic (see further), but it had the welcome side effect that we 'only' had to annotate 432 different licenses.  All code and data3 for this project are available on GitHub. #openresearch #ftw ... Our analysis of the licenses of all 11.000+ GBIF registered datasets shows a bleak picture. Very few GBIF registered datasets can be easily and legally used, let alone without restrictions. This is mainly due to data being published with no or a non-standard license.  Fixing this is crucial, and GBIF's 2014 mission to provide a machine readable, standard license to all datasets is a step in the good direction. We hope ouranalysis (which can be run again) and guidelines already help ..."

Link:

http://peterdesmet.com/posts/analyzing-gbif-data-licenses.html

From feeds:

Open Access Tracking Project (OATP) » abernard102@gmail.com

Tags:

oa.new oa.comment oa.gbif oa.biodiversity oa.data oa.copyright oa.licensing oa.libre

Date tagged:

01/09/2014, 08:24

Date published:

01/09/2014, 03:24