Community repositories: the best way to share the data underlying your research | Research Data at Springer Nature

peter.suber's bookmarks 2022-07-19

Summary:

"The advantages of using community repositories

Advantage 1: Community familiarity. Experienced researchers of a given community know which repositories frequently hold the types of data generated and analysed in their field. If they wish to find data, a community repository is the first place to check. Datasets in these repositories also frequently link to the research article(s) that have utilised the data (here’s an example on the PANGAEA repository). Most frequently, the data link to the first article that the data underlie, but some repositories allow for subsequent articles that made use of the data to also be referenced.

It is also worth bearing in mind that experienced researchers expect to see the data that underlie the paper deposited/shared in an appropriate repository. Seeing data for which a community repository exists shared in a generalist repository has two negative effects: i) inconvenience, as generalist repositories do not ensure the data are deposited in a standardised, community-approved form; ii) scepticism, as it appears the researchers who wrote the article either weren't familiar enough with the field to know where the data should be deposited or the data are not of sufficient quality to appear in a community repository.

Advantage 2: Metadata. As community repositories ensure comprehensive metadata is associated with the data they hold, it is easy to search one of these repositories for specific types of data. As an example, in NCBI's Sequence Read Archive (DNA & RNA data) you can search by specialist fields such as organism taxonomic classification, read length or sequencing platform. As another example, in PANGAEA (earth & environmental science data) you can search by geographic location, topic, funder and project.

Advantage 3: Interoperability. Data in the same repository tend to be interoperable, as the repositories require data to be in the correct format before they can be deposited. Therefore, users know what to expect and will not need to dedicate time to deciphering and standardising data formats.

Advantage 4: Machine-readability. The above features facilitate machine interaction with data: software and algorithms can be set up to find, access and analyse large amounts of data. This can greatly reduce the time needed for research projects, and enable projects of far greater scope than would otherwise be possible. Much importance has been placed upon this machine-accessibility of data, and it is a very active area of development.

Advantage 5: Time-saving. You do not need to worry about finding, re-familiarising yourself with and sharing your data if/when it is requested as community repositories offer a straightforward method by which your data can be located, contextualised and accessed in future. Yes--preparing the data and metadata for sharing takes additional time up front, but it can save a lot of time later. In fact, the better your data and the more popular your research, the more time can be saved...."

Link:

https://researchdata.springernature.com/posts/community-repositories-the-best-way-to-share-the-data-underlying-your-research

Updated:

07/19/2022, 08:03

From feeds:

Open Access Tracking Project (OATP) » peter.suber's bookmarks

Tags:

oa.new oa.data oa.repositories oa.repositories.data oa.recommendations oa.repositories.disciplinary oa.benefits

Date tagged:

07/19/2022, 12:03

Date published:

03/14/2022, 08:03