Navigating the Data Landscape: An Open Source Workflow | TexLibris

peter.suber's bookmarks 2025-02-12

Summary:

"While a few proprietary solutions are beginning to emerge that purport to be able to track institutional research data outputs (e.g., Web of Science), these products have notable shortcomings, including significant cost, difficulty assessing thoroughness of retrieval, and limited number of retrievals. In order to create a more sustainable and transparent solution, the Research Data Services team has developed a Python-based workflow that uses a number of publicly accessible APIs for data repositories and DOI registries. The code for running this workflow has been publicly shared through the UT Libraries GitHub at https://github.com/utlibraries/research-data-discovery so that others can also utilize this open approach to gathering information about research data outputs from user-defined institutions; the code will continue to be maintained and expanded to improve coverage and accuracy. To date, the workflow has identified more than 3,000 dataset publications by UT Austin researchers across nearly 70 different platforms, ranging from generalist repositories that accept any form of data like Dryadfigshare, and Zenodo to highly specialized repositories like the Digital Rocks Portal (for visualizing porous microstructures), DesignSafe (for natural hazards), and PhysioNet (for physiological signal data)...."

Link:

https://texlibris.lib.utexas.edu/2025/02/navigating-the-data-landscape-an-open-source-workflow/

From feeds:

Open Access Tracking Project (OATP) » peter.suber's bookmarks

Tags:

oa.new oa.data oa.workflows oa.floss oa.repositories oa.repositories.data

Date tagged:

02/12/2025, 15:15

Date published:

02/12/2025, 10:15