Obstacles to Dataset Citation Using Bibliographic Management Software | Data Science Journal
peter.suber's bookmarks 2025-05-21
Summary:
Abstract: Governmental, funder, and scholarly publisher mandates for FAIR and open data are pushing researchers to archive data with persistent identifiers in repositories and link datasets in journal articles. Data citations enable transparency in research and credit and impact metrics for data reuse. However, numerous adoption barriers still exist, including that bibliographic reference management software commonly used by researchers to ease the referencing process may not yet be equipped to handle datasets. This paper examines the readiness of commonly used reference management software to support researchers in importing bibliographic metadata for datasets and generating references that comply with leading practices for data citation. Using seven major reference managers and datasets sampled across 14 Earth, space, and environmental sciences repositories, we identify and analyze common errors in reference-manager-facilitated metadata capture, storage, and citation export, using quantitative content analysis to compare repository-provided recommended citations, reference manager results, and DataCite metadata records. We find that a majority of frequently used reference managers do not adequately support data citation, obstructing uptake of data citation by researchers and thereby limiting the growth of credit and incentives for data sharing and reuse. The range and scale of issues uncovered are broadly extensible and relevant to data citation across disciplines. We present actionable recommendations for reference manager, data repository, scholarly publisher, and researcher stakeholders for increasing the ease, efficiency, and accuracy of bibliographic management software-facilitated data citation.