The Role of Data Papers in Scientific Data Dissemination and Reuse | Scientific Data
peter.suber's bookmarks 2026-06-15
Summary:
Abstract: Data papers have emerged as a distinctive publication format in the open science era, yet their actual role in fostering scientific data reuse remains uncertain. Drawing on extensive citation-context datasets and large language models, this study investigates the specific contributions and citation purposes of data papers, complemented by a comparative analysis across diverse disciplinary domains. The findings show that while data papers significantly enhance dataset visibility and scholarly recognition, most citations emphasize descriptive or contextual information rather than facilitating direct computational, comparative, or synthetic reuse. Moreover, despite persistent advocacy for a “data-driven” research paradigm, the contribution of data papers to catalyzing novel scientific insights appears to be modest and often indirect. Notable disciplinary differences reveal that data-intensive fields, such as Earth and Life Sciences, embed datasets directly into research workflows, whereas more conceptual domains primarily treat data papers as methodological guides or contextual references. Additionally, publishing attributes—spanning journal practices to national data infrastructures—emerge as critical determinants of reuse patterns, with dissemination effectiveness shaped more by publication management and the thematic priorities of editors than by output volume alone.