Research data management in institutional repositories: an architectural approach using data lakehouses | Emerald Insight

peter.suber's bookmarks 2024-12-23

Summary:

Abstract:  Purpose

This paper aims to address the pressing challenges in research data management within institutional repositories, focusing on the escalating volume, heterogeneity and multi-source nature of research data. The aim is to enhance the data services provided by institutional repositories and modernise their role in the research ecosystem.

Design/methodology/approach

The authors analyse the evolution of data management architectures through literature review, emphasising the advantages of data lakehouses. Using the design science research methodology, the authors develop an end-to-end data lakehouse architecture tailored to the needs of institutional repositories. This design is refined through interviews with data management professionals, institutional repository administrators and researchers.

Findings

The authors present a comprehensive framework for data lakehouse architecture, comprising five fundamental layers: data collection, data storage, data processing, data management and data services. Each layer articulates the implementation steps, delineates the dependencies between them and identifies potential obstacles with corresponding mitigation strategies.

Practical implications

The proposed data lakehouse architecture provides a practical and scalable solution for institutional repositories to manage research data. It offers a range of benefits, including enhanced data management capabilities, expanded data services, improved researcher experience and a modernised institutional repository ecosystem. The paper also identifies and addresses potential implementation obstacles and provides valuable guidance for institutions embarking on the adoption of this architecture. The implementation in a university library showcases how the architecture enhances data sharing among researchers and empowers institutional repository administrators with comprehensive oversight and control of the university’s research data landscape.

Originality/value

This paper enriches the theoretical knowledge and provides a comprehensive research framework and paradigm for scholars in research data management. It details a pioneering application of the data lakehouse architecture in an academic setting, highlighting its practical benefits and adaptability to meet the specific needs of institutional repositories.

Link:

https://www.emerald.com/insight/content/doi/10.1108/dlp-02-2024-0022/full/html

From feeds:

Open Access Tracking Project (OATP) » peter.suber's bookmarks

Tags:

oa.new oa.paywalled oa.repositories oa.ir oa.green oa.data oa.rdm

Date tagged:

12/23/2024, 13:01

Date published:

12/23/2024, 08:01