OpenCommon plateforme Science Ouverte passerelle pour des communautés intersciences - Archive ouverte HAL
Hanna_S's bookmarks 2024-11-26
Summary:
Abstract "The 2nd National Plan for Open Science (PNSO 2021-2024) aims to generalize open science practices, share and open research data and promote other research products such as source codes and workflows. The issues of sharing and reuse concern both scientists who produce and (re)use data, but also support staff, especially those in charge of data stewardship. To contribute to these objectives, more particularly to the structuring, sharing and opening of research data and also to the transformation of practices to make Open Science the default principle, we propose the OpenCommon platform offering a unified access for an intersciences research community to facilitate the intermediation to the Open Science digital ecosystem. Indeed, in the multitude of platforms, internet services and protocols related to digital resources of research, in the multitude of formats of descriptive metadata data and data formats themselveseven, the end user of data is at least a little puzzled as if he has to cross data of different natures and disciplines. We were particularly interested in the research, fairization and data repository scenarios. These scenarios are based on the identification and integration of services and vocabularies of interest. These services may be those of referentiels, data repositories or data portals for example. FAIR vocabularies are used to describe the data management and scientific domains of users. A tool to help the productions description of data communities in accordance with the FAIR principles is proposed. The generic OpenCommon platform offers the possibility of creating data communities in which producers and users of data collaborate with support staff, to facilitate access and production of data catalogues in a disciplinary, multi or interdisciplinary context or intersciences whatever the nature of the data to remedy the silos of disciplinary and technical data. The platform manages descriptive metadata of data communities. The data itself is hosted by partner data repositories. OpenCommon supports the user to facilitate a single entry regardless of the distribution of these metadata. We present the platform and its uses. The core of the platform is based on open data cataloguing standards such as the DCAT (Data CATalog vocabulary) and its profiles for European scientific data. It is based on web technologies, W3C recommendations and standards, in particular those of the data web with the implementation of a SPARQL graph database. User assistance and the integration of FAIR principles are achieved through the adoption of controlled vocabularies. The platform’s data communities allow for the implementation of user workflows for collaborative production of data catalogues at the level of a team, unit or organization. This work in progress capitalizes on the feedback from projects such as ANR Semantic4FAIR (2019-2022) on the semantisation of MétéoFRANCE data or ENVIA on the use of AI for environmental data analysis. The scientific and technical aspects of the exploitation of interdisciplinary scientific data were studied during the DataNoos project (STAE foundation project 2018-2022 then MSHS-T platform 2022-2024) with the development of a first research prototype. During the ANR SO-DRIIHM (2019-2024), we continued the developments to evolve this prototype so that it becomes the Open Science gateway of labex DRIIHM, a federation of 13 International Observatories Hommes-Milieux (OHM). The needs of the scientific community of the DRIIHM lab were studied in a co-design process. This community, typically intersciences, is involved in the study of human impacts on the environment. In the field of ecology and environment, the main target warehouses are INEE’s Data.Indores, PNDB, GBIF or Data Terra and the BRGM’s EasyData platform. In the field of social sciences and humanities, the Nakala (Huma-num) warehouse has been integrated. Developments continue with the project France relance SO-DRIIHM-FR (2022-2024). This platform has become a free project and is open to new contributions and contexts of use. We offer you our feedback on the development of the OpenCommon platform from the digital ecosystem of Open Science as the referentiels (DOI, ORCID, ROR, IDref, IDhal), services and APIs of e-infrastructures, dataverse, Data Research Gouv, or European Union services such as data.europa, the official portal for European open data. FAIR vocabularies are at the heart of the platform for semantic enrichment, to propose an urbanization based on a common conceptualization of Open Science as the Open Science Thesaurus (TSO) of INIST or as representations of knowledge in the scientific fields for assistance to input and research. A positioning in relation to common data such as the European Open Science Cloud (EOSC) and an operationalization of the fairization of community data are proposed to facilitate their reuse at a time when Artificial Intelligence is about to dethrone HPC!"