A protocol for adding knowledge to Wikidata, a case report | bioRxiv

peter.suber's bookmarks 2020-04-14

Summary:

Abstract:  Pandemics, even more than other scientific questions, require swift integration of knowledge and identifiers. In a setting where there is a large number of loosely connected projects and initiatives, we need a common ground, also known as a “commons”. Wikidata, a public knowledge graph aligned with Wikipedia, is such a commons, but Wikidata may not always have the right schema for the urgent questions. In this paper, we address this problem by showing how a data schema required for the integration can be modelled with entity schemas represented by shape expressions. As a telling example, we describe the process of aligning resources on the genomics of the SARS-CoV-2 virus and related viruses as well as how shape expressions can be defined for Wikidata helping others studying the SARS-CoV-2 pandemic. How this model can be used to make data between various resources interoperable, is demonstrated by integrating data from NCBI Taxonomy, NCBI Genes, UniProt, and WikiPathways. Based on that model, a set of automated applications or bots were written for regular updates of these sources in Wikidata and added to a platform for automatically running these updates. Although this workflow is developed and applied in the context of the SARS-CoV-2 pandemic, it was also applied to other human coronaviruses (MERS, SARS, SARS-CoV-2, Human Coronavirus NL63, Human coronavirus 229E, Human coronavirus HKU1, Human coronavirus OC4) to demonstrate its broader applicability.

 

Link:

https://www.biorxiv.org/content/10.1101/2020.04.05.026336v1

From feeds:

Open Access Tracking Project (OATP) » peter.suber's bookmarks

Tags:

oa.new oa.wikidata oa.case oa.medicine oa.ontologies oa.speed oa.interoperability

Date tagged:

04/14/2020, 09:55

Date published:

04/14/2020, 05:55