Assisting in semantic enrichment of scholarly resources by connecting neonion and Wikidata
peter.suber's bookmarks 2018-09-09
Abstract: Explicit semantic enrichments make digital scholarly publications potentially easy to find, to navigate, to organize and to understand. But whereas the generation of explicit semantic information is common in fields like biomedical research, comparable approaches are rare for domains in the humanities. Apart from a lack of authoritive structured knowledge bases formalizing the respective conceptualizations and terminologies, many experts from specialized fields of research seem reluctant to employ the technologies and methods that are currently available for the generation of structured knowledge representations. However, human involvement is indispensable in the organization and application of the domain-specific knowledge representations necessary for the contextualization of structured semantic data extracted from textual and scholarly resources. Over the past decade, various efforts have been made towards openly accessible online knowledge graphs containing collaboratively edited, structured and cross-linked data. Such public knowledge bases might be suitable as a starting point for defining formalized domain knowledge representations, with which the subjects and findings of a research domain can be described. Extensive re-use of the widely adopted shared conceptualizations from a large collaborative knowledge base could be in more than one way beneficial to processes of semantic enrichment, especially those involving domain experts with less-technical backgrounds. In this work, we discuss ways of enabling domain experts to semantically enrich their research resources by generating semantic annotations in text documents using the scholarly reading and annotation software neonion. We introduce features to the web-based software which improve various aspects of the semantic annotation process by connecting it to the collaboratively edited public knowledge base Wikidata. Furthermore, we argue that the re-use of external structured knowledge from Wikidata both fuels an enhanced workflow for assisted subject-matter-sensitive semantic annotation, and allows for the knowledge base to benefit from the structured data generated within neonion in return. Our prototype implementation extracts schematic terminological information from Wikidata objects linked by local annotations and feeds it into the new recommender system, where candidate descriptors for vocabulary amendment are being determined, most notably by the association rule mining recommender engine Snoopy. This paper is a follow-up on the bachelor’s thesis “Assisting in semantic development of knowledge domains by recommending terminology”, submitted by Jakob Höper under supervision by Prof. Dr. Claudia Müller-Birn. It elaborates in further detail on aspects of the presented implementation that have not been exhaustively covered in said thesis.