Open Citations Corpus Import Process | Open Citations and Related Work

abernard102@gmail.com 2013-03-07

Summary:

"As part of the Open Citations project, we have been asked to review and improve the process of importing data into the Open Citations Corpus, taking the scripts from the initial project as our starting point. The current import procedure evolved from several disconnected processes and requires running multiple command line scripts and transforming the data into different intermediate formats. As a consequence, it is not very efficient and we will be looking to improve on the speed and reliability of the import procedure. Moreover, there are two distinct procedures depending on the source of the data (arXiv or PubMed Central); we are hoping to unify the common parts of these procedures into a single process which can be simplified and normalised to improve code re-use and comprehensibility ..."

Link:

http://opencitations.wordpress.com/2013/03/06/open-citations-corpus-import-process/

From feeds:

Open Access Tracking Project (OATP) ยป abernard102@gmail.com

Tags:

oa.new oa.data oa.comment oa.harvesting oa.arxiv oa.metadata oa.tools oa.pmc oa.oai-pmh oa.bibserver oa.open_citations

Date tagged:

03/07/2013, 13:05

Date published:

03/07/2013, 08:05