Digital Public Library of America » Blog Archive » DPLA Launches Open-Source Spark OAI Harvester

peter.suber's bookmarks 2017-08-17

Summary:

"The DPLA is launching an open-source tool for fast, large-scale data harvests from OAI repositories. The tool uses a Spark distributed processing engine to speed up and scale up the harvesting operation, and to perform complex analysis of the harvested data. It is helping us improve our internal workflows and provide better service to our hubs.  The Spark OAI Harvester is freely available and we hope that others working with interoperable cultural heritage or science data will find uses for it in their own projects."

Link:

https://dp.la/info/2017/08/16/dpla-launches-open-source-spark-oai-harvester/

From feeds:

Open Access Tracking Project (OATP) » peter.suber's bookmarks
Open Access Tracking Project (OATP) » lterrat's bookmarks

Tags:

oa.harvesting oa.metadata oa.repositories oa.standards

Date tagged:

08/17/2017, 15:30

Date published:

08/17/2017, 05:49