Open Access Media Importer: Apology, frontend & usage

abernard102@gmail.com 2012-08-20

Summary:

In a prior blog post the Open Access Media Importer was introduced with the following explanation, “ Open Access scientific literature contains, almost by definition, content suitable – both in substance and licensing – for Wikimedia Commons. However, currently, there seems to be no automated, easy way to identify such files, convert them into appropriate formats and import them into Commons.” The current blog post provides a brief update on the development of the Open Access Media Importer. “More than a month ago, I promised to blog about my quest to build the Open Access Media Importer for Wikimedia Commons... That being said, the rest of this article deals... more with technical issues: It describes the design of the Open Access Importer frontend. To access all elements of the envisioned scraper / transcoder / upload toolchain in a uniform way is important – nobody likes to use un-usable software. After some deliberation, I chose to closely model it on the apt-get utility of the Debian GNU/Linux distribution, coming up with three wrapper scripts named oa-get, oa-cache and oa-put. oa-get takes care of everything regarding downloads, acquiring medatada and media. With the simple invocation oa-get download-metadata, it downloads index files from PubMed Central, skipping already accquired files and displaying a progress bar (screenshot). Its less-complex sister invocation, oa-get download-media could be imagined somewhat analog to wget -i... oa-cache is the complementary tool for any activity that does not need network connectivity. It is able to find suitable supplementary materials, writing their URLs and possible metadata to a CSV file and writes an additional file to identify articles having no or non-audiovisual supplementary materials. Known-useless files can thus be skipped on subsequent runs (screenshot); since many articles do not contain any usable media, this speeds up processing tremendously. oa-put‘s purpose will be upload activities for Wikimedia Commons. Unlike the other tools, it currently cannot do anything. Like the others, it will be usable both manually and in shell scripts in a consistent manner. Stay tuned: The next post will outline how you can write your own plugin for the Open Access Media Importer, extending the functionality of oa-get download-metadata andoa-cache find-media beyond accessing PubMed Central. If you are impatient, in the meantime you should follow the project on GitHub.”

Link:

http://wir.okfn.org/2012/03/10/open-access-media-importer-apology-frontend-usage/

Updated:

08/16/2012, 06:08

From feeds:

Open Access Tracking Project (OATP) » abernard102@gmail.com

Tags:

oa.new oa.pubmed oa.licensing oa.comment oa.metadata oa.tools oa.wikimedia oa.floss oa.github oa.libre

Authors:

abernard

Date tagged:

08/20/2012, 18:58

Date published:

03/10/2012, 23:27