Release ‘open’ data from their PDF prisons using tabulizer | R-bloggers

peter.suber's bookmarks 2017-04-20

Summary:

"As a political scientist who regularly encounters so-called "open data" in PDFs, this problem is particularly irritating. PDFs may have "portable" in their name, making them display consistently on various platforms, but that portability means any information contained in a PDF is irritatingly difficult to extract computationally."

Link:

https://www.r-bloggers.com/release-open-data-from-their-pdf-prisons-using-tabulizer/

From feeds:

Open Access Tracking Project (OATP) » peter.suber's bookmarks
Open Access Tracking Project (OATP) » lkfitz's bookmarks

Tags:

oa.new oa.data oa.comment oa.obstacles oa.tools oa.extraction oa.code4oa oa.pdf oa.formats

Date tagged:

04/20/2017, 21:59

Date published:

04/20/2017, 05:20