Release ‘open’ data from their PDF prisons using tabulizer | R-bloggers

peter.suber's bookmarks 2017-04-20


"As a political scientist who regularly encounters so-called "open data" in PDFs, this problem is particularly irritating. PDFs may have "portable" in their name, making them display consistently on various platforms, but that portability means any information contained in a PDF is irritatingly difficult to extract computationally."


From feeds:

Open Access Tracking Project (OATP) » peter.suber's bookmarks
Open Access Tracking Project (OATP) » lkfitz's bookmarks

Tags: oa.formats oa.comment oa.obstacles oa.extraction oa.code4oa oa.pdf


04/20/2017, 05:20