rOpenSci | fulltext v1: text-mining scholarly works | Scott Chamberlain

ab1630's bookmarks 2018-01-18

Summary:

"The problem: Text-mining - the art of answering questions by extracting patterns, data, etc. out of the published literature - is not easy. It’s made incredibly difficult because of publishers. It is a fact that the vast majority of publicly funded research across the globe is published in paywall journals. That is, taxpayers pay twice for research: once for the grant to fund the work, then again to be able to read it.....

...fulltext is a package to help R users address the above problems, and get published literature from the web in it’s many forms, and across all publishers.

the fulltext package: fulltext tries to make the following use cases as easy as possible:

  • Search for articles
  • Fetch abstracts
  • Fetch full text articles
  • Get links for full text articles (xml, pdf)
  • Extract text from articles
  • Collect sections of articles that you actually need (e.g., titles)
  • Download supplementary materials

fulltext organizes functions around the above use cases, then provides flexiblity to query many data sources within that use case (i.e. function). For example fulltext::ft_search searches for articles - you can choose among one or more of many data sources to search, passing options to each source as needed...."

Link:

https://ropensci.org/technotes/2018/01/17/fulltext-v1/

From feeds:

Open Access Tracking Project (OATP) » ab1630's bookmarks

Tags:

oa.new oa.mining oa.tools oa.interoperability oa.data oa.search oa.discoverability oa.metadata oa.formats

Date tagged:

01/18/2018, 13:32

Date published:

01/18/2018, 12:57