The NIH Open Citation Collection: A public access, broad coverage resource

peter.suber's bookmarks 2019-10-14

Summary:

Abstract:  Citation data have remained hidden behind proprietary, restrictive licensing agreements, which raises barriers to entry for analysts wishing to use the data, increases the expense of performing large-scale analyses, and reduces the robustness and reproducibility of the conclusions. For the past several years, the National Institutes of Health (NIH) Office of Portfolio Analysis (OPA) has been aggregating and enhancing citation data that can be shared publicly. Here, we describe the NIH Open Citation Collection (NIH-OCC), a public access database for biomedical research that is made freely available to the community. This dataset, which has been carefully generated from unrestricted data sources such as MedLine, PubMed Central (PMC), and CrossRef, now underlies the citation statistics delivered in the NIH iCite analytic platform. We have also included data from a machine learning pipeline that identifies, extracts, resolves, and disambiguates references from full-text articles available on the internet. Open citation links are available to the public in a major update of iCite (https://icite.od.nih.gov).

 

 

Link:

https://journals.plos.org/plosbiology/article?id=10.1371/journal.pbio.3000385

From feeds:

Open Access Tracking Project (OATP) » peter.suber's bookmarks

Tags:

oa.new oa.nih oa.usa oa.citations oa.metadata oa.nih-occ oa.medicine oa.biomedicine oa.biology oa.ai oa.unpaywall

Date tagged:

10/14/2019, 10:38

Date published:

10/14/2019, 06:43