The Right to Read Is the Right to Mine
abernard102@gmail.com 2012-06-11
Summary:
“Introduction... Researchers can find and read papers online, rather than having to manually track down print copies. Machines (computers) can index the papers and extract the details (titles, keywords etc.) in order to alert scientists to relevant material. In addition, computers can extract factual data and meaning by “mining” the content, opening up the possibility that machines could be used to make connections (and even scientific discoveries) that might otherwise remain invisible to researchers.
However, it is not generally possible today for computers to mine the content in papers due to constraints imposed by publishers. While Open Access (OA) is improving the ability for researchers to read papers (by removing access barriers), still only around 20% of scholarly papers are OA. The remainder are locked behind paywalls. As per the vast majority of subscription contracts, Subscribers may read paywalled papers, but they may not mine them. Content mining is the way that modern technology locates digital information. Because digitized scientific information comes from hundreds of thousands of different sources in today’s globally connected scientific community [2] and because current data sets can be measured in terabytes,[1] it is often no longer possible to simply read a scholarly summary in order to make scientifically significant use of such information.[3] A researcher must be able to copy information, recombine it with other data and otherwise “re-use” it so as to produce truly helpful results. Not only is it a deductive tool to analyze research data, it is how search engines operate to allow discovery of content. To prevent mining is therefore to force scientists into blind alleys and silos where only limited knowledge is accessible. Science does not progress if it cannot incorporate the most recent findings and move forward from there... Definition... ‘Open Content Mining’ means the unrestricted right of subscribers to extract, process and republish content manually or by machine in whatever form (text, diagrams, images, data, audio, video, etc.) without prior specific permissions and subject only to community norms of responsible behaviour in the electronic age. Principle 1: Right of Legitimate Accessors to Mine... We assert that there is no legal, ethical or moral reason to refuse to allow legitimate accessors of research content (OA or otherwise) to use machines to analyse the published output of the research community. Researchers expect to access and process the full content of the research literature with their computer programs and should be able to use their machines as they use their eyes. The right to read is the right to mine Principle 2: Lightweight Processing Terms and Conditions... Mining by legitimate subscribers should not be prohibited by contractual or other legal barriers. Publishers should add clarifying language in subscription agreements that content is available for information mining by download or by remote access. Where access is through researcher-provided tools, no further cost should be required. Users and providers should encourage machine processing Principle 3: Use... Researchers can and will publish facts and excerpts which they discover by reading and processing documents. They expect to disseminate and aggregate statistical results as facts and context text as fair use excerpts, openly and with no restrictions other than attribution. Publisher efforts to claim rights in the results of mining further retard the advancement of science by making those results less available to the research community; Such claims should be prohibited. Facts don’t belong to anyone... Strategies... We plan to assert the above rights by: [1] Educating researchers and librarians about the potential of content mining and the current impediments to doing so, including alerting librarians to the need not to cede any of the above rights when signing contracts with publishers [2] Compiling a list of publishers and indicating what rights they currently permit, in order to highlight the gap between the rights here being asserted and what is currently possible [3] Urging governments and funders to promote and aid the enjoyment of the above rights”