chem-bla-ics: Elsevier's new text mining initiative is a step sideways

abernard102@gmail.com 2014-02-16

Summary:

Elsevier's new ideas on text mining are getting a lot attention now. Sadly, they get it wrong, again. On the bright side, all other publishers, which are expected to follow this year, can learn from this mistake. Because if done right, the publishers can even help forward science, despite crippling progress. That sound harsh, and surely they have done a lot of good for science. In fact, we would not be where we are now without the publishers. But things have changed. With the internet anyone can be publisher. We see this with blogs, we see this with Lulu.com. And, unlike some misinformed people think, this is independent from peer review. Publishers were important because they provide a channel to disseminate knowledge. But paper publishing is no longer the most efficient way. In fact, in terms of value, paper has been overtaken for some years now. And we need more added value. Not the shipping of the knowledge, but keeping up is the issue. And there too, publishing is inefficient: human language is nice for sharing ideas and concepts, but it fails at disseminating raw facts: measured data. Anyone who has tried creating a data set to find patterns knows this: extracting the information is a lot of effort, mostly caused by the broken paper publishing model. This is most apparent in some research domain where data repositories exist, but sadly this applies to a small minority of data types. Now, text mining seems in that sense the wrong question: why trying to recover knowledge that should have gone into repositories in the first places. I agree. However, we cannot just throw away all the knowledge kept in these papers, and certainly not as long as people keep insisting on seeing only papers as scientific success. We are slowly seeing this improve, but only very slowly. Things that were apparent to me as a student 20 years ago, are the things that scholars are still struggling with today. Depressing indeed, but it does help you grow a good sense of patience. And now, Elsevier wants to make a step forward, wants to be leading in science dissemination again. And they come up with an intermediate solution between actual knowledge dissemination and profit: they come up with a license-model, increasing their monopoly on knowledge and trying to lure the scientist into a non-commercial license. From a money-making perspective this is what society expects from them. From someone who likes to see societal problems solves, this is disappointing. They had a great opportunity to lead the field. Now, is all bad? Not at all. It's a step, but not the step I would have liked to see. It will be a success: because the CC-BY-NC data that will come out of it, will be part of the web of knowledge. No one will care about the NC part, except all those SMEs in Europe that work on products to help society which will find it much harder to collaborate with other companies, because they cannot share the knowledge the created from analyzing the literature (does Elsevier want a monopoly in this analysis?).  Nor will many in the academic community complain. Surely, those that have worried about this, they will. But the scholar at universities do not care about NC licenses. After all, universities are not commercial. Asking a student to pay 30 thousand euro for a year is surely not commercial. That is the consensus. But I note that this consensus has not be tried in court, and I am looking forward to the day it will happen. Elsevier will likely not challenge this, and silently accept this situation. Just like Microsoft never made a big deal out of people copying office versions of their operating system for at home: you do not bit the hand that feeds you (too hard). You rather go after others, like Academia.edu. It will not be scholar Elsevier will enforce the NC on, and it will not be large companies either: if any, it will be the SMEs. Support them, and do not agree with the license. Well, it was a nice opportunity for Elsevier. I only see my choice to sign The Cost of Knowledge reaffirmed ... The choice of the NC clause is totally useless in any context of dissemination. I call for Elsevier to at least add this option, if they are serious about improving: text mining is provided to subscribers, via a decent API, adhering to: [1] Facts extracted from literature are licensed CCZero and attribution is paid (facts are copyright free in most parts of the world)
  1. [2] Output can contain 'snippets' of the original text under international 'fair use' concepts, and licensed as CC-BY ..."

Link:

http://chem-bla-ics.blogspot.com/2014/02/elseviers-new-text-mining-initiative-is.html

From feeds:

Open Access Tracking Project (OATP) » abernard102@gmail.com

Tags:

oa.new oa.comment oa.academia.edu oa.versions oa.copyright oa.licensing oa.takedowns oa.elsevier oa.publishers oa.business_models oa.policies oa.mining oa.advocacy oa.petitions oa.cost_of_knowledge oa.libre

Date tagged:

02/16/2014, 09:42

Date published:

02/16/2014, 04:42