Mining for insight - TEXT AND DATA MINING - Research Information

abernard102@gmail.com 2013-08-16

Summary:

"Text and data mining is a hot topic. It has been extensively discussed in copyright and open-access discussions and has been mentioned in many recent policies in these areas. But is there a fundamental disconnect between what researchers want to do and what information providers think they need? Part of the challenge comes down to defining text and data mining (TDM). At one extreme it’s a large-scale, deep search to generate specialised datasets. For example, Shreejoy Tripathy, a PhD candidate in the Neural Computation Center for the Neural Basis of Cognition at Carnegie Mellon University, USA, said of his research, ‘I use full-text literature text mining to extract information about the electrical properties of different neuron types in the brain. I then analyse the resulting dataset to better understand the electrical diversity of neurons throughout the brain. Because this data is useful to other researchers who can use it for purposes different from my intended use, I also provide the extracted information (but not the publications themselves) back to the field at www.neuroelectro.org.’ TDM can also be much simpler – an extension of search – as Cameron Neylon, director of advocacy at Public Library of Science (PLOS), explained: ‘Researchers want better awareness of the latest research. This is not really delivered by current search tools. For example, I look a lot at methods, and it’s very common for these to be left out of abstracts. If we could do TDM on papers it would be fairly trivial to build tools to search methods sections.’ According to many publishers, the amount of TDM going on is still small. Wim van der Stelt, executive vice president of corporate strategy at Springer, said that the company has not had very many requests so far, and that most of these have been from pharmaceutical research. ‘There is a company policy in the works but, so far, the amount has been so low we’ve handled them case by case,’ he said. Nature Publishing Group (NPG) has a similar story: ‘In general, the number of requests has been pretty low. They are generally one-offs, so are dealt with on a case-by-case basis, although there is more interest now,’ said Jessica Rutt, rights and licensing manager at NPG. Alicia Wise, director of universal access for Elsevier, says that her company has a dedicated help-desk for people who want to do TDM. However, she said that the company has had fewer than 100 requests and ‘a lot don’t seem to actually do it once they are set up’. She remarked that it is ‘still early days, but some do TDM a lot. We want to support the full spectrum, from power-users to those who just do a bit.’ Canada-based researcher Heather Piwowar, however, believes that the lack of requests does not fully represent the demand for TDM in research. ‘How many help-desk requests is often used as a guide to how little interest there is, but it’s not a full picture,’ she explained. ‘As a grad student I knew I wanted to do TDM but I never asked. And a lot of people wish Google Scholar had an API. I think these people are actually wishing to do TDM, but don’t know it yet.’ In addition, Neylon suggested that some low-level TDM goes on below the radar. ‘Text and data miners at universities often have to hide their location to avoid auto cut-offs of traditional publishers. This makes them harder to track. It’s difficult to draw the line between what’s text mining and what’s for researchers’ own use, for example, putting large volumes of papers into Mendeley or Zotero,’ he explained ..."

Link:

http://www.researchinformation.info/features/feature.php?feature_id=429

From feeds:

Open Access Tracking Project (OATP) » abernard102@gmail.com

Tags:

oa.new oa.data oa.publishers oa.policies oa.mining oa.comment oa.standards oa.formats

Date tagged:

08/16/2013, 15:48

Date published:

08/16/2013, 11:48