talking text mining with Elsevier

abernard102@gmail.com 2012-08-20

Summary:

“I had a phone call on Friday with my university librarian and six (!) Elsevier employees. We discussed Elsevier’s text mining policies and whether my needs for text mining access could be better facilitated... This meeting is thanks to the wonders of twitter and participation+proactive engagement there by Alicia Wise (aka @wisealic)... I was participating in a twitter conversation about the PubMed Open Access Subset, and a) observed how few Elsevier articles are in it, and b) suggested that Elsevier make its back issues available for text mining for the progress of science... [The tweet from Alicia Wise] ‘hi there - I am rather perplexed by this comment as all #Elsevier content - incl subscription content - can be text mined...’ I responded with surprise because that wasn’t my understanding. We went traded a few tweets about Terms of Use and how it is currently unclear on the Elsevier website to understand what Terms appear to the reuse of article content, and she generously volunteered to follow up with me (and separately, Alf Eaton)... True to her word, Alicia got back to me promptly and facilitated a phone call that included: Alicia Wise – Director Universal Access, David Tempest – Deputy Director, Universal Access, Chris Shillum – Vice President Product Management, Platform and Content, Allan Lu – Director, Product Management, ScienceDirect, Ale de Vries- Director, Platform Integration, Kortney Boak – Account Manager, Canada, Aleteia Greenwood – Head Librarian Science & Engineering, UBC Library, Heather Piwowar – Department of Zoology, UBC... Before the call I sent the participants a summary of my text mining projects because Alicia had indicated that Elsevier facilitates text mining on a project-by-project basis... Here’s the email I sent (overviews of these projects deleted below but included in link)... [From the email] ‘My hope is threefold: [1] to inform our decisions on ways I may text-mine Elsevier-controlled content [2] to provide additional case studies for you to understand all the ways researchers may want to use the literature [3] to highlight for you the frustration that many scholars feel about accessing and USING the scientific literature to advance science. I’m very happy to be having these conversations, but also very aware I’m only having them now because I was lucky on twitter. Many other scholars would also like to have them but don’t know how... My research area is studying patterns in research data sharing and use. Project 1: Tracking datasets from public repositories into the published literature... I’d like to programmatically query Elsevier fulltext for 1000 accession number strings. For each query string I’d like to export the search result information (dois or IDs), analyze it, and make it available as open supplementary information... Project 2: Classifying citations to identify those made in the context of dataset reuse... I’d like programmatic access to the full text of Elsevier papers that I know to have cited my dataset cohort, so that I can automatically extract relevant citation context. I’d like to make this information publicly available to citizen scientists and run text analysis algorithms on it... Project 3: Providing evidence of data use to data creators... I’d like ongoing programmatic access to the full text of Elsevier papers to query for Research Object identifiers, so that we can display links to the search results in total-impact, aggregate them in reports, and release them openly...’ We had a respectful and productive conversation. I recapped my projects, Elsevier told me about their standard textmining contract clause, and we discussed next steps... We decided that: [1] I could get text mining access for the purpose of my first project immediately, through Elsevier’s APIs [2] others on the call would work toward text mining access for UBC as a whole soon, and sooner than the next contract renewal (2014 or 2015).  No money was discussed, leading me to assume that there would be no charge. [3] two of my text mining use cases require reuse rights that are outside the standard Elsevier agreement. We will continue working together to see what we can do. Alicia mentioned the citizen science project as a particularly interesting use case... Follow-up [1] Ale de Vries sent me email on the weekend with API keys, and followed up on Monday with helpful tips on how to use them for my specific use cases. Very helpful. [2] I asked for the text of the standard reuse agreement. It was sent to me but I was asked not to share it publicly because ‘it is a legal element’ [3] David Tempest is now taking lead in place of Alicia Wise in moving forward with partnership with UBC [4] David will be meeting with the Elsevier lawyer, Jan Bij de Weg, on Wednesday morning to check into licensing questions [5] someone (I’m not sure who, I need to check) will take the next step on adding text mining agreements into UBC’s Science Direct contract (UBC does not sign its own SD license, it is signed by the National Consortium, CRKN). [6] I sent more details on my two use cases that are not clearly within

Link:

http://researchremix.wordpress.com/2012/03/05/talking-text-mining-with-elsevier/

Updated:

08/16/2012, 06:08

From feeds:

Open Access Tracking Project (OATP) » abernard102@gmail.com

Tags:

oa.new oa.data oa.business_models oa.publishers oa.licensing oa.mining oa.comment oa.repositories oa.advocacy oa.elsevier oa.copyright oa.libraries oa.cc oa.google oa.crowd oa.uk oa.impact oa.librarians oa.reports oa.citations oa.apis oa.hargreaves oa.libre

Authors:

abernard

Date tagged:

08/20/2012, 14:40

Date published:

03/06/2012, 15:07