Data Citation and Sharing: What’s in it for me? | Impact of Social Sciences

abernard102@gmail.com 2014-01-09

Summary:

"'I’m all for the free sharing of information, provided it’s them sharing their information with us.' – Archchancellor Ridcully, Unseen University, Ankh-Morpork (Unseen Academicals p. 166). Substitute the word 'data' (or 'code' or 'methodology' or 'workflows' or…) for 'information' in the above quote and you’ve got a sentiment that a lot of researchers share, though maybe not in quite such a blunt way. You’d find it very hard to argue that data sharing isn’t difficult, time consuming, expensive, and generally not part of scientific practice. Conversely, you’d find it even harder to argue that data shouldn’t be shared. Let’s get the reasons for sharing out of the way first. [1] Science is all about reproducibility – if someone else can’t reproduce your results, then your conclusions are invalid, and therefore the science doesn’t work. For a lot of scientific domains, reproducing results means using the original data collected, which means having access to it in the first place, which means sharing. [2] Data sharing cuts down on academic fraud. It’s hard work fabricating datasets (I know this from personal experience, having spent most of my PhD trying to simulate synthetic rain fields that looked anything like the real ones…), and having other people using your data means that they’re more likely to notice if something seems a bit wrong (which is also useful for error corrections). [3] Data sharing saves time and money. If a dataset already exists to test your hypothesis, why spend the effort and the money to collect an entirely new one? [4] Data sharing improves the transparency of the research process. If the data’s available to anyone who wants it, then you can’t be accused of hiding evidence about a controversial topic (like climate change). All good things, right? So, the question then becomes: why don’t researchers share their data as a matter of course? There are lots of reasons, which have been collated by lots of other researchers, ranging from fear of getting scooped (it happened to me!), to worry that others will find errors in the dataset or use it to misrepresent a key finding, to a simple and understandable desire to squeeze all possible research benefit out of the data before making it public. In a time of increasing competition and decreasing science budgets, hoarding data might make the difference between getting a grant and not, and therefore building a career as a scientist, or not. Still, more and more research funders (including the UK Research Councils and the USA’s National Science Foundation) are becoming more interested in how the data produced by their funding is being managed, and have issued policies about this, including making statements about providing the resources needed for researchers to properly manage and share their data. Compliance, however, is patchy. Some academic fields (like my own, Earth Sciences) are very well supplied with long running and well-established national and international data centres, which serve their communities by offering services in data curation and archiving, and metadata creation and management. Other researchers aren’t nearly as lucky, and face the situation of having to share their data via ftp sites, or departmental webpages, if at all. Yet, as well as sharing data being good for science, it appears that data sharing is also good for the scientist. Piwowar et al (2007), showed that, for a sample of 85 cancer microarray clinical trial publications, 48% of trials with publicly available microarray data received 85% of the aggregate citations. A similar study (Piwowar and Vision, 2013) showed that of 10,555 studies that created gene expression microarray data, studies that made data available in a public repository received 9% more citations than similar studies for which the data was not made available. It remains to be seen whether this holds true for other research fields, as not that many research domains have an established norm of citing data, though anecdote seems to suggest it might be ..."

Link:

http://blogs.lse.ac.uk/impactofsocialsciences/2014/01/07/data-citation-and-sharing-whats-in-it-for-me/

From feeds:

Open Access Tracking Project (OATP) » abernard102@gmail.com

Tags:

oa.new oa.comment oa.data oa.benefits oa.economic_impact oa.reproducibility oa.impact oa.citations oa.quality oa.credibility oa.open_science oa.funders oa.mandates oa.policies

Date tagged:

01/09/2014, 08:18

Date published:

01/09/2014, 03:18