Data: why openness and sharing are important | Discussions – F1000 Research
Connotea Imports 2013-03-15
Summary:
"Researchers are coming under increasing pressure to share the detailed results of their research, namely the datasets themselves, with other researchers. The pressure may come in the form of data management plans now being requested by many funders, or from requests for the data from an institutional data repository, or from journals that are increasingly encouraging data articles or even requesting that the data behind more traditional articles be made available on request.
At F1000Research, we think it is essential that the data supporting the results and, ultimately, the article conclusions be made publicly available with the article, thus enabling reproducibility and even reuse in some cases.
Our mandatory policy on data submission (obviously not including those datasets where data protection could be an issue) has been tested by a couple of recent submissions where the authors had understandable initial concerns about such open sharing of their detailed results. One such author, Vathsala Mohan from AgResearch Ltd in New Zealand, initially wrote to us saying that: 'providing the raw data is a little difficult as those data are very important and valuable and will form a basis for other papers from my research'.
I am sure to many researchers, this is a familiar scenario and concern. It initially made us wonder whether our policy was a viable approach. We explored other options, such as a limited-time embargo before publishing data. We also discussed the options with many on our Advisory Panel who, to our surprise, unanimously told us that we should be bold and stick to our original plans; and of course they are right. It really makes no sense that a reader has to take the authors’ word for it that they really did generate the data behind their graphs, or that they analysed the data correctly and without (deliberate or unintentional) bias.
One of the strongest arguments for publishing your data as early as possible is to establish priority. This means you can truly show, with a formal data citation, that you did the work before anyone else. Such an approach could certainly have prevented many a Nobel Laureate dispute!
Publishing was, in fact, what Mohan and colleagues ultimately chose to do. Given the volume of data behind their original research article, we decided that the work would be best represented as 2 articles: one focussing purely on the data and protocol information, and the other focussing on the analysis and conclusions. The articles are independently citable while still being tightly linked. One of the advantages of separate publication is that it gives authors the opportunity to provide proper credit on the data paper to those who generated the data, who may not always be the same individuals that conducted the analyses and wrote up the conclusions ..."