Large and linked in scientific publishing the launch of big data journal GigaScience 2012-07-13


“BGI, the world's largest genomics institute, and BioMed Central, a leader in scientific data sharing, aim to revolutionize science publishing with the launch of GigaScience, a new open access, open data journal with a scope that embraces all life science research that generates 'big data'. This launch is a major first step towards the open access publication of complete, reproducible accounts of all parts of data-intensive scientific research projects. Together GigaScience and its integrated database GigaDB provide scientific analyses, full dataset hosting, and access to the software tools used to conduct these analyses, along with publication of more traditional scientific articles describing the studies. Having all these together finally allows readers to not only glean the scientific conclusions in the papers, but also to directly test these using the underlying data and analysis tools. In this way, GigaScience offers a way to help overcome the growing problem of the lack of reproducibility of research. GigaScience publications also include Digital Object Identifier (DOIs) for all datasets in the journal database, GigaDB. This helps make datasets more permanent, as well as fully track-able, discoverable, linkable, and citable, which traditionally has only been possible for journal articles. Citation enables scientists, who generate these enormous datasets and share them with the community, to gain more appropriate credit for their contributions to research. Laurie Goodman, Editor-in-Chief, says, ‘The full use of large-scale data has sadly lagged far behind our ability to produce it. The leaders of BGI realized they had the ability, given their vast computational resources, to create an innovative new journal format — one where enormous datasets could be fully hosted and directly linked to their original scientific studies. By including analysis tools in a data platform, as well as the planned addition of cloud technology later this year, GigaScience can serve as a means to put such data into the hands of researchers who do not have the vast computational resources required for optimal data use...’ Exemplifying GigaScience and GigaDB's innovative approach to publishing, in the launch edition, is a research article from Stephan Beck's group at the University College London, UK. This article focuses on ways to conduct whole-genome analyses of DNA methylation, an important mechanism that regulates gene expression. The article contains all of the supporting data and software tools needed to recreate the experiments — a total of 84 GB — freely available for download and reuse from GigaDB. Using BGI's data storage capacity, GigaScience is able to host these and other files, which are far larger than any other journals are able to publish. GigaDB furthermore supports open data by giving up all copyright in published datasets by its use of the Creative Commons CC0 public domain dedication waiver. This allows anyone to access and reuse published data without restrictions. This is part of a forward thinking, technology-driven approach to science publishing. As Publisher Iain Hrynaszkiewicz says, ‘We traditionally have only had access to limited amounts of scientific knowledge... which means we do not reap the full benefits of research. Through GigaScience's open access, open data journal and database we are entering a new era of publishing ... BioMed Central are delighted to be leading this revolution in open data and science communication with GigaScience and BGI, which we hope will ultimately help make scientific research faster and more reliable.’ As well as this innovative, big-data-driven publication format the journal also provides reviews and commentaries that address the many hurdles that still need to be surmounted to improve future big-data handling. BioMed Central and the GigaScience editors will be marking the journal's launch at the ISMB conference 15-17 July 2012”



