Sharing Data from Large-scale Biological Research Projects: A System of Tripartite Responsibility

"The Wellcome Trust sponsored a meeting on 14–15 January 2003 to discuss how, at this point in the development of the field of genomics, pre-publication data release can promote the best interests of science and help to maximize the public benefit to be gained from research. About 40 people attended the meeting, among them large-scale sequence producers, sequence users including computational biologists, representatives of the major nucleotide sequence databases, journal editors, and scientists interested in other large-scale data sets. The discussion took as a given that published data are available in their entirety for any use by any investigator, and focused on issues involved in making data broadly available prior to publication. The meeting concluded that pre-publication release of sequence data by the International Human Genome Sequencing Consortium, and other sequence producers, has been of tremendous benefit to the scientific research community in general. While not all were in a position to make commitments for their funding agencies, the meeting attendees were in broad agreement that, to encourage the continuation of such benefits, the sequence producers, sequence users and the funding agencies recognize and implement a system based on ‘tripartite responsibility’. Specifically, • The meeting attendees enthusiastically reaffirmed the 1996 Bermuda Principles, which expressly called for rapid release to the public international DNA sequence databases (GenBank, EMBL, and DDBJ) of sequence assemblies of 2kb or greater by large-scale sequencing efforts and recommended that that agreement be extended to apply to all sequence data, including both the raw traces submitted to the Trace Repositories at NCBI and Ensembl and whole genome shotgun assemblies. • The attendees recommended that the principle of rapid pre-publication release should apply to other types of data from other large-scale production centers specifically established as ‘community resource projects’. • The attendees recognized that pre-publication data release might conflict with a fundamental scientific incentive – publishing the first analysis of one's own data. The attendees noted that it would not be possible to absolutely guarantee this incentive without applying restrictions that would undermine the rationale for rapid, unrestricted release of data from community resources. Nonetheless, it is essential that excellent scientists continue to be attracted to these projects. To encourage this, the scientific community should understand that pre-publication data release needs active communitywide support if it is to continue to receive widespread support from the producers. The contributions and interests of the large-scale data producers should be recognized and respected by the users of the data, and the ability of the production centres to analyse and publish their own data should be supported by their funding agencies...."



