PLOS Biology: Lost Branches on the Tree of Life
Use the link to access the full text article from PLoS Biology. The introduction reads as follows: "Given that reproducibility is a pillar of scientific research, the preservation of scientific knowledge (underlying data) is of paramount importance. The standard of reproducibility can be evaluated based on criteria of methodological rigor and legitimacy, which is sometimes used to distinguish 'hard' from 'soft' sciences. In phylogenetics, a discipline that routinely uses DNA sequences to build trees reflecting organismal relationships, the scale of data collection and the complexity of analytical software have both increased dramatically during the past decade. Consequently, the ability to navigate publications and reproduce analyses is more challenging than ever. When DNA sequencing was initially employed in systematics during the late 1980s, there was some reluctance to deposit nucleotide sequences in open repositories such as GenBank . This ultimately changed when high-impact journals (e.g., Proceedings of the National Academy of Sciences, Nature, Science) began requiring GenBank submission as a prerequisite for publication ,; now virtually every evolutionary biology journal observes this requirement (but see ). Until recently, uploading sequences to GenBank (or EMBL) was generally considered sufficient to ensure reproducibility of phylogenetic studies using DNA sequence data. Increasingly, however, the systematics community is realizing that archiving raw DNA sequences is not adequate, and that the underlying alignments of DNA sequences as well as the resulting phylogenetic trees are pivotal for reproducibility, comparative purposes, meta-analyses, and ultimately synthesis. Indeed, there has been a growing clamor for journals to adopt and enforce more rigorous data archiving practices across diverse disciplines –. As a result, about 35 evolutionary journals , have adopted policies to encourage or require authors to upload alignments, phylogenetic trees, and other files requisite for study reproducibility  to TreeBASE (http://treebase.org/) and/or other public repositories such as Dryad (http://datadryad.org). Unfortunately, enforcement of such data deposition policies is generally lax, and most journals in systematics and evolution still do not require DNA sequence alignment or tree deposition. As a result, the alignments and trees underlying most published papers in systematics/phylogenetics and evolutionary biology remain inaccessible to the scientific community at large ,."