A toolkit for data transparency takes shape
ab1630's bookmarks 2018-08-22
Summary:
"...users often find themselves struggling to replicate other scientists’ research — and even their own. As part of his postdoc, Casey Greene, a bioinformatician at the University of Pennsylvania in Philadelphia, ran a computational analysis of gene-expression data. He documented every aspect of his work, except the specific version of a key database. Four years later, those findings were irreproducible, at least in the fine details. “That was troubling,” he says....
Reproducibility advocates are converging around a tool set to minimize these problems. The list includes version control, scripting, computational notebooks and containerization — tools that allow researchers to document their data, the steps they follow to manipulate it, and the computing environment in which they work (see ‘Getting reproducible’)...."