DataUp—Data Curation for the Long Tail of Science - Microsoft Research Connections Blog - Site Home - MSDN Blogs

abernard102@gmail.com 2012-10-05

Summary:

"The long tail: sure, it’s a well-known concept in business and marketing, but there’s a very important “hidden” long tail in the sciences, too. So, what is this hidden long tail of science? It consists of the millions of datasets that are not stored in a databank and therefore are not available for use by other scientists. Every day, researchers throughout the world are observing, calculating, and compiling data, recording it all on their local machines within their labs—often not even as a shared resource to their institutions. Regrettably, much of this data never gets deposited in larger web-accessible data repositories where it could be reused by other investigators around the globe... Enter DataUp, an open-source tool that helps us document, manage, and archive our tabular data. The DataUp project was born out of this need for seamless integration of data management into the researchers’ current workflows. TheUniversity of California Curation Center (UC3) at theCalifornia Digital Library (CDL), with sponsorship from Microsoft Research and the Gordon and Betty Moore Foundation (GBMF), focused on creating a tool that could be used by researchers in the environmental sciences. They recognized that this field epitomizes the problems of data management and curation; in particular, the storage of data locally without data description (metadata)—such as where it was collected, by whom, and when—that would make it more usable by others.  By conducting surveys at ecological and environmental science events, CDL found that the majority of these scientists use spreadsheets to collect and organize their data, so rather than make them learn a new program, UC3 recognized a need for a tool that works with a program most scientists already know: Microsoft Excel.  we decided that there needed to be two versions of the tool: an open-source add-in (extension) for Microsoft Excel, and an open-source web application.  To achieve the project goals of facilitating data management, sharing, and archiving, both the add-in and the web application accomplish four main tasks: [1] Perform a best-practices check to ensure good data organization [2] Guide users through creation of metadata for their Excel file [3] Help users obtain a unique identifier for their dataset [4] Connect users to a major repository, where their data can be deposited and shared with others..."

Link:

http://blogs.msdn.com/b/msr_er/archive/2012/10/02/dataup-data-curation-for-the-long-tail-of-science.aspx

From feeds:

Open Access Tracking Project (OATP) » abernard102@gmail.com

Tags:

oa.new oa.data oa.comment oa.best_practices oa.metadata oa.preservation oa.funders oa.floss oa.repositories.data oa.curation oa.uc3 oa.microsoft oa.dataone oa.dataup oa.moore_foundation oa.rdm oa.repositories oa.uc.cdl

Date tagged:

10/05/2012, 13:55

Date published:

10/05/2012, 09:55