Funding Strategies for Data-Intensive Science
peter.suber's bookmarks 2025-04-25
Summary:
"In the early 1990s, NASA was ready to fund a project that would generate 15 TB of images of the night sky. This was a massive amount of data for the time. It was the beginning of the first era of big data in science, where scientists and funders could invest in big projects to collect large data sets for many people to use.
But actually using that data required more than just the telescopes that NASA could fund. The data needed to be stored, curated, retrieved, and analyzed. The Alfred P. Sloan Foundation — a private philanthropy — contributed not only to the telescopes, but also contributed successive grants for the open data release, community, infrastructure, and management that was needed to turn the massive dataset of images into something transformative for astronomers.
This kind of data-intensive science (DIS) advanced not just astronomy, but also computer science and core software tools for scientific database management and data analysis. Microsoft computer scientist Jim Gray, who led the database software work of the Sloan Digital Sky Survey, was a strong advocate for it. Gray recognized that DIS is a fundamental evolution of the scientific method: scientific innovation, as a process, is radically altered by massive scale compute and large, complex datasets...."