Data Commons Can Save Open AI - The New Stack
peter.suber's bookmarks 2025-06-26
Summary:
"In recent months, a steady development stream of new models, versions and derivatives has become the norm. It’s not an exaggeration to say that there’s a boom in open source AI models.
Unfortunately, against this backdrop, we see a problem in the foreground: the data landscape has stalled. There has been much less progress in public or open training datasets, even if everyone agrees that data is the key resource needed to build better AI systems...."