Releasing The Public Interest Corpus Principles and Goals – Authors Alliance
peter.suber's bookmarks 2025-12-04
Summary:
"Today, we are pleased to release The Public Interest Corpus Principles and Goals. This release builds on the recap of our final planning workshop and anticipates release of our final deliverable later this month....
The Public Interest Corpus works with a growing coalition of stakeholders to develop a service that advances the library community’s ability to support the responsible use of their collections for AI research and development and computational research more generally. The initial focus of the service is on a corpus development, discovery, and access solution for books data (digitized and/or born digital text with metadata) at scale....
The Public Interest Corpus is inspired by open corpus development efforts from organizations like Wikipedia and PLEAIS. The Public Interest Corpus is also encouraged by efforts like the Institutional Data Initiative and European Books Data Commons. The Public Interest Corpus builds on these efforts by working to provide access to in-copyright as well as public domain books data on terms that are legal and ethical...."