Anthropic destroyed millions of print books to build its AI models

Ars Technica 2025-06-25

On Monday, court documents revealed that AI company Anthropic spent millions of dollars physically scanning print books to build Claude, an AI assistant similar to ChatGPT. In the process, the company cut millions of print books from their bindings, scanned them into digital files, and threw away the originals solely for the purpose of training AI—details buried in a copyright ruling on fair use whose broader fair use implications we reported yesterday.

The 32-page legal decision tells the story of how, in February 2024, the company hired Tom Turvey, the former head of partnerships for the Google Books book-scanning project, and tasked him with obtaining "all the books in the world." The strategic hire appears to have been designed to replicate Google's legally successful book digitization approach—the same scanning operation that survived copyright challenges and established key fair use precedents.

While destructive scanning is a common practice among smaller-scale operations, Anthropic's approach was somewhat unusual due to its massive scale. For Anthropic, the faster speed and lower cost of the destructive process appear to have trumped any need for preserving the physical books themselves.

Read full article

Comments