The U.S. Copyright Office’s Draft Report on AI Training Errs on Fair Use

Deeplinks 2025-05-16

Summary:

Within the next decade, generative AI could join computers and electricity as one of the most transformational technologies in history, with all of the promise and peril that implies. Governments’ responses to GenAI—including new legal precedents—need to thoughtfully address real-world harms without destroying the public benefits GenAI can offer. Unfortunately, the U.S. Copyright Office’s rushed draft report on AI training misses the mark.

The Report Bungles Fair Use

Released amidst a set of controversial job terminations, the Copyright Office’s report covers a wide range of issues with varying degrees of nuance. But on the core legal question—whether using copyrighted works to train GenAI is a fair use—it stumbles badly. The report misapplies long-settled fair use principles and ultimately puts a thumb on the scale in favor of copyright owners at the expense of creativity and innovation.

To work effectively, today’s GenAI systems need to be trained on very large collections of human-created works—probably millions of them. At this scale, locating copyright holders and getting their permission is daunting for even the biggest and wealthiest AI companies, and impossible for smaller competitors. If training makes fair use of copyrighted works, however, then no permission is needed.

Right now, courts are considering dozens of lawsuits that raise the question of fair use for GenAI training. Federal District Judge Vince Chhabria is poised to rule on this question, after hearing oral arguments in Kadrey v. Meta Platforms. The Third Circuit Court of Appeals is expected to consider a similar fair use issue in Thomson Reuters v. Ross Intelligence. Courts are well-equipped to resolve this pivotal issue by applying existing law to specific uses and AI technologies.

Courts Should Reject the Copyright Office’s Fair Use Analysis

The report’s fair use discussion contains some fundamental errors that place a thumb on the scale in favor of rightsholders. Though the report is non-binding, it could influence courts, including in cases like Kadrey, where plaintiffs have already filed a copy of the report and urged the court to defer to its analysis.

Courts need only accept the Copyright Office’s draft conclusions, however, if they are persuasive. They are not.

The Office’s fair use analysis is not one the courts should follow. It repeatedly conflates the use of works for training models—a necessary step in the process of building a GenAI model—with the use of the model to create substantially similar works. It also misapplies basic fair use principles and embraces a novel theory of market harm that has never been endorsed by any court.

The first problem is the Copyright Office’s transformative use analysis. Highly transformative uses—those that serve a different purpose than that of the original work—are very likely to be fair. Courts routinely hold that using copyrighted works to build new software and technology—including search engines, video games, and mobile apps—is a highly transformative use because it serves a new and distinct purpose. Here, the original works were created for various purposes and using them to train large language models is surely very different.

The report attempts to sidestep that conclusion by repeatedly ignoring the actual use in question—training —and focusing instead on how the model may be ultimately used. If the model is ultimately used primarily to create a class of works that are similar to the original works on which i

Link:

https://www.eff.org/deeplinks/2025/05/us-copyright-offices-draft-report-ai-training-errs-fair-use

From feeds:

Fair Use Tracker » Deeplinks
CLS / ROC » Deeplinks

Authors:

Tori Noble, Mitch Stoltz, Corynne McSherry

Date tagged:

05/16/2025, 04:33

Date published:

05/16/2025, 00:53