Copyright and AI: the Cases and the Consequences
Deeplinks 2025-02-19
Summary:
The launch of ChatGPT and other deep learning quickly led to a flurry of lawsuits against model developers. Legal theories vary, but most are rooted in copyright: plaintiffs argue that use of their works to train the models was infringement; developers counter that their training is fair use. Meanwhile developers are making as many licensing deals as possible to stave off future litigation, and it’s a sound bet that the existing litigation is an elaborate scramble for leverage in settlement negotiations.
These cases can end one of three ways: rightsholders win, everybody settles, or developers win. As we’ve noted before, we think the developers have the better argument. But that’s not the only reason they should win these cases: while creators have a legitimate gripe, expanding copyright won’t protect jobs from automation. A win for rightsholders or even a settlement could also lead to significant harm, especially if it undermines fair use protections for research uses or artistic protections for creators. In this post and a follow-up, we’ll explain why.
State of Play
First, we need some context, so here’s the state of play:
DMCA Claims
Multiple courts have dismissed claims under Section 1202(b) of the Digital Millennium Copyright Act, stemming from allegations that developers removed or altered attribution information during the training process. In Raw Story Media v. OpenAI, Inc., the Southern District of New York dismissed these claims because the plaintiff had not “plausibly alleged” that training ChatGPT on their works had actually harmed them, and there was no “substantial risk” that ChatGPT would output their news articles. Because ChatGPT was trained on “massive amount of information from unnumerable sources on almost any given subject…the likelihood that ChatGPT would output plagiarized content from one of Plaintiffs’ articles seems remote.” Courts granted motions to dismiss similar DMCA claims in Andersen v. Stability AI, Ltd., , The Intercept Media, Inc. v. OpenAI, Inc., Kadrey v. Meta Platforms, Inc., and Tremblay v. OpenAI.
Another such case, Doe v. GitHub, Inc. will soon be argued in the Ninth Circuit.
Copyright Infringement Claims
Rightsholders also assert ordinary copyright infringement, and the initial holdings are a mixed bag. In Kadrey v. Meta Platforms, Inc., for example, the court dismissed “nonsensical” claims that Meta’s LLaMA models are themselves infringing derivative works. In Andersen v. Stability AI Ltd., however, the court held that copyright claims based on the assumption that the plaintiff’s works were included in a training data set could go forward, where the use of plaintiffs’ names as prompts generated outputted images that were “similar to plaintiffs’ artistic works.” The court also held that the plaintiffs plausibly alleged that the model was designed to “promote infringement” for similar reasons.
It's early in the case—the court was merely deciding if the plaintiffs had alleged enough to justify further proceedings—but it’s a dangerous precedent. Crucially, copyright protection extends only to the actual expression of the author—the underlying facts and ideas in a creative work are not themselves protected. That means that, while a model cannot output an identical or near-identical copy of a training image without running afoul of copyright, it i
Link:
https://www.eff.org/deeplinks/2025/02/copyright-and-ai-cases-and-consequencesFrom feeds:
Fair Use Tracker » DeeplinksCLS / ROC » Deeplinks