Protecting AI Usage and Training Rights: Three Ideas for Congress

ARL Policy Notes 2024-04-19

Last Updated on April 19, 2024, 12:40 pm ET

graphic of a human brain made of computer circuitryimage by Steve Johnson on Unsplash

This week I had the opportunity to participate as a panelist in the Public Knowledge (PK) briefing for congressional staffers, called “The Big Picture on AI & IP: Research, Openness, and Competition Concerns.” The briefing focused on how legal doctrines like fair use, noncommercial academic and research purposes, and competition concerns are essential considerations when thinking about AI training. In conversation with PK’s policy counsel Nick Garcia and fellow panelists Rachael Samberg (University of California (UC), Berkeley) and Kit Walsh (Electronic Frontier Foundation), the following three ideas emerged for how Congress can protect fair use for training AI.

Take a cautious approach to legislative proposals to regulate AI, and evaluate whether they would restrict research, scholarship, and education.

It is important to note that it is not necessary for Congress to enact new legislation in order for researchers to continue to train AI. Under current US copyright law, researchers can rely on fair use and other rights that Congress granted through exceptions and limitations in the US Copyright Act. The Library Copyright Alliance (LCA) principles for copyright and AI hold that the US Copyright Act, as applied and interpreted by the Copyright Office and the courts, is fully capable of addressing issues at the intersection of copyright and AI, without amendment. LCA asserts that under current law, researchers can rely on fair use to train AI.

We ask Congress to be cautious of legislative proposals that would narrow settled rights that are essential for research and education. For instance, Samberg noted that Congress and the Copyright Office understand the importance of facilitating scholarly access to and usage of scholarly publications, databases, data, literary works, and more, as they implement the fair use exception without any statutory or regulatory exclusions or opt-outs. But legislation that provided opt-outs or imposed a licensing scheme in the context of AI would undermine the purpose of fair use and have adverse effects on scholarship, criticism, and curtail scientific advances—the very purpose of the Copyright Act to begin with.

Consider whether legislation might be necessary to clarify that fair use and other congressionally granted rights cannot be eroded by license agreement.

During the briefing, Samberg described how UC Berkeley negotiates clauses in their database license agreements that preserve fair use. But some vendors of digital scholarly and cultural heritage works have tried to insert clauses that restrict AI usage and training rights, and take away scholarly activities that would otherwise be allowed by fair use, such as text and data mining. As Samberg, Walsh, and I illustrated, this issue—sometimes called contractual override—can present a big problem for researchers.

Congress may consider whether legislation could protect the rights that they granted to libraries and the scholarly functions that we support. For instance, the 2002 Digital Choice and Freedom Act introduced by Rep. Zoe Lofgren (D-CA) would have created a new section of the US Copyright Act asserting that license terms that restrict or limit any of the congressionally granted rights are not enforceable under any state statute. Congress and witnesses could hold hearings to debate whether a legislative solution is necessary to protect fair use for AI training in the face of contracts that would prohibit the use of licensed works on artificial intelligence tools, and how that legislation might be structured.

Encourage the US Copyright Office to hold a roundtable or study to collect updated evidence on how licenses restrict scholarly use of AI.

In 2001, libraries raised similar contract preemption and licensing issues, pointing out that the rights Congress granted in the US Copyright Act are undermined by licenses or contracts for digital information. The Copyright Office captured these concerns in its 2001 DMCA Section 104 Report, which studied the effects of electronic commerce on sections of the US Copyright Act. In the report, the Copyright Office wrote that there was “no convincing evidence of present-day problems.” In 2016, the US Department of Commerce Internet Policy Task Force issued the White Paper on Remixes, First Sale, and Statutory Damages, with the following conclusion: “If over time it becomes apparent that libraries have been unable to appropriately serve their patrons due to overly restrictive terms imposed by publishers, further action may be advisable (such as convening library and publisher stakeholders to develop voluntary best practices, or amending the Copyright Act).”

A contemporary Copyright Office study would update the collective understanding of the US government, libraries, publishers, and the public about the current market-based barriers that libraries face to accessing and preserving works. After all, these are not problems with copyright law, but with the marketplace for works in digital formats. The 2001 DMCA report suggested that congressional intervention might be necessary if the marketplace does not respond to library concerns, which of course it has not; an updated Copyright Office study could also consider whether legislative or regulatory steps are warranted to address contract terms that conflict with US Copyright Law by prohibiting scholarly activities like the use and training of AI on licensed works.

The post Protecting AI Usage and Training Rights: Three Ideas for Congress appeared first on Association of Research Libraries.