Professor Apologizes For Using Fake AI-Generated Citations In Defense of Minnesota’s Unconstitutional Deepfake Law

Techdirt. 2024-12-02

Just weeks after a court swiftly struck down California’s unconstitutional “deepfake” law, a similar challenge is underway in Minnesota (using the same plaintiff and same lawyers) — and the state’s defense is off to an inauspicious start, with one of their own expert witnesses submitting a declaration that cites non-existent research, “hallucinated” by the very same LLM tools the law seeks to demonize.

Minnesota’s law, unlike California’s, doesn’t even include any exceptions for satire or parody. It does require the impersonation to be through technological means, so wouldn’t include just a human impersonation, but still. Indeed, Minnesota’s law is so broadly worded, with so little detail that it’s astounding anyone thought this was a good idea or even remotely constitutional.

The law appears to violate the First Amendment by restricting speech based on its content, without meeting the high “strict scrutiny” bar required for such content-based restrictions. It’s also likely unconstitutionally vague and overbroad.

In responding to the lawsuit, Minnesota hired Stanford Professor Jeff Hancock as an expert witness to defend the law. In particular, he was asked to explain how AI is influencing misinformation on social media, as part of the state’s likely effort to show that there’s a “compelling government interest” here (part of the strict scrutiny test).

Here’s where I note that I know and like Professor Hancock, and have appreciated his insights and research regarding social media for years. He has a long history of doing research that has helped debunk moral panics.

I was a bit surprised to see him agree to defend this law, which seems quite clearly a First Amendment violation.

But, I was even more shocked a couple weeks ago when Eugene Volokh noted that it appeared that Hancock’s declaration included “hallucinated” non-existent citations to two pieces of research. There is, of course, some amount of irony in a declaration about misinformation and AI to include… misinformation generated by AI.

I had emailed Hancock asking for a comment, which he promised was coming soon. Last Wednesday, the day before Thanksgiving, he filed a new declaration in support of amending the original declaration, with an explanation of what happened. His explanation is understandable, though I would argue not acceptable.

This wasn’t a case like the infamous lawyer who used ChatGPT and had no idea what he was doing. According to Hancock, the likely mistake had to do with his workflow, which did involve a combination of direct writing in Word, but also using both Google Scholar and GPT-4o to augment both his research and writing.

He claims that he wrote out some lists of things he wished to cover and wrote “[cite]” at one point. I’ve seen this sort of thing in many (often legal) draft documents, where people write out a notation that they can search for to go back later and add in citations for the claim they are making.

According to Hancock, he then likely asked GPT-4o to draft a paragraph based on the list, which included the “[cite]” not realizing that the LLM would then make up a citation for that claim. Then, because he would normally just do a search for “[cite]” to fill in missing citations, he didn’t see it because the paragraph generated by the LLM “erased” the notation and replaced it with a fake citation.

Still, the big mistake here was in asking the LLM to “draft a short paragraph.” I can’t see any good reason for that to have been used in this way here:

The citation errors here occurred in the drafting phase, and as such, I explain my process in granular detail here. The drafting phase involved two parts – the substance and the citations. As to the substance, I began by outlining the main sections of the declaration in MS Word. I then outlined the key substantive points for each section, also in MS Word. I continued to engage Google Scholar and GPT-4o

The two citation errors, popularly referred to as “hallucinations,” likely occurred in my use of GPT-4o, which is web-based and widely used by academics and students as a research and drafting tool. “Hallucinated citations” are references to articles that do not exist. In the drafting phase I sometimes cut and pasted the bullet points I had written into MS Word (based on my research for the declaration from the prior search and analysis phases) into GPT-4o. I thereby created prompts for GPT-4o to assist with my drafting process. Specifically for these two paragraphs, I cannot remember exactly what I wrote but as I want to try to recall to the best of my abilities, I would have written something like this as a prompt for GPT-4o: (a) for paragraph 19: “draft a short paragraph based on the following points: -deepfake videos are more likely to be believed, -they draw on multiple senses, – public figures depicted as doing/saying things they did not would exploit cognitive biases to believe video [cite]”; and (b) for paragraph 21: “draft a short paragraph based on the following points: -new technology can create realistic reproductions of human appearance and behavior, -recent study shows that people have difficulty determining real or fake even after deepfake is revealed, -deepfakes are especially problematic on social media [cite].”

When I inserted the bullet points pertaining to paragraphs 19 and 21 into GPT-4o I also included the word “[cite]” as a placeholder to remind to myself to go back and add the academic citation. As I explained earlier, both of the now corrected cites were articles that I was very familiar with – one of which I wrote myself. I did not mean for GPT-4o to insert a citation, but in the cut and paste from MS Word to GPT-4o, GPT-4o must have interpreted my note to myself as a command. The response from GPT-4o, then, was to generate a citation, which is where I believe the hallucinated citations came from. This only happened in these two instances and nowhere else in my declaration.

When GPT-4o provided me these answers, I cut and pasted them from the online tool into my MS Word declaration. I then edited my declaration extensively as to its substance, and where I had notes to myself in both instances to add the citation, GPT-4o had put them in for me incorrectly and deleted the “[cite]” placeholder I had included to remind myself to go back and include the right citation. Without the “[cite]” placeholders, I overlooked the two hallucinated citations and did not remember to include the correct ones. This was the error on my part, and as I stated earlier, I am sorry for my oversight in both instances here and for the additional work it has taken to explain and correct this.

I find this explanation to be believable in that it seems likely to be an accurate portrayal of what happened. However, I also find it completely unacceptable for someone submitting a declaration.

I’ve talked before about how I use LLMs at Techdirt, and it’s incredibly important to me that it not do any of the writing for this exact reason. I let it review stuff, challenge my writing, and suggest improvements, but each one I carefully review and then edit by hand to avoid exactly these kinds of situations.

And that’s just for a random blog post. For a legal filing like this, it seems to be a fairly egregious mistake to have used AI in this manner. Especially when someone like Hancock knows the research so thoroughly. As he admits in this new declaration, he actually has real citations to match the original claims, and one of them included his own research:

The correct citation is to Hancock & Bailenson (2021) for paragraph 19, which is cited above in paragraph 17. I co-authored this article, and it lays out why the visual medium is so dominant in human perception and why communication research indicates that misleading audiovisual information may be more likely to be trusted than verbal messages (Hancock & Bailenson, 2021, p. 150).

But, of course, this is only all the more reason why he should have quickly caught this error before submitting it.

I feel bad for him, as I’m sure he feels awful about this. But that’s still little excuse for letting this kind of thing slip through, and also for supporting a law that seems pretty obviously unconstitutional.

If there’s any “silver lining” here, it’s that this incident demonstrates how even experts can be fooled by LLM-generated misinformation when they don’t exercise proper care and skepticism. However, that’s a concerning lesson about the risks of failing to use proper care, rather than a point in favor of Minnesota’s misguided law.