What is interesting writing and can LLMs create it?

Statistical Modeling, Causal Inference, and Social Science 2025-02-25

This is Jessica. Last week I wrote about how statistical methods can sometimes evoke “moral dismay” when you consider how they build in expectations of a simpler, less dynamic world than we would probably want to live in. But another kind of moral dismay that I have been thinking about lately is in reaction to questions about whether LLMs can produce writing that is truly creative or interesting.

For example, last week I was talking to Ari Holtzman, who thinks LLMs are capable of creating interesting writing, about an idea I had for a writing project, and he suggested using an LLM to generate it. And my response was, Oh, like have it paraphrase my writing? He said I was the third person he had encountered that day who seemed unable to believe that an LLM could produce something genuinely interesting. So I started thinking about why I would be skeptical, and whether it was just a superficial bias or one where I could come up with good reasons upon reflection. I use LLMs fairly often these days, so why do I begrudge them the ability to create interesting writing?

What it means for a piece of writing to be creative or interesting is not obvious, just like what intelligence should mean is not obvious, so we need to find some way of defining that. It’s also ambiguous what it means for an LLM to produce writing in the first place. Here I’m going to assume that an LLM producing writing involves a human providing the prompt and maybe doing some kind of curation or light editing, but nothing that goes too far beyond giving an outline of high level goals.

One definition of interesting is surprising, but this is not specific enough. I would expect there are plenty of strings that have zero probability as defined on some large corpus like we train LLMs on that would not be interesting, so statistically unlikely alone is not sufficient. On the other hand, the infinite monkey theorem says that a monkey randomly hitting keys on a typewriter for infinite time will surely type the complete works of William Shakespeare, but that doesn’t help us define interesting writing.

Among strings that at least roughly follow grammatical conventions, how do we decide which ones are interesting? Is statistically unlikely within this set sufficient? I think we’d still be missing something like logic or common sense or causal mechanism that good writing usually conforms to – i.e., we want surprising but with some underlying logic that we recognize as being of this world. The plot or ideas or impressions conveyed by the writing have to logically follow in some way. Previously I have said I associate good poetry with defying expectations in a way that is confusing but opens up room for some unexpected sense of familiarity or recognition.

In great fiction or poetry this often feels like some kind of truth has been conveyed. Or at least some carefully crafted statement. For example, surprising in contemporary art often means the work functions as a kind of statement that is carefully crafted against some background set of assumptions, often to refer to the absence of something. Writers like Bukowski and Ginsberg were interesting in part because they refused to abide by some convention.

But do we always need meaning for writing to be interesting? It’s worth returning to this except from Gertrude Stein I shared before:

It is not a range of a mountain. Of average of a range of a average mountain Nor can they of which of which of arrange To have been not which they which Can add a mountain to this. Upper an add it then maintain That if they were busy so to speak Add it to and It not only why they could not add ask Or when just when more each other There is no each other as they like They add why then emerge an add in It is of absolutely no importance how often they add it.

This is interesting writing to me, but Stein’s goal was purportedly simple, to put words together so as to sound like English. That seems like something an LLM could do too. But is that really all she was doing? I tried prompting Claude with this excerpt, asking it to create something in the same genre. Here’s the first few stanzas, which make clear it it makes too much sense to evoke what Stein was evoking:

Not that it matters in the direction Of a circular happening a circular Happening which circles around itself And not around them or around it. When it happens it happens to happen In a way that is neither here nor circular Nor not circular but with edges that soften Into something that cannot be edged into.

Meh. I told it it was too coherent to approximate what Stein was doing and got this revision:

A table is a table is a table is not When they come they come they come around For which of them for which of them for which The table stands the table stands. Not as a they would have could have The having of a table the table having Neither once and twice and neither twice At once at as at table once.

Closer, but now it has a kind of one-trick-pony vibe, with all the repetition.

What about intention? Can writing be interesting without intention? If I gave you a piece of text and you thought it was written by a human, would you find it more interesting than the same piece of text described as coming from an LLM? Why exactly would knowing it is produced by an LLM imply that it will not be interesting? Is it because no person found it interesting enough to work through its full creation themself? Or could the presence of a human curator instill enough trust for a reader to not worry that they are wasting their time?

This makes me think of asking LLMs to generate writing that combines things in an interesting way. Does that count as the LLM producing interesting writing? E.g., I could ask it to work a discussion of Virginia Woolf’s goals in writing The Waves into my research paper on uncertainty quantification. Maybe this could be interesting, but would it be primarily because of the juxtaposition? If that came from the human prompter, it would seem they deserve the credit for making it interesting. If it didn’t, then the lack of intention might lead to a lack of trust for some readers.

This starts to question what we expect about the source of interesting writing. I came across this article from Nature archives (published in 1924), where the author apparently interviewed some writers about where they felt the writing came from. The whole thing is fascinating, for example:

I have reported the results of the only registrations of verse ever made and analysed. They show that verse is a current of speech energy of varying amount so adjusted as to come in more or less regular waves … The next problem is that of the source of this current of energy … The replies of many living poets to a question-paper agree without exception with all the statements I have been able to find from the poets of the past: they have no ideas whatever on the subject. They all assert, moreover, that the verse-form comes to them ready-made along with the poetic content. This poetic content, they all declare, is forced upon them by some inner power. The source of verse, therefore, must be sought in the source of poetry. The poets all agree that the source of their poetry lies entirely outside of their consciousness. In psychological terms it is a product of the unconscious. In physiological terms it is a product of bodily processes that result in producing poetical feeling and ideas in consciousness. In either sense it is a biological phenomenon of which the ultimate explanation must be in biological terms.

Elsewhere the author attributes it to inner conflict between the “It” (unconscious) and the “Self” (which wishes to adapt to the environment) of the writer. Do we have a built-in bias against LLM-generated writing because it’s not biological, or doesn’t reflect some kind of existential struggle? This all seems kind of crazy, but as we wrote about in this article on generative AI and aesthetic judgment, we often bring quasi-religious expectations to what we perceive as art, where we are looking both for the work to reflect something about the conditions that created it and some eternal human essence. Art historian Donald Preziosi has written about the relationship between art and the Eucharist, which was an exceptional in that it was both a sign and the thing itself. We sometimes expect a similar miracle from aesthetic objects. So maybe it’s not that farfetched that we are setting up impossible standards when we look for something interesting in LLM writing.