The contrapositive of “Politics and the English Language.” One reason writing is hard:

Statistical Modeling, Causal Inference, and Social Science 2024-03-25

In his classic essay, “Politics and the English Language,” the political journalist George Orwell drew a connection between cloudy writing and cloudy content.

The basic idea was: if you don’t know what you’re saying, or if you’re trying to say something you don’t really want to say, then one strategy is to write unclearly. Conversely, consistently cloudy writing can be an indication that the writer ultimately doesn’t want to be understood.

In Orwell’s words:

[The English language] becomes ugly and inaccurate because our thoughts are foolish, but the slovenliness of our language makes it easier for us to have foolish thoughts.

He continues:

In our time, political speech and writing are largely the defence of the indefensible. Things like the continuance of British rule in India, the Russian purges and deportations, the dropping of the atom bombs on Japan, can indeed be defended, but only by arguments which are too brutal for most people to face, and which do not square with the professed aims of the political parties. Thus political language has to consist largely of euphemism, question-begging and sheer cloudy vagueness.

A few years ago I posted on this topic, drawing an analogy to cloudy writing in science. To be sure, much of the bad writing in science comes from researchers who have never learned to write clearly. Writing is hard!

But it’s not just that. A key problem with a lot of the bad science that we see featured in PNAS, Ted, NPR, Gladwell, Freakonomics, etc., is that the authors are trying to use statistical analysis and storytelling to do something they can’t do with their science, which is to draw near-certain conclusions from noisy data that can’t support strong conclusions. This leads to tortured constructions such as this from a medical journal:

The pair‐wise results (using paired‐samples t‐test as well as in the mixed model regression adjusted for age, gender and baseline BMI‐SDS) showed significant decrease in BMI‐SDS in the parents–child group both after 3 and 24 months, which indicate that this group of children improved their BMI status (were less overweight/obese) and that this intervention was indeed effective.

However, as we wrote in the results and the discussion, the between group differences in the change in BMI‐SDS were not significant, indicating that there was no difference in change in our outcome in either of the interventions. We discussed, in length, the lack of between‐group difference in the discussion section. We assume that the main reason for the non‐significant difference in the change in BMI‐SDS between the intervention groups (parents–child and parents only) as compared to the control group can be explained by the fact that the control group had also a marginal positive effect on BMI‐SDS . . .

Obv not as bad as political journalists in the 1930s defending Stalin’s purges or whatever; the point is that the author is in the awkward position of trying to use the ambiguities of language to say something while not quite saying it. Which leads to unclear and barely readable writing, not just by accident.

The writing and the statistics have to be cloudy, because if they were clear, the emptiness of the conclusions would be apparent.

The problem

Orwell’s statement, when transposed to writing a technical paper, is that if you attempt to cover the gaps in your reasoning with words, this will typically yield bad writing. Indeed, if you’re covering the gaps in your reasoning with words, you’ll either have bad writing or dishonest writing, or both. In some important way, it’s a good thing that this sort of writing is so hard to follow; otherwise it could be really misleading.

Now let’s flip it around.

Often you will find yourself trying to write an article, and it will be very difficult to write it clearly. You’ll go around and around, and whatever you, your written output will feel like the worst of both worlds: a jargon-filled mess, while at the same time being sloppy and imprecise. Try to make it more readable and it becomes even sloppier and harder to follow at a technical level; try to make it accurate and precise, and it reads like a complicated, uninterpretable set of directions.

You’re stuck. You’re in a bad place. And any direction you take makes the writing worse in some important way.

What’s going on?

It could be this: You’re trying to write something you don’t fully understand, you’re trying to bridge a gap between what you want to say and what is actually justified by your data and analysis . . . and the result is “Orwellian,” in the sense that you’re desperately using words to try to paper over this yawning chasm in your reasoning.

The solution

One way out of this trap is to follow what we could call Orwell’s Contrapositive.

It goes like this: Step back. Pause in whatever writing you’re doing. Pull out a new sheet of paper (or an empty document on the computer) and write, as directly as you can, in two columns. Column 1 is what you want to be able to say (the method is effective, the treatment saves lives, whatever); Column 2 is what is supported by your evidence (the method works better than a particular alternative in a particular setting, fewer people died in the treatment than the control group after adjusting this and that, whatever).

At that point, do the work to pull Column 2 to Column 1, or make concessions to reality to shift Column 1 toward Column 2. Do what it takes to get them to line up.

At this point, you’ve left the bad zone in which you’re trying to say more than you can honestly say. And the writing should then go much smoother.

That’s the contrapositive: if bad writing is a sign of someone trying to say the indefensible, then you can make your writing better by not trying to say the defensible, either by expanding what is legitimately defensible or restricting what you’re trying to say.

Remember the folk theorem of statistical computing: When you have computational problems, often there’s a problem with your model. Orwell’s Contrapositive is a sort of literary analogy to that.

One reason writing is hard

To put it another way: One reason writing is hard is that we use writing to cover the gaps in our reasoning. This is not always a bad thing! On the way to the destination of covering these gaps is the important step of revealing these gaps. We write to understand. Writing has an internal logic that can protect us from (some) errors and gaps—if we let it, by reacting to the warning sign that the writing is unclear.