The connection between the psychological concept of “generic language” and the problem of overgeneralization from research studies
Statistical Modeling, Causal Inference, and Social Science 2023-10-03
A couple years ago I suggested: A quick fix in science communication: Switch from the present to the past tense.
Here’s an example. A paper was published, “Māori and Pacific people in New Zealand have a higher risk of hospitalisation for COVID-19,” and I recommended they change “have” to “had” in that title. More generally, I wrote,
There’s a common pattern in science writing to use the present tense to imply that you’ve discovered a universal truth. For example, “Beautiful parents have more daughters” or “Women are more likely to wear red or pink at peak fertility.” OK, those particular papers had other problems, but my point here is that at best these represented findings about some point in time and some place in the past.
Using the past tense in the titles of scientific reports won’t solve all our problems or even most of our problems or even many of our problems, but maybe it will be a useful start, in reminding authors as well as readers of the scope of their findings.
Recently it was brought to my attention that research has been conducted on this topic.
The relevant paper is Generic language in scientific communication, published by Jasmine DeJesus et al. in 2017, who write:
Scientific communication poses a challenge: To clearly highlight key conclusions and implications while fully acknowledging the limitations of the evidence. Although these goals are in principle compatible, the goal of conveying complex and variable data may compete with reporting results in a digestible form . . . For example, generic language (e.g., “Introverts and extraverts require different learning environments”) may mislead by implying general, timeless conclusions while glossing over exceptions and variability. Using generic language is especially problematic if authors overgeneralize from small or unrepresentative samples . . . In an analysis of 1,149 psychology articles, 89% described results using generics . . . Online workers and undergraduate students judged findings expressed with generic language more important than findings expressed with nongeneric language.
It’s good to see this coming out in the psychology literature, given that just a few years ago a prominent psychology professor expressed annoyance when I expressed problems about representativeness in a published study.
Also relevant is our post from a few years ago, Correlation does not even imply correlation, which also addressed the challenges of drawing general conclusions from nonrepresentative samples in the presence of selection bias.