This one might possibly be interesting.

Statistical Modeling, Causal Inference, and Social Science 2024-10-26

Bert Gunter points to this news article by Jeffrey Brainard that reports:

Careful scientists know to acknowledge uncertainty in the findings and conclusions of their papers. But in one leading journal, the frequency of hedging words such as “might” and “probably” has fallen by about 40% over the past 2 decades, a study finds. . . .

The new analysis, one of the largest of its kind, examined more than 2600 research articles published from 1997 to 2021 in Science, which the team chose because it publishes articles from multiple disciplines. . . . The team searched the papers for about 50 terms such as “could,” “appear to,” “approximately,” and “seem.” The frequency of these hedging words dropped from 115.8 instances per 10,000 words in 1997 to 67.42 per 10,000 words in 2021. . . .

Although journal editors and reviewers should look out for exaggerated claims, they shouldn’t bear all the responsibility, Wheeler cautions. “It’s also up to universities and research institutes to value the quality over quantity of researchers’ outputs,” she says, “which would allow more time for academics to reflect and produce meaningful work instead of churning out as many publishable papers as possible.

Wei adds a hedge of her own, however: The new study doesn’t show what caused the observed decline of hedging language. The pressure to publish that academics face to gain tenure, promotion, and professional recognition may play a role, but there could be other factors as well. The nature of the connection, she says, deserves further study.

I was struck by the inclusion of the word “probably” in that list, because, to me as a statistician, “probably” has a technical meaning and should not be used as a vague qualifier! When I’m writing something with coauthors and see terms such as “probably” or “most of the time,” I’ll swap in a fuzzier term such as “typically” or “we would not be surprised to see” or something like that. This is similar to how we will use terms such as “haphazard” rather than “random” when referring to non-random sampling.