“On or about December 2010 the behavioral sciences changed” and Eggers on The case against Hypothesis 1

Statistical Modeling, Causal Inference, and Social Science 2025-03-02

“On or about December 1910 human character changed.” — Virginia Woolf (1924).

Woolf’s quote about modernism in the arts rings true, in part because we continue to see relatively sudden changes in intellectual life, not merely from technology (email and texting replacing letters and phone calls, streaming replacing record sales, etc.) and power relations (for example arising from the decline of labor unions and the end of communism) but also ways of thinking which are not exactly new but seem to take root in a way that had not happened earlier. Around 1910, it seemed that the literary and artistic world was ready for Ezra Pound, Pablo Picasso, Igor Stravinsky, Gertrude Stein, and the like to shatter old ways of thinking, and (in a much lesser way) the behavioral sciences were upended just about exactly 100 years later by what is now known as the “replication crisis.”

For several decades, leading behavioral scientists have offered strong criticisms of the common practice of null hypothesis significance testing as producing spurious findings without strong theoretical or empirical support. But only in the past decade has this manifested as a full-scale replication crisis. We consider some possible reasons why, on or about December 2010, the behavioral sciences changed.

The above is taken from an article I wrote with Simine Vazire a few years ago, Why did it take so many decades for the behavioral sciences to develop a sense of crisis around methodology and replication?. We had some good discussion on the blog at the time.

I have nothing new to add on the topic right now; I’m just re-posting because the “What changed in 2010?” question arose in a recent talk by the political scientist Andy Eggers:

The case against Hypothesis 1

Making and testing predictions about unseen data is extremely common in social science research. Until recently, most prediction-making and -testing was mere window dressing because the predictions were made after the results were known. Now that empirical researchers commonly pre-register their hypotheses in pre-analysis plans (PAPs), the practice of making and testing predictions deserves new scrutiny. I [Eggers] highlight the “Hypothesis 1 Puzzle”, which is that (all else equal) we expect to learn less from studies with more predictable results, and yet researchers take pains to make their outcomes seem more predictable. I consider several possible resolutions to the puzzle, including adherence to a caricature of the “scientific method” and conflation of theory testing with statistical hypothesis testing. In many cases, tests of “Hypothesis 1” are best seen as severely underpowered tests of poorly specified predictive models.