“Exposing omitted moderators: Explaining why effect sizes differ in the social sciences”

Statistical Modeling, Causal Inference, and Social Science 2025-04-03

Antonia Krefeld-Schwalb, Eli Rosen Sugerman, and Eric Johnson write:

Policymakers increasingly rely on behavioral science in response to global challenges, such as climate change or global health crises. But applications of behavioral science face an important problem: Interventions often exert substantially different effects across contexts and individuals. We examine this heterogeneity for different paradigms that underlie many behavioral interventions. We study the paradigms in a series of five preregistered studies across one in-person and 10 online panels, with over 11,000 respondents in total. We find substantial heterogeneity across settings and paradigms, apply techniques for modeling the heterogeneity, and introduce a framework that measures typically omitted moderators.

I like this. It reminds me of our piranha paper but directly informed by empirical data.

The focus on treatment interactions—equivalently, variation of effect size—makes sense to me. It’s something I’ve been thinking about for a long time, for example this from 2005, this from 2014, this from 2015, this from 2023, and our recent paper on causal quartets for visualizing varying treatment effects. Also this paper from 2004 that I’m still chewing on.

But I haven’t thought so much about modeling (rather than just describing) variation in effects. so this paper by Krefeld et al. seems like an important step forward.

Magnitude and direction

Also I was struck by this statement from the paper:

Moderators are associated with effect sizes through two paths—effecting manipulation intensity and interacting with the effect of the manipulation.

This reminds me of something regarding education research—really, policy research in general—that I’ve been saying a lot recently but haven’t written down. The idea is that education is like a vector with a magnitude and a direction. The magnitude is how hard students work on their own or in small groups—those are the two scenarios where most of the learning gets done—and the direction is what they learn.

As teachers, we have two jobs. Job #1 is to motivate students to learn, that is, to increase the magnitude. Job #2 is to teach correct and useful things, that is, to get the direction right. My books are a mix of #1 and #2. To help with the magnitude, we try to structure the material to be clear, to smooth the path to learning and to give students lots of handholds: stories, examples, math, code, explanations, homeworks, all sorts of things. To help with the direction, we work hard to include useful material and to remove or to argue against ideas we think are counterproductive. For example, when we were writing BDA back in the early 1990s, a big idea in Bayesian inference was decision theory of Bayes estimators, and another idea was Bayesian null hypothesis testing. We put in very little on those topics in the book, and most of what we did put in was to explain why we weren’t putting it in. From the other direction, we pivoted the book around three chapters on hierarchical modeling, model checking, and the relation between design and analysis: to us, these were important concepts that students might otherwise not see.

As you can see just from the above paragraph, it’s a lot easier for me to think about direction than magnitude, which makes sense because I think I’m a much better statistician than a teacher, and indeed my teaching is best done not one-on-one but rather in this broadcasty way by exploring ideas through writing.

To get back to education research: I think most education interventions, at least the ideas that get tested in controlled trials, are focused on improving magnitude. The idea is that the subject-matter experts are supposed to get the direction right, and the education researchers work on the magnitude.

But this has implications for education research! What I’m calling “the magnitude,” which is motivation for students to work hard, figure things out, and learn on their own or with peers, is in large part an interaction between the teacher and student. That’s right, interactions again!

And not just education research. So many social interventions are ultimately about motivation.

This idea—that the most important part of a treatment is in its interaction with the people being treated—is in direct conflict with the dominant approach of thinking about causal inference, what I call the black-box or push-a-button, take-a-pill model of science. Something’s gotta give, and maybe this new paper by Krefeld et al. will take us a little bit in the right direction.

P.S. I could do without the trolley example—I’d be happy to never again hear about that fat guy (described in this article as “a large man wearing a backpack,” which I guess is the politically correct way to say “fat guy” now), but, hey, it’s their paper, they can use whatever examples they want!