Last post on the estimated effects of Mississippi school reforms

Statistical Modeling, Causal Inference, and Social Science 2026-01-05

For background:

How much of “Mississippi’s education miracle” is an artifact of selection bias?

When the numbers don’t look right, check them! (Mississippi education update)

More on school reform, this time New Orleans

And now one more, from Noah Spencer, who writes:

I did have a good back-and-forth with Wainer et al., but remain unconvinced by their main critique.

– I [Spencer] address the authors’ main critique – that truncation due to retention mechanically explains the observed effects – in Section 7.2 of my paper. Basically, students who are retained in grade 3 do not just stay there forever. The typical student is retained for one year and then proceeds to grade 4, where they can write the NAEP. Based on the timing of the policy, it just would not have been the case that any NAEP-taking cohort would be artificially missing a mass of weaker students. – “One hypothesis is that the NAEP test score gains are a mechanical consequence of weaker 3rd-grade students not making it to fourth grade to write the NAEP test. Given the timing of the retention policy however, this purely mechanical explanation does not make sense. The first cohort eligible for retention under the LBPA was the 2014-2015 grade 3 cohort. Thus, the 2014-2015 grade 4 NAEP test-takers were not exposed to the new retention policy. It is true that the 2016-2017 grade 4 NAEP test-taking cohort would not have included students who were retained in grade 3 after the 2015-2016 school year (who would have been in grade 4 in 2016-2017 absent the LBPA). However, the 2016-2017 test-taking cohort would have included students who were retained in grade 3 after the 2014-2015 school year (assuming they were not retained again in 2015-2016).

    Thus, the mass of weaker students taking the NAEP would not be eliminated due to the LBPA, but rather replaced by a mass of previously under-achieving students who had been retained and had now passed the necessary grade 3 reading assessment.

Similar logic follows for the 2018-2019 test-taking cohort.” – Minor note: Being retained multiple times in grade 3 is rare in Mississippi.

– I also test in my paper whether the LBPA changed the composition of NAEP-takers beyond the above truncation concern (see Table B3). I do not find statistically significant effects on the percent of NAEP takers who: are White, are male, are English language learners, have a disability, or have a computer at home.

– The question of whether retention was the key mechanism through which the LBPA’s effects manifest is a good one. Are the average test score gains across Mississippi driven by the scores of retained students? The 2014-2015 treatment effect cannot be due to LBPA-induced retention as Mississippi’s 2014-2015 grade 4 cohort was not exposed to the retention aspect of the policy (which started in 2014-2015). The 2018-2019 treatment effect is unlikely to be substantially influenced by LBPA-induced retention given that the 2016-2017 third-grade retention rate (3.8%) was so similar to the pre-LBPA retention rate (3.3% in 2013-2014). You would have to assume incredible gains in test scores due to retention for such a small segment of students to influence a state’s average so greatly. The 2016-2017 treatment effect is the most likely to be affected by retention given that 8.1% of third-graders in 2014-2015 were retained. In Appendix C, I conduct a decomposition exercise and estimate that only about 22% of the 2016-2017 treatment effect is due to retention aspect of the LBPA – though I should note that this decomposition exercise does require some strong assumptions.

– With respect to longer-term effects, I show in Appendix B.1 of my paper that effects persist until at least grade 7 on higher-stakes, state-level tests. There is some fadeout, but this is not unusual among educational interventions. I did not analyze effects on grade 8 NAEP reading scores in my paper partially because there was only one pre-COVID grade 8 cohort who was exposed to the LBPA and partially because I wanted to use grade 8 test scores as covariates. For what it’s worth, though, I have run the analysis quickly and find positive effects for grade 8 NAEP reading test-takers (including the 2022 and 2024 cohorts), though I would be hesitant to take much from post-COVID results because there was so much else changing at the time.

– Carefully evaluating effects on longer-term outcomes like high school completion rates, ACT scores, and post-secondary entrance rates is an important topic for future research. Mississippi’s gains on grade 4-8 assessments certainly do not guarantee longer-term effects and, again, it would not be unusual for short/medium-term effects to fade out.

– The claim that “The 2024 NAEP fourth grade mathematics scores rank the state at a tie at 50th!” is incorrect: Mississippi ranked 16th. They are also ranked 35th in 8th grade math, not 50th. I believe the authors have corrected this in an updated version of their article.

– “He improvised by using some prior years’ data as the control group, and instead of random assignment he used various bits of covariate information to equate this year’s students with the previous years…” – This was not what I did (nor what the synthetic difference-in-differences method does). I generated a control group based on a weighted average of states with similarly evolving test scores pre-treatment.

– Mississippi’s results are not entirely unique. Westall and Cummings (2023) assess early literacy policies across the country and find 0.14 SD effects for kids exposed from K-3 in the average “comprehensive policy” state. My 0.23 SD estimated effect for Mississippi is not wholly inconsistent with their national results.