Common misreading of data with time lags

Numbers Rule Your World 2022-01-11

It's sad to see data analysis has not progressed at all since the early days of the pandemic. Recall this post from end of July 2020, and it's groundhog day. Two years ago, they declared the pandemic was improving because deaths weren't rising as quickly as cases. They were wrong. The recent articles about the "mildness" of Omicron repeats the same flawed analysis from before.

Is Omicron mild? It will take a few weeks - after Omicron becomes the dominant strain in a given locale - to be reasonably sure. The early analysis being circulated - which takes the current deaths divided by the current cases - is misleading.

***

Typically, one doesn't die the day after one tests positive. There is a time lag of weeks, maybe even more than a month.

Because Omicron is a lot more infectious, the denominator is expanding very rapidly while the numerator is not yet responding to the recent spike in cases. We have seen this picture before; eventually, deaths jump up also.

The media did the same routine when Delta first showed up in the statistics. The following chart shows how Delta entered the country in May and by late July, became the dominant SARS-Cov-2 strain. What was the media saying in early August?

Kfung_variants_usa_riseandfall

"The delta variant-driven summer COVID-19 surge in the United States has so far proved much less deadly than previous waves, thanks in large part to vaccinations." That was the first sentence of this article at Yahoo. If you looked at the cases and deaths data at the time of the article, you found that cases were rising rapidly while deaths remained low, and therefore, it appeared as if Delta would be sniffles.

Then, by the end of August, the media dropped what it said in early August, and suddenly told readers how horrible this Delta strain was. Yahoo's headline became "America's delta-driven surge of COVID-19 has entered a deadlier phase."

No, a deadly phase did not arrive. It was the time lag. About four weeks later, some of the people who got infected in the July surge died. Because of the time lag, we shouldn't divide current deaths with current cases. The current deaths mostly come from infections that occurred about some weeks ago. In the meantime, the current cases are exploding but few of these new infections are causing immediate deaths. So the denominator gets inflated with cases that have almost zero chance of contributing to current deaths.

In the following chart, you can see the time lag between cases and deaths during the Delta surge.

Kfung_casesdeathsdeathrate_usa

Let's say we don't have a crystal ball. We are at the start of August, and this is what we see:

Kfung_casesdeathsdeathrate_usa_projection

Is this new variant that is causing the surge in cases less severe? Analyzing the blue area will result in the wrong conclusion because the red line has yet to react to the surge in the black line.

So, the media have pitched this misleading story over and over again, with each surge. When will they stop?

***

Omicron may well be milder but it cannot be proven by flawed analysis. The fact that a flawed analysis results in a desirable answer does not rescue the analysis. Even if we eventually prove that Omicron is milder through a proper analysis, it still cannot salvage a bad methodology. Otherwise, all we are doing is to massage the data to support a wished-for conclusion.