Friday, April 3, 2020

Friday, April 3, 2020

The Models Are Great, It's the Data That's Bad

On Monday of this week I asked the question, ”Are the Models Correct?”. Later that day Drs. Fauci and BIrx finally showed us the much touted models they’ve been using. They boldly proclaimed 100,000 - 240,000 people would die in the best case scenario! Fauci always hedges proclaiming the models are only as good as the assumptions we put into them but this statement presupposes that the models are accurate and are only limited by the data and assumptions. More accurate would be to say that they are no better than the assumptions we put into them because it’s quite possible to have perfect data and a bad model. Today Birx was asked if the models were accurate and she danced around the question saying that they’re really good but the data is uncertain.

Fauci loves to talk about the gold standard of double-blind, randomized, controlled tests in knowing if treatments like hydroxychloroquine/azithromycin really work in spite of multiple trials all over the world showing very positive clinical results. They’re just anecdotal, he says. I wonder how much of the general public understands that a controlled test requires giving some people a placebo, i.e., not receiving treatment for a disease that is a serious killer if you are sick enough to get into an ICU. He’s basically saying he’s willing that some people should die in order to have his beautiful large scale gold standard test because that’s the only way we’ll ever know if it works. You can make your own decision as to the morality of that. I know what I think. But I would love to ask him if he’s done large scale, randomized, double-blind, controlled tests of these models because I know the answer is no. They are, in his own words, anecdotal—at best.

So what are the really good models predicting? Here’s the model from yesterday, just three days after that Monday briefing:

The prediction is 93,531 total deaths. That’s less then the lower limit of Monday’s model (100,000). Maybe the models are perfect and the data is bad, or maybe the data is fine and the models are bad. It doesn’t make much difference because they don’t predict reality, whatever the reason. In just three days the model is now predicting slightly more than half the number of deaths. Really, in three days we got so much new data that the model is now completely different? And these are being used to shutdown the economy, imprison the country and take away our fundamental freedoms like the freedom of religion and assembly. Oh, by the way, this model assumes we will continue the extreme lock down through the end of May! By then, those of us who didn’t get killed by the virus may wish we had since there won’t be a country left, at least one that we will recognize.

Last week I looked at 10 days of data and simply fit a power curve through it with a very high correlation coefficient. Extended out to today it projected 7,000 deaths in the U.S. Here’s what it looks like today where we see that the curve indeed goes through 7,000 on day 35.:

I pointed out that that curve shows a slowing death rate where deaths took three days to double from 2,000 - 4,000 and they are now taking ~3-1/2 days to double (4,000 -8,000).

I also created three cumulative normal distributions with means approximately at the point when the really good models were predicting the peak death rate and designed to fit the existing data as closely as possible. Here’s how they look today with an extra week’s data:

Looking more closely we see that we are on track for the red or blue curves:

Note that these curves are not models at all. They are just three normal distributions designed to fit the data as of 1 week ago. Any number of curves could be drawn. I also looked at the epi curves from the CDC and suburban Cook County, Illinois and I pointed out that these appeared to be starting to peak around March 16-18. Today it is very clear that this is the case:

Since the number of people getting infected began to peak around March 18 and the average number of days to death for those who are going to die is between 2 and 3 weeks, we ought to see the death rate start to peak in the middle of next week, which is one week earlier than the really good models are predicting. This is day 40 in my graphs. The green curve peaks at 41 days, the blue peaks at 44 and the red peaks at 49 days. Unless there’s another major outbreak that pushes the epi curve up again, it looks increasingly unlikely that the red curve is accurate (the one that is the lower limit of Monday’s really good model). I think the blue curve is much more likely and I suspect it might even end up between the blue and green curves given the very positive clinical results of hydroxychloroquine/azithromycin. And this game changer in the middle of the outbreak may shift the shape of the curve so that the 25,000 deaths of the green curve is still a possibility, I think. Note that the middle of next week coincides with the lower limit of the current really good model and that would be 40,000 deaths. Anyone care to bet that the model will soon be changed to half the number of deaths again? Maybe we should plot the changing predictions of the model output like we do the various aspects of the virus. Maybe we’ll see see an exponential decrease in predicted deaths.

Are the models correct? No one will ever know, because they will always be able to say it was the fault of the data. As time progresses the models get “more refined”. In other words, these models will perfectly fit the data after we have it. Hindsight is 20:20 they say. I call it predicting the past.

©Richard Wright, April 3, 2020