Rules of Thumb for Evaluating Statistical Models
The coronavirus pandemic has flung abstruse questions of scientific modeling to the forefront of public consciousness. How do you model a complex phenomenon, such as the spread of a highly contagious disease, when we don’t know its infectiousness, its morbidity rate, or how it interacts with other health conditions? How do you integrate models of the biological characteristics and effects of a virus with models of human society (e.g., the availability of medical ventilators and medical staff, the influence of mass transit and mass recreation on stimulating the spread of disease)?
Most difficult, how do you model policy responses to the virus?—which are themselves dependent on how politicians and the people react to public announcements about earlier virus spread models? Imperial College epidemiologist Neil Ferguson and his team’s model of predicted British deaths fell rapidly from 500,000 to less than 20,000—not because of changed epidemiological assumptions, but because the British government had imposed a lockdown, likely as an alarmed response to his earlier model. And what do you do when the negative consequences might include millions of deaths on the one hand, the Great Depression on the other, both if you’re particularly unlucky—and you must make the decision right now, with nothing but an ensemble of shifting and uncertain models?
We should have sympathy for every elected leader in the world. Each one of them has to make hundreds of vitally important decisions, and they’re almost guaranteed to choose at least one policy during this crisis that people will look back on and say, that was a mistake! We must critique policy choices vigorously and continuously through this pandemic—but with a spirit of charity. It’s unlikely that any of us would make perfect policy if we were in the Oval Office or 10 Downing Street.
We should also approach scientific modelers with a spirit of charity. We should critique their work wherever appropriate—and particularly when modelers call for millions of people to suffer destitution and restrictions on their liberty, because today’s model supports that strategy. But the modelers are trying to do their best for their fellow human beings. And we should sympathize with the doctors’ fears, whose experiences provide crucial input for the models, because it is informed by direct experience with coronavirus effects.
But how should we judge coronavirus models?—and, more broadly, the gusher of articles about the coronavirus and its effects? How should we work to improve the models? I think these are some useful rules of thumb to keep in mind:
- Reserve judgment on claims that the rate of coronavirus deaths correlates with X. Too many observational studies in recent years have produced false correlations simply by looking for relationships between X and everything else in the world, and discovering fluke correlations by sheer chance. If you read the words correlation or association near coronavirus, pause and take a close look at the methodology behind the claim.
- We must work to get the maximum amount of data fully accessible to the public as quickly as possible. We don’t yet have the most basic information, such as how many people actually have been infected? We need massive numbers of coronavirus tests to get better data for our models.
- Rather than choosing which model to believe, we must think in terms of updating an ensemble of coronavirus models as quickly and efficiently as possible. Much highly technical debate among statisticians regards how we should update our models—war to the knife between Bayesians and Frequentists. I believe, following Karl Popper, that we should work continuingly to attempt to falsify models, and to provide falsifiable models. This points toward the insights of error statistics philosophy, which focuses at a theoretical level on the need to craft usefully falsifiable models, as a way to frame our examination and updating of models.
- We should be aware that no model will provide us perfect knowledge. A great many scientific disciplines cannot provide us perfect knowledge; we must work instead to maximize the amount of knowledge we have. Policymakers must ultimately make their decisions, knowing that no model provides scientific certainty. The public should judge policymakers, knowing that they had to make their decisions with limited information.
- We should try to incorporate intelligent policy response to existing data into our models in order to model effective government policy. There is a strong case that epidemiological statistics, as a discipline, does not model such responses adequately, and thus has recommended unnecessarily large government responses to the coronavirus pandemic. Although we should be charitable to epidemiologists, we should also examine epidemiology for disciplinary blinders, and not simply defer to epidemiologists’ claims of expertise.
- We must try not to fall prey to groupthink. Whether we think the lockdown needs to be extended indefinitely, or ended immediately, we need to listen carefully to people arguing the opposite point of view. This is particularly difficult when the stakes are so immediate and so high. Yet it is precisely for this reason that we should listen to a wide variety of points of view—and make sure the corresponding data are publicly accessible, so as to allow for informed debate among contrary opinions. The stricture to avoid groupthink applies to the experts informing government policy—but our first duty is to rid ourselves of groupthink, not to point out gleefully the groupthink in our neighbor’s eye.
As many observers have noted, the debate about coronavirus models has been subsumed into the debate about climate change models. Skepticism about overready reliance on climate change models has been useful for framing analysis of coronavirus models. However, this connection has been unhelpful to the extent that it freezes and distorts responses to the coronavirus along the fault lines of climate change debate. This is unfortunate, because it is perfectly possible to be a climate change skeptic and a coronavirus alarmist or a climate change alarmist and a coronavirus skeptic. We should allow our knowledge of climate science modeling controversies to inform our approach to coronavirus modeling controversies, but not to determine them.
Soon we will debate anew the regulations that govern how models should inform government policy, with a great deal of coronavirus grist for the mill. This debate should expand to include education policy. We cannot evaluate models if we do not understand statistics, and statistics education is sorely lacking in America, among the general public, policymakers, and even scientific professionals. We must improve our statistics education drastically. Every state should require its high school graduates to take a course in statistics literacy. All universities should add introductory statistics to their general education requirements. The different professions should increase their statistics requirements—not only medical schools but also law schools. We need better knowledge of statistics and modeling at every level of our educational system.
Citizens (whose opinions influence policymakers) and policymakers must be able to evaluate models intelligently. The coronavirus has made this a matter of life and death.
Related Articles
Your generosity supports our non-partisan efforts to advance the principles of open inquiry, viewpoint diversity, and constructive disagreement to improve higher education and academic research.