April 21, 2021

Knowing what to leave out

The epidemic modelling group at Te Pūnaha Matatini (who work a few floors above me) won the Prime Minister’s Science Prize for their work on modelling the Covid epidemic in New Zealand.   There have been some descriptions in the media of their models, but not so much of what it is that mathematical modelling involves.

A good mathematical model captures some aspect of the way the real process works, but leaves out enough of the detail that it’s feasible to study and learn about the model more easily.  The limits to detail might be data available or computer time or mathematical complexity or just not understanding part the way the process works.  Weather models, for example, have improved over the years by using more powerful computers and more detailed input data, enabling them to take into account more features of the real weather system and more details of Earth’s actual geography.

The simplest epidemic models are the SIR and SEIR families.  These generate the familiar epidemic curves that we’ve all seen so often: exponential on the way up, then falling away more slowly. They are also responsible for the reproduction number “R”, the average number of people each case infects.  The simple models have no randomness in them, and they know nothing about the New Zealand population except its size.  There’s a rate at which cases come into contact with new people, and a rate at which contacts lead to new infections, and that’s all the model knows.  These models are described by simple differential equations; they can be projected into the future very easily, and the unknown rates can be estimated from data.   If you want a quick estimate of how many people are likely to  be in hospital at the epidemic peak, and how soon, you can run this model and gaze in horror at the output.  In fact, many of the properties of the epidemic curve can be worked out just by straightforward maths, without requiring sophisticated computer simulation.  The SEIR models, however, are completely unable to model Covid elimination — they represent the epidemic by continuously varying quantities, not whole numbers with uncertainty.  If you put a lockdown on and then take it off, the SEIR model will always think there’s some tiny fraction of a case lurking somewhere to act as a seed for a new wave.  In fact, there’s a notorious example of a mathematical model for rabies elimination in the UK that predicted a new rabies wave from a modelled remnant of 10-18 infected foxes — a billion billionth of a fox, or one ‘attofox’.

The next step is models that treat people not precisely as individuals but at least as whole units, and acknowledge the randomness in the number of new infections for each existing case.  These models let you estimate how confident you are about elimination, since it’s not feasible to do enough community testing to prove elimination that way.   After elimination, these models also let you estimate how big a border incursion is likely to be by the time it’s detected, and how this depends on testing strategy, on vaccination, and on properties of new viral variants.  As a price, the models take more computer time and require more information — not just the average number of people infected by each case, but the way this number varies.

None of the models so far capture anything about how people in different parts of New Zealand are different.  In some areas, people travel further to work or school, or for leisure. In some areas people live in large households; in others, small households. In some areas a lot of people work at the border; in others, very few do.  Decisions about local vs regional lockdowns need a model that knows how many people travel outside their local area, and to where.  A model with this sort of information can also inform vaccination policy: vaccinating border works will prevent them getting sick, but what will it do to the range of plausible outbreaks in the future?  Models with this level of detail require a huge amount of data on the whole country, and serious computing resources; getting them designed and programmed correctly is also a major software effort.  The model has an entire imaginary New Zealand population wandering around inside the computer; you’re all individuals!

A mathematical modelling effort on this scale involves working from both ends on the problem: what is the simplest model that will inform the policy question, and what is the most detailed model you have the time and resources and expertise to implement?  Usually, it also involves a more organised approach to funding and job security and so on, but this was an emergency.  As the Education Act points out, one reason we have universities is as a repository of knowledge and expertise; when we need the expertise, we tend to need it right now.

avatar

Thomas Lumley (@tslumley) is Professor of Biostatistics at the University of Auckland. His research interests include semiparametric models, survey sampling, statistical computing, foundations of statistics, and whatever methodological problems his medical collaborators come up with. He also blogs at Biased and Inefficient See all posts by Thomas Lumley »

Comments

  • avatar
    Steve Curtis

    Travel demand modelling uses quite specific information about where and how many people there are and where they go to work and how ( mostly car).
    They are of course looking for different things- the morning peak travel mostly but for epidemic purposes ‘inter-region’ travel would be useful to know.

    4 years ago