Counting cases
There have been some fairly large fluctuations in the reported number of cases of COVID-19, the new coronavirus, in China, as the authorities change how they define cases. That’s not as dodgy as it might sound.
We can divide the population into two groups according to how they feel: do they have symptoms consistent with COVID-19 infection or not. We can divide them into three groups according to viral testing: positive, negative, not tested yet. Outside the outbreak area we could also divide people according to whether they had a plausible exposure or not, but at the centre of the outbreak it makes sense to assume basically anyone could have been exposed. We end up with six groups.
- The no-symptoms, negative test group clearly shouldn’t be counted as cases.
- The symptoms, positive test group clearly are cases.
Then it gets harder:
- The symptoms, no-test group will be mixed. Many of them will have COVID-19 infection, but others will just have some other influenza-like illness. The likelihood that they are cases will vary according to exactly what symptoms they have. Most of these people are being tested for the virus, but testing for a new virus is relatively slow and takes expertise, and the testing labs are backed up. The subset of people with lower respiratory tract infection confirmed by chest imaging (x-ray) were recently added to the official case count, but only if they are in Hubei province, China.
- The symptoms, negative test group are probably not cases of COVID-19.
- The no-symptoms, positive test group are probably cases, but since few asymptomatic people are being tested, they will be a small and unrepresentative subset of the asymptomatic cases. I have one source that says these were recently subtracted from the count
- The no-symptoms, no-test group includes nearly everyone, including most of the asymptomatic (or mildly symptomatic) cases.
Who you want to count depends on what you want to do with the data.
Thomas Lumley (@tslumley) is Professor of Biostatistics at the University of Auckland. His research interests include semiparametric models, survey sampling, statistical computing, foundations of statistics, and whatever methodological problems his medical collaborators come up with. He also blogs at Biased and Inefficient See all posts by Thomas Lumley »