Estimating Overall Covid-19 Infection Rate of New Zealand
Introduction
I am curious about this: What’s the overall covid-19 infection rate of New Zealand (NZ)? By the overall infection rate, I mean $$ r = \frac{T}{N}, $$ where $T$ is the total cases since the first NZ case up to now (i.e. 14 August 2022), and $N$ is the total NZ resident population. So this is an estimation problem, and for a solution we need data—“data, data. I cannot make bricks without clay” (who said that?) In this note, I will firstly use sample data to estimate $r$, and then use official statistics to make estimation.
Using sample data
As I just said, I do have sample data—9 out of 16 people got covid. Note that this is not a random sample but this is real data (I will not tell you the source of the data for protecting privacy).
Here I take the classic Bayesian model (see [1]), i.e. $$ \hbox{Prior:}\ r\sim \hbox{Uniform}(0, 1); $$
$$ Y|r \sim \hbox{binomial}(n, r), $$ where $Y$ is a random variable that counts covid cases in a sample, $n$ is sample size and $n=16$ for my sample. The Bayesian estimator is $$ \hat{r} = \frac{Y+1}{n+2}. $$ Plugging $Y=9$ and $n=16$ into the above estimator, we have the estimate $$ \hat{r}=\frac{9+1}{16+2}=\frac{5}{9}\approx 56\%. $$ So we estimate that the overall covid-19 infection rate of NZ is about $56\%$. Is this a good estimate? I make the following remarks:
-
The sample size is only 16 and this is not a random sample, thus we cannot claim that the estimate is very accurate.
-
This paper [2] uses the same model to estimate case fatality rate.
Using official statistics
For sure I’m not the only person who have an interest in the overall infection rate $r$. Let’s back to the first principle, and an estimate of the rate $r$ is $$ \hat{r} = \frac{\hat{T}}{\hat{N}}. $$ That is, if can have good estimates of NZ total cases and resident population, then the job is done. Now I ask our friend Google. This friend does not let me down:
-
According to the Ministry of Health (MoH) web site, prior to 11:59pm 12 August 2022, $$ \hat{T} = 1,684,946 \approx 1.68\ \hbox{million}. $$
-
According to Stats NZ web site, on 31 March 2022, $$ \hat{N} = 5,127,100\approx 5.13\ \hbox{million}. $$
Thus, $$ \hat{r}=\frac{1.68}{5.13}\approx 33\%. $$ Is this a good estimate? Probably, but hold on! Although we may have good confidence in Stats NZ’s estimate $\hat{N} \approx 5.13\ \hbox{million}$, common sense tells us that 1.68 million almost certainly underestimates the true value of total NZ cases. Many people got covid in NZ, but for various reasons, they did not report to the MoH. Here I make a confession: I was one of those people. My daughter and wife got covid and they reported to MoH, but I did not. Shame on me! Let’s adjust the estimate as
$$ \hat{T}_{adj}=\frac{1.68}{\hat{\rho}}, $$
where $\hat{\rho}$ is estimate of covid reporting rate. A new estimate of the rate $r$ is $$ \hat{r}_{new}= (1.68/\hat{\rho})/5.13 $$ Now we can have a table, trying out a few $\hat{\rho}$:
Reporting_rate | Infection_rate_r |
---|---|
95% | 34% |
90% | 36% |
85% | 39% |
80% | 41% |
75% | 44% |
70% | 47% |
65% | 50% |
60% | 55% |
55% | 60% |
50% | 65% |
So now we have a number of estimates of the overall infection rate as shown in the above table. Which one is the truth? My honest answer is, “I don’t know.” Based on my judgement, it is likely that the true $r$ is somewhere around $47\%$, because for some reason—as indicated by the word “judgement”—I think the true covid reporting rate is probably about $70\%$ (i.e. among all the people who got covid, $30\%$ did not report to MoH.)
Conclusions
Based on the official statistics and my analysis, the covid-19 overall infection rate of NZ is definitely higher than $30\%$ and it could be as high as $47\%$ if not even higher.
References
[1] Gelman, A; Carlin, J. B.; Stern, H. S.; Dunson, D. B.; Vehtari, A.; Rubin, D. B. (2014). Bayesian Data Analysis, 3rd Ed.
[2] Gao, X. and Dong, Q. (2020). A primer on Bayesian estimation of prevalence of COVID-19 patient outcomes. JAMIA Open; 3(4), pp 628–631. URL: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7750711/