Mavrik Hard: May 2020

SIRD simulations

Fig.1: Flattening the curve [1]

Carrying on from the last blog post, this post is about SIRD simulations. First, one attempts to do what has already been done before. Second, examine the parameters used by different authors. Finally, do some simulations, to look at the ‘D’ compartment: what values of mortality do SIRD models predict?

Fig.2: replicating Kumar [1]

Table: for N = 1x10⁶

R₀	Infected proportion (Kumar et al)	I_max	T_max (days)
1.862	0.129	135,000	39
1.455	0.055	57,300	65
1.258	0.023	23,200	101
1.2	0.015	15,000	124

Menon et al [2] have a similar plot, showing the effect of a lockdown, but there is much less change in the percentage of infected cases even after a 5-7 month lockdown than that shown by Kumar et al [1]:

Fig.3: from Menon et al [2]

However, the question of comparison arises again. The results of his SIR model are:

Fig.4: SIR simulation by Menon [2]

Numerically, Menon states that the maximum number infected is about 430 at about 4.77 days, and that only about 14 individuals out of a total of N = 1000 were uninfected at the end of 14 days. The parameters used were a = 2.18, b = 0.5 with only one infected individual at the start i.e. I(0) = 1 and S(0) = 999.

The results I get are a bit problematic because Euler’s method assumes differentiability and that’s a problem with such a small number of individuals. The closest I got to Menon’s results (by tweaking the interval) was I_max = 420 infected at 5.5 days and the uninfected number of individuals was about 10. Not very satisfactory. Note: Menon mentions using difference equations.

The general shape of the curves were similar (a small consolation):

Fig.5: replicating Menon’s SIR simulation [2]

Apart from these results from the SIR model, what is of more interest is the SIRD model [3]. Adams considers the 1918 flu, and assumes N = 1x10⁶, a = 1.9 per week, b = 1.4 per week and c = 0.065 per week. The epidemic he plots over a period of 45 weeks.

The results obtained by me are within 2% of those of Adams:

	Adams	Mine
No. of deaths	19,200	18,755
I_max	29,800	29,152
R_max	435,000	403,970
T_max (weeks)	27.6	27.6

The shape of the plots is very close as you would expect from the table, so I have not used Adam’s plots. First the I-D plot vs time (in weeks):

Fig.6: from Adams [3]

And the S-R plot (with time in weeks):

Fig7: from Adams [3]

The question is what values of a, b & c should be used? First look at the table below:

	a (d^-1)	b (d^-1)	c (d^-1)	R₀
Adams (SIRD)	0.271	0.20	0.0093	1.295
Anastassopoulou (SIRD)	0.191 0.319	0.064 (~1/15) 0.16	0.01 0.0005	2.581 2
Caccavo (modified SIRD)	0.553 0.256	0.076 0.017	0.009 0.024	6.505
Ferguson (agent-based & SEIR)	-	-	-	2, 2.2, 2.4, 2.6
Karin (SEIR)	0.95; 0.5 & 0.7	0.25; 0.163	-	3.8; 2.0; 2.8
Van Wees (SEIR)	1.75	0.5		3.5
Walker (SEIR)	0.478 (1/(2.09)) ??	0.2 (1/5)	0.0476 (1/21)	3.0
Volz (SEIR)		(1/6.5)		2.9
Bertozzi (SIR)	0.26 – 0.41	0.10, 0.12	-	2.1-4.6
Guttal (10 comp. model)	0.42	0.1428	0.0136	~2.7 1.8 with social distancing
Lin (SEIR + 2c)	0.5944	0.2	-	2.8
Alvarez (SIR)	0.2	0.0555 (1/18)		3.6
Chinazzi	(1/6 – 1/11)			2.57
Li	1/7.5			2.2

Many of these are SIR models, or SEIR, or agent-based. In some cases, more than one set of values has been given. Note that the values given by Adams are for the 1918 flu, whereas all the rest are for covid-19.

The recovery period has been taken by most as the mean incubation of 5.2 days, but the value of b has been calculated in many cases by assuming a total hospitalization period of about 12.4 days. The reason for this difference is that the incubation period is to be used in the SEIR models for the E Þ I, transition, while in the SIR or SIRD model, the S Þ I transition should take the average hospitalization period into account. Thus, the value of b should be ~ 0.08. This point is also made by Atkeson [14], except that he uses the value of 18 days instead of 12.4 days (citing a paper from China, by Wang [15]).

Note, however, Lin et al [16] have used b = 0.2, citing another paper by Nishiura [17] that states that the serial interval is 5 days (i.e. the time between generations). Chinazzi [18] quotes a range of 6-11 days for the generation time, and a R₀of 2.57, corresponding to a doubling time T_dof 4.2 days.

Whereas, Li [19] estimated 7.5 days for the serial interval, with R₀= 2.2 and a mean incubation period of 5.2 days.

The value of ‘a’ should be in the range 0.2-0.55. Alvarez [20] uses a value of a = 0.2.

According to Du et al [20], the value of R₀ is estimated for an epidemic in its initial stages from its growth rate r, the mean generation time m and the standard deviation s of its generation time, assuming normally distributed generation times:

R₀ = exp(mr – {s²r²/2})

Caccavo [5] has quoted two sets of values, the first set for China, and the second for Italy. However, he introduces a modification that the parameter c is time-dependent, decreasing exponentially with a time constant of t_c = 21.6 days:

c = c₀ exp(-t/t_c)

Actually, to fit the Chinese data he also added similar time constants for the parameters a and b. He claims that his plots fit the data correctly, and he gets the number of deaths in Italy right.

Amidst all this uncertainty, Park et al [22] attempted to reconcile differing estimates of the basic reproductive number R₀ by using a ‘multi-level Bayesian framework’. They argue that R₀ depends on 3 parameters: the exponential growth rate r (in the initial phases), the mean generation interval G_m and the dispersion k in the generation interval. After their statistical analysis of six independent studies, they concluded with a median value for the basic reproductive number of R₀ = 2.9, with a range of 2.1 to 4.5 with 95% confidence.

However, when all is said and done, Biswas [23] argues that we should ‘keep it retrospective’. He accepts that policy-makers need guidance based on evidence, but is skeptical about the possibility of prediction, considering the varying models and the uncertainties in parameters. This view is shared by Adam [24]: “much information about how SARS-COV-2 spreads is still unknown and must be estimated or assumed and that limits the precision of forecasts.” Adam quotes a paper by the group of John Edmunds [25] that assessed the performance of models in a 2014-15 Ebola outbreak in Sierra Leone, and found that it was possible to reliably predict the course of the epidemic up to 2 weeks, but no more than that. In addition, the difficulty of determining the value of R₀ has to do with the fact that many of the early values are small numbers (to determine the growth rate r) and are thus subject to statistical fluctuations. These can be tamed to some extent by using moving averages over many days, but it is more of a problem for deaths than for infected cases [26], since the numbers are smaller.

What of the mortality rate c? It may be as low as 0.0005, or as high as 0.05, but a mean value of 0.01 may be taken. With all this uncertainty, this range of parameters seems par for the course. Mortality estimates for covid-19 have varied between 0.6% and 1.4% of those who get infected [27]. Maybe one can get some number within this range by plugging in values of a, b and c in the SIRD model.

Let us first fix b as 1/7.5 = 0.133 based on the generation time given by Li [19]. Chinazzi [18] quotes a range of 6-11 days. Anastassopoulou et al [4] give two values, but one value is b = 0.16. Let us use the two values of c given by [4] 0.01 and 0.0005. The value of a can be taken as a variable corresponding to the range of R₀ of 2.0 to 3.5 which covers most of the values in the above table.

First, assume b = 0.064 and c = 0.0005 (all for N = 1x10⁶):

R₀	S_¥	R_¥	D_¥	I_max
2	199,552	794,252	6,205	155,714
2.5	103,320	889,836	6,952	238,070
3	55,563	937,125	7,321	307,460
3.5	30,558	961,935	7,515	366,393
5	5,259	987,039	7,711	490,750
10	8	992,249	7,752	677,644

Next, assume b = 0.064 and c = 0.0025:

R₀	S_¥	R_¥	D_¥	I_max
2	199,438	770,475	30,097	155,839
2.5	103,090	863,200	33,710	237,990
3	55,438	909,061	35,510	307,469
3.5	30,449	933,110	36,449	365,312
5	5,207	957,404	37,398	494,829
10	7	962,408	37,594	704,384

And then, assume b = 0.064 and c = 0.01:

R₀	S_¥	R_¥	D_¥	I_max
2	199,009	692,758	108,243	155,921
2.5	102,600	776,138	121,271	238,553
3	54,969	817,331	127,708	308,742
3.5	30,041	838,890	131,077	367,844
5	5,013	860,537	134,459	496,402
10	5	864,868	135,135	677,479

A look at the above tables shows that c does not affect S_¥ or I_max much, but the effect is stronger on R_¥ and greatest on D_¥. Increasing R₀ decreases S_¥ and increases R_¥ as expected, while also increasing D_¥ and increasing I_max.

A further examination of at the above tables would suggest that, as R₀ increases (beyond 3.5), R_¥and D_¥ saturate. That is, beyond a point, increasing transmissibility does not result in increasing fatalities. However, I_max continues to increase, but does not show consistent variation.

The standard plot of S, I and R vs time (in days) is:

Fig.8:

While the plot that shows the dead (a smaller number) along with those infected is:

Fig.9:

First, assume b = 0.133 and c = 0.0005 (all for N = 1x10⁶):

R₀	S_¥	R_¥	D_¥	I_max
2	195,547	801,449	3,013	158,112
2.5	98,633	898,000	3,376	241,569
3	51,176	945,280	3,554	315,024
3.5	26,759	969,606	3,645	369,667
5	3,537	992,740	3,732	498,996
10	0	996,264	3,745	680.429

Next, assume b = 0.133 and c = 0.0025:

R₀	S_¥	R_¥	D_¥	I_max
2	195,430	789,735	14,845	158,350
2.5	98,497	884,880	16,633	241,201
3	51,046	931,455	17,508	313,489
3.5	26,647	955,404	17,959	375,871
5	3,490	978,134	18,386	510,052
10	0	981,560	18,450	710,875

And then, assume b = 0.133 and c = 0.01:

R₀	S_¥	R_¥	D_¥	I_max
2	194,986	748,728	56,295	158,071
2.5	97,987	838,944	63,078	241,818
3	50,558	883,056	66,395	316,274
3.5	26,227	905,686	68,097	376,332
5	3,313	926,997	69,699	481,408
10	0	930,079	69,930	619,992

The SIR model prediction is from Bastin [28]:

R₀ = - [ln(1-x)]/x

where x = 1 – (S_¥/N).

You would expect the SIRD model to deviate from the SIR model prediction and it does:

R₀	X	1 – (S_¥/N)	1 – (S_¥/N)	1 – (S_¥/N)	1 – (S_¥/N)
2	0.800	0.800	0.804	0.800	0.805
2.5	0.890	0.897	0.901	0.8974	0.902
3	0.9403	0.9444	0.9488	0.9450	0.9494
3.5	0.9659	0.9694	0.9732	0.9700	0.9738
5	0.9930	0.9947	.9965	0.9950	0.9967
		b = 0.064	b = 0.133	b = 0.064	b = 0.133
		c = 0.0005		c = 0.01

But the deviation is not all that great, except that it goes up as R₀ increases, and also as b and c increase, with the largest deviation being for the highest value of all three parameters.

The SIR formula for I_max is [28]:

I_max = N[ 1- {1+ ln(R₀)}/R₀]

For N = 1x10⁶:

R₀	I_max
2	153,426
2.5	233,484
3	300,462
3.5	356,353
5	478,112
10	669,741

From R_¥ and D_¥ in the above tables we can estimate the percentage of deaths among those who were infected:

C	b = 0.064	b = 0.133
0.0005	0.78%	0.37%
0.0025	3.76%	1.84%
0.01	13.51%	6.99%

This percentage does not seem to depend much on the value of R₀ (in the range 2-5), but it clearly depends on the value of b and c. The value of b = 0.064 predicts a high percentage, since recovery takes a long time, 15.6 days (i.e. it is less effective as a mechanism). The column with b = 0.133 looks somewhat better, but clearly a lot depends on the assumed value of the parameter c. If c is low, mortality is an ineffective mechanism. However, is a value of 0.0025 (corresponding to 400 days) reasonable? With this set of parameters, it is the only way to get about 1.8%. And the lower end value of 0.0005 (2,000 days) looks even more questionable.

However, let us do a reality check. The population of Italy is about 60.4 million. According to [29], the toll in Italy is 228,006 cases and 32,406 deaths on 23^rd May 2020. The paper of Anastassopoulou et al [4] considers two sets of parameters: (i) a = 0.191, b = 0.064 and c = 0.01, and (ii) a = 0.319, b = 0.16 and c = 0.0005. Case (i) corresponds to R₀ = 2.58, while case (ii) is for R₀ = 1.98. We prefer case (ii), since the recovery period of 6.25 days seems more reasonable than the recovery time of 15.6 days. Even for case (ii), the number of deaths (per million) is 2,500. Multiply by 60, and this gives 150,000 deaths – which is about 5X the actual value to date. Note, however, that Anastassopoulou et al [4] have fitted their data to Wuhan in China, not Italy.

Anastassopoulou et al [4] also consider values of a, b and c taking into account the confidence intervals in their determination. According to them, a is estimated to a high level of accuracy (0.319 ±0.001 and 0.191 ±0.001) but they take a ±10% variation in the parameters b and c. This would be fine, except for the fact that the two sets of parameters (i) and (ii) differ to such a great extent. The authors explain that the second data set has been calculated assuming that the data from China underestimated the number of infected cases by 20X and the number of recovered cases by 40X, but was accurate in estimating the number of deaths.

Of course, there are problems with these arguments: assuming the same value of a, b and c throughout does not take into account the efforts of lockdown and social distancing. A more sophisticated analysis would be needed for that, with time-varying parameters, as was done by Caccavo [5]. Further, as pointed out earlier, this analysis disregards spatial inhomogeneity: North Italy was much more seriously affected than South Italy. It also does not take into account the higher mortality of old people and the age-structure of the population. So it is possible that the second set of values chosen [4] is actually reasonable – if you also factor in time- and space-variations. However, models can get very complicated; INDSCI-SIM of Guttal et al [30] has a 9 compartment model, while Menon et al [2] have a 10-compartment model. And complicated models are ‘data-hungry’. Even with the internet, getting hold of all the data … easier said than done.

Fig.10: from Menon [2]

Worryingly, Menon et al [2] have a plot showing that 10% of the population in India will be dead within a year – despite the lockdowns. This percentage is significantly larger than even the highest value in the table above (13.5%). They have a number of policy prescriptions for eradicating the virus (apart from a vaccine), but they do not discuss the effect of these prescriptions on mortality. And following the complexities of a 10-compartment model is beyond me…

However, bottom-line is that if one uses Caccavo’s parameters for Italy [5] one winds up with 3.2% of the population dead. While this is still high, it is a lot better than Menon’s value of 10%! However, Caccavo’s value of b (0.017; 58 days) [5] does not seem to be particularly reasonable. If one were to use b = 0.133, as discussed above, (while keeping a, c₀ and t_c the same, 0.256, 0.024 & 21.6 respectively) the fraction of people that drop dead, drops down to 0.17%! Which would be an outcome that would be significantly better.

However, as Biswas {23] says, better ‘keep it retrospective’, a view that would be endorsed by Adam [24] and Edmunds [25]: post-diction is easier than prediction.

References:

1 1. Kaushalendra Kumar, Wahengbam Bigyananda Meitei and Abhishek Singh “Projecting the future trajectory of COVID-19 infections in India using the susceptible-infected-recovered (SIR) model” IIPS Analytical Series on Covid 19: Paper 7

2 2. Ashish Menon, Nithin K Rajendran, Anish Chandrachud, Girish Setlur “Modelling and simulation of COVID-19 propagation in a large population with specific reference to India,” medRxiv preprint doi: https://doi.org/10.1101/2020.04.30.20086306; this version posted May 5, 2020

3 3. Peter Adams,” Theory and Practice in Science” https://people.smp.uq.edu.au/PeterAdams/SCIE1000/scie1000_notes_part2.pdf

4 4. C.Anastassopoulou, L.Russo, A.Tsakris and C.Siettos 31^st March 2020 PlosOne https://doi.org/10.1371/journal.pone.0230405

5 5. D.Caccavo medRxiv 17Apr2020 https://doi.org/10.1101/2020.03.19.20039388

6 6. N.M.Ferguson et al Nature Vol 442|27 July 2006|doi:10.1038/nature04795

7 7. O.Karin et al, “Adaptive cyclic exit strategies from lockdown to suppress

COVID-19 and allow economic activity,”medRxiv doi: https://doi.org/10.1101/2020.04.04.20053579

8 8. J.-D.Van Wees et al, “Forecasting hospitalization and ICU rates of the COVID-19 outbreak: an efficient SEIR model,” Bull World Health Organ. E-pub: 30 March 2020. doi: http://dx.doi.org/10.2471/BLT.20.256743

9 9. P.G.T.Walker et al, “Report 12: The Global Impact of COVID-19 and Strategies for Mitigation and Suppression ,”26^th Mar.2020 DOI: https://doi.org/10.25561/77735

1 10. E.Volz et al, “Report 5: Phylogenetic analysis of SARS-CoV-2 ,” 15^th Feb.2020 DOI: https://doi.org/10.25561/77169

1 11. A.L.Bertozzi et al, “The challenges of modeling and forecasting the spread of COVID-19,” arXiv 9Apr 2020 2004.04741v1

1 12. V.Guttal et al “INDSCI-SIM: a state-level epidemiological model for India,” 18^th Apr.2020

113. A.Biswas, “A prediction model for covid-19”, 17^th March 2020 https://www.thehindu.com/opinion/op-ed/a-prediction-model-for-covid-19/article31092695.ece

1 14. A.Atkeson “What will be the economic impact of Covid-19 in the U.S.? Rough estimates of disease scenarios,“NBER Working Paper 26867 March 2020

1 15. H.Wang et al, “Phase-adjusted estimation of number of coronavirus disease 2019 cases in Wuhan, China,” Cell Discovery 6 (2020) 10; https://doi.org/10.1038/s41421-020-0148-0

1 16. Q.Lin et al , “ A conceptual model for the outbreak of coronavirus disease (Covid-19) in Wuhan, China, with individual reaction and governmental action”, Int.J.Inf.Dis. 93 (2020) 211

1 17. H.Nishiura et al, “Serial interval of novel coronavirus (covid-19) infections,” Int.J.Inf.Dis. 93 (2020) 284

18. M.Chinazzi et al, “The effect of travel restrictions on the spraed of the 2019 novel coronoavirus (COVID-19) outbreak,” Science 368 (2020) 394 24^th Apr.2020

1 19. Q.Li et al, “Early transmission dynamics in Wuham China, of novel corononavirus-infected pneumonia,” New England J.Med. 382 (2020) 26^th Mar 2020 1199 DOI: 10.1056/NEJMoa2001316

2 20. F.E.Alvarez, “A simple planning problem for covid-19 lockdown,” NBER Working Paper 26981 April 2020

2 21. Z.Du et al ,” The serial interval of covid-19 from publicly confirmed cases,” medRxiv 20^th Mar 2020 https://doi.org/10.1101/2020.02.19.20025452

2 22. S.W.Park et al ,”Reconciling early outbreak estimates of basic reproductive number and its uncertainty,” medRxiv 28^th Feb.2020 doi: https://doi.org/10.1101/2020.01.30.20019877

2 23. A.Biswas, “Keep it retrospective,” Hindu, 21^st May 2020.

2 24. D.Adam, ”Modelling the pandemic,” Nature 580 (16^th April 2020) 316

2 25. S.Funk et al ,” Assessing the performance of real-time epidemic forecasts,” PLOS Computational Biol. https://doi.org/10.1371/journal.pcbi.1006785

2 26. S.Ramesh & M.Basu, https://theprint.in/science/r0-data-shows-indias-coronavirus-infection-rate-has-slowed-gives-lockdown-a-thumbs-up/399734/

27. Prof.Devi Sridhar https://www.theguardian.com/commentisfree/2020/apr/08/how-will-the-coronavirus-crisis-end-lockdown-pandemic

2 28. S.Bastin, “Lectures on mathematical modelling of biological systems,” 22^nd Aug.2018, GBIO 2060

https://perso.uclouvain.be/georges.bastin/lectures-bio.pd

2 29. https://www.worldometers.info/coronavirus/country/italy/

3 30. V.Guttal et al a) medRxiv 1^st May 2020 doi: https://doi.org/10.1101/2020.04.26.20080648

and b) INDSCI-SIM: a State-level epidemiological model for India 18^th April 2020 www.indscicov.in

Mavrik Hard

Tuesday, May 26, 2020

SIRD Simulations

1 1. Kaushalendra Kumar, Wahengbam Bigyananda Meitei and Abhishek Singh “Projecting the future trajectory of COVID-19 infections in India using the susceptible-infected-recovered (SIR) model” IIPS Analytical Series on Covid 19: Paper 7

Blog Archive

About Me