Tuesday, May 26, 2020

SIRD Simulations


SIRD simulations

Fig.1: Flattening the curve [1]



Carrying on from the last blog post, this post is about SIRD simulations. First, one attempts to do what has already been done before. Second, examine the parameters used by different authors. Finally, do some simulations, to look at the ‘D’ compartment: what values of mortality do SIRD models predict?


Fig.2: replicating Kumar [1]


Table: for N = 1x106

R0
Infected proportion
(Kumar et al)
Imax
Tmax (days)
1.862
0.129
135,000
39
1.455
0.055
57,300
65
1.258
0.023
23,200
101
1.2
0.015
15,000
124

Menon et al [2] have a similar plot, showing the effect of a lockdown, but there is much less change in the percentage of infected cases even after a 5-7 month lockdown than that shown by Kumar et al [1]:

Fig.3: from Menon et al [2]

However, the question of comparison arises again. The results of his SIR model are:

Fig.4: SIR simulation by Menon [2]

Numerically, Menon states that the maximum number infected is about 430 at about 4.77 days, and that only about 14 individuals out of a total of N = 1000 were uninfected at the end of 14 days. The parameters used were a = 2.18, b = 0.5 with only one infected individual at the start i.e. I(0) = 1 and S(0) = 999.

The results I get are a bit problematic because Euler’s method assumes differentiability and that’s a problem with such a small number of individuals. The closest I got to Menon’s results (by tweaking the interval) was Imax = 420 infected at 5.5 days and the uninfected number of individuals was about 10. Not very satisfactory. Note: Menon mentions using difference equations.

The general shape of the curves were similar (a small consolation):

Fig.5: replicating Menon’s SIR simulation [2]

Apart from these results from the SIR model, what is of more interest is the SIRD model [3]. Adams considers the 1918 flu, and assumes N = 1x106, a = 1.9 per week, b = 1.4 per week and c = 0.065 per week. The epidemic he plots over a period of 45 weeks.

The results obtained by me are within 2% of those of Adams:


Adams
Mine
No. of deaths
19,200
18,755
Imax
29,800
29,152
Rmax
435,000
403,970
Tmax (weeks)
27.6
27.6

The shape of the plots is very close as you would expect from the table, so I have not used Adam’s plots. First the I-D plot vs time (in weeks):

Fig.6: from Adams [3]

And the S-R plot (with time in weeks):

Fig7: from Adams [3]

The question is what values of a, b & c should be used? First look at the table below:


a (d-1)
b (d-1)
c (d-1)
R0
Adams (SIRD)
0.271
0.20
0.0093
1.295
Anastassopoulou
(SIRD)
0.191
0.319
0.064 (~1/15)
0.16
0.01
0.0005
2.581
2
Caccavo (modified SIRD)
0.553
0.256
0.076
0.017
0.009
0.024
6.505
Ferguson
(agent-based & SEIR)
-
-
-
2, 2.2, 2.4, 2.6
Karin (SEIR)
0.95; 0.5 & 0.7
0.25; 0.163
-
3.8; 2.0;  2.8
Van Wees (SEIR)
1.75
0.5

3.5
Walker (SEIR)
0.478 (1/(2.09)) ??
0.2 (1/5)
0.0476 (1/21)
3.0
Volz (SEIR)

(1/6.5)

2.9
Bertozzi (SIR)
0.26 – 0.41
0.10, 0.12
-
2.1-4.6
Guttal (10 comp. model)
0.42
0.1428
0.0136
~2.7
1.8 with social distancing
Lin (SEIR + 2c)
0.5944
0.2
-
2.8
Alvarez (SIR)
0.2
0.0555 (1/18)

3.6
Chinazzi
(1/6 – 1/11)


2.57
Li
1/7.5


2.2

Many of these are SIR models, or SEIR, or agent-based. In some cases, more than one set of values has been given. Note that the values given by Adams are for the 1918 flu, whereas all the rest are for covid-19.

The recovery period has been taken by most as the mean incubation of 5.2 days, but the value of b has been calculated in many cases by assuming a total hospitalization period of about 12.4 days. The reason for this difference is that the incubation period is to be used in the SEIR models for the E Þ I, transition, while in the SIR or SIRD model, the S Þ I transition should take the average hospitalization period into account. Thus, the value of b should be ~ 0.08. This point is also made by Atkeson [14], except that he uses the value of 18 days instead of 12.4 days (citing a paper from China, by Wang [15]).

Note, however, Lin et al [16] have used b = 0.2, citing another paper by Nishiura [17] that states that the serial interval is 5 days (i.e. the time between generations). Chinazzi [18] quotes a range of 6-11 days for the generation time, and a R0 of 2.57, corresponding to a doubling time Td of 4.2 days. 

Whereas, Li [19] estimated 7.5 days for the serial interval, with R0 = 2.2 and a mean incubation period of 5.2 days.

The value of ‘a’ should be in the range 0.2-0.55. Alvarez [20] uses a value of a = 0.2.

According to Du et al [20], the value of R0 is estimated for an epidemic in its initial stages from its growth rate r, the mean generation time m and the standard deviation s of its generation time, assuming normally distributed generation times:

R0 = exp(mr – {s2r2/2})

Caccavo [5] has quoted two sets of values, the first set for China, and the second for Italy. However, he introduces a modification that the parameter c is time-dependent, decreasing exponentially with a time constant of tc  = 21.6 days:

c = c0 exp(-t/tc)

Actually, to fit the Chinese data he also added similar time constants for the parameters a and b. He claims that his plots fit the data correctly, and he gets the number of deaths in Italy right.

Amidst all this uncertainty, Park et al [22] attempted to reconcile differing estimates of the basic reproductive number R0 by using a ‘multi-level Bayesian framework’. They argue that R0 depends on 3 parameters: the exponential growth rate r (in the initial phases), the mean generation interval Gm and the dispersion k in the generation interval. After their statistical analysis of six independent studies, they concluded with a median value for the basic reproductive number of R0 = 2.9, with a range of 2.1 to 4.5 with 95% confidence.

However, when all is said and done, Biswas [23] argues that we should ‘keep it retrospective’. He accepts that policy-makers need guidance based on evidence, but is skeptical about the possibility of prediction, considering the varying models and the uncertainties in parameters. This view is shared by Adam [24]: “much information about how SARS-COV-2 spreads is still unknown and must be estimated or assumed and that limits the precision of forecasts.” Adam quotes a paper by the group of John Edmunds [25] that assessed the performance of models in a 2014-15 Ebola outbreak in Sierra Leone, and found that it was possible to reliably predict the course of the epidemic up to 2 weeks, but no more than that. In addition, the difficulty of determining the value of R0 has to do with the fact that many of the early values are small numbers (to determine the growth rate r) and are thus subject to statistical fluctuations. These can be tamed to some extent by using moving averages over many days, but it is more of a problem for deaths than for infected cases [26], since the numbers are smaller.

What of the mortality rate c? It may be as low as 0.0005, or as high as 0.05, but a mean value of 0.01 may be taken. With all this uncertainty, this range of parameters seems par for the course. Mortality estimates for covid-19 have varied between 0.6% and 1.4% of those who get infected [27]. Maybe one can get some number within this range by plugging in values of a, b and c in the SIRD model.

Let us first fix b as 1/7.5 = 0.133 based on the generation time given by Li [19]. Chinazzi [18] quotes a range of 6-11 days. Anastassopoulou et al [4] give two values, but one value is b = 0.16. Let us use the two values of c given by [4] 0.01 and 0.0005. The value of a can be taken as a variable corresponding to the range of R0 of 2.0 to 3.5 which covers most of the values in the above table.

First, assume b = 0.064 and c = 0.0005 (all for N = 1x106):

R0
S¥
R¥
D¥
Imax
2
199,552
794,252
6,205
155,714
2.5
103,320
889,836
6,952
238,070
3
55,563
937,125
7,321
307,460
3.5
30,558
961,935
7,515
366,393
5
5,259
987,039
7,711
490,750
10
8
992,249
7,752
677,644

Next, assume b = 0.064 and c = 0.0025:

R0
S¥
R¥
D¥
Imax
2
199,438
770,475
30,097
155,839
2.5
103,090
863,200
33,710
237,990
3
55,438
909,061
35,510
307,469
3.5
30,449
933,110
36,449
365,312
5
5,207
957,404
37,398
494,829
10
7
962,408
37,594
704,384

And then, assume b = 0.064 and c = 0.01:

R0
S¥
R¥
D¥
Imax
2
199,009
692,758
108,243
155,921
2.5
102,600
776,138
121,271
238,553
3
54,969
817,331
127,708
308,742
3.5
30,041
838,890
131,077
367,844
5
5,013
860,537
134,459
496,402
10
5
864,868
135,135
677,479

A look at the above tables shows that c does not affect S¥ or Imax much, but the effect is stronger on R¥ and greatest on D¥. Increasing R0 decreases S¥ and increases R¥ as expected, while also increasing D¥ and increasing Imax.

A further examination of at the above tables would suggest that, as R0 increases (beyond 3.5), R¥ and D¥ saturate. That is, beyond a point, increasing transmissibility does not result in increasing fatalities. However, Imax continues to increase, but does not show consistent variation.

The standard plot of S, I and R vs time (in days) is:
Fig.8:

While the plot that shows the dead (a smaller number) along with those infected is:

Fig.9:

First, assume b = 0.133 and c = 0.0005 (all for N = 1x106):

R0
S¥
R¥
D¥
Imax
2
195,547
801,449
3,013
158,112
2.5
98,633
898,000
3,376
241,569
3
51,176
945,280
3,554
315,024
3.5
26,759
969,606
3,645
369,667
5
3,537
992,740
3,732
498,996
10
0
996,264
3,745
680.429

Next, assume b = 0.133 and c = 0.0025:

R0
S¥
R¥
D¥
Imax
2
195,430
789,735
14,845
158,350
2.5
98,497
884,880
16,633
241,201
3
51,046
931,455
17,508
313,489
3.5
26,647
955,404
17,959
375,871
5
3,490
978,134
18,386
510,052
10
0
981,560
18,450
710,875

And then, assume b = 0.133 and c = 0.01:

R0
S¥
R¥
D¥
Imax
2
194,986
748,728
56,295
158,071
2.5
97,987
838,944
63,078
241,818
3
50,558
883,056
66,395
316,274
3.5
26,227
905,686
68,097
376,332
5
3,313
926,997
69,699
481,408
10
0
930,079
69,930
619,992

The SIR model prediction is from Bastin [28]:
R0 = - [ln(1-x)]/x
where x = 1 – (S¥/N).
You would expect the SIRD model to deviate from the SIR model prediction and it does:

R0
X
 1 – (S¥/N)
1 – (S¥/N)
1 – (S¥/N)
1 – (S¥/N)
2
0.800
0.800
0.804
0.800
0.805
2.5
0.890
0.897
0.901
0.8974
0.902
3
0.9403
0.9444
0.9488
0.9450
0.9494
3.5
0.9659
0.9694
0.9732
0.9700
0.9738
5
0.9930
0.9947
.9965
0.9950
0.9967


b = 0.064
 b = 0.133
b = 0.064
 b = 0.133


c = 0.0005
 c  = 0.01

But the deviation is not all that great, except that it goes up as R0 increases, and also as b and c increase, with the largest deviation being for the highest value of all three parameters.

The SIR formula for Imax is [28]:

Imax = N[ 1- {1+ ln(R0)}/R0]

For N = 1x106:

R0
 Imax
2
153,426
2.5
233,484
3
300,462
3.5
356,353
5
478,112
10
669,741

From R¥ and D¥ in the above tables we can estimate the percentage of deaths among those who were infected:

C
b = 0.064   
b = 0.133  
0.0005
0.78%
0.37%
0.0025
3.76%
1.84%
0.01
13.51%
6.99%

This percentage does not seem to depend much on the value of R0 (in the range 2-5), but it clearly depends on the value of b and c.  The value of b = 0.064 predicts a high percentage, since recovery takes a long time, 15.6 days (i.e. it is less effective as a mechanism). The column with b = 0.133 looks somewhat better, but clearly a lot depends on the assumed value of the parameter c. If c is low, mortality is an ineffective mechanism. However, is a value of 0.0025 (corresponding to 400 days) reasonable? With this set of parameters, it is the only way to get about 1.8%. And the lower end value of 0.0005 (2,000 days) looks even more questionable.

However, let us do a reality check. The population of Italy is about 60.4 million. According to [29], the toll in Italy is 228,006 cases and 32,406 deaths on 23rd May 2020. The paper of Anastassopoulou et al [4] considers two sets of parameters: (i) a = 0.191, b = 0.064 and c = 0.01, and (ii) a = 0.319, b = 0.16 and c = 0.0005. Case (i) corresponds to R0 = 2.58, while case (ii) is for R0 = 1.98. We prefer case (ii), since the recovery period of 6.25 days seems more reasonable than the recovery time of 15.6 days. Even for case (ii), the number of deaths (per million) is 2,500. Multiply by 60, and this gives 150,000 deaths – which is about 5X the actual value to date. Note, however, that Anastassopoulou et al [4] have fitted their data to Wuhan in China, not Italy.  

Anastassopoulou et al [4] also consider values of a, b and c taking into account the confidence intervals in their determination. According to them, a is estimated to a high level of accuracy (0.319 ±0.001 and 0.191 ±0.001) but they take a ±10% variation in the parameters b and c. This would be fine, except for the fact that the two sets of parameters (i) and (ii) differ to such a great extent. The authors explain that the second data set has been calculated assuming that the data from China underestimated the number of infected cases by 20X and the number of recovered cases by 40X, but was accurate in estimating the number of deaths.

Of course, there are problems with these arguments: assuming the same value of a, b and c throughout does not take into account the efforts of lockdown and social distancing. A more sophisticated analysis would be needed for that, with time-varying parameters, as was done by Caccavo [5]. Further, as pointed out earlier, this analysis disregards spatial inhomogeneity: North Italy was much more seriously affected than South Italy. It also does not take into account the higher mortality of old people and the age-structure of the population.  So it is possible that the second set of values chosen [4] is actually reasonable – if you also factor in time- and space-variations. However, models can get very complicated; INDSCI-SIM of Guttal et al [30] has a 9 compartment model, while Menon et al [2] have a 10-compartment model. And complicated models are ‘data-hungry’. Even with the internet, getting hold of all the data … easier said than done.

 Fig.10: from Menon [2]

Worryingly, Menon et al [2] have a plot showing that 10% of the population in India will be dead within a year – despite the lockdowns. This percentage is significantly larger than even the highest value in the table above (13.5%). They have a number of policy prescriptions for eradicating the virus (apart from a vaccine), but they do not discuss the effect of these prescriptions on mortality. And following the complexities of a 10-compartment model is beyond me…

However, bottom-line is that if one uses Caccavo’s parameters for Italy [5] one winds up with 3.2% of the population dead. While this is still high, it is a lot better than Menon’s value of 10%! However, Caccavo’s value of b (0.017; 58 days) [5] does not seem to be particularly reasonable. If one were to use b = 0.133, as discussed above, (while keeping a, c0 and tc the same, 0.256, 0.024 & 21.6 respectively) the fraction of people that drop dead, drops down to 0.17%! Which would be an outcome that would be significantly better.

However, as Biswas {23] says, better ‘keep it retrospective’, a view that would be endorsed by Adam [24] and Edmunds [25]: post-diction is easier than prediction.

References:

 1.     Kaushalendra Kumar, Wahengbam Bigyananda Meitei and Abhishek Singh “Projecting the future trajectory of COVID-19 infections in India using the susceptible-infected-recovered (SIR) model” IIPS Analytical Series on Covid 19: Paper 7

 2.      Ashish Menon, Nithin K Rajendran, Anish Chandrachud, Girish Setlur “Modelling and simulation of COVID-19 propagation in a large population with specific reference to India,” medRxiv preprint doi: https://doi.org/10.1101/2020.04.30.20086306; this version posted May 5, 2020
3  3.     Peter Adams,” Theory and Practice in Science”  https://people.smp.uq.edu.au/PeterAdams/SCIE1000/scie1000_notes_part2.pdf
4 4.   C.Anastassopoulou, L.Russo, A.Tsakris and C.Siettos 31st March 2020 PlosOne  https://doi.org/10.1371/journal.pone.0230405
5 5.   D.Caccavo medRxiv 17Apr2020  https://doi.org/10.1101/2020.03.19.20039388
6 6.   N.M.Ferguson et al  Nature Vol 442|27 July 2006|doi:10.1038/nature04795
7 7.   O.Karin et al, “Adaptive cyclic exit strategies from lockdown to suppress
COVID-19 and allow economic activity,”medRxiv doi: https://doi.org/10.1101/2020.04.04.20053579
8 8.    J.-D.Van Wees et al, “Forecasting hospitalization and ICU rates of the COVID-19 outbreak: an efficient SEIR model,” Bull World Health Organ. E-pub: 30 March 2020. doi: http://dx.doi.org/10.2471/BLT.20.256743
9 9.   P.G.T.Walker et al, “Report 12: The Global Impact of COVID-19 and Strategies for Mitigation and Suppression  ,”26th Mar.2020 DOI: https://doi.org/10.25561/77735
1 10.   E.Volz et al, “Report 5: Phylogenetic analysis of SARS-CoV-2  ,”  15th Feb.2020  DOI: https://doi.org/10.25561/77169
1 11.   A.L.Bertozzi et al,  “The challenges of modeling and forecasting the spread of COVID-19,” arXiv 9Apr 2020 2004.04741v1
1 12.   V.Guttal et al  “INDSCI-SIM: a state-level epidemiological model for India,” 18th Apr.2020
 113.   A.Biswas, “A prediction model for covid-19”, 17th March 2020 https://www.thehindu.com/opinion/op-ed/a-prediction-model-for-covid-19/article31092695.ece
1  14.   A.Atkeson “What will be the economic impact of Covid-19 in the U.S.? Rough estimates  of disease scenarios,“NBER Working Paper 26867 March 2020    
1  15.   H.Wang et al, “Phase-adjusted estimation of number of coronavirus disease 2019 cases in Wuhan, China,”  Cell Discovery 6 (2020) 10;  https://doi.org/10.1038/s41421-020-0148-0
1  16.   Q.Lin et al , “ A conceptual model for the outbreak of coronavirus disease (Covid-19) in Wuhan, China, with individual reaction and governmental action”, Int.J.Inf.Dis. 93 (2020) 211
1  17.   H.Nishiura et al, “Serial interval of novel coronavirus (covid-19) infections,” Int.J.Inf.Dis. 93 (2020) 284
    18.   M.Chinazzi et al, “The effect of travel restrictions on the spraed of the 2019 novel coronoavirus (COVID-19) outbreak,” Science 368 (2020) 394 24th Apr.2020
1  19.   Q.Li et al, “Early transmission dynamics in Wuham China, of novel corononavirus-infected pneumonia,” New England J.Med. 382 (2020) 26th Mar 2020 1199 DOI: 10.1056/NEJMoa2001316
2  20.   F.E.Alvarez, “A simple planning problem for covid-19 lockdown,” NBER Working Paper 26981 April 2020
2  21.   Z.Du et al ,” The serial interval of covid-19 from publicly confirmed cases,” medRxiv 20th Mar 2020 https://doi.org/10.1101/2020.02.19.20025452
2  22.   S.W.Park et al ,”Reconciling early outbreak estimates of basic reproductive number and its uncertainty,” medRxiv 28th Feb.2020 doi: https://doi.org/10.1101/2020.01.30.20019877
2  23.   A.Biswas, “Keep it retrospective,” Hindu, 21st May 2020.
2  24.   D.Adam, ”Modelling the pandemic,”  Nature 580 (16th April 2020) 316
2  25.   S.Funk et al ,” Assessing the performance of real-time epidemic forecasts,” PLOS Computational  Biol. https://doi.org/10.1371/journal.pcbi.1006785

2  28.   S.Bastin, “Lectures on mathematical modelling of biological systems,” 22nd Aug.2018, GBIO 2060
https://perso.uclouvain.be/georges.bastin/lectures-bio.pd
3 30.   V.Guttal et al a) medRxiv 1st May 2020 doi: https://doi.org/10.1101/2020.04.26.20080648
and b) INDSCI-SIM: a State-level epidemiological model for India 18th April 2020 www.indscicov.in