Mavrik Hard: Reactions to R0, the SIR model and herd immunity

Reactions to R₀, the SIR model and herd immunity

Nowadays, anybody with internet access who has read a couple of papers fancies himself/herself to be an epidemiologist. As a physicist, by training, I also want to jump in with my reactions to what little I have read, to highlight the small fraction that (I think) I have understood.

Time Variations: exponential, logistic, SIR and SIRD models:

In the initial phases the growth is exponential. In the later phases, the growth slows down and can be fitted to the logistic curve, as done by Rhett Allain [1,2], because there is only a finite population that the virus can infect, while the exponential would keep going forever. Ranjan [2] points out that the exponential curve predicts the initial build-up, but fails to predict the eventual flattening out. The logistic curve:

I (t) = K/[ 1 + A exp(-rt)]

Where A = (K/I₀) -1, reduces to the exponential for small values of t:

I (t) = I₀ exp(rt).

But the logistic curve, once fitted to the data, fails to fit the initial time variation.

However, most epidemiologists follow the basic SIR model. Proposed by Kermack & McKendrick [3], the model considers three categories of people: those susceptible (S) to infection, those infected (I) and those who have recovered (R). It is cast in terms of 3 coupled differential equations:

dS/dt = - a SI

dI/dt = aSI – b I

dR/dt = bI

where N is the population size, S is the number of susceptible people, I is the number of infected people, R is the number of recovered people, (such that N = S + I +R) , a is the infection rate (in units of 1/day), b is the recovery rate (1/day) and c is the mortality rate (1/day). This is expressed schematically in terms of transitions between compartments:

The basic SIR model was modified to take the probability of death, after getting infected, into account, through the mortality rate c (1 /day), becoming the SIRD model by adding another differential equation and adding one term to the dI/dt equation:

dS/dt = - a SI

dI/dt = aSI – (b +c) I

dR/dt = bI

dD/dt = cI

such that N = S + I +R + D.

This is expressed in compartments as:

Basic Reproduction Number (BRN), R₀:

But what of R₀? According to Aronson et al [4], the parameter R₀ only characterizes the transmissibility, or the average reproduction number, of the virus at the very beginning of the epidemic in a homogeneous population (when there is zero immunity in the population to the virus, so the whole population is susceptible).

The importance of R₀ is that if it is less than unity, the virus cannot spread (a physicist would call it ‘sub-critical’); but if it is more than one, the virus multiples, exponentially (in physicist-speak: avalanche multiplication).

As the number of infected people changes, the transmissibility decreases, and it is given by an effective reproduction rate R_e. This is related to R₀ by the formula:

R_e = R₀ (1 – P_i),

where P_i is the proportion of the population that is now immune to the virus – which keeps increasing with time. But, to back up a bit, R₀ is affected by [3]:

(i) The size of the population N, and the fraction of the population that is susceptible at the beginning P_s(0) (Above Aronson said all are susceptible?)

(ii) The infectiousness of the organism

(iii) The recovery rate b (in the SIR model, recovery includes death in addition to remission).

Both R₀ and R_e increases as N increases, and as P_s(0) increases, and both decrease as the recovery rate b increases. The effective reproduction rate R_e is also affected by the behavior of people e.g. physical distancing will decrease it.

According to Jones [5] and Delamater et al [6], it must be emphasized that R₀, the basic reproduction number (BRN), is dimensionless; it is often misleadingly characterized as a ‘rate’, which could only happen if it had units of (time)^-1:

R₀ = tc_avd

Where d is the duration of infectiousness (in days), c_av is the average rate of contact between susceptible and infected people (number/day) and t is the transmissibility of the infection, or the probability that the contact will result in infection.

Jones [5] also differentiates between R₀ and R_i. The first generation of an epidemic is all the secondary infections that result from infectious contact with the index case, or generation zero. R_i refers to the BRN of the i^th generation, and R₀ refers to the number of infections generated by the index case (Gen zero). This is similar to the parameter R_ementioned above [4]. The time between generations is called the serial interval, or the average time between successive infections (1/b), or the number of infectious days [7]. The problem with accurately estimating R₀ is that, initially, numbers are small and subject to statistical fluctuations. The estimation of R₀ is generally done by modeling and is tricky [5].

Similarly, Fernandez-Villaverde and Jones [8] express the parameter R₀ as a product of two variables in an intuitive way: it is the product of the number of infectious days (1/b) and the number of (lengthy) contacts per day (a) (or 1/a is the time between contacts):

R₀ = (1/b)(a).

Where a = tc_av.

In the SIR model: R₀ = a/b,

while in the SIRD model: R₀ = a/(b + c).

If there is no mortality, it reduces to the SIR formula. The number of infectious days is characteristic of the virus, and not controllable. But people can increase the time between contacts by social distancing.

Ranjan [2] uses the formula:

R₀ = (a/b) [1 – (I₀/N)]

But initially I₀ << N.

Similar to the above formula: R_e = R₀ (1 – P_i) is another formula (they are the same if we neglect the recovered population) used by Caccavo in his SIRD model [9]:

R_e = R₀ (S/N)

Initially, S and N are almost the same, but as the number of susceptible people decreases R_e declines. The million dollar question is: when does it hit one? That is the point in time at which the number of infections peaks, and beyond that the epidemic dies out as R_e decreases further.

Herd Immunity:

The concept of herd immunity [4] was introduced in the context of vaccination [10,11]. If a population is immunized above a threshold H then the remainder of the population is given indirect protection, because the infection/virus encounters too may immunized individuals and is unable to propagate in the population. The herd immunity threshold is related to R₀ by:

H = 1 – (1/R₀)

For the case of covid-19, R₀ is somewhere between 2.5 and 3.5, so the safe value of H is about 71% of the population, and about 29% of the population, though susceptible (never having been infected), will remain protected from infection.

What is interesting is a different equation derived by Bastin [11] and by Jones [5]. This relates R₀ to the maximum fraction x_im of the population that ever gets infected, just before the epidemic collapses to zero [11]:

R₀ = - [ln(1 –x_im)]/x_im

Jones [5] expresses it equivalently in terms of the final susceptible fraction s_¥:

ln (s_¥) = R₀ (1 - s_¥)

Bastin also gives an equation for the maximum value of infected individuals, I_max:

I_max = N [1 – {{1 + ln(R₀)}/R₀}]

Bastin [11] even provides an I-S plot with an epidemic trajectory:

The maximum number of infections occurs at the value of S = b/a (or when R_e = 1).

Compare this with the herd immunity threshold: H = 1 – (b/a), and with the formula for the maximum number of s_¥.

For simplicity, assume R₀ = a/b = 3. The peak I_max occurs when (S/N) = 1/3.

The herd immunity threshold is H = 2/3.

But what is the fraction s_¥? it takes the value 0.06. That is, the fraction of susceptible individuals is not 1/3 as expected from the herd immunity formula but lower.

Here it is necessary to clarify: Bastin derives the formula for x_im in the context of a model that is different from the SIR model mentioned above. Without going into details of his derivation, he introduces another compartment called ‘exposed’, so it is now an SEIR model. A person who is ‘exposed’ has been exposed to the virus, but is not infectious yet. The SEIR model takes into account the processes of births, deaths and vaccinations.

The question then really is: does herd immunity apply when there is no vaccination? Can the infection itself be considered as equivalent to vaccination, since the infected person is now immune to further infection (never mind whether it was due to a live virus or an inactivated virus)?

The problem within the SIR model – which does not consider vaccination – is that the infections continue beyond the herd immunity threshold, until s has fallen to s_¥ (see the Figure), at which point the number of infections drops to zero.

One might try to define the herd immunity threshold as the point at which herd immunity just starts up, without really giving all of the susceptible individuals immunity immediately - just an increasing level of immunity to more and more individuals as s approaches s_¥. This is a consistent way of interpreting the equations – but that is not what many prominent epidemiologists have claimed in public. The advocates of herd immunity vocally state that all we have to do is get 70% of the population infected, and the remaining 30% will be protected. The advocates of this approach, initially in the U.K. and even today in Sweden [4], argue that we need only isolate the elderly and the sick (with co-morbidities), who have higher chances of dying than the young and healthy majority (but see below!) According to Aronson et al [4]:

“The problem with leaving people to catch the infection spontaneously, leading to herd immunity, would increase the death rate. For example, on 10 April, the number of confirmed cases in Sweden was 9685 with 870 deaths (9.0%), compared with Norway with 6219 confirmed cases and 108 deaths (1.7%) and Denmark with 5830 confirmed cases and 237 deaths (4.4%).”

However, Aronson et al [4] also argue:

“For example, if R₀ = 2, immunization needs to be achieved in 50% of the population. However, if R₀ = 5 the proportion rises steeply, to 80%. Beyond that the rise is less steep; an increase in R₀ to 10 (for measles) increases the need for immunization to 90%.

When other children become immune the infected child who encounters 10 children will not be able to infect them all; the number infected will depend on R_e. When immunity is 90% or more the chances that the child will meet enough unimmunized children to pass on the disease falls to near zero, and the population is protected.” (emphasis added by me).

The point being made here is that the number to be infected is higher: 94%, not 70% for R₀ = 3. The lower and upper limits of R₀ are also tabulated:

R₀	(1/R₀)	s_¥	s_¥ (from SIR program)
2.5	0.400	0.11	0.103
3	0.333	0.06	0.056
3.5	0.286	0.035	0.0306
5	0.200	0.007	0.00500
7	0.143	0.0008	0.00047

But look at the last 2 rows: for a high R₀of 5 only a very small fraction (0.7 %) is ‘protected’. And for even higher values, that fraction becomes even smaller (0.05%).

So the question about the SIR vs the SEIR model has real consequences.

The results of the SIR simulation (using Euler’s method [11]) for R₀ = 3 are similar to Bastin’s plot above. The arrow on the S-axis indicates S_¥ = 0.056. The arrow indicating the maximum has its position at S/N = 0.33 = 1/R₀ as expected, and the value of the maximum is I_max = 3.07 x 10⁵, close to the number expected (3.00 x 10⁵) from the equation for I_max given above (by Bastin).

I just found an article by Prof. Gautam Menon [13] which states that there are ‘sound methodological reasons’ to prefer the SEIR model to the SIR model. Among others, the covid-19 has asymptomatic patients (the ‘E’ in SEIR) which SIR does not capture. More importantly, Menon states that the SIR model: “predicts the numbers of people with COVID-19 in India will decline immediately after a lockdown is imposed. In contrast, an SEIR model would have predicted that the case load would continue to increase before beginning to drop”.

There is plenty of criticism of the compartment models. For example, Delamater et al [6] points out that R₀ will fluctuate if human-human or human-vector interactions vary with time and space. These compartment models need to be modified to take into account the fact that populations may be homogeneous or heterogeneous (e.g. different social classes or age groups), that interactions are dynamic – either deterministic or stochastic (with various possible distribution functions) – and pathogen pathways may be complex involving intermediate hosts. In other words, some of the basic simplifying assumptions made in early models have had to be modified to take complex realities into account. Prof.Menon raises these concerns as well [13].

Similar criticisms have been leveled by Huppert and Katriel [14]:

a) the assumption of “well-mixed populations” disregards the facts of geographical and social proximity in enabling or preventing contacts

b) individuals are different, some are more prone to infection, others may be more effective in spreading infection (super-spreaders).

c) the duration of infection is assumed to be exponentially distributed, but an individual need not become infectious immediately after contracting an infection, and recovery from infection may depend upon how much time has passed since getting infected.

d) populations are assumed to be large, so they can be treated deterministically by differential equations; while small populations (a village or a school) may be dominated by stochastic effects.

Menon [13] points out that there are other approaches than the compartment model: a) an agent-based model b) Monte Carlo models c) purely statistical or machine learning models d) network models.

Interestingly, Roda et al [16] claim to have fitted the post-lockdown data in Wuhan to both SIR and SEIR models, and say that the former is preferable, because the latter has too many parameters that need to be estimated, and so the simpler SIR model is better.

Regalado [17] says that for an R₀ of 3, 66% of the population has to be immune before the effect (of herd immunity) ‘kicks in’, according to ‘the simplest model’. Regalado is being careful, because herd immunity, even at 66%, is not a free pass…

As a parting shot, note the following paragraph in a UK newspaper [18]:

“About 60 per cent is the sort of figure you need to get herd immunity.” The coronavirus won’t stop at 60 per cent, though, according to the British scientists. Models promoted by Dr. Vallance and Britain’s chief medical adviser, Chris Whitty, indicate that 80 per cent will ultimately become infected. The trick is to limit that 80 per cent to those who can be safely immunized, since infection for the others could mean death.

This is not the dominant narrative: see the characterization by Aronson et al above. Are epidemiologists just dumbing down the narrative for a public that they believe is incapable of understanding the details? But Chris Whitty did explain the details...

To sum up: getting to 80% is harder than getting to 60%, since there are more deaths along the way. The idea of herd immunity is not as attractive as it seems to be at first sight.

References:

1. Rhett Allain Wired 24^th March 2020 https://www.wired.com/story/the-promising-math-behind-flattening-the-curve/

2. Rajesh Ranjan,”Predictions for covid-19 outbreak in India using epidemiological models,” 3^rd April 2020 preprint: https://www.researchgate.net/publication/340314461

3. W.O.Kermack and A.G.McKendrick Proc.Roy.Soc. (Lond) A115 (1927) 700

4. J.K.Aronson et al “’When will it be over?’: An introduction to viral reproduction

numbers, R₀ and R_e” www.cebm.net/oxford-covid-19/

5. J.H.Jones “Notes on R₀” 1^st May 2007

https://web.stanford.edu/~jhj1/teachingdocs/Jones-on-R0.pdf

6. P.L.Delamater et al, “Complexity of the Basic Reproduction Number, R₀” Emerging Infectious Diseases 15 (2019) 1 www.cdc.gov/eid

7. Max Fisher, “R0, the messy metric,” https://www.nytimes.com/2020/04/23/world/europe/coronavirus-R0-explainer.html

8. J.Fernandez-Villaverde and C.L.Jones, “ Estimating and simulating a SIRD model of Covid-19 for many countries, states and cities,” 28^th April 2020 Ver 0.6

9. D.Caccavo, “Chinese and Italian outbreaks can be correctly described by a modified SIRD model,”

17^th Apr.2020 medRxiv preprint doi: https://doi.org/10.1101/2020.03.19.20039388

10. P.E.M.Fine, “Herd immunity: history, theory, practice”, Epidemiologic Rev. 15 (1993) 265

11. G.Bastin, “Lectures on mathematical modeling of biological systems,” 22^nd Aug.2018 (GBIO 2060)

12. Peter Adams,” Theory and Practice in Science” https://people.smp.uq.edu.au/PeterAdams/SCIE1000/scie1000_notes_part2.pdf

13. Gautam I.Menon https://science.thewire.in/the-sciences/covid-19-pandemic-infectious-disease-transmission-sir-seir-icmr-indiasim-agent-based-modelling/

14. A.Huppert and G.Katriel,”Mathematical modeling and prediction in infectious disease

epidemiology, “ Clinical Microbiology and Infection (Nov.2013) DOI: 10.1111/1469-0691.12308

15. Gautam I.Menon https://science.thewire.in/the-sciences/coronavirus-pandemic-infectious-disease-transmission-modelling-kermack-mckendrick-theory-seir-model/

16. W.C.Roda et al, “Why is it difficult to accurately predict the covid-19 epidemic?” Infectious Diseases Modeling 5 (2010) 271

17. A.Regalado, “What is herd immunity and can it stop the coronavirus?” 17^th Mar 2020 https://www.technologyreview.com/2020/03/17/905244/what-is-herd-immunity-and-can-it-stop-the-coronavirus/

18. Lawrence Solomon 16^th March 2020, “Britain’s approach to coronavirus: will herd immunity work?” https://www.theglobeandmail.com/opinion/article-britains-novel-approach-to-coronavirus-will-herd-immunity-work/

Mavrik Hard

Monday, May 11, 2020

Reactions to R0, the SIR model and herd immunity

No comments:

Blog Archive

About Me