Working of SIR Modeling

Section 1: The SIR model main purpose is only to understand:

1.       How many days it takes for the epidemic to peak.

2.       After how many days from the start of infection can we expect the number of new cases to be less than five (5).

3.       We can also find out how estimate the hospitalization rate and the number of beds and ventilators required.

4.       Important Note:

a.       The limitation of this model is that once the person gets infected, the model assumes that the same person will not be infected again (life long immunity). Because of this problem the SIR model is not accurate and will not reflecting the actual number of cases.

b.      The SIR model also cannot estimate the effects of quarantine (example : State Lockdown)

c.       The SIR model cannot tell the improvement in outcomes if vaccine is introduced or developed.

d.      The SIR model assumes well-mixed population .i.e. everyone is of same age and has same immunity (homogeneous). Also the birth rate and death rate is constant. So any change in the in the population mix is only because of the undergoing epidemic.

e.       Adding additional complexity is the fact that the model is a continuous-time process, whereas the data are generally collected on a daily, weekly or monthly basis. 

Section 2: To use the SIR model, epidemiologists estimate the following

1)      Starting Date of infection in a county/ state or country.
2)      Suspected(S) - Who is the susceptible population for this disease( in case of COVID-19, we estimate the entire population to be susceptible as this disease is novel and there is no prior understanding of this disease).
a.       We can estimate the population to be susceptible as a percentage and give 100%
3)      Infected(I) - We also need to estimate how many people on the starting date were infected (in case of COVID-19 we don’t know how many people were infected on first day so we can give the value as 1 or 10).

a.       We can also give the infected number on starting date of infection as a percentage by giving a very small number like 0.0001 or 0.001.

4)      Recovered(R) - The number of patients recovered at the start of infection.

a.       We can estimate this value to be zero for COVID-19 at the start of infection.
5)      The main parameters of the infection are Beta (effective contact rate) and Gamma (inverse of the mean recovery time).
6)      Using differential solver equation we can initialize the value of S,I,R and give value of B and Gamma and ask the solver to return the value for S(t), I(t) and R(t) for the number of iterations.

Section 3: To calculate the SIR model outputs, we need to know; the below conditions are applicable for the human in the loop who will run the model

1)      The starting date (t=0) on which the epidemic started in county/state and country.

2)      The number of days we want to run the simulation.

3)      The value of Suspected(S) (at time t = 0), if we don’t know we can use the entire population.

4)      The Value of Infected(I) (at time t = 0), if we don’t know then we can use 1, as we need at least one infected.

5)      The Value of Recovered(R) (at time t = 0), if we don’t know then we can use 0, as initial value of recovered can be assumed as zero.

6)      The important value of the SIR model is the constant value of Beta and Gamma

a.       Recovery Rate - Gamma = 1 / (mean number of days it takes to recover)

          i.      As per CDC : mean number of days is 14 , Gamma = 1/14 (some experts argue 20 days is the time need for virus shedding)

                   ii.      As per John R Code : mean number of days is 7, Gamma = 1/7

b.      Infection Rate - Beta (effective contact rate) is the challenging parameter to calculate. As infection has not ended we cannot estimate the value of Beta.

                  i.      Approximate value of Beta can be calculated by

1.       Beta = Gamma + G,

2.       Where G = {2^(1/Td)} -1,

a.       Where Td is the number of days it takes to double the number of new cases from the start of infection (t=0).

3.       The American Hospital Association (AHA) initially projected a doubling time (Td) between 7 and 10 days. The doubling time is applied to the number of infections, not the number of confirmed cases. This distinction may explain the discrepancy between the AHA's doubling time estimates and the observed doubling time of confirmed cases (currently 2 - 4 days).

4.       There is a relation between Beta and Gamma where R0 = Beta/Gamma. The value of R0 for COVID-19 is suggested to be between 1.4 to 3.28

7)      Using the values from point 2, 3, 4, 5, 6a and 6b we define the SIR Model and use any solver package to create a model.

a.       John has used the R package desolve

8)      The output of the model will be the values of S(t), I(t) and R(t) for each day of the simulation.

9)      We can use the output values from point 8.

Section 4: Output of SIR models;

1)      We can use the output values of SIR i.e. S(t), I(t) and R(t) as a time series matrix.

2)      The number of output rows will be the number of days we wanted to run the simulation.

a.       John’s R-Code has used 200 as the default number.

b.      Each row will contain the output values of S(t), I(t) and R(t)

3)      We can plot the each row value in a time series plot starting from the first day of infection.

a.       Using the graph we can understand the points 1, 2 and 3 from Section 1 as given below;

                   i.      How many days it takes for the epidemic to peak.

                   ii.      After how many days from the start of infection can we expect the number of new cases to be less than five (5).

                   iii.      We can also find out how estimate the hospitalization rate and the number of beds and ventilators required.