I am trying to fit a Susceptible-Infectious-Recovered (SIR) ODE model to infection data in python using scipy.optimize, but the infection curve of the ODE model leads to a delayed rise in infections and an overestimated peak height (see the orange line below).
The data comes from an agent-based simulation with homogeneous mixing and three states (S, I, and R). In theory, this should be in accordance with the assumptions of ODE models and the model should be able to fit perfectly. I am not sure whether my fitting method needs improvement or whether the agent-based simulation creates a shape that cannot be reproduced with a simple ODE model.
Below is the data and my attempt.
import numpy as np
import matplotlib.pyplot as plt
from scipy import integrate, optimize
inf_data = [0.005,0.007,0.009,0.012,0.015,0.019,0.024,0.03,0.036,0.044,0.053,0.063,0.074,0.086,0.098,0.112,0.126,0.141,0.157,0.172,0.188,0.203,0.218,0.232,0.247,0.259,0.272,0.282,0.292,0.302,0.308,0.315,0.318,0.321,0.322,0.321,0.319,0.317,0.312,0.307,0.301,0.294,0.286,0.279,0.27,0.262,0.253,0.244,0.236,0.226,0.217,0.207,0.197,0.188,0.18,0.171,0.163,0.155,0.147,0.14,0.132,0.125,0.119,0.112,0.106,0.1,0.095,0.089,0.085,0.08,0.076,0.071,0.067,0.064,0.06,0.056,0.053,0.05,0.047,0.044,0.041,0.039,0.037,0.034,0.032,0.03,0.028,0.026,0.025,0.023,0.022,0.021,0.019,0.018,0.017,0.016,0.015,0.014,0.013,0.012,0.011] #percentage of infected
time_points = np.arange(len(inf_data))
def sir_model(y, x, beta, gamma):
S, I, R = y
dS = -beta * S * I / N
dI = beta * S * I / N - gamma * I
dR = gamma * I
return dS, dI, dR
def fit_odeint(time_points, beta, gamma):
return integrate.odeint(sir_model, (S0, I0, R0), time_points, args=(beta, gamma))[:,1]
# Initial values
N = 1.0
I0 = inf_data[0]
S0 = N - I0
R0 = 0.0
popt, pcov = optimize.curve_fit(fit_odeint, time_points, inf_data)
fitted = fit_odeint(time_points, *popt)
plt.plot(time_points, inf_data, 'o')
plt.plot(time_points, fitted)
plt.show()

I don't think that you have enough "play" in your model equations to fit that data. As an exercise in curve-fitting (if not in infection behaviour) you could add another parameter p in your source term for dI/dt (and sink for dS/dt), replacing I by I**p. This slightly changes the standard form of the SIR equation model, but you could regard it as equivalent to making beta and gamma weak functions of I.
Note that R is redundant in your set of equations (since S+I+R=N) - so I've removed it below.
This gives
Note that the power p is a little less than 1: you need your initial infections to rise a little faster.