Fitting a periodic graph in python: parameters for scipy.interpolate.splrep, equation of curve?

1.5k views Asked by At

[Original Question]

I need an equation for curve that increases indefinitely as time elapsed increases based on data below. How to get that?

[Updates on the question]

I need to specify proper parameters for scipy.interpolate.splrep. Can someone helpout?

Also, Is there a way to get an equation from coeffs of b spline?

[Alternate question]

How to get a fit using decomposition of signal in Fourier series?

This plot seems to be a combination of linear plot, a periodic function pf1 which increases four time, a larger periodic function which causes pf1 to happen again and again indefinitely. The difficulty of the plot is the reason which is why this question has been asked.

enter image description here

Data:

Time elapsed in sec.    TX + RX Packets
(0,0)
(10,2422)
(20,2902)
(30,2945)
(40,3059)
(50,3097)
(60,4332)
(70,4622)
(80,4708)
(90,4808)
(100,4841)
(110,6081)
(120,6333)
(130,6461)
(140,6561)
(150,6585)
(160,7673)
(170,8091)
(180,8210)
(190,8291)
(200,8338)
(210,8357)
(220,8357)
(230,8414)
(240,8414)
(250,8414)
(260,8414)
(270,8414)
(280,8414)
(290,8471)
(300,8471)
(310,8471)
(320,8471)
(330,8471)
(340,8471)
(350,8471)
(360,8471)
(370,8471)
(380,8471)
(390,8471)
(400,8471)
(410,8471)
(420,8528)
(430,8528)
(440,8528)
(450,8528)
(460,8528)
(470,8528)
(480,8528)
(490,8528)
(500,8528)
(510,9858)
(520,10029)
(530,10129)
(540,10224)
(550,10267)
(560,11440)
(570,11773)
(580,11868)
(590,11968)
(600,12039)
(610,13141)

My Code:

import numpy as np
import matplotlib.pyplot as plt

points = np.array(
    [(0,0), (10,2422), (20,2902), (30,2945), (40,3059), (50,3097), (60,4332), (70,4622), (80,4708), (90,4808), (100,4841), (110,6081), (120,6333), (130,6461), (140,6561), (150,6585), (160,7673), (170,8091), (180,8210), (190,8291), (200,8338), (210,8357), (220,8357), (230,8414), (240,8414), (250,8414), (260,8414), (270,8414), (280,8414), (290,8471), (300,8471), (310,8471), (320,8471), (330,8471), (340,8471), (350,8471), (360,8471), (370,8471), (380,8471), (390,8471), (400,8471), (410,8471), (420,8528), (430,8528), (440,8528), (450,8528), (460,8528), (470,8528), (480,8528), (490,8528), (500,8528), (510,9858), (520,10029), (530,10129), (540,10224), (550,10267), (560,11440), (570,11773), (580,11868), (590,11968), (600,12039), (610,13141)]
    )
# get x and y vectors
x = points[:,0]
y = points[:,1]

# calculate polynomial
z = np.polyfit(x, y, 3)
print z
f = np.poly1d(z)

# calculate new x's and y's
x_new = np.linspace(x[0], x[-1], 50)
y_new = f(x_new)

plt.plot(x,y,'o', x_new, y_new)
plt.xlim([x[0]-1, x[-1] + 1 ])
plt.show()

My Output:

enter image description here

My Code 2:

import numpy as N
from scipy.interpolate import splprep, splev

x = N.array([0, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 110, 120, 130, 140, 150, 160, 170, 180, 190, 200, 210, 220, 230, 240, 250, 260, 270, 280, 290, 300, 310, 320, 330, 340, 350, 360, 370, 380, 390, 400, 410, 420, 430, 440, 450, 460, 470, 480, 490, 500, 510, 520, 530, 540, 550, 560, 570, 580, 590, 600, 610])
y = N.array([0, 2422, 2902, 2945, 3059, 3097, 4332, 4622, 4708, 4808, 4841, 6081, 6333, 6461, 6561, 6585, 7673, 8091, 8210, 8291, 8338, 8357, 8357, 8414, 8414, 8414, 8414, 8414, 8414, 8471, 8471, 8471, 8471, 8471, 8471, 8471, 8471, 8471, 8471, 8471, 8471, 8471, 8528, 8528, 8528, 8528, 8528, 8528, 8528, 8528, 8528, 9858, 10029, 10129, 10224, 10267, 11440, 11773, 11868, 11968, 12039, 13141])

# spline parameters
s=1.0 # smoothness parameter
k=3 # spline order
nest=-1 # estimate of number of knots needed (-1 = maximal)

# find the knot points
tckp,u = splprep([x,y],s=s,k=k,nest=nest,quiet=True,per=1)

# evaluate spline, including interpolated points
xnew,ynew = splev(N.linspace(0,1,400),tckp)

import pylab as P
data,=P.plot(x,y,'bo-',label='data')
fit,=P.plot(xnew,ynew,'r-',label='fit')
P.legend()
P.xlabel('x')
P.ylabel('y')

P.show()

My Output 2: enter image description here

1

There are 1 answers

2
Moritz On

It looks like you have a reaction kinetic:

#%%
import numpy as np
from scipy.integrate import odeint
from scipy import optimize
from matplotlib import pyplot as plt
#%%
data = []
with open('data.txt', 'r') as f:
    for line in f:
        data.append(line.strip(' \n ()').split(','))
data = np.array(data,dtype=float)
data = data[0:-1].T

#%%
slope = np.diff(data[1])
index = np.where(slope>1000)
index = np.append(index, len(data[0]) -1 )
plt.plot(data[0],data[1],'.')
plt.plot(data[0,index],data[1,index],'ro')
plt.plot(data[0,1:],np.diff(data[1]))

enter image description here

From here in I assume that the reaction starts at every marked spot (red). I am sure the code could be written much cleaner, but it is a first quick and dirty hack. You could use scipy curvefit or similar to fit the raction constant k

#%%
time = data[0,index]

def model(y,t,k):
    dydt = np.zeros(2)
    dydt[0] = -k*y[0]
    dydt[1] = k*y[0]

    return dydt


def res(k):
    y_hat = []
    t_hat = []
    for i in xrange(len(index) -1):
        '''
        I assume that at every timepoint the reaction is initiated by
        adding y[i + 1] - y[i] new datapackages. Over time they are 
        converted to sent packages. All packages which do not react,
        do not take part in the next cycle.
        '''
        y0 = [data[1, index[i+1]] - data[1, index[i]], 0]
        t0 = data[0, index[i]:index[i+1]]
        y_int,info = odeint(model, y0, t0, args=(k,), full_output = 1 )
        # I am not very happy about the construct below, but could
        # not find a better solution.
        y_hat.append(list(y_int[:,1]))
        t_hat.append(list(t0))
    return y_hat,t_hat

k = 2e-1
y,t = res(k)
''' It may be possible to play with y0[1] in the model in order 
to avoid the construct below. But since I started every reaction at y[1]=0
I have to add the last y value from the last step. This is a bit of a hack,
since data[1, index[i]] is not necessarily the corresponding solution. But hey, It seems to work.
'''
y_hat = [subitem + data[1, index[i]] for i,item in enumerate(y) for subitem in item]
t_hat = [subitem for item in t for subitem in item]
y_hat = np.array(y_hat,dtype=float)
t_hat = np.array(t_hat,dtype=float)


#%%
plt.plot(data[0],data[1],'.')
plt.plot(data[0,index],data[1,index],'ro')
plt.plot(t_hat,y_hat,'go')

enter image description here

Another approach could be (maybe physically more correct) to add a CDF of a gaussian peak at each timepoint.