EM-Algorithm with Pykalman

2.9k views Asked by At

I am trying to implement a simpel application of the Kalman Filter using Pykalman, but I am getting an error on the estimation step of the EM-Algorithm that comes with the Pykalman package.

It is a simple linear regression with time-varying coefficient, based on simulated data. The code below simulates the data and starts the kalman filter, but when I try to estimate the parameters based on the observations, using kf.em(Data), it returns the error: ValueError: object arrays are not supported.

Am I doing something wrong with pykalman?

Model and full code below. The error occurs on the last line of the code.

Model (small images)

Description of the problem

State-Space representation of the model

Full Code

import pandas as pd
import scipy as sp
import numpy as np

import matplotlib.pyplot as plt
import pylab as pl
from pykalman import KalmanFilter

# generates the data
Data = pd.DataFrame(columns=['NoiseAR','NoiseReg', 'x', 'beta', 'y'], index=range(1000))
Data['NoiseAR'] = np.random.normal(loc=0.0, scale=1.0, size=1000)
Data['NoiseReg'] = np.random.normal(loc=0.0, scale=1.0, size=1000)

for i in range(1000):
    if i==0:
        Data.loc[i, 'x'] = Data.loc[i, 'NoiseAR']
    else:
        Data.loc[i, 'x'] = 0.95*Data.loc[i-1, 'x'] + Data.loc[i, 'NoiseAR']

for i in range(1000):
    Data.loc[i, 'beta'] = np.sin(np.radians(i))

Data['y'] = Data['x']*Data['beta'] + Data['NoiseReg']

# set up the kalman filter
F = [1.]
H = Data['x'].values.reshape(1000,1,1)
Q = [2.]
R = [2.]

init_state_mean = [0.]
init_state_cov = [2.]

kf = KalmanFilter(
    transition_matrices=F, 
    observation_matrices=H, 
    transition_covariance=Q, 
    observation_covariance=R, 
    initial_state_mean=init_state_mean, 
    initial_state_covariance=init_state_cov, 
    em_vars=['transition_covariance', 'observation_covariance', 'initial_state_mean', 'initial_state_covariance']
)

# estimate the parameters from em_vars using the EM algorithm
kf = kf.em(Data['y'].values)
1

There are 1 answers

0
Gustavo Amarante On BEST ANSWER

I figured it out! Data['y'].values is a numpy array with dtype=object. All I had to do is to change the type of the array to float using .astype(float). This has to be done with everything that goes into the kalman filter object of pykalman, so I also had to change the type of the H matrix.

Hope this helps somebody in the future!

Here is what the final working code looks like:

import pandas as pd
import scipy as sp
import numpy as np
import matplotlib.pyplot as plt
import pylab as pl
from pykalman import KalmanFilter

Data = pd.DataFrame(columns=['NoiseAR','NoiseReg', 'x', 'beta', 'y'], index=range(1000))

Data['NoiseAR'] = np.random.normal(loc=0.0, scale=1.0, size=1000)
Data['NoiseReg'] = np.random.normal(loc=0.0, scale=1.0, size=1000)
plt.plot(Data[['NoiseAR','NoiseReg']])
plt.show()

for i in range(1000):
    if i == 0:
        Data.loc[i, 'x'] = Data.loc[i, 'NoiseAR']
    else:
        Data.loc[i, 'x'] = 0.95 * Data.loc[i - 1, 'x'] + Data.loc[i, 'NoiseAR']

plt.plot(Data['x'])
plt.show()

for i in range(1000):
    Data.loc[i, 'beta'] = np.sin(np.radians(i))

plt.plot(Data['beta'])
plt.show()

Data['y'] = Data['x']*Data['beta'] + Data['NoiseReg']

plt.plot(Data[['x', 'y']])
plt.show()

F = [1.]
H = Data['x'].values.reshape(1000,1,1).astype(float)
Q = [2.]
R = [2.]

init_state_mean = [0.]
init_state_cov = [2.]

kf = KalmanFilter(
    transition_matrices=F,
    observation_matrices=H,
    transition_covariance=Q,
    observation_covariance=R,
    initial_state_mean=init_state_mean,
    initial_state_covariance=init_state_cov,
    em_vars=['transition_covariance', 'observation_covariance', 'initial_state_mean', 'initial_state_covariance']
)

kf = kf.em(Data['y'].values.astype(float))

filtered_state_estimates = kf.filter(Data['y'].values.astype(float))[0]
smoothed_state_estimates = kf.smooth(Data['y'].values.astype(float))[0]

pl.figure(figsize=(10, 6))
lines_true = pl.plot(Data['beta'].values, linestyle='-', color='b')
lines_filt = pl.plot(filtered_state_estimates, linestyle='--', color='g')
lines_smooth = pl.plot(smoothed_state_estimates, linestyle='-.', color='r')
pl.legend(
    (lines_true[0], lines_filt[0], lines_smooth[0]),
    ('true', 'filtered', 'smoothed')
)
pl.xlabel('time')
pl.ylabel('state')

pl.show()