Error in Threading SARIMAX model

5.2k views Asked by At

I am using threading library for the first time inorder to speed up the training time of my SARIMAX model. But the code keeps failing with the following error

Bad direction in the line search; refresh the lbfgs memory and restart the iteration.
This problem is unconstrained.
This problem is unconstrained.
This problem is unconstrained.

Following is my code:

import numpy as np
import pandas as pd
from statsmodels.tsa.arima_model import ARIMA
import statsmodels.tsa.api as smt
from threading import Thread

def process_id(ndata):
   train = ndata[0:-7]
   test = ndata[len(train):]
   try:
       model = smt.SARIMAX(train.asfreq(freq='1d'), exog=None, order=(0, 1, 1), seasonal_order=(0, 1, 1, 7)).fit()
       pred = model.get_forecast(len(test))
       fcst = pred.predicted_mean
       fcst.index = test.index
       mapelist = []
       for i in range(len(fcst)):
            mapelist.insert(i, (np.absolute(test[i] - fcst[i])) / test[i])
       mape = np.mean(mapelist) * 100
       print(mape)
    except:
       mape = 0
       pass
return mape

def process_range(ndata, store=None):
   if store is None:
      store = {}
   for id in ndata:
      store[id] = process_id(ndata[id])
   return store


def threaded_process_range(nthreads,ndata):
    store = {}
    threads = []
    # create the threads
    k = 0
    tk = ndata.columns
    for i in range(nthreads):
        dk  = tk[k:len(tk)/nthreads+k]
        k = k+len(tk)/nthreads
        t = Thread(target=process_range, args=(ndata[dk],store))
        threads.append(t)
    [ t.start() for t in threads ]
    [ t.join() for t in threads ]
    return store

outdata = threaded_process_range(4,ndata)

Few things I would like to mention:

  • Data is daily stock time series in a dataframe
  • Threading works for ARIMA model
  • SARIMAX model works when done in a for loop

Any insights would be highly appreciated thanks!

1

There are 1 answers

0
Rohan Kumar On

I got the same error with lbfgs, I'm not sure why lbfgs fails to do gradient evaluations, but I tried changing the optimizer. You can try this too, choose among any of these optimizers

’newton’ for Newton-Raphson, ‘nm’ for Nelder-Mead

’bfgs’ for Broyden-Fletcher-Goldfarb-Shanno (BFGS)

’lbfgs’ for limited-memory BFGS with optional box constraints

’powell’ for modified Powell’s method

’cg’ for conjugate gradient

’ncg’ for Newton-conjugate gradient

’basinhopping’ for global basin-hopping solver

change this in your code

model = smt.SARIMAX(train.asfreq(freq='1d'), exog=None, order=(0, 1, 1), seasonal_order=(0, 1, 1, 7)).fit(method='cg')

It's an old question but still I'm answering it in case someone in future faces the same problem.