Python: TypeError: '>' not supported between instances of 'numpy.ndarray' and 'int'

1.3k views Asked by At

Code description: I am trying to calculate various rolling metrics on financial time series data. I am using a looped approach as I would like to simulate data coming in from an API.

My original code was using a simple itertuples loop which passed values to NumPy arrays for the rolling calculations. However, I would like to speed up the calculations with Numba. As such, I need to iterate through the data using NumPy within a function.

I am getting the following error when trying to iterate through the Numpy array.


PyDev console: starting.
Python 3.7.9 (default, Aug 31 2020, 17:10:11) [MSC v.1916 64 bit (AMD64)] on win32
runfile('F:/Python/Directories/Directed Reading/Data Handling 0.1 JIT.py', wdir='F:/Python/Directories/Directed Reading')
Traceback (most recent call last):
  File "<input>", line 1, in <module>
  File "C:\Program Files\JetBrains\PyCharm 2019.3.3\plugins\python\helpers\pydev\_pydev_bundle\pydev_umd.py", line 197, in runfile
    pydev_imports.execfile(filename, global_vars, local_vars)  # execute the script
  File "C:\Program Files\JetBrains\PyCharm 2019.3.3\plugins\python\helpers\pydev\_pydev_imps\_pydev_execfile.py", line 18, in execfile
    exec(compile(contents+"\n", file, 'exec'), glob, loc)
  File "F:/Python/Directories/Directed Reading/Data Handling 0.1 JIT.py", line 92, in <module>
    Results = RunBacktest(Gran, Period, ZMinMax, upperbound, lowerbound, UpperExit, LowerExit, Data)
  File "F:/Python/Directories/Directed Reading/Data Handling 0.1 JIT.py", line 60, in RunBacktest
    if bid > 0:
TypeError: '>' not supported between instances of 'numpy.ndarray' and 'int'

The code is as follows:

Data = "F:/Market Data/2020.3.15 FXAUDCAD-TICK-NoSession.h5"
df = pd.read_hdf(Data)
df = df.set_index(pd.DatetimeIndex(df['DateTime']))
df = df.drop(columns=['DateTime'])
df = df.resample(Gran).mean()

t0 = time.time()
Array = df['Bid'].to_numpy()

def RunBacktest(Gran, Period, ZMinMax, upperbound, lowerbound, UpperExit, LowerExit, Array):

    # Arrays for storing "data feed"
    live_dtime_arr = np.array([])
    live_arr = np.array([])
    live_ma = np.array([])
    live_s_dev = np.array([])
    live_z_score = np.array([])
    live_buy_sig = np.array([])
    live_sell_sig = np.array([])

    count = 0
    sell_count = 0
    buy_count = 0

    # Loop through rows
    for i in np.nditer(Array):

        count += 1
        bid = i      #< this line is throwing the error

        #I did this to filter Nan data points
        if bid > 0:
            if count > Period:
                ma = live_arr[-Period:].mean()
                s_dev = live_arr[-Period:].std()
                z_score = (bid - ma) / s_dev
            else:
                ma = np.nan
                s_dev = np.nan
                z_score = np.nan

            if z_score > upperbound:
                sell_sig = bid
                sell_count += 1
            elif z_score < lowerbound:
                buy_sig = bid
                buy_count += 1
            else:
                signal_filter = 0
                sell_sig = np.nan
                buy_sig = np.nan

            live_arr = np.append(live_arr, [bid], axis=0)
            live_ma = np.append(live_ma, [bid], axis=0)
            live_s_dev = np.append(live_s_dev, [s_dev], axis=0)
            live_z_score = np.append(live_z_score, [z_score], axis=0)
            live_buy_sig = np.append(live_buy_sig, [buy_sig], axis=0)
            live_sell_sig = np.append(live_sell_sig, [sell_sig], axis=0)


        return live_arr

Results = RunBacktest(Gran, Period, ZMinMax, upperbound, lowerbound, UpperExit, LowerExit, Data)
print(Results)

Sample Data: (From df)

Note: There are some nan values in the 'Bid' column of the Pandas data frame

DateTime                 Bid                                              
2006-01-03 00:01:07.588  0.85208       
2006-01-03 00:01:08.654  0.85213       
2006-01-03 00:01:08.859  0.85212       
2006-01-03 00:01:11.472  0.85215       
2006-01-03 00:01:12.002  0.85218  
...                          ...  
2020-03-15 23:59:57.150  0.85178  
2020-03-15 23:59:57.300  0.85179  
2020-03-15 23:59:58.233  0.85179  
2020-03-15 23:59:58.366  0.85178  
2020-03-15 23:59:58.595  0.85179

When I run the loop outside of the function, the printed values appear as expected.

I am new to programming and would really appreciate some advice/help. Thanks!

I'm using Python 3.7.9 and NumPy 1.19.1

0

There are 0 answers