OSError - could not get source code, during Pandas rolling apply an iterative function

428 views Asked by At

I am encountering an issue related to the Pandas rolling function in Python.

Specifically, I am implementing a rolling apply with a custom function, and I'm facing an OSError: could not get source code error.

This is my DataFrame

gvkey PERMNO date Xt Ve sigma_E rf
1004 54594 2011-09-30 178.9760 674484.87 0.000000 0.000003
1004 54594 2011-10-03 178.9760 613793.373 30345.750000 0.000003

This is the function I defined with an iterator function nested inside it

def cal_sigma_A(x):
    ite_sig = x.sigma_E.iloc[-1]
    ite_Va = x.Ve

    def iterative_sigma(x, sig = ite_sig, Va = ite_Va):
    
        d1 = (np.log(Va/x.Xt) + (x.rf + 0.5*(sig**2)))/sig
        d2 = d1 - sig
        d1 = norm.cdf(d1)
        d2 = norm.cdf(d2)
        new_Va = (x.Ve + x.Xt*np.exp(-x.rf)*d2)/d1
        new_sigma = np.std(new_Va)
        if new_sigma - sig < 10*(-4):
            return new_sigma
        else:
            return iterative_sigma(x,new_sigma,new_Va)
    return iterative_sigma(x,ite_sig,ite_Va)
merged.set_index('date').groupby(['gvkey','PERMNO'],group_keys=False).rolling('365D').apply(lambda x:cal_sigma_A(x))

And this is the error:

Unexpected exception formatting exception. Falling back to standard exception
Traceback (most recent call last):
  File "d:\anaconda\Lib\site-packages\IPython\core\interactiveshell.py", line 3526, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "C:\Users\28137\AppData\Local\Temp\ipykernel_11788\4089525893.py", line 1, in <module>
    merged.groupby(['gvkey','PERMNO'],group_keys=False).rolling('365D').apply(lambda x:cal_sigma_A(x))
  File "d:\anaconda\Lib\site-packages\pandas\core\window\rolling.py", line 1913, in apply
    return super().apply(
           ^^^^^^^^^^^^^^
  File "d:\anaconda\Lib\site-packages\pandas\core\window\rolling.py", line 1390, in apply
    return self._apply(
           ^^^^^^^^^^^^
  File "d:\anaconda\Lib\site-packages\pandas\core\window\rolling.py", line 709, in _apply
    result = super()._apply(
             ^^^^^^^^^^^^^^^
  File "d:\anaconda\Lib\site-packages\pandas\core\window\rolling.py", line 615, in _apply
    return self._apply_blockwise(homogeneous_func, name, numeric_only)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "d:\anaconda\Lib\site-packages\pandas\core\window\rolling.py", line 490, in _apply_blockwise
    res = homogeneous_func(arr)
          ^^^^^^^^^^^^^^^^^^^^^
  File "d:\anaconda\Lib\site-packages\pandas\core\window\rolling.py", line 610, in homogeneous_func
    result = calc(values)
             ^^^^^^^^^^^^
  File "d:\anaconda\Lib\site-packages\pandas\core\window\rolling.py", line 607, in calc
    return func(x, start, end, min_periods, *numba_args)
...
                  ^^^^^^^^^^^^^^^^^^
  File "d:\anaconda\Lib\inspect.py", line 1081, in findsource
    raise OSError('could not get source code')
OSError: could not get source code

Additionaly, if I only want to execute this function at the end of each month, what's the best way to do it?

I'm a newbie, so I have no idea how to debug.

1

There are 1 answers

0
Suraj Shourie On

If you reduce your problem, it becomes how to apply rolling on multiple columns. There are a few related problems on SO, one of them is here.

You can use rolling_apply from numpy_ext package (you'll have to install that). Tweaking your original code a little, the below code works (though you have to be careful with recursion limit, if your function keeps on calling itself without returning).

# !pip install numpy_ext
import pandas as pd
from scipy.stats import norm
from numpy_ext import rolling_apply as rolling_apply_ext
import numpy as np
def func1(ite_Va, sigma_E, Xt, rf, Ve):
  ite_sig = sigma_E.iloc[-1]
  def iterative_sigma(xt, rf, Ve, sig = ite_sig, Va = ite_Va):
    
        d1 = (np.log(Va/xt) + (rf + 0.5*(sig**2)))/sig
        d2 = d1 - sig
        d1 = norm.cdf(d1)
        d2 = norm.cdf(d2)
        new_Va = (Ve + Xt*np.exp(-rf)*d2)/d1
        new_sigma = np.std(new_Va)
        # if new_sigma - sig < 10*(-4):
        return new_sigma
        # else:
        #     return iterative_sigma(xt,rf,Ve,new_sigma,new_Va)
  return iterative_sigma(Xt,rf,Ve,ite_sig,ite_Va)

window = 2
rolling_apply_ext(func1, window, df['Ve'], df['sigma_E'], df['Xt'], df['rf'], df['Ve'])