I am encountering an issue related to the Pandas rolling
function in Python.
Specifically, I am implementing a rolling apply with a custom function, and I'm facing an OSError: could not get source code
error.
This is my DataFrame
gvkey | PERMNO | date | Xt | Ve | sigma_E | rf |
---|---|---|---|---|---|---|
1004 | 54594 | 2011-09-30 | 178.9760 | 674484.87 | 0.000000 | 0.000003 |
1004 | 54594 | 2011-10-03 | 178.9760 | 613793.373 | 30345.750000 | 0.000003 |
This is the function I defined with an iterator function nested inside it
def cal_sigma_A(x):
ite_sig = x.sigma_E.iloc[-1]
ite_Va = x.Ve
def iterative_sigma(x, sig = ite_sig, Va = ite_Va):
d1 = (np.log(Va/x.Xt) + (x.rf + 0.5*(sig**2)))/sig
d2 = d1 - sig
d1 = norm.cdf(d1)
d2 = norm.cdf(d2)
new_Va = (x.Ve + x.Xt*np.exp(-x.rf)*d2)/d1
new_sigma = np.std(new_Va)
if new_sigma - sig < 10*(-4):
return new_sigma
else:
return iterative_sigma(x,new_sigma,new_Va)
return iterative_sigma(x,ite_sig,ite_Va)
merged.set_index('date').groupby(['gvkey','PERMNO'],group_keys=False).rolling('365D').apply(lambda x:cal_sigma_A(x))
And this is the error:
Unexpected exception formatting exception. Falling back to standard exception
Traceback (most recent call last):
File "d:\anaconda\Lib\site-packages\IPython\core\interactiveshell.py", line 3526, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "C:\Users\28137\AppData\Local\Temp\ipykernel_11788\4089525893.py", line 1, in <module>
merged.groupby(['gvkey','PERMNO'],group_keys=False).rolling('365D').apply(lambda x:cal_sigma_A(x))
File "d:\anaconda\Lib\site-packages\pandas\core\window\rolling.py", line 1913, in apply
return super().apply(
^^^^^^^^^^^^^^
File "d:\anaconda\Lib\site-packages\pandas\core\window\rolling.py", line 1390, in apply
return self._apply(
^^^^^^^^^^^^
File "d:\anaconda\Lib\site-packages\pandas\core\window\rolling.py", line 709, in _apply
result = super()._apply(
^^^^^^^^^^^^^^^
File "d:\anaconda\Lib\site-packages\pandas\core\window\rolling.py", line 615, in _apply
return self._apply_blockwise(homogeneous_func, name, numeric_only)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "d:\anaconda\Lib\site-packages\pandas\core\window\rolling.py", line 490, in _apply_blockwise
res = homogeneous_func(arr)
^^^^^^^^^^^^^^^^^^^^^
File "d:\anaconda\Lib\site-packages\pandas\core\window\rolling.py", line 610, in homogeneous_func
result = calc(values)
^^^^^^^^^^^^
File "d:\anaconda\Lib\site-packages\pandas\core\window\rolling.py", line 607, in calc
return func(x, start, end, min_periods, *numba_args)
...
^^^^^^^^^^^^^^^^^^
File "d:\anaconda\Lib\inspect.py", line 1081, in findsource
raise OSError('could not get source code')
OSError: could not get source code
Additionaly, if I only want to execute this function at the end of each month, what's the best way to do it?
I'm a newbie, so I have no idea how to debug.
If you reduce your problem, it becomes how to apply rolling on multiple columns. There are a few related problems on SO, one of them is here.
You can use rolling_apply from
numpy_ext
package (you'll have to install that). Tweaking your original code a little, the below code works (though you have to be careful with recursion limit, if your function keeps on calling itself without returning).