Running into problems when trying to send a dataframe to hdf5 in small chunks via pd.HDFStore('mystore.h5', mode='a').append(my_frame, chunk)
. The chunks are all the same in terms of columns and types (they come from the same dataframe) But It works for a lot of chunks then bombs half way through.
ValueError: cannot match existing table structure for [Net_Bal_Amt,Loan_Current_Rate] on appending data
I print out the dataframe chunks that caused this fail, the one thing they have in common is all 'None' values for a specific column (they are originally null from the source). Not sure how to correct this. They should stay None or NaN or null, as long as they are empty. Thanks.
Traceback (most recent call last):
File "[...]\Anaconda3\lib\site-packages\pandas\io\pytables.py", line 3381, in create_axes
b, b_items = by_items.pop(items)
KeyError: ('Net_Bal_Amt', 'Loan_Current_Rate')
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "[...]\crd_test.py", line 8, in <module>
credit.CRD.hdf_install(overwrite=True, tablenames=['loans_uscrd', 'loans_uscrd_a'])
File "[...]\credit_base.py", line 62, in hdf_install
cls._hdf_creation(map_)
File "[...]\credit_base.py", line 80, in _hdf_creation
cls._hdf_processing(v, chunk)
File "[...]\credit_base.py", line 88, in _hdf_processing
cls.crd.append(frame, chunk)
File "[...]\Anaconda3\lib\site-packages\pandas\io\pytables.py", line 903, in append
**kwargs)
File "[...]\lib\site-packages\pandas\io\pytables.py", line 1259, in _write_to_group
s.write(obj=value, append=append, complib=complib, **kwargs)
File "[...]\lib\site-packages\pandas\io\pytables.py", line 3751, in write
**kwargs)
File "[...]\Anaconda3\lib\site-packages\pandas\io\pytables.py", line 3388, in create_axes
item in items))
ValueError: cannot match existing table structure for [Net_Bal_Amt,Loan_Current_Rate] on appending data
dtypes:
pd.read_hdf(r'[...]\crd_test.h5','loans').dtypes
Out[4]:
Customer_Id object
As_of_Date datetime64[ns]
Net_Bal_Amt float64
Loan_Current_Rate float64
dtype: object
versions: pytables:3.1.1 pandas: 0.15.2 python:3.4
dtypes of chunk being appended on crash:
Customer_Id object
As_of_Date datetime64[ns]
Net_Bal_Amt float64
Loan_Current_Rate object
dtype: object