I have record array with 2×2 fixed-size item, with 10 rows; thus the column is 10×2x2. I would like to assign a constant to the whole column. Numpy array will broadcast scalar value correctly, but this does not work in h5py.
import numpy as np
import h5py
dt=np.dtype([('a',('f4',(2,2)))])
# h5py array
h5a=h5py.File('/tmp/t1.h5','w')['/'].require_dataset('test',dtype=dt,shape=(10,))
# numpy for comparison
npa=np.zeros((10,),dtype=dt)
h5a['a']=np.nan
# ValueError: changing the dtype of a 0d array is only supported if the itemsize is unchanged
npa['a']=np.nan
# numpy: broadcasts, OK
In fact, I can't find a way to assign the column without broadcasting:
h5a['a']=np.full((10,2,2),np.nan)
# ValueError: When changing to a larger dtype, its size must be a divisor of the total size in bytes of the last axis of the array
Not even one element row:
h5a['a',0]=np.full((2,2),np.nan)
# ValueError: When changing to a larger dtype, its size must be a divisor of the total size in bytes of the last axis of the array
What is the problem here?
We can set a like sized array:
and assign it to the dataset:
We can also make a single element array with the correct dtype, and assign that:
With
h5py
we can index with record and field as:Where as
ndarray
requires a double index as with:d[0]['a']
h5py
tries to imitatendarray
indexing, but is not exactly the same. We just have to accept that.edit
The [118] assignment can also be
The
dt
here just as one field, but I think this should work with multiple fields. The key is that the value has to be a structured array that matches thed
field specification.I just noticed in the docs that they are trying to move away from the
d[1,'a']
indexing, instead usingd[1]['a']
. But for assignment that doesn't seem to work - not error, just no action. I thinkd[1]
ord['a']
is a copy, the equivalent of a advanced indexing for arrays. For a structured arrays those areview
.