View object array under different dtype

190 views Asked by At

I would like to view an object array with a dtype that encapsulates entire rows:

data = np.array([['a', '1'], ['a', 'z'], ['b', 'a']], dtype=object)
dt = np.dtype([('x', object), ('y', object)])
data.view(dt)

I get an error:

TypeError: Cannot change data-type for object array.

I have tried the following workarounds:

dt2 = np.dtype([('x', np.object, 2)])
data.view()
data.view(np.uint8).view(dt)
data.view(np.void).view(dt)

All cases result in the same error. Is there some way to view an object array with a different dtype?

I have also tried a more general approach (this is for reference, since it's functionally identical to what's shown above):

dt = np.dtype(','.join(data.dtype.char * data.shape[1]))
dt2 = np.dtype([('x', data.dtype, data.shape[1])])
1

There are 1 answers

0
Mad Physicist On

It seems that you can always force a view of a buffer using np.array:

view = np.array(data, dtype=dt, copy=not data.flags['C_CONTIGUOUS'])

While this is a quick and dirty approach, the data gets copied in this case, and dt2 does not get applied correctly:

>>> print(view.base)
None
>>> np.array(data, dtype=dt2, copy=not data.flags['C_CONTIGUOUS'])
array([[(['a', 'a'],), (['1', '1'],)],
       [(['a', 'a'],), (['z', 'z'],)],
       [(['b', 'b'],), (['a', 'a'],)]], dtype=[('x', 'O', (2,))])

For a more correct approach (in some circumstances), you can use the raw np.ndarray constructor:

real_view = np.ndarray(data.shape[:1], dtype=dt2, buffer=data)

This makes a true view of the data:

>>> real_view
array([(['a', '1'],), (['a', 'z'],), (['b', 'a'],)], dtype=[('x', 'O', (2,))])
>>> real_view.base is data
True

As shown, this only works when the data has C-contiguous rows.