Appropriate formatting of NumPy dtypes for arrays within Structured Arrays

Question

Appropriate formatting of NumPy dtypes for arrays within Structured Arrays

320 views Asked by SomeoneElse At 06 January 2025 at 04:41

I am trying to create a numpy structured array but I can't figure out the correct way to format my column titles/column types for arrays within arrays. I keep getting the setting an array element with a sequence message, but I can convert the list into an unstructured array without a problem so the problem is in the formatting of the dtypes in the sub-arrays.

Code

#Number of People
numOfP=5
#Array of people's ids
ids=np.array(range(0,numOfP),dtype='int64')
#People object
temp=[];
peoType=np.dtype({
    'names':
    ['id','value','ability','helpNeeded','helpOut','helpIn'],
    'formats':
    ['int64','float64','float32','float32','object','object'],
    'aligned':True
});
#Populate people with attributes
for id in ids:
    temp.append([
        #0 - id
        id,
        #1 - people's value
        sts.lognorm.rvs(.5)*100000,
        #2 - people's ability
        (1/(sts.lognorm.rvs(.99)+1)),
        #3 - help needed
        ((sts.lognorm.rvs(.99))*100),
        #4 - people helped
#This is where the problem is, if I get rid of these arrays, and the associated dtypes, there are no errors
        np.zeros(numOfP),
        #5 - people who helped you
        np.zeros(numOfP)
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    ])
peoType
temp
#doing np.array(temp), without the dtype works
temp=np.asarray(temp)      #doesn't change anything
temp
peo=np.array(temp,peoType) #where things break

dtype

{'names': ['id', 'value', 'ability', 'helpNeeded', 'helpOut', 'helpIn'],
 'formats': ['int64', 'float64', 'float32', 'float32', 'object', 'object'],
 'aligned': True}

Error message

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
e:\xampp\htdocs\Math2Code\cooperate.py in 
     52     ])
     53 peoType
---> 54 peo=np.array(temp,peoType)

ValueError: setting an array element with a sequence.

Contents of temp List

[[0,
  86381.14170220899,
  0.12974876676966007,
  49.537761763004056,
  array([0., 0., 0., 0., 0.]),
  array([0., 0., 0., 0., 0.])],
 [1,
  95532.94886721167,
  0.3886984384013719,
  49.9244719570076,
  array([0., 0., 0., 0., 0.]),
  array([0., 0., 0., 0., 0.])],
 [2,
  53932.09250542036,
  0.6518993291826463,
  92.72979425242384,
  array([0., 0., 0., 0., 0.]),
  array([0., 0., 0., 0., 0.])],
 [3,
  161978.14156816195,
  0.49130827569636754,
  56.44742176255372,
  array([0., 0., 0., 0., 0.]),
  array([0., 0., 0., 0., 0.])],
 [4,
  38679.21128565417,
  0.6979042712239539,
  132.35562828412765,
  array([0., 0., 0., 0., 0.]),
  array([0., 0., 0., 0., 0.])]]

Contents of temp after converted to a unstructured array

array([[0, 119297.86954924025, 0.38806815548557444, 487.4877681755314,
        array([0., 0., 0., 0., 0.]), array([0., 0., 0., 0., 0.])],
       [1, 75215.69897153028, 0.5387632600167043, 83.27487024641633,
        array([0., 0., 0., 0., 0.]), array([0., 0., 0., 0., 0.])],
       [2, 88986.345811315, 0.2533847055636237, 48.52795408229029,
        array([0., 0., 0., 0., 0.]), array([0., 0., 0., 0., 0.])],
       [3, 80539.81607335186, 0.27683829962996226, 226.25682883690638,
        array([0., 0., 0., 0., 0.]), array([0., 0., 0., 0., 0.])],
       [4, 40429.11615682778, 0.5748035151329913, 226.69671215072958,
        array([0., 0., 0., 0., 0.]), array([0., 0., 0., 0., 0.])]],
      dtype=object)

Output of the peoType np.dtype variable when used in a 2x2 np.zeros array:

Input

np.zeros(2, peoType)

Output

array([(0, 0., 0., 0., 0, 0), (0, 0., 0., 0., 0, 0)],
      dtype={'names':['id','value','ability','helpNeeded','helpOut','helpIn'], 'formats':['<i8','<f8','<f4','<f4','O','O'], 'offsets':[0,8,16,20,24,32], 'itemsize':40, 'aligned':True})

Why the rows rapped in tuples????

Original Q&A

There are 2 answers

**NaN** · Answer 1 · 2020-09-10T02:20:09+00:00

Too big for a comment, but this demonstrates the tuple for input to produce the structured array. If vals is a list, then you will get an error. Sample, below is using one of your inputs.

vals = (2,
  53932.09250542036,
  0.6518993291826463,
  92.72979425242384,
  np.array([0., 0., 0., 0., 0.]),
  np.array([0., 0., 0., 0., 0.]))

dt={'names':['id','value','ability','helpNeeded','helpOut','helpIn'], 'formats':['<i8','<f8','<f4','<f4','O','O']}

a = np.asarray(vals, dtype=dt)

a
array((2,  53932.09,  0.65,  92.73, array([ 0.00,  0.00,  0.00,  0.00,  0.00]), array([ 0.00,  0.00,  0.00,  0.00,  0.00])),
      dtype=[('id', '<i8'), ('value', '<f8'), ('ability', '<f4'), ('helpNeeded', '<f4'), ('helpOut', 'O'), ('helpIn', 'O')])

**hpaulj** · Answer 2 · 2020-09-10T17:31:25+00:00

Your compound dtype:

In [33]: peoType=np.dtype({
    ...:     'names':
    ...:     ['id','value','ability','helpNeeded','helpOut','helpIn'],
    ...:     'formats':
    ...:     ['int64','float64','float32','float32','object','object'],
    ...:     'aligned':True
    ...: })

A sample structured array with that dtype:

In [34]: arr = np.zeros(2, peoType)
In [35]: arr
Out[35]: 
array([(0, 0., 0., 0., 0, 0), (0, 0., 0., 0., 0, 0)],
      dtype={'names':['id','value','ability','helpNeeded','helpOut','helpIn'], 'formats':['<i8','<f8','<f4','<f4','O','O'], 'offsets':[0,8,16,20,24,32], 'itemsize':40, 'aligned':True})
In [36]: arr['id']
Out[36]: array([0, 0])
In [37]: arr['helpOut']
Out[37]: array([0, 0], dtype=object)

() is used to mark individual records. This is a 1d array, with records, not rows and columns. The notation tries to make this clear. Operations like reshape and broadcasting don't cross that record boundary.

Make your temp list:

In [39]: array = np.array
In [40]: temp=[[0,
    ...:   86381.14170220899,
    ...:   0.12974876676966007,
    ...:   49.537761763004056,
    ...:   array([0., 0., 0., 0., 0.]),
    ...:   array([0., 0., 0., 0., 0.])],
    ...:  [1,
    ...:   95532.94886721167,
    ...:   0.3886984384013719,
    ...:   49.9244719570076,
    ...:   array([0., 0., 0., 0., 0.]),
    ...:   array([0., 0., 0., 0., 0.])],
    ...:  [2,
    ...:   53932.09250542036,
    ...:   0.6518993291826463,
    ...:   92.72979425242384,
    ...:   array([0., 0., 0., 0., 0.]),
    ...:   array([0., 0., 0., 0., 0.])],
    ...:  [3,
    ...:   161978.14156816195,
    ...:   0.49130827569636754,
    ...:   56.44742176255372,
    ...:   array([0., 0., 0., 0., 0.]),
    ...:   array([0., 0., 0., 0., 0.])],
    ...:  [4,
    ...:   38679.21128565417,
    ...:   0.6979042712239539,
    ...:   132.35562828412765,
    ...:   array([0., 0., 0., 0., 0.]),
    ...:   array([0., 0., 0., 0., 0.])]]

Make a structured array from the list - first converting it into a list of tuples, as required by structured array:

In [42]: arr = np.array([tuple(row) for row in temp], peoType)
In [43]: arr
Out[43]: 
array([(0,  86381.14170221, 0.12974876,  49.53776 , array([0., 0., 0., 0., 0.]), array([0., 0., 0., 0., 0.])),
       (1,  95532.94886721, 0.38869843,  49.924473, array([0., 0., 0., 0., 0.]), array([0., 0., 0., 0., 0.])),
       (2,  53932.09250542, 0.65189934,  92.7298  , array([0., 0., 0., 0., 0.]), array([0., 0., 0., 0., 0.])),
       (3, 161978.14156816, 0.49130827,  56.447422, array([0., 0., 0., 0., 0.]), array([0., 0., 0., 0., 0.])),
       (4,  38679.21128565, 0.6979043 , 132.35562 , array([0., 0., 0., 0., 0.]), array([0., 0., 0., 0., 0.]))],
      dtype={'names':['id','value','ability','helpNeeded','helpOut','helpIn'], 'formats':['<i8','<f8','<f4','<f4','O','O'], 'offsets':[0,8,16,20,24,32], 'itemsize':40, 'aligned':True})
In [44]: arr['helpOut']
Out[44]: 
array([array([0., 0., 0., 0., 0.]), array([0., 0., 0., 0., 0.]),
       array([0., 0., 0., 0., 0.]), array([0., 0., 0., 0., 0.]),
       array([0., 0., 0., 0., 0.])], dtype=object)

The object dtype field is a 1d array of objects - arrays.

If all those object fields contained the same size arrays, we could replace them with multi-item fields:

In [50]: dt=np.dtype({
    ...:     'names':
    ...:     ['id','value','ability','helpNeeded','helpOut','helpIn'],
    ...:     'formats':
    ...:     ['int64','float64','float32','float32','5float','5float'],
    ...:     'aligned':True
    ...: })
In [51]: arr = np.array([tuple(row) for row in temp], dt)
In [52]: arr
Out[52]: 
array([(0,  86381.14170221, 0.12974876,  49.53776 , [0., 0., 0., 0., 0.], [0., 0., 0., 0., 0.]),
       (1,  95532.94886721, 0.38869843,  49.924473, [0., 0., 0., 0., 0.], [0., 0., 0., 0., 0.]),
       (2,  53932.09250542, 0.65189934,  92.7298  , [0., 0., 0., 0., 0.], [0., 0., 0., 0., 0.]),
       (3, 161978.14156816, 0.49130827,  56.447422, [0., 0., 0., 0., 0.], [0., 0., 0., 0., 0.]),
       (4,  38679.21128565, 0.6979043 , 132.35562 , [0., 0., 0., 0., 0.], [0., 0., 0., 0., 0.])],
      dtype={'names':['id','value','ability','helpNeeded','helpOut','helpIn'], 'formats':['<i8','<f8','<f4','<f4',('<f8', (5,)),('<f8', (5,))], 'offsets':[0,8,16,20,24,64], 'itemsize':104, 'aligned':True})
In [53]: arr['helpOut']
Out[53]: 
array([[0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0.]])

Now that field produces a 2d array.

TechQA.

Appropriate formatting of NumPy dtypes for arrays within Structured Arrays

There are 2 answers

Related Questions in PYTHON

Related Questions in NUMPY

Related Questions in STRUCTURED-ARRAY

Popular Questions

Popular Tags

Trending Questions