Numpy masked_array sum

Question

Numpy masked_array sum

3.3k views Asked by orange At 30 December 2024 at 22:24

I would expect the result of a summation for a fully masked array to be zero, but instead "masked" is returned. How can I get the function to return zero?

>>> a = np.asarray([1, 2, 3, 4])
>>> b = np.ma.masked_array(a, mask=~(a > 2))
>>> b
masked_array(data = [-- -- 3 4],
             mask = [ True  True False False],
       fill_value = 999999)

>>> b.sum()
7
>>> b = np.ma.masked_array(a, mask=~(a > 5))
>>> b
masked_array(data = [-- -- -- --],
         mask = [ True  True  True  True],
   fill_value = 999999)


>>> b.sum()
masked
>>> np.ma.sum(b)
masked
>>>

Here's another unexpected thing:

>>> b.sum() + 3
masked

Original Q&A

There are 1 answers

**hpaulj** · Accepted Answer · 2016-12-14T00:32:40+00:00

In your last case:

In [197]: bs=b1.sum()
In [198]: bs.data
Out[198]: array(0.0)
In [199]: bs.mask
Out[199]: array(True, dtype=bool)
In [200]: repr(bs)
Out[200]: 'masked'
In [201]: str(bs)
Out[201]: '--'

If I specify keepdims, I get a different array:

In [208]: bs=b1.sum(keepdims=True)
In [209]: bs
Out[209]: 
masked_array(data = [--],
             mask = [ True],
       fill_value = 999999)
In [210]: bs.data
Out[210]: array([0])
In [211]: bs.mask
Out[211]: array([ True], dtype=bool)

here's the relevant part of the sum code:

def sum(self, axis=None, dtype=None, out=None, keepdims=np._NoValue):
    kwargs = {} if keepdims is np._NoValue else {'keepdims': keepdims}

    _mask = self._mask
    newmask = _check_mask_axis(_mask, axis, **kwargs)
    # No explicit output
    if out is None:
        result = self.filled(0).sum(axis, dtype=dtype, **kwargs)
        rndim = getattr(result, 'ndim', 0)
        if rndim:
            result = result.view(type(self))
            result.__setmask__(newmask)
        elif newmask:
            result = masked
        return result
    ....

It's the

 newmask = np.ma.core._check_mask_axis(b1.mask, axis=None)
 ...
 elif newmask: result = masked

lines that produce the masked value in your case. newmask is True in the case where all values are masked, and False is some are not. The choice to return np.ma.masked is deliberate.

The core of the calculation is:

In [218]: b1.filled(0).sum()
Out[218]: 0

the rest of the code decides whether to return a scalar or masked array.

============

And for your addition:

In [232]: np.ma.masked+3
Out[232]: masked

It looks like the np.ma.masked is a special array that propagates itself across calculations. Sort of like np.nan.

TechQA.

Numpy masked_array sum

There are 1 answers

Related Questions in PYTHON

Related Questions in NUMPY

Related Questions in SUM

Related Questions in MASKED-ARRAY

Popular Questions

Popular Tags

Trending Questions