Where 'absent' can mean either nan
or np.masked
, whichever is easiest to implement this with.
For instance:
>>> from numpy import nan
>>> do_it([1, nan, nan, 2, nan, 3, nan, nan, 4, 3, nan, 2, nan])
array([1, 1, 1, 2, 2, 3, 3, 3, 4, 3, 3, 2, 2])
# each nan is replaced with the first non-nan value before it
>>> do_it([nan, nan, 2, nan])
array([nan, nan, 2, 2])
# don't care too much about the outcome here, but this seems sensible
I can see how you'd do this with a for loop:
def do_it(a):
res = []
last_val = nan
for item in a:
if not np.isnan(item):
last_val = item
res.append(last_val)
return np.asarray(res)
Is there a faster way to vectorize it?
Working from @Benjamin's deleted solution, everything is great if you work with indices
Comparing solution runtime:
So both this and @user2357112's solution blow the solution in the question out of the water, but this has the slight edge over @user2357112 when there are high numbers of
nan
s