I have an array ab of shape (2,12)
ab = np.array([[0,3,6,3,np.nan,3,7,3,5,4,3,np.nan],
[5,9,np.nan,3,7,5,3,6,4,np.nan,np.nan,np.nan]])
I am trying to get the longest segment of consecutive notnull values between the two rows. From the example above, the output should be:
[[3. 7. 3. 5.]
[5. 3. 6. 4.]]
I used the solution proposed for a similar question here: Find longest subsequence without NaN values in set of series, after converting my array into a dataframe:
df = pd.DataFrame(ab.T)
seq = np.array(df.dropna(how='any').index)
longest_seq = max(np.split(seq, np.where(np.diff(seq)!=1)[0]+1), key=len)
print(df.iloc[longest_seq])
0 1
5 3.0 5.0
6 7.0 3.0
7 3.0 6.0
8 5.0 4.0
However, is it possible to find a solution using numpy only?
Thanks
I am not sure your code handles the case where the length of such sequences differs from one row to the other. Instead, I would proceed row-by-row:
I am not too familiar with numpy, so I took your question as an exercise. There are probably many ways to improve that code.