I am doing SPC analysis using numpy/pandas.
Part of this is checking data series against the Nelson rules and the Western Electric rules.
For instance (rule 2 from the Nelson rules): Check if nine (or more) points in a row are on the same side of the mean.
Now I could simply implement checking a rule like this by iterating over the array.
- But before I do that, I'm checking here on SO if numpy/pandas has a way to do this without iteration?
- In any case: What is the "numpy-ic" way to implement a check like the one described above?
As I mentioned in a comment, you may want to try using some stride tricks.
First, let's make an array of the size of your anomalies: we can put it as
np.int8
to save some spaceNow for the strides. If you want to consider
N
consecutive points, you'll useThat gives us a
(x.size, N)
rollin array: the first row isx[0:N]
, the secondx[1:N+1]
... Of course, the lastN-1
rows will be meaningless, so from now on we'll useLet's sum along the rows
That gives us an array of size
(x.size-N+1)
of values between-N
and+N
: we just have to find where the absolute values areN
:indices
is the array of the indicesi
of your arrayx
for which the valuesx[i:i+N]
are on the same side of the mean...Example with
x=np.random.rand(10)
andN=3