I am doing SPC analysis using numpy/pandas.
Part of this is checking data series against the Nelson rules and the Western Electric rules.
For instance (rule 2 from the Nelson rules): Check if nine (or more) points in a row are on the same side of the mean.
Now I could simply implement checking a rule like this by iterating over the array.
- But before I do that, I'm checking here on SO if numpy/pandas has a way to do this without iteration?
- In any case: What is the "numpy-ic" way to implement a check like the one described above?
As I mentioned in a comment, you may want to try using some stride tricks.
First, let's make an array of the size of your anomalies: we can put it as
np.int8to save some spaceNow for the strides. If you want to consider
Nconsecutive points, you'll useThat gives us a
(x.size, N)rollin array: the first row isx[0:N], the secondx[1:N+1]... Of course, the lastN-1rows will be meaningless, so from now on we'll useLet's sum along the rows
That gives us an array of size
(x.size-N+1)of values between-Nand+N: we just have to find where the absolute values areN:indicesis the array of the indicesiof your arrayxfor which the valuesx[i:i+N]are on the same side of the mean...Example with
x=np.random.rand(10)andN=3