NaN appearance at seemingly random times when applying Butterworth filter to zero-centred signal

934 views Asked by At

I have coded a band-pass Butterworth filter in Python 3.9.7 using scipy.signal.butter and scipy.signal.filtfilt and have been iterating through different critical frequency pairs for lower and upper values for the pass-band.

This filter has then been applied to a zero-centred signal of floats. It may be relevant that the signal is likely heavily affected by aliased components, since the sensor used can only capture between 0 and 1000 Hz signals, and a low-pass filter above this was not used under measurement.

time, data = readCleanedFileToLists(filename, rootpath) # Imports from a csv file
time = np.asarray(time, dtype='float64') # Change imported list to array
data = np.asarray(data, dtype='float64') # Change imported list to array
print(f"data = {data} of length {len(data)}")


upper = 990 # Set fixed upper frequency value
first = 1
second = 998
lower = np.linspace(first, second, second - first + 1) # Determine lower sweep range


for i in range(len(lower)):
    b, a, = scipy.signal.butter(N = 8, Wn = [lower[i], upper], fs = fs, btype = 'bandpass', analog = False)
    filteredSignal = scipy.signal.filtfilt(b, a, data)
    print(f"Lower: {lower[i]}, Upper: {upper}, Difference: {upper - lower[i]}, Sum: {sum(filteredSignal)}")

which prints in this form:

...
Lower: 590.0, Upper: 990, Difference: 400.0, Sum: 0.34275260253820755
Lower: 591.0, Upper: 990, Difference: 399.0, Sum: 0.3391258786111905
Lower: 592.0, Upper: 990, Difference: 398.0, Sum: 0.33183353502685375
Lower: 593.0, Upper: 990, Difference: 397.0, Sum: 0.33242304387705246
Lower: 594.0, Upper: 990, Difference: 396.0, Sum: 0.3335822652375922
...
Lower: 711.0, Upper: 990, Difference: 279.0, Sum: nan
Lower: 712.0, Upper: 990, Difference: 278.0, Sum: 0.1721640435475024
Lower: 713.0, Upper: 990, Difference: 277.0, Sum: nan
Lower: 714.0, Upper: 990, Difference: 276.0, Sum: nan
Lower: 715.0, Upper: 990, Difference: 275.0, Sum: 0.194133809472057
Lower: 716.0, Upper: 990, Difference: 274.0, Sum: nan
Lower: 717.0, Upper: 990, Difference: 273.0, Sum: 0.19170132060027684
Lower: 718.0, Upper: 990, Difference: 272.0, Sum: nan
Lower: 719.0, Upper: 990, Difference: 271.0, Sum: nan
Lower: 720.0, Upper: 990, Difference: 270.0, Sum: nan
...

from lower = 1 to lower = 990.

From lower = 1 to lower = 5, and from approximately lower = 700, the code returns "NaN" as the sum of the output "filteredSignal", showing the presence of at least one "NaN" value in filteredSignal.

NaN appearance is sporadic between lower = 650 and lower = 750, but after lower = 750 all are NaN.

These sporadic NaN results also occur when looking at frequency ranges [1, 101], [900, 999], [5, 100].

Is anyone aware of what is causing this rather spread appearance of NaN values?

My goal is to see if there is any information to be gleaned from this signal, even given the presence of aliased higher frequency components.

1

There are 1 answers

4
Rory Yorke On

You'll probably get more general help on signal-processing on https://dsp.stackexchange.com/ (e.g., for your question on aliasing).

It looks like you're trying to examine the spectrum of your signal. I suggest using either a plain DFT (numpy.fft.rfft) or periodogram (scipy.signal.welch) for that, instead of bandpass filtering.

You sum the output of the band-pass filter, but summing is basically finding the a scaled mean, or the DC or zeroth frequency component, so I don't think this will give you anything useful.

As for the NaNs: this is most probably due to the filter form you've used. You should usually use filters arranged as "second-order sections", but Scipy defaults to ba form; see scipy docs. I couldn't find a simple explanation of why this SOSs are better, but you could, e.g., consider Wilkinson's polynomial, which shows how it's numerically stable to go from roots to polynomial coefficients, but not the other way around.

Exactly whether or not you'll have a problem, and how quickly it will happen (see below) depends on the filter, which is why sometimes you get NaNs, and sometimes not.

Below is example code showing this problem for a particular case.

In the resulting figure, notice how the ba form output is unstable. This is the bottom row of subplots: you can see instability in the ba response from early on (bottom left), growing to 1e33 by 2.5s (bottom centre), and NaN by 25s. The NaNs are probably the result of the difference of two Infs, and the Infs in turn come from the unstable filter response. The filtfilt response is all NaN (see text output at end): the forward run of the filter resulted in a NaN, so the backward run of the filter, starting with this state, gives all NaN output.

The SOS form has no problems with this case (top row of subplots), and produces the expected output. sosfiltfilt is also OK, though it has transient issues near the final time, and produces small values there (this is not shown below).

Finally, you have some cases where the band-pass lower bound is higher than the upper bound. This should probably raise an exception, but doesn't in the version of Scipy I have (1.7.3); I've opened an issue for this.

import numpy as np
from scipy import signal
import matplotlib.pyplot as plt

# all frequencies in hertz
fs = 2000 # sample freq
fex = 225 # example freq
fother = 150 # frequency to remove

flo = 224 # lower critical
fhi = 226 # upper critical

t = np.arange(100_000) / fs # [s]

u = np.sin(2 * np.pi * fex * t) + np.sin(2 * np.pi * fother * t)

b, a = signal.butter(N=8,
                     Wn=[flo, fhi],
                     fs=fs,
                     btype='bandpass')

sos = signal.butter(N=8,
                    Wn=[flo, fhi],
                    fs=fs,
                    btype='bandpass',
                    output='sos')

bay = signal.lfilter(b, a, u)
sosy = signal.sosfilt(sos, u)

ff_bay = signal.filtfilt(b, a, u)
ff_sosy = signal.sosfiltfilt(sos, u)

n5 = int(np.ceil(5 * fs / fex))
i0 = len(t)//200

fig, axs = plt.subplots(2,3, sharex='col', figsize=[10,8])

# plot different time intervals to show filter response
# at different times 
idx0s = [1000, 5000, 50_000]

for i, idx0 in enumerate(idx0s):
    rng = slice(idx0,idx0+n5)
    axs[0,i].plot(t[rng], u[rng], label='input')
    axs[0,i].plot(t[rng], sosy[rng], label='sos', lw=5, alpha=0.5)
    axs[0,i].plot(t[rng], ff_sosy[rng], label='ff_sos')
    axs[1,i].plot(t[rng], bay[rng], label='ba')
    axs[1,i].set_xlabel('time')


axs[0,0].set_ylabel('input & sos')
axs[1,0].set_ylabel('ba')

axs[0,0].legend()
fig.tight_layout()

print(f'{np.any(np.isnan(bay))=}')
print(f'{np.any(np.isnan(sosy))=}')
print(f'{np.any(np.isnan(ff_sosy))=}')
print(f'{np.all(np.isnan(ff_bay))=}')

plt.show()

which gives output:

np.any(np.isnan(bay))=True
np.any(np.isnan(sosy))=False
np.any(np.isnan(ff_sosy))=False
np.all(np.isnan(ff_bay))=True

and figure

enter image description here