Fast fourier filtering treshold / forex tick data denoising

101 views Asked by At

I have a big dataset of currency pairs tick data and i want to denoise it using fourier. The problem is that i want to automatically find a treshold for filtering, it will be a part of a larger system constantly taking new data and preprocessing it.

Thanks' for your time :)

This is my code it is easy to find this threshold plotting the data and trying different values but that is not an option


market_data=market_data["<BID>"]
market_data=market_data.fillna(method="ffill")
market_data=market_data.fillna(method="bfill")
market_data=market_data - market_data.mean()


fourier = np.fft.fft(market_data)
amplitude_filter=fourier.copy()

amplitude_filter[(np.abs(amplitude_filter) < treshold)] = 0 # part where i apply treshold

amplitude_filtered_back=np.fft.ifft(amplitude_filter)

example with treshold 8000 example with treshold 8000

example with frequency filter from chat example with frequency filter from chat

amplitude_filter=amplitude_filter[-80:]

original data original data but still i need to decide which indexes or what treshold to choose i need this result automatically

1

There are 1 answers

0
Lourenço Monteiro Rodrigues On

If band-pass filtering is definetely something you are not considering, I would go about it thinking: "How much variation do I want to retain?"

This means, after computing the fft, you can select the frequency components that, together, contribute less than X% to the total power of the signal.

you can do that with the following:

amplitudes = np.abs(fft)
amp_sorted_index = np.argsort(amplitudes) # save correspondence between original and sorted elements
sorted_amplitudes = amplitudes[amp_sorted_index] # sort amplitudes in ascending

contributions = np.cumsum(sorted_amplitudes)/np.sum(sorted_amplitudes)
threshold = 0.1 # if you chose to keep 90% of most contributing

to_delete = amp_sorted_index[contributions < threshold]

fft[to_delete] = 0

filtered_data = np.fft.ifft(fft)

But again, this kind of filtering is unusual, defining a cut-off frequency and having a filter that implements that would have a direct physical reading, as @lastchance commented.