I would like to do such a thinks:
- Segment the audio file (divide it into frames) - to avoid information loss, the frames should overlap.
- In each frame, apply a window function (Hann, Hamming, Blackman etc) - to minimize discontinuities at the beginning and end.
I managed to save the audio file as a numpy array:
def wave_open(path, normalize=True, rm_constant=False):
path = wave.open(path, 'rb')
frames_n = path.getnframes()
channels = path.getnchannels()
sample_rate = path.getframerate()
duration = frames_n / float(sample_rate)
read_frames = path.readframes(frames_n)
path.close()
data = struct.unpack("%dh" % channels * frames_n, read_frames)
if channels == 1:
data = np.array(data, dtype=np.int16)
return data
else:
print("More channels are not supported")
And then I did a hamming window on the whole signal:
N = 11145
win = np.hanning(N)
windowed_signal = (np.fft.rfft(win*data))
But I don't know how to split my signal into frames (segments) before useing hamming window. Please help me :)
Here is a solution using
librosa
.Output: