Ok, there are a bunch of questions on this here, and plenty of reading material on google, yet I somehow am not able to figure this out. I want to get the fundamental frequency of a segment of speech. The basic steps are supposed to be:
- take the FFT of a windowed signal
- convert the FFT from rectangular to polar coordinates (so you can get the magnitude)
- discard the phase information
- take the square, then the natural log of each bin of the magnitude
- take another FFT (or some sources say take the inverse fft?)
Here is how I have implemented this in AS3:
var signal:Vector.<Number> = my1024PointSignal; // an audio signal 1024 samples long
var imx:Vector.<Number> = new Vector.<Number>(signal.length); // 1024 point vector to hold imaginary part of fft
hammingWindow(signal); // window it
zeroFill(imx); // fill imx with zeros
FFT(signal, imx); // convert signal into real and imaginary components of fft
toPolar(signal, imx); // convert fft to polar coordinates
// square each bin, and take the log of each bin, discard phase
for (var i:int = 0, l:int = signal.length; i < l; i++) {
signal[i] = Math.log(Math.pow(signal[i], 2));
imx[i] = 0;
}
FFT(signal, imx); // or maybe inverseFFT(signal, imx), i don't know
Now when I do this and end by taking the FFT, when I plot it the bins appear to be in reverse order? I also am seeing a larger peak at the second harmonic than at the fundamental. When I do this and take the inverse FFT, I get an audio signal that looks reflected around N/2, and again the peaks seem to be reversed. The whole thing is also quite noisy. What am I doing wrong?
For Cepstrum i always have used to this steps:
The equation for cepstrum:
Maybe are you are seeing reflected because you not get the step four(4)
The difference is the scale representation, if do you end using FFT you need extract just the real information, for both below equations you will get the same shape:
Plot example from cepstrum:
For
IFFT(log(abs(FFT(s))))
:For
real(FFT(log(abs(FFT(s)))))
:This is a cepstrum example from a 4096 points sine in 440hz sampled at 44100hz