Couple questions regarding FFT and Pitch Estimation

354 views Asked by At

I have a couple of clarifications that I need with FFT and Pitch Estimation in general.

1.) I read that the larger your block size for your FFT, the better accuracy it will have, although I know that there is also a downside to this. Is this really true? Because Ive been experimenting and whenever I use a block size of 16384 as opposed to 8192 or 4096, I get worse results. Can someone clarify me about this?

2.) Initially, I believed that getting the pitch from the FFT is only a simple matter of getting the bin with the highest intensity. However, after posting and reading some questions here, I think that there may be more the this. Can someone suggest me on how to get a good pitch estimation from FFT?

3.) Although I already have a good idea, can someone just explain in simple terms what the auto-correlator algorithm does? (My idea is that its basically a compare and contrast algorithm and the one with the lowest difference is the chosen one)

Thanks a lot!

1

There are 1 answers

0
Simon Richter On
  1. The downside is processing time, memory consumption and delay. If you want realtime display, having to wait for an entire frame to fill up before beginning processing may take inacceptably long.
  2. Yes, there is more. Specifically, phase. It could also be the bin with the largest negative value (180 degree shift), or one that is zero (90 degree shift), or anything in between. You probably want to do the conversion using complex numbers, and look for the largest absolute value.
  3. The algorithm looks for periodic elements in the signal by testing how "similar" a signal is to time-shifted versions of itself. The output is an mapping from time offset to "similarity"; you can then look for the highest value.