IOS and FFT (vDSP_fft_zrip): frequencies below appr. 100 Hz are cut off - why?

1.6k views Asked by At

I'm using the Novocaine framework in order to perform pitch detection. All works well - only with the graphic FFT representation, there are some troubles.

When I use an audio file for input, FFT works fine - frequencies below 100 Hz are shown perfectly.

Instead when I use the mic as input, frequencies below 100 Hz are not shown at all - I can't figure out why! See screenshot below:

The higher frequencies (for example 1000 Hz) are correctly displayed!

Here's the source code for the FFT procedure:

- (NSMutableArray*)performFFT: (float*) data  withFrames: (int) numSamples {
    // 1. init
    float bufferSize = numSamples;
    uint32_t maxFrames = numSamples;
    displayData = (float*)malloc(maxFrames*sizeof(float));
    bzero(displayData, maxFrames*sizeof(float));
    int log2n = log2f(maxFrames);
    int n = 1 << log2n;
    assert(n == maxFrames);
    float nOver2 = maxFrames/2;
    
    
    A.realp = (float*)malloc(nOver2 * sizeof(float));
    A.imagp = (float*)malloc(nOver2 * sizeof(float));
    fftSetup = vDSP_create_fftsetup(log2n, FFT_RADIX2);
    
    // 2. calcuate
    bufferSize = numSamples;
    float ln = log2f(numSamples);
    vDSP_ctoz((COMPLEX*)data, 2, &A, 1, numSamples/2);
    
    //fft
    vDSP_fft_zrip(fftSetup, &A, 1, ln, FFT_FORWARD);
    
    // Absolute square (equivalent to mag^2)
    vDSP_zvmags(&A, 1, A.realp, 1, numSamples/2);
    // make imaginary part to zero in order to filter them out in the following loop
    bzero(A.imagp, (numSamples/2) * sizeof(float));
    
    //convert complex split to real
    vDSP_ztoc(&A, 1, (COMPLEX*)displayData, 2, numSamples/2);
    
    // Normalize
    float scale = 1.f/displayData[0];
    vDSP_vsmul(displayData, 1, &scale, displayData, 1, numSamples);
    
    
    //scale fft
    Float32 mFFTNormFactor = 1.0/(2*numSamples);
    vDSP_vsmul(A.realp, 1, &mFFTNormFactor, A.realp, 1, numSamples/2);
    vDSP_vsmul(A.imagp, 1, &mFFTNormFactor, A.imagp, 1, numSamples/2);

...
}

For the further graphical issues, I use displayData.

As I'm interested into the lower frequencies (there's a separate pitch detection algorithm which works fine), I've reduced the sample rate to 11.025 (instead of 44100).

Within Novocaine, I make the following calls:

1) Input

[audioManager setInputBlock:^(float *data, UInt32 numFrames, UInt32 numChannels) {
            
            // frequency analysis
            vDSP_rmsqv(data, 1, &magnitude, numFrames*numChannels);
            self->ringBuffer->AddNewInterleavedFloatData(data, numFrames, numChannels);
...
}

2) Output:

[audioManager setOutputBlock:^(float *data, UInt32 numFrames, UInt32 numChannels)
     { 
   self->ringBuffer->FetchInterleavedData(data, numFrames, numChannels);
   ....
   [fft performFFT:data withFrames:numFrames];
   ...
   }

Has someone got an idea?

2

There are 2 answers

4
marko On

This is because the built-in microphone is a condenser mic and has a high-pass filter to remove the DC bias on the signal.

It is possible to disable the HPF. The DC bias isn't much of an issue, as it will simply turn up in bin #0 of the FFT.

[[AVAudioSession sharedInstance] setMode: AVAudioSessionModeMeasurement error:NULL];

Also be aware that windowing for the FFT also affects the accuracy of low-frequency bins of the FFT. You need at least 2*PI of samples in the window for any given frequency.

1
hotpaw2 On

Pitch is different from spectrum frequency. Low pitched sounds are often composed of mostly harmonic and overtone energy and don't have much energy at the fundamental frequency. Thus an FFT won't find much there either. Autocorrelation is slightly better at finding the pitch of these types of harmonic rich sounds.

This is in addition to the iPhone microphone not being as sensitive to low frequency energy as it is to mid and higher audio frequencies (there are audio response graphs online).

Also note that FFT buffers need to be proportionally longer to have equivalent percentage (equal tempered pitch in cents) resolution of low frequency content. An FFT far far longer than any single buffer delivered by an iOS Audio Unit is usually required for low frequency resolution.