What are the components of the Mel mfcc

Question

What are the components of the Mel mfcc

2.5k views Asked by Joe At 08 December 2020 at 20:40

In looking at the output of this line of code:

mfccs = librosa.feature.mfcc(y=librosa_audio, sr=librosa_sample_rate, n_mfcc=40)
print("MFCC Shape = ", mfccs.shape)

I get a response of MFCC Shape = (40,1876). What do these two numbers represent? I looked at the librosa website but still could not decipher what are these two values.

Any insights will be greatly appreciated!

Original Q&A

There are 1 answers

**Jon Nordby** · Accepted Answer · 2020-12-08T23:26:33+00:00

The first dimension (40) is the number of MFCC coefficients, and the second dimensions (1876) is the number of time frames. The number of MFCC is specified by n_mfcc, and the number of time frames is given by the length of the audio (in samples) divided by the hop_length.

To understand the meaning of the MFCCs themselves, you should understand the steps it takes to compute them:

Spectrograms, using the Short-Time-Fourier-Transform (STFT)
The Mel spectrogram, from applying Mel scale filterbanks to the STFT
Mel Frequency Cepstral Coefficients, from applying the DCT transform on the mel-spectrogram.

A good written explainer is Haytham Fayek: Speech Processing for Machine Learning: Filter banks, Mel-Frequency Cepstral Coefficients (MFCCs) and What's In-Between and a good video explainer is The Sound of AI: Mel-Frequency Cepstral Coefficients Explained Easily.

TechQA.

What are the components of the Mel mfcc

There are 1 answers

Related Questions in LIBROSA

Related Questions in MFCC

Popular Questions

Popular Tags

Trending Questions