I have an issue with google speech recognition API. I am making successful calls with the examples provided in the documentation.
However, my input is in mp3 format (8 Khz). I suspect that the transformation to FLAC which I did with an online tool may be the issue.
Here is the body of my call:
{
"config": {
"encoding":"FLAC",
"sampleRateHertz": 8000,
"languageCode": "en-US",
"enableWordTimeOffsets": false
},
"audio": {
"uri":"gs://speech-demo/phone3.flac"
}
}
I get a bad request. If I use FLAC, 16 KHz for the example provided I get a full transcript.
Any idea what I was doing wrong, and if it's the conversion part, how I should be converting?
8000 is defined for
en-IN
.