I've played with WebkitSpeechRecognition service used for transcribing speech into written words (https://developer.mozilla.org/en-US/docs/Web/API/Web_Speech_API). In it's current state it's a decent toy, but not really accurate enough to be useful. It is, however, good at detecting pauses and getting at least a couple words right to give a vague idea of what the user said.
What I would find useful is to be able to capture raw audio as well. That way I can show it alongside with the transcribed text so that the user can replay it manually for the sentences that didn't get transcribed correctly.
Unfortunately, I don't see it exposed anywhere in the API. Is there a way to accomplish this? If not, is there an alternative solution that's not too much of a hack and/or CPU drain I could use for this, such as capturing Navigator.getUseMedia()
? If so, would I now have to rewrite the logic for pause detection and splitting myself?