In examples for the Web Speech API, a grammar is always specified. For example, in MDN's colour change example, the grammar is:
#JSGF V1.0;
grammar colors;
public <color> = aqua | azure | beige | bisque | black | blue | brown | chocolate | coral | crimson | cyan | fuchsia | ghostwhite | gold | goldenrod | gray | green | indigo | ivory | khaki | lavender | lime | linen | magenta | maroon | moccasin | navy | olive | orange | orchid | peru | pink | plum | purple | red | salmon | sienna | silver | snow | tan | teal | thistle | tomato | turquoise | violet | white | yellow ;
However, in actually using the API (on Chrome 54.0.2840.71), the result function:
- Sometimes returns strings that do not fit the supplied grammar
- Does not provide the parse tree that describes the speech
What then does the grammar actually do? How can I get either of these behaviours (restricting to the grammar and seeing the parse tree)?
I know this is an old question, but I'm going through a few similar ones to this as it's something I've been trying to figure out myself recently, and I have a solution. The grammar doesn't seem to work, at least not reliably or as expected.
As a solution, I've written a function that goes some way toward solving the issue. Supply it with the
event.results
from theSpeecRecognition.onresult
callback, and make suremaxAlternatives
is set to something like 10. Also supply a list of phrases. It will return the first transcription it finds containing one of the phrases, otherwise it just returns the transcript with the highest confidence.There are probably was of improving upon this solution for long transcripts etc, but it works for my situation for short command like phrases. Hopefully it helps someone else out too.