I have an application in which user can speak and a word and he will be given the percentage accuracy of the word he spoke. i.e how much clearly the engine recognized the word.
This all works fine ,but i have a dilemma that what words needed to be added to the dictionary which i will give to the recognition engine as dictionary.
If i give words starting with "p" for case pen then words like pendant ,pent etc all will be added to the dictionary.In that case i am not getting the recognized word as "pen".
Instead i always get other words like "pendant" etc
But if i only add limited words to dictionary like "pe","pen" then for the same recorded file i got the recognized words as "Pen" only.
Means it clearly depends on the words which we give to the dictionary.
I have conveyed the same to my client.But what they want is that they can speak wrong words also for a given input words ,so at that time they need not want to get the accuracy and also get the recognized text.
I have done what i could have done for the issue.But my client needs something apart from universe.
Code :
public OdllSpeechProcessor(string culture, string speechContent , string filePath)
{
try
{
int counter = 0;
string line;
cultureInfo = new CultureInfo(culture);
recognitionEngine = new SpeechRecognitionEngine(cultureInfo);
words = new Choices();
gb = new GrammarBuilder();
gb.Culture = cultureInfo;
rndAccuracy = new Random();
System.IO.StreamReader file = new System.IO.StreamReader(filePath);
while ((line = file.ReadLine()) != null)
{
if (line != "")
{
for (int i = 0; i < srcContent.Length; i++)
{
if (line.StartsWith(subsetWords, true, cultureInfo))
{
if (count >= line.Length)
{
words.Add(line);
counter++;
}
}
}
}
}
file.Close();
// Adding words to the grammar builder.
gb.Append(words);
// Create the actual Grammar instance, with the words from the source audio.
g = new Grammar(gb);
// Load the created grammar onto the speech recognition engine.
recognitionEngine.LoadGrammarAsync(g);
Do any experts have solution for this here? Any help will be appreciated.
Thanks
You're using a command grammar (i.e., a set of choices). With a command grammar, the engine tries its best to find a match, which can easily result in false positives (as you've seen). You might want to investigate a dictation grammar, particularly the pronunciation grammar, as I've outlined in my answer to this question. Note that the solution I outlined uses some interfaces that aren't available in C# (or at least exposed via
System.Speech.Recognition
).