How can I detect a typo, but only for the specific phrase. Another way to think about it would be how can I detect a typo for a certain regex.
For example, I do not want a generic typo finder, I found multiple resources on that. I do not want a generic spell checker, again I found multiple resources on that.
How would I write a typo checker for a relatively constant value...say:
Super Secret 13-12345
It should always say "Super Secret NN-NNNNN" (N means any 0-9 number).
It would flag the following as "typos":
- Ssuper Secret 13-12345
- Super Secret 1312345
- Sper Scret 13-123456
- Spuer Secret 13-12345
- Super Secret
- 13-12345
It would NOT flag the following as "typos":
- Super Secret 13-12345
- Any other random words
- Superman flies over the jungle
I am most worried about extra characters leaking in, transposing characters, or numbers not following the NN-NNNNN format.
I feel like this is an answerable question, but I may just not be asking Google or SO using the correct words.
I am writing it in .NET, but could obviously port anything.
This isn't a good place for a regex: you would need a regex that detects every possible type of typo. Instead, you should be looking at the Levenshtein distance. It would work something like:
Once you have it implemented, play with the threshold in step 4 to match the desired behaviour.
Edit: "Invalid character" can either mean any character other than those in "Superct0123456789-", or it can mean any non-alphanumeric other than "-". The end result should be the same.