When you misspell a word in Google ("appples" for example), it comes up with the now familiar, "Did you mean: apples" suggestion for you.
Excluding Google's ability to guess your intentions based on relevance of search results, how can I develop a list of words that sound the same?
The words don't have to be English and also do not have to exist. So, for example, if I give the input "hole", I would get back a list including words like: "whole" "hola" "whore" "role" "molar", etc...
I am guessing there might be something online that can develop this list, but I couldn't find anything. If there is not a site and if it can be done using Perl, is there a CPAN module that can help me do this?
You can start by learning about the module Text::Soundex . It is a simple algorithm that maps words to 4 byte codes. I got Soundex out of Sedgewick (ex Knuth) long ago, used it to generate longer keys (not truncated) and suggested lists of corrections for 0 and 1-letter substitutions. I applied this to large databases of census and postal data.