What's a Good package for Phonetic Representation for Various Human Languages?

269 views Asked by At

I'm currently working on a project for which I think being able to come up with phonetic representations of words in various languages would be really helpful. I know Aspell does this pretty well, but I don't think there's a very easy way to get at their phonetic representations, so I ask: is there some other good package for getting the phonetic representation of a word given the word and the language/dialect/accent/whatever it's coming from?

This doesn't need to be in any particular language, but if it were Perl, that would be best.

I've already tried Soundex, Metaphone, DoubleMetaphone, and everything else in Text::Phonetic, and none of that stuff was very good – definitely nowhere near as good as the stuff in Aspell.

3

There are 3 answers

2
JRFerguson On

The first thing that springs to mind is Soundex. Of course, there is a Perl module Soundex, too. While this is designed to generate a soundex "key" from input it might be useful in mapping different variants to a common key.

1
Bill Ruppert On

There is a package Text::Aspell in CPAN. Might be useful.

2
AlfredoVR On

I you are trying to make a google style suggestion/correction system, it's not based on just phonetics or AI, but on a massive amount of user input. When a user makes a search, and doesn't click in any link but corrects the input and searches again, it gives google a lot of data about "correct" writing than a phonetics test or dictionary matching. The main problem is in human language itself, it's not that people speak or write in a deterministic way, let alone in multiple languages. Of course , i might be wrong, but if you need a library that let's you do this:

getLanguage(string);

I want to see that working, really.