Data quality with Ruby

222 views Asked by At

I'm looking for any libraries that can help to match two words with misspelling. For instance, the gem should mark the following statements as true (it's just an example, not necessary to have standard strings extended)

'Start' == 'Strat'
'woodpecker' == 'Wodpekcer'

Any ruby gems for data quality checking?

2

There are 2 answers

0
devanand On BEST ANSWER

You know about Levenshtein?

https://github.com/anjlab/rubyfish is just one gem you can install

2
kostja On

As you stated that you are looking for libraries/gems, here are some gems implementing string distance and fuzzy matching:

The libraries do not extend core classes, so you would not be able to compare the strings using the == operator, but you can calculate their similarity and find similar strings.

For Soundex, Metaphone and similar, you can use the wonderful text gem. It may be a bit more involving using phonetic algorithms, as they may work better or worse depending on the language. What works perfectly for English might not work for other languages.