I have a list with city names, which some of them are misspelled:
['bercelona', 'emstrdam', 'Praga']
And a list with all possible city names well spelled:
['New York', 'Amsterdam', 'Barcelona', 'Berlin', 'Prague']
I'm looking for an algorithm able to find the closest match between the names of the first and second list, and returns the first list with its well spelled names. So it should return the following list:
['Barcelona', 'Amsterdam', 'Prague']
You may use built-in Ratcliff and Obershelp algorithm:
Where 0.7 is coefficient of similarity. It may do some tests for your case and set this value. It shows how similar are both of strings(1 - it's the same string, 0 - very different strings)