if have a column inside a pandas df containing a bunch of names:
NAME
-------
robert
robert
robrt
marie
ann
I'd like to merge similar ones in order to correct/uniform typos, resulting in:
NAME
-------
robert
robert
robert
marie
ann
I would like to use Levenshtein distance in order to search for similar records. Also, solutions using other metrics are much appreciated.
Thanks a lot in advance
All examples on Stackoverflow seem to compare multiple columns, so I have not been able to find a nice solution to my problem.
One possible approach is the following:
which will give you