Levenstein distance, multiple paths

865 views Asked by At

Edit: TL;DR version: how to get all possible backtraces for Damerau–Levenshtein distance between two words? I'm using https://en.wikipedia.org/wiki/Wagner%E2%80%93Fischer_algorithm in order to compute distance, and trivial backtrace algorithm (illustrated below) in order to reconstruct corrections list.

More details below:

Just got stuck with optimal string alignment (sort of Damerau–Levenshtein distance) while trying to get a complete set of possible alignments.

Goal is to align 2 strings for further comparison in auto-suggestions algorithm. Particularly, I'd like to ignore insertions past the end of 1st word.

The problem that in some cases multiple "optimal" alignments is possible, e.g.

align("goto", "go to home")

1) go to
   go to home

2) go t   o
   go to home

Unfortunately, mine implementation finds second variant only, while I need both or 1st one.

I've tried to perform some kind of A* or BFS path finding, but it looks like cost computation matrix is "tuned" for (2) variant only. There is screenshot below where I can find red path, but it looks like there is no green path: screenshot

However, someone made a web demo which implements exactly what I want: web demo

What I'm missing here?

Perhaps my implementation is too long to post it here, so there is a link to github: https://github.com/victor-istomin/incrementalSpellCheck/blob/f_improvement/spellCheck.hpp

Distance implementation is located in optimalStringAlignementDistance() and optimalStringAlignmentBacktrace() methods.

0

There are 0 answers