I'm trying to find the longest common substring (LCS) across multiple sequences.
There are a number of modules on CPAN that implement the LCS algorithm for 2 sequences, such as Algorithm::Diff and String::LCSS_XS, but I'm having a hard time in extending them to work with more than 2 sequences because the LCS across multiple sequences is not necessarily a LCS between any two of them.
Of note, despite its name, the Algorith::MLCS doesn't actually return the LCS but all the common elements (also non-consecutive) of a number of arrays. My impression is that it is broken by design, but I might be wrong.
Algorithm::Diff and Algorith::MLCS solve the longest common subsequence problem, not the longest common substring one.
Is there an obvious way to extend the n=2 algorithms or do I have to implement my version? If yes, how?
Thanks.
This can be solved pretty easily using the Tree::Suffix module.
Example: