Perl: Find all matched substrings of two strings

211 views Asked by At

Maybe there is a function, which can find every (maximal by length) equal substring of string1 and string2 in perl, isn't it?

I can find every substring in string, using m/substring/g;.

For searching all equal substrings, I must shift the pointer of string1's begin and symbol-by-simbol compare strings. How can I do it in perl, or is there a way easer? (the ready function)

Thank you in advance.

my $string1 = "... (i==i)kn;i=n.n;k(i(i,"%i",&i);i ..."; my $string2 = "... k;kn;i=n.n;k;k(i(i,"%i",&i);k ..."; my @answer = ( ..., "kn;i=n.n;", "k(i(i,"%i",&i);", ... );

1

There are 1 answers

6
ysth On BEST ANSWER

Your example seems to show returning two different lengths of substring, with the shorter one first, so I'm not sure what "maximal by length" means. But this may help:

use Tree::Suffix;
my $string1 = '(i==i)kn;i=n.n;k(i(i,"%i",&i);i';
my $string2 = 'k;kn;i=n.n;k;k(i(i,"%i",&i);k';
my $tree = Tree::Suffix->new($string1, $string2);
my @answer;
my $min_length = 1;
my $max_length = 0; # 0 initially means no limit
do {
    my @by_length = $tree->lcs($min_length,$max_length);
    last unless @by_length;
    # don't include any substrings that are substrings of substrings already found
    for my $new_substring (@by_length) {
        push @answer, $new_substring if 0 == grep $_ =~ /\Q$new_substring/, @answer;
    }
    $max_length = length($by_length[0])-1;
} while $max_length >= $min_length;
use Data::Dumper;
print Dumper \@answer;

output:

$VAR1 = [
      ';k(i(i,"%i",&i);',
      'kn;i=n.n;k'
    ];

Tree::Suffix was kind of a pain to install; I had to delete the included inc/Devel/CheckLib.pm because it had errors and install Devel::CheckLib separately, as well as downloading and installing the libstree library.