Longest common subsequence (LCS) brute force algorithm

11.6k views Asked by At

I want to create a brute force algorithm to find the largest common subsequence between 2 strings, but I'm struggling to enumerate all possibilities in the form of a algorithm.

I don't want a dynamic programming answer as oddly enough I manage to figure this one out (You would think the brute force method would be easier). Please use pseudo code, as I prefer to understand it and write it up myself.

4

There are 4 answers

2
turingcomplete On BEST ANSWER

It's pretty much the same as DP minus the memoization part.

LCS(s1, s2, i, j):
    if(i == -1 || j == -1)
        return 0
    if(s1[i] == s2[j])
        return 1 + LCS(s1, s2, i-1, j-1)
    return max(LCS(s1, s2, i-1, j), LCS(s1, s2, i, j-1))

The idea is if we have two strings s1 and s2 where s1 ends at i and s2 ends at j, then the LCS is:

  • if either string is empty, then the longest common subsequence is 0.
  • If the last character (index i) of string 1 is the same as the last one in string 2 (index j), then the answer is 1 plus the LCS of s1 and s2 ending at i-1 and j-1, respectively. Because it's obvious that those two indices contribute to the LCS, so it's optimal to count them.
  • If the last characters don't match, then we try to remove one of the characters. So we try finding LCS between s1 (ending at i-1) and s2 (ending at j) and the LCS between s1 (ending at i) and s2 (ending at j-1), then take the max of both.

The time complexity is obviously exponential.

1
MooseBoys On

I like @turingcomplete's answer but it's not really brute-force since it doesn't actually enumerate all candidate solutions. For example, if the strings are "ABCDE" and "XBCDY", the recursive approach won't test for "ABC" versus "XBC" because the test for "A" versus "X" will have already failed. It's kind of a matter of opinion whether you want to count that as a unique candidate though. In fact, you could argue that "ABC" versus "ABCDY" is a valid candidate as well, and just immediately fails due to length difference. You could add separate LA and LB to the code below to fully enumerate those candidates though.

for L = min(A.length, B.length) to 1
{
    for iA = 0 to A.length - L - 1
    {
        for iB = 0 to B.length - L - 1
        {
            for i = 0 to L - 1
            {
                if(A[iA] != B[iB])
                {
                    match failed;
                }
            }
            if match didn't fail, then
            A[iA..iA+L] and B[iB..iB+L] are a maximal common substring
        }
     }
}
no common substring
0
Vivek Singh On

Replace s1 and s2 with your String

import java.util.*;
/* Online Java Compiler and Editor */
public class HelloWorld{
     public static void main(String []args){
        System.out.println("Hello, World!");
        String s1 = "GXXAYB";
        String s2 = "AGGTAB";
        String ans = "",res ="";
        int m = 0;
        for(int k=0;k<s1.length();k++) {
            m=0;
            for(int i=k;i<s1.length();i++) {
                for(int j=m;j<s2.length();j++) {
                    if(s1.charAt(i)==s2.charAt(j)) {
                        res = res+s2.charAt(j);
                        i=i+1;
                    
                    }
                }
            }
            if(res.length()>ans.length()) {
                ans=res;
            }
            res ="";
      }
      System.out.println(ans);
    }
}
0
Sriman S On

Here is a Java method which stores/lists out all the subsequences of given string in an ArrayList.

  • Find all the subsequences of given 2 strings

  • Find common ones between them

  • Longest one among them is the answer

  • We already know that each character may either
    1) appear
    or
    2) not appear
    in any subsequence.

  • So, we keep all the strings in the ArrayList untouched(case-2 in above).

  • Also, we add(to the ArrayList) strings which are results of concatenation
    of already existing strings in ArrayList with the next character of the
    string(case-1 above).

  • This covers(solves) both the above cases of our problem.

  • We do this until to all the letters in the string.

ArrayList<String> subseq(String s)
    {
        ArrayList<String> h = new ArrayList<String>();
        h.add("");
        int n = s.length();
        int l;
        for(int i=0;i<n;i++)
        {
            l = h.size();
            for(int j=0;j<l;j++)
                h.add( h.get(j) + s.charAt(i));
        }
        return h;
    }