Finding non-overlapping repeating sub-strings

989 views Asked by At

My problem is similar to finding the longest repeating non-overlapping sub-strings but in this manner. Example, my input array is 0,0,0,1,2,1,1,2,1,1,2,1,2,0,0

The longest repeating and non-overlapping sub-string in this input is 121 with the count as 3 Now, remove the occurrences of 121 from the input array and replace them with -1.

My input now becomes: 0,0,0,-1,-1,-1,2,0,0 Now again find the same sub-strings.

Note: -1 shall not be included in the answer. So, the next longest non-overlapping and repeating sub-string is 00 with count as 2.

Remove occurrences of 0,0 from the input which then becomes: 0,-1,-1,-1,2,-1 The only occurrences now are 0 and 2 which are of length 1. So, here we stop.

I am only able to think of brute force which will be O(n3). Tried thinking of suffix trees too but don't know how to include the non-overlapping condition in it.

My very naive implementation:

Because I want my substring to have count as atleast 2, it can have length maximum as N/2 (N is the size of input array)

for len = N/2 to 1:
    // find all sub-strings of length as "len" in the input sequence
    for i=0 to N-len+1
        // the sub-string is s[i... (i+len)]
        // find non-overlapping count of this sub-string in the input sequence using KMP
        int count = find_Count();
        // take the substring with maximum count, let it be "r"
    end
    modify the array to place replace non-overlapping occurrences of "r" with -1
end

As you can see above, the complexity is O(N3).

Any optimal solution, hints or explanation can be of some help.

Thanks

0

There are 0 answers