Question: Count the total number of pairs in a list of strings, so that for each pair, one string is the suffix of the other. For example, "door" is the suffix of "backdoor", "a" is the suffix of "cba".
def count_same_endings(words):
"""
Count the number of word pairs with the same ending.
Parameters:
words (list): List of words.
Returns:
int: The count of word pairs with the same ending.
"""
strings = sorted([word[::-1] for word in words])
count = 0
to_print = []
i = 0
while i < len(words):
j = 1
while i + j < len(words) and strings[i + j].startswith(strings[i]):
# to_print.append((strings[i+j][::-1], strings[i][::-1]))
count += 1
j += 1
i += 1
# print(to_print)
return count
words = ['cba', 'a', 'a', 'a', 'a', 'b', 'ba', 'ca']
result = count_same_endings(words)
print(result)
19
def count_end_slices(strings):
slice_count = {}
strings_count = {}
count = 0
# Iterate over each word in the list
for string in strings:
length = len(string)
# Check if this word is already the suffix of some previous words
if string in slice_count:
count += slice_count[string]
# Generate slices that include the end of the string
for start in range(length):
slice = string[start:]
# Count the occurrences of each slice
if slice in slice_count:
# Check if previous words are the suffix of this string
if slice in strings_count:
count += strings_count[slice] if slice != string else 0
slice_count[slice] += 1
else:
slice_count[slice] = 1
# Account for this word
if string in strings_count:
strings_count[string] += 1
else:
strings_count[string] = 1
return slice_count, count
``Comparison between 1st and 2nd
I have already created two algorithms which are both pretty brute force. I wonder if there is a more optimized way of solving this question.
We can join all strings with symbol "$" (it is less than all alphabetic symbols) into string
common
, then build suffix array overcommon
string, I've used implementation from here.Then sort source strings, and look for every string+"$" in suffix array. Sorted suffixes and sorted strings allow to walk through both arrays once.
When string is larger lexically than current suffix, we move to the next suffix.
When string is equal to current suffix (more exactly, to its prefix of corresponding length), we increment counter and move to the next suffix.
When string is lexically smaller than current suffix, we move to the next string.
At the end we subtract number of strings from counter (to remove coincidences of strings with themselves).