Filter Anagram in Array in Python

2k views Asked by At

I'm trying to go through an array and delete the elements that aren't anagrams in python. Here is the code I wrote. My logic seems fine but I can't seem to get it.

b = ['cat', 'dog', 'god', 'star', 'lap', 'act']
array=[]
t=0
for i in b:
    while t<len(b):
        if ''.join(sorted(i))==''.join(sorted(b[t])):
           array.append(i)
        t+=1
print array
4

There are 4 answers

1
Akshay On BEST ANSWER

Just some minor tweaks to your existing code should work.

b = ['cat', 'dog', 'god', 'star', 'lap', 'act']
array = []
t = 0
for i, value in enumerate(b):
    t = i+1
    while t<len(b):
        if ''.join(sorted(value))==''.join(sorted(b[t])):
            array.extend([value, b[t]])
        t+=1
print array
['cat', 'act', 'dog', 'god']
1
Anand S Kumar On

First issue in your program, is that you are initializing t to 0 outside the for loop , hence you you only check the first element of b with all the elements, for the rest of the iterations of the for loop , t would always be greater than len(b) , hence it never goes inside the inner loop, from second iteration of for loop. A simple fix -

for i in b:
    t = 0
    while t<len(b):
        if ''.join(sorted(i))==''.join(sorted(b[t])):
           array.append(i)
        t+=1

But for finding anagrams, i think you are over-complicating it, you can simple find out sum of ASCII values of the characters of the string, and then compare it with other same sums and lengths , and check if both sum of ASCII value and length of string match, if they do they are anagram.

Example code for this method -

b = ['cat', 'dog', 'god', 'star', 'lap', 'act']
c = list(map(len,b))
d = list(map(lambda x: sum([ord(c) for c in x]), b))
arr= []
for i, s in enumerate(b):
    for j, s1 in enumerate(b):
            if d[i] == d[j] and c[i] == c[j] and i != j:
                    if s not in arr:
                            arr.append(s)
                    if s1 not in arr:
                            arr.append(s1)
print(arr)
>> ['cat', 'act', 'dog', 'god']
1
Ajay On

An alternate approach

Using itertools groupby

In [18]: from itertools import groupby


In [19]: c=[list(g) for k,g in groupby(sorted(b,key=sorted),sorted)]

In [20]: c
Out[20]: [['cat', 'act'], ['lap'], ['star'], ['dog', 'god']]

In [21]: [x for _list in c if len(_list)>1 for x in _list]
Out[21]: ['cat', 'act', 'dog', 'god']

The key thing here is to use itertools.groupby from the itertools module which will group items in a list together.

The list we supply to groupby has to be sorted in advanced so we pass it sorted(b,key=sorted). The trick here is that sorted can take a key function and will sort based on the output from this function, so we pass sorted again as the key function and this will will sort the words using the letters of the string in order. There's no need to define our own function or create a lambda.

groupby takes a key function which it uses to tell if items should be grouped together and again we can just pass it the built-in sorted function.

Source:Finding and grouping anagrams by Python

1
Mazdak On

Actually your solution is wrong and the idea of using 2 for loop is not efficient. you are iterating over your list 2 time and apply ''.join(sorted()) 2 time on your elements also you are comparing each element with itself! instead that you can use a dictionary to get the indices of the anagram elements with iterating over the enumerate of your list :

>>> d={}
>>> for i,j in enumerate(b):
...   d.setdefault(''.join(sorted(j)),[]).append(i)
... 
>>> d
{'arst': [3], 'dgo': [1, 2], 'alp': [4], 'act': [0, 5]}

>>> [b[t] for k in d.values() if len(k)>1 for t in k]
['dog', 'god', 'cat', 'act']

And if you care about the order you can use OrderedDict function from collections module :

>>> from collections import OrderedDict
>>> d=OrderedDict()
>>> for i,j in enumerate(b):
...   d.setdefault(''.join(sorted(j)),[]).append(i)
... 
>>> [b[t] for k in d.values() if len(k)>1 for t in k]
['cat', 'act', 'dog', 'god']