remove (graph) isomorphic instances from python list

47 views Asked by At

Given a function that returns True if two instances A and B of some class are the "same", is there an efficient way of to remove duplicates from a list?

My question is general, but the example I have in mind are networkx graphs and the function is networkx.is_isomorphic. Two graphs that have the same shape are said isomorphic, but the attributes of the two networkx.Graph instances can be different. From a list li, I want to get only graphs that have different shapes.

A solution I came up with is the following:

def check_isomorphisms(G, new_li):
    for H in new_li:
        if are_isomorphic(G, H):
            return True
    return False

def remove_isomorphisms(li):
    if li == []:
        return li
    new_li = [li[0]]
    for G in li:
        if check_isomorphisms(G, new_li) is True:
            continue
        else:
            new_li.append(G)

    return new_li

In the practical example above are_isomorphic = networkx.is_isomorphic, but it is a priori independent of the particular class of objects and the precise definition of isomorphic, as long as there is a function that returns either true or false given two instances.

It is a variation of this question, but in that case two instances are the same if there attributes are the same so hashing was easy, and there was not really a need for optimisation. I'm interested in cases where "the same" is more general.

The solution I came up with works, but scales terribly with the size of the list. For about a list with 1000 elements, it takes about a minute, while it takes half an hour with 4000 elements (working with graphs and graph isomorphisms), while on the other hand building the the original list with duplicates takes a few seconds. Is there a more efficient way of doing this?

0

There are 0 answers