BFS very slow in python

1.6k views Asked by At

I was comparing the efficiency of Breadth-first search and Floyd–Warshall algorithm in solving shortest path problem in python, although the complexity of Floyd–Warshall algorithm is much larger then that of BFS, BFS seems to be taking more time

def BFS_ForNode(query_node, parents):
        result = {}
        queue = []
        queue.append( (query_node, 0) )
        while queue:
            node, dist = queue.pop(0)
            result[node] = dist
            if node in parents: 
                for parent in parents[node]:
                    queue_members = [x[0] for x in queue]
                    if parent not in result and parent not in queue_members: 
                        queue.append( (parent, dist+1) )
        return result

def BFS(Graph,vertix_num):
        for i in range(vertix_num):
            BFS_ForNode(i, Graph)

def floydWarshall(graph , V):
    dist = map(lambda i : map(lambda j : j , i) , graph)
    for k in range(V):
        for i in range(V):

            for j in range(i):

                dist[i][j] = min(dist[i][j] ,
                                  dist[i][k]+ dist[k][j]
                                )
                dist[j][i] = dist[i][j]

    return dist

Is there some data structure I am using for BFS that makes it very slow and is there anyway to optimize it

2

There are 2 answers

0
BearAqua On

Assuming you start BFS by calling BFS(), your BFS routine actually takes more than linear time. It treats every vertex in graph as a source vertex, and restarts a BFS for each vertex. which makes time quadratic even if your body of the loop takes constant time (it doesn't).

Additionally, [x[0] for x in queue] and if node in parents: can all be linear time worst case, making your worst case runtime approximately equal or even greater than Floyd Warshall.

0
Visel_chak On

I suppose this question is not so relevant for the op, but still for future people:

Original question:

Is there some data structure I am using for BFS that makes it very slow and is there anyway to optimize it

The answer from my POV: yes. There are some performance issues in your example:

  1. You are using list instead of proper collection like deque ( from collections module ). to store your items. In fact, when you are doing pop(0) from it, list should shift all items in list ( 1, n ) on one position - this is time consuming operation.

  2. Also when you are looking to the existence of item in the list, it is O(N) complexity, if you'll use set() instead, it will take you only O(1), because set structure based on the hash ( like dict ) and it is constant time to check is item in set.

If you're interested in, from my perspective the best implementation for BFS is:

from _collections import deque    

def bfs(graph, start):
    """BFS which will collect levels/distances of each neighbour of start vertex and so on."""
    queue = deque([start])
    distances = {start: 0}
    while queue:
        vertex = queue.popleft()
        for neighbour in graph[vertex]:
            if neighbour not in distances:
                queue.append(neighbour)
                distances[neighbour] = distances[vertex] + 1
    return distances