find a path using all given edges python

2.4k views Asked by At

I have a list of edges. I need to decode a path from source node to sink node from them. There might be loops in my paths, but I should only use each of the edges once. In my list, I might also have the same edge for more than one time, which means in my path I should pass it more than once.

Lets say my edges list as following:

[(1, 16), (9, 3), (8, 9), (15, 8), (5, 1), (8, 15), (3, 5)]

so my path is:

8->15->8->9->3->5->1->16 equivalent to [8,15,8,9,3,5,1,16]

I know the sink node and the source node. (In above sample I knew that 8 is source and 16 is sink) here is another sample with more than one usage of the same edge:

[(1,2),(2,1),(2,3),(1,2)]

the path is:

1->2->1->2->3 equivalent to [1,2,1,2,3]

Basically it is type of topological sorting but, we don't have loops in topological sorting. I have the following code, but it does not use the nodes in the loops !

def find_all_paths(graph, start, end):
    path  = []
    paths = []
    queue = [(start, end, path)]
    while queue:
        start, end, path = queue.pop()
        print 'PATH', path

        path = path + [start]
        if start == end:
            paths.append(path)
        for node in set(graph[start]).difference(path):
            queue.append((node, end, path))
return paths
1

There are 1 answers

0
Chris H. On

Simply, you may need to do more than one pass over the edges to assemble a path using all the edges.

The included code operates on the following assumptions:

  • A solution exists. Namely all vertices belong to a single connected component of an underlying graph and
  • in_degree = out_degree for either all or all but 2 vertices. In the latter case one of the vertices has in_degree - out_degree = 1 and the other has in_degree - out_degree = -1.

Furthermore even with these conditions, there is not necessarily a unique solution to the problem of finding a path from source to sink utilizing all edges. This code only finds one solution and not all solutions. (An example where multiple solutions exist is a 'daisy' [(1,2),(2,1),(1,3),(3,1),(1,4),(4,1),(1,5),(5,1)] where the start and end are the same.)

The idea is to create a dictionary of all edges for the path indexed by the starting node for the edge and then remove edges from the dictionary as they are added to the path. Rather than trying to get all of the edges in the path in the first pass, we go over the dictionary multiple times until all of the edges are used. The first pass creates a path from source to sink. Subsequent passes add in loops.

Warning: There is almost no consistency checking or validation. If the start is not a valid source for the edges then the 'path' returned will be disconnected!

"""
This is a basic implementatin of Hierholzer's algorithm as applied to the case of a 
directed graph with perhaps multiple identical edges.
"""

import collections

def node_dict(edge_list):
    s_dict = collections.defaultdict(list)
    for edge in edge_list:
        s_dict[edge[0]].append(edge)
    return s_dict

def get_a_path(n_dict,start):
    """
    INPUT:  A dictionary whose keys are nodes 'a' and whose values are lists of 
    allowed directed edges (a,b) from 'a' to 'b', along with a start WHICH IS 
    ASSUMED TO BE IN THE DICTIONARY.

    OUTPUT:  An ordered list of initial nodes and an ordered list of edges 
    representing a path starting at start and ending when there are no other 
    allowed edges that can be traversed from the final node in the last edge.

    NOTE:  This function modifies the dictionary n_dict!
    """

    cur_edge = n_dict[start][0]
    n_dict[start].remove(cur_edge)

    trail = [cur_edge[0]]
    path = [cur_edge]
    cur_node = cur_edge[1]

    while len(n_dict[cur_node]) > 0:
        cur_edge = n_dict[cur_node][0]
        n_dict[cur_node].remove(cur_edge)
        trail.append(cur_edge[0])
        path.append(cur_edge)
        cur_node = cur_edge[1]

    return trail, path


def find_a_path_with_all_edges(edge_list,start):
    """
    INPUT:  A list of edges given by ordered pairs (a,b) and a starting node.

    OUTPUT:  A list of nodes and an associated list of edges representing a path 
    where each edge is represented once and if the input had a valid Eulerian 
    trail starting from start, then the lists give a valid path through all of 
    the edges.

    EXAMPLES:

        In [2]: find_a_path_with_all_edges([(1,2),(2,1),(2,3),(1,2)],1)
        Out[2]: ([1, 2, 1, 2, 3], [(1, 2), (2, 1), (1, 2), (2, 3)])

        In [3]: find_a_path_with_all_edges([(1, 16), (9, 3), (8, 9), (15, 8), (5, 1), (8, 15), (3, 5)],8)
        Out[3]: 
        ([8, 15, 8, 9, 3, 5, 1, 16],
         [(8, 15), (15, 8), (8, 9), (9, 3), (3, 5), (5, 1), (1, 16)])
    """

    s_dict = node_dict(edge_list)
    trail, path_check = get_a_path(s_dict,start)

    #Now add in edges that were missed in the first pass...
    while max([len(s_dict[x]) for x in s_dict]) > 0:
        #Note:  there may be a node in a loop we don't have on trail yet
        add_nodes = [x for x in trail if len(s_dict[x])>0]
        if len(add_nodes) > 0:
            skey = add_nodes[0]
        else:
            print "INVALID EDGE LIST!!!"
            break

        temp,ptemp = get_a_path(s_dict,skey)
        i = trail.index(skey)
        if i == 0:
            trail = temp + trail
            path_check = ptemp + path_check
        else:
            trail = trail[:i] + temp + trail[i:]
            path_check = path_check[:i] + ptemp + path_check[i:]

    #Add the final node to trail.
    trail.append(path_check[-1][1])
    return trail, path_check