Finding a closed path from list of start and end nodes

620 views Asked by At

I have a list of edges (E) of a graph with nodes V = [1,2,3,4,5,6]:

E = [(1,2), (1,5), (2,3), (3,1), (5,6), (6,1)]

where each tuple (a,b) refers to the start & end node of the edge respectively.

If I know the edges form a closed path in graph G, can I recover the path?

Note that E is not the set of all edges of the graph. Its just a set of edges.

In this example, the path would be 1->2->3->1->5->6->1

A naive approach, I can think of is using a tree where I start with a node, say 1, then I look at all tuples that start with 1, here, (1,2) and (1,5). Then I have two branches, and with nodes as 2 & 5, I continue the process till I end at the starting node at a branch.

How to code this efficiently in python?

2

There are 2 answers

0
spefk On BEST ANSWER

The networkx package has a function that can generate the desired circuit for you in linear time...

It is possible, that construction of nx.MultiDiGraph() is slower and not such efficient, as desired in question, or usage of external packages for only one function is rather excessive. If it is so, there is another way.

Plan: firstly we will find some way from start_node to start_node, then we will insert all loops, that were not visited yet.

from itertools import chain
from collections import defaultdict, deque
from typing import Tuple, List, Iterable, Iterator, DefaultDict, Deque


def retrieve_closed_path(arcs: List[Tuple[int, int]], start_node: int = 1) -> Iterator[int]:
    if not arcs:
        return iter([])

    # for each node `u` carries queue of its
    # neighbours to be visited from node `u`
    d: DefaultDict[int, Deque[int]] = defaultdict(deque)
    for u, v in arcs:
        # deque pop and append complexity is O(1)
        d[u].append(v)

    def _dfs(node) -> Iterator[int]:
        out: Iterator[int] = iter([])
        # guarantee, that all queues
        # will be emptied at the end
        while d[node]:
            # chain returns an iterator and helps to
            # avoid unnecessary memory reallocations
            out = chain([node], _dfs(d[node].pop()), out)
            # if we return in this loop from recursive call, then
            # `out` already carries some (node, ...) and we need
            # only to insert all other loops which start at `node`
        return out

    return chain(_dfs(start_node), [start_node])


def path_to_string(path: Iterable[int]) -> str:
    return '->'.join(str(x) for x in path)

Examples:

    E = [(1, 2), (2, 1)]
    p = retrieve_closed_path(E, 1)
    print(path_to_string(p))
    >> 1->2->1

    E = [(1, 2), (1, 5), (2, 3), (3, 1), (5, 6), (6, 1)]
    p = retrieve_closed_path(E, 1)
    print(path_to_string(p))

    >> 1->5->6->1->2->3->1

    E = [(1, 2), (2, 3), (3, 4), (4, 2), (2, 1)]
    p = retrieve_closed_path(E, 1)
    print(path_to_string(p))

    >> 1->2->3->4->2->1


    E = [(5, 1), (1, 5), (5, 2), (2, 5), (5, 1), (1, 4), (4, 5)]
    p = retrieve_closed_path(E, 1)
    print(path_to_string())
    >> 1->4->5->1->5->2->5->1

4
BrokenBenchmark On

You're looking for a directed Eulerian circuit in your (sub)graph. An Eulerian circuit is a trail that visits every edge exactly once.

The networkx package has a function that can generate the desired circuit for you in linear time:

import networkx as nx

edges = [(1,2), (1,5), (2,3), (3,1), (5,6), (6,1)]
G = nx.MultiDiGraph()
G.add_edges_from(edges)

# Prints [(1, 5), (5, 6), (6, 1), (1, 2), (2, 3), (3, 1)]
# which matches the desired output (as asked in the comments).
print([edge for edge in nx.algorithms.euler.eulerian_circuit(G)])

The documentation cites a 1973 paper, if you're interested in understanding how the algorithm works. You can also take a look at the source code here. Note that we're working with multigraphs here, since you can have multiple edges that have the same source and destination node. There are probably other implementations floating around on the Internet, but they may or may not work for multigraphs.