Efficient reverse-factorization of a number given list of divisors

316 views Asked by At

Given a number n and a list of divisors A, how can I efficiently find all the combinations of divisors that, when multiplied, yield to the number?

e.g.

n = 12 
A = [2, 3, 4]

Output:

[[3, 2, 2],
 [2, 3, 2],
 [2, 2, 3],
 [4, 3],
 [3, 4]]

This is what I managed to do so far (code that I re-adapted from one of the many find-prime-factorization questions on stackoverflow):

def products(n, A):
    if n == 1:
        yield []
    for each_divisor in A:
        if n % each_divisor == 0:
            for new_product in products(n // each_divisor, A):
                yield new_product + [each_divisor]

This code seems to work properly but it's very slow, and if I try to use memoization (passing A as a tuple to the function to avoid unhashable type error) the code doesn't provide the correct result.

Any suggestions on how to improve the efficiency of this code?

The memoized code I tried is the following:

class Memoize:
    def __init__(self, fun):
        self.fun = fun
        self.memo = {}

    def __call__(self, *args):
        if args not in self.memo:
            self.memo[args] = self.fun(*args)
        return self.memo[args]

@Memoize
def products(n, A): [as above]

When calling the function with the above defined parameters n, A:

>>> list(products(12, (2, 3, 4)))
[[3, 2, 2]]

Without memoization, the output of the same code is:

[[3, 2, 2], [2, 3, 2], [2, 2, 3], [4, 3], [3, 4]]

Note that other memoizazation functions (e.g. from the functools package @functools.lru_cache(maxsize=128)) lead to the same problem.

1

There are 1 answers

5
Mad Physicist On BEST ANSWER

Rather than using memoization, you can split the problem into a recursive portion to find all the unique combinations, and a portion to find the combinations of each arrangement. That should cut down your search space considerably and only permute the options that will actually work.

To accomplish this, A should be sorted.

Part 1:

Do a DFS on the graph of possible factorizations that are available. Truncate the search down redundant branches by only selecting orderings in which each factor is greater than or equal to its predecessor. For example:

                   12
                /  |  \
               /   |   \
              /    |    \
          2(x6)  3(x4)   4(x3)
         /  |      |  \
     2(x3) 3(x2)   3   4(x1)
    /  |
   2  3(x1)

Bold nodes are the paths that lead to a successful factorization. Struck nodes are ones that lead to a redundant branch because the remaining n after dividing by the factor is less than the factor. Nodes that don't show a remaining value in parentheses do not lead to a factorization at all. No branch is attempted for the factors lower than the current one: when we try 3, 2 is never revisited, only 3 and 4, etc.

In code:

A.sort()

def products(n, A):
    def inner(n, A, L):
        for i in range(len(A)):
            factor = A[i]
            if n % factor: continue

            k = n // factor
            if k < factor:
                if k == 1:
                    yield L + [factor]
                elif n in A:
                    yield L + [n]
                break  # Following k guaranteed to be even smaller
                       # until k == 1, which elif shortcuts

            yield from inner(k, A[i:], L + [factor])

    yield from inner(n, A, [])

This is pretty fast. In your particular case, it only inspects 4 nodes instead of ~30. In fact, you can prove that it inspects the absolute minimum number of nodes possible. The only improvement you might get is by using iteration instead of recursion, and I doubt that will help much.

Part 2:

Now, you just generate a permutation of each element of the result. Python provides the tools to do this directly in the standard library:

from itertools import chain, permutations

chain.from_iterable(map(permutations, products(n, A)))

You can put this into the last line of products as

yield from chain.from_iterable(map(permutations, inner(n, A, [])))

Running list(products(12, A)) shows a 20-30% improvement on my machine this way (5.2µs vs 4.0µs). Running with a more complicated example like list(products(2 * 3 * 4 * 5 * 5 * 7 * 11, [2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 14, 22])) shows an even more dramatic improvement: 7ms vs 42ms.

Part 2b:

You can filter out duplicate permutations that occur because of duplicate factors using an approach similar to the one shown here (shameless plug). Adapting for the fact that we always deal with an initial list of sorted integers, it can be written something like this:

def perm_dedup(tup):
    maximum = (-1,) * len(tup)
    for perm in permutations(tup):
        if perm <= maximum: continue
        maximum = perm
        yield perm

Now you can use the following in the last line:

yield from chain.from_iterable(map(perm_dedup, inner(n, A, [])))

The timings sill favor this complete approach very much: 5.2µs vs 4.9µs for the question and 6.5ms vs 42ms for the long example. In fact, if anything, avoiding duplicate permutations seems to reduce the timing even more.

TL;DR

A much more efficient implementation that only uses standard libraries and searches only for unique permutations of unique factorizations:

from itertools import chain, permutations

def perm_dedup(tup):
    maximum = (-1,) * len(tup)
    for perm in permutations(tup):
        if perm <= maximum: continue
        maximum = perm
        yield perm

def products(n, A):
    A = sorted(set(A))
    def inner(n, A, L):
        for i in range(len(A)):
            factor = A[i]
            if n % factor: continue

            k = n // factor
            if k < factor:
                if k == 1:
                    yield L + [factor]
                elif n in A:
                    yield L + [n]
                break  # Following k guaranteed to be even smaller
                       # until k == 1, which elif shortcuts

            yield from inner(k, A[i:], L + [factor])

    yield from chain.from_iterable(map(perm_dedup, inner(n, A, [])))