Given a number n
and a list of divisors A
, how can I efficiently find all the combinations of divisors that, when multiplied, yield to the number?
e.g.
n = 12
A = [2, 3, 4]
Output:
[[3, 2, 2],
[2, 3, 2],
[2, 2, 3],
[4, 3],
[3, 4]]
This is what I managed to do so far (code that I re-adapted from one of the many find-prime-factorization questions on stackoverflow):
def products(n, A):
if n == 1:
yield []
for each_divisor in A:
if n % each_divisor == 0:
for new_product in products(n // each_divisor, A):
yield new_product + [each_divisor]
This code seems to work properly but it's very slow, and if I try to use memoization (passing A
as a tuple to the function to avoid unhashable type error) the code doesn't provide the correct result.
Any suggestions on how to improve the efficiency of this code?
The memoized code I tried is the following:
class Memoize:
def __init__(self, fun):
self.fun = fun
self.memo = {}
def __call__(self, *args):
if args not in self.memo:
self.memo[args] = self.fun(*args)
return self.memo[args]
@Memoize
def products(n, A): [as above]
When calling the function with the above defined parameters n, A
:
>>> list(products(12, (2, 3, 4)))
[[3, 2, 2]]
Without memoization, the output of the same code is:
[[3, 2, 2], [2, 3, 2], [2, 2, 3], [4, 3], [3, 4]]
Note that other memoizazation functions (e.g. from the functools
package @functools.lru_cache(maxsize=128)
) lead to the same problem.
Rather than using memoization, you can split the problem into a recursive portion to find all the unique combinations, and a portion to find the combinations of each arrangement. That should cut down your search space considerably and only permute the options that will actually work.
To accomplish this,
A
should be sorted.Part 1:
Do a DFS on the graph of possible factorizations that are available. Truncate the search down redundant branches by only selecting orderings in which each factor is greater than or equal to its predecessor. For example:
Bold nodes are the paths that lead to a successful factorization.
Strucknodes are ones that lead to a redundant branch because the remainingn
after dividing by the factor is less than the factor. Nodes that don't show a remaining value in parentheses do not lead to a factorization at all. No branch is attempted for the factors lower than the current one: when we try 3, 2 is never revisited, only 3 and 4, etc.In code:
This is pretty fast. In your particular case, it only inspects 4 nodes instead of ~30. In fact, you can prove that it inspects the absolute minimum number of nodes possible. The only improvement you might get is by using iteration instead of recursion, and I doubt that will help much.
Part 2:
Now, you just generate a permutation of each element of the result. Python provides the tools to do this directly in the standard library:
You can put this into the last line of
products
asRunning
list(products(12, A))
shows a 20-30% improvement on my machine this way (5.2µs vs 4.0µs). Running with a more complicated example likelist(products(2 * 3 * 4 * 5 * 5 * 7 * 11, [2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 14, 22]))
shows an even more dramatic improvement: 7ms vs 42ms.Part 2b:
You can filter out duplicate permutations that occur because of duplicate factors using an approach similar to the one shown here (shameless plug). Adapting for the fact that we always deal with an initial list of sorted integers, it can be written something like this:
Now you can use the following in the last line:
The timings sill favor this complete approach very much: 5.2µs vs 4.9µs for the question and 6.5ms vs 42ms for the long example. In fact, if anything, avoiding duplicate permutations seems to reduce the timing even more.
TL;DR
A much more efficient implementation that only uses standard libraries and searches only for unique permutations of unique factorizations: