algorithm to find products of a set of primes, in order, greater than x

1.7k views Asked by At

Consider the finite set {2,3,5,...,n}. I am interested in primes but the question could apply to any set of numbers. I want to find all possible products of these numbers in ascending order, and in particular greater than or equal to some number x. Does anyone know a nice algorithm for this?

EDIT to clarify:

Each factor in the input set may be used any number of times. If the input were {2,3,5,7} the output would be {2,3,4,5,6,7,8,9,10,12,14,15,16,18,...}. The algorithm can stop as soon as it produces a result greater than or equal to some number x.

7

There are 7 answers

0
Will Ness On BEST ANSWER

A Haskell code, as seen in this answer,

hamm :: [Integer] -> [Integer]
hamm []     = []   
hamm (p:ps) = xs        -- e.g. hamm [2,3,5] 
        where xs = merge (hamm ps)               --   H({p} ∪ ps) = S,
                         (p : map (p*) xs)       -- S ⊇ {p} ∪ H(ps) ∪ { p*x | x ∊ S }

merge a@(x:xs) b@(y:ys) | x < y     = x : merge xs b 
                        | otherwise = y : merge a ys 
merge [] b = b
merge a [] = a

merge here doesn't try to eliminate multiples, because there won't be any -- but only in case you're using only the primes in the input:

~> take 20 $ hamm [2,3,5,7]
[2,3,4,5,6,7,8,9,10,12,14,15,16,18,20,21,24,25,27,28]

If not, you need to use union instead,

union a@(x:xs) b@(y:ys) | x < y     = x : union xs  b
                        | x > y     = y : union a  ys
                        | otherwise = x : union xs ys
union [] b = b
union a [] = a

Starting from (above) a given value efficiently might be an interesting challenge. A directly slice-generating code at the bottom of this answer could be taken as a starting point.

In general it is easy to skip along the ordered sequence until a value is passed over. In Haskell, it is done with a built-in dropWhile (< n),

~> take 10 $ dropWhile (< 100) $ hamm [2,3,5,7]
[100,105,108,112,120,125,126,128,135,140]
0
rossum On

Every integer greater than 1 is the product of a 'set of primes' because it is the product of its prime factors. It might be easier to start at your desired minimum number and strike out all numbers that have a prime factor not in your initial set. Continue the process until your result set is large enough. In effect you are doing a modified Sieve of Eratosthenes, removing all multiples of the primes not in your initial set.

0
n. m. could be an AI On

(Edit: made it produce all products in ascending order; let users filter them as they wish. This is a generalised Hamming numbers problem)

genHamming     :: Integral a => [a] -> [a]
genHamming  zs = hmng where
         hmng = 1 : foldr (||) []  [map (z*) hmng | z <- zs]
         []     ||  ys             =  ys
         xs     ||  []             =  xs
         (x:xs) || (y:ys)  | x==y  =  x : (xs || ys)
                           | x<y   =  x : (xs || (y:ys))
                           | y<x   =  y : (ys || (x:xs))

Example usage

 Prelude Hamming> take 10 $ dropWhile (< 1000) $ genHamming [2,3,5]
 [1000,1024,1080,1125,1152,1200,1215,1250,1280,1296]
 Prelude Hamming>
0
Dillon Davis On

There are two algorithms that come to mind for this. First, you could calculate all possible products between the numbers, and then sorts these. Although this seems to be the naive approach, you could speed this up by 'remembering' the last product, dividing out one number, and multiplying a different number in its place. This would greatly reduce the number of operations necessary, if with the correct ordering of your permutations (look into Gray Codes) you could minimize the total multiplications.

On the other hand, what you could do is compute the sets of all the original numbers, the set of the products of pairs (of two) original numbers, the set of products of 3... so on. Then you can sort each individual set (which shouldn't be hard to ensure they are nearly sorted anyways), and the mergesort the sets together into one sorted list of products. This would take more operations, but would result in a nearly sorted list at the end, which may take less time to construct overall.

The other algorithm would be to take the product of all the primes of interest, and call this P. Construct another list of all the original primes squared. Now, loop over all numbers up to P, and test if those are divisible by any of the values in the primes-squared array. If they are, toss them. If not, then add them to the output array. You can optimize this by only testing divisibility up to sqrt(i), where i is the iteration in your for loop. This is still likely slower than the above method though.

1
user448810 On

You probably also want to include 2^0 * 3^0 * 5^0 * 7^0 = 1 in your output.

The way to do this is with a priority queue. If k is in the sequence, so are 2k, 3k, 5k and 7k. Start your output with 1, then add 2, 3, 5, and 7 to the priority queue. Pop 2 from the top of the queue and add 2*2=4, 2*3=6, 2*5=10 and 2*7=14 to the queue; the queue at that point will contain 3, 4, 5, 6, 7, 10 and 14. Pop 3 from the top of the queue and add 3*2=6, 3*3=9, 3*5=15 and 3*7=21 to the queue. And so on.

You will discover that many elements are duplicated; for instance, we added 6 to the priority queue twice in the example above. You can either add duplicates, and check each time you pop the queue if the element is the same as the prior member of the sequence, or you can keep a separate list of items in the queue and refrain from adding duplicates in the first place.

I discuss a priority queue that contains only distinct elements at my blog.

2
Ralf Kleberhoff On

As every prime factor is allowed to appear many times, the sequence is infinite. So we can't generate all products and then sort them. We have to generate the sequence iteratively.

If a is a member of the sequence, then {2*a, 3*a, 5*a ... n*a} will also be members of the sequence, coming later.

So, the algorithm that comes to my mind is to have a (sorted, duplicate-free) buffer of the next sequence members. We extract and present the smallest value and insert all of its multiples into the buffer.

As it's difficult to predict the buffer content for your starting number x, this algorithm should start from the beginning and ignore the results until they reach the x threshold.

0
Tavin On

Because our application is written in python I came up with the following implementation which I wanted to share:

def powers(x):
    y = x
    while True:
        yield y
        y *= x


def products(factors):
    y0 = factors[0]
    if len(factors) == 1:
        yield from powers(y0)
    else:
        yield y0
        g1 = products(factors)
        y1 = y0 * next(g1)
        g2 = products(factors[1:])
        y2 = next(g2)
        while True:
            if y1 < y2:
                yield y1
                y1 = y0 * next(g1)
            else:
                yield y2
                y2 = next(g2)


if __name__ == "__main__":
    import itertools
    for n in itertools.islice(products([2, 3, 5, 7]), 10**6):
        print(n)

No doubt the use of recursion with generators could be improved upon, but in practice the performance is good enough for our application. Beyond that, I'm still interested in how to start at a given minimum value efficiently, as mentioned in Will Ness' answer. Thanks to all those who contributed.