I was playing around with generating Hamming numbers in Haskell, trying to improve on the obvious (pardon the naming of the functions)
mergeUniq :: Ord a => [a] -> [a] -> [a]
mergeUniq (x:xs) (y:ys) = case x `compare` y of
EQ -> x : mergeUniq xs ys
LT -> x : mergeUniq xs (y:ys)
GT -> y : mergeUniq (x:xs) ys
powers :: [Integer]
powers = 1 : expand 2 `mergeUniq` expand 3 `mergeUniq` expand 5
where
expand factor = (factor *) <$> powers
I noticed that I can avoid the (slower) arbitrary precision Integer if I represent the numbers as the triple of the 2-, 3- and 5-exponents like data Power = Power { k2 :: !Int, k3 :: !Int, k5 :: !Int }, where the number is understood to be 2k2 * 3k3 * 5k5. The comparison of two Powers then becomes
instance Ord Power where
p1 `compare` p2 = toComp (p1 `divP` gcdP) `compare` toComp (p2 `divP` gcdP)
where
divP p1 p2 = Power { k2 = k2 p1 - k2 p2, k3 = k3 p1 - k3 p2, k5 = k5 p1 - k5 p2 }
gcdP = Power { k2 = min (k2 p1) (k2 p2), k3 = min (k3 p1) (k3 p2), k5 = min (k5 p1) (k5 p2) }
toComp Power { .. } = fromIntegral k2 * log 2 + fromIntegral k3 * log 3 + fromIntegral k5 * log 5
So, very roughly speaking, to compare p₁ = 2i₁ * 3j₁ * 5k₁ and p₂ = 2i₂ * 3j₂ * 5k₂ we compare the logarithms of p₁ and p₂, which presumably fit Double. But actually we do even better: we first compute their GCD (by finding the mins of the corresponding exponents pairs — only Int arithmetic so far!), divide p₁ and p₂ by the GCD (by subtracting the mins from the corresponding exponents — also only Int arithmetic), and compare the logarithms of the results.
But, given that we go through Doubles, there will be loss of precision eventually. And this is the ground for my questions:
- When will the finite precision of
Doubles bite me? That is, how to estimate the order ofi, j, kfor which the results of comparisons of2i * 3j * 5kwith numbers with "similar" exponents will become unreliable? - How does the fact that we go through dividing by the GCD (which presumably lowers the exponents considerably for this task) modify the answer to the previous question?
I did an experiment, comparing the numbers produced this way with the numbers produced via going through arbitrary precision arithmetic, and all Hamming numbers up to the 1'000'000'000th match exactly (which took me about 15 minutes and 600 megs of RAM to verify). But that's obviously not a proof.
Empirically, it's above about 10 trillionths Hamming number, or higher.
Using your nice GCD trick won't help us here, because some neighboring Hamming numbers are bound to have no common factors between them.
update: trying it online on ideone and elsewhere, we get