I was playing around with generating Hamming numbers in Haskell, trying to improve on the obvious (pardon the naming of the functions)
mergeUniq :: Ord a => [a] -> [a] -> [a]
mergeUniq (x:xs) (y:ys) = case x `compare` y of
EQ -> x : mergeUniq xs ys
LT -> x : mergeUniq xs (y:ys)
GT -> y : mergeUniq (x:xs) ys
powers :: [Integer]
powers = 1 : expand 2 `mergeUniq` expand 3 `mergeUniq` expand 5
where
expand factor = (factor *) <$> powers
I noticed that I can avoid the (slower) arbitrary precision Integer
if I represent the numbers as the triple of the 2-, 3- and 5-exponents like data Power = Power { k2 :: !Int, k3 :: !Int, k5 :: !Int }
, where the number is understood to be 2k2 * 3k3 * 5k5
. The comparison of two Power
s then becomes
instance Ord Power where
p1 `compare` p2 = toComp (p1 `divP` gcdP) `compare` toComp (p2 `divP` gcdP)
where
divP p1 p2 = Power { k2 = k2 p1 - k2 p2, k3 = k3 p1 - k3 p2, k5 = k5 p1 - k5 p2 }
gcdP = Power { k2 = min (k2 p1) (k2 p2), k3 = min (k3 p1) (k3 p2), k5 = min (k5 p1) (k5 p2) }
toComp Power { .. } = fromIntegral k2 * log 2 + fromIntegral k3 * log 3 + fromIntegral k5 * log 5
So, very roughly speaking, to compare p₁ = 2i₁ * 3j₁ * 5k₁
and p₂ = 2i₂ * 3j₂ * 5k₂
we compare the logarithms of p₁
and p₂
, which presumably fit Double
. But actually we do even better: we first compute their GCD (by finding the min
s of the corresponding exponents pairs — only Int
arithmetic so far!), divide p₁
and p₂
by the GCD (by subtracting the min
s from the corresponding exponents — also only Int
arithmetic), and compare the logarithms of the results.
But, given that we go through Double
s, there will be loss of precision eventually. And this is the ground for my questions:
- When will the finite precision of
Double
s bite me? That is, how to estimate the order ofi, j, k
for which the results of comparisons of2i * 3j * 5k
with numbers with "similar" exponents will become unreliable? - How does the fact that we go through dividing by the GCD (which presumably lowers the exponents considerably for this task) modify the answer to the previous question?
I did an experiment, comparing the numbers produced this way with the numbers produced via going through arbitrary precision arithmetic, and all Hamming numbers up to the 1'000'000'000th match exactly (which took me about 15 minutes and 600 megs of RAM to verify). But that's obviously not a proof.
Empirically, it's above about 10 trillionths Hamming number, or higher.
Using your nice GCD trick won't help us here, because some neighboring Hamming numbers are bound to have no common factors between them.
update: trying it online on ideone and elsewhere, we get