Why does this Hedgehog generator not shrink faster?

117 views Asked by At

I made a Hedgehog generator that generates arbitrary 256-bit values in the following way:

genWord256 :: Gen Word256
genWord256 = do
  bytes <- Gen.integral (Range.linear 0 31)
  let lo = 2 ^ (8 * bytes)
      hi = 2 ^ (8 * (bytes + 1))
  pred <$> Gen.integral (Range.constant lo hi)

Making the size parameter determine the number of bytes in the number, I think, makes sense to my application. However, evaluating how this generator shrinks, and applying ceiling . logBase 2 to this, my question is this:

Why does Hedgehog decide to emphasise on the vicinity of its initial outcome? Have I somehow misunderstood what is meant by "a range which is unaffected by the size parameter"? (Range.constant) I would have thought that any shrinkage here must have a smaller number of bits.

λ> Gen.print genWord256
=== Outcome ===
68126922926972638
=== Shrinks ===
112                 -- 7 bits
4035711763          -- 32 bits
106639875637011     -- 47 bits
281474976710655     -- 48 bits
34204198951841647   -- 55 bits
51165560939407143   -- 56 bits
59646241933189891   -- ...
67994412286444783   -- ...
... 50 shrinks omitted ...
68126922926972637   -- 56 bits
1

There are 1 answers

0
kirelagin On BEST ANSWER

The output that you showed makes perfect sense to me.

First and foremost, just to make sure that we are on the same page, what Gen.print shows is not a sequence of consequitive shrinks assuming failure, it is just the first level of the shrink tree.

So, in your example, it generated 68126922926972638, which is a 7-byte value. Assuming that this fails, it will try 112, a 1-byte value. This is as small as possible, given that it corresponds to the value 0 at your first generator. If this test fails too, it will move to the second level of the shrink tree, which we don’t see, but it would be reasonable to assume that it will focus on 1-byte values and try to shrink them towards lo.

However, if the test at 112 succeeds, it will move to the second value on your list, 4035711763, which is a 4-byte value. And note that 4 bytes correspond to 3 in your first generator, which is right in the middle between 0 and 6, because it is doing binary search on the size of your inputs. If the test succeeds again, it will continue moving closer to the original outcome, that is “emphasise on the vicinity of its initial outcome”, and that is what we see in your output.