Uniqueness and other restrictions for Arbitrary in QuickCheck

931 views Asked by At

I'm trying to write a modified Arbitrary instance for my data type, where (in my case) a subcomponent has a type [String]. I would ideally like to bring uniqueness in the instance itself, that way I don't need ==> headers / prerequisites for every test I write.

Here's my data type:

data Foo = Vars [String]

and the trivial Arbitrary instance:

instance Arbitrary Foo where
  arbitrary = Vars <$> (:[]) <$> choose ('A','z')

This instance is strange, I know. In the past, I've had difficulty when quickcheck combinatorically explodes, so I'd like to keep these values small. Another request - how can I make an instance where the generated strings are under 4 characters, for instance?

All of this, fundamentally requires (boolean) predicates to augment Arbitrary instances. Is this possible?

2

There are 2 answers

0
Petr On BEST ANSWER

Definitely you want the instance to produce only instances that match the intention of the data type. If you want all the variables to be distinct, the Arbitrary instance must reflect this. (Another question is if in this case it wouldn't make more sense to define Vars as a set, like newtype Vars = Set [String].)

I'd suggest to check for duplicates using Set or Hashtable, as nub has O(n^2) complexity, which might slow down your test considerably for larger inputs. For example:

import Control.Applicative
import Data.List (nub)
import qualified Data.Set as Set
import Test.QuickCheck

newtype Foo = Vars [String]

-- | Checks if a given list has no duplicates in _O(n log n)_.
hasNoDups :: (Ord a) => [a] -> Bool
hasNoDups = loop Set.empty
  where
    loop _ []       = True
    loop s (x:xs) | s' <- Set.insert x s, Set.size s' > Set.size s
                    = loop s' xs
                  | otherwise
                    = False

-- | Always worth to test if we wrote `hasNoDups` properly.
prop_hasNoDups :: [Int] -> Property
prop_hasNoDups xs = hasNoDups xs === (nub xs == xs)

Your instance then needs to create a list of list, and each list should be randomized. So instead of (: []), which creates just a singleton list (and just one level), you need to call listOf twice:

instance Arbitrary Foo where
  arbitrary = Vars <$> (listOf . listOf $ choose ('A','z'))
                       `suchThat` hasNoDups

Also notice that choose ('A', 'z') allows to use all characters between A and z, which includes many control characters. My guess is that you rather want something like

oneof [choose ('A','Z'), choose ('a','z')]

If you really want, you could also make hasNoDups O(n) using hash tables in the ST monad.


Concerning limiting the size: you could always have your own parametrized functions that produce different Gen Foo, but I'd say in most cases it's not necessary. Gen has it's own internal size parameter, which is increased throughout the tests (see this answer), so different sizes (as generated using listOf) of lists are covered.

But I'd suggest you to implement shrink, as this will give you much nicer counter-examples. For example, if we define (a wrong test) that tried to verify that no instance of Var contains 'a' in any of its variable:

prop_Foo_hasNoDups :: Foo -> Property
prop_Foo_hasNoDups (Vars xs) = all (notElem 'a') xs === True

we'll get ugly counter-examples such as

Vars ["RhdaJytDWKm","FHHhrqbI","JVPKGTqNCN","awa","DABsOGNRYz","Wshubp","Iab","pl"]

But adding

shrink (Vars xs) = map Vars $ shrink xs

to Arbitrary Foo makes the counter-example to be just

Vars ["a"]
2
Athan Clark On

suchThat :: Gen a -> (a -> Bool) -> Gen a is a way to embed Boolean predicates in a Gen. See the haddocks for more info.

Here's how you would make the instance unique:

instance Arbitrary Foo where
  arbitrary = Vars <$> (:[]) <$> (:[]) <$> choose ('A','z')
              `suchThat` isUnique
    where
      isUnique x = nub x == x