How to create data for Criterion benchmarks?

1.6k views Asked by At

I am using criterion to benchmark my Haskell code. I'm doing some heavy computations for which I need random data. I've written my main benchmark file like this:

main :: IO ()
main = newStdGen >>= defaultMain . benchmarks

benchmarks :: RandomGen g => g -> [Benchmark]
benchmarks gen =
   [
     bgroup "Group"
     [
       bench "MyFun" $ nf benchFun (dataFun gen)
     ]
   ]

I keep benchmarks and data genrators for them in different modules:

benchFun :: ([Double], [Double]) -> [Double]
benchFun (ls, sig) = fun ls sig

dataFun :: RandomGen g => g -> ([Double], [Double])
dataFun gen = (take 5 $ randoms gen, take 1024 $ randoms gen)

This works, but I have two concerns. First, is the time needed to generate random data included in the benchmark? I found a question that touches on that subject but honestly speaking I'm unable to apply it to my code. To check whether this happens I wrote an alternative version of my data generator enclosed within IO monad. I placed benchmarks list with main, called the generator, extracted the result with <- and then passed it to the benchmarked function. I saw no difference in performance.

My second concern is related to generating random data. Right now the generator once created is not updated, which leads to generating the same data within a single run. This is not a major problem, but nevertheless it would be nice to make it properly. Is there a neat way to generate different random data within each data* function? "Neat" means "without making data functions acquiring StdGen within IO"?

EDIT: As noted in comment below I don't really care about data randomness. What is important to me is that the time needed to generate the data is not included in the benchmark.

2

There are 2 answers

3
jberryman On BEST ANSWER

This works, but I have two concerns. First, is the time needed to generate random data included in the benchmark?

Yes it would. All of the random generation should be happening lazily.

To check whether this happens I wrote an alternative version of my data generator enclosed within IO monad. I placed benchmarks list with main, called the generator, extracted the result with <- and then passed it to the benchmarked function. I saw no difference in performance.

This is expected (if I understand what you mean); the random values from randoms gen aren't going to be generated until they're needed (i.e. inside your benchmark loop).

Is there a neat way to generate different random data within each data* function? "Neat" means "without making data functions acquiring StdGen within IO"?

You need either to be in IO or create an StdGen with an integer seed you supply, with mkStdGen.

Re. your main question of how you should get the pRNG stuff out of your benchmarks, you should be able to evaluate the random input fully before your defaultMain (benchmarks g) stuff, with evaluate and force like:

import Control.DeepSeq(force)
import Control.Exception(evaluate)
myBench g = do randInputEvaled <- evaluate $ force $ dataFun g
               defaultMain [
                    bench "MyFun" $ nf benchFun randInputEvaled
                    ...

where force evaluates its argument to normal form, but this will still happen lazily. So to get it to be evaluated outside of bench we use evaluate to leverage monadic sequencing. You could also do things like call seq on the tail of each of the lists in your tuple, etc. if you wanted to avoid the imports.

That kind of thing should work fine, unless you need to hold a huge amount of test data in memory.

EDIT: this method is also a good idea if you want to get your data from IO, like reading from the disk, and don't want that mixed in to your benchmarks.

0
MathematicalOrchid On

You could try reading the random data from a disk file instead. (In fact, if you're on some Unix-like OS, you could even use /dev/urandom.)

However, depending on how much data you need, the I/O time might dwarf the computation time. It depends how much random data you need.

(E.g., if your benchmark reads random numbers and calculates their sum, it's going to be I/O-limited. If your benchmark reads a random number and does some huge calculation based on just that one number, the I/O adds hardly any overhead at all.)