Generate Array of Numbers that fit to a Probability Distribution in Ruby?

3.4k views Asked by At

Say I have 100 records, and I want to mock out the created_at date so it fits on some curve. Is there a library to do that, or what formula could I use? I think this is along the same track:

Generate Random Numbers with Probabilistic Distribution

I don't know much about how they are classified in mathematics, but I'm looking at things like:

  • bell curve
  • logarithmic (typical biology/evolution) curve? ...

Just looking for some formulas in code so I can say this:

  • Given 100 records, a timespan of 1.week, and an interval of 12.hours
  • set created_at for each record such that it fits, roughly, to curve

Thanks so much!

Update

I found this forum post about ruby algorithms, which led me to rsruby, an R/Ruby bridge, but that seems like too much.

Update 2

I wrote this little snippet trying out the gsl library, getting there...

Generate test data in Rails where created_at falls along a Statistical Distribution

4

There are 4 answers

2
zsalzbank On BEST ANSWER

You can generate UNIX timestamps which are really just integers. First figure out when you want to start, for example now:

start = DateTime::now().to_time.to_i

Find out when the end of your interval should be (say 1 week later):

finish = (DateTime::now()+1.week).to_time.to_i

Ruby uses this algorithm to generate random numbers. It is almost uniform. Then generate random numbers between the two:

r = Random.new.rand(start..finish)

Then convert that back to a date:

d = Time.at(r)

This looks promising as well: http://rb-gsl.rubyforge.org/files/rdoc/randist_rdoc.html

And this too: http://rb-gsl.rubyforge.org/files/rdoc/rng_rdoc.html

0
Carlos Agarie On

Another option is the Distribution gem under SciRuby. You can generate normal numbers by:

require 'distribution'

rng = Distribution::Normal.rng
random_numbers = Array.new(100).map { rng.call }

There are RNGs for various other distributions as well.

0
Mitch Wheat On

From wiki:

There are a couple of methods to generate a random number based on a probability density function. These methods involve transforming a uniform random number in some way. Because of this, these methods work equally well in generating both pseudo-random and true random numbers.

One method, called the inversion method, involves integrating up to an area greater than or equal to the random number (which should be generated between 0 and 1 for proper distributions).

A second method, called the acceptance-rejection method, involves choosing an x and y value and testing whether the function of x is greater than the y value. If it is, the x value is accepted. Otherwise, the x value is rejected and the algorithm tries again.

The first method is the one used in the accepted answer in your SO linked question: Generate Random Numbers with Probabilistic Distribution

1
jabbrwcky On

I recently came across croupier, a ruby gem that aims to generate numbers according to a variety of statistical distributions.

I have yet to try it but it sounds quite promising.