How does calling srand more than once affect the quality of randomness?

5.9k views Asked by At

This comment, which states:

srand(time(0)); I would put this line as the first line in main() instead if calling it multiple times (which will actually lead to less random numbers).

...and I've bolded the line which I'm having an issue with... repeats common advice to call srand once in a program. Questions like srand() — why call only once? re-iterate that because time(0) returns the current time in seconds, that multiple calls to srand within the same second will produce the same seed. A common workaround is to use milliseconds or nanoseconds instead.

However, I don't understand why this means that srand should or can only be called once, or how it leads to less random numbers.

cppreference:

Generally speaking, the pseudo-random number generator should only be seeded once, before any calls to rand(), and the start of the program. It should not be repeatedly seeded, or reseeded every time you wish to generate a new batch of pseudo-random numbers.

phoxis's answer to srand() — why call only once?:

Initializing once the initial state with the seed value will generate enough random numbers as you do not set the internal state with srand, thus making the numbers more probable to be random.

Perhaps they're simply using imprecise language, none of the explanations seem to explain why calling srand multiple times is bad (aside from producing the same sequence of random numbers) or how it affects the "randomness" of the numbers. Can somebody clear this up for me?

5

There are 5 answers

3
vsoftco On

A pseudo random generator is an engine which produce numbers that look almost random. However, they are completely deterministic. In other words, given a seed x0, they are produced by repeated application of some injective function on x0, call it f(x0), so that f^m(x0) is quite different from f^{m-1}(x0) or f^{m+1}(x0), where the notation f^m denotes the function composition m times. In other words, f(x) has huge jumps, almost uncorrelated with the previous ones.

If you use sradnd(time) multiple times in a second, you may get the same seed, as the clock is not as fast as you may imagine. So the resulting sequence of random numbers will be the same. And this may be a (huge) problem, especially in cryptography applications (anyway, in the latter case, people buy good number generators based on real-time physical processes such as temperature difference in atmospheric data etc, or, recently, on measuring quantum bits, e.g. superposition of polarized photons, the latter being truly random, as long as quantum mechanics is correct.)

There are also other serious issues with rand. One of it is that the distribution is biased. See e.g. http://eternallyconfuzzled.com/arts/jsw_art_rand.aspx for some discussion, although I remember I've seen something similar on SO, although cannot find it now.

If you plan to use it in crypto applications, just don't do it. Use <random> and a serious random engine like Mersene's twister std::mt19937 combined with std::random_device

If you seed your random number generator twice using srand, and get different seeds, then you will get two sequences that will be quite different. This may be satisfactory for you. However, each sequence per se will not be a good random distribution due to the issues I mentioned above. On the other hand, if you seed your rng too many times, you will get the same seed, and THIS IS BAD, as you'll generate the same numbers over and over again.

PS: seen in the comments that pseudo-numbers depend on a seed, and this is bad. This is the definition of pseudo-numbers, and it is not a bad thing as it allows you to repeat numerical experiments with the same sequence. The idea is that each different seed should produce a sequence of (almost) random numbers, different from a previous sequence (technically, you shouldn't be able to distinguish them from a perfect random sequence).

0
aso On

Look at the source of srand() from this question: Rand Implementation

Also, example implementation from this thread:

static unsigned long int next = 1;

int rand(void) // RAND_MAX assumed to be 32767
{
    next = next * 1103515245 + 12345;
    return (unsigned int)(next/65536) % 32768;
}

void srand(unsigned int seed)
{
    next = seed;
}

As you can see, when you calling srand(time(0)) you will got new numbers on rand() depends on seed. Numbers will repeat after some milions, but calling srand again will make it other. Anyway, it must repeat after some cycles - but order depends on argument for srand. This is why C rand isn't good for cryptography - you can predict next number when you know seed.

If you have fast loop, calling srand every iteration is without sense - you can got same number while your time() (1 second is very big time for modern CPUs) give another seed.

There is no reason in simple app to call srand multiple times - this generator are weak by design and if you want real random numbers, you must use other (the best I know is Blum Blum Shub)

For me, there is no more or less random numbers - it always depends on seed, and they repeat if you use same seed. Using time is good solution because it's easy to implement, but you must use only one (at beginning of main()) or when you sure that you calling srand(time(0)) in another second.

0
outlyer On

The seed determines what random numbers will be generated, in order, i.e. srand(1), will always generate the same number on the first call to rand(), the same on the second call to rand() and so on.

In other words, if you re-seeded with the same seed before each rand() invocation, you'd generate the same random number every single time.

So successive seeding with time(0), during a single second, will mean all your random numbers after re-seeding are actually the same number.

0
Wintermute On

The numbers rand() returns are not actually random but "pseudo-random." What this means is that rand() generates a stream of numbers that look random for given values of "look" and "random" from an internal state that changes with each call.

As a rule, rand() is what is called a linear congruental generator, which means that uses a mechanism roughly like this:

int state; // persistent state

int rand() {
  state = (a * state + b) % c;
  return state;
}

with carefully chosen constants a, b and c. c tends to be a power of two in practice because that makes it faster to calculate.

The "randomness" of this sequence depends in part on the persistence of the state. If the sequence is constantly reseeded with predictable values, the return values of rand() become predictable in turn. How critical this is depends on the application, but it is not a purely academical consideration. Consider, for example, the case

a = 69069
b = 1
c = 2^32

which was used, for example, by old versions of glibc. Granted that I picked this example for the obviousness of the pattern, but the point remains in less obvious cases. Imagine this RNG were seeded with a sequence of incrementing numbers n, n+1, n+2 and so forth -- you will get from rand() a sequence of numbers, each 69069 larger than the last (modulo 2^32). The pattern will be plainly visible. Starting with 0, we would get

1
69070
138139
207208
...

rising until a bit over 4 billion in steady increments. And to make matters worse, some implementation actually returned the seed value in the first call of rand after a call to srand, in which case you'd just get your seeds back.

0
Michael Krebs On

Most of the other answers are saying exactly what the question already stated: multiple calls to srand with the same second will produce the same seed. I believe the actual question is the same one that I had, which is: why would it be bad to call srand multiple times, even if it was with a different seed every time?

I can think of three reasons:

  1. People are not clear in their language and they actually mean srand should not be called multiple times with time() if you want different sequences of random numbers.

  2. It's cryptographically bad because every seed passed to srand is not itself a random number (well, it's probably not). Meaning, every srand is injecting a chance for someone to guess that seed and therefore predict your stream of pseudo-random numbers.

  3. It can mess up the distribution of pseudo-random numbers. @vsoftco's answer gave me a clue. If you call srand once, rand can be designed to give you a uniform distribution of pseudo-random numbers over its lifetime. If you call srand in the middle, however, you'll throw off that uniform distribution because it would "start over" with a new seed.

So, if you don't care about any of that, I would think it's okay to call srand more than once. In my case, I want to call it at the start of my program, but call it again after a fork() because the seed is apparently shared across child processes, and I want each child process to have its own sequence of pseudo-random numbers.


Going back to why it's cryptographically bad, it's easier to guess a seed if it's something like time() because a bad actor can try to guess the time it was seeded. That is why calling srand at the start of a program might be better, because it could be less likely that someone would guess that time as well as, say, when a server request was initiated.

But I would surmise that even passing nanoseconds would be cryptographically dangerous if there's a chance the underlying clock doesn't have that kind of precision. Imagine, for example, that you call srand(get_time_in_ns()) and the underlying clock only returns time to the nearest millisecond.

Now, I'm no crypto expert in any way, but this leads me to wonder if it would be safer than current-time to pass the output of a different pseudo-random generator as seeds to multiple srand calls? For example, can you call each srand with a number from Linux's /dev/random? (I imagine you might want to do that if you want a safer seed than the current time but still want to use rand() so you don't have the overhead of reading from the kernel every time.)