Random numbers

In a recent post I wrote about five of George Marsaglia’s random-number generators:

shr3 Three shifts, with a period of 2^32-1.
mwc Multiply With Carry, with a period of about 2^60.
kiss Keep It Simple Stupid, with a period of about 2^123
lfib4        Lagged Fibonacci, with a period of about 2^287
swb Subtract with borrow. Has a very long period: about 2^7578.

But how do they compare, speed-wise, to generators found in C, C++ and glibc? Here are five additional generators:

rand The standard C routine. Returns 32-bit signed, but always non-negative, random number. Period is undocumented; assume that period is at most 2^31. It is not guaranteed to be thread-safe, but the GNU libc version is. It uses a futex lock to protected its internal state.
rand_r The reentrant version of rand(). The state is a 32-bit unsigned integer, so the period can be at most 2^32.
lrand48 Defined by XOpen, now Open Group. Uses 48 bits of state. The return value is a 32-bit signed, but always non-negative, random number, just like for rand(). The period is not documented, but the state is 48 bits.
lrand48_r The reentrant version of lrand48.
mt19937 A Mersenne Twister generator, from the C++ standard library <random>. Its period is a whopping 2^19937.

For the purpose of the plot below, I have assumed that rand() and rand_r() have periods of 2^31, and that lrand48() and lrand48_r() have a periods of 2^48.

The Marsaglia posting compared different ways of instantiating the generator. As it turned out, there was not significant difference between them. The times used here are those measured using a file-static object.

It is not surprising that lrand48(), using 48-bit arithmetic, is slower than the traditional rand(). But why are the reentrant versions faster? This is strange, because of the overhead of passing state. The answer for rand() and rand_r() depends on the specifics of the glibc implementation. While the standard does not require rand() to be reentrant, in glibc its state is protected by a futex lock, adding time. This does not apply to lrand48() and lrand48_r() . They both call the internal routine __nrand48_r(), and nothing else, so exactly why lrand48_r() is consistently slightly faster than lrand48() is not obvious.

The generator mt19937, called a Mersenne Twister because 2^19927-1 is a Mersenne prime, is a lot more complicated, and probably much better, than any of the other. A good choice those who need a really godd RNG, and for everyone else who like the feeling of having a bit of a margin. It keeps a lot of state; an instance is exactly 5000 bytes is size with my setup.

Measuring execution time

It is strangely difficult to get consistent results when running the same test several times.On an multi-core CPU with no other load, the execution time should be virtually identical from one run to another at least when running for more than a few seconds. But that is not the case. The variation probably has to do with the fact that Linux, on which I run these test, places the stack and the heap at randomly addresses, that vary from run to run. This is a security feature, as it makes it more difficult to write malicious software.

To exemplify, here are all ten runs that the second plot above is based on.

Each group of ten bars represent ten timing runs for one generator. Within a group, the bars are ordered by decreasing execution time.

It is interesting to see how there are sometimes two levels, possibly depending on whether memory areas are close or distant.

In the averaged plots, the two shortest and the two longest have been dropped, and the remaining six values have been averaged. While the shortest values show how fast the generator actually is, or at least sometimes can be, the middle average shows what can be expected.

Download

There is no separate download for this post. Use the one from the previous post.

Final note

I am not an expert in statistics or random numbers, and can neither judge nor make recommendations when it comes to random-number generators. I have faith in George Marsaglia, the creator of the original diehard tests for RNGs. I also have faith in the the RNGs from GNU libc, because they get a lot of scrutiny. Finally I have faith in the C++ random library, and the well-researched mt19938 RNG.

You can reach me by email at “lars dash 7 at sdu dot se” or by telephone +46 705 189090

View source for the content of this page.