Random Numbers: Difference between revisions

Latest revision as of 01:19, 24 June 2026

Exploring the quirks of C's venerable rand() function — and why seeding with time(NULL) can produce "random" numbers that are anything but.

If you're looking for a proper, Python-centric treatment of random numbers, head over to Numerics/Random Numbers — that's the grown-up table. This page is about what happens when you reach for the C standard library's random facilities without thinking too hard.

The Program

Here's a minimal C++ program that generates five "random" numbers:

#include <stdlib.h>     // srand, rand
#include <time.h>       // time
#include <iostream>

using std::cout;
using std::endl;

int main()
{
    srand(time(NULL));  // seed with the current Unix timestamp
    for (int i = 0; i < 5; i++) {
        cout << rand() << endl;
    }
    return 0;
}

Yes, we're using the C headers (stdlib.h, time.h) instead of the C++ wrappers (cstdlib, ctime). For a quick-and-dirty demo this is fine — but in anything resembling production code you'd want <random> (we'll get to that).

Compilation

$ g++ myrand.cpp -o myrand

Output

Here's the program run four times, one second apart:

charles @ cronus [ 2016-06-21 - 15:00:12 - 41 ] /temp $ ./a.out
1575746672
805981500
1951708871
1705770619
2127589730

charles @ cronus [ 2016-06-21 - 15:00:13 - 42 ] /temp $ ./a.out
1575763479
1088456749
1426875297
543230630
1124215013

charles @ cronus [ 2016-06-21 - 15:00:14 - 43 ] /temp $ ./a.out
1575780286
1370931998
902041723
1528174288
120840296

charles @ cronus [ 2016-06-21 - 15:00:15 - 44 ] /temp $ ./a.out
1575797093
1653407247
377208149
365634299
1264949226

The First Number Is Suspiciously Predictable

Look at the first number from each run:

1575746672
1575763479
1575780286
1575797093

They all start with 15757... and they increase by roughly 16,800 each run. That's not a coincidence — it's literally the Unix timestamp marching forward one second at a time. In mid-2016, time(NULL) returns a value around 1.57 billion. Since we're feeding it directly into srand() as the seed, the first rand() output is heavily correlated with that seed value.

Here's the thing: rand() is a linear congruential generator (on glibc, anyway). Given similar seeds, the first few outputs will be similar. You're not getting randomness — you're getting a deterministic function of the current second. Anyone who knows approximately when you ran the program can guess your first "random" number with embarrassing precision.

I hope the people who design cryptosystems know about this! (They absolutely do — this is Crypto 101, and nobody's been using rand() for key material since approximately forever. But it's still a great cautionary tale.)

The One-Second Problem

time(NULL) only ticks once per second. If you run the program multiple times within the same second, every invocation gets the identical seed — and rand() is fully deterministic. Same seed, same sequence, every single time:

$ ./a.out && ./a.out && ./a.out && ./a.out && ./a.out
1584133365
27210049
2052760379
1418900798
1807295698

1584133365
27210049
2052760379
1418900798
1807295698

1584133365
27210049
2052760379
1418900798
1807295698

1584133365
27210049
2052760379
1418900798
1807295698

1584133365
27210049
2052760379
1418900798
1807295698

Five identical outputs because the shell launched them all in a fraction of a second. This is honestly kind of beautiful — a perfect demonstration of why timestamp seeding is a disaster for anything that actually needs unpredictability.

What You Should Actually Use

C++11 introduced the <random> header, and it's a massive upgrade over rand():

#include <random>
#include <iostream>

int main()
{
    std::random_device rd;                      // non-deterministic seed from /dev/urandom
    std::mt19937 gen(rd());                     // Mersenne Twister engine
    std::uniform_int_distribution<> dis(1, 6);  // fair dice roll, no modulo bias

    for (int i = 0; i < 5; i++) {
        std::cout << dis(gen) << std::endl;
    }
    return 0;
}

The big wins here:

std::random_device pulls entropy from the operating system (on Linux, it reads /dev/urandom). No more timestamp-as-seed nonsense.
Mersenne Twister (std::mt19937) is a well-studied PRNG with a period of 2¹⁹⁹³⁷−1 and good statistical properties. It passes most randomness test suites.
Distribution objects let you generate numbers in specific ranges without the modulo bias that rand() % N famously introduces.

rand() has been in the C standard library since 1989, quietly producing terrible random numbers in codebases everywhere. Studying its failure modes turns out to be a fantastic way to understand why randomness is genuinely hard — and why the language committee finally gave us proper tools in C++11.

@@ Line 1: / Line 1: @@
-Exploring some curious random number behavior in C++...
+Exploring the quirks of C's venerable <code>rand()</code> function — and why seeding with <code>time(NULL)</code> can produce "random" numbers that are anything but.
-=Simple Program=
+''If you're looking for a proper, Python-centric treatment of random numbers, head over to '''[[Numerics/Random Numbers]]''' — that's the grown-up table. This page is about what happens when you reach for the C standard library's random facilities without thinking too hard.''
-Put this code in a file called <code>myrand.cpp</code>
+== The Program ==
+Here's a minimal C++ program that generates five "random" numbers:
 <pre>
-#include <stdlib.h>     /* srand, rand */
+#include <stdlib.h>     // srand, rand
-#include <time.h>       /* time */
+#include <time.h>       // time
 #include <iostream>
@@ Line 13: / Line 15: @@
 using std::endl;
-int main ()
+int main()
 {
-  int mynum;
+    srand(time(NULL));  // seed with the current Unix timestamp
-  srand (time(NULL)); /* initialize random seed */
+    for (int i = 0; i < 5; i++) {
-  for( int i = 0; i < 5; i++) {
+        cout << rand() << endl;
-    cout << rand() << endl;
+    }
-  }
+    return 0;
-  return 0;
 }
 </pre>
-=Compilation=
+Yes, we're using the C headers (<code>stdlib.h</code>, <code>time.h</code>) instead of the C++ wrappers (<code>cstdlib</code>, <code>ctime</code>). For a quick-and-dirty demo this is fine — but in anything resembling production code you'd want <code>&lt;random&gt;</code> (we'll get to that).
-Compile with g++:
+== Compilation ==
 <pre>
-$ g++ myrand.cpp
+$ g++ myrand.cpp -o myrand
 </pre>
-=Output=
+== Output ==
-Run it:
-<pre>
-$ ./a.out
-</pre>
-Here is the output from my terminal when I run this program:
+Here's the program run four times, one second apart:
 <pre>
@@ Line 72: / Line 67: @@
 </pre>
-=Curious Behavior=
+== The First Number Is Suspiciously Predictable ==
-Notice how the first random number that is generated is always starting at around 15757XXXXX. Seeding the random number generator with the current time seems to generate very similar (read: NOT random) numbers for the first random number.
+Look at the first number from each run:
-I hope the people who design cryptosystems know about this!
+* 1575746672
+* 1575763479
+* 1575780286
+* 1575797093
-=More Curious Behavior=
+They all start with <code>15757...</code> and they increase by roughly 16,800 each run. That's not a coincidence — it's literally the Unix timestamp marching forward one second at a time. In mid-2016, <code>time(NULL)</code> returns a value around 1.57 billion. Since we're feeding it directly into <code>srand()</code> as the seed, the first <code>rand()</code> output is heavily correlated with that seed value.
-<code>time(NULL)</code> only generates a unique integer once per second.
+Here's the thing: <code>rand()</code> is a linear congruential generator (on glibc, anyway). Given similar seeds, the first few outputs will be similar. You're not getting randomness — you're getting a deterministic function of the current second. Anyone who knows approximately when you ran the program can guess your first "random" number with embarrassing precision.
-That means, if you run the program repeatedly with NO delays in-between, <code>time(NULL)</code> will always generate the same random number if it is still the same second, and so you will get the exact same numbers if you run the program over and over in a 1-second interval. For example:
+I hope the people who design cryptosystems know about this! (They absolutely do — this is Crypto 101, and nobody's been using <code>rand()</code> for key material since approximately forever. But it's still a great cautionary tale.)
+== The One-Second Problem ==
+<code>time(NULL)</code> only ticks once per second. If you run the program multiple times within the same second, every invocation gets the identical seed — and <code>rand()</code> is fully deterministic. Same seed, same sequence, every single time:
 <pre>
-charles @ cronus [ 2016-06-21 - 15:08:22 - 69 ] /temp $ ./a.out && ./a.out && ./a.out && ./a.out && ./a.out
+$ ./a.out && ./a.out && ./a.out && ./a.out && ./a.out
 1584133365
 27210049
@@ Line 117: / Line 119: @@
 </pre>
+Five identical outputs because the shell launched them all in a fraction of a second. This is honestly kind of beautiful — a perfect demonstration of why timestamp seeding is a disaster for anything that actually needs unpredictability.
+== What You Should Actually Use ==
+C++11 introduced the <code>&lt;random&gt;</code> header, and it's a massive upgrade over <code>rand()</code>:
+<pre>
+#include <random>
+#include <iostream>
+int main()
+{
+    std::random_device rd;                      // non-deterministic seed from /dev/urandom
+    std::mt19937 gen(rd());                     // Mersenne Twister engine
+    std::uniform_int_distribution<> dis(1, 6);  // fair dice roll, no modulo bias
+    for (int i = 0; i < 5; i++) {
+        std::cout << dis(gen) << std::endl;
+    }
+    return 0;
+}
+</pre>
+The big wins here:
+* '''<code>std::random_device</code>''' pulls entropy from the operating system (on Linux, it reads <code>/dev/urandom</code>). No more timestamp-as-seed nonsense.
+* '''Mersenne Twister''' (<code>std::mt19937</code>) is a well-studied PRNG with a period of 2<sup>19937</sup>−1 and good statistical properties. It passes most randomness test suites.
+* '''Distribution objects''' let you generate numbers in specific ranges without the modulo bias that <code>rand() % N</code> famously introduces.
+<code>rand()</code> has been in the C standard library since 1989, quietly producing terrible random numbers in codebases everywhere. Studying its failure modes turns out to be a fantastic way to understand why randomness is genuinely hard — and why the language committee finally gave us proper tools in C++11.
+== See Also ==
+* [[Numerics/Random Numbers]] — the Python way, with NumPy and all the good stuff
+* [https://en.cppreference.com/w/cpp/numeric/random cppreference: Random number generation]
 [[Category:C++]]
 [[Category:Programming]]