True random numbers are not easy to generate. That needs something external that is true random like certain things at the atomic level like radio-activity.
When we talk software then we usually mean pseudo random numbers, which is numbers that are fully deterministic but have characteristics similar to true random numbers. And due to that then in most cases they are just as good as true random numbers. And they are way easier to generate.
Unless otherwise stated then all the following examples are about uniform distributed random numbers - that is random numbers where each number has the same probability.
Throughout the article I will abbreviate rand number generator as RNG.
Examples will be shown using various programming languages like Java, C# and C. But the examples should be so simple that it is possible to understand the code even without knowing the specific language.
Most programming languages/runtime comes with a builtin RNG.
Examples of usage:
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
#define N 10
#define K 100
int main()
{
int i, rani;
double ranx;
/* initialize random generator */
srand(time(NULL));
for(i = 0; i < N; i++)
{
/* generate random integer in the range from 0 (inclusive) to K-1 (inclusive) */
rani = rand() % K;
printf("%d\n", rani);
}
for(i = 0; i < N; i++)
{
/* generate random double in the range 0.0 (inclusive) to 1.0 (exclusive) */
ranx = rand() / (double)(RAND_MAX + 1);
printf("%f\n", ranx);
}
return 0;
}
import java.util.Random;
public class Rng {
public final static int N = 10;
public final static int K = 100;
public static void main(String[] args) {
// initialize random generator
Random rng = new Random();
for(int i = 0; i < N; i++) {
// generate random integer in the range from 0 (inclusive) to K-1 (inclusive)
int rani = rng.nextInt(K);
System.out.println(rani);
}
for(int i = 0; i < N; i++) {
// generate random double in the range 0.0 (inclusive) to 1.0 (exclusive)
double ranx = rng.nextDouble();
System.out.println(ranx);
}
}
}
using System;
public class Rng
{
public const int N = 10;
public const int K = 100;
public static void Main(string[] args)
{
// initialize random generator
Random rng = new Random();
for(int i = 0; i < N; i++)
{
// generate random integer in the range from 0 (inclusive) to K-1 (inclusive)
int rani = rng.Next(K);
Console.WriteLine(rani);
}
for(int i = 0; i < N; i++)
{
// generate random double in the range 0.0 (inclusive) to 1.0 (exclusive)
double ranx = rng.NextDouble();
Console.WriteLine(ranx);
}
}
}
Are the output from these builtin RNG's good meaning that the generated numbers have characteristics similar to true random numbers?
In most cases they are OK, not great, but not bad either.
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
#define N 50000
#define K 10
int main()
{
int i, j, last, one[K], two[K][K], a[N];
srand(time(NULL));
for(i = 0; i < N; i++)
{
a[i] = rand() % K;
}
for(i = 0; i < K; i++) one[i] = 0;
for(i = 0; i < N; i++)
{
one[a[i]]++;
}
for(i = 0; i < K; i++) for(j = 0; j < K; j++) two[i][j] = 0;
last = a[0];
for(i = 1; i < N; i++)
{
two[last][a[i]]++;
last = a[i];
}
for(int i = 0; i < K; i++)
{
printf("%d\n", one[i]);
}
for(int i = 0; i < K; i++)
{
for(int j = 0; j < K; j++)
{
printf(" %d", two[i][j]);
}
printf("\n");
}
return 0;
}
import java.util.Random;
public class GoodRng {
public final static int N = 50000;
public final static int K = 10;
public static void main(String[] args) {
Random rng = new Random();
int[] one = new int[K];
int[][] two = new int[K][K];
int[] a = new int[N];
for(int i = 0; i < N; i++) {
a[i] = rng.nextInt(K);
}
for(int i = 0; i < N; i++) {
one[a[i]]++;
}
int last = a[0];
for(int i = 1; i < N; i++) {
two[last][a[i]]++;
last = a[i];
}
for(int i = 0; i < K; i++) {
System.out.println(one[i]);
}
for(int i = 0; i < K; i++) {
for(int j = 0; j < K; j++) {
System.out.print(" " + two[i][j]);
}
System.out.println();
}
}
}
using System;
public class Rng
{
public const int N = 50000;
public const int K = 10;
public static void Main(string[] args)
{
Random rng = new Random();
int[] one = new int[K];
int[,] two = new int[K,K];
int[] a = new int[N];
for(int i = 0; i < N; i++)
{
a[i] = rng.Next(K);
}
for(int i = 0; i < N; i++)
{
one[a[i]]++;
}
int last = a[0];
for(int i = 1; i < N; i++)
{
two[last,a[i]]++;
last = a[i];
}
for(int i = 0; i < K; i++)
{
Console.WriteLine(one[i]);
}
for(int i = 0; i < K; i++)
{
for(int j = 0; j < K; j++)
{
Console.Write(" " + two[i,j]);
}
Console.WriteLine();
}
}
}
Example output:
4945 5018 5085 4980 4956 4992 4846 5030 5045 5103 436 505 513 521 455 497 503 483 521 510 545 508 525 493 514 480 431 464 509 549 488 504 544 486 504 503 484 507 536 529 500 488 524 509 494 490 492 476 515 492 506 517 523 464 463 521 503 506 462 491 506 512 448 498 517 514 493 486 510 508 478 468 486 493 471 469 459 534 516 472 481 515 505 471 507 519 503 542 479 508 505 492 507 531 511 492 476 502 481 548 500 509 510 514 520 507 501 530 516 496
This output will be shown many times. It is a very simple test for randomness. It tests the distribution of the generated numbers and the distribution of the generated numbers given the previous number. Both should have an equal distribution.
Much better tests for randomness exists. Including the well-known Diehard test af Marsaglia.
Even though the builtin RNG may be OK, then it is very easy for the application code to use it incorrectly resulting in very non-random results.
Below are some classic mistakes.
Reinitializing the RNG for each number. Some beginners think that if initializing RNG once results in OK random numbers then initializing RNG for every number will produce even better random numbers. Not so.
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
#define N 50000
#define K 10
int main()
{
int i, j, last, one[K], two[K][K], a[N];
for(i = 0; i < N; i++)
{
srand(time(NULL)); /* <---- initialize for every number */
a[i] = rand() % K;
}
for(i = 0; i < K; i++) one[i] = 0;
for(i = 0; i < N; i++)
{
one[a[i]]++;
}
for(i = 0; i < K; i++) for(j = 0; j < K; j++) two[i][j] = 0;
last = a[0];
for(i = 1; i < N; i++)
{
two[last][a[i]]++;
last = a[i];
}
for(int i = 0; i < K; i++)
{
printf("%d\n", one[i]);
}
for(int i = 0; i < K; i++)
{
for(int j = 0; j < K; j++)
{
printf(" %d", two[i][j]);
}
printf("\n");
}
return 0;
}
import java.util.Random;
// always bad code
// produces bad results in Java 1.0 - 1.4
// produces ok results in Java 1.5-
public class BadRng1 {
public final static int N = 50000;
public final static int K = 10;
public static void main(String[] args) {
int[] one = new int[K];
int[][] two = new int[K][K];
int[] a = new int[N];
for(int i = 0; i < N; i++) {
Random rng = new Random(); // <---- initialize for every number
a[i] = rng.nextInt(K);
}
for(int i = 0; i < N; i++) {
one[a[i]]++;
}
int last = a[0];
for(int i = 1; i < N; i++) {
two[last][a[i]]++;
last = a[i];
}
for(int i = 0; i < K; i++) {
System.out.println(one[i]);
}
for(int i = 0; i < K; i++) {
for(int j = 0; j < K; j++) {
System.out.print(" " + two[i][j]);
}
System.out.println();
}
}
}
using System;
public class BadRng1
{
public const int N = 50000;
public const int K = 10;
public static void Main(string[] args) {
int[] one = new int[K];
int[,] two = new int[K,K];
int[] a = new int[N];
for(int i = 0; i < N; i++)
{
Random rng = new Random(); // <---- initialize for every number
a[i] = rng.Next(K);
}
for(int i = 0; i < N; i++)
{
one[a[i]]++;
}
int last = a[0];
for(int i = 1; i < N; i++)
{
two[last,a[i]]++;
last = a[i];
}
for(int i = 0; i < K; i++)
{
Console.WriteLine(one[i]);
}
for(int i = 0; i < K; i++)
{
for(int j = 0; j < K; j++)
{
Console.Write(" " + two[i,j]);
}
Console.WriteLine();
}
}
}
Example output:
0 0 16282 170 33548 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 16281 0 1 0 0 0 0 0 0 0 0 169 1 0 0 0 0 0 0 0 1 0 33546 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
The numbers are not random at all. The problem is that initialization often uses time as basis for the random numbers. So all initializations within a given time interval will result in the same numbers being generated.
The right way is to just initialize once.
Often it is done as a class or instance field in OOP languages.
In Java:
public static final Random rng = new Random();
or:
public final Random rng = new Random();
Check documentation for RNG regarding thread safety when using in multi-threaded context.
Bad scaling. Some beginners think that if you have a RNG generating OK random numbers then scaling the numbers will always result in OK random numbers. Not so.
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
#define N 50000
#define K 10
int main()
{
int i, j, last, one[K], two[K][K], a[N];
srand(time(NULL));
for(i = 0; i < N; i++)
{
a[i] = (rand() % 15) % K; /* scale random number 0..14 to 0..9 */
}
for(i = 0; i < K; i++) one[i] = 0;
for(i = 0; i < N; i++)
{
one[a[i]]++;
}
for(i = 0; i < K; i++) for(j = 0; j < K; j++) two[i][j] = 0;
last = a[0];
for(i = 1; i < N; i++)
{
two[last][a[i]]++;
last = a[i];
}
for(int i = 0; i < K; i++)
{
printf("%d\n", one[i]);
}
for(int i = 0; i < K; i++)
{
for(int j = 0; j < K; j++)
{
printf(" %d", two[i][j]);
}
printf("\n");
}
return 0;
}
import java.util.Random;
public class BadRng2 {
public final static int N = 50000;
public final static int K = 10;
public static void main(String[] args) {
Random rng = new Random();
int[] one = new int[K];
int[][] two = new int[K][K];
int[] a = new int[N];
for(int i = 0; i < N; i++) {
a[i] = rng.nextInt(15) % K; // scale random number 0..14 to 0..9
}
for(int i = 0; i < N; i++) {
one[a[i]]++;
}
int last = a[0];
for(int i = 1; i < N; i++) {
two[last][a[i]]++;
last = a[i];
}
for(int i = 0; i < K; i++) {
System.out.println(one[i]);
}
for(int i = 0; i < K; i++) {
for(int j = 0; j < K; j++) {
System.out.print(" " + two[i][j]);
}
System.out.println();
}
}
}
using System;
public class BadRng2
{
public const int N = 50000;
public const int K = 10;
public static void Main(string[] args)
{
Random rng = new Random();
int[] one = new int[K];
int[,] two = new int[K,K];
int[] a = new int[N];
for(int i = 0; i < N; i++)
{
a[i] = rng.Next(15) % K; // scale random number 0..14 to 0..9
}
for(int i = 0; i < N; i++)
{
one[a[i]]++;
}
int last = a[0];
for(int i = 1; i < N; i++)
{
two[last,a[i]]++;
last = a[i];
}
for(int i = 0; i < K; i++)
{
Console.WriteLine(one[i]);
}
for(int i = 0; i < K; i++)
{
for(int j = 0; j < K; j++)
{
Console.Write(" " + two[i,j]);
}
Console.WriteLine();
}
}
}
It is rarely as obvious as here. Usually the 15 and the K are in different code. But the problem still exists.
Example output:
6686 6678 6700 6610 6558 3274 3408 3415 3316 3355 913 878 911 864 888 470 449 426 457 430 833 918 877 911 920 450 421 493 429 426 908 873 890 918 882 414 488 443 427 457 882 909 868 851 853 420 454 452 453 467 894 884 925 861 888 430 431 439 412 394 453 414 456 423 415 202 226 204 227 254 462 470 444 461 419 210 237 253 220 232 493 461 422 434 449 228 252 226 217 233 408 425 480 447 399 211 236 237 235 238 440 446 427 440 445 239 213 242 239 224
The numbers are not random at all. The problem is that 10 is not a factor in 15 and that 10 is relative close to 15. This combination result in very poor distribution.
Note that the following modification does not solve the problem:
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
#define N 50000
#define K 10
int main()
{
int i, j, last, one[K], two[K][K], a[N];
srand(time(NULL));
for(i = 0; i < N; i++)
{
a[i] = (int)(((rand() % 15) / 15.0) * K); /* scale random number 0..14 to 0..9 */
}
for(i = 0; i < K; i++) one[i] = 0;
for(i = 0; i < N; i++)
{
one[a[i]]++;
}
for(i = 0; i < K; i++) for(j = 0; j < K; j++) two[i][j] = 0;
last = a[0];
for(i = 1; i < N; i++)
{
two[last][a[i]]++;
last = a[i];
}
for(int i = 0; i < K; i++)
{
printf("%d\n", one[i]);
}
for(int i = 0; i < K; i++)
{
for(int j = 0; j < K; j++)
{
printf(" %d", two[i][j]);
}
printf("\n");
}
return 0;
}
import java.util.Random;
public class BadRng2NotFix {
public final static int N = 50000;
public final static int K = 10;
public static void main(String[] args) {
Random rng = new Random();
int[] one = new int[K];
int[][] two = new int[K][K];
int[] a = new int[N];
for(int i = 0; i < N; i++) {
a[i] = (int)(rng.nextInt(15) / 15.0 * K); // scale random number 0..14 to 0.0..1.0 to 0..9
}
for(int i = 0; i < N; i++) {
one[a[i]]++;
}
int last = a[0];
for(int i = 1; i < N; i++) {
two[last][a[i]]++;
last = a[i];
}
for(int i = 0; i < K; i++) {
System.out.println(one[i]);
}
for(int i = 0; i < K; i++) {
for(int j = 0; j < K; j++) {
System.out.print(" " + two[i][j]);
}
System.out.println();
}
}
}
using System;
public class BadRng2NotFix
{
public const int N = 50000;
public const int K = 10;
public static void Main(string[] args)
{
Random rng = new Random();
int[] one = new int[K];
int[,] two = new int[K,K];
int[] a = new int[N];
for(int i = 0; i < N; i++)
{
a[i] = (int)((rng.Next(15) / 15.0) * K); // scale random number 0..14 to 0.0..1.0 to to 0..9
}
for(int i = 0; i < N; i++)
{
one[a[i]]++;
}
int last = a[0];
for(int i = 1; i < N; i++)
{
two[last,a[i]]++;
last = a[i];
}
for(int i = 0; i < K; i++)
{
Console.WriteLine(one[i]);
}
for(int i = 0; i < K; i++)
{
for(int j = 0; j < K; j++)
{
Console.Write(" " + two[i,j]);
}
Console.WriteLine();
}
}
}
Example output:
6685 3212 6714 3323 6680 3280 6736 3367 6653 3350 910 421 910 465 877 389 884 455 925 449 428 188 418 225 410 207 452 236 453 195 856 413 943 449 917 446 889 431 909 461 415 227 433 228 458 232 457 227 420 226 877 435 920 427 898 433 902 433 899 456 437 222 394 200 484 235 427 229 438 214 877 443 895 433 916 440 898 462 902 469 455 206 428 207 439 258 461 219 465 229 948 433 893 460 866 423 928 461 801 440 482 224 480 229 415 217 438 214 440 211
It just moves the problem around a little bit.
The easy solution is to never scale down more than once.
If it is necessary to scale down more than once then it can be done by generating a new number every time you get a number >= (MAX_RAND / K) * K.
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
#define N 50000
#define K 10
int main()
{
int i, j, tmp, last, one[K], two[K][K], a[N];
srand(time(NULL));
for(i = 0; i < N; i++)
{
do
{
tmp = rand() % 15;
}
while(tmp >= 10); /* retry until random number is in 0..9 (or M*10+9) */
a[i] = tmp % K; /* scale random number 0..9 (or M*10+9) to 0..9 */
}
for(i = 0; i < K; i++) one[i] = 0;
for(i = 0; i < N; i++)
{
one[a[i]]++;
}
for(i = 0; i < K; i++) for(j = 0; j < K; j++) two[i][j] = 0;
last = a[0];
for(i = 1; i < N; i++)
{
two[last][a[i]]++;
last = a[i];
}
for(int i = 0; i < K; i++)
{
printf("%d\n", one[i]);
}
for(int i = 0; i < K; i++)
{
for(int j = 0; j < K; j++)
{
printf(" %d", two[i][j]);
}
printf("\n");
}
return 0;
}
import java.util.Random;
public class BadRng2GoodFix {
public final static int N = 50000;
public final static int K = 10;
public static void main(String[] args) {
Random rng = new Random();
int[] one = new int[K];
int[][] two = new int[K][K];
int[] a = new int[N];
for(int i = 0; i < N; i++) {
a[i] = rng.nextInt(15) %K; // scale random number 0..14 to 0..9
}
for(int i = 0; i < N; i++) {
int tmp;
do {
tmp = rng.nextInt(15);
} while(tmp >= 10); // retry until random number is in 0..9 (or M*10+9)
a[i] = tmp % K; // scale random number 0..9 (or M*10+9) to 0..9
}
for(int i = 0; i < N; i++) {
one[a[i]]++;
}
int last = a[0];
for(int i = 1; i < N; i++) {
two[last][a[i]]++;
last = a[i];
}
for(int i = 0; i < K; i++) {
System.out.println(one[i]);
}
for(int i = 0; i < K; i++) {
for(int j = 0; j < K; j++) {
System.out.print(" " + two[i][j]);
}
System.out.println();
}
}
}
using System;
public class BadRng2GoodFix
{
public const int N = 50000;
public const int K = 10;
public static void Main(string[] args)
{
Random rng = new Random();
int[] one = new int[K];
int[,] two = new int[K,K];
int[] a = new int[N];
for(int i = 0; i < N; i++)
{
int tmp;
do
{
tmp = rng.Next(15);
}
while(tmp >= 10); // retry until random number is in 0..9 (or M*10+9)
a[i] = tmp % K; // scale random number 0..9 (or M*10+9) to 0..9
}
for(int i = 0; i < N; i++)
{
one[a[i]]++;
}
int last = a[0];
for(int i = 1; i < N; i++)
{
two[last,a[i]]++;
last = a[i];
}
for(int i = 0; i < K; i++)
{
Console.WriteLine(one[i]);
}
for(int i = 0; i < K; i++)
{
for(int j = 0; j < K; j++)
{
Console.Write(" " + two[i,j]);
}
Console.WriteLine();
}
}
}
Example output:
5060 5078 5143 4951 4992 5025 4952 4981 4909 4909 537 540 524 468 519 526 499 508 458 481 522 495 559 514 529 440 507 484 521 507 522 522 503 491 469 529 500 526 560 521 501 533 502 524 505 478 476 466 478 488 476 533 527 483 537 501 477 500 477 481 504 490 504 480 509 526 505 496 524 487 497 510 504 495 468 509 509 515 456 488 522 494 508 506 503 518 508 469 486 467 493 467 485 504 486 493 472 512 465 532 486 494 527 486 467 504 499 505 484 457
Bad combination of random numbers. Some beginners think that combining two RNG's generating OK random numbers will produce even better random numbers. Not so.
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
#define N 50000
#define K 10
int main()
{
int i, j, last, one[K], two[K][K], a[N];
srand(time(NULL));
for(i = 0; i < N; i++)
{
a[i] = (rand() % K + rand() % K) / 2; /* average of two random numbers 0..9 */
}
for(i = 0; i < K; i++) one[i] = 0;
for(i = 0; i < N; i++)
{
one[a[i]]++;
}
for(i = 0; i < K; i++) for(j = 0; j < K; j++) two[i][j] = 0;
last = a[0];
for(i = 1; i < N; i++)
{
two[last][a[i]]++;
last = a[i];
}
for(int i = 0; i < K; i++)
{
printf("%d\n", one[i]);
}
for(int i = 0; i < K; i++)
{
for(int j = 0; j < K; j++)
{
printf(" %d", two[i][j]);
}
printf("\n");
}
return 0;
}
import java.util.Random;
public class BadRng3 {
public final static int N = 50000;
public final static int K = 10;
public static void main(String[] args) {
Random rng = new Random();
int[] one = new int[K];
int[][] two = new int[K][K];
int[] a = new int[N];
for(int i = 0; i < N; i++) {
a[i] = (rng.nextInt(K) + rng.nextInt(K)) / 2; // average of two random numbers 0..9
}
for(int i = 0; i < N; i++) {
one[a[i]]++;
}
int last = a[0];
for(int i = 1; i < N; i++) {
two[last][a[i]]++;
last = a[i];
}
for(int i = 0; i < K; i++) {
System.out.println(one[i]);
}
for(int i = 0; i < K; i++) {
for(int j = 0; j < K; j++) {
System.out.print(" " + two[i][j]);
}
System.out.println();
}
}
}
using System;
public class BadRng3
{
public const int N = 50000;
public const int K = 10;
public static void Main(string[] args)
{
Random rng = new Random();
int[] one = new int[K];
int[,] two = new int[K,K];
int[] a = new int[N];
for(int i = 0; i < N; i++)
{
a[i] = (rng.Next(K) + rng.Next(K)) / 2; // average of two random numbers 0..9
}
for(int i = 0; i < N; i++)
{
one[a[i]]++;
}
int last = a[0];
for(int i = 1; i < N; i++)
{
two[last,a[i]]++;
last = a[i];
}
for(int i = 0; i < K; i++)
{
Console.WriteLine(one[i]);
}
for(int i = 0; i < K; i++)
{
for(int j = 0; j < K; j++)
{
Console.Write(" " + two[i,j]);
}
Console.WriteLine();
}
}
}
Example output:
1532 3568 5473 7410 9506 8536 6476 4494 2499 506 35 116 178 225 310 253 208 124 67 16 101 256 386 552 645 612 468 343 170 35 173 407 614 814 1064 907 660 497 283 54 221 497 783 1042 1537 1266 975 642 365 81 292 664 1077 1404 1776 1684 1185 852 494 78 267 648 908 1298 1575 1463 1084 765 432 96 217 452 693 989 1202 1057 891 582 325 68 140 315 512 648 838 766 612 406 215 42 75 183 261 370 461 430 329 232 126 32 11 30 61 68 98 98 64 51 21 4
The numbers are not random at all.
It requires some good math understanding to combine RNG's. So for beginners the advice is simply: don't do it. Later in this article some combinations that actually work will be shown.
Use of low bits. When taking modulus of powers of two extra care must be taken. Taking modulus of powers of two is equivalent of taking a number of low bits. Some common RNG's generate numbers that are not very random in low bits.
Example in C:
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
#define N 50000
#define K 4
static unsigned long int seed;
void mysrand(unsigned long int ss)
{
seed = ss;
return;
}
unsigned long int myrand()
{
/*
seed = (65539UL * seed) % 2147483648UL;
*/
unsigned long int help1 = 2147483648UL % 65539UL;
unsigned long int help2 = 2147483648UL / 65539UL;
long int tmp = 65539UL * (seed % help2) - help1 * (seed / help2);
if(tmp >= 0)
seed = tmp;
else
seed = tmp + 2147483648UL;
return seed;
}
int main()
{
int i, j, last, one[K], two[K][K], a[N];
mysrand(time(NULL));
for(i = 0; i < N; i++)
{
a[i] = myrand() % K;
}
for(i = 0; i < K; i++) one[i] = 0;
for(i = 0; i < N; i++)
{
one[a[i]]++;
}
for(i = 0; i < K; i++) for(j = 0; j < K; j++) two[i][j] = 0;
last = a[0];
for(i = 1; i < N; i++)
{
two[last][a[i]]++;
last = a[i];
}
for(i = 0; i < K; i++) printf("%d\n",one[i]);
for(i = 0; i < K; i++) {
for(j = 0; j < K; j++)
{
printf(" %d",two[i][j]);
}
printf("\n");
}
return 0;
}
Don't think so much about the algorithm. It is not good, but it was commonly used 40 years ago.
Output:
0 25000 0 25000 0 0 0 0 0 0 0 24999 0 0 0 0 0 25000 0 0
The numbers are not random at all.
A solution is to use high bits instead of low bits:
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
#define N 50000
#define K 4
static unsigned long int seed;
void mysrand(unsigned long int ss)
{
seed = ss;
return;
}
unsigned long int myrand()
{
/*
seed = (65539UL * seed) % 2147483648UL;
*/
unsigned long int help1 = 2147483648UL % 65539UL;
unsigned long int help2 = 2147483648UL / 65539UL;
long int tmp = 65539UL * (seed % help2) - help1 * (seed / help2);
if(tmp >= 0)
seed = tmp;
else
seed = tmp + 2147483648UL;
return seed;
}
int main()
{
int i, j, last, one[K], two[K][K], a[N];
mysrand(time(NULL));
for(i = 0; i < N; i++)
{
a[i] = (myrand() / 2147483648.0) * K;
}
for(i = 0; i < K; i++) one[i] = 0;
for(i = 0; i < N; i++)
{
one[a[i]]++;
}
for(i = 0; i < K; i++) for(j = 0; j < K; j++) two[i][j] = 0;
last = a[0];
for(i = 1; i < N; i++)
{
two[last][a[i]]++;
last = a[i];
}
for(i = 0; i < K; i++) printf("%d\n",one[i]);
for(i = 0; i < K; i++)
{
for(j = 0; j < K; j++)
{
printf(" %d",two[i][j]);
}
printf("\n");
}
return 0;
}
Output:
12628 12373 12597 12402 3171 3198 3192 3067 3079 3067 3055 3172 3191 3119 3213 3074 3187 2989 3137 3088
Of course one can also use an algorithm without this flaw:
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
#define N 50000
#define K 4
int main()
{
int i, j, last, one[K], two[K][K], a[N];
srand(time(NULL));
for(i = 0;i < N; i++)
{
a[i] = rand() % K;
}
for(i = 0; i < K; i++) one[i] = 0;
for(i = 0; i < N; i++)
{
one[a[i]]++;
}
for(i = 0; i < K; i++) for(j = 0; j < K; j++) two[i][j] = 0;
last = a[0];
for(i = 1; i < N; i++)
{
two[last][a[i]]++;
last = a[i];
}
for(i = 0; i < K; i++)
{
printf("%d\n",one[i]);
}
for(i = 0; i < K; i++)
{
for(j = 0; j < K; j++)
{
printf(" %d",two[i][j]);
}
printf("\n");
}
return 0;
}
Output:
12671 12293 12526 12510 3233 3098 3119 3221 3022 3061 3139 3070 3270 3067 3105 3084 3146 3066 3163 3135
But the first workaround is actually more safe unless one really understand the characteristics of the alternative algorithm being used.
So far we have focused on a good random distribution of generated numbers. That is always important.
But it may not be sufficient for the RNG to be good in a specific context.
Sometimes the numbers must also be unpredictable.
To illustrate the difference let us look at some usage of RNG for Poker.
First let us try and find the probability of a flush (including straight flush and royal straight flush) - so 5 cards of same color.
Java code:
import java.util.ArrayList;
import java.util.List;
import java.util.Random;
public class Poker1 {
private static Random rng = new Random(System.currentTimeMillis());
private List<Integer> deck;
public Poker1() {
deck = new ArrayList<Integer>();
for(int i = 0; i < 52; i++) {
deck.add(i);
}
}
public int takeCard() {
return deck.remove(rng.nextInt(deck.size()));
}
public int[] takeHand() {
int[] res = new int[5];
for(int i = 0; i < 5; i++) {
res[i] = takeCard();
}
return res;
}
public static int getColour(int card) {
return (card / 13);
}
public static int getValue(int card) {
return (card % 13);
}
public static boolean isFlush(int[] hand) {
if((getColour(hand[0]) == getColour(hand[1])) &&
(getColour(hand[1]) == getColour(hand[2])) &&
(getColour(hand[2]) == getColour(hand[3])) &&
(getColour(hand[3]) == getColour(hand[4]))) {
return true;
} else {
return false;
}
}
private static final int REP = 1000000;
private static final int PLAYERS = 4;
public static void main(String[] args) {
int nflush = 0;
int[] hand = new int[5];
for(int i = 0; i < REP; i++) {
Poker1 game = new Poker1();
for(int j = 0; j < PLAYERS; j++) {
if(Poker1.isFlush(game.takeHand())) {
nflush++;
}
}
}
System.out.println(PLAYERS * REP / (double)nflush);
}
}
The program usually print a number in the 500 to 510 range, which is fine as the true probability is 1 out of 505.
The distribution of the generated numbers are fine.
So are we ready to create a poker game?
import java.util.ArrayList;
import java.util.List;
import java.util.Random;
public class Poker2 {
private static Random rng = new Random(System.currentTimeMillis());
private List<Integer> deck;
public Poker2() {
deck = new ArrayList<Integer>();
for(int i = 0; i < 52; i++) {
deck.add(i);
}
}
public int takeCard() {
return deck.remove(rng.nextInt(deck.size()));
}
public int[] takeHand() {
int[] res = new int[5];
for(int i = 0; i < 5; i++) {
res[i] = takeCard();
}
return res;
}
public static int getColour(int card) {
return (card / 13);
}
public static int getValue(int card) {
return (card % 13);
}
private static final int PLAYERS = 4;
public static void main(String[] args) {
int[][] hand = new int[PLAYERS][5];
Poker2 game = new Poker2();
for(int i = 0; i < PLAYERS; i++) {
hand[i] = game.takeHand();
for(int j = 0; j < 5; j++) {
System.out.print(" " + hand[i][j]);
}
System.out.println();
}
}
}
Example output:
8 36 4 18 16 50 35 29 27 39 28 42 48 31 1 24 9 5 43 2
Those hands will pass all statistical tests for randomness.
But check this program:
import java.util.ArrayList;
import java.util.Arrays;
import java.util.List;
import java.util.Random;
public class Poker3 {
private Random rng;
private List<Integer> deck;
public Poker3(long seed) {
rng = new Random(seed);
deck = new ArrayList<Integer>();
for(int i = 0; i < 52; i++) {
deck.add(i);
}
}
public int takeCard() {
return deck.remove(rng.nextInt(deck.size()));
}
public int[] takeHand() {
int[] res = new int[5];
for(int i = 0; i < 5; i++) {
res[i] = takeCard();
}
return res;
}
public static int getColour(int card) {
return (card / 13);
}
public static int getValue(int card) {
return (card % 13);
}
public static boolean isSameHand(int[] h1, int[] h2) {
int[] h1x = h1.clone();
int[] h2x = h2.clone();
Arrays.sort(h1x);
Arrays.sort(h2x);
if((h1x[0] == h2x[0]) &&
(h1x[1] == h2x[1]) &&
(h1x[2] == h2x[2]) &&
(h1x[3] == h2x[3]) &&
(h1x[4] == h2x[4])) {
return true;
} else {
return false;
}
}
private static final int PLAYERS = 4;
public static void main(String[] args) {
int[] myhand = { 8, 36, 4, 18, 16 };
int[][] hand;
long seed = System.currentTimeMillis();
while(true) {
hand = new int[PLAYERS][5];
Poker3 game = new Poker3(seed);
for(int i = 0; i < PLAYERS; i++) {
hand[i] = game.takeHand();
}
if(isSameHand(myhand, hand[0])) {
break;
}
seed--;
}
for(int i = 0; i < PLAYERS; i++) {
for(int j = 0; j < 5; j++) {
System.out.print(" " + hand[i][j]);
}
System.out.println();
}
}
}
Output:
8 36 4 18 16 50 35 29 27 39 28 42 48 31 1 24 9 5 43 2
Based on ones own hand and knowledge about the algorithm, then it only takes seconds to calculate the other hands.
Not smart!
The core of the problem is that even though the algorithm and the initialization is good enough to provide random numbers with good distribution, then the numbers are not unpredictable.
Good distribution and unpredictability are two very diffreent properties.
To generate unpredictable random numbers two things are needed:
There are several ways of getting an unpredictable seed.
Most large runtime libraries has socalled cryptographically secure random number generators that can generate unpredictable random numbers:
On Linux it is possible to read from /dev/random and /dev/urandom.
By far the most common family of algorithms for RNG is LCG (Linear Congruential Generator):
There is also a specialization with b=0 called MLCG (Multiplicative Linear Congruential Generator):
And a generalization with older input than i-1 called MRG (Multiple Recursive Generator):
When one neeed a random integer number 0 .. N-1, then one can just use xi if N=c.
But often one will want to use xi>>m to drop the m low bits as it is a known weakness in LCG's that low bits are less random than highbits. See example in previous section.
When one need a random decimal number 0.0 .. 1.0 where 0.0 is inclusive and 1.0 is exclusive, then one can use an integer random number and xi*1.0/N.
A key concept is cycle. All LCG's will start to repeat sequence of generated numbers after L numbers has been generated. L is called the cycle of the algorithm
The following applies:
An alternative way to look at the cycle concept is to imagine all possible numbers put in a circle. The algorithm determines the order of the numbers. The initializion seed determines where in the cycle one starts. But when started numbers are simply returned by moving one place forward in the circle.
It is obvious that for LCG the cycle can never be be larger than c.
It has been proven that if:
then the cycle is c.
Some of the big names in RNG theory are Knuth, Marsaglia and L'Ecuyer.
Before we go to specific algorithms we need to look at a little cute rewrite that is often necesarry in LCG to avoid integer overflow.
Demo in C:
#include <stdio.h>
static long int seed1;
void mysrand1(long int ss)
{
seed1 = ss;
return;
}
long int myrand1()
{
long long int tmp = seed1;
tmp = (3125LL * tmp) % 67108864LL;
seed1 = tmp;
return seed1;
}
static long int seed2;
void mysrand2(long int ss)
{
seed2 = ss;
return;
}
long int myrand2()
{
seed2 = (3125L * seed2) % 67108864L;
return seed2;
}
static long int seed3;
void mysrand3(long int ss)
{
seed3 = ss;
return;
}
long int myrand3()
{
long int help1 = 67108864L % 3125L;
long int help2 = 67108864L / 3125L;
long int tmp = 3125L * (seed3 % help2) - help1 * (seed3 / help2);
if(tmp >= 0)
seed3 = tmp;
else
seed3 = tmp + 67108864L;
return seed3;
}
int main()
{
int i;
mysrand1(1234567);
mysrand2(1234567);
mysrand3(1234567);
for(i = 0; i < 20; i++) {
printf("%ld %ld %ld\n", myrand1(), myrand2(), myrand3());
}
return 0;
}
Output:
32816627 -34292237 32816627 9615183 9615183 9615183 49784667 -17324197 49784667 18737623 18737623 18737623 36142467 -30966397 36142467 991263 991263 991263 10689131 -56419733 10689131 50428967 -16679897 50428967 18909203 -48199661 18909203 35459055 -31649809 35459055 12812411 -54296453 12812411 41901431 41901431 41901431 12578211 12578211 12578211 48223935 48223935 48223935 40397195 40397195 40397195 9461191 9461191 9461191 38321715 -28787149 38321715 33145999 33145999 33145999 32269723 32269723 32269723 45370647 45370647 45370647
We see that the rewrite solves the integer overflow issue similar to using a longer integer.
Let us see 4 of the known good algorithms in object oriented Java and C#:
RNG.java
public abstract class RNG {
public abstract int getInt();
public abstract int getMaxInt();
public double getDouble() {
return (getInt() / (double)getMaxInt());
}
}
LCG.java
public class LCG extends RNG {
private long a;
private long b;
private long c;
private long seed;
public LCG(int a, int b, int c, int seed) {
this.a = a;
this.b = b;
this.c = c;
this.seed = seed;
}
public int getInt() {
seed = (a * seed + b) % c;
return (int)seed;
}
public int getMaxInt() {
return (int)c;
}
}
MRG.java
public class MRG extends RNG {
private long[] a;
private long b;
private long c;
private long[] seed;
public MRG(int[] a, int b, int c, int[] seed) {
this.a = new long[a.length];
for(int i = 0; i < a.length; i++) this.a[i] = a[i];
this.b = b;
this.c = c;
this.seed = new long[seed.length];
for(int i = 0; i < seed.length; i++) this.seed[i] = seed[i];
}
public int getInt() {
long tmp = 0;
for(int i = 0; i < a.length; i++) {
tmp = tmp + a[i] * seed[i];
}
tmp = tmp + b;
tmp = tmp % c;
for(int i = 1; i < seed.length; i++) {
seed[i] = seed[i-1];
}
seed[0] = tmp;
return (int)seed[0];
}
public int getMaxInt() {
return (int)c;
}
}
MLCG.java
public class MLCG extends LCG {
public MLCG(int a, int c, int seed) {
super(a, 0, c, seed);
}
}
MinStd.java
// Minimal Standard algorithm
//
// Original proposed by Lewis, Goodman & Miller 1969, but popularized
// by Park & Miller CACM Oct 1988.
//
// Name indicate that one should never choose an algorithm worse than this.
//
// cycle = 2 147 483 647 = 2^31-1
public class MinStd extends MLCG {
public MinStd(int seed) {
super(16807, 2147483647, seed);
}
}
LEcuyer.java
// L'Ecuyer algorithm
//
// Ecuyer & Blouin & Coutre ACM Simulations 1993.
//
// cycle = approx. 1E46
public class LEcuyer extends MRG {
public LEcuyer(int[] seed) {
super(new int[] { 107374182, 0, 0, 0, 104480 }, 0, 2147483647, seed);
}
}
Combiner.java
public interface Combiner {
public int combine(int[] num);
public int getMaxInt();
}
Combined.java
public class Combined extends RNG {
private RNG[] g;
private Combiner cmbn;
public Combined(RNG[] g, Combiner cmbn) {
this.g = g;
this.cmbn = cmbn;
}
public int getInt() {
int[] num = new int[g.length];
for(int i = 0; i < num.length; i++) {
num[i] = g[i].getInt();
}
return cmbn.combine(num);
}
public int getMaxInt() {
return cmbn.getMaxInt();
}
}
Uniform32.java
// Uniform32 algorithm
//
// L'Ecuyer CACM Jun 1988.
//
// cycle = approx. 2.31E18
public class Uniform32 extends Combined {
private static class Uniform32Combiner implements Combiner {
public int combine(int[] num) {
long tmp = num[0] - num[1];
tmp = (tmp + 2147483562) % 2147483562;
return (int)tmp;
}
public int getMaxInt() {
return 2147483562;
}
}
public Uniform32(int seed1, int seed2) {
super(new RNG[] { new MLCG(40014, 2147483563, seed1), new MLCG(40692, 2147483399, seed2) },
new Uniform32Combiner());
}
}
CMRG.java
// CMRG algorithm
//
// L'Ecuyer Operations Research 1996.
//
// cycle = 2^205 = approc. 1E61
public class CMRG extends Combined {
private static class CMRGCombiner implements Combiner {
public int combine(int[] num) {
long tmp = num[0] - num[1];
tmp = (tmp + 2147483647) % 2147483647;
return (int)tmp;
}
public int getMaxInt() {
return 2147483647;
}
}
public CMRG(int[] seed1, int[] seed2) {
super(new RNG[] { new MRG(new int[] { 0, 63308, -183326 }, 0, 2147483647, seed1),
new MRG(new int[] { 86098, 0, -539608 }, 0, 2145483479, seed2) },
new CMRGCombiner());
}
}
Test.java
public class Test {
private static void test(RNG rng) {
for(int i = 0; i < 10; i++) {
System.out.println(rng.getInt());
}
for(int i = 0; i < 10; i++) {
System.out.println(rng.getDouble());
}
}
private static void testMinStd() {
test(new MinStd(1234567));
}
private static void testUniform32() {
test(new Uniform32(1234567, 1234567));
}
private static void testLEcuyer() {
test(new LEcuyer(new int[] { 1234567, 1234567, 1234567, 1234567, 1234567 }));
}
private static void testCMRG() {
test(new CMRG(new int[] { 1234567, 1234567, 1234567 }, new int[] { 1234567, 1234567, 1234567 }));
}
public static void main(String[] args) {
testMinStd();
testUniform32();
testLEcuyer();
testCMRG();
}
}
RNG.cs
public abstract class RNG
{
public abstract int GetInt();
public abstract int MaxInt
{
get;
}
public double GetDouble()
{
return (GetInt() / (double)MaxInt);
}
}
public class LCG : RNG
{
private long a;
private long b;
private long c;
private long seed;
public LCG(int a, int b, int c, int seed)
{
this.a = a;
this.b = b;
this.c = c;
this.seed = seed;
}
public override int GetInt()
{
seed = (a * seed + b) % c;
return (int)seed;
}
public override int MaxInt
{
get
{
return (int)c;
}
}
}
public class MRG : RNG
{
private long[] a;
private long b;
private long c;
private long[] seed;
public MRG(int[] a, int b, int c, int[] seed)
{
this.a = new long[a.Length];
for(int i = 0; i < a.Length; i++) this.a[i] = a[i];
this.b = b;
this.c = c;
this.seed = new long[seed.Length];
for(int i = 0; i < seed.Length; i++) this.seed[i] = seed[i];
}
public override int GetInt()
{
long tmp = 0;
for(int i = 0; i < a.Length; i++)
{
tmp = tmp + a[i] * seed[i];
}
tmp = tmp + b;
tmp = tmp % c;
for(int i = 1; i < seed.Length; i++)
{
seed[i] = seed[i-1];
}
seed[0] = tmp;
return (int)seed[0];
}
public override int MaxInt
{
get
{
return (int)c;
}
}
}
public class MLCG : LCG
{
public MLCG(int a, int c, int seed) : base(a, 0, c, seed)
{
}
}
// Minimal Standard algorithm
//
// Original proposed by Lewis, Goodman & Miller 1969, but popularized
// by Park & Miller CACM Oct 1988.
//
// Name indicate that one should never choose an algorithm worse than this.
//
// cycle = 2 147 483 647 = 2^31-1
public class MinStd : MLCG
{
public MinStd(int seed) : base(16807, 2147483647, seed)
{
}
}
// L'Ecuyer algorithm
//
// Ecuyer & Blouin & Coutre ACM Simulations 1993.
//
// cycle = approx. 1E46
public class LEcuyer : MRG
{
public LEcuyer(int[] seed) : base(new int[] { 107374182, 0, 0, 0, 104480 }, 0, 2147483647, seed)
{
}
}
public interface Combiner
{
int Combine(int[] num);
int MaxInt
{
get;
}
}
public class Combined : RNG
{
private RNG[] g;
private Combiner cmbn;
public Combined(RNG[] g, Combiner cmbn)
{
this.g = g;
this.cmbn = cmbn;
}
public override int GetInt()
{
int[] num = new int[g.Length];
for(int i = 0; i < num.Length; i++)
{
num[i] = g[i].GetInt();
}
return cmbn.Combine(num);
}
public override int MaxInt
{
get
{
return cmbn.MaxInt;
}
}
}
// Uniform32 algorithm
//
// L'Ecuyer CACM Jun 1988.
//
// cycle = approx. 2.31E18
internal class Uniform32Combiner : Combiner
{
public int Combine(int[] num)
{
long tmp = num[0] - num[1];
tmp = (tmp + 2147483562) % 2147483562;
return (int)tmp;
}
public int MaxInt
{
get
{
return 2147483562;
}
}
}
public class Uniform32 : Combined
{
public Uniform32(int seed1, int seed2) : base(new RNG[] { new MLCG(40014, 2147483563, seed1),
new MLCG(40692, 2147483399, seed2) },
new Uniform32Combiner())
{
}
}
// CMRG algorithm
//
// L'Ecuyer Operations Research 1996.
//
// cycle = 2^205 = approx. 1E61
internal class CMRGCombiner : Combiner
{
public int Combine(int[] num)
{
long tmp = num[0] - num[1];
tmp = (tmp + 2147483647) % 2147483647;
return (int)tmp;
}
public int MaxInt
{
get
{
return 2147483647;
}
}
}
public class CMRG : Combined {
public CMRG(int[] seed1, int[] seed2) : base(new RNG[] { new MRG(new int[] { 0, 63308, -183326 }, 0, 2147483647, seed1),
new MRG(new int[] { 86098, 0, -539608 }, 0, 2145483479, seed2) },
new CMRGCombiner())
{
}
}
Test.cs
using System;
public class Program
{
private static void Test(RNG rng)
{
for(int i = 0; i < 10; i++)
{
Console.WriteLine(rng.GetInt());
}
for(int i = 0; i < 10; i++)
{
Console.WriteLine(rng.GetDouble());
}
}
private static void TestMinStd()
{
Test(new MinStd(1234567));
}
private static void TestUniform32()
{
Test(new Uniform32(1234567, 1234567));
}
private static void TestLEcuyer()
{
Test(new LEcuyer(new int[] { 1234567, 1234567, 1234567, 1234567, 1234567 }));
}
private static void TestCMRG()
{
Test(new CMRG(new int[] { 1234567, 1234567, 1234567 }, new int[] { 1234567, 1234567, 1234567 }));
}
public static void Main(string[] args)
{
TestMinStd();
TestUniform32();
TestLEcuyer();
TestCMRG();
}
}
Everything so far has been about uniform distributed random numbers. But sometimes it is necessary to generate numbers with another distribution.
Luckily other distributions can usually be derived from uniform distribution.
For normal distributed random nunmbers one can use the socalled Box Mueller transformation:
Let us see implementations in Java and C#.
import java.util.Random;
public class Normal
{
public static final int N = 100000;
public static void test(double e, double s) {
Random rng = new Random();
int[] one1 = new int[4];
int[] one2 = new int[4];
for(int i = 0; i < N; i++) {
double u1 = rng.nextDouble();
double u2 = rng.nextDouble();
double h1 = Math.sqrt(-2 * Math.log(u1));
double h2 = 2 * Math.PI * u2;
double n1 = e + s * h1 * Math.sin(h2);
double n2 = e + s * h1 * Math.cos(h2);
if(n1 < e - 1.96 * s) {
one1[0]++;
} else if(n1 < e) {
one1[1]++;
} else if(n1 < e + 1.96 * s) {
one1[2]++;
} else {
one1[3]++;
}
if(n2 < e - 1.96 * s) {
one2[0]++;
} else if(n2 < e) {
one2[1]++;
} else if(n2 < e + 1.96 * s) {
one2[2]++;
} else {
one2[3]++;
}
}
for(int i = 0; i < 4; i++)
{
System.out.println(one1[i] + " " + one2[i]);
}
}
public static void main(String[] args) {
test(0, 1);
test(7, 3);
}
}
using System;
public class Normal
{
public const int N = 100000;
public static void Test(double e, double s)
{
Random rng = new Random();
int[] one1 = new int[4];
int[] one2 = new int[4];
for(int i = 0; i < N; i++)
{
double u1 = rng.NextDouble();
double u2 = rng.NextDouble();
double h1 = Math.Sqrt(-2 * Math.Log(u1));
double h2 = 2 * Math.PI * u2;
double n1 = e + s * h1 * Math.Sin(h2);
double n2 = e + s * h1 * Math.Cos(h2);
if(n1 < e - 1.96 * s)
{
one1[0]++;
}
else if(n1 < e)
{
one1[1]++;
}
else if(n1 < e + 1.96 * s)
{
one1[2]++;
}
else
{
one1[3]++;
}
if(n2 < e - 1.96 * s)
{
one2[0]++;
}
else if(n2 < e)
{
one2[1]++;
}
else if(n2 < e + 1.96 * s)
{
one2[2]++;
}
else
{
one2[3]++;
}
}
for(int i = 0; i < 4; i++)
{
Console.WriteLine(one1[i] + " " + one2[i]);
}
}
public static void Main(string[] args) {
Test(0, 1);
Test(7, 3);
}
}
Output:
2544 2471 47544 47787 47360 47319 2552 2423 2474 2532 47367 47736 47694 47248 2465 2484
which looks fine (the expected distribution is 2.5%, 47.5%, 47.5% and 2.5%.
One can expect a nice distribution but not a perfect distribution.
Let us say that we generate 10000 random numbers and group in groups that should have same size (either pick 1/4 of range for each group or pick low 2 bit for group).
The expected distribution for a good generator is:
2500 2500 2500 2500
But it is not likely to get exactly that distribution.
The probability can be calculated via formula for multinominal distribution:
which according to my hand calculator is approx. 1*10-6, so one will only get the expected distribution one out of a million tries.
If one only look at the distribution for one group then it is binominal distributed and for large N we can approximate with a normal distribution:
which gives a standard deviation:
and therefore a distribution where there is 95% probability that the value is in the range:
PHP comes standard with 2 RNG's:
rand() uses the rand() function from the C runtime library on the system (so whether it is a good or bad algorithm depends on the operating system and/or the C compiler).
mt_rand() uses the socalled Mersenne Twister algorithm that:
So which to use?
PHP docs are clear:
Many random number generators of older libcs have dubious or unknown characteristics and are slow. By default, PHP uses the libc random number generator with the rand() function. The mt_rand() function is a drop-in replacement for this. It uses a random number generator with known characteristics using the Mersenne Twister, which will produce random numbers four times faster than what the average libc rand() provides.
In my opinion it is the unknown characteristics and not the slow speed that is the main reason to avoid rand() and use mt_rand() instead.
Test of rand() on Windows XP with PHP 5.3.0:
<?php
srand(1233567);
$high1 = array(0, 0, 0, 0);
$high2 = array(array(0, 0, 0, 0), array(0, 0, 0, 0), array(0, 0, 0, 0), array(0, 0, 0, 0));
$low1 = array(0, 0, 0, 0);
$low2 = array(array(0, 0, 0, 0), array(0, 0, 0, 0), array(0, 0, 0, 0), array(0, 0, 0, 0));
$prev = rand();
for($i = 0; $i < 10000; $i++) {
$curr = rand();
$high1[$curr * 4 / getrandmax()]++;
$high2[$curr * 4 / getrandmax()][$prev* 4 / getrandmax()]++;
$low1[$curr % 4]++;
$low2[$curr % 4][$prev % 4]++;
$prev = $curr;
}
echo "High bits:\n";
echo "<table border>\n";
for($row = 0; $row < 4; $row++) {
echo "<tr>\n";
echo "<td>" . $high1[$row] . "</td>\n";
echo "</tr>\n";
}
echo "</table>\n";
echo "<table border>\n";
for($row = 0; $row < 4; $row++) {
echo "<tr>\n";
for($col = 0; $col < 4; $col++) {
echo "<td>" . $high2[$row][$col] . "</td>\n";
}
echo "</tr>\n";
}
echo "</table>\n";
echo "Low bits:\n";
echo "<table border>\n";
for($row = 0; $row < 4; $row++) {
echo "<tr>\n";
echo "<td>" . $low1[$row] . "</td>\n";
echo "</tr>\n";
}
echo "</table>\n";
echo "<table border>\n";
for($row = 0; $row < 4; $row++) {
echo "<tr>\n";
for($col = 0; $col < 4; $col++) {
echo "<td>" . $low2[$row][$col] . "</td>\n";
}
echo "</tr>\n";
}
echo "</table>\n";
?>
Result:
High bits: 2438 2480 2577 2504 607 605 612 614 605 608 645 622 627 641 659 650 599 625 662 617 Low bits: 2500 2500 2500 2500 0 0 0 2500 2500 0 0 0 0 2500 0 0 0 0 2500 0
Not good in low bits!
Test of mt_rand() on Windows XP with PHP 5.3.0:
<?php
mt_srand(1233567);
$high1 = array(0, 0, 0, 0);
$high2 = array(array(0, 0, 0, 0), array(0, 0, 0, 0), array(0, 0, 0, 0), array(0, 0, 0, 0));
$low1 = array(0, 0, 0, 0);
$low2 = array(array(0, 0, 0, 0), array(0, 0, 0, 0), array(0, 0, 0, 0), array(0, 0, 0, 0));
$prev = mt_rand();
for($i = 0; $i < 10000; $i++) {
$curr = mt_rand();
$high1[$curr * 4 / mt_getrandmax()]++;
$high2[$curr * 4 / mt_getrandmax()][$prev* 4 / mt_getrandmax()]++;
$low1[$curr % 4]++;
$low2[$curr % 4][$prev % 4]++;
$prev = $curr;
}
echo "High bits:\n";
echo "<table border>\n";
for($row = 0; $row < 4; $row++) {
echo "<tr>\n";
echo "<td>" . $high1[$row] . "</td>\n";
echo "</tr>\n";
}
echo "</table>\n";
echo "<table border>\n";
for($row = 0; $row < 4; $row++) {
echo "<tr>\n";
for($col = 0; $col < 4; $col++) {
echo "<td>" . $high2[$row][$col] . "</td>\n";
}
echo "</tr>\n";
}
echo "</table>\n";
echo "Low bits:\n";
echo "<table border>\n";
for($row = 0; $row < 4; $row++) {
echo "<tr>\n";
echo "<td>" . $low1[$row] . "</td>\n";
echo "</tr>\n";
}
echo "</table>\n";
echo "<table border>\n";
for($row = 0; $row < 4; $row++) {
echo "<tr>\n";
for($col = 0; $col < 4; $col++) {
echo "<td>" . $low2[$row][$col] . "</td>\n";
}
echo "</tr>\n";
}
echo "</table>\n";
?>
Result:
High bits: 2467 2522 2524 2487 594 603 648 622 641 647 617 617 610 646 619 649 623 625 640 599 Low bits: 2503 2515 2480 2502 634 630 622 617 634 627 638 616 629 616 609 626 606 641 612 643
Much better.
Conclusion always use mt_rand() and never rand().
VBS RND() function returns a floating point value 0.0 .. 1.0 (0.0 inclusive and 1.0 exclusive) and not an integer like most other base RNG's.
Often one need a random integer so a transformation is needed.
And then it is critical to know that RND() uses a 24 bit LCG.
This means that:
12 bits:
<%
Randomize(1234567)
high1 = array(0, 0, 0, 0)
high2 = array(array(0, 0, 0, 0), array(0, 0, 0, 0), array(0, 0, 0, 0), array(0, 0, 0, 0))
low1 = array(0, 0, 0, 0)
low2 = array(array(0, 0, 0, 0), array(0, 0, 0, 0), array(0, 0, 0, 0), array(0, 0, 0, 0))
prev = Fix(Rnd() * 4096)
For i = 1 To 10000
curr = Fix(Rnd() * 4096)
high1(curr * 4 \ 4096) = high1(curr * 4 \ 4096) + 1
high2(curr * 4 \ 4096)(prev * 4 \ 4096) = high2(curr * 4 \ 4096)(prev * 4 \ 4096) + 1
low1(curr Mod 4) = low1(curr Mod 4) + 1
low2(curr Mod 4)(prev Mod 4) = low2(curr Mod 4)(prev Mod 4) + 1
prev = curr
Next
Response.Write "High bits:" & vbCrLf
Response.Write "<table border>" & vbCrLf
For row = 0 To 3
Response.Write "<tr>" & vbCrLf
Response.Write "<td>" & high1(row) & "</td>" & vbCrLf
Response.Write "</tr>" & vbCrLf
Next
Response.Write "</table>" & vbCrLf
Response.Write "<table border>" & vbCrLf
For row = 0 To 3
Response.Write "<tr>" & vbCrLf
For col = 0 To 3
Response.Write "<td>" & high2(row)(col) & "</td>" & vbCrLf
Next
Response.Write "</tr>" & vbCrLf
Next
Response.Write "</table>" & vbCrLf
Response.Write "Low bits:" & vbCrLf
Response.Write "<table border>" & vbCrLf
For row = 0 To 3
Response.Write "<tr>" & vbCrLf
Response.Write "<td>" & low1(row) & "</td>" & vbCrLf
Response.Write "</tr>" & vbCrLf
Next
Response.Write "</table>" & vbCrLf
Response.Write "<table border>" & vbCrLf
For row = 0 To 3
Response.Write "<tr>" & vbCrLf
For col = 0 To 3
Response.Write "<td>" & low2(row)(col) & "</td>" & vbCrLf
Next
Response.Write "</tr>" & vbCrLf
Next
Response.Write "</table>" & vbCrLf
%>
Result:
High bits: 2528 2474 2509 2489 631 629 633 635 619 617 630 608 643 593 633 640 635 635 613 606 Low bits: 2525 2512 2450 2513 648 628 616 633 641 621 620 630 613 621 600 616 623 641 614 635
which looks fine.
16 bit:
<%
Randomize(1234567)
high1 = array(0, 0, 0, 0)
high2 = array(array(0, 0, 0, 0), array(0, 0, 0, 0), array(0, 0, 0, 0), array(0, 0, 0, 0))
low1 = array(0, 0, 0, 0)
low2 = array(array(0, 0, 0, 0), array(0, 0, 0, 0), array(0, 0, 0, 0), array(0, 0, 0, 0))
prev = Fix(Rnd() * 65536)
For i = 1 To 10000
curr = Fix(Rnd() * 65536)
high1(curr * 4 \ 65536) = high1(curr * 4 \ 65536) + 1
high2(curr * 4 \ 65536)(prev * 4 \ 65536) = high2(curr * 4 \ 65536)(prev * 4 \ 65536) + 1
low1(curr Mod 4) = low1(curr Mod 4) + 1
low2(curr Mod 4)(prev Mod 4) = low2(curr Mod 4)(prev Mod 4) + 1
prev = curr
Next
Response.Write "High bits:" & vbCrLf
Response.Write "<table border>" & vbCrLf
For row = 0 To 3
Response.Write "<tr>" & vbCrLf
Response.Write "<td>" & high1(row) & "</td>" & vbCrLf
Response.Write "</tr>" & vbCrLf
Next
Response.Write "</table>" & vbCrLf
Response.Write "<table border>" & vbCrLf
For row = 0 To 3
Response.Write "<tr>" & vbCrLf
For col = 0 To 3
Response.Write "<td>" & high2(row)(col) & "</td>" & vbCrLf
Next
Response.Write "</tr>" & vbCrLf
Next
Response.Write "</table>" & vbCrLf
Response.Write "Low bits:" & vbCrLf
Response.Write "<table border>" & vbCrLf
For row = 0 To 3
Response.Write "<tr>" & vbCrLf
Response.Write "<td>" & low1(row) & "</td>" & vbCrLf
Response.Write "</tr>" & vbCrLf
Next
Response.Write "</table>" & vbCrLf
Response.Write "<table border>" & vbCrLf
For row = 0 To 3
Response.Write "<tr>" & vbCrLf
For col = 0 To 3
Response.Write "<td>" & low2(row)(col) & "</td>" & vbCrLf
Next
Response.Write "</tr>" & vbCrLf
Next
Response.Write "</table>" & vbCrLf
%>
Result:
High bits: 2528 2474 2509 2489 631 629 633 635 619 617 630 608 643 593 633 640 635 635 613 606 Low bits: 2509 2485 2506 2500 837 194 644 834 828 822 194 641 649 826 834 197 195 643 834 828
and we see some deviations in low bits.
24 bit:
<%
Randomize(1234567)
high1 = array(0, 0, 0, 0)
high2 = array(array(0, 0, 0, 0), array(0, 0, 0, 0), array(0, 0, 0, 0), array(0, 0, 0, 0))
low1 = array(0, 0, 0, 0)
low2 = array(array(0, 0, 0, 0), array(0, 0, 0, 0), array(0, 0, 0, 0), array(0, 0, 0, 0))
prev = Fix(Rnd() * 16777216)
For i = 1 To 10000
curr = Fix(Rnd() * 16777216)
high1(curr * 4 \ 16777216) = high1(curr * 4 \ 16777216) + 1
high2(curr * 4 \ 16777216)(prev * 4 \ 16777216) = high2(curr * 4 \ 16777216)(prev * 4 \ 16777216) + 1
low1(curr Mod 4) = low1(curr Mod 4) + 1
low2(curr Mod 4)(prev Mod 4) = low2(curr Mod 4)(prev Mod 4) + 1
prev = curr
Next
Response.Write "High bits:" & vbCrLf
Response.Write "<table border>" & vbCrLf
For row = 0 To 3
Response.Write "<tr>" & vbCrLf
Response.Write "<td>" & high1(row) & "</td>" & vbCrLf
Response.Write "</tr>" & vbCrLf
Next
Response.Write "</table>" & vbCrLf
Response.Write "<table border>" & vbCrLf
For row = 0 To 3
Response.Write "<tr>" & vbCrLf
For col = 0 To 3
Response.Write "<td>" & high2(row)(col) & "</td>" & vbCrLf
Next
Response.Write "</tr>" & vbCrLf
Next
Response.Write "</table>" & vbCrLf
Response.Write "Low bits:" & vbCrLf
Response.Write "<table border>" & vbCrLf
For row = 0 To 3
Response.Write "<tr>" & vbCrLf
Response.Write "<td>" & low1(row) & "</td>" & vbCrLf
Response.Write "</tr>" & vbCrLf
Next
Response.Write "</table>" & vbCrLf
Response.Write "<table border>" & vbCrLf
For row = 0 To 3
Response.Write "<tr>" & vbCrLf
For col = 0 To 3
Response.Write "<td>" & low2(row)(col) & "</td>" & vbCrLf
Next
Response.Write "</tr>" & vbCrLf
Next
Response.Write "</table>" & vbCrLf
%>
Result:
High bits: 2528 2474 2509 2489 631 629 633 635 619 617 630 608 643 593 633 640 635 635 613 606 Low bits: 2500 2500 2500 2500 0 2500 0 0 0 0 2500 0 0 0 0 2500 2500 0 0 0
a typical LCG very bad low bits result.
Conclusion: VBS RND() is acceptable to scale to integer values 1-12 bit, but more bits require carefulness due to bad distribution of low bits. And never go above 24 bits.
ASP.NET uses .NET RNG's that are described in previous section and have good characteristica.
Java EE uses Java RNG's that are described in previous section and have good characteristica.
Version | Date | Description |
---|---|---|
1.0 | April 18th 2005, May 23rd 2005 and April 15th 2006 | Initial version (in Danish) published on Eksperten.dk in 3 separate articles |
1.2 | February 12th 2010 | Minor changes |
2.0 | August 1st 2016 | Translation to English and complete reformatting and publishing here |
2.1 | October 8th 2016 | Add content overview |
2.2 | March 7th 2021 | Add more examples and update some text |
See list of all articles here
Please send comments to Arne Vajhøj