Random numbers

Content:

  1. What is random numbers?
  2. Good usage of builtin RNG
  3. Bad usage of builtin RNG
    1. Reinitalize for each number
    2. Bad scaling
    3. Bad combination
    4. Low bits
    5. Summary
  4. Good random distribution versus unpredictability
  5. Some theory
  6. Some known good algorithms
  7. Non-uniform distribution
  8. What to expect from generated random numbers?
  9. PHP
  10. ASP
  11. ASP.NET
  12. Java EE

What is random numbers?

True random numbers are not easy to generate. That needs something external that is true random like certain things at the atomic level like radio-activity.

When we talk software then we usually mean pseudo random numbers, which is numbers that are fully deterministic but have characteristics similar to true random numbers. And due to that then in most cases they are just as good as true random numbers. And they are way easier to generate.

Unless otherwise stated then all the following examples are about uniform distributed random numbers - that is random numbers where each number has the same probability.

Throughout the article I will abbreviate rand number generator as RNG.

Examples will be shown using various programming languages like Java, C# and C. But the examples should be so simple that it is possible to understand the code even without knowing the specific language.

Good usage of builtin RNG:

Most programming languages/runtime comes with a builtin RNG.

Examples of usage:

#include <stdio.h>
#include <stdlib.h>
#include <time.h>

#define N 10
#define K 100

int main()
{
    int i, rani;
    double ranx;
    /* initialize random generator */
    srand(time(NULL));
    for(i = 0; i < N; i++)
    {
        /* generate random integer in the range from 0 (inclusive) to K-1 (inclusive) */
        rani = rand() % K;
        printf("%d\n", rani);
    }
    for(i = 0; i < N; i++)
    {
        /* generate random double in the range 0.0 (inclusive) to 1.0 (exclusive) */ 
        ranx = rand() / (double)(RAND_MAX + 1);
        printf("%f\n", ranx);
    }
    return 0;
}
import java.util.Random;

public class Rng {
    public final static int N = 10;
    public final static int K = 100;
    public static void main(String[] args) {
        // initialize random generator
        Random rng = new Random();
        for(int i = 0; i < N; i++) {
            // generate random integer in the range from 0 (inclusive) to K-1 (inclusive)
            int rani = rng.nextInt(K);
            System.out.println(rani);
        }
        for(int i = 0; i < N; i++) {
            // generate random double in the range 0.0 (inclusive) to 1.0 (exclusive) 
            double ranx = rng.nextDouble();
            System.out.println(ranx);
        }
    }
}
using System;

public class Rng
{
    public const int N = 10;
    public const int K = 100;
    public static void Main(string[] args)
    {
        // initialize random generator
        Random rng = new Random();
        for(int i = 0; i < N; i++)
        {
            // generate random integer in the range from 0 (inclusive) to K-1 (inclusive)
            int rani = rng.Next(K);
            Console.WriteLine(rani);
        }
        for(int i = 0; i < N; i++)
        {
            // generate random double in the range 0.0 (inclusive) to 1.0 (exclusive) 
            double ranx = rng.NextDouble();
            Console.WriteLine(ranx);
        }
    }
}

Are the output from these builtin RNG's good meaning that the generated numbers have characteristics similar to true random numbers?

In most cases they are OK, not great, but not bad either.

#include <stdio.h>
#include <stdlib.h>
#include <time.h>

#define N 50000
#define K 10

int main()
{
    int i, j, last, one[K], two[K][K], a[N];
    srand(time(NULL));
    for(i = 0; i < N; i++)
    {
        a[i] = rand() % K;
    }
    for(i = 0; i < K; i++) one[i] = 0;
    for(i = 0; i < N; i++)
    {
        one[a[i]]++;
    }
    for(i = 0; i < K; i++) for(j = 0; j < K; j++) two[i][j] = 0;
    last = a[0];
    for(i = 1; i < N; i++)
    {
        two[last][a[i]]++;
        last = a[i];
    }
    for(int i = 0; i < K; i++)
    {
        printf("%d\n", one[i]);
    }
    for(int i = 0; i < K; i++)
    {
        for(int j = 0; j < K; j++)
        {
            printf(" %d", two[i][j]);
        }
        printf("\n");
    }
    return 0;
}
import java.util.Random;

public class GoodRng {
    public final static int N = 50000;
    public final static int K = 10;
    public static void main(String[] args) {
        Random rng = new Random();
        int[] one = new int[K];
        int[][] two = new int[K][K];
        int[] a = new int[N];
        for(int i = 0; i < N; i++) {
            a[i] = rng.nextInt(K);
        }
        for(int i = 0; i < N; i++) {
            one[a[i]]++;
        }
        int last = a[0];
        for(int i = 1; i < N; i++) {
            two[last][a[i]]++;
            last = a[i];
        }
        for(int i = 0; i < K; i++) {
            System.out.println(one[i]);
        }
        for(int i = 0; i < K; i++) {
            for(int j = 0; j < K; j++) {
                System.out.print(" " + two[i][j]);
            }
            System.out.println();
        }
    }
}
using System;

public class Rng
{
    public const int N = 50000;
    public const int K = 10;
    public static void Main(string[] args)
    {
        Random rng = new Random();
        int[] one = new int[K];
        int[,] two = new int[K,K];
        int[] a = new int[N];
        for(int i = 0; i < N; i++)
        {
            a[i] = rng.Next(K);
        }
        for(int i = 0; i < N; i++)
        {
            one[a[i]]++;
        }
        int last = a[0];
        for(int i = 1; i < N; i++)
        {
            two[last,a[i]]++;
            last = a[i];
        }
        for(int i = 0; i < K; i++)
        {
            Console.WriteLine(one[i]);
        }
        for(int i = 0; i < K; i++)
        {
            for(int j = 0; j < K; j++)
            {
                Console.Write(" " + two[i,j]);
            }
            Console.WriteLine();
        }
    }
}

Example output:

4945
5018
5085
4980
4956
4992
4846
5030
5045
5103
 436 505 513 521 455 497 503 483 521 510
 545 508 525 493 514 480 431 464 509 549
 488 504 544 486 504 503 484 507 536 529
 500 488 524 509 494 490 492 476 515 492
 506 517 523 464 463 521 503 506 462 491
 506 512 448 498 517 514 493 486 510 508
 478 468 486 493 471 469 459 534 516 472
 481 515 505 471 507 519 503 542 479 508
 505 492 507 531 511 492 476 502 481 548
 500 509 510 514 520 507 501 530 516 496

This output will be shown many times. It is a very simple test for randomness. It tests the distribution of the generated numbers and the distribution of the generated numbers given the previous number. Both should have an equal distribution.

Much better tests for randomness exists. Including the well-known Diehard test af Marsaglia.

Bad usage of builtin RNG:

Even though the builtin RNG may be OK, then it is very easy for the application code to use it incorrectly resulting in very non-random results.

Below are some classic mistakes.

Reinitalize for each number

Reinitializing the RNG for each number. Some beginners think that if initializing RNG once results in OK random numbers then initializing RNG for every number will produce even better random numbers. Not so.

#include <stdio.h>
#include <stdlib.h>
#include <time.h>

#define N 50000
#define K 10

int main()
{
    int i, j, last, one[K], two[K][K], a[N];
    for(i = 0; i < N; i++)
    {
        srand(time(NULL)); /* <---- initialize for every number */
        a[i] = rand() % K;
    }
    for(i = 0; i < K; i++) one[i] = 0;
    for(i = 0; i < N; i++)
    {
        one[a[i]]++;
    }
    for(i = 0; i < K; i++) for(j = 0; j < K; j++) two[i][j] = 0;
    last = a[0];
    for(i = 1; i < N; i++)
    {
        two[last][a[i]]++;
        last = a[i];
    }
    for(int i = 0; i < K; i++)
    {
        printf("%d\n", one[i]);
    }
    for(int i = 0; i < K; i++)
    {
        for(int j = 0; j < K; j++)
        {
            printf(" %d", two[i][j]);
        }
        printf("\n");
    }
    return 0;
}
import java.util.Random;

// always bad code
// produces bad results in Java 1.0 - 1.4
// produces ok results in Java 1.5-
public class BadRng1 {
    public final static int N = 50000;
    public final static int K = 10;
    public static void main(String[] args) {
        int[] one = new int[K];
        int[][] two = new int[K][K];
        int[] a = new int[N];
        for(int i = 0; i < N; i++) {
            Random rng = new Random(); // <---- initialize for every number
            a[i] = rng.nextInt(K);
        }
        for(int i = 0; i < N; i++) {
            one[a[i]]++;
        }
        int last = a[0];
        for(int i = 1; i < N; i++) {
            two[last][a[i]]++;
            last = a[i];
        }
        for(int i = 0; i < K; i++) {
            System.out.println(one[i]);
        }
        for(int i = 0; i < K; i++) {
            for(int j = 0; j < K; j++) {
                System.out.print(" " + two[i][j]);
            }
            System.out.println();
        }
    }
}
using System;

public class BadRng1
{
    public const int N = 50000;
    public const int K = 10;
    public static void Main(string[] args) {
        int[] one = new int[K];
        int[,] two = new int[K,K];
        int[] a = new int[N];
        for(int i = 0; i < N; i++)
        {
            Random rng = new Random(); // <---- initialize for every number
            a[i] = rng.Next(K);
        }
        for(int i = 0; i < N; i++)
        {
            one[a[i]]++;
        }
        int last = a[0];
        for(int i = 1; i < N; i++)
        {
            two[last,a[i]]++;
            last = a[i];
        }
        for(int i = 0; i < K; i++)
        {
            Console.WriteLine(one[i]);
        }
        for(int i = 0; i < K; i++)
        {
            for(int j = 0; j < K; j++)
            {
                Console.Write(" " + two[i,j]);
            }
            Console.WriteLine();
        }
    }
}

Example output:

0
0
16282
170
33548
0
0
0
0
0
 0 0 0 0 0 0 0 0 0 0
 0 0 0 0 0 0 0 0 0 0
 0 0 16281 0 1 0 0 0 0 0
 0 0 0 169 1 0 0 0 0 0
 0 0 1 0 33546 0 0 0 0 0
 0 0 0 0 0 0 0 0 0 0
 0 0 0 0 0 0 0 0 0 0
 0 0 0 0 0 0 0 0 0 0
 0 0 0 0 0 0 0 0 0 0
 0 0 0 0 0 0 0 0 0 0

The numbers are not random at all. The problem is that initialization often uses time as basis for the random numbers. So all initializations within a given time interval will result in the same numbers being generated.

The right way is to just initialize once.

Often it is done as a class or instance field in OOP languages.

In Java:

public static final Random rng = new Random();

or:

public final Random rng = new Random();

Check documentation for RNG regarding thread safety when using in multi-threaded context.

Bad scaling

Bad scaling. Some beginners think that if you have a RNG generating OK random numbers then scaling the numbers will always result in OK random numbers. Not so.

#include <stdio.h>
#include <stdlib.h>
#include <time.h>

#define N 50000
#define K 10

int main()
{
    int i, j, last, one[K], two[K][K], a[N];
    srand(time(NULL));
    for(i = 0; i < N; i++)
    {
        a[i] = (rand() % 15) % K;  /* scale random number 0..14 to 0..9 */
    }
    for(i = 0; i < K; i++) one[i] = 0;
    for(i = 0; i < N; i++)
    {
        one[a[i]]++;
    }
    for(i = 0; i < K; i++) for(j = 0; j < K; j++) two[i][j] = 0;
    last = a[0];
    for(i = 1; i < N; i++)
    {
        two[last][a[i]]++;
        last = a[i];
    }
    for(int i = 0; i < K; i++)
    {
        printf("%d\n", one[i]);
    }
    for(int i = 0; i < K; i++)
    {
        for(int j = 0; j < K; j++)
        {
            printf(" %d", two[i][j]);
        }
        printf("\n");
    }
    return 0;
}
import java.util.Random;

public class BadRng2 {
    public final static int N = 50000;
    public final static int K = 10;
    public static void main(String[] args) {
        Random rng = new Random();
        int[] one = new int[K];
        int[][] two = new int[K][K];
        int[] a = new int[N];
        for(int i = 0; i < N; i++) {
            a[i] = rng.nextInt(15) % K; // scale random number 0..14 to 0..9
        }
        for(int i = 0; i < N; i++) {
            one[a[i]]++;
        }
        int last = a[0];
        for(int i = 1; i < N; i++) {
            two[last][a[i]]++;
            last = a[i];
        }
        for(int i = 0; i < K; i++) {
            System.out.println(one[i]);
        }
        for(int i = 0; i < K; i++) {
            for(int j = 0; j < K; j++) {
                System.out.print(" " + two[i][j]);
            }
            System.out.println();
        }
    }
}
using System;

public class BadRng2
{
    public const int N = 50000;
    public const int K = 10;
    public static void Main(string[] args)
    {
        Random rng = new Random();
        int[] one = new int[K];
        int[,] two = new int[K,K];
        int[] a = new int[N];
        for(int i = 0; i < N; i++)
        {
            a[i] = rng.Next(15) % K; // scale random number 0..14 to 0..9
        }
        for(int i = 0; i < N; i++)
        {
            one[a[i]]++;
        }
        int last = a[0];
        for(int i = 1; i < N; i++)
        {
            two[last,a[i]]++;
            last = a[i];
        }
        for(int i = 0; i < K; i++)
        {
            Console.WriteLine(one[i]);
        }
        for(int i = 0; i < K; i++)
        {
            for(int j = 0; j < K; j++)
            {
                Console.Write(" " + two[i,j]);
            }
            Console.WriteLine();
        }
    }
}

It is rarely as obvious as here. Usually the 15 and the K are in different code. But the problem still exists.

Example output:

6686
6678
6700
6610
6558
3274
3408
3415
3316
3355
 913 878 911 864 888 470 449 426 457 430
 833 918 877 911 920 450 421 493 429 426
 908 873 890 918 882 414 488 443 427 457
 882 909 868 851 853 420 454 452 453 467
 894 884 925 861 888 430 431 439 412 394
 453 414 456 423 415 202 226 204 227 254
 462 470 444 461 419 210 237 253 220 232
 493 461 422 434 449 228 252 226 217 233
 408 425 480 447 399 211 236 237 235 238
 440 446 427 440 445 239 213 242 239 224

The numbers are not random at all. The problem is that 10 is not a factor in 15 and that 10 is relative close to 15. This combination result in very poor distribution.

Note that the following modification does not solve the problem:

#include <stdio.h>
#include <stdlib.h>
#include <time.h>

#define N 50000
#define K 10

int main()
{
    int i, j, last, one[K], two[K][K], a[N];
    srand(time(NULL));
    for(i = 0; i < N; i++)
    {
        a[i] = (int)(((rand() % 15) / 15.0) * K);  /* scale random number 0..14 to 0..9 */
    }
    for(i = 0; i < K; i++) one[i] = 0;
    for(i = 0; i < N; i++)
    {
        one[a[i]]++;
    }
    for(i = 0; i < K; i++) for(j = 0; j < K; j++) two[i][j] = 0;
    last = a[0];
    for(i = 1; i < N; i++)
    {
        two[last][a[i]]++;
        last = a[i];
    }
    for(int i = 0; i < K; i++)
    {
        printf("%d\n", one[i]);
    }
    for(int i = 0; i < K; i++)
    {
        for(int j = 0; j < K; j++)
        {
            printf(" %d", two[i][j]);
        }
        printf("\n");
    }
    return 0;
}
import java.util.Random;

public class BadRng2NotFix {
    public final static int N = 50000;
    public final static int K = 10;
    public static void main(String[] args) {
        Random rng = new Random();
        int[] one = new int[K];
        int[][] two = new int[K][K];
        int[] a = new int[N];
        for(int i = 0; i < N; i++) {
            a[i] = (int)(rng.nextInt(15) / 15.0 * K); // scale random number 0..14 to 0.0..1.0 to 0..9
        }
        for(int i = 0; i < N; i++) {
            one[a[i]]++;
        }
        int last = a[0];
        for(int i = 1; i < N; i++) {
            two[last][a[i]]++;
            last = a[i];
        }
        for(int i = 0; i < K; i++) {
            System.out.println(one[i]);
        }
        for(int i = 0; i < K; i++) {
            for(int j = 0; j < K; j++) {
                System.out.print(" " + two[i][j]);
            }
            System.out.println();
        }
    }
}
using System;

public class BadRng2NotFix
{
    public const int N = 50000;
    public const int K = 10;
    public static void Main(string[] args)
    {
        Random rng = new Random();
        int[] one = new int[K];
        int[,] two = new int[K,K];
        int[] a = new int[N];
        for(int i = 0; i < N; i++)
        {
            a[i] = (int)((rng.Next(15) / 15.0) * K); // scale random number 0..14 to 0.0..1.0 to to 0..9
        }
        for(int i = 0; i < N; i++)
        {
            one[a[i]]++;
        }
        int last = a[0];
        for(int i = 1; i < N; i++)
        {
            two[last,a[i]]++;
            last = a[i];
        }
        for(int i = 0; i < K; i++)
        {
            Console.WriteLine(one[i]);
        }
        for(int i = 0; i < K; i++)
        {
            for(int j = 0; j < K; j++)
            {
                Console.Write(" " + two[i,j]);
            }
            Console.WriteLine();
        }
    }
}

Example output:

6685
3212
6714
3323
6680
3280
6736
3367
6653
3350
 910 421 910 465 877 389 884 455 925 449
 428 188 418 225 410 207 452 236 453 195
 856 413 943 449 917 446 889 431 909 461
 415 227 433 228 458 232 457 227 420 226
 877 435 920 427 898 433 902 433 899 456
 437 222 394 200 484 235 427 229 438 214
 877 443 895 433 916 440 898 462 902 469
 455 206 428 207 439 258 461 219 465 229
 948 433 893 460 866 423 928 461 801 440
 482 224 480 229 415 217 438 214 440 211

It just moves the problem around a little bit.

The easy solution is to never scale down more than once.

If it is necessary to scale down more than once then it can be done by generating a new number every time you get a number >= (MAX_RAND / K) * K.

#include <stdio.h>
#include <stdlib.h>
#include <time.h>

#define N 50000
#define K 10

int main()
{
    int i, j, tmp, last, one[K], two[K][K], a[N];
    srand(time(NULL));
    for(i = 0; i < N; i++)
    {
        do
        {
            tmp = rand() % 15;
        }
        while(tmp >= 10); /* retry until random number is in 0..9 (or M*10+9) */
        a[i] =  tmp % K; /* scale random number 0..9 (or M*10+9) to 0..9 */
    }
    for(i = 0; i < K; i++) one[i] = 0;
    for(i = 0; i < N; i++)
    {
        one[a[i]]++;
    }
    for(i = 0; i < K; i++) for(j = 0; j < K; j++) two[i][j] = 0;
    last = a[0];
    for(i = 1; i < N; i++)
    {
        two[last][a[i]]++;
        last = a[i];
    }
    for(int i = 0; i < K; i++)
    {
        printf("%d\n", one[i]);
    }
    for(int i = 0; i < K; i++)
    {
        for(int j = 0; j < K; j++)
        {
            printf(" %d", two[i][j]);
        }
        printf("\n");
    }
    return 0;
}
import java.util.Random;

public class BadRng2GoodFix {
    public final static int N = 50000;
    public final static int K = 10;
    public static void main(String[] args) {
        Random rng = new Random();
        int[] one = new int[K];
        int[][] two = new int[K][K];
        int[] a = new int[N];
        for(int i = 0; i < N; i++) {
            a[i] = rng.nextInt(15) %K; // scale random number 0..14 to 0..9
        }
        for(int i = 0; i < N; i++) {
            int tmp;
            do {
                tmp = rng.nextInt(15);
            } while(tmp >= 10); // retry until random number is in 0..9 (or M*10+9)
            a[i] =  tmp % K; // scale random number 0..9 (or M*10+9) to 0..9
        }
        for(int i = 0; i < N; i++) {
            one[a[i]]++;
        }
        int last = a[0];
        for(int i = 1; i < N; i++) {
            two[last][a[i]]++;
            last = a[i];
        }
        for(int i = 0; i < K; i++) {
            System.out.println(one[i]);
        }
        for(int i = 0; i < K; i++) {
            for(int j = 0; j < K; j++) {
                System.out.print(" " + two[i][j]);
            }
            System.out.println();
        }
    }
}
using System;

public class BadRng2GoodFix
{
    public const int N = 50000;
    public const int K = 10;
    public static void Main(string[] args)
    {
        Random rng = new Random();
        int[] one = new int[K];
        int[,] two = new int[K,K];
        int[] a = new int[N];
        for(int i = 0; i < N; i++)
        {
            int tmp;
            do
            {
                tmp = rng.Next(15);
            }
            while(tmp >= 10); // retry until random number is in 0..9 (or M*10+9)
            a[i] =  tmp % K; // scale random number 0..9 (or M*10+9) to 0..9
        }
        for(int i = 0; i < N; i++)
        {
            one[a[i]]++;
        }
        int last = a[0];
        for(int i = 1; i < N; i++)
        {
            two[last,a[i]]++;
            last = a[i];
        }
        for(int i = 0; i < K; i++)
        {
            Console.WriteLine(one[i]);
        }
        for(int i = 0; i < K; i++)
        {
            for(int j = 0; j < K; j++)
            {
                Console.Write(" " + two[i,j]);
            }
            Console.WriteLine();
        }
    }
}

Example output:

5060
5078
5143
4951
4992
5025
4952
4981
4909
4909
 537 540 524 468 519 526 499 508 458 481
 522 495 559 514 529 440 507 484 521 507
 522 522 503 491 469 529 500 526 560 521
 501 533 502 524 505 478 476 466 478 488
 476 533 527 483 537 501 477 500 477 481
 504 490 504 480 509 526 505 496 524 487
 497 510 504 495 468 509 509 515 456 488
 522 494 508 506 503 518 508 469 486 467
 493 467 485 504 486 493 472 512 465 532
 486 494 527 486 467 504 499 505 484 457

Bad combination

Bad combination of random numbers. Some beginners think that combining two RNG's generating OK random numbers will produce even better random numbers. Not so.

#include <stdio.h>
#include <stdlib.h>
#include <time.h>

#define N 50000
#define K 10

int main()
{
    int i, j, last, one[K], two[K][K], a[N];
    srand(time(NULL));
    for(i = 0; i < N; i++)
    {
        a[i] = (rand() % K + rand() % K) / 2; /* average of two random numbers 0..9 */
    }
    for(i = 0; i < K; i++) one[i] = 0;
    for(i = 0; i < N; i++)
    {
        one[a[i]]++;
    }
    for(i = 0; i < K; i++) for(j = 0; j < K; j++) two[i][j] = 0;
    last = a[0];
    for(i = 1; i < N; i++)
    {
        two[last][a[i]]++;
        last = a[i];
    }
    for(int i = 0; i < K; i++)
    {
        printf("%d\n", one[i]);
    }
    for(int i = 0; i < K; i++)
    {
        for(int j = 0; j < K; j++)
        {
            printf(" %d", two[i][j]);
        }
        printf("\n");
    }
    return 0;
}
import java.util.Random;

public class BadRng3 {
    public final static int N = 50000;
    public final static int K = 10;
    public static void main(String[] args) {
        Random rng = new Random();
        int[] one = new int[K];
        int[][] two = new int[K][K];
        int[] a = new int[N];
        for(int i = 0; i < N; i++) {
            a[i] = (rng.nextInt(K) + rng.nextInt(K)) / 2; // average of two random numbers 0..9
        }
        for(int i = 0; i < N; i++) {
            one[a[i]]++;
        }
        int last = a[0];
        for(int i = 1; i < N; i++) {
            two[last][a[i]]++;
            last = a[i];
        }
        for(int i = 0; i < K; i++) {
            System.out.println(one[i]);
        }
        for(int i = 0; i < K; i++) {
            for(int j = 0; j < K; j++) {
                System.out.print(" " + two[i][j]);
            }
            System.out.println();
        }
    }
}
using System;

public class BadRng3
{
    public const int N = 50000;
    public const int K = 10;
    public static void Main(string[] args)
    {
        Random rng = new Random();
        int[] one = new int[K];
        int[,] two = new int[K,K];
        int[] a = new int[N];
        for(int i = 0; i < N; i++)
        {
            a[i] = (rng.Next(K) + rng.Next(K)) / 2; // average of two random numbers 0..9
        }
        for(int i = 0; i < N; i++)
        {
            one[a[i]]++;
        }
        int last = a[0];
        for(int i = 1; i < N; i++)
        {
            two[last,a[i]]++;
            last = a[i];
        }
        for(int i = 0; i < K; i++)
        {
            Console.WriteLine(one[i]);
        }
        for(int i = 0; i < K; i++)
        {
            for(int j = 0; j < K; j++)
            {
                Console.Write(" " + two[i,j]);
            }
            Console.WriteLine();
        }
    }
}

Example output:

1532
3568
5473
7410
9506
8536
6476
4494
2499
506
 35 116 178 225 310 253 208 124 67 16
 101 256 386 552 645 612 468 343 170 35
 173 407 614 814 1064 907 660 497 283 54
 221 497 783 1042 1537 1266 975 642 365 81
 292 664 1077 1404 1776 1684 1185 852 494 78
 267 648 908 1298 1575 1463 1084 765 432 96
 217 452 693 989 1202 1057 891 582 325 68
 140 315 512 648 838 766 612 406 215 42
 75 183 261 370 461 430 329 232 126 32
 11 30 61 68 98 98 64 51 21 4

The numbers are not random at all.

It requires some good math understanding to combine RNG's. So for beginners the advice is simply: don't do it. Later in this article some combinations that actually work will be shown.

Low bits

Use of low bits. When taking modulus of powers of two extra care must be taken. Taking modulus of powers of two is equivalent of taking a number of low bits. Some common RNG's generate numbers that are not very random in low bits.

Example in C:

#include <stdio.h>
#include <stdlib.h>
#include <time.h>

#define N 50000
#define K 4

static unsigned long int seed;

void mysrand(unsigned long int ss)
{
    seed = ss;
    return;
}

unsigned long int myrand()
{
    /*
    seed = (65539UL * seed) % 2147483648UL;
    */
    unsigned long int help1 = 2147483648UL % 65539UL;
    unsigned long int help2 = 2147483648UL / 65539UL;
    long int tmp = 65539UL * (seed % help2) - help1 * (seed / help2);
    if(tmp >= 0)
        seed = tmp;
    else
        seed = tmp + 2147483648UL;
    return seed;
}

int main()
{
    int i, j, last, one[K], two[K][K], a[N];
    mysrand(time(NULL));
    for(i = 0; i < N; i++)
    {
        a[i] = myrand() % K;
    }
    for(i = 0; i < K; i++) one[i] = 0;
    for(i = 0; i < N; i++)
    {
        one[a[i]]++;
    }
    for(i = 0; i < K; i++) for(j = 0; j < K; j++) two[i][j] = 0;
    last = a[0];
    for(i = 1; i < N; i++)
    {
        two[last][a[i]]++;
        last = a[i];
    }
    for(i = 0; i < K; i++) printf("%d\n",one[i]);
    for(i = 0; i < K; i++) {
        for(j = 0; j < K; j++) 
        {
            printf(" %d",two[i][j]);
        }
        printf("\n");
    }
    return 0;
}

Don't think so much about the algorithm. It is not good, but it was commonly used 40 years ago.

Output:

0
25000
0
25000
 0 0 0 0
 0 0 0 24999
 0 0 0 0
 0 25000 0 0

The numbers are not random at all.

A solution is to use high bits instead of low bits:

#include <stdio.h>
#include <stdlib.h>
#include <time.h>

#define N 50000
#define K 4

static unsigned long int seed;

void mysrand(unsigned long int ss)
{
    seed = ss;
    return;
}

unsigned long int myrand()
{
    /*
    seed = (65539UL * seed) % 2147483648UL;
    */
    unsigned long int help1 = 2147483648UL % 65539UL;
    unsigned long int help2 = 2147483648UL / 65539UL;
    long int tmp = 65539UL * (seed % help2) - help1 * (seed / help2);
    if(tmp >= 0)
        seed = tmp;
    else
        seed = tmp + 2147483648UL;
    return seed;
}

int main()
{
    int i, j, last, one[K], two[K][K], a[N];
    mysrand(time(NULL));
    for(i = 0; i < N; i++)
    {
        a[i] = (myrand() / 2147483648.0) * K;
    }
    for(i = 0; i < K; i++) one[i] = 0;
    for(i = 0; i < N; i++)
    {
        one[a[i]]++;
    }
    for(i = 0; i < K; i++) for(j = 0; j < K; j++) two[i][j] = 0;
    last = a[0];
    for(i = 1; i < N; i++)
    {
        two[last][a[i]]++;
        last = a[i];
    }
    for(i = 0; i < K; i++) printf("%d\n",one[i]);
    for(i = 0; i < K; i++)
    {
        for(j = 0; j < K; j++)
        {
            printf(" %d",two[i][j]);
        }
        printf("\n");
    }
    return 0;
}

Output:

12628
12373
12597
12402
 3171 3198 3192 3067
 3079 3067 3055 3172
 3191 3119 3213 3074
 3187 2989 3137 3088

Of course one can also use an algorithm without this flaw:

#include <stdio.h>
#include <stdlib.h>
#include <time.h>

#define N 50000
#define K 4

int main()
{
    int i, j, last, one[K], two[K][K], a[N];
    srand(time(NULL));
    for(i = 0;i < N; i++)
    {
        a[i] = rand() % K;
    }
    for(i = 0; i < K; i++) one[i] = 0;
    for(i = 0; i < N; i++)
    {
        one[a[i]]++;
    }
    for(i = 0; i < K; i++) for(j = 0; j < K; j++) two[i][j] = 0;
    last = a[0];
    for(i = 1; i < N; i++)
    {
        two[last][a[i]]++;
        last = a[i];
    }
    for(i = 0; i < K; i++) 
    {
        printf("%d\n",one[i]);
    }
    for(i = 0; i < K; i++)
    {
        for(j = 0; j < K; j++)
        {
            printf(" %d",two[i][j]);
        }
        printf("\n");
    }
    return 0;
}

Output:

12671
12293
12526
12510
 3233 3098 3119 3221
 3022 3061 3139 3070
 3270 3067 3105 3084
 3146 3066 3163 3135

But the first workaround is actually more safe unless one really understand the characteristics of the alternative algorithm being used.

Summary:

  1. Pick a good algorithm. The best choice is to pick an algorithm tested and approved by mathematicians. The second best choice is to use the RNG builtin into ones programming language or runtime. The worst choice is to try and invent an algorithm yourself.
  2. Initialize the algorithm properly - once with a value that change between calls.
  3. Avoid to ruin a good algorithm with bad transformations. Limit the transformations as much as possible.

Good random distribution versus unpredictability:

So far we have focused on a good random distribution of generated numbers. That is always important.

But it may not be sufficient for the RNG to be good in a specific context.

Sometimes the numbers must also be unpredictable.

To illustrate the difference let us look at some usage of RNG for Poker.

First let us try and find the probability of a flush (including straight flush and royal straight flush) - so 5 cards of same color.

Java code:

import java.util.ArrayList;
import java.util.List;
import java.util.Random;

public class Poker1 {
    private static Random rng = new Random(System.currentTimeMillis());
    private List<Integer> deck;
    public Poker1() {
        deck = new ArrayList<Integer>();
        for(int i = 0; i < 52; i++) {
            deck.add(i);
        }
    }
    public int takeCard() {
        return deck.remove(rng.nextInt(deck.size()));
    }
    public int[] takeHand() {
        int[] res = new int[5];
        for(int i = 0; i < 5; i++) {
            res[i] = takeCard();
        }
        return res;
    }
    public static int getColour(int card) {
        return (card / 13);
    }
    public static int getValue(int card) {
        return (card % 13);
    }
    public static boolean isFlush(int[] hand) {
        if((getColour(hand[0]) == getColour(hand[1])) &&
           (getColour(hand[1]) == getColour(hand[2])) &&
           (getColour(hand[2]) == getColour(hand[3])) &&
           (getColour(hand[3]) == getColour(hand[4]))) {
            return true;
        } else {
            return false;
        }
    }
    private static final int REP = 1000000;
    private static final int PLAYERS = 4;
    public static void main(String[] args) {
        int nflush = 0;
        int[] hand = new int[5];
        for(int i = 0; i < REP; i++) {
            Poker1 game = new Poker1();
            for(int j = 0; j < PLAYERS; j++) {
                if(Poker1.isFlush(game.takeHand())) {
                    nflush++;
                }
            }
        }
        System.out.println(PLAYERS * REP / (double)nflush);
    }
}

The program usually print a number in the 500 to 510 range, which is fine as the true probability is 1 out of 505.

The distribution of the generated numbers are fine.

So are we ready to create a poker game?

import java.util.ArrayList;
import java.util.List;
import java.util.Random;

public class Poker2 {
    private static Random rng = new Random(System.currentTimeMillis());
    private List<Integer> deck;
    public Poker2() {
        deck = new ArrayList<Integer>();
        for(int i = 0; i < 52; i++) {
            deck.add(i);
        }
    }
    public int takeCard() {
        return deck.remove(rng.nextInt(deck.size()));
    }
    public int[] takeHand() {
        int[] res = new int[5];
        for(int i = 0; i < 5; i++) {
            res[i] = takeCard();
        }
        return res;
    }
    public static int getColour(int card) {
        return (card / 13);
    }
    public static int getValue(int card) {
        return (card % 13);
    }
    private static final int PLAYERS = 4;
    public static void main(String[] args) {
        int[][] hand = new int[PLAYERS][5];
        Poker2 game = new Poker2();
        for(int i = 0; i < PLAYERS; i++) {
            hand[i] = game.takeHand();
            for(int j = 0; j < 5; j++) {
                System.out.print(" " + hand[i][j]);
            }
            System.out.println();
        }
    }
}

Example output:

 8 36 4 18 16
 50 35 29 27 39
 28 42 48 31 1
 24 9 5 43 2

Those hands will pass all statistical tests for randomness.

But check this program:

import java.util.ArrayList;
import java.util.Arrays;
import java.util.List;
import java.util.Random;

public class Poker3 {
    private Random rng;
    private List<Integer> deck;
    public Poker3(long seed) {
        rng = new Random(seed);
        deck = new ArrayList<Integer>();
        for(int i = 0; i < 52; i++) {
            deck.add(i);
        }
    }
    public int takeCard() {
        return deck.remove(rng.nextInt(deck.size()));
    }
    public int[] takeHand() {
        int[] res = new int[5];
        for(int i = 0; i < 5; i++) {
            res[i] = takeCard();
        }
        return res;
    }
    public static int getColour(int card) {
        return (card / 13);
    }
    public static int getValue(int card) {
        return (card % 13);
    }
    public static boolean isSameHand(int[] h1, int[] h2) {
        int[] h1x = h1.clone();
        int[] h2x = h2.clone();
        Arrays.sort(h1x);
        Arrays.sort(h2x);
        if((h1x[0] == h2x[0]) &&
           (h1x[1] == h2x[1]) &&
           (h1x[2] == h2x[2]) &&
           (h1x[3] == h2x[3]) &&
           (h1x[4] == h2x[4])) {
            return true;
        } else {
            return false;
        }
    }
    private static final int PLAYERS = 4;
    public static void main(String[] args) {
        int[] myhand = { 8, 36, 4, 18, 16 };
        int[][] hand;
        long seed = System.currentTimeMillis();
        while(true) {
            hand = new int[PLAYERS][5];
            Poker3 game = new Poker3(seed);
            for(int i = 0; i < PLAYERS; i++) {
                hand[i] = game.takeHand();
            }
            if(isSameHand(myhand, hand[0])) {
                break;
            }
            seed--;
        }
        for(int i = 0; i < PLAYERS; i++) {
            for(int j = 0; j < 5; j++) {
                System.out.print(" " + hand[i][j]);
            }
            System.out.println();
        }
    }
}

Output:

 8 36 4 18 16
 50 35 29 27 39
 28 42 48 31 1
 24 9 5 43 2

Based on ones own hand and knowledge about the algorithm, then it only takes seconds to calculate the other hands.

Not smart!

The core of the problem is that even though the algorithm and the initialization is good enough to provide random numbers with good distribution, then the numbers are not unpredictable.

Good distribution and unpredictability are two very diffreent properties.

To generate unpredictable random numbers two things are needed:

  • The seed must be unpredictable
  • The number of possible values of the seed must be large enough to prevent brute force attack (trying all possible values)

There are several ways of getting an unpredictable seed.

Most large runtime libraries has socalled cryptographically secure random number generators that can generate unpredictable random numbers:

  • Java has java.security.SecureRandom.
  • .NET has System.Security.Cryptography.RNGCryptoServiceProvider.
  • PHP has random_bytes() [builtin PHP 7+, user implementations available PHP 5]
  • Python has the secrets module

On Linux it is possible to read from /dev/random and /dev/urandom.

Some theory:

By far the most common family of algorithms for RNG is LCG (Linear Congruential Generator):

xi = (a*xi-1 + b) mod c

There is also a specialization with b=0 called MLCG (Multiplicative Linear Congruential Generator):

xi = (a*xi-1) mod c

And a generalization with older input than i-1 called MRG (Multiple Recursive Generator):

xi = (a1*xi-1 + a2*xi-2 + ... + an*xi-n + b) mod c

When one neeed a random integer number 0 .. N-1, then one can just use xi if N=c.

But often one will want to use xi>>m to drop the m low bits as it is a known weakness in LCG's that low bits are less random than highbits. See example in previous section.

When one need a random decimal number 0.0 .. 1.0 where 0.0 is inclusive and 1.0 is exclusive, then one can use an integer random number and xi*1.0/N.

A key concept is cycle. All LCG's will start to repeat sequence of generated numbers after L numbers has been generated. L is called the cycle of the algorithm

The following applies:

  • A reasonable large cycle is necessary for a RNG to be good
  • Larger cycle does not guarantee better algorithm

An alternative way to look at the cycle concept is to imagine all possible numbers put in a circle. The algorithm determines the order of the numbers. The initializion seed determines where in the cycle one starts. But when started numbers are simply returned by moving one place forward in the circle.

It is obvious that for LCG the cycle can never be be larger than c.

It has been proven that if:

  • c is a power of 2
  • a mod 4 is 1
  • b is odd

then the cycle is c.

Some of the big names in RNG theory are Knuth, Marsaglia and L'Ecuyer.

Before we go to specific algorithms we need to look at a little cute rewrite that is often necesarry in LCG to avoid integer overflow.

Demo in C:

#include <stdio.h>

static long int seed1;

void mysrand1(long int ss)
{
    seed1 = ss;
    return;
}

long int myrand1()
{
    long long int tmp = seed1;
    tmp = (3125LL * tmp) % 67108864LL;
    seed1 = tmp;
    return seed1;
}

static long int seed2;

void mysrand2(long int ss)
{
    seed2 = ss;
    return;
}

long int myrand2()
{
    seed2 = (3125L * seed2) % 67108864L;
    return seed2;
}

static long int seed3;

void mysrand3(long int ss)
{
    seed3 = ss;
    return;
}

long int myrand3()
{
    long int help1 = 67108864L % 3125L;
    long int help2 = 67108864L / 3125L;
    long int tmp = 3125L * (seed3 % help2) - help1 * (seed3 / help2);
    if(tmp >= 0)
        seed3 = tmp;
    else
        seed3 = tmp + 67108864L;
    return seed3;
}

int main()
{
    int i;
    mysrand1(1234567);
    mysrand2(1234567);
    mysrand3(1234567);
    for(i = 0; i < 20; i++) {
        printf("%ld %ld %ld\n", myrand1(), myrand2(), myrand3());
    }
    return 0;
}

Output:

32816627 -34292237 32816627
9615183 9615183 9615183
49784667 -17324197 49784667
18737623 18737623 18737623
36142467 -30966397 36142467
991263 991263 991263
10689131 -56419733 10689131
50428967 -16679897 50428967
18909203 -48199661 18909203
35459055 -31649809 35459055
12812411 -54296453 12812411
41901431 41901431 41901431
12578211 12578211 12578211
48223935 48223935 48223935
40397195 40397195 40397195
9461191 9461191 9461191
38321715 -28787149 38321715
33145999 33145999 33145999
32269723 32269723 32269723
45370647 45370647 45370647

We see that the rewrite solves the integer overflow issue similar to using a longer integer.

Some known good algorithms:

Let us see 4 of the known good algorithms in object oriented Java and C#:

RNG.java

public abstract class RNG {
    public abstract int getInt();
    public abstract int getMaxInt();
    public double getDouble() {
        return (getInt() / (double)getMaxInt());
    }
}

LCG.java

public class LCG extends RNG {
    private long a;
    private long b;
    private long c;
    private long seed;
    public LCG(int a, int b, int c, int seed) {
        this.a = a;
        this.b = b;
        this.c = c;
        this.seed = seed;
    }
    public int getInt() {
        seed = (a * seed + b) % c;
        return (int)seed;
    }
    public int getMaxInt() {
        return (int)c;
    }
}

MRG.java

public class MRG extends RNG {
    private long[] a;
    private long b;
    private long c;
    private long[] seed;
    public MRG(int[] a, int b, int c, int[] seed) {
        this.a = new long[a.length];
        for(int i = 0; i < a.length; i++) this.a[i] = a[i];
        this.b = b;
        this.c = c;
        this.seed = new long[seed.length];
        for(int i = 0; i < seed.length; i++) this.seed[i] = seed[i];
    }
    public int getInt() {
        long tmp = 0;
        for(int i = 0; i < a.length; i++) {
            tmp = tmp + a[i] * seed[i];
        }
        tmp = tmp + b;
        tmp = tmp % c;
        for(int i = 1; i < seed.length; i++) {
            seed[i] = seed[i-1];
        }
        seed[0] = tmp;
        return (int)seed[0];
    }
    public int getMaxInt() {
        return (int)c;
    }
}

MLCG.java

public class MLCG extends LCG {
    public MLCG(int a, int c, int seed) {
        super(a, 0, c, seed);
    }
}

MinStd.java

// Minimal Standard algorithm
//
// Original proposed by Lewis, Goodman & Miller 1969, but popularized
// by Park & Miller CACM Oct 1988.
//
// Name indicate that one should never choose an algorithm worse than this.
//
// cycle = 2 147 483 647 = 2^31-1

public class MinStd extends MLCG {
    public MinStd(int seed) {
        super(16807, 2147483647, seed);
    }
}

LEcuyer.java

// L'Ecuyer algorithm
//
// Ecuyer & Blouin & Coutre ACM Simulations 1993.
//
// cycle = approx. 1E46

public class LEcuyer extends MRG {
    public LEcuyer(int[] seed) {
        super(new int[] { 107374182, 0, 0, 0, 104480 }, 0, 2147483647, seed);
    }
}

Combiner.java

public interface Combiner {
    public int combine(int[] num);
    public int getMaxInt();
}

Combined.java

public class Combined extends RNG {
    private RNG[] g;
    private Combiner cmbn;
    public Combined(RNG[] g, Combiner cmbn) {
        this.g = g;
        this.cmbn = cmbn;
    }
    public int getInt() {
        int[] num = new int[g.length];
        for(int i = 0; i < num.length; i++) {
            num[i] = g[i].getInt();
        }
        return cmbn.combine(num);
    }
    public int getMaxInt() {
        return cmbn.getMaxInt();
    }
}

Uniform32.java

// Uniform32 algorithm
//
// L'Ecuyer CACM Jun 1988.
//
// cycle = approx. 2.31E18

public class Uniform32 extends Combined {
    private static class Uniform32Combiner implements Combiner {
        public int combine(int[] num) {
            long tmp = num[0] - num[1];
            tmp = (tmp + 2147483562) % 2147483562;
            return (int)tmp;
        }
        public int getMaxInt() {
            return 2147483562;
        }
    }
    public Uniform32(int seed1, int seed2) {
        super(new RNG[] { new MLCG(40014, 2147483563, seed1), new MLCG(40692, 2147483399, seed2) },
              new Uniform32Combiner());
    }
}

CMRG.java

// CMRG algorithm
//
// L'Ecuyer Operations Research 1996.
//
// cycle = 2^205 = approc. 1E61

public class CMRG extends Combined {
    private static class CMRGCombiner implements Combiner {
        public int combine(int[] num) {
            long tmp = num[0] - num[1];
            tmp = (tmp + 2147483647) % 2147483647;
            return (int)tmp;
        }
        public int getMaxInt() {
            return 2147483647;
        }
    }
    public CMRG(int[] seed1, int[] seed2) {
        super(new RNG[] { new MRG(new int[] { 0, 63308, -183326 }, 0, 2147483647, seed1),
                          new MRG(new int[] { 86098, 0, -539608 }, 0, 2145483479, seed2) },
              new CMRGCombiner());
    }
}

Test.java

public class Test {
    private static void test(RNG rng) {
        for(int i = 0; i < 10; i++) {
            System.out.println(rng.getInt());
        }
        for(int i = 0; i < 10; i++) {
            System.out.println(rng.getDouble());
        }
    }
    private static void testMinStd() {
        test(new MinStd(1234567));
    }
    private static void testUniform32() {
        test(new Uniform32(1234567, 1234567));
    }
    private static void testLEcuyer() {
        test(new LEcuyer(new int[] { 1234567, 1234567, 1234567, 1234567, 1234567 }));
    }
    private static void testCMRG() {
        test(new CMRG(new int[] { 1234567, 1234567, 1234567 }, new int[] { 1234567, 1234567, 1234567 }));
    }
    public static void main(String[] args) {
        testMinStd();
        testUniform32();
        testLEcuyer();
        testCMRG();
    }
}

RNG.cs

public abstract class RNG
{
    public abstract int GetInt();
    public abstract int MaxInt
    {
        get;
    }
    public double GetDouble()
    {
        return (GetInt() / (double)MaxInt);
    }
}

public class LCG : RNG
{
    private long a;
    private long b;
    private long c;
    private long seed;
    public LCG(int a, int b, int c, int seed)
    {
        this.a = a;
        this.b = b;
        this.c = c;
        this.seed = seed;
    }
    public override int GetInt()
    {
        seed = (a * seed + b) % c;
        return (int)seed;
    }
    public override int MaxInt
    {
        get
        {
            return (int)c;
        }
    }
}

public class MRG : RNG
{
    private long[] a;
    private long b;
    private long c;
    private long[] seed;
    public MRG(int[] a, int b, int c, int[] seed)
    {
        this.a = new long[a.Length];
        for(int i = 0; i < a.Length; i++) this.a[i] = a[i];
        this.b = b;
        this.c = c;
        this.seed = new long[seed.Length];
        for(int i = 0; i < seed.Length; i++) this.seed[i] = seed[i];
    }
    public override int GetInt()
    {
        long tmp = 0;
        for(int i = 0; i < a.Length; i++)
        {
            tmp = tmp + a[i] * seed[i];
        }
        tmp = tmp + b;
        tmp = tmp % c;
        for(int i = 1; i < seed.Length; i++)
        {
            seed[i] = seed[i-1];
        }
        seed[0] = tmp;
        return (int)seed[0];
    }
    public override int MaxInt
    {
        get
        {
            return (int)c;
        }
    }
}

public class MLCG : LCG
{
    public MLCG(int a, int c, int seed) : base(a, 0, c, seed)
    {
    }
}

// Minimal Standard algorithm
//
// Original proposed by Lewis, Goodman & Miller 1969, but popularized
// by Park & Miller CACM Oct 1988.
//
// Name indicate that one should never choose an algorithm worse than this.
//
// cycle = 2 147 483 647 = 2^31-1

public class MinStd : MLCG
{
    public MinStd(int seed) : base(16807, 2147483647, seed)
    {
    }
}

// L'Ecuyer algorithm
//
// Ecuyer & Blouin & Coutre ACM Simulations 1993.
//
// cycle = approx. 1E46

public class LEcuyer : MRG
{
    public LEcuyer(int[] seed) : base(new int[] { 107374182, 0, 0, 0, 104480 }, 0, 2147483647, seed)
    {
    }
}

public interface Combiner
{
    int Combine(int[] num);
    int MaxInt
    {
        get;
    }
}

public class Combined : RNG
{
    private RNG[] g;
    private Combiner cmbn;
    public Combined(RNG[] g, Combiner cmbn)
    {
        this.g = g;
        this.cmbn = cmbn;
    }
    public override int GetInt()
    {
        int[] num = new int[g.Length];
        for(int i = 0; i < num.Length; i++)
        {
            num[i] = g[i].GetInt();
        }
        return cmbn.Combine(num);
    }
    public override int MaxInt
    {
        get
        {
            return cmbn.MaxInt;
        }
    }
}

// Uniform32 algorithm
//
// L'Ecuyer CACM Jun 1988.
//
// cycle = approx. 2.31E18

internal class Uniform32Combiner : Combiner
{
    public int Combine(int[] num)
    {
        long tmp = num[0] - num[1];
        tmp = (tmp + 2147483562) % 2147483562;
        return (int)tmp;
    }
    public int MaxInt
    {
        get
        {
            return 2147483562;
        }
    }
}

public class Uniform32 : Combined
{
    public Uniform32(int seed1, int seed2) : base(new RNG[] { new MLCG(40014, 2147483563, seed1),
                                                              new MLCG(40692, 2147483399, seed2) },
                                                  new Uniform32Combiner())
    {
    }
}

// CMRG algorithm
//
// L'Ecuyer Operations Research 1996.
//
// cycle = 2^205 = approx. 1E61

internal class CMRGCombiner : Combiner
{
    public int Combine(int[] num)
    {
        long tmp = num[0] - num[1];
        tmp = (tmp + 2147483647) % 2147483647;
        return (int)tmp;
    }
    public int MaxInt
    {
        get
        {
            return 2147483647;
        }
    }
}

public class CMRG : Combined {
    public CMRG(int[] seed1, int[] seed2) : base(new RNG[] { new MRG(new int[] { 0, 63308, -183326 }, 0, 2147483647, seed1),
                                                             new MRG(new int[] { 86098, 0, -539608 }, 0, 2145483479, seed2) },
                                                 new CMRGCombiner())
    {
    }
}

Test.cs

using System;

public class Program
{
    private static void Test(RNG rng)
    {
        for(int i = 0; i < 10; i++)
        {
            Console.WriteLine(rng.GetInt());
        }
        for(int i = 0; i < 10; i++)
        {
            Console.WriteLine(rng.GetDouble());
        }
    }
    private static void TestMinStd()
    {
        Test(new MinStd(1234567));
    }
    private static void TestUniform32()
    {
        Test(new Uniform32(1234567, 1234567));
    }
    private static void TestLEcuyer()
    {
        Test(new LEcuyer(new int[] { 1234567, 1234567, 1234567, 1234567, 1234567 }));
    }
    private static void TestCMRG()
    {
        Test(new CMRG(new int[] { 1234567, 1234567, 1234567 }, new int[] { 1234567, 1234567, 1234567 }));
    }
    public static void Main(string[] args)
    {
        TestMinStd();
        TestUniform32();
        TestLEcuyer();
        TestCMRG();
    }
}

Non-uniform distribution:

Everything so far has been about uniform distributed random numbers. But sometimes it is necessary to generate numbers with another distribution.

Luckily other distributions can usually be derived from uniform distribution.

For normal distributed random nunmbers one can use the socalled Box Mueller transformation:

u1 og u2 are unform distributed 0.0 .. 1.0
h1 = sqrt(-2*log(u1)); h2 = 2*PI*u2
n1 = e + s*h1*sin(h2) n2 = e + s*h1*cos(h2)
n1 and n2 are normal distributed N(e,s).

Let us see implementations in Java and C#.

import java.util.Random;

public class Normal
{
    public static final int N = 100000;
    public static void test(double e, double s) {
        Random rng = new Random();
        int[] one1 = new int[4];
        int[] one2 = new int[4];
        for(int i = 0; i < N; i++) {
            double u1 = rng.nextDouble();
            double u2 = rng.nextDouble();
            double h1 = Math.sqrt(-2 * Math.log(u1));
            double h2 = 2 * Math.PI * u2;
            double n1 = e + s * h1 * Math.sin(h2);
            double n2 = e + s * h1 * Math.cos(h2);
            if(n1 < e - 1.96 * s) {
                one1[0]++;
            } else if(n1 < e) {
                one1[1]++;
            } else if(n1 < e + 1.96 * s) {
                one1[2]++;
            } else {
                one1[3]++;
            }
            if(n2 < e - 1.96 * s) {
                one2[0]++;
            } else if(n2 < e) {
                one2[1]++;
            } else if(n2 < e + 1.96 * s) {
                one2[2]++;
            } else {
                one2[3]++;
            }
        }
        for(int i = 0; i < 4; i++)
        {
            System.out.println(one1[i] + " " + one2[i]);
        }
    }
    public static void main(String[] args) {
        test(0, 1);
        test(7, 3);
    }
}
using System;

public class Normal
{
    public const int N = 100000;
    public static void Test(double e, double s) 
    {
        Random rng = new Random();
        int[] one1 = new int[4];
        int[] one2 = new int[4];
        for(int i = 0; i < N; i++)
        {
            double u1 = rng.NextDouble();
            double u2 = rng.NextDouble();
            double h1 = Math.Sqrt(-2 * Math.Log(u1));
            double h2 = 2 * Math.PI * u2;
            double n1 = e + s * h1 * Math.Sin(h2);
            double n2 = e + s * h1 * Math.Cos(h2);
            if(n1 < e - 1.96 * s) 
            {
                one1[0]++;
            }
            else if(n1 < e)
            {
                one1[1]++;
            }
            else if(n1 < e + 1.96 * s) 
            {
                one1[2]++;
            }
            else
            {
                one1[3]++;
            }
            if(n2 < e - 1.96 * s) 
            {
                one2[0]++;
            }
            else if(n2 < e)
            {
                one2[1]++;
            }
            else if(n2 < e + 1.96 * s) 
            {
                one2[2]++;
            }
            else
            {
                one2[3]++;
            }
        }
        for(int i = 0; i < 4; i++)
        {
            Console.WriteLine(one1[i] + " " + one2[i]);
        }
    }
    public static void Main(string[] args) {
        Test(0, 1);
        Test(7, 3);
    }
}

Output:

2544 2471
47544 47787
47360 47319
2552 2423
2474 2532
47367 47736
47694 47248
2465 2484

which looks fine (the expected distribution is 2.5%, 47.5%, 47.5% and 2.5%.

What to expect from generated random numbers?

One can expect a nice distribution but not a perfect distribution.

Let us say that we generate 10000 random numbers and group in groups that should have same size (either pick 1/4 of range for each group or pick low 2 bit for group).

The expected distribution for a good generator is:

2500
2500 
2500
2500

But it is not likely to get exactly that distribution.

The probability can be calculated via formula for multinominal distribution:

P(2500,2500,2500,2500) = 10000!/(2500!*2500!*2500!*2500!)*0.252500*0.252500*0.252500*0.252500

which according to my hand calculator is approx. 1*10-6, so one will only get the expected distribution one out of a million tries.

If one only look at the distribution for one group then it is binominal distributed and for large N we can approximate with a normal distribution:

BIN(0.25,2500,10000) = N(0.25*10000,10000*0.25*(1-0.25)) = N(2500,1875)

which gives a standard deviation:

s = SQRT(Var) = SQRT(1875) = 43.3

and therefore a distribution where there is 95% probability that the value is in the range:

E-1.96*s ... E+1.96*s = 2415 ... 2585

PHP:

PHP comes standard with 2 RNG's:

  • rand()
  • mt_rand()

rand() uses the rand() function from the C runtime library on the system (so whether it is a good or bad algorithm depends on the operating system and/or the C compiler).

mt_rand() uses the socalled Mersenne Twister algorithm that:

  • is very fast
  • has a very long cycle (219937-1)
  • has a good distribution

So which to use?

PHP docs are clear:

Many random number generators of older libcs have dubious or unknown characteristics and are slow. By default, PHP uses the libc random number generator with the rand() function. The mt_rand() function is a drop-in replacement for this. It uses a random number generator with known characteristics using the Mersenne Twister, which will produce random numbers four times faster than what the average libc rand() provides.

In my opinion it is the unknown characteristics and not the slow speed that is the main reason to avoid rand() and use mt_rand() instead.

Test of rand() on Windows XP with PHP 5.3.0:

<?php
srand(1233567);
$high1 = array(0, 0, 0, 0);
$high2 = array(array(0, 0, 0, 0), array(0, 0, 0, 0), array(0, 0, 0, 0), array(0, 0, 0, 0));
$low1 = array(0, 0, 0, 0);
$low2 = array(array(0, 0, 0, 0), array(0, 0, 0, 0), array(0, 0, 0, 0), array(0, 0, 0, 0));
$prev = rand();
for($i = 0; $i < 10000; $i++) {
    $curr = rand();
    $high1[$curr * 4 / getrandmax()]++;
    $high2[$curr * 4 / getrandmax()][$prev* 4 / getrandmax()]++;
    $low1[$curr % 4]++;
    $low2[$curr % 4][$prev % 4]++;
    $prev = $curr;
}
echo "High bits:\n";
echo "<table border>\n";
for($row = 0; $row < 4; $row++) {
    echo "<tr>\n";
    echo "<td>" . $high1[$row] . "</td>\n";
    echo "</tr>\n";
}
echo "</table>\n";
echo "<table border>\n";
for($row = 0; $row < 4; $row++) {
    echo "<tr>\n";
    for($col = 0; $col < 4; $col++) {
        echo "<td>" . $high2[$row][$col] . "</td>\n";
    }
    echo "</tr>\n";
}
echo "</table>\n";
echo "Low bits:\n";
echo "<table border>\n";
for($row = 0; $row < 4; $row++) {
    echo "<tr>\n";
    echo "<td>" . $low1[$row] . "</td>\n";
    echo "</tr>\n";
}
echo "</table>\n";
echo "<table border>\n";
for($row = 0; $row < 4; $row++) {
    echo "<tr>\n";
    for($col = 0; $col < 4; $col++) {
        echo "<td>" . $low2[$row][$col] . "</td>\n";
    }
    echo "</tr>\n";
}
echo "</table>\n";
?>

Result:

High bits:
2438 
2480 
2577 
2504 
607 605 612 614 
605 608 645 622 
627 641 659 650 
599 625 662 617 
Low bits:
2500 
2500 
2500 
2500 
0 0 0 2500 
2500 0 0 0 
0 2500 0 0 
0 0 2500 0 

Not good in low bits!

Test of mt_rand() on Windows XP with PHP 5.3.0:

<?php
mt_srand(1233567);
$high1 = array(0, 0, 0, 0);
$high2 = array(array(0, 0, 0, 0), array(0, 0, 0, 0), array(0, 0, 0, 0), array(0, 0, 0, 0));
$low1 = array(0, 0, 0, 0);
$low2 = array(array(0, 0, 0, 0), array(0, 0, 0, 0), array(0, 0, 0, 0), array(0, 0, 0, 0));
$prev = mt_rand();
for($i = 0; $i < 10000; $i++) {
    $curr = mt_rand();
    $high1[$curr * 4 / mt_getrandmax()]++;
    $high2[$curr * 4 / mt_getrandmax()][$prev* 4 / mt_getrandmax()]++;
    $low1[$curr % 4]++;
    $low2[$curr % 4][$prev % 4]++;
    $prev = $curr;
}
echo "High bits:\n";
echo "<table border>\n";
for($row = 0; $row < 4; $row++) {
    echo "<tr>\n";
    echo "<td>" . $high1[$row] . "</td>\n";
    echo "</tr>\n";
}
echo "</table>\n";
echo "<table border>\n";
for($row = 0; $row < 4; $row++) {
    echo "<tr>\n";
    for($col = 0; $col < 4; $col++) {
        echo "<td>" . $high2[$row][$col] . "</td>\n";
    }
    echo "</tr>\n";
}
echo "</table>\n";
echo "Low bits:\n";
echo "<table border>\n";
for($row = 0; $row < 4; $row++) {
    echo "<tr>\n";
    echo "<td>" . $low1[$row] . "</td>\n";
    echo "</tr>\n";
}
echo "</table>\n";
echo "<table border>\n";
for($row = 0; $row < 4; $row++) {
    echo "<tr>\n";
    for($col = 0; $col < 4; $col++) {
        echo "<td>" . $low2[$row][$col] . "</td>\n";
    }
    echo "</tr>\n";
}
echo "</table>\n";
?>

Result:

High bits:
2467 
2522 
2524 
2487 
594 603 648 622 
641 647 617 617 
610 646 619 649 
623 625 640 599 
Low bits:
2503 
2515 
2480 
2502 
634 630 622 617 
634 627 638 616 
629 616 609 626 
606 641 612 643 

Much better.

Conclusion always use mt_rand() and never rand().

ASP:

VBS RND() function returns a floating point value 0.0 .. 1.0 (0.0 inclusive and 1.0 exclusive) and not an integer like most other base RNG's.

Often one need a random integer so a transformation is needed.

And then it is critical to know that RND() uses a 24 bit LCG.

This means that:

  • scaling higher than 24 bit is very bad
  • scaling high can result in badly distributed low bits

12 bits:

<%
Randomize(1234567)
high1 = array(0, 0, 0, 0)
high2 = array(array(0, 0, 0, 0), array(0, 0, 0, 0), array(0, 0, 0, 0), array(0, 0, 0, 0))
low1 = array(0, 0, 0, 0)
low2 = array(array(0, 0, 0, 0), array(0, 0, 0, 0), array(0, 0, 0, 0), array(0, 0, 0, 0))
prev = Fix(Rnd() * 4096)
For i = 1 To 10000
   curr = Fix(Rnd() * 4096)
   high1(curr * 4 \ 4096) = high1(curr * 4 \ 4096) + 1 
   high2(curr * 4 \ 4096)(prev * 4 \ 4096) = high2(curr * 4 \ 4096)(prev * 4 \ 4096) + 1
   low1(curr Mod 4) = low1(curr Mod 4) + 1 
   low2(curr Mod 4)(prev Mod 4) = low2(curr Mod 4)(prev Mod 4) + 1
   prev = curr
Next
Response.Write "High bits:" & vbCrLf
Response.Write "<table border>" & vbCrLf
For row = 0 To 3
    Response.Write "<tr>" & vbCrLf
    Response.Write "<td>" & high1(row) & "</td>" & vbCrLf
    Response.Write "</tr>" & vbCrLf
Next
Response.Write "</table>" & vbCrLf
Response.Write "<table border>" & vbCrLf
For row = 0 To 3
    Response.Write "<tr>" & vbCrLf
    For col = 0 To 3
        Response.Write "<td>" & high2(row)(col) & "</td>" & vbCrLf
    Next
    Response.Write "</tr>" & vbCrLf
Next
Response.Write "</table>" & vbCrLf
Response.Write "Low bits:" & vbCrLf
Response.Write "<table border>" & vbCrLf
For row = 0 To 3
    Response.Write "<tr>" & vbCrLf
    Response.Write "<td>" & low1(row) & "</td>" & vbCrLf
    Response.Write "</tr>" & vbCrLf
Next
Response.Write "</table>" & vbCrLf
Response.Write "<table border>" & vbCrLf
For row = 0 To 3
    Response.Write "<tr>" & vbCrLf
    For col = 0 To 3
        Response.Write "<td>" & low2(row)(col) & "</td>" & vbCrLf
    Next
    Response.Write "</tr>" & vbCrLf
Next
Response.Write "</table>" & vbCrLf
%>

Result:

High bits:
2528 
2474 
2509 
2489 
631 629 633 635 
619 617 630 608 
643 593 633 640 
635 635 613 606 
Low bits:
2525 
2512 
2450 
2513 
648 628 616 633 
641 621 620 630 
613 621 600 616 
623 641 614 635 

which looks fine.

16 bit:

<%
Randomize(1234567)
high1 = array(0, 0, 0, 0)
high2 = array(array(0, 0, 0, 0), array(0, 0, 0, 0), array(0, 0, 0, 0), array(0, 0, 0, 0))
low1 = array(0, 0, 0, 0)
low2 = array(array(0, 0, 0, 0), array(0, 0, 0, 0), array(0, 0, 0, 0), array(0, 0, 0, 0))
prev = Fix(Rnd() * 65536)
For i = 1 To 10000
   curr = Fix(Rnd() * 65536)
   high1(curr * 4 \ 65536) = high1(curr * 4 \ 65536) + 1 
   high2(curr * 4 \ 65536)(prev * 4 \ 65536) = high2(curr * 4 \ 65536)(prev * 4 \ 65536) + 1
   low1(curr Mod 4) = low1(curr Mod 4) + 1 
   low2(curr Mod 4)(prev Mod 4) = low2(curr Mod 4)(prev Mod 4) + 1
   prev = curr
Next
Response.Write "High bits:" & vbCrLf
Response.Write "<table border>" & vbCrLf
For row = 0 To 3
    Response.Write "<tr>" & vbCrLf
    Response.Write "<td>" & high1(row) & "</td>" & vbCrLf
    Response.Write "</tr>" & vbCrLf
Next
Response.Write "</table>" & vbCrLf
Response.Write "<table border>" & vbCrLf
For row = 0 To 3
    Response.Write "<tr>" & vbCrLf
    For col = 0 To 3
        Response.Write "<td>" & high2(row)(col) & "</td>" & vbCrLf
    Next
    Response.Write "</tr>" & vbCrLf
Next
Response.Write "</table>" & vbCrLf
Response.Write "Low bits:" & vbCrLf
Response.Write "<table border>" & vbCrLf
For row = 0 To 3
    Response.Write "<tr>" & vbCrLf
    Response.Write "<td>" & low1(row) & "</td>" & vbCrLf
    Response.Write "</tr>" & vbCrLf
Next
Response.Write "</table>" & vbCrLf
Response.Write "<table border>" & vbCrLf
For row = 0 To 3
    Response.Write "<tr>" & vbCrLf
    For col = 0 To 3
        Response.Write "<td>" & low2(row)(col) & "</td>" & vbCrLf
    Next
    Response.Write "</tr>" & vbCrLf
Next
Response.Write "</table>" & vbCrLf
%>

Result:

High bits:
2528 
2474 
2509 
2489 
631 629 633 635 
619 617 630 608 
643 593 633 640 
635 635 613 606 
Low bits:
2509 
2485 
2506 
2500 
837 194 644 834 
828 822 194 641 
649 826 834 197 
195 643 834 828 

and we see some deviations in low bits.

24 bit:

<%
Randomize(1234567)
high1 = array(0, 0, 0, 0)
high2 = array(array(0, 0, 0, 0), array(0, 0, 0, 0), array(0, 0, 0, 0), array(0, 0, 0, 0))
low1 = array(0, 0, 0, 0)
low2 = array(array(0, 0, 0, 0), array(0, 0, 0, 0), array(0, 0, 0, 0), array(0, 0, 0, 0))
prev = Fix(Rnd() * 16777216)
For i = 1 To 10000
   curr = Fix(Rnd() * 16777216)
   high1(curr * 4 \ 16777216) = high1(curr * 4 \ 16777216) + 1 
   high2(curr * 4 \ 16777216)(prev * 4 \ 16777216) = high2(curr * 4 \ 16777216)(prev * 4 \ 16777216) + 1
   low1(curr Mod 4) = low1(curr Mod 4) + 1 
   low2(curr Mod 4)(prev Mod 4) = low2(curr Mod 4)(prev Mod 4) + 1
   prev = curr
Next
Response.Write "High bits:" & vbCrLf
Response.Write "<table border>" & vbCrLf
For row = 0 To 3
    Response.Write "<tr>" & vbCrLf
    Response.Write "<td>" & high1(row) & "</td>" & vbCrLf
    Response.Write "</tr>" & vbCrLf
Next
Response.Write "</table>" & vbCrLf
Response.Write "<table border>" & vbCrLf
For row = 0 To 3
    Response.Write "<tr>" & vbCrLf
    For col = 0 To 3
        Response.Write "<td>" & high2(row)(col) & "</td>" & vbCrLf
    Next
    Response.Write "</tr>" & vbCrLf
Next
Response.Write "</table>" & vbCrLf
Response.Write "Low bits:" & vbCrLf
Response.Write "<table border>" & vbCrLf
For row = 0 To 3
    Response.Write "<tr>" & vbCrLf
    Response.Write "<td>" & low1(row) & "</td>" & vbCrLf
    Response.Write "</tr>" & vbCrLf
Next
Response.Write "</table>" & vbCrLf
Response.Write "<table border>" & vbCrLf
For row = 0 To 3
    Response.Write "<tr>" & vbCrLf
    For col = 0 To 3
        Response.Write "<td>" & low2(row)(col) & "</td>" & vbCrLf
    Next
    Response.Write "</tr>" & vbCrLf
Next
Response.Write "</table>" & vbCrLf
%>

Result:

High bits:
2528 
2474 
2509 
2489 
631 629 633 635 
619 617 630 608 
643 593 633 640 
635 635 613 606 
Low bits:
2500 
2500 
2500 
2500 
0 2500 0 0 
0 0 2500 0 
0 0 0 2500 
2500 0 0 0 

a typical LCG very bad low bits result.

Conclusion: VBS RND() is acceptable to scale to integer values 1-12 bit, but more bits require carefulness due to bad distribution of low bits. And never go above 24 bits.

ASP.NET

ASP.NET uses .NET RNG's that are described in previous section and have good characteristica.

Java EE

Java EE uses Java RNG's that are described in previous section and have good characteristica.

Article history:

Version Date Description
1.0 April 18th 2005, May 23rd 2005 and April 15th 2006 Initial version (in Danish) published on Eksperten.dk in 3 separate articles
1.2 February 12th 2010 Minor changes
2.0 August 1st 2016 Translation to English and complete reformatting and publishing here
2.1 October 8th 2016 Add content overview
2.2 March 7th 2021 Add more examples and update some text

Other articles:

See list of all articles here

Comments:

Please send comments to Arne Vajhøj