There are several different informal definitions of randomness, usually based on either a lack of discernible patterns, or their unpredictability. Although the output from 'practical' random numbers generators is widely used, they are nearly always more accurately termed 'pseudo-random number generators'. So far, _all_ computer based PRNGs, ie those which execute some algorithm to produce their output, are not actually random at all. They may appear to lack a discernible pattern, they may pass assorted statistical tests probing for non-randomness (see Knuth, Art of Programming, vol 2, for details of many such tests), they may have very large repeat cycles in their output, and they may evoke a pleasant warm and fuzzy feeling in their users, but they always have a pattern -- the pattern given by the algorithm that generates them, and by its starting 'state'. Nor are they unpredictable. Given the original state of the generator, and its algorithm, they are totally predictable, and given even partial knowledge of that state, they are insecure, if not entirely predictable.
As John von Neumann put it, "Anyone who considers arithmetical methods of producing random digits is, of course, in a state of sin".
This article is about the attempts to generate and use "random numbers" generated from apparently random physical phenomena. It is a fundamentally difficult problem, and in practice questions about randomness generally degenerate into 'can I use this not really random series in my application without getting too badly burned?'. This is, truly, an even harder problem.
'Unpredictable' random numbers were first investigated in the context of gambling, and many randomizing devices such as dice, shuffling playing cards, and roulette wheels, were first developed for use in gambling.
'Random' numbers are also used for serious purposes such as draft lotteries, where "fairness" is approximated by randomization, and in research where some modeling and statistical methods require them. They have other uses in physics (such as noise resonance studies), engineering, and operations research.
More recently developed uses of unpredictable random numbers include their uses in cryptography.
Uses of physically generated "random" numbers
The big industrial use of real random numbers is to make unpredictable data, including keys, for cryptography. Merely random numbers are not sufficient because the choice of a key (or other cryptographically required random value) must be maximally difficult for an attacker to predict. Accordingly, any published random sequence is a poor choice as are such sequences as the digits in an irrational number such as the &phi or even in transcendental numbers such as &pi, or e. Since most keys are a few thousand bits long at most, slow random number generators serve well -- if they are actually random. This use of random generators is important; many informed observers believe every computer should have a way to generate true random numbers.
True random numbers are absolutely required for the only provably unbreakable encryption algorithm -- the one-time pads. Furthermore, those random sequences cannot be reused and must never become available to any attacker, which implies a continuously operable generator. See Venona for an example of what happens when these requirements are violated when using a one-time pad.
Cheap random number sequences (if any can be made available) are handy for use in permanently erasing files, and similar technical tasks.
It is important to understand (and so I repeat it) that, in cryptography, random bit streams need to be not only unpredictable, but also secret. Public or third-party sources of random values, or random values computed from publicly observable phenomena, are almost never cryptographically acceptable. Often used, but unacceptable. They permit attacks that should never be allowed.
One early way of producing random numbers was by a variation of the same machines used to play keno or select lottery numbers. Basically, these mixed numbered ping-pong balls with blown air, and used some method to withdraw balls from the mixing chamber. This method gives reasonable results, but the random numbers generated by this means are very expensive. The method is inherently slow, and is unusable in most mechanized situations.
[early random number tables- to be written]
An important 20th century work was the RAND Corporation million number table. It was produced in the 1950's by an electronic simulation of a roulette wheel attached to a computer, the results of which were then carefully filtered and tested before being used to generate the table. The RAND table was a significant break-through in delivering random numbers because such a large and carefully prepared table had never before been available. It was useful for simulations and modeling. But, having been published, it was not useful for any credible cryptographic purpose.
Many modern random number generators attempt to use some form of quantum-mechanical noise, widely considered to be the gold standard for randomness (but not yet proved to be so).
In some simple designs, this logic value is converted to an RS-232 level signal and sent directly to a computer's serial port. Software then sees this series of logic values as bursts of "line noise" characters on an I/O port.
More sophisticated systems may format the bit values before passing them into a computer. Because of problems with bias (see below), ordinary analog to digital converters are rarely useful.
The bit-stream from such systems is prone to be biased, with 1s or 0s predominating. One method to correct this feeds back the generated bit stream, filtered by a low-pass filter, to adjust the bias of the generator. By the central limit theorem, the feedback loop will tend to be well-adjusted almost all the time. Ultra-high speed random number generators use this method. Even then, the numbers generated are usually somewhat biased.
A higher quality device might use two diodes and eliminate signals that are common to both - this eliminates interference from outside electric and magnetic fields. This is recommended for gambling devices, to prevent cheating by exploiting bias in the 'random bit' stream.
The Intel RNG (supplied in some of their board-level chipsets for PC-type computers), uses most of the above tricks and adds another. The non-common-mode noise from two diodes controls a voltage-controlled oscillator, which clocks bits from a rapid oscillator (the new trick), which then go through a von Neumann type decorrelation step.
A completely unrelated method uses two uncoupled oscillators, and counts events in one from the time-base of another. This method has been used on PCs to generate pure-software 'true-random' number generators. It requires a PC with two clock crystals, one for the real-time clock, and another for the processor. The program loops, counting the time that one of the bits of the counter of the real-time clock is a 1. The least significant bit of the loop-counter can be quite 'random'; depending on the implemetational details (and on the current condition / operation of the hardware) can even be random 'enough' for some uses.
Even after all these measures have been taken, the bit-stream should still be assumed to contain bias and correlation.
John von Neumann invented a simple algorithm to fix simple bias, and reduce correlation: it considers bits two at a time. When two successive bits are the same, they are not used as a random bit. A sequence of 1,1 becomes a 1. A sequence of 0,0 becomes a zero. This eliminates simple bias, and is easy to implement as a computer program or in digital logic. It reduces the bit rate available by a factor of four, on average. This technique works no matter how the bits have been generated. It cannot assure randomness in its output however.
One of the more popular methods of improving a near random bit stream is to exclusive-or the bit stream with the output of a high-quality cryptographically secure pseudo-random number generator such as Blum Blum Shub. This can cheaply improve decorrelation and digit bias.
Another method to "improve" a random sequence is to perform data compression on it. High quality lossless data compression algorithms must increase the entropy (i.e. reduce the predictability) of the data, else no compression could occur. This means, roughly, that the compressed data is 'more' random. Because some compression schemes and some implementations of others produce predictable output (eg, in the dictionary used for Huffman coding), using a compression algorithms to produce random bits should be very carefully thought out. Although mathematically robust in theory, many engineering types find this approach inelegant and wasteful.
Some designs apply cryptographic hash functions such as MD5, SHA-1, or RIPEMD-160 or even a CRC function to all or part of the bit stream, and then use the ouptut as the random bit stream. Other designs use 'true random bits' as the key for a high quality block cypher algorithm, taking the encrypted output as the random bit stream. Care must be used in these cases to select an appropriate block mode, however.
When a high-speed source of lower-quallity random digits is needed, sometimes the true random source is used to generate the seed for a pseudo random number generator. The PRNG is run for a fixed number of digits, while the hardware device generates a new seed.
to be written
Software engineers without true random number generators often try to develop them by measuring physical events available to the software. The classic is to measure the time between key-strokes, and then take the least significant bit (or two or three) of the count as a random digit. The same approach has been applied by measuring task-scheduling, network hits, disk-head seek times and other internal events.
The method is quite risky when it uses computer-controlled events because a clever, malicious programmer might be able to predict a cryptographic key by controlling the external events. Several gambling frauds have been uncovered which rely on manipulating (normally hidden) events internal to the operation of computers or networks.
to be written -- mention:
It is very easy to mis-construct devices that generate random numbers. Also, they break silently, often producing decreasingly random numbers as they degrade. An example might be the rapidly decreasing radioactivity of the smoke alarms mentioned earlier. As the radioactive intensity decreases, its sensor will be required to compensate, not an easily accomplished task. Failure modes in such devices are plentiful and are neither easy nor quick nor cheap to detect.
Because they are quite fragile, and fail silently, statistical tests on their output should be performed continuously. Many such devices include some such tests into the software that reads the device. Others, of course, don't.
to be written
The "global consciousness project" maintains TRNGs in many cities, and reported chi-square fluctuations during the 9/11 attacks. This event was not controlled for power fluctuations and the like. Many researchers involved in paranormal research use random number generators in their test designs.
Early attempts to generate true random numbers
Physical phenomena used for random number generation
In nearly all cases, the acquired noise signal is amplified, filtered, and then run through a high-speed voltage comparator to produce a logic signal that alternates states at random intervals. Cleaning up the bit-stream
Estimating the size of the entropy pool
Software implementation of random number generators from observed events
/dev/random
/dev/urandom
Problems
Checking the performance of hardware random number generators
Controversial random number phenomena
See also
External links
Random number services on the net
HotBits claims to provide private radioactively-produced random numbers. They admit to "stockpiling" them, so in principle they could be intercepted if their server were penetrated.
random.org claims to deliver private random numbers generated from atmospheric radio noise.Manufacturers of random number generator devices