Friday, July 6, 2012

The shocking difficulty of producing truly random numbers (and the world's most boring book)

Need some random numbers? The RAND Corporation has you covered. They published a book called "A Million Random Digits With 100,000 Normal Deviates". This 1955 work makes for some boring reading indeed; it's exactly what it says on the cover, no more, no less. Just columns of digits, which were simulated with a roulette wheel device hooked up to one of their mainframe computers. The numbers are arranged in strings of five digits, ten strings to a line, in neatly numbered rows numbered at the far left column from 00000 to 19999.

And if you want a copy, it's yours for - as of this writing - a used price of $33 or a new price starting at $76.67. And while you're at Amazon, don't miss the hilarious customer reviews, ranging from tongue-in-cheek drollery to one man who claims that the book was instrumental in him meeting his future wife.

Why would anyone want such a tome? Because TRUE random numbers are actually surprisingly hard to come by. Especially back in 1955. Even modern machines, however, are hard-pressed to come up with true entropy. True, you could sit there rolling a handful of dice all day and writing down the results, but this is slow. The best we can do is psudorandom numbers, generated from things like the system clock, user input, or the measurement of system voltage.

Why are random numbers so important? They are used in both cryptography and in gambling games - in both cases, it's a disaster if your number sequences become predictable. While pseudorandom numbers are fine for things like determining random events in a video game (such as which Tetris piece will fall next), they just don't cut the mustard for more mission-critical applications.


One such case is demonstrated in the lesson of the TV game show Press Your Luck, in which pseudorandom number sequences were used to generate a pattern of squares on a game board and the contestant would have to press a button to stop the board on a square, hoping to win money. As one determined contestant discovered, the patterns on the board were very easy to predict, and so when he got on the show, Michael Larson had an unprecedented winning streak.


Casinos, with electronic slot machines, Keno, poker machines, and so on, have the same risk. So do lotteries. And then so does any security application of cryptography - it doesn't matter how great my secret code is, if one can simply run the letters or digits through a frequency analysis and figure out that "36547" = "Washington". So the encoded information has to be combined with a bank of random numbers, used once, and thrown away.

There are a number of hardware random-number generators today, things like little frobs that plug into a USB drive and have things like a radiation-decay mechanism or an atmospheric noise microphone to get true natural entropy as input.

For students with test data needs, websites like RANDOM.ORG have downloadable random numbers produced from a microphone dangling in the breeze.