Thinking about one-time pads reminded me to point this out.
It’s simple, really. A “random” number generator spits out a string of numbers or characters from some set, say c1, c2,…, cp, where each of these is a character out of p total. Example: c1 = ‘a’, c2 = ‘b’, etc.
Before the generator starts, using only the premises that the generator can produce only these p characters, we deduce the probability that the first character is ci (for any i in 1 to p) as 1/p.
Start the generator. The characters roll out one at a time. To be be considered “random”, this condition must hold:
Pr( Ct = ci | C1, C2, Ct-1, generator premises) = 1/p for all t and all i,
where I mean by this notation that the probability the next character (Ct) will be one of the set is 1/p no matter how far along in the series we are (short of infinity) and no matter what character we consider. The “generator premises” are those which state only the characters c1 through cp are available, etc.
In other words, if there is any information in the series to date that allows us to calculate any probability other than 1/p for Ct, then the series is not “random.” Be clear that “allows us” means “logically possible” and not necessarily practical or in-practice possible. There may be some existence proof which says something like “Given this generator and an output series like this-and-such, the probability of Ct is x, which does not equal 1/p”. We may never see the this-and-such series, but still if the output series is possible, then the output is not “random.”
Now don’t start going all frequentist on me and say, “Look here, Briggs, you fool. I’ve ran the generator a finite number of times and the relative frequencies of the observed Ct don’t match 1/p.” I’d reply, “Get back to me when you reach an infinite number of times.”
Run the generator for just one character. The observed relative frequencies will be 0, 0, …, 1, …, 0, where it’s 0s for all ci except the 1 character which showed. What does this prove? Next to nothing. Probability just isn’t relative frequency, but relative frequency can match the probability. Probabilities predict relative frequencies. In that spirit, we know the series is not “random” if we can do a better job predicting the series than saying the probability of the next character is 1/p (for any character).
But I see the idea of relative frequency is still alluring. Perhaps this is why. There is in math the idea of a “normal” number, unfortunately named because “normal” in probability means something else. A normal string or number is one in which the digits/characters repeat equally often, yea even unto infinity. Examples: 0.111222333444555666777888999000111222… and ‘abcabcabcabc…’ (we know the numbers are limited to digits 0-9, but here I limit the characters to a,b,c).
These normal numbers are in no sense “random”, because if you know where you are you know with certainty what the next digit or character is. Plus, there are some technical ideas, where a number may be normal in one base (say 10 or 2) but not normal in another base. Here is an example of a number, Stoneham’s constant, which is normal in one base but not another. So “normal” does not imply random, but we have the sense that “random” implies “normal”, which it does.
Truly “random” numbers are probably (I don’t believe there is a proof) normal in any base. Another way to put it, in terms on information, is to say that the number cannot be compressed (in any base); that is, speaking loosely, that it takes just as many characters to compress the number as to display it. The above-linked article gives some hints on the “randomness” of π, which appears normal in many (all?) bases and which cannot be compressed. So where do the digits of π and other transcendental numbers come from? Only God knows.
Last point: many “random” number generators—where by now we see that “random” merely means unknown or unpredictable in some sense—are wholly predictable, they are fully deterministic. Start at a known output and the path can be predicted with certainty. These are called “pseudo-random” generators because the numbers only appear unpredictable.
And what does that mean? Appear unpredictable? Well, it means not-random, that we can prove that a set of premises exist which predict the series perfectly. The difference between a “random” generator is that we can prove no such set of premises exist.