# The Probability of Monkeys Typing Shakespeare

How long would it take a monkey typing randomly to reproduce the completes works of William (great name, incidentally) Shakespeare?

Once we know that, we can answer how long it would take a barrelful. If that is, we knew how many monkeys would fit in a standard barrel. In experiments conducted by your author, I can tell you the answer is eleven, but you have to press hard.

A typewritten work is composed, of course, of words, and in between those words are spaces and the occasional punctuation. Separating the words are headings, themselves comprised of words and numbers.

According to Bennett, Briggs (no relation), and Triola, Shakespeare penned 884,647 words, which isn’t as many as you would think. A standard newspaper-style column, of the kind you read at websites such as this, is 800 words. If Will wrote columns, 884,647 words would fill about 1,100 columns.

And if he wrote one column per day, then it would only take three years to have an *oeuvre*. After that, of course, comes retirement. Sounds like a government job, no?

Now, each, or nearly each, word Shakespeare wrote was accompanied by a space, this being a peculiarity of English. There are no spaces in Chinese and Japanese, for example. Each word consists of letters, there being in English 26 of them: to make less work for our monkeys, we’ll assume case insensitivity: capitals, lower case, all the same to us.

We could count all the letters in those 884,647 words, but that would take too long (I don’t have the files at hand). Instead we’ll use the average word length in English (in Shakespeare’s day): it was 5 letters. And let’s not forget punctuation, which is roughly 10% of his published work (scientifically estimated by glancing at Act IV of *The Tempest*).

This gives us

#characters = (#words x avg. word length + #spaces) x punctuation

or

#characters = (884,647 x 5 +884,647) x 1.1 = 5,838,670

which is close enough to 6 million for anybody. That’s 6 million *characters*, consisting of letters, numbers, the space, and punctuation.

There are 26 letters, 10 numbers, 1 space, and 8 (that I could see) punctuation marks—[‘ , : ; ! ? . –]. That’s 45 different characters.

Ready? Gather your monkeys and sit them in front of a keyboard. In order to reproduce Shakespeare, what has to happen *first*?

Well, the Bard’s earliest published work was *Venus and Adonis*, which begins “Even as the sun with purple-colour’d face…” In order to reproduce all his works, your monkey has to at least reproduce *Venus and Adonis*, and in order to reproduce that, your monkey has to at least reproduce the first word, which is “Even”.

And in order to reproduce *that*, your monkey has to first type an ‘e’. Suppose he has typed an ‘r’, which is not an ‘e’. And then suppose he typed “ven as the sun with purple-colour’d face…” and all the other letters, numbers, spaces, and punctuation in proper order.

Has your monkey reproduced the complete works of Shakespeare? No, sir, he has not. If you think of Shakespeare’s works as a key, then any deviation from that key won’t work in our lock (whatever that is). So our monkey has to have every i dotted, every t crossed, in the proper order.

Given our information (our evidence, our premises), what is the probability that your monkey types an ‘e’? There are 45 possibilities, so the chance is 1 in 45, or 0.022222.

Given our evidence *and* given the additional fact that your monkey *did* type an ‘e’ first, what is the chance he types the second character correctly? Right: it’s the same; 1 in 45. And so on for each character.

What are the chances of your monkey typing, *in order*, all the characters? It’s the probability of typing the first correctly, times the probability of typing the second correctly, and so on. This is

Prob = (1/45^(6,000,000)) ~ 2 x 100^{-6,000,000}

where I have approximated 1/45 by 2/100, which is close enough. We can write that better using logs (base 10), because

2 x 100^{-6,000,000} = 10^{log(2) – 2 x 6,000,000} ~ 10^{-12,000,000}

which is an awfully small number (log(2) = 0.3, which is so tiny that it’s not worth subtracting from 12,000,000). This is a 1 divided by a 1 followed by 12 *million* zeros.

Keep in mind that a googol (no, the other one) is defined as 10^{100}, which is just plain large. Our number (rather, its inverse) is much, much—much!—bigger. (However, our number is smaller than a googolplex.)

The universe is roughly 14 billion years old, which is about 4.4 x 10^{17} seconds. It now becomes tricky. Do we only accept a monkey’s efforts that are of the correct length, which we then compare with the Bard? Or, more fairly, do we throw out the stream of characters that do not match the matching stream of Shakespeare? Do we, that is, let the monkey continuously start over until he gets it right?

Well, it just doesn’t matter. The number 10^{-12,000,000} is so mind-bogglingly small that it is never going to happen. Even if we let a barrelful of monkeys type 100 characters a second, they are never going to finish.

And so we conclude what we already knew: randomness isn’t enough to make a Shakespeare; something more is needed.

I assumed that the order of Shakespeare’s works matter: that is, the monkey first has to type *Venus and Adonis* before moving on to a sonnet. If order doesn’t matter, then the chance the monkey reproduces everything increases, by about 100 times.

I also ignored all the other keys on a standard keyboard: including them drops the chance of duplication to even smaller levels.

**Update** Fixed thanks to Charles’s suggestion. Another way to think of it is that a space adds to the average word length by one.

Here is a relevant excerpt from The Wicked-Good Dictionary of Media and Communication

we (Charles G. Waugh, Sandra Sibert, and Joseph Leff) will be publishing in the fall.

THE INFINITE MONKEY THEOREMS:

The Infinite Monkey Theorem: If an army of monkeys

were strumming on typewriters, they might write all the

books in the British Museum. (Based on a statement by

Sir Arthur Eddington.)

The Infinite Monkey Theorem Attempted: I heard

someone tried the monkeys-on-typewriter bit trying for

the plays of W. Shakespeare, but all they got was the

collected works of Francis Bacon. (Based on a

statement by Bill Hirsh.)

Dyslexic Infinite Monkey Theorem: I heard that if you

locked William Shakespeare in a room with a typewriter

for long enough, heâ€™d eventually write all the songs of the

Monkees. (Based on a statement by an unknown

source.)

The Infinite Monkey Theorem Refuted: Weâ€™ve all

heard that a million monkeys at a million keyboards could

produce the complete works of Shakespeare; now,

thanks to the Internet, we know that is not true. (Based

on a statement by Robert Wilensky.)

The Infinite Monkey Theorem Revised: If an infinite

number of rednecks riding in an infinite number of pickup

trucks fire an infinite number of shotgun rounds at an

infinite number of highway signs, they will eventually

produce all the worldâ€™s great literary works in Braille.

(Based on a statement from http://www.jokecenter.com/

jokes/Education/ 1563.htm.)

Briggs,

Did you skip your coffee this morning? Your math skills aren’t up to your usual standards.

I think you meant:

#characters = (#words x avg. word length + #spaces) x punctuation factor

including the spaces does not double the number of characters – it adds about one per word.

and

Prob = (1/45^(10,000,000)) ~ 2^(10,000,000) x 100^(-10,000,000)

that’s a big difference. Not enough to change your conclusion but still …

Typing?!

â€œGirls, weâ€™ll go see the movie

Iron Man 2after you finish typing your book reports due on Monday.â€â€œLetâ€™s go to the movie first. Donâ€™t worry, Mom, with

Ctrl-CandCtrl-Vand internet, we can finish it within 5 minutes. â€œBut what if we bred (or trained, but breeding sounds like more fun) monkeys to prefer to type the more common characters? Would it then become possible to fit enough monkeys and typewriters into the universe to have a reasonable chance of finishing the works before THE END?

Fred went over it again, this time he said that punctuation AND capitalization don’t count.

http://www.fredoneverything.net/Evolution.shtml

Charles,

Damn your eyes!

Charles, MD,

I like the internet bit.

Darwinists would say that Shakespeare was in fact a slightly evolved monkey, but a monkey just the same, and so the probability is 1.

Jerry:

As I understand the latest research there is more than one author of the works ascribed to “Shakespeare”. On the other hand, it is hard to disagree with you fundamental biological assertion, given any reasonable definition of monkey.

Mogan,

You may be onto something, a genius like Shakepeare was able to write a couple of hunded pages of poetry in a year. A modern author is no more prolific than Shakespeare was. How many professional writers produce more than one book every 18 months?

A congressional committee tries to write a couple of thousand pages in a few weeks. No wonder it is garbage!

These days I am told it is ‘Kitmans Law’, but I remember it as something else, but hey –

Pure Drivel drives away ordinary drivel. Gresham would be proud

Bernie,

You believe the lies about Shakespeare being written by someone else? I am surprised.

Until the age of the internet it was simply too difficult to coordinate the work of your monkeys in order to test the math empirically. After all, how could you tell if monkey 7,432’s page fourteen of Hamlet legitimately comes before monkey 3,914,648’s page 13 of Hamlet, or if it actually came right before monkey 17’s page 9 of Midsummer Night’s Dream?

Thanks to the good folks at the IETF we now have rfc2795: The Infinite Monkey Protocol Suite (IMPS) to ensure efficient monkey-typing organization.

http://tools.ietf.org/html/rfc2795

I think the proposal of the Infinite monkeys idea to support evolution is badly put. After all, there is no definite outcome to evolution so saying, “There y’go chimps, produce Shakespear’s works” is comparing apples and something else that is not an apple.

If we said, “There y’go chimps, produce something recognizable as a work of literature” then what would we be prepared to accept as a hit? How about the works of Shakespeare with all the e’s replaced by o’s? After all, some of evolution’s products are less than perfect. (No names, no pack drill).

Or the works of Edgar Allen Poe? In French? Backwards? Or Caesar’s “Gallic War” in latin encrypted with the Caesar cypher? Or all the works of William McGonagal translated into every known language? (The chimps would have produced a handy reference card at the end showing how foreign scripts had been encoded into the Roman alphabet). Or into a thousand languages that don’t exist?

Once we hold any of these in our hands we’re going to be amazed. Of course, calculating the probability of a chimp product that will satisfy these wider criteria is harder, maybe impossible. You must first imagine every possible work of literature whether it presently exists or not in any language in every possible form. OK, impossible.

Isn’t it a bit like being stunned and amazed every week that someone wins the lottery?

Rich,

Excellent point.

The odds of winning the Mega Millions lottery is 1 in 175,711,536 (I copied this because I was too lazy to calculate; it doesn’t matter if it’s correct). Given this info, the probability that

youwin is 1 in 175,711,536, or p = 5.691146e-09.But the probability that

there is at least one winneris much higher, but it depends on how many people buy tickets. Let that number be n. Then the probability (given our evidence) of at least one winner is Q = 1 – (1 -p)^{n}.If just you bought a ticket, Q = p. If 1,000 bought tickets, then Q = 5.69113e-06; 1 million buyers pushes Q to 0.005674982. Ten million makes it 0.055. The actual number of tickets bought is somewhere in that 1-10 million range.

Now, somebody doesn’t win the lottery every week (there are two drawings a week), but only every so often; it certainly isn’t a rarity, as you suggest.

Given our information, if 1 million buy tickets for each drawing, the probability of at least one winner in 1 week is 0.011. In five weeks, it is 0.055.

If 10 million buy tickets for each drawing, the probability of at least one winner in 1 week is 0.108. In five weeks, it is 0.43.

You get the idea, rare events like hitting the lottery aren’t that rare when they are given multiple chances of occurring.

Same with evolution.

â€œIsnâ€™t it a bit like being stunned and amazed every week that someone wins the lottery?â€

Isnâ€™t itâ€ what? Are you talking about Shakespeare or the chances of a monkey typing it?

Yawn, or the chances of my breathing a said molecule given the number on the planet.

Thatâ€™s silly, the universe is not a casino. Even the casino is not a casino!

When you dress up the argument falsely by implying that all have an equal chance, invoke infinity, you can make anything seem possible. Mathematicians do it all the time. We now are subjected to their meddling through the actions of politics with perverted science as proof for anything. What will happen when the public, who are not dumb but trusting, realise what has gone on.

The self proclaimed scientist or mathematicians have a duty of care to speak out when nonsense is dressed up as highly complex truth.

â€œLet every eye negotiate for itself and trust no agent.â€

Joy, I’m sorry but I didn’t understand your comment.

But to answer your question, I’m talking about being amazed that the particular forms of life we see around us have arisen by chance when, for all we know, some form or other was inevitable. Or not even inevitable. As Briggs just pointed out, that there should be a lottery winner each week is not inevitable just quite likely.

If you don’t decide in advance what forms life will take then the chance that some form of life or other will appear may not be so unlikely even though the forms we actually do see are extraordinarily unlikely.

But we don’t actually have a clue.

Rich,

The difficulty with evolution-probability questions is that they are nearly always ill-posed. As we know, there is no such thing as unconditional probability. The questions “What’s the probability this certain life form develops?”, or even “What’s the probability of life self-organizing?” are ill posed. They cannot be answered.

Not without first specifying the conditions on which they are based; that is, their evidence or premises

mustbe spelled out exactly.Any question of abiogenesis is ill-posed unless you can tell me exactly what are

allthe possibilities. Do we consider only carbon-based life? And that only comprised of a certain set of amino acids? Or do we also allow silicon-based life forms? Energy-based? And what do those mean?Actually, what we find—you won’t be surprised—is that most people talk nonsense on this subject.

That guy Fred is right: what is often used as evidence is possibility. And what is possibility but unquantified probability? Perfectly respectable to leave probabilities vague, but that does not excuse you from specifying the evidence, premises, conditions. If you can’t do that, you can’t talk about possibility.

Well quite.

It would be interesting to throw in some variables like nature. Monkeys are rewarded with food or the ability to mate for pressing the right keys or they are harmed or withheld food for pressing the wrong keys. Add in the ability some growth factor, say, start out simple, the challenge being one letter, then one word. When then meet that challenge, their knowledge grows.

Eventually, you could have a single monkey write all the sonnets of Shakespeare. This monkey:

The monkeys, might produce the works of Shakespeare, but backwards. An ordinary reader of English might spot it quickly on casual inspection. Or, perhaps they could type it out in hexadecimal code. Do these count? I agree that the probability of a random process producing something large and humanly recognizable is very low, but I think the puzzel needs better definitions. Perhaps thermodynamics, and information theory could be the source of those definitions.

Rich,

A bit late sorry, but were you trying to say sweetly â€œJoy youâ€™re not making senseâ€, never mind, it was clear as I could muster. I was joking about the casino.

There are always barrel loads of scientist types waiting to muddy the waters of clear understanding with less than honest motives. Sometimes thereâ€™s no motive at all but to play with their â€˜toolsâ€™ and â€˜modelsâ€™; all the gear, no idea.

