William M. Briggs

Statistician to the Stars!

The Probability of Monkeys Typing Shakespeare

This article was inspired by reading Fredwin On Evolution, as wisely suggested by reader Bob Ludwick. At the bottom of Fred’s piece, there appears a rough calculation, which I expand here.

How long would it take a monkey typing randomly to reproduce the completes works of William (great name, incidentally) Shakespeare?

Once we know that, we can answer how long it would take a barrelful. If that is, we knew how many monkeys would fit in a standard barrel. In experiments conducted by your author, I can tell you the answer is eleven, but you have to press hard.

A typewritten work is composed, of course, of words, and in between those words are spaces and the occasional punctuation. Separating the words are headings, themselves comprised of words and numbers.

According to Bennett, Briggs (no relation), and Triola, Shakespeare penned 884,647 words, which isn’t as many as you would think. A standard newspaper-style column, of the kind you read at websites such as this, is 800 words. If Will wrote columns, 884,647 words would fill about 1,100 columns.

And if he wrote one column per day, then it would only take three years to have an oeuvre. After that, of course, comes retirement. Sounds like a government job, no?

Now, each, or nearly each, word Shakespeare wrote was accompanied by a space, this being a peculiarity of English. There are no spaces in Chinese and Japanese, for example. Each word consists of letters, there being in English 26 of them: to make less work for our monkeys, we’ll assume case insensitivity: capitals, lower case, all the same to us.

We could count all the letters in those 884,647 words, but that would take too long (I don’t have the files at hand). Instead we’ll use the average word length in English (in Shakespeare’s day): it was 5 letters. And let’s not forget punctuation, which is roughly 10% of his published work (scientifically estimated by glancing at Act IV of The Tempest).

This gives us

    #characters = (#words x avg. word length + #spaces) x punctuation

or

    #characters = (884,647 x 5 +884,647) x 1.1 = 5,838,670

which is close enough to 6 million for anybody. That’s 6 million characters, consisting of letters, numbers, the space, and punctuation.

There are 26 letters, 10 numbers, 1 space, and 8 (that I could see) punctuation marks—[' , : ; ! ? . --]. That’s 45 different characters.

Ready? Gather your monkeys and sit them in front of a keyboard. In order to reproduce Shakespeare, what has to happen first?

Well, the Bard’s earliest published work was Venus and Adonis, which begins “Even as the sun with purple-colour’d face…” In order to reproduce all his works, your monkey has to at least reproduce Venus and Adonis, and in order to reproduce that, your monkey has to at least reproduce the first word, which is “Even”.

And in order to reproduce that, your monkey has to first type an ‘e’. Suppose he has typed an ‘r’, which is not an ‘e’. And then suppose he typed “ven as the sun with purple-colour’d face…” and all the other letters, numbers, spaces, and punctuation in proper order.

Has your monkey reproduced the complete works of Shakespeare? No, sir, he has not. If you think of Shakespeare’s works as a key, then any deviation from that key won’t work in our lock (whatever that is). So our monkey has to have every i dotted, every t crossed, in the proper order.

Given our information (our evidence, our premises), what is the probability that your monkey types an ‘e’? There are 45 possibilities, so the chance is 1 in 45, or 0.022222.

Given our evidence and given the additional fact that your monkey did type an ‘e’ first, what is the chance he types the second character correctly? Right: it’s the same; 1 in 45. And so on for each character.

What are the chances of your monkey typing, in order, all the characters? It’s the probability of typing the first correctly, times the probability of typing the second correctly, and so on. This is

    Prob = (1/45^(6,000,000)) ~ 2 x 100-6,000,000

where I have approximated 1/45 by 2/100, which is close enough. We can write that better using logs (base 10), because

    2 x 100-6,000,000 = 10log(2) – 2 x 6,000,000 ~ 10-12,000,000

which is an awfully small number (log(2) = 0.3, which is so tiny that it’s not worth subtracting from 12,000,000). This is a 1 divided by a 1 followed by 12 million zeros.

Keep in mind that a googol (no, the other one) is defined as 10100, which is just plain large. Our number (rather, its inverse) is much, much—much!—bigger. (However, our number is smaller than a googolplex.)

The universe is roughly 14 billion years old, which is about 4.4 x 1017 seconds. It now becomes tricky. Do we only accept a monkey’s efforts that are of the correct length, which we then compare with the Bard? Or, more fairly, do we throw out the stream of characters that do not match the matching stream of Shakespeare? Do we, that is, let the monkey continuously start over until he gets it right?

Well, it just doesn’t matter. The number 10-12,000,000 is so mind-bogglingly small that it is never going to happen. Even if we let a barrelful of monkeys type 100 characters a second, they are never going to finish.

And so we conclude what we already knew: randomness isn’t enough to make a Shakespeare; something more is needed.

———————————————————————————

Notes

I assumed that the order of Shakespeare’s works matter: that is, the monkey first has to type Venus and Adonis before moving on to a sonnet. If order doesn’t matter, then the chance the monkey reproduces everything increases, by about 100 times.

I also ignored all the other keys on a standard keyboard: including them drops the chance of duplication to even smaller levels.

If you find any errors in calculations or logic or whatever, please email them to obama@whitehouse.gov.

Update Fixed thanks to Charles’s suggestion. Another way to think of it is that a space adds to the average word length by one.

30 Comments

  1. Dr. Charles G. Waugh

    20 May 2010 at 7:17 am

    Here is a relevant excerpt from The Wicked-Good Dictionary of Media and Communication
    we (Charles G. Waugh, Sandra Sibert, and Joseph Leff) will be publishing in the fall.

    THE INFINITE MONKEY THEOREMS:

    The Infinite Monkey Theorem: If an army of monkeys
    were strumming on typewriters, they might write all the
    books in the British Museum. (Based on a statement by
    Sir Arthur Eddington.)

    The Infinite Monkey Theorem Attempted: I heard
    someone tried the monkeys-on-typewriter bit trying for
    the plays of W. Shakespeare, but all they got was the
    collected works of Francis Bacon. (Based on a
    statement by Bill Hirsh.)

    Dyslexic Infinite Monkey Theorem: I heard that if you
    locked William Shakespeare in a room with a typewriter
    for long enough, he’d eventually write all the songs of the
    Monkees. (Based on a statement by an unknown
    source.)

    The Infinite Monkey Theorem Refuted: We’ve all
    heard that a million monkeys at a million keyboards could
    produce the complete works of Shakespeare; now,
    thanks to the Internet, we know that is not true. (Based
    on a statement by Robert Wilensky.)

    The Infinite Monkey Theorem Revised: If an infinite
    number of rednecks riding in an infinite number of pickup
    trucks fire an infinite number of shotgun rounds at an
    infinite number of highway signs, they will eventually
    produce all the world’s great literary works in Braille.
    (Based on a statement from http://www.jokecenter.com/
    jokes/Education/ 1563.htm.)

  2. Dear Mr Obama, You’re not doing it right! Make a call to the NBC people downstairs and ask for TRAINED MONKEYS!

  3. Ah, but it would take but one monkey if it graduated or made it look like it graduated from Harvard!

  4. Briggs,

    Did you skip your coffee this morning? Your math skills aren’t up to your usual standards.

    I think you meant:

    #characters = (#words x avg. word length + #spaces) x punctuation factor

    including the spaces does not double the number of characters – it adds about one per word.

    and

    Prob = (1/45^(10,000,000)) ~ 2^(10,000,000) x 100^(-10,000,000)

    that’s a big difference. Not enough to change your conclusion but still …

  5. Typing?!

    “Girls, we’ll go see the movie Iron Man 2 after you finish typing your book reports due on Monday.”

    “Let’s go to the movie first. Don’t worry, Mom, with Ctrl-C and Ctrl-V and internet, we can finish it within 5 minutes. “

  6. But what if we bred (or trained, but breeding sounds like more fun) monkeys to prefer to type the more common characters? Would it then become possible to fit enough monkeys and typewriters into the universe to have a reasonable chance of finishing the works before THE END?

  7. Fred went over it again, this time he said that punctuation AND capitalization don’t count.

    http://www.fredoneverything.net/Evolution.shtml

  8. Briggs

    20 May 2010 at 10:01 am

    Charles,

    Damn your eyes!

    Charles, MD,

    I like the internet bit.

  9. There is a macaque at the door who wants to discuss the script to Hamlet.

  10. Darwinists would say that Shakespeare was in fact a slightly evolved monkey, but a monkey just the same, and so the probability is 1.

  11. Jerry:
    As I understand the latest research there is more than one author of the works ascribed to “Shakespeare”. On the other hand, it is hard to disagree with you fundamental biological assertion, given any reasonable definition of monkey.

  12. Dear Mr Briggs, my barrelful of monkeys typed the entire script to Zaphod Beeblebrox’s second porno movie that he made to finance his run for President of the Universe. What am I doing wrong?

  13. The odds on the monkey’s is higher than Reed, Pelosi and Obama coming up with a comprehensible law.

  14. “The odds on the monkey’s is higher than Reed, Pelosi and Obama coming up with a comprehensible law.”

    Which is strange, because the Patient Protection and Affordable Care Act is only 384,000 words long. I mean, that’s less than half the collected works of Shakespeare.

  15. Mogan,

    You may be onto something, a genius like Shakepeare was able to write a couple of hunded pages of poetry in a year. A modern author is no more prolific than Shakespeare was. How many professional writers produce more than one book every 18 months?

    A congressional committee tries to write a couple of thousand pages in a few weeks. No wonder it is garbage!

  16. These days I am told it is ‘Kitmans Law’, but I remember it as something else, but hey –

    Pure Drivel drives away ordinary drivel. Gresham would be proud

  17. The chances of Shakespeare penning rather than writing are about a shakespillion to one.

    Bernie,

    You believe the lies about Shakespeare being written by someone else? I am surprised.

  18. A monkey by any other name would still smell like a monkey.

  19. “If that is, we knew how many monkeys would fit in a standard barrel. In experiments conducted by your author, I can tell you the answer is eleven, but you have to press hard.”

    Is this an imperial barrel or a metric barrel?

  20. Until the age of the internet it was simply too difficult to coordinate the work of your monkeys in order to test the math empirically. After all, how could you tell if monkey 7,432′s page fourteen of Hamlet legitimately comes before monkey 3,914,648′s page 13 of Hamlet, or if it actually came right before monkey 17′s page 9 of Midsummer Night’s Dream?

    Thanks to the good folks at the IETF we now have rfc2795: The Infinite Monkey Protocol Suite (IMPS) to ensure efficient monkey-typing organization.

    http://tools.ietf.org/html/rfc2795

  21. I think the proposal of the Infinite monkeys idea to support evolution is badly put. After all, there is no definite outcome to evolution so saying, “There y’go chimps, produce Shakespear’s works” is comparing apples and something else that is not an apple.

    If we said, “There y’go chimps, produce something recognizable as a work of literature” then what would we be prepared to accept as a hit? How about the works of Shakespeare with all the e’s replaced by o’s? After all, some of evolution’s products are less than perfect. (No names, no pack drill).

    Or the works of Edgar Allen Poe? In French? Backwards? Or Caesar’s “Gallic War” in latin encrypted with the Caesar cypher? Or all the works of William McGonagal translated into every known language? (The chimps would have produced a handy reference card at the end showing how foreign scripts had been encoded into the Roman alphabet). Or into a thousand languages that don’t exist?

    Once we hold any of these in our hands we’re going to be amazed. Of course, calculating the probability of a chimp product that will satisfy these wider criteria is harder, maybe impossible. You must first imagine every possible work of literature whether it presently exists or not in any language in every possible form. OK, impossible.

    Isn’t it a bit like being stunned and amazed every week that someone wins the lottery?

  22. Briggs

    21 May 2010 at 8:06 am

    Rich,

    Excellent point.

    The odds of winning the Mega Millions lottery is 1 in 175,711,536 (I copied this because I was too lazy to calculate; it doesn’t matter if it’s correct). Given this info, the probability that you win is 1 in 175,711,536, or p = 5.691146e-09.

    But the probability that there is at least one winner is much higher, but it depends on how many people buy tickets. Let that number be n. Then the probability (given our evidence) of at least one winner is Q = 1 – (1 -p)n.

    If just you bought a ticket, Q = p. If 1,000 bought tickets, then Q = 5.69113e-06; 1 million buyers pushes Q to 0.005674982. Ten million makes it 0.055. The actual number of tickets bought is somewhere in that 1-10 million range.

    Now, somebody doesn’t win the lottery every week (there are two drawings a week), but only every so often; it certainly isn’t a rarity, as you suggest.

    Given our information, if 1 million buy tickets for each drawing, the probability of at least one winner in 1 week is 0.011. In five weeks, it is 0.055.

    If 10 million buy tickets for each drawing, the probability of at least one winner in 1 week is 0.108. In five weeks, it is 0.43.

    You get the idea, rare events like hitting the lottery aren’t that rare when they are given multiple chances of occurring.

    Same with evolution.

  23. “Isn’t it a bit like being stunned and amazed every week that someone wins the lottery?”
    Isn’t it” what? Are you talking about Shakespeare or the chances of a monkey typing it?

    Yawn, or the chances of my breathing a said molecule given the number on the planet.
    That’s silly, the universe is not a casino. Even the casino is not a casino!

    When you dress up the argument falsely by implying that all have an equal chance, invoke infinity, you can make anything seem possible. Mathematicians do it all the time. We now are subjected to their meddling through the actions of politics with perverted science as proof for anything. What will happen when the public, who are not dumb but trusting, realise what has gone on.

    The self proclaimed scientist or mathematicians have a duty of care to speak out when nonsense is dressed up as highly complex truth.

    “Let every eye negotiate for itself and trust no agent.”

  24. Joy, I’m sorry but I didn’t understand your comment.

    But to answer your question, I’m talking about being amazed that the particular forms of life we see around us have arisen by chance when, for all we know, some form or other was inevitable. Or not even inevitable. As Briggs just pointed out, that there should be a lottery winner each week is not inevitable just quite likely.

    If you don’t decide in advance what forms life will take then the chance that some form of life or other will appear may not be so unlikely even though the forms we actually do see are extraordinarily unlikely.

    But we don’t actually have a clue.

  25. Briggs

    21 May 2010 at 10:18 am

    Rich,

    The difficulty with evolution-probability questions is that they are nearly always ill-posed. As we know, there is no such thing as unconditional probability. The questions “What’s the probability this certain life form develops?”, or even “What’s the probability of life self-organizing?” are ill posed. They cannot be answered.

    Not without first specifying the conditions on which they are based; that is, their evidence or premises must be spelled out exactly.

    Any question of abiogenesis is ill-posed unless you can tell me exactly what are all the possibilities. Do we consider only carbon-based life? And that only comprised of a certain set of amino acids? Or do we also allow silicon-based life forms? Energy-based? And what do those mean?

    Actually, what we find—you won’t be surprised—is that most people talk nonsense on this subject.

    That guy Fred is right: what is often used as evidence is possibility. And what is possibility but unquantified probability? Perfectly respectable to leave probabilities vague, but that does not excuse you from specifying the evidence, premises, conditions. If you can’t do that, you can’t talk about possibility.

  26. Well quite.

  27. It would be interesting to throw in some variables like nature. Monkeys are rewarded with food or the ability to mate for pressing the right keys or they are harmed or withheld food for pressing the wrong keys. Add in the ability some growth factor, say, start out simple, the challenge being one letter, then one word. When then meet that challenge, their knowledge grows.

    Eventually, you could have a single monkey write all the sonnets of Shakespeare. This monkey:
    http://dspace.dial.pipex.com/town/parade/abj76/PG/images/william_shakespeare.gif

  28. The monkeys, might produce the works of Shakespeare, but backwards. An ordinary reader of English might spot it quickly on casual inspection. Or, perhaps they could type it out in hexadecimal code. Do these count? I agree that the probability of a random process producing something large and humanly recognizable is very low, but I think the puzzel needs better definitions. Perhaps thermodynamics, and information theory could be the source of those definitions.

  29. Rich,
    A bit late sorry, but were you trying to say sweetly “Joy you’re not making sense”, never mind, it was clear as I could muster. I was joking about the casino.

    There are always barrel loads of scientist types waiting to muddy the waters of clear understanding with less than honest motives. Sometimes there’s no motive at all but to play with their ‘tools’ and ‘models’; all the gear, no idea.

    Not only Shakespeare, but I had to speak firmly to a lion-maned marmoset yesterday who took to my cashmere gassato cardy.

    I’m waiting for the sonnet, I miss him already.

Comments are closed.

© 2014 William M. Briggs

Theme by Anders NorenUp ↑