More On Monkeys Typing Shakespeare & Why Chance Isn’t Enough

More On Monkeys Typing Shakespeare & Why Chance Isn’t Enough

Way back in 2010, I did the Infinite Monkeys theorem, which says an infinite number of monkeys unintelligently whacking away at typewriters would eventually reproduce Shakespeare. But there aren’t an infinite number of monkeys, or an infinite number of anything, so it’s more fun to do the problem with a finite number.

In that, we learned “According to Bennett, Briggs (no relation), and Triola, Shakespeare penned 884,647 words”, which isn’t that many, in the scheme of things. I estimated that accounted for about 6 million characters (letters and punctuation, from a set of 45 possibilities). This gives the probability a single monkey, hitting a keyboard 6 million times, reproduces Shakespeare at about 2 x 10-6,000,000. That is, 2 divided by 1 followed by 6 million zeros.

A single monkey therefore likely won’t do it. But neither will any finite number available to us. Too many characters, not enough time.

Well, now the problem is official, because the New York Times recently wrote an article about a new paper which finds more or less the same as I did. The paper is “A numerical evaluation of the Finite Monkeys Theorem” by Stephen Woodcock and Jay Falleta in Franklin Open (thanks to JC for the tip).

I’m delighted to report the authors have a sense of humor—how rare!—and open their paper with the Bard’s own words: “Alas, poor ape, how thou sweat’st!”. They also have occasion to quote from (the original) Planet of the Apes, which is always a wise move. They come to the probability 10-7,448,357, which is smaller than mine, but they’re more careful on the precise number of characters, and their probability is for one monkey working constantly over his entire life.

Then it struck me that words are not characters. Monkeys think in images, phantasms, which is to say, something like whole words. What if we allow them to have a go using whole words and not characters? And we can also account for the order of Shakespeare’s works, which we don’t care about. That is, if a monkey whacked out Hamlet first and Henry VI last, we wouldn’t care. We also want to give our monkeys a break and have them go it only at working hours.

Again, there are 884,647 words, and these are from a “dictionary”, meaning first the list of words in Shakespeare’s head, including any neologisms. One source says there are 28,829 unique words in that set.

Maybe we can get monkeys to manipulate the words like ideograms in a Chinese-room-type experiment. That way we can ignore punctuation and other niceties. After all, if monkeys can type on a pre-set selection of characters, they can also type by hitting pictures of words. Be a big keyboard though. But if we’re having monkeys trying to reproduce “the pearl of English literature”, an expense like that is no bother.

It’s simpler now. Chance of getting the first word right, given the new information, is 1/28,829, and so the chance of getting the entire corpus right (and in the right order) on a single “try” of hitting the ideographic keyboard 884,647 times is

$$\left(\frac{1}{28829}\right)^{884647} = 10^{-3945375}$$.

Call it 10 to the minus 4 million to account for bad-banana induced mishits. At one key a second, that’s 10.23 days for a try, or 245.7 hours. But that’s having a monkey go at it without cease. Let’s ease up and see what happens.

A work day is, or used to be, 8 hours, with 5 work days a week. Each month has about 4.2 work weeks, and 8 hours a day gives 168 hours a month. So it turns out if we can push the monkeys to have 12 hour days, with weekends and nights off and 15 minute lunches, we can get one try a month out of each monkey. Pretty good! If our monkeys can’t do it, we can use H1-Bs and import some from India—I understand they have a surplus. American monkeys don’t want this kind of work anyway.

One count puts the number of Indian monkeys at a million, with another big chunk of langurs. If we wanted langurs we’d use Harvard graduates, and we don’t want them. It’s monkeys or bust. With current “immigration” rates, most of the monkeys will be in Canada in about 10 years (hitching rides in luggage, etc.), and at least half will have crossed the southern border during that time. We can make up the other half million by plucking them from the caravans coming north from the Global South.

So we have a million monkeys to work with. A million monkeys can thus do 12 million tries a year. How many tries are necessary? Well, as many as it takes to reproduce Shakespeare. We’ve seen the probability of 1 try. What’s the chance of a least one “hit” in 12 million tries? That’s easy. From the binomial, it’s

$$1 – (1-p)^n$$

where p = 10^{-3945375} and n = 12 million. Don’t bother punching it into your calculator. Too big. But we can ask how big n has to be such that the probability of a match is greater than a half. First let q = 3945375. Then to good approximation (set the previous equation to be greater than 1/2, realize 1 – p = 10^a/10^q for some a very close to but smaller than q, take logs, notice $a-q<0$, and solve for n)

$$n> \log_{10}(2)10^{3945375}.$$

Big number. Still not going to get there. There aren’t monkeys in the world, or time enough.

Yet all this assumes we have to get all Shakespeare’s works will not only be reproduced, but in order. According to Wokepedia, our man wrote 39 plays and 154 sonnets and some other poems, all of which go into making up the 884,647 words.

Presumably somebody has the word count for each of these works. I don’t, so I’ll approximate, though it’ll be easy to see how to reproduce what comes next using the proper numbers. One source says sonnets run an average 150 words. Let that be. That gives 150 x 154 = 23,100 words for the sonnets, with 884,647 – 23,100 =
861,547 words left over for the plays. I know they’re not equal length, but let’s here suppose they are. That’s
22,091 words per play, on average.

We can use the same math as above to see that the chance of reproducing a (single) sonnet is

$$\left(\frac{1}{28829}\right)^{150} = 10^{-669}$$.

Do the same kind of thing for each play, and the probability of matching one in a try is

$$\left(\frac{1}{28829}\right)^{22091} = 10^{-98522}$$.

With me?

Since there are 150 sonnets, and we know the chance of getting one right in a try, and we don’t care about the order, the chance of getting all sonnets right, in any order, is

$$\frac{10^{-669\times 150}}{150!} \approx 10^{-100089},$$

where there are 150! ways to arrange the sonnets. (150! = 150 x 149 x … x 1, and I used Stirling’s approximation to get a value of about 10^{261}.)

We can do the same for the plays, which gives

$$\frac{10^{-98522\times 39}}{39!} \approx 10^{-3842312}.$$

And, of course, we must get both sonnets and plays, but there are two ways to arrange these, which gives at last

$$10^{-3842312}\times 10^{-100089} \approx 10^{-3942400}$$

Notice that this is larger than the original probability, ignoring the order; i.e.

$$ 10^{-3942400} > 10^{-3945375},$$

where there is a 10^{2975} times bigger chance. Itself quite a number, but in the scheme of things, it doesn’t help much.

The Real Chance

Every calculation so far hinged on two numbers, the word count of Shakespeare’s oeuvre and the size of his dictionary. Which is to say, the limited set of words he chose to use, or coin. That is the most important number of all, because if we were to ask not about monkeys, but of men, we would not use the same dictionary, but a bigger one.

The list of words of the World Dictionary which Shakespeare drew from is much larger than the 29,000 words he eventually picked. Shakespeare himself added to that World Dictionary with his coinages. Which argues that the size of the WD is, in a sense, infinite. Or at least very large, and a whole lot larger than 29,000.

The obvious conclusion is that if millions of monkeys working gazillions of years cannot produce Shakespeare, the man himself could not produce Shakespeare picking words “randomly”. Intelligence is required. Both to write and comprehend. “Random” molecules bumping into one another isn’t enough to produce this.

Subscribe or donate to support this site and its wholly independent host using credit card click here. Or use the paid subscription at Substack. Cash App: \$WilliamMBriggs. For Zelle, use my email: matt@wmbriggs.com, and please include yours so I know who to thank. BUY ME A COFFEE.

2 Comments

  1. Brian (bulaoren)

    How many words could Joe Biden type, coherently?

  2. I appreciate that they had a sense of humor about it but there is no “Finite Monkeys Theorem”, there’s only an “Infinite Monkeys Theorem”. Maybe I’m taking it too seriously but they open their article (which came out in December) by stating that the “Infinite Monkeys Theorem” is misleading and so here’s a debunking of the “Finite Monkeys Theorem”. It just annoyed me because they wrote an article to set up and knock down a straw man, then used it to conclude that something only tangentially related was “misleading”.

    We all understand that it’s an infinite number of monkeys and an infinite span of time, so we don’t need smarmy statisticians pointing out that there aren’t enough monkeys and there isn’t enough time.

    That’s my curmudgeonly take on it.

Leave a Reply

Your email address will not be published. Required fields are marked *