# What Is The Probability Of COVFEFE?

From a tweet from Taleb, who informs us the following question is part of the Indian Statistical Institute examination.

(5) Mr.Trump decides to post a random message on Facebook and he starts typing a random sequence of letters {U

_{k}}_{k≥1}such that they are chosen independently and uniformly from the 26 possible english alphabets. Find out the expected time of the first appearance of the word COVFEFE.

Now it is too good to check whether this is really used by the ISI, but I hope it is. It is too delicious. (Yes, it was Twitter, not Facebook.)

Regular readers will recall we had a Covfefe Sing-along after Trump’s masterly tweet.

The night Donald Trump took to Twitter

Elites had a terrible fit

Trump warned the world of covfefe

And Tweet streams were filled up with sh——Shaving cream.

Be nice and clean.

Shave everyday and

you’ll always look keen…

The ISI’s COVFEFE problem has much to recommend it, because it chock full of the language of modern probability that is so confusing. (Even my title misleads! Nothing “has” a probability!)

Now I learned my math from physicists, who do things to equations that make mathematicians shudder, but which are moves that are at least an attempt to hew to reality. There isn’t anything wrong with mathematician math, but the temptation to the Deadly Sin of Reification can be overwhelming. And why all those curly brackets? They intimidate.

I still recall in a math course struggling with some higher-order proofs from Billingsley (a standard work on mathematical probability) when a Russian mathematician made everything snap into clarity when he told me X, the standard notation for a “random variable” which all the books said “had” a distribution, “was a function”, whereas as a physicist I always saw it as an observable or proposition. It can, of course, be both, but if you ever want to *apply* the math, it *is* a proposition.

So here is Trump typing. What does it mean—think like a physicist and not a mathematician—to “independently and uniformly” choose letters? *To choose* requires a method of choosing. Some thing or things are *causing* the characters to appear on the screen. What? Trump closing his eyes and smacking his hands into the keys? Maybe. But, if so, then we have no hope of identifying the causes of what appears. If we don’t know the causes, we can’t answer how long it will take. We can’t solve the problem.

Enter probability, which can’t answer the question, but can answer similar ones, like “Given certain assumptions, what are the chances it takes X seconds?”

Since all probability is conditional on the assumptions made, the assumptions matter. What are they?

Choosing letters “independently” is causal language. “Uniformly” insists the probability of every letter being typed is equal, a circular definition, since what we want to know is the probability. Say instead “There are 26 letters, one of which must be typed once per time unit t, where knowledge of the letters typed previously tell us nothing about letters to be typed.”

Since COVFEFE (we’re working with all caps via the information given) is 7 letters, we want to characterize the *uncertainty in* the total time it takes to type this sequence.

Do we have all we need? Not quite. Again, think like a physicist and not a mathematician. How long is Trump going to sit at the computer? (Or play with his Portable Thinking Suppression Device (PTSD)?) It can’t be forever. That means there should be a chance we never see COVFEFE. On the other hand, if we assume Trump types forever, then it is obvious that not only must COVFEFE appear, but it must appear an infinite number of times!

Indeed, if we allow the *mathematical* possibility of eternal typing, not only will COVFEFE appear in infinite plenitude, Trump will also type the entire works of Shakespeare, not just once, but also an infinite number of times. And the entire corpus of all works that can be types in 26 letters *sans* spacing. Trump’s a genius!

Well that escalated quickly. That’s because The Limit is a bizarre place. Our intuition breaks down.

We still have to decide how fast Trump can type. Maybe two to five letters per second, but not faster than that. But that’s the physicist in me speaking. Keyboards and fingers can’t be engineered for infinitely fast typing. A mathematician might allow one character per infinitesimal time unit. If so, we have another infinity that has crept in. If one infinity was weird, trying mixing two.

Point is, since probability needs assumptions, we need to make explicit all of them. The problem doesn’t do that. We have to bring our knowledge of English grammar to bear, which we always do, and which part of the conditions. It will be no surprise people can come to different answers.

**Homework**: Assume finite time in which to type, and discrete positive real time to type each letter; assume also the simple characters proposition I gave and then calculate the probability of COVFEFE *at* t = 0, 1, 2, … n typing time units (notice this *adds* the assumption that letters come regularly with no variation, another mathematical, non-physical assumption). And then calculate the first appearance *by* t = 0, 1, 2, … n. Then calculate the expected value (is it even interesting?). After you have that, what happens in n goes to infinity? (It that even interesting?) And can you *also* have the time unit decrease to the infinitesimal?

*Hint*. The probability of seeing COVFEFE and not seeing COVFEFE must sum to 1. If n = 1, the (conditional on all these assumptions) probability of COVFEFE is 0, and not-COVFEFE is 1. Same with n = 2, 3, 4, 5, and 6. What about n = 7? And so on?

Questions phrased like this seem to me indicative of a very deep rot in our academic institutions. What I really want to know, is what are these “26 possible english alphabets”? The one with Æ? Or maybe the one where lowercase s is ? ?

Hmm. WordPress didn’t seem to like the long s. Unicode U+017F 🙂

Surely Trump was *aiming* for “coverage”, so a proper calculation would be, given assumptions about how frequently he makes typos, the probability that he errs 5 times: typing “fefe” instead of “erag” (that’s 4 letters), plus hitting “Send” rather than continuing.

Given this, we could narrow it down to the lower probability of typing specifically “covfefe”, or just of typing any errors in that specific word that wouldn’t be automatically corrected by the phone (plus hitting “Send”). I can’t imagine “covfega” or whatever would be much less popular, though I think it’d be less funny.

Obviously, it’s better to give higher probabilities to mistakes that are closer to the intended letters in the QWERTY keyboard. But that’s getting overly serious about it.

Given the 26 letters of the English alphabet (all caps, no spaces), and assuming that the letters were truly typed randomly (with no fixed goal on the part of the typist), one per unit of time t (undefined, but real, positive, and capable of being produced by human thought and reflexes), then the probability of COVFEFE at time t is 0 for t6 (as each key stroke is random and independent of the previous history – our typist, unknown to us, could be holding down the ‘a’ key on repeat until the heat death of the universe) until COVFEFE is actually spelled out, which can only be checked by examination. After that point, the probability of COVFEFE appearing in the stream is 1, which is uninteresting. If the time represented by t is taken to the infinitesimal, then t = infinity, which theoretically assures COVFEFE, happens in no practical time, and is uninteresting.

However – we know that the actual probability of COVFEFE is 1 at t=7 by examination, because it already happened.

Yes, I know that’s not how the probability equations work. It’s funnier that way.

Expected time for the appearance of COVFEFE

26^7 delta_t = 8,031,810,176 delta_t

delta_t: typing time unit