Statistics

What People Think Probability Words Mean

The graph above came from Github and was the result of a poll of folks on Reddit. “The raw data came from /r/samplesize responses to the following question: What [probability/number] would you assign to the phrase ‘[phrase]’?”

There are various ways of plotting the results; shown at the top is just one, a smoothed “density” estimate of the responses. Another plot (at the site) uses horizontal box plots.

Let’s think about what exactly this graph shows, and what it does not. What it does not show is what probability is. Probability is not subjective, even though it subjective answers vary to the question. That requires a clarification.

We are always interested in the probability of some proposition, and since all probability is conditional, the probability of each proposition varies depending on the premises assumed. The premises are not only those that are explicit, but those that are implicit, too. If the premises include words, which I believe is always, then the implicit premises are those that carry the definition of those words, and of grammar. These grammatical and definitional premises are always there, even if they are not formally written.

So we might ask somebody to say “What are the chances your team wins?” and he’ll say “Highly unlikely.”

Now, how he came to that judgement depends on many, many premises, most of which your friend will not be able to articulate. Depending on the situation, he’ll consider some things in depth or in brief. If it were possible to lay out all the premises upon which he relies, including all those implicit ones, then we would see we could deduce from them the “highly unlikely” answer.

The premises on which your friend relies are likely to shift in time, perhaps even rapidly. Ask him the same question a minute later, he may come to a different answer, because his premises have changed. But if we were to do the impossible and extract all of them, then we could deduce whatever his answer was.

This is not in practice possible. There are just two many premises for most questions of interest, many of which are only vaguely contemplated. Thus it will appear that probability is subjective.

The same is true when we ask people what number they would assign to words like “highly unlikely”. What [probability/number] would you assign to the phrase [phrase]?

Any number of premises, evidence, thoughts will flit through a person’s mind, perhaps including certain concrete scenarios in which the person last used the term like “highly unlikely”. Most will scarcely be able to articulate these, and thus the number that pops out will represent some kind of approximation.

What’s interesting is that we can interpret the picture of the range of answers for each phrase as a rough indicator of how people internally think of the definition and grammar of phrases. It isn’t completely that, because we cannot be certain no other premises were involved in each person’s answer. There had to be a lot of folks who said of phrases, “I don’t know. Ten percent.” Ask them again, or in the context of a concrete situation, and the number will likely change.

You can see, quelle surprise, that not all took the poll seriously, because it is difficult to defend the idea that the English phrase “highly likely” would lead to numbers around 15%. Of course, it could be that one or more persons think exactly that. Similar seeming discrepancies exist for other phrases.

Interpretations of “unlikely” are far more variable than “about even.” You may think this makes sense—it does to me—because the words “about even” almost directly map to numbers hovering around 50%. Whereas “unlikely” maps grammatically to only under 50% but not impossible.

How can you use a graph like this? One situation is if you have to make a bet, or counter bet, on the truth of some proposition. When you hear your opponent use one of these words to describe his assessment, you can use the chart as a guess to what number he might assign, and thus you might be able to gauge his commitment. But many of the phrases produces vague answers, so there will not be much profit in that strategy.

Beside that, as said, it is useful to investigating empirical grammar.

Categories: Statistics

2 replies »

  1. Finally hit me–this topic is very relevant to lots of recent topics. The CIA has created an elaborate Words of Estimative Probability scheme.

    It is really just cover for their frequent and massive failures to predict events.

    They are always able to say that they were right, to some degree.

    It’s a big reason why Trump has no time for these mealy-mouthed bureaucrats.

    It’s at the core of the CIA’s Brennan meddling in Trump’s election. He masked all his warnings of Russian interference in WEP.

    Here’s the scheme:

    https://www.cia.gov/library/center-for-the-study-of-intelligence/csi-publications/books-and-monographs/sherman-kent-and-the-board-of-national-estimates-collected-essays/6words.html

Leave a Reply

Your email address will not be published. Required fields are marked *