There Is No Such Thing As Intrinsic Probability

This is less fun than looking at the so-called principle of indifference, which you must read first, and I realize we’re wading into the depths and on a Friday, but I wanted to finish Draper’s paper for two reasons: a few people are interested, and if I don’t do it now, I never will.

Draper defines intrinsic probability for a hypothesis (proposition) as the “probability independent of the evidence we possess for or against it.” Since probability is epistemological and not ontological, and since all probability is conditional of stated premises, I cannot see how this definition makes any sense. There is no such thing as unconditional probability, therefore there can’t be “intrinsic” probability. The closet we could come is if probability was ontological, which it isn’t. Nevertheless, let’s examine his justification and see what comes of it.

Intrinsic probability “consists of three postulates: that intrinsic probability depends on modesty, that it depends on coherence, and that it does not depend on anything else.”

Modesty

“The degree of modesty of a hypothesis depends inversely on how much it asserts (that we do not know by rational intuition to be true). Other things being equal, hypotheses that are narrower in scope or less specific assert less and so are more modest than hypotheses that are broader in scope or more specific.”

One example of “modesty” he gives is “the hypothesis that either Hilary Clinton or Nancy Pelosi will be the 45th President of the United States is less specific and thus intrinsically more probable than the hypothesis that Joe Biden will be the 45th President.”

This is meaningless as it stands, unless it is accompanied by conditions/evidence/premises. It is the same as asking, “What is the probability that ‘A Schmenge will come out‘”? There is no tying that proposition to anything; it has no probability, not unconditionally. No proposition does. You are left groundless if you are wholly ignorant of the meaning or context of the proposition. It is like asking what is the intrinsic probability of ‘é›ªæ˜¯é»‘çš„’ (which I hope I have right)? Unless you read traditional Chinese, unless, that is, you accept premises such as the meaning of the symbols, you can’t answer. There is no intrinsic probability.

Now if I offer E = ‘There are 12 people in the room two of which are Schmenges and one person must come out” the probability of ‘A Schmenge comes out’ given E is 1/6. But if E = ‘There are two people in the room both of which are Schmenges and one person must come out’ the probability of ‘A Schmenge comes out’ given this E is 1. Change the premises, change the probability.

I think Draper might have in mind evidence something like E = ‘There will be n candidates for the 45th President of the US, and these include Clinton, Pelosi and Biden, and only one candidate will win’. Given that, and given only that E, the probability of ‘Clinton or Pelosi wins’ is 2/n, which is higher than ‘Biden wins’, which has probability 1/n.

With the ‘A Schmenge comes out’, unless you knew, as all good people should know, Cabbage Rolls and Coffee, you had nothing to bring to the mental table. No evidence or premises sprang to your mind, thus the probability seemed incalculable. There was no intrinsic evidence for it, thus no intrinsic probability.

Premises surely do swim into view—it is impossible for most of us to keep from this—when assessing whether Joe “Are My Hair Plugs Too Tight?” Biden versus Hilary “What’s A Benghazi?” Clinton will win the Democrat nomination, let alone the presidency. For myself, I judge Mrs Clinton to have a higher chance than Mr Biden, who in my opinion missed the turn off to Sanity quite a few miles back. But maybe you have a different idea. If so, you will have a different probability for Draper’s propositions, but only because you have different evidence. No piece of evidence is more intrinsic than any other. But if we agree on the evidence, such as that I supposed Draper might have had in mind, then we have to agree on the probability.

Coherence

“To the extent that the various claims entailed by a hypothesis support each other (relative only to what we know by rational intuition), the hypothesis is more coherent.” His example, which will take a couple of readings:

Consider, for example, the hypothesis that all crows are black. This hypothesis is identical to the hypothesis that all non-Asian crows are black and all Asian crows are black. Now compare this hypothesis to a second hypothesis, namely, that all non-Asian crows are black and all Asian crows are white. The two hypotheses are equally modest, but not equally coherent. The first half of both hypotheses states that all non-Asian crows are black. This supports the second half of the first hypothesis, which states that Asian crows are also black, while it counts against the second half of the second hypothesis, which states that Asian crows are white. Thus, the hypothesis that non-Asian and Asian crows are all black is more coherent and thus intrinsically more probable than the equally modest hypothesis that all non-Asian crows are black and all Asian crows are white.

Proposition 1: P1 = ‘All non-Asian crows are black and all Asian crows are black’. Proposition 2: P2 = ‘All non-Asian crows are black and all Asian crows are white.’ Draper claims the probability of P1 is “intrinsically more probable” than P2.

Again, since both propositions are anchor-free there just is no probability. Suppose I invent E = ‘All the crows I have seen are black’, then P1 is more probable than P2 given this E, only because P2 allows crows of a color I haven’t yet seen. Or suppose E = ‘Animal species coloring is independent of continent’ then again P1 is more probable than P2 given this E. But then I might have E = ‘Animals are lighter colored in Asia than in other continents’, then P1 is less probable than P2 given this E.

If Draper thinks P2 less probable than P1 he must have some sort of “uniformity of animal color” evidence in mind. Maybe that evidence is even right, or close to right. But it’s still evidence even if it’s unstated, meaning there is no “intrinsic” probability of either proposition only probabilities conditional on tacit premises. And indeed Draper closes this section with the words (meant as self-proving), “More generally, hypotheses that attribute objective uniformity to the world are, other things being equal, intrinsically more probable than hypotheses that postulate synchronic or diachronic variety.”

Nothing else needed

“[I]ntrinsic probability does not depend on anything else besides modesty and coherence.” His proof is “ask yourself: what else could the intrinsic probability of a hypothesis depend on besides how little the hypothesis says (its modesty) and how well what it says fits together (its coherence)?” Meh.

Extensions & Homework

There’s more to Draper’s paper, but that’s enough for us. I believe Draper recognized the weaknesses given above, which is why the paper was never published, so there’s no point going on and on. He does bring up the venerable grue (which I just realized we never did), and Richard Swinburne’s ideas of intrinsic probability (which I say are also wrong), and induction. But enough’s enough for today.

Now homework. Dissect these examples in the manner I did above, to show they do not have intrinsic probability, but that all have tacit unacknowledged premises. Hint about the arsenic example: how hard is it to exclude all you already know about arsenic before answering this?

1. Under modesty. “First, the hypothesis that all cats are curious is narrower in scope and so intrinsically more probable than the hypothesis that all animals are curious.”

2. Under coherence. “[T]he hypothesis that all arsenic is poisonous to human beings is intrinsically more likely to be true than the hypothesis that, while all observed arsenic is poisonous, all unobserved arsenic is nutritious.”

18 Thoughts

1. Briggs says:

Nick,

You and I appear to be the only ones.

2. DAV says:

Cabbage Rolls and Coffee

John Candy playing the licorice stick! And without moving his fingers! What are the chances of that?

3. Briggs says:

DAV,

That pales in comparison to hearing Linsk Minyk signing On the Road Again.

Or, to prove Cabbage Rolls and Coffee was not too far out in left field, listen to this song about Dampfnudeln. If my German is not too far out, at 22:54, the lady sings they’re good with coffee!

4. DAV says:

First, the hypothesis that all cats are curious is narrower in scope and so intrinsically more probable than the hypothesis that all animals are curious.

So would that apply equally well to bookends?
There must be properties that all cats share or they wouldn’t be cats. The same for all animals.

I’ve come to the conclusion that each of us has the intrinsic probability of 0.375 which is why Elvis is in each and everyone of us (except for MJF who is the anti-Elvis.). Why 0.375? Because 0.375 is more powerful than a 0.22 or even a paltry 0.177. Some of us may have 0.5 but that includes 0.375.

5. Sander van der Wal says:

Assume an intrinsic probability exists, and is a number p, such that 0 <= p p(all animals are curious). We can also say that p(all dogs are curious) > p(all animals are curious), p(all ants are curious) > p(all animals are curious) and so on. But do we now know if p(all cats are curious) = p(all dogs are curious)? Or compare p(all cats are curious) > p(all mammals are curious) > p(all animals are curious), or p(all cats are curious) > p(all four-legged animals are curious) > p(all animals are curious).

The relation between mammals and four-legged animals should determine which p is bigger. Now, not all mammals are four-legged. Some, humans, apes, are using their front limbs as arms. And an arm is not a leg. An arm is a limb, but we are stating specifically that we are looking at legs, not limbs. And there are plenty of critters with four legs around too that aren’t mammals. Crododiles come to mind, and tortoises. Lizards. Now, which of the two intermediate p’s is the more modest one?

But wait, there is more. We could assume there is a p(all meat-eating animals are curious) between the cats and all animals. Or a p(all animals with claws on their paws are more curious).

Afaics, there is no reasonable way of determining the comparative modesty of all these intermediate hypotheses, which makes modesty unusable as a criterium for determining intrinsic probability.

6. DAV says:

Briggs,
If my German is not too far out, at 22:54, the lady sings theyâ€™re good with coffee!

Well they are but what is the probability that alle Dampfnudeln sind gut mit allen Kaffees?

Reminds me of some beer joint I visited in Munich run by Dortmunder Union — nonstop lederhosen.

7. Ye Olde Statisician says:

The closet we could come is if probability…

Is the probability coming out of the closet?

8. Brandon Gates says:

Briggs,

I still haven’t finished the previous day’s homework. My college profs would cast knowing looks, 100% confidence and the wee-est non-zero pee imaginable. Here’s where I am on today’s assignment:

One example of â€œmodestyâ€ he gives is â€œthe hypothesis that either Hilary Clinton or Nancy Pelosi will be the 45th President of the United States is less specific and thus intrinsically more probable than the hypothesis that Joe Biden will be the 45th President.â€

This is meaningless as it stands, unless it is accompanied by conditions/evidence/premises.

And I intuitively agree, but there’s a wrongness feeling to my agreement. Here’s what does feel right:

A or B not C is more intrinsically probable than A not C. I fear I’ve made a total hash of this, but would like feedback.

9. JH says:

“I believe Draper recognized the weaknesses given above, which is why the paper was never published, so thereâ€™s no point going on and on.”

Ah, you are saying that peer-review works!

10. Paul W says:

“You and I appear to be the only ones.”

The probability of that statement is 0.00000000 based on real data. I loved all the SCTV characters. It was late night must see T.V. for me.

11. Andrew W says:

As someone who has played a lot with probability (a gamer) but not dealt with it at a philosophical level, let me thrash around a bit.

(1) Probability (typically) applies a true-false test to a sampling event, and asks the question “If I could theoretically run this event multiple times, what proportion of the time would the test condition be true?”.

Given this, it requires that:

(2) there must be some conception of what it would mean to “run this event multiple times”

and (3) there must be some form of model of a indeterminable process that would cause the test condition to occur or not occur.

“Indeterminable” is important. Given total information about the inputs and process, we could accurately predict the outcome and therefore the “probability” is either 1 or 0. In contrast, given no model we can make no sane prediction. So what we need is a model whose inputs are themselves subject to being evaluated probabilistically.

I therefore claim that (4) all probabilistic models must ultimately be based on the outcomes of existing sampling of multiple events that we are willing to treat as equivalent and independent. Sampling is the only way to get a “probabilistic” number without itself requiring a probabilistic model – it’s a raw input.

In other words, to sensibly assign a probability to the outcome of an election between Hillary Clinton, Nancy Pelosi and Joe Biden we must either have a history of such elections (multi-event sampling) that are sufficiently comparable or a model that is itself derived from a combination of existing samplings.

Given this, it seems philosophically obvious that there’s no such thing as intrinsic probability, unless you want to claim that any probabilistic statement can be tied to a single indiscriminate model.

12. Briggs says:

Andrew W,

This is close, but in the (proper) logical view of probability, we do away with “sampling spaces”, infinite repetitions (the only way to know a probability in frequentist theory; there, probabilities can never be asserted), fancy notation, and so on.

We can compute probabilities like Pr(George wears a hat | Half of all Martians wear hats and George is a Martian) for which “sampling” is impossible, etc.

Probability, like logic, is a measure between propositions. That being so, we need both “sides” of the equation. There can be no between with the evidence/premises on the right hand side.

13. Andrew W says:

At risk of displaying my ignorance, let me push back on some assumptions that I perceive hiding in your example.

(1) “sampling” – the formulation includes an implicit sampling of the entire population: “Half of all Martians wear hats”. Yes, it’s a hypothetical population, but it is nonetheless sampled (and, being hypothetical, with more precision than is usually possible in real life).

(2) assumptions of model. “George wears a hat”, “George is a Martian”. Implicit in this is a model any martian is as likelihood to wear a hat as any other and that we have no information about George that would change this.

So, while I fully agree that probability is a measure between propositions, it seems to that it is a measure between propositions with respect to an existing data set (or model thereof). That “with respect to” would bar any attempt to describe intrinsic probability.

Are we saying the same thing in different words, or am I failing at breaching the gap from layman to philosopher of statistics?

14. Jonathan D says:

Briggs,

Regarding example 1, I am happy to say that “all cats are curious” is intrinsically more probable than “all animals are curious”, without suggesting that there are any intrinsic probabilities for these propositions. By that I mean that whatever premises I choose to base my probability on, the first will be more (ok, I should have sid ‘at least as’) probably than the second. Quite different to the presidential example, which is perhaps what Brandon is getting at in his comment.

As for coherence, well… we could look for premises to justify the relevant probabilistic claim in each case, but I’d rather observe that the whole notion of coherence seems a bit lacking in… coherence.

15. Briggs says:

Jonathan D,

You say: “whatever premises I choose to base my probability on, the first will be more…probably than the second.”

Exactly so. Premises you choose. Meaning the probabilities are conditional on evidence you have, meaning the probabilities are not intrinsic.

16. bernie1815 says:

Since curiosity killed the cat, either there are no cats or not all cats are curious . 😉
“All” is an immodest condition for any non-tautological premise.