Class 21: Probability Of Hypotheses (Bayes)

HOMEWORK: In the homework, assume in the same box you observed D = 10 good and 3 bad widgets. What is Pr(A|DX)? READ JAYNES CHAPTER 4, THROUGH SECTION 4.3


Probability Of Hypotheses

Everything I did on the board in the video is done in full detail in Jaynes, so there is no point repeating that here. The links to his book are above. It is very simple math, and beautifully explained. You must read it.

I highlight here only one thing. Recall all (as in all) probability fits the schema Pr(A|X), where A is our proposition of interest and X the evidence we assume is true. As in assume is true. Whether it is true is completely beside the point. We want to judge the uncertainty of A with regard to X. That is logic, which is to say, probability.

“A”, as we have learned over and again, can be any kind of proposition. It can even be a—drumroll—hypothesis!

<<Audience gasps. Realization dawns. Slow applause crescendos into an ovation.>>.

When it is given this glorified name, hypothesis, probability becomes Science. A hypothesis is no more, and no less, than a proposition.

Since Pr(A|X) is the uncertainty of A with respect to evidence X, we have just done—men, steel yourselves; ladies, set down those dishes—a hypothesis test.

Sort of. Because that’s not what is ordinarily called a “hypothesis test”. We’ll learn the official version later, when I recommend against it with all the strength I can muster. We have, though, “tested” A with respect to X.

Notice very very most very carefully that Pr(A|X) is a not decision that A is true or false. Unless, of course, the evidence X insists this is so. Official hypothesis testing makes decisions for you, it makes you say “Treat A as true (or false)”, which is one of its main weaknesses. In real life, whether A should be treated as true or false depends on the probability of it with respect to X, whether X is important to the decision maker, and whether Pr(A|X) is large or small enough to be important to that decision maker. Decisions and bets are not probability.

Now suppose you want to entertain not only X, but also evidence D. No problem for probability. We now want Pr(A|DX). Done! It really is that simple.

Of course, if X assumes some mathematics, sometimes Pr(A|DX) can be written in a nice way to facilitate computation (as in Jaynes), but that’s all that is happening. We don’t need Bayes. It is only a helpful tool, which is only helpful sometimes. What we really want is Pr(A|DX), and any way we can get it is good enough.

This should remove all mysticism statistical probability brings. “A” does not “have” a probability. There is no “true” probability of A. Nothing has a probability. Propositions only have probabilities with respect to assumed evidence. Change the evidence, change the probability. It is as simple as that.

I’ve said it dozens of times so far, and will repeat it dozens more, but this is all of probability. Once you have learned, and truly assimilated this lesson, all the rest is minor detail.

Like, for instance, what if want to judge X itself. We want X is to true about the world, as as close to true as we can make it. That is, we want Pr(X|W) $\approx 1$ for evidence W about the world. All right. Then we can compute this Pr(A|DXW)! We’ll work out the niceties of this at a later class.

  1. McChuck

    Everything is provisional. Everything depends on everything else.
    The truth exists. Our knowledge of the truth is partial, limited, skewed, and quite often wrong.

  2. gareth

    Well, dBs, antennas – wot? You a radio ham?
    (I’m not but many of my friends are, etc…)
    Interesting concept – probability in dB. Maybe maps into human cognition more than a linear scale.

    & still following class, even with inadequate math(s)…

    BTW: I saw/maybe read somebody the other day making a point about that old James Bond thing “Once is a happenstance, twice a coincidence, the third time…” Then they pointed to the mental shift on the second aircraft striking the other tower. Then to the second assassination attempt (and the lack of shift).

    As we might say in Blighty: “Gor blimey – wot would Bayes ‘av said?”

