Class 39: Trust Your Equations! (Bayes)

Class 39: Trust Your Equations! (Bayes)

Chant this until it becomes part of your soul: There is no such thing as unconditional probability. Once you have that, endless “problems” people have “discovered” about probability vanish. Today we example the non-problem of old evidence.

Uncertainty & Probability Theory: The Logic of Science

Video

Links: YouTube * Twitter – X * Rumble * Bitchute * Class Page * Jaynes Book * Uncertainty

HOMEWORK: Given below; see end of lecture.

Lecture

Today, a brief lesson in trusting the proofs we have established. You remember we proved, from first principles and easy premises, that the mathematical form of probability of Bayes’s theorem. Unless you doubt the premises of this proof, which we don’t, then you must believe its conclusions.

Which most do. But occasionally some think they have discovered errors in the equation. Not in the equation, I mean, but in their uses. We saw some of this a couple of weeks ago in the Paradox Lost lesson: being loose with infinity and forgetting tacit premies can lead one astray.

A similar thing is true is the so-called problem of old evidence. Let me quote in full a passage from a Colin Howson paper in the British Journal of Philosophy of Science which defines this “problem”:

Clark Glymour [1980] presented subjective Bayesians with the following problem. Suppose a hypothesis h is proposed which turns out to explain some already well-known data e. Can h ever be confirmed by e? Intuitively, the answer is that it can. Examples abound, in fact, of hs apparently well-supported by such es. One very often mentioned is General Relativity with its allegedly strong support from the data on the annual precession of Mercury’s perihelion, data already fifty years old when the field equations of General Relativity were first obtained by Einstein.

The problem for subjective Bayesians is explaining and justifying the attribution of such support. For Bayes’s Theorem says that

$$P(h|e) = \frac{P(e|h)P(h)}{P(e)}$$

and the received wisdom is that for subjective Bayesians these probabilities are all relativized to the individual’s stock K of contemporary background information. This relativization has the following unpleasant consequence, however. If eis known at the time h is proposed, then e is in K and so P(e)=P(e/h)=l, giving, from (l), P(h/e) =P(h); which means that e gives no support to h.

Long ago, and many times since, you heard me warn about notation. How it can help or hinder. Here is hinders—bad. I told you that it’s okay to write things like “P(e)” when you’re manipulating equations, trying to do proofs, and that kind of thing. No one wants to write out full notation when shorthand will do.

But, again as I have told you over and over and over some more and will never tire of reminding you, there is no such thing as “P(e)”. There is no such thing as unconditional probability. There is no such thing as unconditional logic. You must absolutely and without any exception have conditions. That is, you must understand “P(e)” = Pr(e|K), for some K. I use K because Howson does.

Forgetting that K, even as he wrote it!, is what caused the misperception that there is a problem with Bayes’s theorem. This is almost trivially easy to see if you write everything out properly.

Howson says—it’s right there!—“e is in K”. Okay, then we can write K = k&e, where the lowercase k means the parts of K that are not the evidence. That means we start not with “P(h)”, because there is no such thing, but, as Howson himself says, we start with Pr(h|K). But that, as he didn’t see because he didn’t write it out, equals Pr(h|ke)! That means, after considering e—and it makes zero difference here when e was observed—Bayes theorem becomes:

$$\frac{P(e|K)P(h|K)}{P(e|K)} = \frac{P(e|ek)P(h|ek)}{P(e|ek)} = P(h|ek) \ne P(h|k)!$$

There is no problem with old evidence. Not when you assume you already know the old evidence! It is just that simple. Indeed, the full notation makes it look like nothing. If I were to present the “problem” in these terms, with full notation, and insist there was a problem you’d like I was crazy. If folks like Glymour and Howson would have been careful with notation, this multi-decade hand-wringing over the “problem” would have been avoided.

There are acres of papers written on this problem, all with screwy solutions, none of which say “Uh, you forgot to do your conditioning.”

Stanford’s on-line encyclopedia of philosophy Plato has an entry on Bayesian Epistemology (updated in 2022!) that has this example of another “problem” called “memory loss”:

Example (Shangri-La). A traveler has reached a fork in the road to Shangri-La. The guardians will flip a fair coin to determine her path. If it comes up heads, she will travel the path by the Mountains and correctly remember that all along. If instead it comes up tails, she will travel by the Sea—with her memory altered upon reaching Shangri-La so that she will incorrectly remember having traveled the path by the Mountains. So, either way, once in Shangri-La the traveler will remember having traveled the path by the Mountains. The guardians explain this entire arrangement to the traveler, who believes those words with certainty. It turns out that the coin comes up heads. So the traveler travels the path by the Mountains and has credence 1 that she does. But once she reaches Shangri-La and recalls the guardians’ words, that credence suddenly drops from 1 down to 0.5.

The author goes on, at length, to say why this is a problem, and how it might be resolved, using all sorts of new technical jargon. None of which is needed if we could simply write out probabilities correctly.

The problem should have laid out the probabilities. We want to know the probability of M = “truly arrived by the mountain path”. We must not write “P(M)” without its conditions. If we do write the conditions, we see there is no problem. We have:

P(M| believe guardians & remember H) = 1 and P(M| believe guardians & does not remember H) = 1/2.

The wording does not make clear whether one remembers the H or not. It is a sloppy problem, because it seems to have a built-in contradiction that one cannot remember the path, but can remember the guardian’s words and the result of the coin flip.

It is simply no problem at all for probability. Obviously, writing out the full conditions shows there is no equality, and none expected.

I’ll leave this one, also from Stanford, as a homework question (no sense trying to cheat by looking at their site, because they don’t solve it the probability way, like we do):

Conditionalization in the standard version preserves certainties, which fails to accommodate cases of memory loss (Talbott 1991):

  • Example (Dinner). At 6:30 PM on March 15, 1989, Bill is certain that he is having spaghetti for dinner that night. But by March 15 of the next year, Bill has completely forgotten what he had for dinner one year ago.

Subscribe or donate to support this site and its wholly independent host using credit card click here. Or use the paid subscription at Substack. Cash App: \$WilliamMBriggs. For Zelle, use my email: matt@wmbriggs.com, and please include yours so I know who to thank. BUY ME A COFFEE.

Comments

No comments yet. Why don’t you start the discussion?

Leave a Reply

Your email address will not be published. Required fields are marked *