Read Part V
From his page 55 (as before slightly edited for HTML/LaTex):
Consider the case of a binary event where the two outcomes are success, S, or failure F and we suppose that we have an unknown probability of success . Suppose that we believe every possible value of is equally likely, so that in that case, in advance of seeing the data, we have a probability density function for of the form = 1.
And lives on 0 to 1. “Suppose we consider now the probability that two independent trials will produce two successes. Given the value of this probability is . Averaged over all possible values of ” this is 1/3 (the integral of ).
A simple argument of symmetry shows that the probability of two failures must likewise be 1/3 from which it follows that the probability of one success and one failure in any order must be 1/3 also and so that the probability of success followed by failure is 1/6 and of failure followed by success is also 1/6.
This is a contradiction or paradox and a glaring one which causes subjective Bayesians to cower (rightly). (I skip over the difficulties covered before with the idea of “independent trials”.) Where does the fault lie? Here:
Suppose that we believe every possible value of is equally likely…
What could that possibly mean? Nothing. Sure, it’s easy to write down a mathematical answer, but this does not make it a true or useful answer. First: how many numbers are there between 0 and 1? Uncountably many. It is impossible for any being short of God to assign a probability to each of these. Second: even if somebody could, because there are uncountably many answers, it is impossible that any should be the right one. Recall the probability of seeing any actual observation with any continuous (i.e. infinity-beholden) distribution is always 0, a daily absurdity to which we always shut our eyes.
We have jumped the infinity shark. Jaynes warned us about this (in his Chapter 15; though he didn’t always obey his own injunction). I think his caution goes unheeded because the calculus is so easy to demonstrate and to work with. What’s easier than integrating a constant?
As shown in the original series, we must begin with a real-world finite conception of each problem and only after we’ve sorted out what is what can we take a limit, and only then for the sake of ease and approximation. We must not fall prey to the temptation of reifing infinity.
(If there is sufficient interest, I’ll show the solution for Senn’s example another day: it’s a simple extension of the problem in the original series.)
Jaynes himself should have followed his own advice in the derivation of a (two-dimensional) normal distribution. He began with a premise (something like this; I don’t have the book to hand) when measuring a star’s position errors are possible in any direction. But he took “any direction” to mean a continuum of directions. This isn’t possible.
Suppose all we have to measure a star’s position (on a plane) is a compass which points only in the cardinal directions. Then our measured error can only be a finite number of possibilities. There would be nothing Gaussian about the probability distribution we use to quantify our uncertainty in this error. Right?
Next suppose we double the precision of our compass, so that it points eight directions. Still nothing Gaussian. Finally suppose we set the precision to whatever is the precision of today’s finest instrument. This would still be finite and non-Gaussian. We have nothing, and will never have anything, which can measure to infinite precision in finite time. This goes for star’s positions, salaries, ages, weight, and anything else you can think of. We’re always limited in our ability to see.
Acknowledging this “solves”—actually does away with—the long-standing problem of putting “flat priors” on (unobservable) parameters of distributions like the normal. These are called “improper” priors because they aren’t real probabilities, they’re only mathematical objects to which we assign an improper meaning. Since they aren’t real probabilities you’d guess people would abandon them. You’d guess wrong.
The other problem with infinite probabilities is measurement units: probabilities can change just by a change in unit, say from feet to centimeters, an absurdity if probability has a constant meaning. This problem also disappears when we remain this side of infinity.
Anyway, time to stop. Logical probability Bayes always lands on its feet. Plenty of mistakes enter with subjective Bayes, it’s true, or even in LPB when people (wrongly) insist on quantifying the unquantifiable. There are many misunderstandings when toying with infinity.