Philosophy

All Of The Supposed Paradoxes About The Principle Of Indifference Are Bogus

Here is an example of a supposed paradox caused by using Keyne’s Principle of Indifference, which Stove and others (and myself) call the Statistical Syllogism (itself deduced from the symmetry of logical constants). This example is due to Van Fraassen, as quoted in Probability, Causes, and Propensities in Physics, edited by Mauricio Suarez, Chapter 1 (I have a preprint and don’t know the official page numbers).

Incidentally, I thoroughly investigate this in this year’s must-have Christmas present Uncertainty.

Consider a factory that produces cubes of length l up to 2 centimeters. What is the probability that the next cube produced has an edge ≤ 1 cm? A straightforward application of the principle of indifference yields probability = 1/2. But, we could have formulated the question in several different ways. For instance, what is the probability that the next cube has sides with an area ≤ 1 cm^2 ? The principle now yields the answer 1/4. And how about the probability that the next cube has volume ≤ 1 cm^3 ? The answer provided by the principle is now 1/8. These are all inconsistent with each other since they ascribe different probabilities to the occurrence of the very same event.

There is a hidden premise here about infinities that causes the error of supposing there is a contradiction. Let’s see how.

Now real cubes will be of a certain length, a length that will be a discrete and finite number. This applies at least to the way we can measure such cubes, if not to the physical properties of the cube in actuality. Even our finest measurement equipment and procedures can only discern observations to a certain discrete and finite level. This will always be so no matter how inventive the equipment becomes. Measuring to infinite precision would require infinite capacity, which is forever impossible in practice (for us).

Call the smallest unit at which we can measure u. In practice it might be that we cannot measure uniformly, such that on some things we can in places only measure to a x u (a times u), where a > 1, or whatever, but this does not change the analysis, as you shall see.

Any real cube will be c x u in length, where c is an integer greater than or equal to 1. Again, this restriction applies to our knowledge of real cubes as measured by our finest tools. You don’t yet have to believe it applies physically to the cube.

Given what we know so far, and recalling all probability is conditional on what we accept as known, the probability the factory produces cubes with lengths less than our equal to 1 cm (with 2 cm as the maximum), is either 1/2 or close to it. If (c x u)/2 is an integer, then the probability is 1/2; if (c x u)/2 is not an integer the probability will be different. We can’t decide which until we make another assumption.

Does the manufacturing process produce cubes of all sizes? In units of u? Or at certain fixed lengths? Of course there will be leeway: we can probably measure to finer lengths than we can reliably and without fail manufacture cubes; plus not all cubes will be made to the same exact lengths, down to u.

However, it really doesn’t matter. If the machines can’t make cubes down to u, it can make then to some other level of precision, such as 100 x u, which we can then refine as u’ = 100 x u, and then drop the prime mark.

Suppose u = 0.2 cm, then cubes can be 0.2, 0.4, 0.6, 0.8, 1.0, 1.2, 1.4, 1.6, 1.8, or 2 cm in length. The probability a cube with length less than or equal to 1 cm is 1/2. Suppose instead u = 2/13 cm, then cubes can be 2/13, 4/13, etc., up to 26/13 = 2 cm. The probability, given this new information, of lengths ≤ 1 cm is 6/13 < 1/2. The length of 6.5/13 is impossible; impossible to know, anyway, and perhaps impossible in reality.

Now, given these two scenarios or premises, the probability of side surface areas of ≤ 1 cm^2 is again 5/10 = 1/2 and 6/13 (do the math if you don’t see it). The probability of cube areas of ≤ 1 cm^3 is the same. The only possible cubic areas in the latter case are 0.003641329, 0.029130633, 0.098315885, 0.233045061 cm^3 and so on. The values between for example 0.098315885 and 0.233045061 are at least impossible to know, and maybe impossible in actuality.

You can immediately see that Van Fraassen’s objection disappears when u = p/q, and p and q are integers. How did the objection, and subsequent paradox, ever appear in the first place? All paradoxes arise because of the same reason: lurking implicit or unknown premises. The same is true here.

That tacit, and probably unthought of, premise is about infinity. Van Fraassen thought it was possible cubes can be infinitely subdivided in actuality, and not just potentially. If so, then the paradox holds.

But it isn’t true cubes can in actuality be subdivided forever. Of course, the u used above were crude, chosen for simplicity. Real cubes will be finer than 0.2 cm. So let u = p/q = 1/q (p = 1 results in no loss of generality), and let q climb. Let it grow! Let is be a googol (10^100) raised to a googol raised to a googol a googol times, and raise all that to a googol more googols. The result will be a very small u indeed! But one which is still finite and results in discrete units and no paradox. Cubes constructed in this way will still be in blocks of u, and there can be no paradox in the probabilities.

If this u is still “too large” for you, even though it is “practically infinite”, let q grow some more. Do as many raising to the googol googol times the answer as before, and do this every second for a billion years. This will give a q so large you can’t think of it—nobody can think of it!—and a u small but finitely graded. And no paradox.

There are no real restrictions using finite and discrete numbers. The paradox only comes in supposing we carry the process to infinity. It is only at infinity that trouble emerges. At infinity, that strange and imponderable place. Before infinity, there is never a problem, and probability survives without difficulties.

So it must be something strange about infinity. Which is a very true statement.

Now real cubes I claim are also composed of finite and discrete blocks, and cannot in actuality be constructed to infinite precision. We already agree our knowledge is discrete and finite, but I say real cubes are like our knowledge and cannot be infinitely subdivided, except potentially. My proof (in brief) is that everything would then be infinite in this way, which is an explosive idea.

But ignore me, and say I’m wrong. It is still true that no paradox exists unless we are at infinity, a mighty strange land! And then I ask you: what actual evidence have you that things are infinitely decomposable? Your evidence is not (I say) based on any observation anybody has ever made, or could make. You even can only be metaphysical, and then it bumps up against my proof.

No, infinity, though mathematicians toss is around with ease, is too heavy to carry in actuality. Nobody really understands what happens there. And since probability is epistemological, it is no wonder it breaks in just the same way our other thoughts of infinity fracture.

To support this site and its wholly independent host using credit card or PayPal (in any amount) click here

Categories: Philosophy, Statistics

15 replies »

  1. And that shows that our esteemed host is smarter than almost all physicists. They still get hung up on infinities when dealing with quantum mechanics. That’s because they keep ignoring the “quantum” part. (The universe appears to work on whole numbers, and detests fractions.)

    Natural law contains no infinities and bears no contradictions.

  2. I don’t think this gets rid of the paradox. The main point of the paradox is that you can get different answers using the principle of indifference when you privilege a particular dimension of measurement (length, area, or volume) over the other two. In your explanation you are privileging a discretization in 1D (length) which gives uneven discretizations (in terms of what is measurable) in areas and volumes. I can just as easily privilege measuring in 3D if I measure cube volumes by dropping them in water and seeing how many mL are displaced. My discretization of what volumes I can measure is even, but my discretization of what areas and lengths I can measure is uneven. The principle of indifference applied in my “privilege even discretization 3D volumes” example (which does not contain any infinities!) will be 1/8.

  3. Bill,

    No, but of course I like some of what constructivists are doing. I instead subscribe (surprise!) to the Aristotelian view of mathematics. Get Jim Franklin’s book on this.

    Infinity is real, and important. And much, much bigger than we think. Since probability is epistemological, and so is thinking about infinity, which no on truly understands, it’s no wonder that this is where weird things happen with our thoughts. We toss around infinities with ease in equations, but as we can see in this example, without thinking of the consequences.

    Jaynes also has a chapter about how the results differ depending on what road you take to infinity.

  4. ASG,

    No, it doesn’t matter which dimension of measurement you take. There is no inconsistency. Try some examples. It’s only at infinity where problems crop up.

  5. OK. Suppose I can measure volumes to an accuracy of 0.5 cm^3. Then the possible volumes I can measure are 0.5, 1.0, 1.5, 2.0, …, 7.0, 7.5, 8.0 cm^3. There are 16 possible volumes, and I’m going to apply the principle of indifference (privileging measuring volumes in 3D) and say these 16 possible volumes are all equally likely. So each has probability 1/16. Taking cube roots (I’ll round to 3 decimal places), these give side lengths of 0.794, 1.0, 1.145, 1.260, …, 1.913, 1.957, 2.000 cm. These are the 16 side lengths I can measure (really, compute from what I’ve measured) when I measure volume with an accuracy of 0.5 cm^3. How many are less than or equal to 1 cm? There are two, 0.794 and 1.0 cm. Probability of side length <= 1 cm? 2/16 = 1/8

  6. ASG,

    Yep. So next apply PoI to the side length, and say each of 0.7937005, 1.0000000, 1.1447142, …, 2.0000000 are possibilities, and you get a probability of 1/8 for the chance of equal to or less than 1. So it works.

    The first question you set implicitly yourself was “What are the chances of volumes ≤ 4 cm^3, given the info”, and the second question you set yourself was “What are the chances of sides ≤ 1 cm, given the info.” Naturally you get different answers.

  7. Somehow I think we might be talking past each other. My aim is to answer the question “what is the probability of side length <= 1 cm, given the info" using the principle of indifference (PoI). My point is that if your measurement capability is discretized evenly in length then you get the answer ~1/2, and if your measurement capability is discretized evenly in volume then you get the answer ~1/8. That's the "paradox". You seem to agree with the first claim in your article, and you seem to agree with the second claim based on your first paragraph in your post at 4:31 pm. And there are no infinities anywhere since everything is discretized. Do you disagree?

    I don't understand what you mean when you say, "the first question you set implicitly yourself was 'What are the chances of volumes <= 4 cm^3, given the info'". I haven't addressed that question at all. But if I did I would run into a similar paradox, just with different numbers (p=0.5 if we apply the PoI when privileging 3D, measuring volumes, and p~=0.794 if we apply the PoI when privileging 1D, measuring lengths).

  8. ASG,

    If you can measure volumes to 0.5 cm^3, and cubes can be any volume from 0.5 cm^3 to 8 cm^3, you get:

    (0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 5.5 6.0 6.5 7.0 7.5 8.0) cm^3

    as possibilities.

    Now what do you want to know about this? The probability of 1 cm^3 or less? The probability of 5 cm^3 or less? What?

    These volumes imply lengths of

    (0.7937005 1.0000000 1.1447142 1.2599210 1.3572088 1.4422496 1.5182945 1.5874011 1.6509636 1.7099759 1.7651742 1.8171206 1.8662556 1.9129312 1.9574338 2.0000000) cm

    Only these lengths are possible to know, and actually we can’t even measure them, since all we can measure is volume to nearest 0.5 cm^3.

    So what question are you asking? What are chances of ≤ 1.5874011 cm, which is equivalent to 4 cm^3?

    In your first example, you implied the example of “What are chances of ≤ 4 cm^3 volume, given info.” In your second example you asked “What are chances of ≤ 1 cm side, given info.” These just are not the same question. They are not logically equivalent.

    If you like notation, we want

    Pr(Y_1 | X) and Pr(Y_2| X), where Y_1 ≠ Y_2. Of course the probabilities differ.

    You aren’t asking the same questions in the same was as the original “paradox.” In the original “paradox”, the length of 1 cm is logically equivalent to an area of 1 cm^2, and a volume of 1 cm^3. But that’s not what you’re doing in your attempted counter example.

  9. Matt,
    How would his 2009 paper “ Aristotelian realism” do?

    ASG,
    I don’t see a paradox at all. The problem is ill-posed to begin with, so you get to decide what you are or are not indifferent to. From that flows the pr(Y|M).

    If it’s a question about a real process, then you get data, otherwise I see it as a philosophical word game of no importance. Sort of on the order of “What would rather have, a ham sandwich versus true happiness?“

  10. I have said the question I am focusing on. It is (1a) “what is the probability of the side length being <= 1 cm?". For the paradox to get off the ground, we have to agree to the claim that this is logically equivalent to the question (1b) "what is the probability of the volume being <= 1 cm^3?". I am not sure if you agree with this claim or not, but if you do not, then I understand why you think there is no paradox, because you think there are multiple questions being asked which are logically inequivalent.

    If you do agree that the questions are logically equivalent, then I don't understand your "solution" which claims that the paradox is the result of infinities. As I have tried to outline, applying the PoI to question (1a) (with evenly discretized measurements of length), you get p ~= 1/2. And applying the PoI to question (1b) (with evenly discretized measurements of volume) you get p ~= 1/8. Maybe you disagree with how I am applying the PoI, but if so I don't understand in what way you disagree.

    Unfortunately I'm out of time so I'll have to leave it there. Best wishes.

  11. ASG,

    Your scenarios are not the same. If you have a chance, come back and write out the exact propositions you think you’re asking about, like “The side of less than or equal to Y_2 cm”, and “The volume of less than or equal to Y_1 cm^2”. Then I think you’ll see it.

    In the original “paradox”, 1 cm side implies logically a surface area of 1 cm^3, and also implies logically a volume of 1 cm^3. So the three questions VF asks seem to be the same, but they’re not because at infinity these three measures are not logically equivalent. They are logically equivalent at all measures less than infinity.

    Or maybe I’m not understanding what you’re saying. Always a distinct possibility.

    Update I think I may get what you’re saying! Many thanks. More after Christmas.

    Bill,

    Haven’t seen the paper. Borrowed the book from a library.

  12. The idea that not knowing much therefore implies a uniform distribution is practical idea but not a sound one.

    Consider a prior on the parameter theta where theta ~ U(0,1). What about the distribution of theta^2, log[theta/(1-theta)], or 1/theta?

    It is true that I’d expect theta^500 to be closer to 0 than 1, but the general question still stands of how does literal ignorance on one scale translate into knowledge on another.

    Justin

  13. Justin,
    Yes!

    The idea that not knowing much therefore implies a uniform distribution is practical idea but not a sound one.

    and therein lies the confusion. Perhaps Matt could post on the difficulties of applying practical reasoning to mathematical ideas, and the fun in placing a uniform prior on the real line….

  14. Dr. Briggs,
    Selecting a hypothetical cube production factory with self-selected constraints in maximum size and no constraints in minimum size, to expose a weakness in Keynes Principle of Indifference fails to consider the reason for manufacturing the cubes.
    Certainly, cube production would not be a random affair, but rather, must be constrained by both, demand and GMP.

    Crucially, Keynes suggests assigning equal probability when there is no reason to select one outcome over another.

    The principle would hold if the factory produced five different-sized widgets in equal numbers, and I were to blindly guess which sized widget was currently in production.

    The utility of thought experiments are limited by the limiting factors.

Leave a Reply

Your email address will not be published. Required fields are marked *