There Is No Such Thing As Unconditional Probability: Update

Such glory could be yours1
I am awarding a Briggs Internet Prize—the coveted BIP—for anybody who can demonstrate even one instance of a probability which is unconditional on any evidence whatsoever.

The prize will not be awarded, because the task is impossible. Of course, there are no unconditional truths, either, since meta-logic encompasses both ordinary Aristotelian two-value logic and probability. But let that pass for this contest.

It is necessary to describe the deadly sin of reification, most recently seen in the post On The Probability God Exists. There some took to writing things like “P(G)” where G was taken as the proposition “God Exists.” Because there was no “given bar”, i.e. no “|”, the probability seemed unconditional and therefore in need of quantification. (“Priors”—I shudder to recall—were even mentioned.)

Well, any probability can be written in such a shorthand way as to imply lack of conditions, but the written equation is not alive, it is not a real thing. In particular it is statisticians (and mathematicians) who fall under the spell of their—let’s admit it—beautiful scratchings and come to see them as having a life apart from their own minds.

Notation is a great facilitator and allows easy manipulation, but just because a thing can be written, or derived, does not mean the thing has any bearing on reality. Applied mathematicians constantly point this out to their pure mathematician brothers, and it’s about time philosophers made this known to statisticians.

Now it’s true that in just about any introductory textbook on probability you will see the “equation” “P(H)”, meant to indicate “the probability of a Head in a ‘fair’ coin flip.” This is always assigned the value 1/2. Never mind (as in never mind) how that assignment comes about and the complications of the word “fair”. Instead look at the notation: it is written to suggest that, lo, here we have an unconditional probability.

We do not. The conditions are all there in the text and are what allow the quantification. In keeping with sanity, the equation should be written P(H|E) where E is the evidence (list of premises, observations, or other things taken for granted) from which we derive or assign the probability. The E is always (as in always) there.

And that is the point: the ever-present E. E may be simple, i.e. nothing more than our intuition or faith, or it may be complex, i.e. a compound proposition mixing premises with observations and inferences, but it always exists in every single probability ever.

The reason textbooks write things like P(H) as if it were unconditional is because they want introduce “conditional” probability later, as if it were a different thing. Because why? Because then they get to show off new mathematical techniques to manipulate these new symbols. And this is so because statisticians are under the mistaken impression that probability is a branch of mathematics, which it surely is not. At least, not when the probability is used to quantify uncertainty in anything of interest to human beings.

In short, it does no good to write P(X) and hide the conditions outside the equation as if it were the equation itself were the probability. The equation is just shorthand for the real thing.

Nevertheless, I am willing to be proved wrong. If you think you can demonstrate a probability, even a probability of 1, which is conditioned on nothing, then put it in the comments below and I will personally congratulate you, award you the BIP, and admit my error.

Rules: Don’t forget the very rigorous definition of nothing, which means, just as you might suspect, no thing. Anything is something and thus not nothing. Nothing is the lack of every thing. Just because you don’t write a thing (as in “P(G)”, “P(H)”, etc.) does not mean the thing isn’t there: it merely means the thing is not written. It is still there penned in invisible ink. Intuition, i.e. faith, is a thing and is therefore not nothing.

(Incidentally, you will quickly notice we are on solid ground if there appear comments carping about the rules instead of engaging the problem, or by the absence of those regulars who ordinarily argue such matters but who were somehow busy this week and couldn’t attend.)

Update My above prediction turned out pretty well (the parenthetical one). But as a service to readers who were unaware that a problem existed, and who were therefore curiously anxious to deny it, I pulled some quotes.

I got these from typing “unconditional probability” in scholar.google.com. Try it yourself. It’s fun.

“Secondly, there must be at least as many distinct conditional probability values as there are distinct unconditional probability values – to any unconditional probability P(Z), there corresponds the conditional probability P(Z/K) which has the same value.” Probabilities of conditionals — Revisited.

“Thus the unconditional probability that he will still be in B in period t + i given that he starts
at t is 3′. The unconditional probability that he will not be in B is (1 – 3′).” Occupational Choice under Uncertainty.

“The unconditional probability that at the last stage of an n + 1 stage search we will find a new species is equal to…” On Estimating the Probability of Discovering a New Species.

I got these from the same into books.google.com. This funnerer (yes, funnerer).

“However, there are some considerations that seem to favor the primacy of conditional probability…” Yes, amen, there are. “…On the other hand, given an unconditional probability, there is always a corresponding conditional probability lurking in the background.” Yes, there is. Philosophy of Statistics

“There are indications in Cohen’s paper that he does not use the unconditional probability of an event A to represent the probability of A prior to the receipt of a body of evidence relevant to A, but rather that he means to indicate the probability of A conditioned on all conceivable relevant evidence about A…” Probability and Inference in the Law of Evidence: The Uses and Limits of…

“On this reading, therefore, the apparently unconditional probability P(A), that the second toss lands heads, is really the probability EP(A|K), of A conditional on some background evidence K, e.g. about how the coin is tossed.” Probability: A Philosophical Introduction (Hey, JH, this one sound familiar?)

And it goes on and on. Texts with familiarity or a basis in philosophy at least acknowledge the discussion. Those in math-stats or are Stats 101 books do not.

Now the reason this subject is so important is that disguising or not acknowledging the full conditions, and therefore reifying the equations, probability does not seem like a matter of logic, which it is. It is therefore of fundamental important, given the areas which statistics touches.


39 Comments

  1. DAV,

    I may create a prize just for entering, in the mode of our public schools, where everybody is a winner.

  2. Mr. Briggs,

    The equation is just shorthand for the real thing.

    Agree. Who would disagree with this statement?

    Let me respond to your comment in the post http://wmbriggs.com/blog/?p=8614.

    The questions are what a prior is and what an appropriate notation should be.

    Mellor writes about the philosophical interpretation differences among “chance” “credence” and “epistemic probability” in his introductory book. The key phrase in your first quote from the book is “posterior credence.” So, you might have missed the point there.

    For the second statement that you quoted, the section is to answer how an epistemic probability can be conditional on no evidence? This problem stems from that fact that we can always further reduce the conditionalization, sooner or later, it will be reduced to so-called absolute prior.

    Yes, I have read Jaynes’ book, courtesy of the publisher.

    Let me quote Jaynes about P(A|X), a prior probability, (the book has been edited by other people, so I can’t be sure whether it is his original writing.)

    X denotes simply whatever additional information the robot has beyond we have chosen to call “the data.”

    You have written the probability without conditioning on any premise because you really know what it is. So, why assuming other people know.

    Nope, I haven’t read the other two books. But I have read a couple of papers. e.g, “Why frequentists and Bayesians need each other “ by J Willaimson.

    Too much to learn. Those books will have to wait. Have you read those books yourself?

  3. The probability that the statement, “There Is No Such Thing As Unconditional Probability” is an unconditional probability since there are no conditions that make the statement false and it is equal to 1.

  4. JH,

    I’ll see if I can move your comments to the relevant post. They don’t match this post. But in brief: his definitions of “chance” etc. as separate things are wrong. Yes, I’ve read all of them—and all of all of them. Report back to me when you have, too.

    No entry?

    Rich,

    Oh no! Not even close. I provided 700+ words of conditioning evidence.

    China hand,

    So based on your observation and intuition, etc.? Not even close.

  5. Mr. Briggs,

    (Please also fix the blockquote tag.)

    Maybe you should read the book first before you conclude those definitions are wrong. I don’t think those are Mellor’s definitions. It’s an elementary textbook that introduces students into the area of philosophy of probability.

    The reason textbooks write things like P(H) as if it were unconditional is because they want introduce “conditional” probability later, as if it were a different thing.

    Given the premise of throwing a coin, what is the probability of landing a head? You’ve claimed, by statistical syllogism, that the probability is 1/2. (It’s 1/2 by symmetry or principle of indifference, see page 331 of Jaynes, but let’s forget about this.) Let H be the event of landing heads.

    Do you mean when statisticians or probablists write P(H), they really don’t know that general information defining a problem? Or do you mean that they should always implicitly write down all the information defining the problem in the probability expression?

    Let E be the premise of tossing a coin. Is P(H|E) a conditional probability? One might argue it’s not! If it is, what is P(E) or P(E|H)?

    No, the textbooks in probably and statistics are probably not written to demonstrate philosophical points.

  6. Only thing I could think of would be if the P(I exist or I don’t exist) = 1. But even there isn’t there some implicit evidence in this too?

  7. JH,

    I read the relevant section and noticed his definitions are lacking. This would be equivalent to picking up a math book and noticing an incorrect equation. Reading the rest of the book would not make the equation right.

    I quote myself, “Incidentally, you will quickly notice we are on solid ground if there appear comments carping about the rules instead of engaging the problem…” And now I answer you, even though you did not give an entry.

    To know what I said about “P(H)” please read the text in the post. I thought it clear. There is some E such that P(H|E) = 1/2. As I say and then you say, never mind here about the E. But you cannot write “P(E)” as it makes no sense.

    Here we agree: probably and statistics are not written to demonstrate philosophical points. Which is the shame of it, since they rush out to use the philosophical principles they do not flesh out on real problems. A great pity, that.

  8. Nate,

    Yep. You’re conditioning on knowledge that tautologies are always true (how you know this, etc.).

  9. Briggs:

    It seems to me that you’ve reached your conclusion through an eccentric use of the term “condition.”

    In statistical modeling, the Cartesian product of the values that are taken on by a model’s independent variables plays a role. Conventionally, a “condition” is a proposition that is formed by placing the elements of a proper subset of the Cartesian product in an inclusive disjunction. For example, suppose that the model has a single independent variable and that this variable takes on the values of “cloudy” and “not-cloudy.” Then “cloudy” is an example of a condition and “not-cloudy is another example.

    By placement each of the elements of the above referenced Cartesian product in an inclusive disjunction, one forms a proposition that is similar to a condition but that is not a condition, for the values from which it is formed are not a proper subset of the Cartesian product. For example, by placing “cloudy” and “not-cloudy” in the inclusive disjunction “cloudy OR not-cloudy” one forms a proposition that is similar to the condition “cloudy” and to the condition “not-cloudy” but that is not a condition, for while the values from which this proposition is formed are a subset of the Cartesian product {cloudy, not-cloudy} this subset is not proper.

    Probabilities which are “unconditional” under the definition of “condition” just presented play a role, for example, in thermodynamics. Here, a system’s “macrostate” is formed by placement of each of the associated “accessible microstates” in an inclusive disjunction and the probabilities are of the various accessible microstates.

  10. Of course. It was a jeux d’esprit.

    Are we saying any more than that no statement means anything without a context?

  11. Briggs,

    Did you work as a carny in your youth? I ask because you like to run a rigged game to draw in the suckers. I can’t believe that anyone really disagrees with what you are saying. It is just a debate over sloppy notation where P(A) = P(A|E), which is done to avoid cluttering up the page with excess notation. Only prior conditions of specific interest need be stated.

  12. In this sense, there is no such thing as unconditional anything. The point is being applied selectively to probability in order to try to prove a point, while selectively and inconsistently not being applied to anything else, because that would prove all sort of inconvenient points the author didn’t want proving.

    The equation “P(H|E) = 1/2” is likewise wrong, because the concept of “1/2” depends on the axioms of arithmetic, the definition of the rational numbers, the axioms of logic, and so on. So you can improve it by writing “P(H|E) = 1/2 | arithmetic, rationals, etc.” but even this now needs to be extended, because it is dependent on the concept of conditionality. What does it mean for a predicate to be conditional on a context of assumptions and definitions? The rules, definitions, and assumptions by which we can do this also need to be specified. Although now we have a difficulty in that the meaning of the notation we use to define it depends on that meaning, and our definition is circular.

    What it comes down to is that if we keep on demanding that we define our terms, even for concepts like the idea of ‘definition’, we get an infinite regress. We cannot build anything with no foundations. It is the same problem Descartes tried (and failed) to solve with his ‘cogito ergo sum’ argument. But it is at the same time a trite tactic to use, nowadays. The argument is a universal solvent, that dissolves everything, and you have to be quite disingenuous in making sure to apply it only to the specific things you want dissolved.

    As previously noted, absolute probabilities exist only in mathematical models, which are *asserted*, without evidence. The same applies to any mathematical model. I *assert* x is a real number satisfying a particular equation. I *assert* ABC is a triangle in a Euclidean plane. I require no evidence, I don’t have to say how I know. This is an imaginary world I’m building in my head, in which I am omnipotent and omniscient. I can assert that this world contains a coin and the probability of it coming up heads is 1/2. It is, because I say so.

    And yes, such a mathematical model is conditional on its axioms, but the same is true of all mathematics, and indeed all language. It’s a trivial objection, by its universality.

  13. Mr. Briggs,

    I don’t disagree that a well-defined probability problem has premises. However, is it necessary to write all evidence and verbal information in a probability expression? Yes, it might be in a philosophical textbook book.

    I was giving you an example to illustrate why the textbook in statistics and probability don’t add all verbal information in probably expressions. I also pointed out the question of what a conditional probability is since you wrote

    …as if it were a different thing.

    BTW, I’d be interested in reading your review of the book In Defence of Objective Bayesianism by J. Williamson.

  14. Scotian,

    Come. Just look at all the instances of the phrase “unconditional probability” that are used. Do a search and be amazed! Step right up and see the wonders of “non-informative” probabilities. Gaze in wonder at rootless, floating numbers with no support whatsoever!

    Even a quick internet search will show phenomenon Ripley wouldn’t believe. (Do Google and Scholar.Google; use the phrase in quotes.)

    But really, I take it as a pretty compliment that I explained myself so well that it seems anybody who disagrees with me must be made of straw. Why, even JH has come over to our side!

    Update (to save you effort; this pdf; class homepage):

    Prior or Unconditional Probability
    – It is the probability of an event prior to arrival of any evidence.

  15. Briggs,

    I bow to your superior knowledge – I guess there is one born every minute. We all have our pet peeves – mine are the misuse of the word heat and claiming that mirrors reverse left and right, and then there is…..

  16. Total shot in the dark here, but: How about P(A=A) = 1?

    A = A being the single necessary axiom for all operations of logic, the probability that it is true must be one, and must be unconditional, because without certainty that A = A no conditions can be evaluated.

  17. There is a (quite great) Brazilian (ergo, mostly unknown) philosopher called Mario Ferreira dos Santos who builds his philosophy from the apodictic truth “there is something”. He analyzes why the Cartesian “I exist” is not quite as undeniable as Descartes thought. That would be my candidate for a proposition with 100% probability.

  18. Stephen J.,

    Nope, and you proved it yourself when you called it an axiom, which is a truth conditioned on intuition/faith, but which is still a condition.

    Mariner,

    Same thing (I’m not disagreeing with the philosopher, understand, just that his axiom isn’t unconditional).

  19. Mr. Briggs,

    I looked up Wikipedia’s definition of “axiom” and nowhere does it suggest that an axiom is “a truth conditioned on intuition/faith”, as if it was the *accuracy* of one’s intuition or faith that determined whether the axiom was objectively true/correct or not. Moreover, the Axiom of Equality, “A = A” (or “x = x”) is defined as being *universally* (which to me seems as good as saying “unconditionally”) valid. I cannot help but feel that you are being a bit No True Scotsman-y here — if the necessary definitions that allow you to postulate a probability in the first place are defined as its “conditions”, then of course all probability is “conditional”, but if the definition of “conditions” means everything then it means nothing. As Nullius in Verba notes, the objection becomes trivial by its universality.

    However, let us grant that excluding the subset “intrinsically necessary definitional conditions” from the set of “conditions” (or to phrase the same thing another way, using “conditions” as a shorthand only for “all non-necessary external conditions not required by definition”) is an avoidable sloppiness of terminology. If we stipulate that most probability textbooks should properly add to their definition of “E = those conditions affecting P” the phrase “not already included in the definition of P”, what clarity would this add to the process? What errors would this allow probability students to avoid?

  20. Stephen J.,

    Let me put it this way: How do you know this proposition (this axiom) is true? Once you answer that you will see that you believe it is so based on (conditional on) your intuition (or faith).

    I too think this contest is trivial, but darned if many people don’t. I invite, as I did above, to search the term (with quotes) “unconditional probability.”

  21. Mr. Briggs,

    It would seem to me that there is a vital difference between the statements “This axiom is true” and “*I believe* this axiom is true”. I’m not sure I see how the basis for my *belief* in an axiom’s truth constitutes a “condition” (i.e. a determining factor) of whether it actually *is* true or not. (Does this mean that I am inadvertently subscribing to the objectivist Bayesian stance rather than the subjectivist one?)

    In addition, following your instructions, I looked up “unconditional probability” and came across the Wikipedia article for *conditional* probability, which said: “P(A) is the probability of A before accounting for evidence E, and P(A|E) is the probability of A having accounted for evidence E.” It would seem to me that by definition this stipulates that E includes *only* relevant evidence not already posited by the definition of A, which would suggest in turn that the “unconditional probability” usage you complain about may be something of a straw man in practice. (The issues in the “Probability that God Exists” post seem to derive from an inadequately defined proposition A rather than a failure to account for separate evidence E, and from — forgive me — your own lumping of “premises” and “evidence” together into one data set; I was under the impression that “premises” are precisely what one posits *before* introducing “evidence” to support or disprove them.)

    That said, your complaints have sufficient fervour that I infer you have encountered many other such examples of statisticians reaching erroneous philosophical conclusions based on unwarranted assumptions of unconditionality. Could you tell us about some of these, and how being forced to account for E would have prevented them?

  22. Stephen J.

    Very well, prove the axiom is true and then say why your belief in its truth is different. This will be a useful exercise to show I am right.

    It makes no sense to say “P(A) is the probability of A before accounting for evidence E…” because there is no such thing as “P(A)”. It must always be “P(A|something).”

    Your search was not very far ranging. Plus, on this blog and many times I have tried to show the consequences of improperly accounting for evidence in assigning probabilities. Look up the subjective probability vs. others thread, for example. That subjectivism and frequentism exist is reason enough to lament “unconditional” probability.

  23. “It must always be ‘P(A|something)’.”

    Granted; but the impression I had of E is that it is always used to mean “what has not already been accounted for in ‘something'”, and the impression I had of A is that it always includes the “something” you specify above. Are these impressions wrong? (You yourself state in your post on Swinburne’s P-Inductive and C-Inductive Arguments for God that saying “H” in the position of A above is accepted shorthand for “H is true,” which would seem to me to necessarily include, “All conditions required for H to be true apply.”)

    As for “proving the axiom true”, that is begging the question; an axiom is by definition that which one needs to assume to prove anything else true, and if it is something that *can* be “proven” true then by definition it isn’t an axiom. But if anything at all can be taken as self-evidently true by definition, it would seem to me that “A = A” would be it; at the very least, if that can’t be taken as read then no results can be reached — and if it’s not self-evidently true, then my belief that it is won’t make it so, or vice versa.

    I’m afraid the more I try to understand this issue, the more it seems to me like somebody complaining about incorrect spellings of a word by pointing out that assuming a “correct” spelling requires specifying which alphabet you’re using first — the point is not invalid, but it does not actually address the errors noted. But I will freely admit that P(I’m Misunderstanding The Issue)|My Lack of Formal Training = (very high). 🙂

  24. Mr. Briggs,

    Almost all statisticians, mathematicians and probabiltists would agree that well-defined probability problems require premises.
    After all, authors of good textbooks need to write well-defined and solvable exercises for students. If all the probabilities, unconditional or marginal or conditional or joint, are not well-defined, none of the math in a textbook would make sense.

    Be happy that those authors are probably on your side!!!

    However, you probably want to think about why those authors use terms such as unconditional or marginal or conditional or joint probabilities. I am sure you could find answers. You might just find that some of your readers are correct.

    Oh, are there are no unconditional truths? What is an unconditional truth? Is it a truth that holds for everyone at all time? Are there any unconditional truths? My guess is that philosophers have debated on it for centuries and still couldn’t reach an answer.

  25. Mr. Briggs,

    Here we agree: probably and statistics are not written to demonstrate philosophical points. Which is the shame of it, since they rush out to use the philosophical principles they do not flesh out on real problems. A great pity, that.

    Do you remember that you predicted an 80% probability of a Romney win? It would be really great if you could show logically how you derived this probability via objective Bayesianism! Yes, use proper notations, write down all premises and evidence and information. I’d be a pity not to put whatever philosophical principles you mentioned in practice. Don’t you think?

  26. I agree with Terry Oldberg that you appear to be making a use of the word ‘condition’ that is unconventional in the context of mathematical probability theory (though perhaps not so unconventional in the non-technical context).

    In the non-technical use of ‘condition’ noone could disagree with your claim because it is almost a tautology. No statement is true or false, including a probability claim, without some assumed ‘conditions’ (if only those involved in the definitions of the terms and symbols in the statement of claim).

    But in probability theory a conditional probability refers to a specific type of condition, namely the restriction to a certain subset of the sample space. The choice of sample space and measure that one starts with is of course a ‘condition’ in the colloquial sense but not in the sense of conditional probability. And of course even in the technical sense every probability is conditional in the trivial sense that P(X)=P(X|U) where U is the entire sample space or “universe” of possible outcomes.

  27. You can look at a mathematical model from the inside or the outside. From the inside, the axioms are trivially and unconditionally true. The proof is concluded in one step, by simply stating the axiom. Looked at from the outside, the theorems are conditional on the axioms, on the model itself. You can embed axiom systems and their consequences into a meta-model, with a conditionality relationship linking them together. But this itself is a mathematical model, with axioms and assumptions that define what “conditionality” and “axiom” and “proof” mean in general, and there is therefore a second layer of conditionality, and so on.

    This has nothing specifically to do with probability. It’s a general feature of all mathematics and logic. (Stephen J’s spelling analogy is quite apposite.)

    Sensible people, when they’re working inside a mathematical model, don’t try to incorporate the meta-model stuff that strictly belongs outside it. You suspend disbelief and take the model for granted. The model is a container, like a box. You can’t put the box itself inside it. Mixing up levels of abstraction was how Bertrand Russell got into trouble.

  28. Briggs said:

    Same thing (I’m not disagreeing with the philosopher, understand, just that his axiom isn’t unconditional).

    Well, I don’t understand what “unconditional” means, then. If a proposition is true regardless of any conditions, isn’t it unconditional? “There is something” is such a proposition (it’s the most basic one; others can be derived from it, which will “inherit” its unconditionality).

  29. P(I will die).

    What conditional information do you need to assess:

    P(your estimate of P(I will die)) is correct?

    I say none.

  30. Geckko: I’ll give a shot at answering your question. It’s conditional on the idea that your lifespan as a living being is transient.

    Correct me if I’m wrong, but I believe you’re conditioning on the information that all living things in the past have died, which would obviously make the estimate of your 1st statement = 1. But that still is conditioned on your evidence that you 1) are like all other living things and 2) using simple logic, will suffer a similar fate. You may consider this to be an absolute truth, but there are still assumptions that are being made for your estimate.

    In fact, you concede in your second statement that there can be multiple estimates of the statement P(I will die). For example, philosophically you could go further and say that you need to define what “die” means. There are some folks that hope to be able to store their brain function in computers in the future, or replace themselves by robots. Would they have died? Then you would need to condition on what your definition of “I will die” actually means. (You could even go into a religious angle from there. But I will not.)

    If I misunderstood your question, and you were addressing that the probability of (someone’s estimate of the probability statement of #1), then I think that’s a simple exercise: The probability estimate of the longer statement will necessarily be conditioned on the probability statement inside that statement, as I just addressed above.

Leave a Comment

Your email address will not be published. Required fields are marked *