Direct And Inverse Probability: The Bayesian Way

Another Wikipedia picture that draws the eye but says little.
Class is almost over! Just three days (including today) left. I’m way behind reading comments.

All logic, of which probability is an example, begins with a fixed list of premises from which we attempt to write a conclusion which either follows from these premises or which is as true as can be under these same premises. This is direct probability.

Thus, given “All men are mortal and Socrates is a man” the conclusion which is true with respect to these is “Socrates is a mortal.” This has probability 1. Or given “Most Martians wear hats and George is a Martian” the conclusion which has the highest probability is “George wears a hat.” It has the probability, assuming also as a premise that most means greater than half up to all, greater than half and up to one. Not to mince about, this is an interval and not a number.

The key to direct probability: the premises are fixed, unflinching and unchangeable, and accepted as is by all. Except for the exception mentioned next, there are rarely (except academic) disagreements over its uses.

There other kind of direct probability is when the premises are just as fixed but the proposition at the end, the conclusion as was, is variable. So we could ask, if desired, the probability “Socrates has a happy marriage” with regard to the first premises—or even the second. Not much would come of it, of course, but the thing could be done. And is done. This is the origin of several popular fallacies.

Then there is inverse probability which starts with the “conclusion” desired or desirable proposition and seeks for premises which make it true or probable. This can always be done. The proof is trivial. Supposing we want to know the truth of “Socrates has a happy marriage”. By supplying the premise “Socrates has a happy marriage” the conclusion has probability one.

This won’t be a satisfying maneuver for most, but it hasn’t stopped many from using it. For those in the habit of producing circular arguments the practice is to word the “premise” differently but equivalently and to pad it with extraneous words. For example, “Through my long years of reading history I, even I, have determined that that most well known philosopher was united in domestic bliss.” Therefore, etc.

More fallacies enter via indirect probability than by direct probability, and more arguments are started over it. Keep our example. Which are the best premises? Well, how much do we know of Xanthippe? If Plato is any guide, we know she had to be led away from Socrates’s prison wailing and weeping. Sounds like the behavior of a loving wife. But then we have the man’s own words, “By all means, marry. If you get a good wife, you’ll become happy; if you get a bad one, you’ll become a philosopher.”

Just which of these, and the many others we can consider, are the right premises? Scholars differ. But we must keep in mind that we can always find a set of premises which makes the conclusion/proposition as true or false as we like. The game then becomes arguing over the list.

Some of the premises which appear in the list will also make appearances in lists of other arguments, arguments which could be related to the proposition at hand. Knowledge, it has been said, is a tangled web. When the same premises are found all about the same village, as it were, it strengthens their support for inclusion. But this is because we are considering a meta-argument with conclusion “This premise should be included.”

People excel at discovering confirmatory premises. But they stink at admitting negative ones. Note admitting and not discovering. It’s often easy enough to find negative premises, but it’s painful to allow them into the list.

And then even if we have all the right premises, we’re not too good at logically tying them together, except in the narrowest of circumstances. Like mathematicians tackling theorems or chemists chasing how much of compound is enough.

An excellent example of inverse probability is a criminal trial. The proposition/conclusion is agreed to by all. “The defendant is guilty.” The hope is, people being what they are and admitting their soaring skills at confirmation bias, that because there will be those in favor of and against the proposition, that all the relevant premises will be discovered.

Because of the vast complexity of these premises, we can only thank God that the probability the conclusion is true is not only not required to be quantified, but it is forbidden to be so.


  1. Is your last paragraph correct? In civil trials (at least in those countries) drawing from the British legal tradition the finding is made ‘on the balance of probabilities’, which on the face of it would suggest that the jury or judge determines that it is 51% or more likely that the event happened, or did not happen, and makes a determination on that basis.

    In criminal matters, how does ‘proof beyond a reasonable doubt’ relate to probability. ‘Reasonable doubt’ is something of a probabilistic formulation. Not so much the ‘reasonable’ part — in law it is shorthand for a doubt that a ‘reasonable’ man might entertain — but the doubt part. How much doubt must this hypothetical reasonable man entertain? Say, 5% like a science experiment?

  2. Stephen,

    It is correct, yes. You could, and in keeping with tradition probably should, write “He’s innocent”. But nobody believes that.

    The “balance of probabilities”, where it is invoked, would mean “greater than 50%”, which would inlude 50.00000000000000000000000000001% and so forth. Thank God this is never quantified, nor is “guilty beyond a reasonable doubt.”

    Probability is not always—indeed, is rarely—quantifiable.

Leave a Comment

Your email address will not be published. Required fields are marked *