Skip to content
July 11, 2018 | 6 Comments

Is Presuming Innocence A Bayesian Prior?

Note An earlier version of this post was accidentally sent out in unedited form. My enemies caused me to hit the wrong button. Subscribers: apologies for the near duplicate email.

I don’t mean to pick on Deborah Mayo, but her site has lots of good probability teasers that don’t confuse the question with a lot of math. Nothing wrong with math, except that most of it is just plodding along. Solving math problems for fixed situations is not as much fun as solving philosophical ones, to me.

The conundrum today is provided by Larry Laudan, who wrote “Why Presuming Innocence is Not a Bayesian Prior” at Mayo’s site.

Let’s switch things up. I’ll say what I think is the right answer, then we’ll let Laudan have a go.

Judging a man guilty or innocent, or at least not guilty, is a decision, an act. It is not probability. Like all decisions it uses probability. The probability you form depends on the evidence you assume or believe. Probability is the deduction, not always quantified, from the set of assumed evidence of the proposition of interest. In this case, “He’s guilty.”

When jurors are empanelled they enter with minds full of chaos. Some might have already formed high probabilities of guilt of the defendant (“Just look at him!”), some will have formed low (“Just look at him!”), because all will have different evidence assumed. Yet most, we imagine, will accept the proposition “There’s more evidence about guilt that I haven’t yet heard.” Adding that to what’s in their minds, perhaps after subtracting some beliefs, and some jurors might form a low probability.

Now no juror at this point is ever asked to form the decision from his probability to guilty or not guilty. Each could, though. People do. You do when you read of trials in the paper, for instance. There is nothing magical that turns the evidence at the final official decision into “real probability”. Decisions could be made at any time. It is only that the law states only one decision will count, and that’s the one directed by the judge.

Of course, what’s going on in a juror’s mind—and I speak from experience—is nearly constantly shifting. One moment you believe or accept this set of evidence, the next moment maybe something entirely different. You’re nearly always ready to judge based on the probability you’ve formed right now. “He was at the school? He’s Guilty!” Then you hear something new and you think Not guilty. The judge may tell you to ignore a piece of evidence, and maybe you can or maybe you can’t. Some jurors see a certain mannerism and interpret it in a certain way, some didn’t. And so on.

At trial’s end, every juror retires to their room with what they started with: minds full of augmented chaos—a directed chaos, though. The direction is honed by the discussion the jurors have. They will try to agree on two things: a set of evidence, which necessarily leads to a deduction of a (non-quantified, thank the Lord) probability (which won’t be precisely identical for each juror, because the set of evidence considered will never be precisely identical), and a decision based on the probability. Decisions are above probability. They account for thinking about being right and wrong, and what consequences flow from that. Each juror might come to a high probability of Guilty, but they might decide Not guilty because they think the law is stupid, or (think OJ) “racist”.

That’s the scheme. But we still haven’t accounted for the initial directive of “presuming innocence”. What happens with that?

You hear “You must presume the defendant innocent.” That can be taken as a judgement, i.e. a decision, or a command to clear the mind of evidence probative to the question of guilt. Or both. If it’s a decision, it’s nothing but a formality. Jurors don’t get a vote at the beginning of a trial anyway, so hearing they would have to vote Not guilty right now, if they were allowed to vote, isn’t much beyond legal theater.

But if it’s a command to clear the mind, or a command to at least implant the evidence “I don’t know all the evidence, but know more is on its way”, and to the extent each juror obeys this command, it is treated as a piece of evidence, and therefore forms part of each juror’s total evidence, which itself implies a (non-quantified) probability for each juror.

So the command is not a “prior” per se. A “prior” is a probability, and probability is the deduction from a set of evidence. That the command is used in forming a probability (of course very informally), does make it prior evidence, though. Prior to the trial itself.

That’s the answer. We’re done. With the reminder that Bayes itself is not what is important in probability. Bayes is just a helpful formula, which isn’t strictly needed. Our answer is the same as what we began with. Probability is deduced from the evidence assumed (at any point), and decisions are acts made with reference to the probability and other matters.

What does Laudan says?

He says the command is “an instruction about [the jurors’] probative attitudes”. I agree with that, in the sense just stated. But Laudan amplifies:

asking a juror to begin a trial believing that defendant did not commit a crime requires a doxastic act that is probably outside the jurors’ control. It would involve asking jurors to strongly believe an empirical assertion for which they have no evidence whatsoever.

That jurors have “no evidence whatsoever” is false, and not even close. I walked into my last trial with the thought, “The guy probably did it because he was arrested and is on trial.” That is positive evidence for Guilt. I had lots of other thought-evidence, as did each other juror. I’m sure some came in thinking Not guilty for any number of other reasons (evidence). The name of the crime itself, taken in context, is more evidence. Each juror could commit, as I said, his “doxastic act” (his decision), at any time. Only his decision doesn’t count until the end.

asking jurors to believe that defendant did not commit the crime seems a rather strange and gratuitous request to make since at no point in the trial will jurors be asked to make a judgment whether defendant is materially innocent. The key decision they must make at the end of the trial does not require a determination of factual innocence. On the contrary, jurors must make a probative judgment: has it been proved beyond a reasonable doubt that defendant committed the crime? If they believe that the proof standard has been satisfied, they issue a verdict of guilty. If not, they acquit him. It is crucial to grasp that an acquittal entails nothing about whether defendant committed the crime, [sic]

We have already seen how each juror forms his probability and then decision based on the evidence. That evidence can very well start with the evidence provided by the judge’s command. I don’t buy his “at no point” either. Many jurors take the vote of Not guilty to mean exactly “He didn’t do it!”—by which they mean they believe the defendant is innocent. Anybody who has served on a jury can verify this. Some jurors might say, of course, they’re not sure, not convinced. To insist that “an acquittal entails nothing about whether defendant committed the crime” is just false—except in a narrow, legal sense.

Laudan says “Legal jurisprudence itself makes clear that the presumption of innocence must be glossed in probatory terms.” That’s true, and I agree the judge’s statement is often taken as theater, part of the ritual of the trial. But it can, and in the manner I showed, be taken as evidence, too.

Now it seems Laudan is not a Bayesian (and neither am I):

Bayesians will of course be understandably appalled at the suggestion here that, as the jury comes to see and consider more and more evidence, they must continue assuming that defendant did not commit the crime until they make a quantum leap and suddenly decide that his guilt has been proven to a very high standard. This instruction makes sense if and only if we suppose that the court is not referring to belief in the likelihood of material innocence (which will presumably gradually decline with the accumulation of more and more inculpatory evidence) but rather to a belief that guilt has been proved.

As I see it, the presumption of innocence is nothing more than an instruction to jurors to avoid factoring into their calculations the fact that he is on trial because some people in the legal system believe him to be guilty. Such an instruction may be reasonable or not (after all, roughly 80% of those who go to trial are convicted and, given what we know about false conviction rates, that clearly means that the majority of defendants are guilty). But I’m quite prepared to have jurors urged to ignore what they know about conviction rates at trial and simply go into a trial acknowledging that, to date, they have seen no proof of defendant’s culpability.

I can’t say what Bayesians would be appalled by, though the ones I have known have strong stomachs. That Bayesians see an accumulation of evidence leading to a point seems to me to be exactly what Bayesians do think, though. How to think of the initial instruction (command), we have already seen.

I agree that the command is used “to avoid factoring into their calculations the fact that he is on trial because some people in the legal system believe him to be guilty.” That’s evidence (which he just said jurors didn’t have). Increasing the probability of guilty because the defendant is on trial is what many jurors do. Even Laudan does that! That’s why he quotes that “80%”. The command (sometimes) removes this evidence. (Laudan may be using evidence as true statements of reality; I do not and instead call it the premises the jury believes true; some lawyers have been known to lie.)

Laudan doesn’t say, but I’m guessing he’s a frequentist. Jury trials are perfect at showing frequentism fails as a definition of probability. In that theory, probabilities are defined by infinite sequences of positive (guilty) measurements embedded in infinite sequences of positive and negative (guilty and not guilty) measurements. (Large doesn’t count: has to be infinite.)

Tell me just what exact unique no-dispute no-possibility-of-other infinite sequence this real-life trial is embedded in. Black guy on trial for selling a certain quantity of cocaine within so many yards of a school. Guy, born and raised Christian in the States, dresses in Muslim garb. Good luck!

Sequence has to be exact unique no-dispute no-possibility-of-other otherwise you could come to different probabilities.

Incidentally, I was a juror on a trial with these circumstances. The black women on the jury were incensed to high degree and never forgave the defendant for wearing the garb.

July 10, 2018 | 12 Comments

Bayesian Theorists Were Little Better Than Cranks

I stole today’s title from David Papineau’s essay “Thomas Bayes and the crisis in science“, which many readers sent in.

When I was in grad school bad in the early to mid 1990s, Bayes was just off its flush of becoming respectable, which occurred mostly in the 1980s. But then, as now, and as you’ve all heard me lament before, all statisticians must first be initiated into frequentism. As such, they find it difficult to overcome. The experience is not unlike trying to leave the religion of your youth. Sure, you can stop practicing it. But you can never stop feeling its influence.

This is why you still hear from self-styled Bayesians admonitions to develop Bayesian procedures with “good frequentist properties”, which is (a) begging a Simpson’s paradox-type situation, and (b) incoherent. If Bayes is right (about which sense more in a moment), then it’s always right and frequetism wrong, and vice versa. The two are not compatible philosophies of probability.

See Uncertainty: The Soul of Modeling, Probability & Statistics for more on all this, incidentally.

Anyway, Bayes has three interpretations. The subjective which says, and I do not jest, probability is a function the indigestibility of your food. The probability of any proposition is how you feel about it. It is therefore an effeminate philosophy (do not confuse feminine with effeminate). The objective, which is frequentist in character, and which thinks probability is ontic. This is a mistake. And then the logical, which says probability is epistemic. This is the correct view (which is not really called “Bayesian” by anybody, though people use it that way). I’m not proving this here: I’m telling you. Read the book for arguments.

The importance of Bayes is not—as I have stressed hundreds of times, to little avail—is not in the formula. It is not strictly needed, not ever. It is nice, it is helpful. But that is it. What we always want is

     Pr(Y | X)

where Y is the proposition of interest and X is the totality—I’d shout this if I thought it would do any good—of evidence. This probability is not always quantifiable. Tough cookies. How we get to Pr(Y | X) is only of interest to technicians, and is where the formula might be of use. But it is always beside the point.

Which means all the ya-ya-ya about “updating beliefs” is beside the point. First, subjective probability is wrong, and second, the update is a technical matter. What always counts is the totality of evidence you accept. And the evidence you accept is not necessarily the same as I accept—or the same as anybody else accepts. Hence disputes. Probability is only a dull function of the evidence accepted.

The real revolution in Bayesian thought is that everything uncertain can be assigned a probability, though not always in number. There is nothing wrong with that sentiment, and everything right. But like I just said in other words, it is the evidence which counts. And only the evidence. The math connecting evidence to probability (the least interesting aspect) we can leave to geeks and nerds.

This is why we know statements like the following (from the article) are false in the strict sense:

Bayes’s reasoning works best when we can assign clear initial probabilities to the hypotheses we are interested in, as when our knowledge of the minting machine gives us initial probabilities for fair and biased coins.

No. What works best is assembling the evidence that comes closest to showing the cause of the proposition of interest Y. The wrong wrong has already been chosen, as we see by the next sentence “But such well-defined ‘prior probabilitie’ are not always available.”

We don’t need “prior probabilities” on the theory that some thing causes heart attacks. We need evidence that it does or doesn’t. Sometimes we start out ignorant. So what? We build evidence from that ignorance.

Thinking Bayes is a panacea, or a universal formula, is why die-hard frequentists are still scared of leaving their incorrect theory of probability. No panacea exists. Subjectivism is silly. And they are right.

But it is a false dichotomy to insist on either subjective/objective Bayes of frequentism. There is a third way.

July 9, 2018 | 5 Comments

Diversity Is Our Weakness

Diversity is our weakness. Diversity is the process of giving special consideration to those who have favored demographic or biologic characteristics or who have non-procreative sexual desires.

Now you might think it insane to accord special preference to people because they express enthusiasm for and participation in non-procreative sexual practices. You might say that a civilization that takes “pride” in such extreme self-indulgences is effeminate and courting death. And you would be right.

But the promotion of unhealthy sexual desires is not what Diversity is about. Diversity is about power. Diversity is therefore never strictly about the subjects in which Diversity is pushed. The subjects are always incidental. Who controls the subjects is paramount.

Same Old Tune

A paradigmatic example. Sheffield University and the Centre for New Music will host a competition in classical music composition. To decide who wins, “A ‘two ticks’ policy will be in place for female composers, composers who identify as BME, transgender or non-binary, or having a disability, to automatically go through to the second stage of the selection process.”

“BME” stands for “black and minority ethnic”.

Quality in the music will be down-weighted. Up-weighted will be political measures.

One of two things will happen. A normal person not exhibiting the desired “victim” characteristics will win. This implies the quality of this person’s work will not only be better than victim competitors’ work, but that it soared above far above them (and had to, to overcome the non-quality weighting). This person’s undesired victory will lead to a call for actual quotas from the embarrassed judges in the next competition.

Or a designated victim will win. This person’s work will probably be mediocre, since the quality of the victim’s work will receive less than full weight. The quality of the music in the competition will decline.

Who wins takes precedence over what wins. Diversity thus causes, on average, a decline in standards.

Chalk Another One Up

Quotas are never called quotas at the beginning of the Diversity process. For example, Cornell University was discovered never to have had a female president. So one was hired. But nobody admitted the new president was hired because she was a female.

This female president sadly died shortly after her arrival. So a new female was hired in her place.

Each person’s qualifications was not the sole judge of her merit. Their sex was included in the judging. Those deciding among the candidates did not confess these females were picked because they were female. So it’s unclear how much genuine merit counted: was there a better male rejected because he was male?

Anyway, the true belief that quotas are harsh measures and harmful to quality constrains people from acknowledging, at first, their use.

This shyness does not persist. Since Diversity, as in the music competition, always leads to a decline in quality, and thus in greater “inequities” in rewards, eventually actual calls for quotas-by-name are made.

Equality Kills

Find out how here.

It’s Official

The Diversity process always follows a common pattern. Here then are the official Steps of Diversity:

All right here, at this most clickable link.

July 8, 2018 | 1 Comment

Summary Against Modern Thought: Ultimate Happiness Man’s Knowledge of God

Previous post.

Three-word summary: Avoid bad books.

That Human Felicity Does Not Consist In the Knowledge of God Which is Generally Possessed by Most Men

1 It remains to investigate the kind of knowledge in which the ultimate felicity of an intellectual substance consists. For there is a common and confused knowledge of God which is found in practically all men; this is due either to the fact that it is self-evident that God exists, just as other principles of demonstration are—a view held by some people, as we said in Book One—or, what seems indeed to be true, that man can immediately reach some sort of knowledge of God by natural reason.

For, when men see that things in nature run according to a definite order, and that ordering does not occur without an orderer, they perceive in most cases that there is some orderer of the things that we see.

But who or what kind of being, or whether there is but one orderer of nature, is not yet grasped immediately in this general consideration, just as, when we see that a man is moved and performs other works, we perceive that there is present in him some cause of these operations which is not present in other things, and we call this cause the soul; yet we do not know at that point what the soul is, whether it is a body, or how it produces these operations which have been mentioned.

Notes That there must be an orderer is easily proved. However things work, at base, there has to be a creator of how things work. How things work could not have come about “randomly”, which is impossible, or from nothing, which is also impossible. There must therefore be an author (or, as our good saint says, at least one, the singularity of the one not yet proven).

2 Of course, it is not possible for this knowledge of God to suffice for felicity.

3 In fact, the operation of the man enjoying felicity must be without defect. But this knowledge admits of a mixture of many errors. Some people have believed that there is no other orderer of worldly things than the celestial bodies, and so they said that the celestial bodies are gods.

Other people pushed it farther, to the very elements and the things generated from them, thinking that motion and the natural functions which these elements have are not present in them as the effect of some other orderer, but that other things are ordered by them.

Still other people, believing that human acts are not subject to any ordering, other than human, have said that men who order others are gods. And so, this knowledge of God is not enough for felicity.

Notes The first “gods” fallacy is mostly dead. The second “physics” fallacy is alive and prospering. We are the generation privileged to see the true birth of the “man-as-god” fallacy, which we might also call the “final fallacy.”

4 Again, felicity is the end of human acts. But human acts are not ordered to the aforementioned knowledge, as to an end. Rather, it is found in all men, almost at once, from their beginning. So, felicity does not consist in this knowledge of God.

5 Besides, no man seems to be blameworthy because of the fact that he lacks felicity; in point of fact, those who lack it, but are tending toward it, are given praise.

But the fact that a person lacks the aforesaid knowledge of God makes him appear very blameworthy. Indeed, a man’s dullness is chiefly indicated by this: he fails to perceive such evident signs of God, just as a person is judged to be dull who, while observing a man, does not grasp the fact that he has a soul.

That is why it is said in the Psalms ( 13:1, 52:1): “The fool hath said in his heart: There is no God.” So, this is not the knowledge of God which suffices for felicity.

6 Moreover, the knowledge that one has of a thing, only in a general way and not according to something proper to it, is very imperfect, just like the knowledge one might have of a man when one knows simply that he is moved.

For this is the kind of knowledge whereby a thing is known only in potency, since proper attributes are potentially included within common ones. But felicity is a perfect operation, and man’s highest good ought to be based on what is actual and not simply on what is potential, for potency perfected by act has the essential character of the good. Therefore, the aforementioned knowledge is not enough for our felicity.