# Why Statistics Can’t Discover Cause, And Bad Priors

My mailbag is filling up. Today two questions from readers, both about statistics. Feel free to send yours in on any subject. Tomorrow is a doozy.

Question 1

From reader Michael H. comes this email:

Hi Dr. Briggs,

I’ve been thinking about statistics much more now that I am required to take an applied statistics class for my actuarial certification.

I’ve watched your video on the crisis of evidence a couple of times. My understanding is that statistics cannot determine that something is a cause, but it may be able to say something of how it is a cause, for example to magnitude of the effect it will have given that we know it is a cause. Is this the case?

Moreover, your argument (at least as concerns the banana example) does not seem to imply that statistics cannot in principle determine cause, but because of the preponderance of possible causes it cannot distinguish between these. Would it be possible to determine a cause were the possible causes limited sufficiently, or is this a problem in principle? If in principle, what is the principle?

Thank you.

Probability models only tell us how our uncertainty changes in some proposition given varying assumptions. Therefore, probability models are not causal or deterministic. An example of a deterministic, but not causal, model is y = a + b*x. This says the value of y is determined by the values of a, b, and x. It says nothing directly about cause. Knowledge of cause comes from understanding a thing’s powers, and its nature or essence. These are not matters of probability.

Statistics, or probability models, in principle cannot determine cause because they remain mute about powers, natures, and essences. Understanding of these comes from induction (in its various types). Probability models aren’t even, except in trivial cases, deterministic. Consider regression. There the equation is a function of the central parameter. That is, the central parameter is said to be a function of various explanatory variables. The central parameter says nothing about any cause, therefore any function of it is silent on cause.

There’s lots more to say about this. I have some details in the paper “The Crisis Of Evidence: Why Probability And Statistics Cannot Discover Cause“, and much more in my (forthcoming?) book, which proves all these things.

Question 2

Our second question—and here readers can help—comes from Miha.

My name is Miha [personal information removed]. I also teach “analytics” in our executive program and have done a number of lectures at the business school here as well…

Yesterday I listened to your outstanding podcast on frequentist and Bayesian statistics…twice. It is fantastic! The best summary of the major differences I have heard/read. I do have a question I hope you can help me with.

When speaking about subjective Bayesians you mentioned that you had seen – in writing – cases where they have made up wild probabilities (you were using the die example and that they might say the probability of getting a six is “95%”). I am curious if you have any such examples at hand. I would love to read a piece or two where this was done as I would like to understand the rational behind such an “absurd” choice (if there is one). This request is simply out of curiosity.

I plan to listen to more of your podcasts during my next long bike ride. Definitely wish I had found your work before I started teaching analytics. Fantastic stuff!

Best,
Miha

I was an Associate editor on an American Meteorological Society journal at one time and an author submitted a paper which purported to demonstrate how certain Bayesians methods worked. For one example, this author used a prior which, as they say in the lingo, was hugely informative. The example usually called for a “flat” prior, which I pointed out. The author responded to me that, as priors were subjective, he could use any he wished. This reasoning convinced the chief editor and the odd example was allowed. The paper was eventually published. Only the Lord knows how influential this was to the largely non-statistical readership.

Most professional statisticians wouldn’t make that kind of mistake. But then again, the nature of the source of parameters is rarely explored. Many “priors” aren’t priors at all, since they are “improper”, meaning they are not probabilities. And many so-called empirical Bayes analyses use priors that depend on the same kinds of dicey assumptions and data found in frequentist studies.

The die example I used proves—rather, strongly suggests—subjective probability is not a correct interpretation of probability. Given “Just 2 out of 3 Martians wear a hat and George is a Martian”, the probability of “George wears a hat” is 2/3. But a subjectivist can say 0.01115%. How can you prove him wrong? Answer: you cannot, not empirically. So the empirical interpretation of probability is also wrong. You can prove the 2/3 is right, however, by use of the statistical syllogism which relies on the more fundamental idea of “symmetry of logical constants”, which, even though it uses the word, has nothing do to with any physical symmetry. I prove—as in prove—this in my book.

My dear Dr Briggs:
If I may, let me take a swing at this question… it mirrors many of my intro stat students’ queries…
“Moreover, your argument (at least as concerns the banana example) does not seem to imply that statistics cannot in principle determine cause, but because of the preponderance of possible causes it cannot distinguish between these.”
The best statistical analysis can do is to point out that there exists some difference between what we expect to see, and what we actually see exists.
The reasons behind that difference, or dissonance, if you will, is beyond any statistical analysis…
Can the thrower influence what faces land up?
Why are the vast majority of those in jail men?
Why exactly is it that most often the server I deal with in a Chinese restaurant is Chinese?
These are questions beyond the statistical trade.
The best we can hope for is to determine that things are not the way we expected them to be…
whether this discrepancy is caused by unknown reasons (often, incorrectly ascribed to “randomness”), or because of some hypothesized causal relationship simply cant be determined by statistical analysis alone.
Its really that simple.
In frequentist analysis the Alternative hypothesis is never accepted…the Null is either rejected, or we fail to reject the Null.
People who ascribe causality on the basis of “wee p-scores” do so at their, and our own risk.

2. Briggs says:

Thanks,

To take just one out of many terrific examples: “Why are the vast majority of those in jail men?” Notice how nobody (that I have heard of) does statistical tests to discover if there are more men than women as they do similar tests between the sexes for other activities?

That’s because everybody (excepting perhaps modern PC academics) knows the causes of the difference. No need for statistics! Of course, we can always have a go at predicting future populations based on some model.

You are too kind Dr Briggs.
The text of the question from a concerned reader that you posted bothered me however…If someone was actually enrolled in an actuarial sciences program (like the one at Temple University, for example) I would think that she would have been able to answer the rather naive question presented to you, no?
Its scary that an actuary in training wouldn’t already know what the deal is….
But if you want to pull whatever hair you have growing where you want it grow out by its roots…just look into the application of statistical analysis practiced by the legal community…that is truly scary….

4. No method mathematical or scientific can ‘discover cause’. This is called the Problem of Induction. So the claim is true, but trivially so. Scientific and mathematical methods can only produce EVIDENCE FOR a claim. The evidence will, of course, vary in quality, subject to the actual claim.

Dressing up trivial claims and making them seem clever seems harmless enough. Certainly less harmless than the public repeatedly confusing (typically dubious) quality evidence for deductive proof. On the other hand, Dr Brigg’s argues there is no such thing as the Problem of Induction, which happens to contradict his main argument. But since he simply asserts this, rather than explain it, I suspect he has not thought much about the contradiction. At this stage queue a haughty rejoinder by Dr Brigg’s suggesting he is too clever to bother to explain himself…

Will:
“Scientific and mathematical methods can only produce EVIDENCE FOR a claim. The evidence will, of course, vary in quality, subject to the actual claim.”
Look the issue is what inferences can properly be made from, in frequentist analysis, the rejection of the Null hypothesis… my point tis that all one can say is something seems “different” from that which we thought..no more no less…the jump top concluding that a causal relationship exists at all between anything is only as good as the rationale behind that conclusion.

Yes of course. But repeating the mantra ‘correlation is not causation’ dressed up to sound deeper than it is, is not really saying anything except the bleeding obvious.

yeah, so is bleating about the “the problem of induction” no?
sometimes, the “problem of induction is easy to resolve”….
a gun is fired, the recoil from the projectile being thrown forward by the explosion of gunpowder is experienced b y the person who pulled the trigger, and the results are seen…a fatal wound….
complaining that correlation doesn’t equate with causation is tiresome, but necessary.

I mean really will…post hoc ergo proctor hoc is a fallacy…except when its true.

9. James says:

From the post:

Understanding of these comes from induction (in its various types).

From Will:

On the other hand, Dr Brigg’s argues there is no such thing as the Problem of Induction, which happens to contradict his main argument.

Color me confused. I’ve been reading this site for maybe a year now, so if I missed something a long time ago, fine.

Briggs has also stated that the discussion on induction is in his book. I wouldn’t expect that to be published on the site in advance of that. Why would most people buy his book if it was all on the site?

It’s also odd to say that Briggs is beating a dead horse wrt induction and CinC when we consistently see example after example of those fundamental principles being ignored. The horse is still way too alive. Taleb wrote a whole book on the problem of induction (not being a turkey, as he said) and it still has its share of critics!

10. Smoking Frog says:

David Eisenstadt – You mean, “post hoc ergo propter hoc,” not “post hoc ergo proctor hoc.” That’s a pretty funny error.

You seem to be taking a very narrow view of statistics in order to make your case. The late G E P Box often noted examples of how statistical methods served as ‘catalysts for discovery and invention’ in industrial applications. Simple monitoring charts for processes can do the trick, as can designed experiments, to expose causes and clarify their effects.

12. Briggs says:

You have proven that I have taken the correct view. Think about it. “Monitoring charts”. What are these? Why did you choose to plot these particular things?

As far as experimental control goes, of course science can use these in the process of discovering cause. But because of underdetermination, they are never (as we know from QM) proofs of cause. Knowledge of cause, therefore, is something beyond the empirical observations. But this is true of much of our knowledge, like mathematical “axioms” and so forth.

I have much more detail in the book. A whole chapter.

“David Eisenstadt – You mean, “post hoc ergo propter hoc,” not “post hoc ergo proctor hoc.” That’s a pretty funny error.”

yeah…spelling errors are the point here frog.
😉
you have a problem with the gist of my comment?
I thought not.
Of course having a dialogue with some internet puppet named “smoking frog” is funny in its own way as well.

I look forward to your new book, which I expect will be a splendid one, well worth reading. But I would like to pester you a little more about cause discovery.

An industrial process monitoring chart typically takes some readily measured quantity, let us say the diameter of a nail head, and maintains a time series of observations of it, along with upper and lower limits reflecting historical variation. As and when a sufficient departure from these limits occurs, the idea is then to mount an investigation into possible cause of said departure. Thise combination of an ‘informative event’ and a ‘perceptive observer’, to use more of Box’s terminology, can lead to the cause being discovered. For example, a new employee failing to follow standard procedures somewhere upstream in the process. Mundane, perhaps, but cause nevertheless.

We seem to agree about designed experiments? I agree that cause knowledge has to go beyond empirical observations. I think it requires theory, some notion of mechanism.

But here’s a thing. You claim my one comment ‘proves’ your viewpoint 🙁

15. MattS says:

“Thise combination of an ‘informative event’ and a ‘perceptive observer’, to use more of Box’s terminology, can lead to the cause being discovered.”

Only very indirectly in that it tells you that there is a cause to be discovered.

16. Bulldust says:

Have you guys seen Lewandowsky’s latest effort? Ignorance as an argument for greater urgency to act on climate change:

“Uncertainty about climate change can, counter-intuitively, produce actionable knowledge and thus should provide an impetus, rather than a hindrance, to addressing climate change, researchers from the University of Bristol’s Cabot Institute argue in a special issue of the Royal Society’s Philosophical Transactions A, published this week.”
http://www.bris.ac.uk/cabot/news/2015/uncertainty-action.html

I think they meant counter-logical. I would posit that we know almost nothing about alien abductions, and hence the call to action could not be more urgent.

They also say:
“…uncertainty provides an impetus to be concerned about climate change because greater uncertainty increases the risks associated with climate change.”

In a simplistic way I presumed that the variables affecting climate would delineate the extent of the risk, not our ignorance of their relationships. But I am a simple man … maybe someone can guide me towards greater wisdom on this issue …

17. Bumble says:

I think it is too simplistic to identify bayesianism with subjectivism about priors. Some bayesians (such as de Finetti) argue for using subjective priors, but most do not. Others (Howson and Urbach) insist that priors must be calibrated against known frequencies, and others (Jon Williamson and Edwin Jaynes) add a further criterion that priors should be constrained to be maximally uncertain given the known data, which is a way of capturing the intuition that the priors should be unbiased and not make assumptions that we have no evidence for.

No, you’re confusing yourself. We don’t need to even fire the weapon to know it will kill you. You can work out the maths, because we already have a good causal theory of physics. That’s not what the Problem of Induction is all about.