## William M. Briggs

### Statistician to the Stars!

#### Page 146 of 758

My mailbag is filling up. Today two questions from readers, both about statistics. Feel free to send yours in on any subject. Tomorrow is a doozy.

Question 1

From reader Michael H. comes this email:

Hi Dr. Briggs,

I’ve been thinking about statistics much more now that I am required to take an applied statistics class for my actuarial certification.

I’ve watched your video on the crisis of evidence a couple of times. My understanding is that statistics cannot determine that something is a cause, but it may be able to say something of how it is a cause, for example to magnitude of the effect it will have given that we know it is a cause. Is this the case?

Moreover, your argument (at least as concerns the banana example) does not seem to imply that statistics cannot in principle determine cause, but because of the preponderance of possible causes it cannot distinguish between these. Would it be possible to determine a cause were the possible causes limited sufficiently, or is this a problem in principle? If in principle, what is the principle?

Thank you.

Probability models only tell us how our uncertainty changes in some proposition given varying assumptions. Therefore, probability models are not causal or deterministic. An example of a deterministic, but not causal, model is y = a + b*x. This says the value of y is determined by the values of a, b, and x. It says nothing directly about cause. Knowledge of cause comes from understanding a thing’s powers, and its nature or essence. These are not matters of probability.

Statistics, or probability models, in principle cannot determine cause because they remain mute about powers, natures, and essences. Understanding of these comes from induction (in its various types). Probability models aren’t even, except in trivial cases, deterministic. Consider regression. There the equation is a function of the central parameter. That is, the central parameter is said to be a function of various explanatory variables. The central parameter says nothing about any cause, therefore any function of it is silent on cause.

There’s lots more to say about this. I have some details in the paper “The Crisis Of Evidence: Why Probability And Statistics Cannot Discover Cause“, and much more in my (forthcoming?) book, which proves all these things.

Question 2

Our second question—and here readers can help—comes from Miha.

My name is Miha [personal information removed]. I also teach “analytics” in our executive program and have done a number of lectures at the business school here as well…

Yesterday I listened to your outstanding podcast on frequentist and Bayesian statistics…twice. It is fantastic! The best summary of the major differences I have heard/read. I do have a question I hope you can help me with.

When speaking about subjective Bayesians you mentioned that you had seen – in writing – cases where they have made up wild probabilities (you were using the die example and that they might say the probability of getting a six is “95%”). I am curious if you have any such examples at hand. I would love to read a piece or two where this was done as I would like to understand the rational behind such an “absurd” choice (if there is one). This request is simply out of curiosity.

I plan to listen to more of your podcasts during my next long bike ride. Definitely wish I had found your work before I started teaching analytics. Fantastic stuff!

Best,
Miha

I was an Associate editor on an American Meteorological Society journal at one time and an author submitted a paper which purported to demonstrate how certain Bayesians methods worked. For one example, this author used a prior which, as they say in the lingo, was hugely informative. The example usually called for a “flat” prior, which I pointed out. The author responded to me that, as priors were subjective, he could use any he wished. This reasoning convinced the chief editor and the odd example was allowed. The paper was eventually published. Only the Lord knows how influential this was to the largely non-statistical readership.

Most professional statisticians wouldn’t make that kind of mistake. But then again, the nature of the source of parameters is rarely explored. Many “priors” aren’t priors at all, since they are “improper”, meaning they are not probabilities. And many so-called empirical Bayes analyses use priors that depend on the same kinds of dicey assumptions and data found in frequentist studies.

The die example I used proves—rather, strongly suggests—subjective probability is not a correct interpretation of probability. Given “Just 2 out of 3 Martians wear a hat and George is a Martian”, the probability of “George wears a hat” is 2/3. But a subjectivist can say 0.01115%. How can you prove him wrong? Answer: you cannot, not empirically. So the empirical interpretation of probability is also wrong. You can prove the 2/3 is right, however, by use of the statistical syllogism which relies on the more fundamental idea of “symmetry of logical constants”, which, even though it uses the word, has nothing do to with any physical symmetry. I prove—as in prove—this in my book.

Great news! The press and activists are touting a study which claims to have discovered the genes that “make” somebody gay. So get your enwombed baby’s DNA scanned and make that appointment with Planned Parenthood early. There’s sure to be a line out the door as parents rush to eliminate these fabulous “clumps of cells.”

Hey. Why not? Abortion is the law of the land, right? And the law of the land decides right and wrong, yes? And abortion, we are told by heretics, isn’t killing, right? (Right, Sylvain? JH? Hello?) So why not save your potential child a lot of trouble and put it out of its misery as early as possible. Right?

Never mind. I was only kidding. You’re saved from confronting this gut-wrenching progressive dilemma—sexual libertinism vs. unchecked bloodlust—because the study is almost certainly nonsense. Why? Wee p-values were the evidence for the claim.

Wee p-values are the smallest sins here. Worst is the claimed accuracy of a model and the absence of skill. What’s skill? Read on.

Here was one of the initial headlines: “The DNA test ‘that reveals if you’re gay’: Genetic code clue is 70% accurate, claim scientists.” This is silly on its face, because if a man is same-sex attracted he doesn’t stand in need of a chemical test to tell him so. But let it pass, because the idea was to discover biological drivers, which is to say causes, of homosexual desire (and not homosexuality: there is no such thing).

The headline was based on an abstract—a mini-paper of about one page, common in medicine—and a press release, unfortunately common in science. Anyway, some fellow named Tuck Ngun at UCLA did the study. According to the paper:

The study involved 37 pairs of twins in which one brother was homosexual and the other heterosexual, and 10 pairs in which both were homosexual.

Using a computer program called Fuzzy Forest they found that nine small regions of the genetic code played the key role on deciding whether someone is heterosexual or homosexual.

The research looked at a process called ‘methylation’ of the DNA — which has been compared to a switch on the DNA — making it have a stronger or weaker effect.

This process can be triggered by hormonal effects on the growing foetus in the womb.

Fuzzy forest? An algorithm to do classification of some thing, here same-sex attraction or not, based on input variables, here the genetic markers. These markers weren’t DNA per se, but epigenetical markers, these methylation sites. Wikipedia, for once, does not let us down: it’s simpler to read their article than have me explain it. It’s not necessary to understand epigenetics to follow the statistics.

And did you notice? Thirty-seven pairs of identical twin brothers, in which one brother did not suffer same-sex attraction and one did. Know what that proves? It proves same-sex attraction can’t be entirely genetically caused. If it were, then you’d find all pairs of identical twins with the same attractions, given we accept everybody tells researchers the truth. This study thus proves that attraction is at least partly environmentally caused (this includes epigenetic changes, which we’d expect should be mostly the same for both twins in the womb). Incidentally, one environmental cause is choice. Skip it.

Then came the day Ngun gave his paper. His audience, we are told by The Atlantic, was not satisfied.

[Ngun] analysed 140,000 regions in the genomes of the twins and looked for methylation marks…He whittled these down to around 6,000 regions of interest, and then built a computer model…

The best model used just five of the methylation marks, and correctly classified the twins 67 percent of the time. “To our knowledge, this is the first example of a biomarker-based predictive model for sexual orientation,” Ngun wrote in his abstract.

Ngun separated the data in a training and validation set, so the model was built on something like 20 or pairs, and tested on the other 20 or so. He built “several models” and selected the one which gave the best classification accuracy, the one with just five methylation marks. Did Ngun correct for multiple testing? No, sir, he did not.

That means Ngun is claiming sexual desire is controlled largely, but not entirely, by these five methylation marks. Sound plausible to you? No. It didn’t to other researchers The Atlantic contacted either.

Guy named John Greally from Albert Einstein gets it. He said Ngun “could not resist trying to interpret [his] findings mechanistically”. Of course, nearly everybody who discovers a wee p-value interprets their findings mechanistically, which is to say, they believe statistical models have discovered causes.

Greally wrote (and The Atlantic also quoted): “It’s not personal about [Ngun] or his colleagues, but we can no longer allow poor epigenetics studies to be given credibility if this field is to survive. By ‘poor,’ I mean uninterpretable.”

To which we say Amen.

Skill

Here’s the real criticism. 37 of 47 pairs were SSA. Suppose you invent the naive model “Say everybody is SSA”. What is that model’s accuracy here? Well, you’d be right 37 * 2 = 74 times and wrong 10 * 2 = 20 times, for 74/94 = 0.79, or 79% accurate. Damn good! And it beats Ngun’s fancy schmancy Fuzzy Forests by a long shot. That sophisticated model only got 67% accuracy. This assumes his training-validation split was equal across the 37 and 10 pairs, naturally, but you get the idea.

If accuracy is your goal, there is no reason in the world to use Ngun’s model and every reason to use the naive model. The naive model does a much better job. It lacks the just-so story of methylations causing SSA, but what of it?

This lack of skill is what I’m always going on about in climate models. Persistence beats the sophisticated models, so why not choose persistence? Models without skill compared to a natural reference model should not be used.

Update Greally discovered Ngun’s answers to some of his critics. From that, we can see wee p-values were the source of decision. But the most interesting attempt at rebuttal was to Greally’s criticism about interpreting results mechanistically. Here’s what Ngun said:

Let’s be real here: no one is going to pay attention unless you talk about implicated genes. It’s all about interpretability. We ultimately want to understand what’s going on in terms of the biology so of course we’re going to talk about any genes that seem related and are interesting.

“Being real” means, to him, juicing the press release with potentially misleading information, because, he implies, who’d be interested in the truth?

Ngun later goes on to answer The Atlantic (Greally didn’t link to this). Ngun said, “[My] approach is used widely in statistical/predictive modeling field. It is not an insidious issue or data manipulation…” This is true. And that’s the problem. The method, as regular readers know, stinks and is guaranteed to produce large quantities of false positives.

More confirmation of wee p-values: “The single test we did was to ask whether the final model we had built was performing better than random guessing. It seemed to be because its p-value was below the nearly universal statistical threshold of 0.05.”

A cup of coffee about the lure another victim.

The headline blared “Enjoy coffee or a gin and tonic? You could be a psychopath: People with dark personalities prefer bitter foods and drinks“. And then the bullet points:

* Researchers studied the preferred food and drink choices of 1,000 people
* Participants then completed a series of personality questionnaires
* Study found a preference for bitter foods was linked to dark personalities
* People who liked coffee, radishes and tonic water were more likely to exhibit signs of Machiavellianism, psychopathy and narcissism

The peer-reviewed paper is “Individual differences in bitter taste preferences are associated with antisocial personality traits” in the journal Appetites (yes) by Christina Sagioglou and Tobias Greitemeyer. You might recall Sagioglou as authoress of “Bitter taste causes hostility” in Personality and Social Psychology Bulletin and of “Activating Christian religious concepts increases intolerance of ambiguity and judgment certainty” in the Journal of Experimental Social Psychology. Or you might not.

In her new work, Sagioglou and pal “investigated how bitter taste preferences might be associated with antisocial personality traits” in which a group of people “self-reported their taste preferences using two complementary preference measures and answered a number of personality questionnaires assessing Machiavellianism, psychopathy, narcissism, everyday sadism, trait aggression, and the Big Five factors of personality.”

Did she say everyday sadism? Yes, sir, she did.

Anyway, you know what followed. Pseudo-quantification of emotional states, emotional states said to be perfectly captured and understood by questionnaires, and wee p-values confirming this-causes-that. And so next time you grab a cup of joe you run the risk of snatching up the cleaver, too, and hacking the barrista to pieces.

I hereby call for a two-year moratorium on all “research” and “science” which in any way uses questionnaires. This is on top of my forbidding all uses of hypothesis testing—to something like worldwide acclaim, at least on the Upper East Side. It’s far past the time to clean out the garbage accumulated by all these “studies.”

It’s so easy to make Science&tm;! All you have to do is pay for some “instrument”, such as the one that assesses “Machiavellianism”, and then invent some new question, such as “How much do you like black coffee?”, on a scale of -17.2 to $e^{\pi}$, and then write a paper which shows how preference for espresso-ground beans causes Machiavellianism.

You can do this endlessly. And after you have done it, you’re not finished! You may well have “proved”, with hypothesis testing, that espresso-ground beans causes Machiavellianism, but then you realize you have said nothing about coarse-ground beans! And even if you have said something about coarse-ground beans, you have been silent on how coffee preference in general effects Machiavellianism in women and minorities.

And so on, as I say, endlessly.

It’s all crap, to coin a word. It’s stinkier than stinky tofu, rottener than Unitarian theology, flimsier than Bill Clinton’s excuses. And it’s pervasive. Question-based science is the very foundation of entire fields, like education, sociology, psychology. Disallow questionnaires and hundreds of journals would dry up.

On a continuous scale of -3.2 to 113-1/3, how would rate the idea of quantifying the unquantifiable? Or how about something more scientific: on a scale of 1 to 8.3, in units of 1/r where r is a prime number, how flummoxed does the previous question make you? Do you think that a “flummoxedness” of 8 is twice as flummoxed as an “flummoxedness” of 4? And is a “flummoxedness” of 4 twice as flummoxed as a “flummoxedness” of 2? Have we captured all there is to know about “flummoxedness” in this “scientifically validated instrument”? It’s validated, incidentally, because I am a scientist and say it is.

Add to this the silliness of hypothesis testing and its pretended identification of cause, and you have what we have now. Newspapers and researchers telling us that liking gin and tonics is “linked” or “tied” to psychopathy. How depressing! (Measured on the patented Briggs Depression scale—only \$495.12 per use—I scored a whopping -4, which was statistically significant, which is the way I knew I was depressed.)

The damage done to clear thinking by pretending batteries of questions adequately quantify emotional states cannot scarcely be underestimated. It’s far past the time to take these things seriously.

This may be proved in three ways. The first…

See the first post in this series for an explanation and guide of our tour of Summa Contra Gentiles. All posts are under the category SAMT.

Previous post.

Contingency is a tricky business. And not just in theology. Science must deal with questions of contingency, too. And some scientists answer the question wrongly, especially about human free will (which isn’t explicitly discussed here).

Chapter 85 The the Divine Will does not remove contingency from things, nor impose absolute necessity on them. (alternate translation)

[1] FROM what has been said we may gather that the divine will does not exclude contingency, nor impose absolute necessity on things…

[3] Moreover. God wills the good of the universe the more especially than any particular good, according as the likeness of His goodness is more completely found therein. Now the completeness of the universe demands that some things should be contingent, else not all the degrees of being would be contained in the universe. Therefore God wills some things to be contingent.

[4] Again. The good of the universe consists in a certain order, as stated in 11 Metaph.[Aristotle] Now the order of the universe requires that certain causes be changeable; since bodies belong to the perfection of the universe and they move not unless they be moved. Now from a changeable cause contingent effects follow: since the effect cannot have more stable being than the cause. Hence we find that, though the remote cause be necessary, yet if the proximate cause be contingent, the effect is contingent. This is evidenced by what happens with the lower bodies: for they are contingent on account of the contingency of their proximate causes, although their remote causes, which are the heavenly movements, are necessary. Therefore God wills some things to happen contingently.

[5] Further. Necessity by supposition in a cause cannot argue absolute necessity in its effect. Now God wills something in the creature not of absolute necessity, but only of necessity by supposition, as we have proved.[4] Wherefore from the divine will we cannot argue absolute necessity in creatures. Now this alone excludes contingency, since even contingents that are indifferent to either of two alternatives become necessary by supposition: thus it is necessary that Socrates be moved if he runs. Therefore the divine will does not exclude contingency from the things willed.

[6] Hence it does not follow, if God wills a thing, that it happens of necessity, but that this conditional proposition is true and necessary, If God wills a thing, it will be: and yet the consequence is not necessary.

Notes How to separate secondary from primary causation? If God is responsible for creating—here-and-now, at each and every moment, and not (just) in some distant past—the entirety of the physical universe (all that is), and God is the primary cause of everything (remember Chapter 13?), the cause without which nothing else could start, how could there be contingency? Contingency means a thing that has changed or happened that didn’t have to.

Aquinas talks about plain-old necessity, by which he means absolute necessity, and suppositional, which is to say, hypothetical, necessity. It is suppositionally necessary that Socrates is moved but only if he walks, runs, or is pushed. Socrates moving is a conditional or local and not a universal truth. Things which are absolutely necessary are so no matter what, but things that are suppositionally necessary are only true if they are. Which sounds funny and is a longer way to state contingency.

Here’s an analogy, and like all analogies it suffers from being absolutely wrong. Air hockey. You hit a plastic puck and there she goes in a contingent direction. But that puck is all the while being held up by a “primary” cause from a deeper level. Your whacking it and its hitting the walls etc. are secondary causes. But those secondary causes are non-starters without the “primary” cause of the puck being held in creation, i.e. in the air. No air blowing and the puck won’t move (unless you strike it hard, which is where the analogy breaks, unless you assume your strength is limited).

And so God, using his Word, is holding everything up like the air on the hockey table, and everything then acts by its powers, which is where contingency arises. God could certainly shuffle the puck along, and might even do so on special occasions (say, miracles), but it would seem He is content to let things act as they will. Hence science is possible. But science does itself a disservice when it fails to recognize the base or true science is God’s will. At some point science must cease and theology must begin.