I Also Declare The Bayesian vs. Frequentist Debate Over For Data Scientists

LSMFT! What's the probability Santa prefers Luckies?
LSMFT! What’s the probability Santa prefers Luckies?

I stole the title, adding the word “also”, from an article by Rafael Irizarry at Simply Stats (tweeted by Diego Kuonen).

First, brush clearing. Data scientists. Sounds like galloping bureaucratic title inflation has struck again, no? Skip it.

Irizarry says, “If there is something Roger, Jeff and I agree on is that this debate is not constructive. As Rob Kass suggests it’s time to move on to pragmatism.” (Roger Peng and Jeff Leek co-run the blog; Rob Kass is a named person in statistics. Top men all.)

Pragmatism is a failed philosophy; as such, it cannot be relied on for anything. It says “use whatever works”, which has a nice sound to it (unlike “data scientist”), until you realize you’ve merely pushed the problem back one level. What does works mean?

No, really. However you form an answer will be philosophical at base. So we cannot escape having to have a philosophy of probability after all. There has to be some definite definition of works, thus also of probability, else the results we provide have no meaning.


Applied statisticians help answer questions with data. How should I design a roulette so my casino makes $? Does this fertilizer increase crop yield?…[skipping many good questions]… To do this we use a variety of techniques that have been successfully applied in the past and that we have mathematically shown to have desirable properties. Some of these tools are frequentist, some of them are Bayesian, some could be argued to be both, and some don’t even use probability. The Casino will do just fine with frequentist statistics, while the baseball team might want to apply a Bayesian approach to avoid overpaying for players that have simply been lucky.

Suppose a frequentist provides an answer to a casino. How does the casino interpret it? They must interpret it somehow. That means having a philosophy of probability. Same thing with the baseball team. Now this philosophy can be flawed, as many are, but it can be flawed in such a way that not much harm is done. That’s why it seems frequentism does not produce much harm for casinos and why the same is true for Bayesian approaches in player pay scales.

It’s even why approaches which “don’t even use probability” might not cause much harm. Incidentally, I’m guessing by “don’t use probability” Irizarry means some mathematical algorithm that spits out answers to given inputs, a comment I based on his use of “mathematically…desirable properties”. But this is to mistake mathematics for or as probability. Probability is not math.

There exists a branch of mathematics called probability (really measure theory) which is treated like any other branch; theorems proved, papers written, etc. But it isn’t really probability. The math only becomes probability when its applied to questions. At that point an interpretation, i.e. a philosophy, is needed. And it’s just as well to get the right one.

Why is frequentism the wrong interpretation? Because to say we can’t know any probability until the trump of doom sounds—a point in time which is theoretically infinitely far away—is silly. Why is Bayes the wrong interpretation? Well, it isn’t; not completely. The subjective version is.

Frequency can and should inform probability. Given the evidence, or premises, “In this box are six green interocitors and four red ones. One interocitor will be pulled from the box” the probability of “A green interocitor will be pulled” is 6/10. Even though there are no such things as interocitors. Hence no real relative frequencies.

Subjectivity is dangerous in probability. A subjective Bayesian could, relying on the theory, say, “I ate a bad burrito. The probability of pulling a green interocitor is 97.121151%”. How could you prove him wrong?

Answer: you cannot. Not if subjectivism is right. You cannot say his guess doesn’t “work”, because why? Because there are no interocitors. You can never do an “experiment.” Ah, but why would you want to? Experiments only work with observables, which are the backbone of science. But who said probability only had to be used in science? Well, many people do say it, at least by implication. That’s wrong, though.

The mistake is not only to improperly conflate mathematics with probability, but to confuse probability models with reality. We need be especially wary of the popular fallacy of assuming the parameters of probability models are reality (hence the endless consternation over “priors”). Although one should, as Irizarry insists, be flexible with the method one uses, we should always strive to get the right interpretation.

What’s the name of this correct way? Well, it doesn’t really have one. Logic, I suppose, à la Laplace, Keynes, Jaynes, Stove, etc. I’ve used this in the past, but come to think it’s limiting. Maybe the best name is probability as argument.


  1. All,

    In case somebody objects to the interocitor-non-science example, don’t forget that in Logic we accept non-observables all the time. Our familiar Lewis Carroll example:

    “All cats are creatures understanding French,” said Alice’s father. “And some chickens are cats.”

    “Wait, I know!” said Alice, chirruping. “That means that some chickens are creatures understanding French.”

    “What you said is true, my dear,” said Alice’s father, his voice full of pride.

  2. Very good discussion. I think Prof. Joe Blitzstein proposes a nice addage that probability is essentially ‘the logic of uncertainty’.

    There’s plenty behind this statement. The maths are a language, whatever the domain. So the translation of some ‘thing’ using that language requires an interpretive philosophy. Unfortunately, various ‘philosophies’ exist, but are not terribly transparent in their grounding (I’m being nice here). This is a pervasive problem.

  3. In my experience, the Data Scientist is simply the guy that uses Excel (or maybe even SAS!) to generate models that tell the people in charge what they want to hear.

  4. “There has to be some definite definition of works, thus also of probability, else the results we provide have no meaning. ”

    Or perhaps the logic is the other way around?

    If it has no meaning you cannot have a probability and so there is no definition of what works.

    And random events have by definition, no meaning in themselves. So by themselves they do not have a meaningful probability.

    And when we have complex random events where the distribution changes over time, then not only does the event itself have no meaning, but also wider concepts such as the distribution are meaningless as they keep changing.

    And oh dear! Global temperature seems to be a form of 1/f noise whose variance increases the longer we observe, so the distribution is constantly changing and where the central value theorem does not appear to hold. With such a signal how can you have a meaningful probability function?

    For more see: http://scottishsceptic.co.uk/2014/12/10/introduction-to-1f-climate-noise/

  5. Hey now, don’t be mean to data scientists! They have families too, and pets, and tithe. It is a ridiculous title that I shy away from, but it basically means someone who knows enough math, stats, programming, and databases to be dangerous, and will market themselves. Despite there being a fair number of naughty data scientists (and scientists in general) who hunt for the wee p’s, there is also a movement against that within the ‘data science’ community. A real part of my job is asking other scientists – “so…um…did you test that (model,theory,etc…) on any out of sample data?”

  6. Hey now, don’t be mean to data scientists!

    On the bright side, he hasn’t picked on Data Engineers. Where would the Progressives and Liberals be without them?

  7. “Applied statisticians help answer questions with data.”
    Is that a truism? Are there theoretical statisticians who don’t help answer questions with data?

  8. Briggs,
    My problem with your approach (once I get over your habit of appropriating the language and notation of conditional probability for something that seems to me to be completely different), is the fact that in any practical context the evidence or premises that are available will depend on the speaker, and I don’t see how you distinguish that kind of difference from what others call subjectivity.

    So I am hoping that you intend this post as the first of a series in which you lead us through more of the details of your logical probability theory.

  9. I’m guessing by “don’t use probability” Irizarry means some mathematical algorithm that spits out answers to given inputs, a comment I based on his use of “mathematically…desirable properties”. But this is to mistake mathematics for or as probability. …

    While it is not clear what those techniques are, exploratory data analysis is an example of techniques that don’t involve probability. Mean, a useful summary statistic, has some desirable properties, e.g, the deviations of the data about the mean sum to zero, and it minimizes a certain sum of least-squares. Both can be shown mathematically. Just a simple example. So, no mistaking mathematics for or as probability.

    Yes, subjectivity is dangerous, not just in probability. Two Chinese characters (I know you hate the Kennedys!) come to mind. They, in a way, fittingly characterize the potential of subjective Bayesian analysis. Your description of a subjective Bayesian is naive with a hint of straw man.

    When academics say an area/subject/debate is dead or over, it usually means that, for whatever reason, there is nothing new for them to add or to learn, which is implied by the tweet “statisticians develop techniques.”
    I am sure Prof. Irizarry can interpret probability and statistical results, Bayesian or frenquentist, just fine.

  10. Alan Cooper,

    I’ve written loads already. Go to the Classic Posts page and navigate to either stats or probability philosophy.

    For starters, all probability is conditional. Etc.

Leave a Comment

Your email address will not be published. Required fields are marked *