William M. Briggs

Statistician to the Stars!

Category: Statistics (page 1 of 284)

The general theory, methods, and philosophy of the Science of Guessing What Is.

Top 10 Health Scares Of 2014 From The American Council On Science & Health

Ahhhh! It's azodicarbonamide!

Ahhhh! It’s azodicarbonamide!

The American Council On Science & Health, longtime exposer of shoddy science, has released their 2014 list of the top scares. Below is a short summary and my comments.

10 Subway’s use of azodicarbonamide in their bread.

Some blogger named Vani Hari—and this is the good news: bloggers can be influential—a.k.a. The Food Babe, took against Subway, even though nobody was forcing her to eat there. She freaked out over the chemical (boo!) the chain uses in their bread.

“Even more ridiculous is Hari’s guidance that ‘if you can’t spell it or pronounce it, you probably shouldn’t eat it.’ It is unclear when spelling became a parameter for toxicological evaluation.” Relying on her own advice, Hari would starve in China. How do you spell “duck tongue” (yum!)?

Anyway, mice fed diets of azodicarbonamide in bulk developed cancer at higher rates than mice not so cruelly treated, which is all the evidence activists (and the ever-present “State of California”) need to conclude it must cause cancer in humans. It’s Science!

9 “Formaldehyde found in baby shampoo”.

Johnson and Johnson put Quaternium-15 in shampoo, which sounds like the maguffin from a science fiction movie, which breaks down into formaldehyde and which keeps shampoo from going bad.

But what about the children!

All of us are exposed to formaldehyde on a regular basis. It exists in a wide variety of food, especially fruits and vegetables, as well as ubiquitous products such as plywood, insulation, carpeting, and cosmetics.

And even if we were able to accomplish the impossible—remove all external exposure to formaldehyde—we would still be exposed to it, since it is made in the human body.

Even children human bodies.

8 Gluten! Gluten! Gluten! This post is certified 100% gluten free.

The dietary fads of rich Americans never fail to provide comedy.

7 Pesticides “linked to” Autism.

Got to love “linked to”. It has no definite meaning and given the low bar of “statistical significance”, almost anything can be “linked to” anything else. Prediction: once every child is routinely checked for autism, it will cease increasing. Think about it.

6 “Hydraulic fracturing (Fracking) pollutes drinking water”

It also causes bad breath, acne, an increase in UFO sightings, and a tendency to vote for Democrats. Increasing UFO sightings? Sure. The two series are statistical correlated. What more proof do you need?

The real concern is one of the fracking fissures caused by an greedy oil company will spread to the earth’s core, cracking it. We could literally split apart!

If that fear gains any traction, I want credit for thinking it up.

5 “Liquid nicotine used in e-cigarettes poisoning children”.

But what about the children!

Well, you can’t blame anti-smoking zealots. There is no “second hand” smoke with e-cigs; no, nor any “third hand” smoke, neither. Activists are so habituated to banning that they can’t stop themselves. And, dammit, those smokers look to be enjoying themselves. Thus the e-cigs must be dangerous. Dangerous to whom? How about the children? Yes!

But what about the children!

4 “Cancer epidemic from medical scans”.

See, this the exact reason metal leaf (thicknesses less than 0.2 mm) sheet-rolled aluminium was invented. It blocks all sorts of electromagnetic cancer causing rays, even those rays which come from outer space! Indeed, it is particularly apt prophylactically for rays which impinge on one’s cranium at direct vertical angles. Medical experts recommend fashioning a snug-fitting cranial-protection garment to avoid prob-scan cancer.

3 “GMOs not safe for use in foods.”

Frankenfoods. Of course, everything we eat has been genetically modified by farmers over centuries of careful breeding and care of stock. So the only way to avoid ingesting any GMO food is to head to the hills and stalk wild caribou using only an organic knife (particles from non-organic knives might cause cancer). Best way to lure these outsize deer is to pretend to be blades of juicy grass. Wear lots of green and brown. Go in November.

2 “Thimerosal in Vaccines poses threat to public health, says RFK Jr.”

I think most of know that after appear the words “says RFK Jr.” there is no need to say anything more.

1 “Prenatal exposure to phthalates linked to lower IQs in children”.

Boy, most of us can’t pronounce or spell that word, so it must be bad (see 10). Rich American parents have every right to expect their children to be above average, IQ wise. Why, if it weren’t for phthalates, azodicarbonamides, and fracking it follows it is near to certain we’d have a nation of near-to geniuses.

Improper Language About Priors

A Christmas distribution of posteriors.

A Christmas distribution of posteriors. Image source.

Suppose you decided (almost surely by some ad hoc rule) that the uncertainty in some thing (call it y) is best quantified by a normal distribution with central parameter θ and spread 1. Never mind how any of this comes about. What is the value of θ? Nobody knows.

Before we go further, the proper answer to that question almost always should be: why should I care? After all, our stated goal was to understand the uncertainty in y, not θ. Besides, θ can never be observed; but y can. How much effort should we spend on something which is beside the point?

If you answered “oodles”, you might consider statistics as a profession. If you thought “some” was right, stick around.

Way it works is that data is gathered (old y) which is then used to say things, not about new y, but about θ. Turns out Bayes’s theorem requires an initial guess of the values of θ. The guess is called “a prior” (distribution): the language that is used to describe it is the main subject today.

Some insist that that the prior express “total ignorance”. What can that mean? I have a proposition (call it Q) about which I tell you nothing (other than it’s a proposition!). What is the probability Q is true? Well, given your total ignorance, there is none. You can’t consistent with the evidence say to yourself anything like, “Q has got to be contingent, therefore the probability Q is true is greater than 0 and less than 1.” Who said Q had to be contingent? You are in a state of “total ignorance” about Q: no probability exists.

The same is not true, and cannot be true, of θ. Our evidence positively tells us that “θ is a central parameter for a normal distribution.” There is a load of rich information in that proposition. We know lots about “normals”; how they give 0 probability to any observable, how they give non-zero probability to any interval on the real line, that θ expresses the central point and must be finite, and so on. It is thus impossible—as in impossible—for us to claim ignorance.

This makes another oft-heard phrase “non-informative prior” odd. I believe it originated from nervous once-frequentist recent converts to Bayesian theory. Frequentists hated (and still hate) the idea that priors could influence the outcome of an analysis (themselves forgetting nearly the whole of frequentist theory is ad hoc) and fresh Bayesians were anxious to show that priors weren’t especially important. Indeed, it can even be proved that in the face of rich and abundant information, the importance of the prior fades to nothing.

Information, alas, isn’t always abundant thus the prior can matter. And why shouldn’t it? More on that question in a moment. But because some think the prior should matter as little as possible, it is often suggested that the prior on θ should be “uniform”. That means that, just like the normal itself, the probability θ takes any value is zero, the probability of any interval is non-zero; it also means that all intervals of the same length have the same probability.

But this doesn’t work. Actually, that’s a gross understatement. It fails spectacularly. The uniform prior on θ is no longer a probability, proved easily by taking the integral of the density (which equals 1) over the real line, which turns out to be infinite. That kind of maneuver sends out what philosopher David Stove called “distress signals.” Those who want uniform priors are aware that they are injecting non-probability into a probability problem, but still want to retain “non-informatativity” so they call the result an “improper prior”. “Prior” makes it sound like it’s a probability, but “improper” acknowledges it isn’t. (Those who use improper priors justify them saying that the resultant posteriors are often, but not always, “proper” probabilities. Interestingly, “improper” priors in standard regression gives identical results, though of course interpreted differently, to classical frequentism.)

Why shouldn’t the prior be allowed to inform our uncertainty in θ (and eventually in y)? The only answer I can see is the one I already gave: residual frequentist guilt. It seems obvious that whatever definite, positive information we have about θ should be used, the results following naturally.

What definite information do we have? Well, some of that has been given. But all that ignores whatever evidence we have about the problem at hand. Why are we using normal distributions in the first place? If we’re using past y to inform about θ, that means we know something about the measurement process. Shouldn’t information like that be included? Yes.

Suppose the unit in which we’re measuring y is inches. Then suppose you have to communicate your findings to a colleague in France, a country which strangely prefers centimeters. Turns out that if you assumed, like the normal, θ was infinitely precise (i.e. continuous), the two answers—inches or centimeters—would give different probabilities to different intervals (suitably back-transformed). How can it be that merely changing units of measurement changes probabilities! Well, that’s a good question. It’s usually answered with a blizzard of mathematics (example), none of which allays the fears of Bayesian critics.

The problem is that we have ignored information. The yardstick we used is not infinitely precise, but has, like any measuring device anywhere, limitations. The best—as inbest—that we can do is to measure y from some finite set. Suppose this it to the nearest 1/16 of an inch. That means we can’t (or rather must) differentiate between 0″ and something less than 1/16″; it further means that we have some upper and lower limit. However we measure, the only possible results will fall into some finite set in any problem. Suppose this is 0″, 1/16″, 2/16″,…, 192/16″ (one foot; the exact units or set constituents do not matter, only that they exist does).

Well, 0″ = 0 cm, and 1/16″ = 0.15875 cm, and so on. Thus if the information was that any of the set were possible (in our next measurement of y), the probability of (say) 111/16″ is exactly the same as the probability of 17.6213 cm (we’ll always have to limit the number of digits in any number; thus 1/3 might in practice equal 0.333333 where the 3’s eventually end). And so on.

It turns out that if you take full account of the information, the units of measurement won’t matter! Notice also that the “prior” in this case was deduced from the available evidence; there was nothing ad hoc or “non-informative” about it at all (of course, other premises are possible leading to other deductions).

But then, with this information, we’re not really dealing with normal distributions. No parameters either: there is no θ in this setup. Ah. Is that so bad? We’ve given up the mathematical convenience continuity brings, but our reward is accuracy—and we never wander away from probability. We can still quantify the uncertainty in future (not yet seen) values of y given the old observations and knowledge of the measurement process, albeit at the price of more complicated formula (which seem more complicated than it really is at least because fewer people have worked on problems like these).

And we don’t really have to give up on continuity as an approximation. Here’s how it should work. First solve the problem at hand—quantifying the uncertainty in new (unseen) values of y given old ones and all the other premises available. I mean, calculate that exact answer. It will have some mathematical form, part of which will be dependent on the size or nature of the measurement process. Then let the number of elements in our measurement set grow “large”, i.e. take that formula to the limit (as recommended by, inter alia, Jaynes). Useful approximations will result. It will even be true that in some cases, the old stand-by, continuous-from-the-start answers will be rediscovered.

Best of all, we’ll have no distracting talk of “priors” and (parameter) “posteriors”. And we wouldn’t have to pretend continuous distributions (like the normal) are probabilities.

I Also Declare The Bayesian vs. Frequentist Debate Over For Data Scientists

LSMFT! What's the probability Santa prefers Luckies?

LSMFT! What’s the probability Santa prefers Luckies?

I stole the title, adding the word “also”, from an article by Rafael Irizarry at Simply Stats (tweeted by Diego Kuonen).

First, brush clearing. Data scientists. Sounds like galloping bureaucratic title inflation has struck again, no? Skip it.

Irizarry says, “If there is something Roger, Jeff and I agree on is that this debate is not constructive. As Rob Kass suggests it’s time to move on to pragmatism.” (Roger Peng and Jeff Leek co-run the blog; Rob Kass is a named person in statistics. Top men all.)

Pragmatism is a failed philosophy; as such, it cannot be relied on for anything. It says “use whatever works”, which has a nice sound to it (unlike “data scientist”), until you realize you’ve merely pushed the problem back one level. What does works mean?

No, really. However you form an answer will be philosophical at base. So we cannot escape having to have a philosophy of probability after all. There has to be some definite definition of works, thus also of probability, else the results we provide have no meaning.

Irizarry:

Applied statisticians help answer questions with data. How should I design a roulette so my casino makes $? Does this fertilizer increase crop yield?…[skipping many good questions]… To do this we use a variety of techniques that have been successfully applied in the past and that we have mathematically shown to have desirable properties. Some of these tools are frequentist, some of them are Bayesian, some could be argued to be both, and some don’t even use probability. The Casino will do just fine with frequentist statistics, while the baseball team might want to apply a Bayesian approach to avoid overpaying for players that have simply been lucky.

Suppose a frequentist provides an answer to a casino. How does the casino interpret it? They must interpret it somehow. That means having a philosophy of probability. Same thing with the baseball team. Now this philosophy can be flawed, as many are, but it can be flawed in such a way that not much harm is done. That’s why it seems frequentism does not produce much harm for casinos and why the same is true for Bayesian approaches in player pay scales.

It’s even why approaches which “don’t even use probability” might not cause much harm. Incidentally, I’m guessing by “don’t use probability” Irizarry means some mathematical algorithm that spits out answers to given inputs, a comment I based on his use of “mathematically…desirable properties”. But this is to mistake mathematics for or as probability. Probability is not math.

There exists a branch of mathematics called probability (really measure theory) which is treated like any other branch; theorems proved, papers written, etc. But it isn’t really probability. The math only becomes probability when its applied to questions. At that point an interpretation, i.e. a philosophy, is needed. And it’s just as well to get the right one.

Why is frequentism the wrong interpretation? Because to say we can’t know any probability until the trump of doom sounds—a point in time which is theoretically infinitely far away—is silly. Why is Bayes the wrong interpretation? Well, it isn’t; not completely. The subjective version is.

Frequency can and should inform probability. Given the evidence, or premises, “In this box are six green interocitors and four red ones. One interocitor will be pulled from the box” the probability of “A green interocitor will be pulled” is 6/10. Even though there are no such things as interocitors. Hence no real relative frequencies.

Subjectivity is dangerous in probability. A subjective Bayesian could, relying on the theory, say, “I ate a bad burrito. The probability of pulling a green interocitor is 97.121151%”. How could you prove him wrong?

Answer: you cannot. Not if subjectivism is right. You cannot say his guess doesn’t “work”, because why? Because there are no interocitors. You can never do an “experiment.” Ah, but why would you want to? Experiments only work with observables, which are the backbone of science. But who said probability only had to be used in science? Well, many people do say it, at least by implication. That’s wrong, though.

The mistake is not only to improperly conflate mathematics with probability, but to confuse probability models with reality. We need be especially wary of the popular fallacy of assuming the parameters of probability models are reality (hence the endless consternation over “priors”). Although one should, as Irizarry insists, be flexible with the method one uses, we should always strive to get the right interpretation.

What’s the name of this correct way? Well, it doesn’t really have one. Logic, I suppose, à la Laplace, Keynes, Jaynes, Stove, etc. I’ve used this in the past, but come to think it’s limiting. Maybe the best name is probability as argument.

Podcast: Peer Review, Bob & Ray Do Statistics, Academic Calls For Killing Of (Post-Birth) Babies

 


Show Notes

Wired’s PubPeer article. PubPeer.com itself. PubPeer’s discussion of “Macroscopic Observability of Spinorial Sign Changes under 2π Rotations“.

Bob and Ray can be found at, inter alia, the Internet Archive, which is also where you can find today’s snippet.

Getting to be the worst person you can see if you’re worried about your health is a “doctor.” These fellows are now and increasingly killing people who come to them—especially in post-Christian Europe, in places like The Netherlands, Belgium, and Luxembourg. But here, too, in Oregon, Washington, and Vermont.

And if people like Udo Schuklenk have his way, it’s going to get worse.

The abstract of Schuklenk’s Journal of Thoracic and Cardiovascular Surgery article.

You can read about Heterotaxy Syndrome here.

The lecture from which Schuklenk’s clips were gathered.

Happy Birthday Frank Sinatra, and Merry Christmas.

Bonus! The podcast is also at YouTube (YouTube also says the video is blocked in Germany because of the 30 second Sinatra clip). I’ll work on restoring my iTunes feed. Maybe. Download the MP3.

Update It is to this level of podcasting perfection to which your host aspires.

Older posts

© 2014 William M. Briggs

Theme by Anders NorenUp ↑