Skip to content
October 4, 2018 | 18 Comments

Insanity & Doom Update LVIII — Special Midweek Doom

There is more Doom that we can shake a holy-water-doused crucifix at. Maybe we should have both Saturdays and Wednesdays Doom devoted.

Item How three scholars gulled academic journals to publish hoax papers on ‘grievance studies.’

Such hoaxes are unethical, and The Wall Street Journal doesn’t condone them.

Good grief, what effeminacy. But of course we all remember the effect of the Sokol hoax. That’s right: beside a few giggles, none at all.

Item The Senate Should Not Confirm Kavanaugh: Signed, 1,000+ Law Professors (and Counting)

The left is now screaming like sissies that Kavanaugh reacted angrily after being hit. This is more effeminacy. I pray you see that. Anyway, that “professors” are against him is, these days, a sign he must be a good man.

Item California, working hard, discovers a new way to destroy culture, industry.

“CA Gov. Jerry Brown signs bill requiring corporate boards to include women, saying despite flaws in measure “recent events in Washington DC– and beyond– make it crystal clear that many are not getting the message” on gender equality.”

It’s one small step from mandating the testosterone-deficient be given board memberships to mandating, say, trannies join the club.

Item Britain’s most senior female bishop says Church should stop calling God ‘he’ because it can put young people off religion (Thanks to reader Mavis Emberson for the tip.)

Britain’s most senior female bishop said the church should avoid referring to God only as ‘he’ after a survey found young Christians assume God is male…

The Bishop of Gloucester, the Rt Rev Rachel Treweek, the Church’s first female diocesan bishop, told The Telegraph: ‘I don’t want young girls or young boys to hear us constantly refer to God as he,’ adding that it was important to be ‘mindful of our language’.

It is not the first time Rt Rev Rachel Treweek has made these claims having said: ‘God is not to be seen as male. God is god,’ in the past.

In 2015 Rev Treweek refused the title of ‘right reverend father’. She has now claimed that non-Christians could feel a sense of alienation from the church if the image of God is painted as solely male, and public announcements are made in only male language to describe God.

Of course, after re-re-re-…-re-acknowledging God is not a man nor woman, a fact which nobody denies, we’ll have to toss out all references to God the Father, and forget the painful fact that Jesus was a non-female. The real question is: who could have foreseen putting a women in charge of religion would have resulted in her calling for the purging of male metaphors?

Item Mormon blogger says men are ‘100% responsible for unwanted pregnancies’ in powerful Twitter thread (Thanks to reader Kunzipjn for the tip.)

A blogger from Oakland, California has become a viral sensation after sharing her views on abortion on Twitter, in which she argues that “all unwanted pregnancies are caused by the the irresponsible ejaculations of men”…

Blair points out that biologically, men cannot impregnate women without experiencing an orgasm and therefore concludes that “getting a woman pregnant is a pleasurable act for men.”

Yes, we have long passed the point at which banalities and idiocies are passing for wisdom. But this is real bottom-of-the-barrel stupidity. A foolish woman discovers that sex can lead to pregnancy, is shocked, and conveys her shock to Twitter, whose readers also convey shock.

We began, as a species, with the full knowledge that sex was for procreation. We eventually came to the idea that sex was for pleasure and procreation an inconvenient side effect, one that could be cured by harmful drugs or by killing the side effect. But this foolish woman, and the explosion of myriad “orientations”, signal we are reaching the point where pregnancy is becoming a mystery. “I have no idea how I got pregnant,” said Thot One. “Did you have sex?” asked Thot Two. “Yeah,” replied Thot One. “But what’s that got to do with it?”

Item Pedophile’s Decapitated Corpse Found On Judge’s Doorstep After Bail Hearing In Aurora, Illinois

William Smith, 28, from Aurora, Illinois was discovered in the early hours of Tuesday morning, decapitated and slumped against the front door of the judge who had granted him bail in August.

Smith was arrested last month following allegations by his then girlfriend that he had raped her 8-year-old daughter.

After a police investigation in which Smith was found in possession of child pornography, he was arrested on two counts related to child pornography and one count of child molestation.

After being charged, Smith walked free from the court after the judge controversially ruled that he did not pose a threat to the local community, and he raised the $30,000 bail required to trigger his freedom…

Aurora police say they are currently “following leads” but have yet to make any arrests for the murder.

Note carefully that this was the perp’s corpse and not the judge’s. This is a case of the government not seen doing justice. We wonder how this judge will rule on the next similar case.

Item In attempting to rebut Rusty Reno’s dismissal of his book, Jonah Goldberg manages to invoke Hitler. He did wait a few paragraphs, though.

Item Eight of Iran’s women’s football team ‘are men’

Eight of Iran’s women’s football team are actually men awaiting sex change operations, it has been claimed.

It is impossible to change your sex. And all the men are ugly.

October 3, 2018 | 1 Comment

How To Do Predictive Statistics: Part IX Stan — Logistic & Beta Regression

Review!

We’re doing logistic and beta regression this time. These aren’t far apart, because the observable for both lives between 0 and 1; for logistic it is 0 or 1; for beta, any fraction or ratio—but not probability–that is on (0,1) works. We don’t model probability; we use probability to model.

That brings up another blunt point. In these demonstrations I do not care much about the models themselves. I’m skipping over all the nice adjustments, tweaks, careful considerations, and other in-depth details about modeling. Most of that stuff is certainly of use, if tainted by the false belief, shared by both frequentists and Bayesians, that probability exists.

I am not here to teach you how to create the best model for this or that kind of observable. I am not here to teach you best coding practices. I am here to teach you the philosophy of uncertainty. Everything takes an economy class nonrefundable ticket seat to that. Because that’s true, it’s more than likely missed code shortcuts and model crudities will be readily apparent to experts in these areas. You’re welcome to put corrective tips in the comments.

On with the show!

Logistic

Let’s use, as we did before, the MASS package dataset birthwt.


library(MCMCpack)
library(rstanarm)

x=MASS::birthwt # we always store in x for downstream ease
x$race = as.factor(x$race)

fit.m =  MCMClogit(low ~ smoke + age + race + ptl + ht + ftv, data=x)
fit.s = stan_glm (low ~ smoke + age + race + ptl + ht + ftv, family=binomial, data=x)

Last time we didn’t put all those other measures in; this time we do. Notice that we specify the data in both methods: rstanarm needs this, and it’s good practice anyway.



# This is a copy-paste from the MCMClogit lesson; only changing to p.m
p.m = NA
for(i in 1:nrow(x)){
  p.m[i] = MCMClogit.pred(fit.m,x[i,])
}

plot(x$age,p.m,col=x$race,ylab='Pr(low wt|old data,M)', pch=x$smoke+1,main='MCMC')
grid()
legend('topright',c('r1','r2','r3','s0','s1'), col=c(1,2,3,1,1), pch = c(3,3,3,1,2), bty='n')

Save that plot. Recall that the three races and two smoking states are given different colors and plotting characters. There is more to each scenario than just these measures, as the model statements show. But this is a start.



# This is new
p.s = posterior_predict(fit.s)
plot(x$age,colMeans(p.s),col=x$race,ylab='Pr(low wt|old data,M)', pch=x$smoke+1,main='Stan')
grid()
legend('topright',c('r1','r2','r3','s0','s1'), col=c(1,2,3,1,1), pch = c(3,3,3,1,2), bty='n')


Notice in the plot we have to do colMeans(p.s) to get the probability estimates—this is the tweak I mentioned last time. That’s because p.s contains nothing buy 189 columns (same as original data length) of 0s and 1s. Remember these are predictions! We take the average of the predictions, at each scenario, to get the probability estimate.

Stare and see the differences in the plots. We can’t put them all on one so easily here. While there are some differences, my guess is that no one would make any different decision based on them.

For homework, you can use rstanarm to check for measure relevancy and importance. Recalling, as you do so, that these are conditional concepts! As all probability is.

What’s that? You don’t remember how to do that? You did review, right? Sigh. Here’s one way checking the measure ptl, number of previous premature labours.



# The commented out lines we already ran; they're in memory
#fit.s = stan_glm (low ~ smoke + age + race + ptl + ht + ftv, family=binomial, data=x)
fit.s.2 = stan_glm (low ~ smoke + age + race +  ht + ftv, family=binomial, data=x)

p.s = posterior_predict(fit.s)
p.s.2 = posterior_predict(fit.s.2)

#a.1 = colMeans(p.s)
a.2 = colMeans(p.s.2)

plot(a.1,a.2, xlab='Full model', ylab='Sans ptl', main='Pr(low wt|old data, Ms)')
  abline(0,1)

Obviously ptl is relevant in the face of all these other measures. Would it be excluding others? Or with new observations? You check. That’s an order: you check. Would you say, as a decision maker interested in predicting low birth weight, that the probabilities change enough such that different decisions would be made using the different models? If so, then ptl is important, and should be kept in this model; i.e. in this form of the model with all the other measures in, too. If not, then it should be chucked.

There is no longer any such thing as hypothesis testing, or model building without reference to the decisions to be made using the model. It is impossible, beyond raw relevancy, which is trivial to see, that model building is independent of decision making.

Beta regression

We can’t do comparisons here, because only rstanarm has this kind of model. It’s for observables living on (0,1), things like ratios, fractions, and the like. The idea (which you can look up elsewhere) is that uncertainty in the observable y is characterized with a beta distribution. These have two parameters—normal distributions do, too. Unlike with normals, here we can model linear functions of both (suitably transformed) parameters. This is done with normals, too, in things like GARCH time series. But it’s not usual for regular regressions.

For beta regression, one or both parameters is transformed (by logging or identity, usually), and this is equated to a linear function of measures. The linear function does not have to be the same for both transformed parameters.

In the end, although there are all kinds of considerations about the kinds of transforms and linear functions, we are really interested in predictions of the observable, and its uncertainty. Meaning, I’m going to use the defaults on all these models, and will leave you to investigate how to make changes. Why? Because no matter what changes you make to the parameterizations, the predicitons about observables remains the same. And that is really all that counts. We are predictivists!

We last time loaded the betareg package. We’re not going to use it except to steal some of its data (the rstanarm package doesn’t have any suitable).

library(betareg)
data('GasolineYield', package='betareg')
x = GasolineYield

attr(x$batch, "contrasts") <- NULL # odd line

?GasolineYield # to investigate the data

We're interested in yield: "proportion of crude oil converted to gasoline after distillation and fractionation" and its uncertainty relating to temperature and experimental batch. There are other measures, and you can play with these on your own.

The "odd line" removes the "contrasts" put there by the data collectors, and which are used to create contrasts in the parameters; say, checking whether the parameter for batch 2 was different than for batch 10, or whatever. We never care about these. If we want to know the difference in uncertainty in the observable for different batches, we just look. Be cautious in using canned examples, because these kinds of things hide. I didn't notice it at first and was flummoxed by some screwy results in some generated scenarios. People aren't coding these things with predictions in mind.


fit = stan_betareg(yield ~ batch + temp | temp, data=x)
p = predictive_interval(fit)


The first transformed parameter is a linear function of batch and temp. The second---everything to the right of "|"---is a linear function of temperature. This was added because, the example makers say, the lone parameter model wasn't adequate.

How do we check adequacy? Right: only one way. Checking the model's predictions against new observables never before used in any way. Do we have those in this case? No, we do not. So can we adequately check this model? No, we cannot. Then how can we know the model we design will work well in practice? We do not know.

And does everything just said apply to every model everywhere for all time? Even by those models released by experts who are loaded to their uvulas with grant money? Yes: yes, it does. Is vast, vast, inconceivable over-certainty produced when people just fit models and release notes on the fit as if these notes (parameter estimates etc.) are adequate for judging model goodness? Yes, it is. Then why do people do this?

Because it is easy and hardly anybody knows better.

With those true, sobering, and probably to-be-ignored important words, let's continue.


plot(x$temp,p[,2],type='n',ylim=c(0,.6),xlab='Temperature',ylab='Yield')
for(i in 1:nrow(p)){
   lines(c(x$temp[i],x$temp[i]),c(p[i,1],p[i,2]))
   text(x$temp[i],mean(c(p[i,1],p[i,2])), as.character(x$batch[i]))
}
grid()


This pictures heads up today's post.

We went right for the (90%) predictive intervals, because these are easy to see, and plotted up each batch at each temperature. Depends on the batch, but it looks like as temperature increases, we have some confidence (and I do not mean this word in its frequentist sense) yield increases.

Let's do our own scenario, batch 1 at increasing temperatures.


y = data.frame(batch="1",temp=seq(200,450,20))
p.y = predictive_interval(fit,newdata=y)
plot(y$temp,p.y[,2],type='n',ylim=c(0,.6),xlab='Temperature',ylab='Yield')
for(i in 1:nrow(p.y)){
   lines(c(y$temp[i],y$temp[i]),c(p.y[i,1],p.y[i,2]))
}


This is where I noticed the screwiness. With the contrasts I wasn't getting results that matched the original data, when, of course, if I make up a scenario that is identical to the original data, the predictions should be the same. This took me a good hour to track down, because I failed to even (at first) think about contrasts. Nobody bats a thousand.

Let's do our own contrast. Would a decision make do anything different regarding batches 7 and 8?


y = data.frame(batch="7",temp=seq(200,450,20))
  p.y.7 = predictive_interval(fit,newdata=y)
y = data.frame(batch="8",temp=seq(200,450,20))
  p.y.8 = predictive_interval(fit,newdata=y)

plot(p.y.7[,1],p.y.8[,1],type='b')
  lines(p.y.7[,2],p.y.8[,2],type='b',col=2)
  abline(0,1)


This only checks the 90% interval. If the decision maker has different important points (say, yields greater than 0.4, or whatever), we'd use those. Different decision makers would do different things. A good model to one decision maker can be a lousy one to a second!

Keep repeating these things to yourself.

The batch 7 gives slightly higher upper bounds on the yields. How much? Mixing code and output:


> p.y.7/p.y.8
         5%      95%
1  1.131608 1.083384
2  1.034032 1.067870
3  1.062169 1.054514
4  1.129055 1.101601
5  1.081880 1.034637
6  1.062632 1.061596
7  1.068189 1.065227
8  1.063752 1.048760
9  1.052784 1.048021
10 1.036181 1.033333
11 1.054342 1.028127
12 1.026944 1.042885
13 1.062791 1.037957


Say 3%-8% higher. That difference enough to make a difference? Not to me. But what the heck do I know about yields like this. Answer: not much. I am a statistician and am incompetent to answer the question---as is each statistician who attempts to answer it with, God help us, a p-value.

At this point I'd ask the client: keep these batches separate? If he says "Nah; I need 20% or more of a difference to make a difference", then relevel the batch measure:


levels(x$batch) = c('1','2','3','4','5','6','7-8','7-8','9','10')

Then rerun the model and check everything again.

October 2, 2018 | 9 Comments

Poor Richard Carrier Goes On The Offensive

Six years ago I wrote a small, daily piece (Bayes Theorem Proves Jesus Existed And Didn’t Exist) about the use of probability in proving or disproving the existence of God. I’m against it.

I began with a quote by Keynes on Bayes’s Theorem:

No other formula in the alchemy of logic has exerted more astonishing powers. For it has established the existence of God from the premiss of total ignorance; and it has measured with numerical precision the probability the sun will rise to-morrow.

I next said:

Probability carries with it “a smack of astrology, of alchemy.” Comte, Keynes reminds us, regarded the application of the mathematical calculus of probability as “purement chimérique et, par conséquent, tout à fait vicieuse.”

Now these are minds better than mine giving manful advice about over-relying on probability. Heed them. Indeed, as anybody who has regularly read this blog knows, probability is misused with shocking abandon. We all know how probability, mainly through statistical models, is used to “prove anything” — a phrase I trust is recognized for what it is, a figure of speech and not a complete logical treatise on probability misuse (although I’ve done that, too!).

I said:

…there is not one, but two books which argue that a fixed, firm number may be put on the proposition God Exists. The first by Stephen Unwin is called The Probability of God: A Simple Calculation That Proves the Ultimate Truth, in which he uses Bayes’s theorem to demonstrate, with probability one minus epsilon, (the Christian) God exists.

This is countered by Proving History: Bayes’s Theorem and the Quest for the Historical Jesus by the very concerned Richard Carrier (pictured above), whose uses Bayes’s theorem to prove, with probability one minus epsilon, that the Christian God does not exist because Jesus himself never did.

There we have it: probability proving two diametrically opposite conclusions. Alchemy indeed.

It’s obvious the “equations” “probability one minus epsilon” are mild jokes, probabilistic figures of speech, and not meant as rigorous mathematical proofs that Unwin and Carrier came to these exact precise figures. It’s equally obvious we have two people using probability to argue both sides of a question.

It wasn’t obvious to Richard Carrier. He felt stung at the time of the post, which was later reprinted, and went then into a sort of minor frenzy. In one comment at the time he said:

…Bayes’ Theorem is simply the mathematical model for the arguments historians are already making. If they can’t make a probabilistic argument that Jesus existed, then they can’t claim to know Jesus probably existed. And then we’d all have to concede we don’t know Jesus probably existed. Sink the ship of arguing from probabilities, and all probability arguments go down with it. And with that, all human knowledge. Thus, you have to address what I actually argue, not pretend it’s some sort of advanced significance testing like in the sciences. It’s just an argument that something probably happened in history. And as such is as valid as any other argument that something probably happened in history. Unless no such arguments are valid!

By pointing out improper uses of probability, I have managed to sink All human knowledge! What powers I have! (I always knew I was special.)

Apparently Carrier doesn’t understand Bayes’s theorem isn’t really needed, that it stands or falls based on its inputs. And that the inputs are the only important things worth discussing. Carrier’s inputs are on the order of the Bigfoot conspiracy theories. For Carrier, don’t forget, claims Jesus never existed. Not just that Jesus wasn’t God, but that the man himself did not exist.

Anyway, Carrier spun himself around in circles lo those many years ago. And I forgot about the post, which after all only made a small point.

Carrier didn’t forget. Evidently, the wound I caused festered and never healed. Carrier, we presume, retreated to some dark corner to cherish the injury, only to reemerge two weeks ago with an extraordinarily long piece—found by reader swordfishtrombone—-in which he produces multiple points of evidence proving I’m a “liar”.

His title is “Why Christians Are Terrified of Probability Theory.”

Yeah, sure, Carrier. I’m quivering.

Here are the first two “lies” of which he says I’m guilty:

The title says Briggs is talking about examples of Bayes’ Theorem being used to prove “Jesus Existed (And That He Didn’t).” But he gives no example anywhere in his piece of Bayes’ Theorem ever being used to prove Jesus existed! Lie number one.

In fact, Briggs gives no actual example of Bayes’ Theorem being used to prove Jesus didn’t exist, either. He cites only my book Proving History. In which I never argue any conclusion about the historicity of Jesus. Much less mathematically. Lie number two.

Good grief!

Another:

Yet Briggs claims Unwin “uses Bayes’’ theorem to demonstrate, with probability one minus epsilon, that the Christian God exists.” Lie number three.

Sigh.

Number four is my favorite:

Moreover, Unwin’s conclusion is that the probability of God’s existence based on his examination of the evidence is only 67%. Yet Briggs claims Unwin got the result of “probability one minus epsilon,” epsilon being a mathematician’s term for a very small number (in fact, usually infinitesimally small). In other words, Briggs lied. He said Unwin found the probability to be arbitrarily close to 100%. In fact, Unwin found it was far more ambiguously around 67%. Strange lie for Briggs to tell. But alas. Lie number four.

You can read the others, which are equally or more frivolous.

The real problem might be that Carrier was embarrassed by other articles I wrote showing his errors. And so he took his frustration out on a toss-away article by using lawyer-like insinuations about niggling details that nobody cares about.

How about this article? “Richard Carrier’s Argument To Show God’s Existence Unlikely Is Invalid And Unsound“.

Richard Carrier’s argument to show that God probably didn’t create the universe, and therefore He probably doesn’t exist, in Carrier’s “Neither Life nor the Universe Appears Intelligently Designed”, like many attempts to use probability in defense of atheism or theism, is invalid and unsound, and based on fundamental misunderstandings of who God is and of the proper role of probability.

Lower down:

Carrier introduces Bayes’s probability theorem, but only as a club to frighten his enemies and not as a legitimate tool to understand uncertainty. I must be right, he seems to insist, because look at these equations. Bayes’s theorem is a simple means to update the probability of a hypothesis when considering new information. If the information comes all at once, the theorem isn’t especially needed, because there is no updating to be done. Nowhere does Carrier actually needs Bayes and, anyway, probabilistic arguments are never as convincing as definitive proof, which is what we seek when asking whether God exists.

Even lower down is a section on Carrier’s many probability errors.

He repeats many of these errors in his new Pity-Me-Richard-Carrier article:

Taking probability theory seriously, entails exposing assumptions to the light of day, that once exposed, destroy the Christian faith. The resulting cognitive dissonance is so powerful only two options are available to the believer: make shit up (like Unwin and Swinburne, they fabricate fantastical probabilities that have no plausible basis in logic or reality) or declare probability itself the enemy. Briggs picks option B. Meanwhile, all peer reviewed work on the question finds the opposite: that history is in fact Bayesian.

History is not Bayesian, Carrier. Any my work has been peer reviewed, too, which makes it true and indisputable.

Your work, Carrier, has also been peer reviewed. And your peers say harsh things. Which is why Carrier also doesn’t like it when I link to atheist Tim O’Neill’s “History for Atheists” site, which has many articles proving—as in proving—Carrier’s many historical mistakes.

Bear with me for one last quote, as it involves Bayes again.

Two years ago Carrier brought out what he felt was going to be a game-changer in the fringe side-issue debate about whether a historical Jesus existed at all. His book, On the Historicity of Jesus: Why We Might Have Reason for Doubt (Sheffield-Phoenix, 2014), was the first peer-reviewed (well, kind of) monograph that argued against a historical Jesus in about a century and Carrier’s New Atheist fans expected it to have a shattering impact on the field. It didn’t. Apart from some detailed debunking of his dubious use of Bayes’ Theorem to try to assess historical claims, the book has gone unnoticed and basically sunk without trace. It has been cited by no-one and has so far attracted just one lonely academic review, which is actually a feeble puff piece by the fawning minion mentioned above. The book is a total clunker.

The “detailed debunking” is not mine, but Tim Hendrix’s “Richard Carrier’s ‘On the historicity of Jesus’ A Review From a Bayesian Perspective“. Fifty-seven pages of solid debunking.

“Tim Hendrix” is a pseudonym. “Hendrix” was evidently worried he or his family would be hounded by Carrier’s “fawning minions” (O’Neill’s phrase). And perhaps be called “a liar” by Carrier himself.

You needed have worried, “Tim”. I’ve had mosquito bites that hurt worse than Carrier’s insults.

A decade of the rosary has been said for you, Richard. Miracles do happen!

October 1, 2018 | 8 Comments

Cornell Professor Exposes His Wee P-values One Too Many Times: All P-values Are P-Hacking

The over-production of wee p-values led to the downfall of Cornell Professor Brian Wansink, who is being made to retire (with what we can guess is a comfortable “package”).

We met Wansink before, in the context of cheating with statistics. According to Vox:

His studies, cited more than 20,000 times, are about how our environment shapes how we think about food, and what we end up consuming. He’s one of the reasons Big Food companies started offering smaller snack packaging, in 100 calorie portions. He once led the USDA committee on dietary guidelines and influenced public policy. He helped Google and the US Army implement programs to encourage healthy eating.

Ah, the love of theory. Science did so well with the simple things, like explaining (in part) gravity at the largest scales, why can’t it do well explaining small things, like what’s best to eat? Surely we can’t go by the wisdom of ages, since that’s anecdote and not blessed “randomized controlled” experiment.

Never mind all that.

Thirteen of Wansink’s studies have now been retracted, including the six pulled from JAMA Wednesday. Among them: studies suggesting people who grocery shop hungry buy more calories; that preordering lunch can help you choose healthier food; and that serving people out of large bowls encourage them to serve themselves larger portions…

There was also Wansink’s famous “bottomless bowls” study, which concluded that people will mindlessly guzzle down soup as long as their bowls are automatically refilled, and his “bad popcorn” study, which demonstrated that we’ll gobble up stale and unpalatable food when it’s presented to us in huge quantities.

Why these were even subjects of “research” is, I think, the more important question. But such is the grip of scientism that it probably won’t even strike you as odd we laid aside the knowledge of gluttony in the search of quantifying the unquantifiable.

I’m happy, however, to note that Vox sees parts of the problem:

Among the biggest problems in science that the Wansink debacle exemplifies is the “publish or perish” mentality.

To be more competitive for grants, scientists have to publish their research in respected scientific journals. For their work to be accepted by these journals, they need positive (i.e., statistically significant) results.

That puts pressure on labs like Wansink’s to do wha’’s known as p-hacking. The “p” stands for p-values, a measure of statistical significance. Typically, researchers hope their results yield a p-value of less than .05 — the cutoff beyond which they can call their results significant.

There is no non-fallacious use of p-values. But that doesn’t mean that they can’t lead to true results. Here is a faulty syllogism: “Michael Moore is a fine fellow. Fine fellows support toxic feminism. Therefore, 1 + 1 = 2.” Although it’s a joke, p-values are like this in form. The conclusion in this argument is true, but not for the reasons cited.

That’s the best case. The usual one is like this: “I need a novel finding to boost my career, and here is some data. Here is a wee p-values. Therefore, men really aren’t taller than women on average.” The conclusion is absurd, goes against common sense, but since it has been “scientifically proved”, and its anyway politically desirable, it is believed.

When I am emperor I will ban all p-values, and I will also disallow the publication of more than one paper biannually. (These are among my lighter measures. Wait until you hear what I intend to do with Resurrection deniers.)

I won’t go through Vox’s incorrect explanation of p-values. You can read all about them here, or soon in a paper that has been accepted (peer reviewed, and therefore flawless) “Everything Wrong With P-values Under One Roof.” Also see the articles here.

I want instead to disabuse Vox of their explanation of Bayesian statistics:

While p-values ask, “How rare are these numbers?” a Bayesian approach asks, “What’s the probability my hypothesis is the best explanation for the results we’ve found?”

Not really. What Bayesians ask instead is “What’s the posterior probability this parameter in my ad hoc model does not equal zero?” You can call a non-zero parameter in an ad hoc model a hypothesis if you like—it’s a free country—but a non-zero parameter is such a narrow small thing that it’s undeserving of such a noble title.

That’s the problem with (most) Bayesian statistics. It still hasn’t learned to speak of reality. For that, see this class.

Thanks to Sheri, Al Perrella, and some anonymous readers for the tip.