Skip to content

Category: Statistics

The general theory, methods, and philosophy of the Science of Guessing What Is.

August 28, 2018 | 4 Comments

Chaos & Outrage In The Church — Guest Post by Ianto Watt

Chaos and outrage have again engulfed the Church. We must talk about this incredibly ugly and painful boil on the Body of Christ.

To lance a boil, you must insert the needle horizontal to the body, piercing through both sides of the mound. Then the needle must be yanked upward, ripping the entire head off. It’s the only way to expose and extricate the core. My dad, a farm boy, did it to me.  It was the only way to deal with it. The pain was excruciating. But the relief was immediate.

If the Church wants relief, we have to correctly identify the ailment. Diagnosis is the key to the cure. There are two possibilities: either moral rot or political nonsense. And the Pope seems to be going for the second one. He’s asking us the basic question of all flim-flam men since the beginning of time: who are you gonna believe, me or your lying eyes? In this case, Archbishop Carlos Viganos is supposedly our lying eyes. Read what he says.

The Church has been trying to treat the boil with any approach other than the needle. Avoidance of pain has been the foremost consideration. Consequently, things are getting different, but certainly not better.

I’d like to pass along a some wisdom I came upon after a recent trip to my favorite store, The Mountain Man, in Manitou Springs, Colorado. There I picked up a book I judged by its cover, Colorado Outdoor Living, by Ernest Wilkinson. (Take a look at the cover and see why.)

Wilkinson was 83 (I think) when he wrote this (in 2008), and he was still guiding camping groups, on foot for a week or more at a time, in the Rockies. His story is as American as it gets. At one time in his life, back in the ’50’s, he was a bear and wolf trapper for the Forestry department in Colorado. His job was to ride a circuit amongst the two dozen or so sheep allotments that held 1,000 head of sheep each. He would visit the two lone shepherds encamped in each allotment and take care of any predators that were making life difficult for these shepherds and their sheep. Most of these simple men he visited were Mexicans.

Wilkinson relates (page 45):

I am reminded of a story concerning language barriers told to me by one of the sheepmen. A teacher was having problems attempting to get a young Spanish youth to understand English usage of words and numbers. The teacher knew the youth was often in sheep camp with his Dad, so she decided to use the names of animals the boy was familiar with. She began an explanation. ‘Okay, you have a hundred head of sheep and one leaves. How many do you have left?’ The boy promptly replied ‘None’.

‘No, no, you don’t understand that only one sheep left.’ The youth looked at the teacher and exclaimed ‘You may understand English words and numbers, but you do not understand sheep. When one leaves, they all leave and you have none left’. The boy knew that if one sheep jumped off in a direction, all the rest would follow.

That simple wisdom, spoken by an immigrant youth to the ‘educated’ Anglo, I believe, is emblematic of where we are at today. The shepherds have fled. Or joined with the wolves in many instances. Where will the sheep go when they are scattered?

I once knew a man named Gary North. You may have heard of him, or at least the many books he has written. He made the remark, in one of those many books, that, (paraphrasing here) our Protestant/Christian/Western society was governed by Three Robes. The first robe was the Professor, who taught us up from down. The second was the Pastor, who taught us right from wrong. The third was the Judge, who ruled on our actions in light of the teaching we had received. I think it is needless to say that in America, the first two of these three have abandoned their robes, and the third is naked under his.

Now Gary’s as rabidly anti-Catholic as a civilized Protestant can get. On many other questions (except money, specifically usury) he and I share a lot of common ground. But those two issues of money and authority there is a Grand Canyon of separation between us. Gary doesn’t see the connection (and reflection) usury has to homosexuality. He’s against the sterility of homosexuality, but not the artificial fecundity of usury.

Gary and I had it out once over the question of who bestows the robes. The three bedrock institutions of the Protestant American Experiment that gave us the men who wore those Three Robes have devolved into utter filth: Harvard, Yale and Princeton. The three seminaries that became the three universities that became our three judges: our legislature, our executive and of course, our Supreme judiciary. And we have meekly followed them like the sheep we are. We can blame it all on them, right?

No. We didn’t get to where we are today by means of one act, or one person. We’ve all played a part. It’s instructive I think to see how we are all woven together in this tapestry. The problem isn’t in the warp and the woof of the fabric. Rather, it’s in the image the tapestry has produced. And the image is Machu Picchu, writ large. (It’s art is often vividly pornographic.)

The Protestant Enlightenment has burned its candle at both ends. In their last flickering light, we have to decide if the Incas and Aztecs (the first American Experiment) were artists or demons. It’s too late to decide if we are better than they were. Because we have now surpassed them in all their bloodthirsty sex-crazed ways. The last institution standing against this spiritual and carnal leprosy is now under a full-scale assault. From within.

The Revolution is not yet complete, Komrade. But only because the one institution, the Catholic Church, that played no part in setting up either the first or second American Experiment is now the last bulwark against the debauched god who goes by the name of Reason. There is hope. There is always hope, if we believe. After all, this same institution destroyed the first American Experiment. Let’s hope it can do it again, before it destroys all of us.

It’s true. It was the Confessional State of Spain, through her imperfect Conquistadors, that stopped the slavery-for-sacrifice of the Aztecs, and the slavery-for-perversion of the Incas. But more importantly, it was the Mother of the Church that freed the people at Guadalupe from their fear of the return of their former Satanic Overlords.

Fear has now returned. A fear that these demons have returned, to claim in North America what they lost in the South. The white man doesn’t know how to handle this fear. Why? Because he doesn’t know the name of the demon. As any good exorcist knows, unless you know the demon’s name, you have no power over him.

What is the name of the demon that grips the Catholic Church? The same one that grips all of America, now that The Three Robes of The Second American Experiment have abandoned us? Simple, Pilgrim. Its name is Sexual Impurity. The same demon that possessed the Incas and the Aztecs in The First American Experiment. And let’s get one thing clear: we’re all part of the problem, even if it is only because of our silence. Read the written testimony of Archbishop Carlo Vigano in his indictment of his fellow Bishops and Cardinals at the highest levels of the Vatican, including the Pope today and tell me he has not lanced the boil by naming this demon. But notice also his sorrow for not having spoken sooner. Mea culpa.

What was the door the demon entered in by? By the Faustian bargain we in the West made: let us physically kill our children, and we will let you spiritually kill the ones who survived their conception and birth. Contraception and abortion amongst the sheep, in exchange for Homosexuality amongst the shepherds.

What then will be the sign of the demon’s departure? That I cannot say, only because we do not know the particular fortuna each of us has consumed in our acceptance of this cursed bargain. Thus, there cannot be a single answer to this question. We each have willingly ingested something, in some fashion, knowingly or not. While we won’t know what physical object to look for upon the demon’s departure, there will be a clear sign that he has departed. We will start having babies again. And we will start to protect them from spiritual evil again. The world will get friendlier to babies. And more hostile to abortionists and homosexuals.

Or is the real problem is Clericalism? That’s exactly what the pushers of The French Revolution accused the Church of when she refused to follow the world in its lust. The revolutionaries said that the people (and thus, the State) gave too much deference to the clerics in their insistence that the sheep not eat of the poisoned plants that grew outside their pastures. Now, in a desperate attempt to again deflect attention from the true root of the problem, these Clerics at the top of this dung heap of homosexuality accuse the people of deferring too much to their poor clerics, who are then tempted beyond their spiritual strength to abuse their power of authority over their seductive sheep. Amazing, eh? We truly are part of the problem. Just not that part.

Let’s call all of this LGBT-ABC-XYZ crap by its real name of Sexual Impurity. And all of the actions taken in that name can be boiled down to one letter, instead of the Qwerty Keyboard that never ends. What is that letter? The letter is A. As in Auto-Sexual. Let’s be honest. The key word is Auto. Self. It’s all about ME! And whatever I do is meant to satisfy only me! That’s reality.

What are we to do? Do we wander off, like the sheep mentioned by the shepherd’s son? Or do we hold our position, awaiting a good shepherd, all the while surrounded by the pack of ravening wolves? If we flee, where are we to go? Who is to lead us on this retreat? What is our destination? Where is the safe sheepfold?

Here is the biggest danger, Komrade. For while we know that we are currently being devoured by a particular pack of pretty-boy wolves, that doesn’t mean that there aren’t other wolves. Wolves who will gladly help you escape the clutches of another wolf.

Yes, there are plenty of people who are willing to offer temporary refugee status to those fleeing from the current Pope and his henchmen. But this refuge will come at a cost. The cost of your faith. You will have to renounce your citizenship in your old country (Rome) but you will never gain rights in your new land. Because it’s not your land. Take a lesson from the Palestinians. You will never find peace. You can only become a wolf. It’s that, or wait to be eaten at a later date as you sit in the larder until your number is called. Be polite, and go without murmur. Remember, political politeness is the watchword of today’s society. If you don’t want to be devoured now by the Press Wolves, never verbalize the thought that Homosexuality is a perversion. Never mention the name of this demon.

So then, Pilgrim, what is going to happen, now that this indictment has been released upon the Homosexual Priesthood? I don’t see how this current Pope can stay. But maybe he can. At least for a while. An excruciating while. It looks like he’s going to try. Hoping his friends in the media will cover for him once again. But they are hungry for blood too, and his blood is as tasty as any. Will they resist?

The boil has been lanced, side through side. But it has not had the lance yanked upward to expose the core. The core, which must be dug out. Or the infection will worsen to gangrene.
And if this Pope and all of his Lavender Wolves are not turned out soon, then the real danger to the flock is this: that the faithful will be only too willing to follow another shepherd who will give them a sense of security, a sense of the traditions, sacraments and doctrine that they remember from their past, before all the sordidness of the current scandal became too much to bear. Too much, even for those who had a hand in the acquiescence to the terminal political politeness. That would be all of us.

The answer then is what it has always been. Stay put. Do as they say, but not as they do. Stay where you are. Stay and beware of other wolves, in Shepherds clothing. Wolves offering sanctuary for those distressed by filth. But wolves, nonetheless. Beware, my friends. Especially of Greeks, bearing gifts.

Mea culpa, mea culpa, mea maxima culpa!

August 24, 2018 | 3 Comments

How To Do Predictive Statistics: Part VII New (Free) Software Tobit Regression

Previous post in the series (or click above on Class). REVIEW!

Download the code: mcmc.pred.R, mcmc.pred.examples.R. If you downloaded before, download again. This is version 0.3! Only the example code changed since last time.

For an explanation of the theory behind all this, which is markedly different than classical views, get this book: Uncertainty.

Let’s return to the despised CGPA data. I know, I know, but it’s a perfect way to demonstrate tobit regression, which is exactly like ordinary regression except the Y can be censored above or below.

We saw with ordinary regression that there was substantial probability leakage in many scenarios because CGPA can only live on 0 to 4. We had large probabilities for CGPAs both greater than 4 and less than 0. Well, why not disallow it, censoring at these levels?

Start by reading the data back in:

x <- read.csv("")

As with multinomial we have to track the model formula, a limitation of the MCMCpack.

# cgpa can only exist between 0 and 4; the ordinary regression
# above exhibits strong probability leakage
form = formula('cgpa~hgpa*sat')
fit = MCMCtobit(form, below=0,above=4,data=x, mcmc=30000)

You can see the places for censoring, and also that I upped the MCMC samples a bit. Just to show how it's done.

Okay, now to scenarios? why? Surely you reviewed! Because we want as always:

     Pr (Y | new X, old X&Y, Model & Assumptions)

where here Y = "CGPA = c", X is the HGPA and SAT scores and the model, including its assumptions about priors and so forth, we already discussed.


I'm not showing the histogram here. You do it.

If you saw an error about truncnorm you will need to install that package with its dependencies, like this:

install.packages('truncnorm', dependencies = TRUE)

Why x[6,] and not x[1,] like we always do? No reason whatsoever. I think I aimed for 1 and hit 6, and was too lazy to change it. Anyway, I got for the quantile function

       5%       50%       95% 
0.2269157 0.8828441 1.6033544 

So there's a 90% chance a NEW person like x[6,] will have a CGPA between 0.23 and 1.6. Your results will differ slightly, as usual. If they differ a lot, increase that mcmc = 30000 in the model fit function.

The 6 person had a SAT of 756 and HGPA of 0.33. What if we wanted probabilities for a fixed scenario? As we've done before

w = x[6,]
w$sat = 1e6

Gives basically a 100% of CGPA = 4. Everybody who thinks this is the wrong answer, raise their your hands. One, two, ... Hmm. You all fail.

This is the right answer. Even before with probability leakage the model gave the right answer. Unless you have a typo in your code, models always give the right answers! That's because all models are conditional on only the information you provide, which includes information on the model structure. Here three is NO information about limits of SAT scores, only limits on CGPA. So we can specify whatever we like for SAT, even a million.

Outside of this model we know it's BS. But that makes our model assumptions wrong, not our model.

Now that that's understood finally and for all time, let's do our trick of looking at all the scenarios in the data.

# Has to be a loop! array, marray, larray, and etc. all convert
# the data.frame to a matrix or a unusable list! 
# R has no utility to step through a data.frame while mainting class
# all the old x
q = matrix(0,nrow(x),3)
for(i in 1:nrow(x)){

We discussed before why the loop is necessary. You reviewed, from the beginning, so I know you already know. So there was no reason for me to mention it here.

I went with a 80% predictive interval and not 90% as above. Why not? 90% is harsh. Most people for many decisions will be happy with 80% (conditional) certainty. Feel free to change to---AS YOU MUST---for your decisions.

Look at the answers:

for(i in 1:nrow(x)){

# or

for(i in 1:nrow(x)){

Same as we did for ordinary regression---which you remember, since you reviewed. So I won't show those here. You do them.

How about scenarios for probabilities of CGPA greater than 3? Same as before:

# all the old x
q = NA
g = 3
for(i in 1:nrow(x)){
  p = MCMCtobit.pred(fit,form,x[i,])
  q[i]= sum(p>=g)/length(p)

# Then

plot(x$sat,q,ylab='Pr(CGPA>g|old data,M)')

# Or

plot(x$hgpa,q,ylab='Pr(CGPA>g|old data,M)')

You can color the dots using hgpa, or put them into decision buckets, or whatever. You can redo the 3D plots, too.

We've reached the end of the pre-packaged MCMpack extensions! Next up, JAGS.

August 20, 2018 | 8 Comments

Does It Mean I’m Psychic?

Years ago I wrote and self-published for fun the book So, You Think You’re Psychic? You can download a free PDF of it on the Books page.

There’s nothing wrong with the book—except that now I would do all the probability tables differently. None of them in the book are wrong. But I wrote the explanations when I was still a p-value believer, as all fledgling statisticians were trained to be. And still are. That has to stop. But that’s a subject for another day.

I also wrote it back in my atheist days, when I was a mistaken believer in materialism. That means the discussions about physical mechanisms of purported psychic phenomenon are incomplete, and in part flawed.

Except for these weaknesses, the rest of the book stands up pretty well. It badly needs updating, of course, and if I find myself with unexpected free time, or lucrative financial incentive, I’ll do so.

For now, let me answer an email I recently received from a person with the wonderful name of Swapnil Kamble.

Hello sir,

I have read your book ‘so you think you are psychic?’. Its a great book. I am not very good at statistics. I had a question.

1) If try guessing numbers from a pool of 1 to 100, using (pseudo)RNG app on my mobile or pc, what is the probability of getting it right 3-4 times in a session of 100 guesses?

2) Does it mean am i psychic?

3) Also if i guess 99 out of 100 times, then am i psychic?

4) in context of guessing the numbers, at what point will i be called psychic, i mean what is the minimum probability that will prove that there is some extrasensory phenomenon involved?

Thank you

1) The probability of guessing a number from 1-100, when all you know is that the number will be 1-100, is 1%, or 0.01. The probability of getting k = 3-4 right in a session of n = 100 is had by a binomial calculation. For k = 3, it’s 0.061 and for k = 4 it’s 0.015. That means getting 3 or 4 right is 0.076.

That’s not so small. Especially if you consider you might repeat the session. The probability of getting 3 or 4 right if you repeat the session just once, for a total of 2 sessions, is 0.15. If you do 3 sessions, the probability is 0.21 that at least one of the 3 sessions you’ll get 3 or 4 right. By the time you repeat it just 10 times, the probability is 0.55, or 55% that at least in one session you’ll get 3 or 4 right. Better than a coin flip!

2) Now 10 sessions isn’t a lot if you consider more than one person around the globe has done them. If just 2 people did 10 sessions, the probability at least one of them sees at least one session with 3 or 4 right is 0.79. For 3 people it’s 0.91, for 4 it’s 0.96%, and for 5 it’s 0.98, or 98%! I certainly sold more than 5 copies of the book, so maybe at least this meany sessions were completed.

You can see it’s really easy for at least one “successful” psychic session to be reported using these criteria as a “success.” Even if people are just guessing—by which I mean not using any psychic powers.

In order to prove psychic ability using these criteria, you’re going to have to do a lot better. Guessing only 3 to 4 in 100 is indicative of very weak powers. What’s stopping you from guessing all 100? Or something in the high 90s?

One answer is that you’re a very weak psychic. Hey, not every ball player hits 400, so this is possible. Now if you can consistently hit 3-4 in every session, then you might be on to something. You must keep careful, careful track, not forgetting any sessions, or partial sessions, and you must not allow yourself any excuses about why a failed session (or partial session) “doesn’t really count.” Ball players don’t get those excuses, and neither do you.

3) The probability if guessing 99 out of 100 is about 1 times 10 to the negative 197. A very, very, exceptionally small number. So, yes, if you can in test conditions, under the watchful eye of people like myself, who can spot mistakes (people often fool themselves with sensory leakage), then I’d say you’d have psychic powers.

4) This is an excellent question. There is no excellent answer. The problem is that no session of the types you are attempting are ever considered in isolation. We have had long experience of people cheating, and amazing reports come under immediate suspicion. That’s why testing under controlled conditions are mandatory.

Then some paranormal powers don’t need probability at all. Like coming back from the dead. Or turning water into wine. Do these things and we’ll know you’ve got something.

August 16, 2018 | 4 Comments

How To Do Predictive Statistics: Part VI New (Free) Software Poisson Regression

Previous post in the series (or click above on Class). REVIEW!

Download the code: mcmc.pred.R, mcmc.pred.examples.R. If you downloaded before, download again. This is version 0.22! Only the example code changed since last time.

For an explanation of the theory behind all this, which is markedly different than classical views, get this book: Uncertainty.

We’re downloading data from another gent’s site, so you have to be on-line for today’s lesson.

x <- read.csv("")
x <- within(x, {
  prog <- factor(prog, levels=1:3, labels=c("General", "Academic", 
  id <- factor(id)

It's fictional data: "the outcome variable [num_awards] and indicates the number of awards earned by students at a high school in a year, math is a continuous predictor variable and represents students' scores on their math final exam, and prog is a categorical predictor variable with three levels indicating the type of program in which the students were enrolled."

Number of awards can be 0, 1, 2, ..., where that "ellipsis" means awards can go up to infinity, which is exactly the hope of Millennials. In reality, of course, awards cap out. How many are possible depends on the situation. About which here we now knowing, since it's all made up anyway.

Poisson distributions allow infinity. Thus they will always be approximations to reality, which does not have infinities. How much an approximation this is in any case, and how loose the approximation, again depends on the situation. Really, an ordered multinomial-type situation is what always pertains. Number of anything falls into progressive buckets from zero to some fixed number.

But here we shall pretend, as everybody does, that the approximation is better than reality. What's the Deadly Sin of Reification among friends?

We already know how to do this. Lucky for us, MCMCpack creators did the work for us as they did with ordinary regression, but not for multinomial regression.

fit = MCMCpoisson(num_awards ~ prog*math,data=x)

We now (and you reviewed so you already know this) move to finding probabilities of scenarios; i.e. of finding

     Pr (Y | new X, old X&Y, Model & Assumptions)

where here Y = "0 awards" or "1 award", etc., X is the program and math score, and the model, including its assumptions about priors and so forth, we already discussed.

As ever, we have built-in scenarios in the old data. Hence

# single new data
p = MCMCpoisson.pred(fit,x[1,])
plot(p,xlab='num_awards', ylab='Pr(num_awards|D,M)')

I won't here show the plot, but it gives a spike at 0, with estimated probability of about 0.87 , a probability of 0.12 for 1 award, about 0.01 for 2 awards, and close-enough-for-most-folks 0 for 3 and 4 awards. This is for an (as typing x[1,] in R shows) a person in a NEW person in a Vocational program and math score of 41.

It is for a NEW person because, as you remember from your reviewing, we do not need probability models to tell us what happened. We just look!

What if we wanted to compare the probability for a person with the same math score but in an Academic vocation? Like this, which is a real pain in the keister:

w = x[12,] # where I hunted for an x that had Academic
w$math = 41
p = MCMCpoisson.pred(fit,w)
plot(p,xlab='num_awards', ylab='Pr(num_awards|D,M)')

I get probability of 0.77 for 0 awards, 0.20 for 1, 0.03 for 2, and near-enough-0 for the rest.

But why can't we do this?

w = x[1,]
w$prog = 'Academic'
p = MCMCpoisson.pred(fit,w)

Because R strips the class of factor from w$prog and turns it into a character. Recall whatever we pass into the prediction functions has to be the same kind of data.frame that went in. The variable prog went in as a factor with 3 levels, and not a character. You might try this:

w = x[1,]
w$prog = 'Academic'
levels(w$prog) = levels(x$prog)

But that turns w$prog to General! I do not know why R strips the class from a factor variable when there is only one row upon substitutions, but it does. For instance, we can do this:

w = x[1:2,]
w$prog[1] = 'Academic'
w = w[-2,]

That works fine: summary(w) shows w$prog is still a three level factor.

So here is the solution:

w = x[1,]
w$prog[1] = 'Academic'
p = MCMCpoisson.pred(fit,w)

That [1] on the second line does the trick! Like I said, pain in the keister.

Okay, let's do our old trick of using all the old data as scenarios. First let's get the probabilities.

# more complicated, because y is open ended; a guess has to be made in 
# number of cols of q; if subscript out of bounds, increase size
q = matrix(0,nrow(x),16)
for(i in 1:nrow(x)){
  a = MCMCpoisson.pred(fit,x[i,])

We do not know the limit of the number of awards, but we have to store them. See that "16"? Try putting 8. It will break because some scenarios generate positive (defined as above machine limit) probability for number of awards larger than 7 (recall 0 awards takes a column in q). Of course, you could always put some huge number in place of 16 and never have a problem. Until you want to make pretty plots. As you see next.

for (i in levels(x$prog)){
  j = which(x$prog == i)
  plot(0:(dim(q)[2]-1),q[1,],xlab='num_awards', ylab='Pr(num_awards|D,M)', main=i,ylim=c(min(q),max(q)), xlim=c(0,8), type='l')
  for(k in 2:nrow(x)){

You can see we limit here the number of awards probabilities to stop at 8 (in the xlim). That obviously has to be changed depending on the application.

Every scenario results in a distribution, probabilities for number of awards from 0 to 8 (max shown). In the scenarios everybody has a program, which are only three, hence three plots. But everybody also has a math score, which obviously varies. That variability does not do too much to change the probability of awards for General or Vocational programs NEW people. But it must for Academic.

The next trick would be to make a plot of only Academic NEW people, where perhaps you color the distributions so that higher math scores are a brighter and brighter red (or whatever color scheme you like).

Math scores goes from 33 to 75. Another fun thing would be to put these scores into discrete decision buckets (which you recall from your review we must always do), and then make the same plots for each of these decision buckets. How? Use the cut function. Do maybe

x$math.cut = cut(x$math, breaks = c(0,45, 52, 59, 80))

which chunks it into the observed quartiles, Do this only if these make sensible decision points. Then you need only change the code above to reflect the new plot, e.g.

for (i in levels(x$math.cut)){
j = which(x$math.cut == i)

I did it and it worked fine, and showed something interesting. That's your homework!

Next time tobit regression.