Skip to content

Category: Statistics

The general theory, methods, and philosophy of the Science of Guessing What Is.

September 5, 2018 | 7 Comments

That Paper About Hiring Chief Diversity Officers At Universities

The paper is “The Impact of Chief Diversity Officers on Diverse Faculty Hiring” by Steven W. Bradley, James R. Garven, Wilson W. Law, and James E. West. (Thanks to John Cook for the tip.) Here’s the abstract.

As the American college student population has become more diverse, the goal of hiring a more diverse faculty has received increased attention in higher education. A signal of institutional commitment to faculty diversity often includes the hiring of an executive level chief diversity officer (CDO). To examine the effects of a CDO in a broad panel data context, we combine unique data on the initial hiring of a CDO with publicly available faculty and administrator hiring data by race and ethnicity from 2001 to 2016 for four-year or higher U.S. universities categorized as Carnegie R1, R2, or M1 institutions with student populations of 4,000 or more. We are unable to find significant statistical evidence that preexisting growth in diversity for underrepresented racial/ethnic minority groups is affected by the hiring of an executive level diversity officer for new tenure and non-tenure track hires, faculty hired with tenure, or for university administrator hires.

Even accepting that, a finding which is confirmed by p-values and a model most complex, it cannot be said that hiring CDOs has had no influence. They have. At the very least, they have found new ways to spend millions upon millions of dollars, which contributes to tuition increases. They have created sensitivity training, which is more to less mandatory. They have created a climate (aha!) of suspicion. They enforce an official ideology. None of these are good things.

But did these CDOs manage to hire more non-whites professors than would have been hired in their absence? It is a counterfactual question whose answer must be yes. We have all seen cases where this is individually true: cronies here and there are found positions. The question the authors of the paper ask must be seen therefore as broader: was the relative increase in non-whites “large”, where large is defined by some model.

Now they say college presidents and provosts largely agree that “Most academic departments at my institution place a high value on diversity in the hiring process.” Meaning race counts. “Yet the number of new PhDs who are members of an underrepresented minority group vary widely by academic discipline.” How could this be? Rather, how could this be if the additional implicit premise of Equality is true?

I’ll skip over the various theories which tout “congruency”, which are the small but superior outcomes seen by non-white students taught by non-white professors. I’m sure some of this is true: people like to be with people who are like themselves.

The authors looked at large universities, where most data on race was voluntary. The U.S. Department of Education only recently mandated tracking and reporting by race. I wonder if they also subscribe to the ideology that race does not exist, that it is only a social construct? Never mind.

We are talking cause and effect: did hiring a CDO cause diversity, or did diversity (in students or faculty) cause the hiring of CDOs? Or are both true (at different places)? The authors used a model to answer this. “We found that decisions regarding a CDO at [Carnegie classification] R1 institutions within 100 miles do not have as much explanatory power over a R1 institution as the decisions regarding a CDO of all R1 institutions (excluding self) nationwide.”

A big problem with hypothesis testing is the rejection of truth. I bet it was true that at at least one place increased diversity caused (in part) the hiring of CDOs, and that at lease one other place hiring CDOs caused (in part) diversity. But a hypothesis-testing model (Bayesian or frequentist) rejects the “both true” possibility, and says only one can be truly true. And this is because, as you have often heard, probability models cannot identify cause. Think about this in the context of pills: the same pill may cure one man but sicken another, yet the model will reject one of these while claiming the other is the truly true.

Now the paper is large and intricate, so much so that it would bore us to go too deeply into it. We’ll do just the supposed cause-and-effect model, and only in brief.

“51 percent of respondent provosts from doctoral-granting public universities responded in the affirmative to the following question: ‘Either because of the protests, or because of prior/subsequent commitments, does your college currently have a target for increasing the number or percentage of minority faculty members you employ by a certain date?'”

Well, we all know colleges cannot withstand student “outrage”. A model is not needed.

Model

“To better understand directions of causality, we implement a Granger Causality Test between the initial establishment of CDO and changes in student, faculty, and administrator hiring diversity, and growth in student applications for undergraduate admissions.”

Here is a picture of the model, where at university i “ΔU^f_it is the change in the proportion of underrepresented students from year t ? 1 to year t”, etc.:

Good grief! After that monstrosity came the p-values, and then exited the Briggs.

If only cause were so easy! No. You have to do the brutal hard work of going to each university and exploring it in depth, asking the people who did the hiring why they did the hiring, before and after their CDO, and try to tease out, for each hire, how much the effect or lack of the effect the CDO had on hiring whites or non-whites, and hope they aren’t misremembering, lying, or confused about how to answer. And who tells the truth on race anymore, given its acid effect on discussion?

Cause is hard! As it is, the model can give correlations, which are none too high, which only proves you again have to look university by university.

Extras

There are some gems in the paper, though. Such as the tacit admission “diverse candidates”, being in limited supply (they say), are sitting pretty. The term “diverse candidates” is also theirs. Golly.

We also learn that “Evidence points toward a congregation by field and subfield for underrepresented minorities.” Meaning there are few black economists and physicists new PhDs (2.6%), but there are many black education PhDs (27%). They say. See their Table 2 for a breakdown of American Indian, black, Hispanic, Asian, and white recent PhDs by field.

Blacks are 20% of “Area & Ethnic & Cultural & Gender Studies”, Hispanics 17.3%, Asians 9.5%, and whites 44.5%. Blacks are also high in “Public Administration” (25.2%). Blacks are lowest in “Foreign Languages and Literature” (0.8%) and “Geosciences & Atmospheric & Ocean Sciences” (1.2%).

September 4, 2018 | 14 Comments

Judea Pearl Is Wrong On AI Identifying Causality, But Right That AI Is Nothing But Curve Fitting

Yep

I’ve disagreed with Judea Pearl before on causality, and I do so again below; but first some areas of agreement. Some deep agreement at that.

Pearl has a new book out (which I have not read yet) The Book of Why, which was the subject of an interview he gave at Quanta Magazine.

Q: “People are excited about the possibilities for AI. You’re not?”

As much as I look into what’s being done with deep learning, I see they’re all stuck there on the level of associations. Curve fitting. That sounds like sacrilege, to say that all the impressive achievements of deep learning amount to just fitting a curve to data. From the point of view of the mathematical hierarchy, no matter how skillfully you manipulate the data and what you read into the data when you manipulate it, it’s still a curve-fitting exercise, albeit complex and nontrivial.

This is true: perfectly true, inescapably true. It is more than that: it is tough-cookies true.
Fitting curves is all computers can ever do. Pearl doesn’t think accept that limitation, though, as we shall see.

Q: “When you share these ideas with people working in AI today, how do they react?”

AI is currently split. First, there are those who are intoxicated by the success of machine learning and deep learning and neural nets. They don’t understand what I’m talking about. They want to continue to fit curves. But when you talk to people who have done any work in AI outside statistical learning, they get it immediately. I have read several papers written in the past two months about the limitations of machine learning.

Don’t despair, Pearl, old fellow, I share your pain. I too have written many articles about the limitations of machine learning, AI, deep learning, et cetera.

Q: “Yet in your new book you describe yourself as an apostate in the AI community today. In what sense?”

In the sense that as soon as we developed tools that enabled machines to reason with uncertainty, I left the arena to pursue a more challenging task: reasoning with cause and effect. Many of my AI colleagues are still occupied with uncertainty. There are circles of research that continue to work on diagnosis without worrying about the causal aspects of the problem. All they want is to predict well and to diagnose well.

I can give you an example. All the machine-learning work that we see today is conducted in diagnostic mode — say, labeling objects as “cat” or “tiger.” They don’t care about intervention; they just want to recognize an object and to predict how it’s going to evolve in time.

I felt an apostate when I developed powerful tools for prediction and diagnosis knowing already that this is merely the tip of human intelligence. If we want machines to reason about interventions (“What if we ban cigarettes?”) and introspection (“What if I had finished high school?”), we must invoke causal models. Associations are not enough — and this is a mathematical fact, not opinion.

Associations, which are what statisticians would call correlations, are not enough, amen, but that’s more than just a mathematical fact. It is just plain true.

Q: “What are the prospects for having machines that share our intuition about cause and effect?”

We have to equip machines with a model of the environment. If a machine does not have a model of reality, you cannot expect the machine to behave intelligently in that reality. The first step, one that will take place in maybe 10 years, is that conceptual models of reality will be programmed by humans.

The next step will be that machines will postulate such models on their own and will verify and refine them based on empirical evidence. That is what happened to science; we started with a geocentric model, with circles and epicycles, and ended up with a heliocentric model with its ellipses.

Robots, too, will communicate with each other and will translate this hypothetical world, this wild world, of metaphorical models.

Now I do not know exactly what Pearl has in mind with his “model of the environment” and “model of reality”, since I haven’t yet read the book. But if it’s just a list of associations (however complex) which are labeled, by some man, as “cause” and “effect”, then it is equivalent to a paper dictionary. The book doesn’t know it’s speaking about a cause, it just prints what it was told by an entity that does know (where I use that word in its full philosophical sense). The computer can be programmed to move from these to identifying associations consonant with this dictionary, but this is nothing more than advanced curve fitting. The computer has not learned about cause and effect. The computer hasn’t learned anything. It is mindless box, an electronic abacus incapable of knowing or learning.

This is why I disagree with Pearl again, when he later says “We’re going to have robots with free will, absolutely. We have to understand how to program them and what we gain out of it. For some reason, evolution has found this sensation of free will to be computationally desirable.” Evolution hasn’t found a thing, and you cannot have a feeling of free will without having free will: it is impossible. Robots, being mindless machines, can thus never have free will, because we will never figure a way to program minds into machines, and minds are needed for feelings. Why? For the very good reason that minds are not material.

Nope

Not coincidentally, in the theological teleological sense of the word, Ed Feser has a new speech on the immateriality of the mind which should be viewed (about 45 minutes). I will take that speech as read—and not as the quale red, which is a great joke. Meaning I’m not going to defend that concept here: Feser has already done it. I’m accepting it as a premise here.

The mind isn’t made of stuff. It is not the brain. Man, paraphrasing Feser, is one substance and the intellect is one power among the many other physical powers we possess. (This is not Descartes’s dualism.) The mind is not a computer. It is much more than that. Computers are nothing more than abacuses, and there’s nothing exciting about that.

Man is a rational creature, and his mind is not material. Rationality is (among other things) the capacity to grasp universals (which are also immaterial). Cause is a universal. Cause is understood in the mind. (How we learn has been answered, as I’m sure you know as you’ve been following along, in our recent review of chapters in Summa Contra Gentiles.) Causes exist, of course, and make things happen, but our knowledge of them is an extraction from data, as the extraction of any universal is. Cause doesn’t exist “in” data. We can see data, and we move from it to knowledge of cause. But no algorithm can do this, because algorithms never know anything, and in particular no algorithm engineered to work on any material thing, like a computer, like even a quantum computer, can know anything. (Yes, we make mistakes, but that does not mean we always do.)

This means we will never be able to build any machine that does more than curve fitting. We can teach a computer to watch as we toss people off of rooftops and watch them go splat, and then ask the computer what will happen to the next person tossed off the roof. If we have taught this computer well, it will say splat. But the computer does not know what it is talking about. It has fit a curve and that is that. It doesn’t know the difference between a person and a carrot: it doesn’t know what splat means: it doesn’t know anything.

We can program this mindless device to announce, “Whenever the correlation exceeds some level, as it is in the tossed-splat example, print on the screen a cause was discovered.” Computers are good at finding these kinds of correlations in data, and they worked tirelessly. But a cause has not been discovered. Merely a correlation. Otherwise all the correlations listed at Spurious Correlations would be declared causes.

Saying computers can discover a universal like cause is equivalent in operation to hypothesis testing, though not necessarily with a p-value. If the criterion to say “cause” isn’t a p-value, it has to be something, some criterion that says, “Okay, before not-cause, now cause.” It doesn’t matter what it is, so if you think it’s not a p-value, swap out “p-value” below with what you have in mind (“in mind”—get it?). In the upcoming peer-reviewed paper (and therefore perfectly true and indisputable) “Everything Wrong With P-Values Under One Roof” (to be published in a Springer volume in January: watch for the announcement), I wrote:

In any given set of data, with some parameterized model, its p-value are assumed true, and thus the decisions based upon them sound. Theory insists on this. The decisions “work”, whether the p-value is wee or not wee.

Suppose a wee p-value. The null is rejected, and the “link” between the measure and the observable is taken as proved, or supported, or believable, or whatever it is “significance” means. We are then directed to act as if the hypothesis is true. Thus if it is shown that per capita cheese consumption and the number of people who died tangled in their bed sheets are “linked” via a wee p, we are to believe this. And we are to believe all of the links found at the humorous web site Spurious Correlations, \cite{vigen_2018}.

I should note that we can either accept that grief of loved ones strangulated in their beds drives increased cheese eating, or that cheese eating causes sheet strangulation. This is joke, but also a valid criticism. The direction of causal link is not mandated by the p-value, which is odd. That means the direction comes from outside the hypothesis test itself. Direction is thus (always) a form of prior information…

I go on to say that direction is a kind of prior information forbidden in frequentist theory. But direction is also not something in the data. It is something we extract, as part of the cause.

I have no doubt that our algorithms will fit curves better, though where free will is involved, or the system is physically complex, we will bump up against impossibility sooner or later, as we do in quantum mechanical events. I have perfect certainty that no computer will ever have a mind, because having a mind requires the cooperation of God. Computers may well pass the Turing test, but this is no feat. Even bureaucrats can pass it.

August 28, 2018 | 4 Comments

Chaos & Outrage In The Church — Guest Post by Ianto Watt

Chaos and outrage have again engulfed the Church. We must talk about this incredibly ugly and painful boil on the Body of Christ.

To lance a boil, you must insert the needle horizontal to the body, piercing through both sides of the mound. Then the needle must be yanked upward, ripping the entire head off. It’s the only way to expose and extricate the core. My dad, a farm boy, did it to me.  It was the only way to deal with it. The pain was excruciating. But the relief was immediate.

If the Church wants relief, we have to correctly identify the ailment. Diagnosis is the key to the cure. There are two possibilities: either moral rot or political nonsense. And the Pope seems to be going for the second one. He’s asking us the basic question of all flim-flam men since the beginning of time: who are you gonna believe, me or your lying eyes? In this case, Archbishop Carlos Viganos is supposedly our lying eyes. Read what he says.

The Church has been trying to treat the boil with any approach other than the needle. Avoidance of pain has been the foremost consideration. Consequently, things are getting different, but certainly not better.

I’d like to pass along a some wisdom I came upon after a recent trip to my favorite store, The Mountain Man, in Manitou Springs, Colorado. There I picked up a book I judged by its cover, Colorado Outdoor Living, by Ernest Wilkinson. (Take a look at the cover and see why.)

Wilkinson was 83 (I think) when he wrote this (in 2008), and he was still guiding camping groups, on foot for a week or more at a time, in the Rockies. His story is as American as it gets. At one time in his life, back in the ’50’s, he was a bear and wolf trapper for the Forestry department in Colorado. His job was to ride a circuit amongst the two dozen or so sheep allotments that held 1,000 head of sheep each. He would visit the two lone shepherds encamped in each allotment and take care of any predators that were making life difficult for these shepherds and their sheep. Most of these simple men he visited were Mexicans.

Wilkinson relates (page 45):

I am reminded of a story concerning language barriers told to me by one of the sheepmen. A teacher was having problems attempting to get a young Spanish youth to understand English usage of words and numbers. The teacher knew the youth was often in sheep camp with his Dad, so she decided to use the names of animals the boy was familiar with. She began an explanation. ‘Okay, you have a hundred head of sheep and one leaves. How many do you have left?’ The boy promptly replied ‘None’.

‘No, no, you don’t understand that only one sheep left.’ The youth looked at the teacher and exclaimed ‘You may understand English words and numbers, but you do not understand sheep. When one leaves, they all leave and you have none left’. The boy knew that if one sheep jumped off in a direction, all the rest would follow.

That simple wisdom, spoken by an immigrant youth to the ‘educated’ Anglo, I believe, is emblematic of where we are at today. The shepherds have fled. Or joined with the wolves in many instances. Where will the sheep go when they are scattered?

I once knew a man named Gary North. You may have heard of him, or at least the many books he has written. He made the remark, in one of those many books, that, (paraphrasing here) our Protestant/Christian/Western society was governed by Three Robes. The first robe was the Professor, who taught us up from down. The second was the Pastor, who taught us right from wrong. The third was the Judge, who ruled on our actions in light of the teaching we had received. I think it is needless to say that in America, the first two of these three have abandoned their robes, and the third is naked under his.

Now Gary’s as rabidly anti-Catholic as a civilized Protestant can get. On many other questions (except money, specifically usury) he and I share a lot of common ground. But those two issues of money and authority there is a Grand Canyon of separation between us. Gary doesn’t see the connection (and reflection) usury has to homosexuality. He’s against the sterility of homosexuality, but not the artificial fecundity of usury.

Gary and I had it out once over the question of who bestows the robes. The three bedrock institutions of the Protestant American Experiment that gave us the men who wore those Three Robes have devolved into utter filth: Harvard, Yale and Princeton. The three seminaries that became the three universities that became our three judges: our legislature, our executive and of course, our Supreme judiciary. And we have meekly followed them like the sheep we are. We can blame it all on them, right?

No. We didn’t get to where we are today by means of one act, or one person. We’ve all played a part. It’s instructive I think to see how we are all woven together in this tapestry. The problem isn’t in the warp and the woof of the fabric. Rather, it’s in the image the tapestry has produced. And the image is Machu Picchu, writ large. (It’s art is often vividly pornographic.)

The Protestant Enlightenment has burned its candle at both ends. In their last flickering light, we have to decide if the Incas and Aztecs (the first American Experiment) were artists or demons. It’s too late to decide if we are better than they were. Because we have now surpassed them in all their bloodthirsty sex-crazed ways. The last institution standing against this spiritual and carnal leprosy is now under a full-scale assault. From within.

The Revolution is not yet complete, Komrade. But only because the one institution, the Catholic Church, that played no part in setting up either the first or second American Experiment is now the last bulwark against the debauched god who goes by the name of Reason. There is hope. There is always hope, if we believe. After all, this same institution destroyed the first American Experiment. Let’s hope it can do it again, before it destroys all of us.

It’s true. It was the Confessional State of Spain, through her imperfect Conquistadors, that stopped the slavery-for-sacrifice of the Aztecs, and the slavery-for-perversion of the Incas. But more importantly, it was the Mother of the Church that freed the people at Guadalupe from their fear of the return of their former Satanic Overlords.

Fear has now returned. A fear that these demons have returned, to claim in North America what they lost in the South. The white man doesn’t know how to handle this fear. Why? Because he doesn’t know the name of the demon. As any good exorcist knows, unless you know the demon’s name, you have no power over him.

What is the name of the demon that grips the Catholic Church? The same one that grips all of America, now that The Three Robes of The Second American Experiment have abandoned us? Simple, Pilgrim. Its name is Sexual Impurity. The same demon that possessed the Incas and the Aztecs in The First American Experiment. And let’s get one thing clear: we’re all part of the problem, even if it is only because of our silence. Read the written testimony of Archbishop Carlo Vigano in his indictment of his fellow Bishops and Cardinals at the highest levels of the Vatican, including the Pope today and tell me he has not lanced the boil by naming this demon. But notice also his sorrow for not having spoken sooner. Mea culpa.

What was the door the demon entered in by? By the Faustian bargain we in the West made: let us physically kill our children, and we will let you spiritually kill the ones who survived their conception and birth. Contraception and abortion amongst the sheep, in exchange for Homosexuality amongst the shepherds.

What then will be the sign of the demon’s departure? That I cannot say, only because we do not know the particular fortuna each of us has consumed in our acceptance of this cursed bargain. Thus, there cannot be a single answer to this question. We each have willingly ingested something, in some fashion, knowingly or not. While we won’t know what physical object to look for upon the demon’s departure, there will be a clear sign that he has departed. We will start having babies again. And we will start to protect them from spiritual evil again. The world will get friendlier to babies. And more hostile to abortionists and homosexuals.

Or is the real problem is Clericalism? That’s exactly what the pushers of The French Revolution accused the Church of when she refused to follow the world in its lust. The revolutionaries said that the people (and thus, the State) gave too much deference to the clerics in their insistence that the sheep not eat of the poisoned plants that grew outside their pastures. Now, in a desperate attempt to again deflect attention from the true root of the problem, these Clerics at the top of this dung heap of homosexuality accuse the people of deferring too much to their poor clerics, who are then tempted beyond their spiritual strength to abuse their power of authority over their seductive sheep. Amazing, eh? We truly are part of the problem. Just not that part.

Let’s call all of this LGBT-ABC-XYZ crap by its real name of Sexual Impurity. And all of the actions taken in that name can be boiled down to one letter, instead of the Qwerty Keyboard that never ends. What is that letter? The letter is A. As in Auto-Sexual. Let’s be honest. The key word is Auto. Self. It’s all about ME! And whatever I do is meant to satisfy only me! That’s reality.

What are we to do? Do we wander off, like the sheep mentioned by the shepherd’s son? Or do we hold our position, awaiting a good shepherd, all the while surrounded by the pack of ravening wolves? If we flee, where are we to go? Who is to lead us on this retreat? What is our destination? Where is the safe sheepfold?

Here is the biggest danger, Komrade. For while we know that we are currently being devoured by a particular pack of pretty-boy wolves, that doesn’t mean that there aren’t other wolves. Wolves who will gladly help you escape the clutches of another wolf.

Yes, there are plenty of people who are willing to offer temporary refugee status to those fleeing from the current Pope and his henchmen. But this refuge will come at a cost. The cost of your faith. You will have to renounce your citizenship in your old country (Rome) but you will never gain rights in your new land. Because it’s not your land. Take a lesson from the Palestinians. You will never find peace. You can only become a wolf. It’s that, or wait to be eaten at a later date as you sit in the larder until your number is called. Be polite, and go without murmur. Remember, political politeness is the watchword of today’s society. If you don’t want to be devoured now by the Press Wolves, never verbalize the thought that Homosexuality is a perversion. Never mention the name of this demon.

So then, Pilgrim, what is going to happen, now that this indictment has been released upon the Homosexual Priesthood? I don’t see how this current Pope can stay. But maybe he can. At least for a while. An excruciating while. It looks like he’s going to try. Hoping his friends in the media will cover for him once again. But they are hungry for blood too, and his blood is as tasty as any. Will they resist?

The boil has been lanced, side through side. But it has not had the lance yanked upward to expose the core. The core, which must be dug out. Or the infection will worsen to gangrene.
And if this Pope and all of his Lavender Wolves are not turned out soon, then the real danger to the flock is this: that the faithful will be only too willing to follow another shepherd who will give them a sense of security, a sense of the traditions, sacraments and doctrine that they remember from their past, before all the sordidness of the current scandal became too much to bear. Too much, even for those who had a hand in the acquiescence to the terminal political politeness. That would be all of us.

The answer then is what it has always been. Stay put. Do as they say, but not as they do. Stay where you are. Stay and beware of other wolves, in Shepherds clothing. Wolves offering sanctuary for those distressed by filth. But wolves, nonetheless. Beware, my friends. Especially of Greeks, bearing gifts.

Mea culpa, mea culpa, mea maxima culpa!

August 24, 2018 | 3 Comments

How To Do Predictive Statistics: Part VII New (Free) Software Tobit Regression

Previous post in the series (or click above on Class). REVIEW!

Download the code: mcmc.pred.R, mcmc.pred.examples.R. If you downloaded before, download again. This is version 0.3! Only the example code changed since last time.

For an explanation of the theory behind all this, which is markedly different than classical views, get this book: Uncertainty.

Let’s return to the despised CGPA data. I know, I know, but it’s a perfect way to demonstrate tobit regression, which is exactly like ordinary regression except the Y can be censored above or below.

We saw with ordinary regression that there was substantial probability leakage in many scenarios because CGPA can only live on 0 to 4. We had large probabilities for CGPAs both greater than 4 and less than 0. Well, why not disallow it, censoring at these levels?

Start by reading the data back in:

x <- read.csv("http://wmbriggs.com/public/cgpa.csv")

As with multinomial we have to track the model formula, a limitation of the MCMCpack.


# cgpa can only exist between 0 and 4; the ordinary regression
# above exhibits strong probability leakage
form = formula('cgpa~hgpa*sat')
fit = MCMCtobit(form, below=0,above=4,data=x, mcmc=30000)

You can see the places for censoring, and also that I upped the MCMC samples a bit. Just to show how it's done.

Okay, now to scenarios? why? Surely you reviewed! Because we want as always:

     Pr (Y | new X, old X&Y, Model & Assumptions)

where here Y = "CGPA = c", X is the HGPA and SAT scores and the model, including its assumptions about priors and so forth, we already discussed.


p=MCMCtobit.pred(fit,form,x[6,],0,4)
hist(p,300)
quantile(p,c(0.05,.5,.95))

I'm not showing the histogram here. You do it.

If you saw an error about truncnorm you will need to install that package with its dependencies, like this:

install.packages('truncnorm', dependencies = TRUE)

Why x[6,] and not x[1,] like we always do? No reason whatsoever. I think I aimed for 1 and hit 6, and was too lazy to change it. Anyway, I got for the quantile function


       5%       50%       95% 
0.2269157 0.8828441 1.6033544 


So there's a 90% chance a NEW person like x[6,] will have a CGPA between 0.23 and 1.6. Your results will differ slightly, as usual. If they differ a lot, increase that mcmc = 30000 in the model fit function.

The 6 person had a SAT of 756 and HGPA of 0.33. What if we wanted probabilities for a fixed scenario? As we've done before


w = x[6,]
w$sat = 1e6
p=MCMCtobit.pred(fit,form,w,0,4)
hist(p,300)
quantile(p,c(0.05,.5,.95))

Gives basically a 100% of CGPA = 4. Everybody who thinks this is the wrong answer, raise their your hands. One, two, ... Hmm. You all fail.

This is the right answer. Even before with probability leakage the model gave the right answer. Unless you have a typo in your code, models always give the right answers! That's because all models are conditional on only the information you provide, which includes information on the model structure. Here three is NO information about limits of SAT scores, only limits on CGPA. So we can specify whatever we like for SAT, even a million.

Outside of this model we know it's BS. But that makes our model assumptions wrong, not our model.

Now that that's understood finally and for all time, let's do our trick of looking at all the scenarios in the data.


# Has to be a loop! array, marray, larray, and etc. all convert
# the data.frame to a matrix or a unusable list! 
# R has no utility to step through a data.frame while mainting class
# all the old x
q = matrix(0,nrow(x),3)
for(i in 1:nrow(x)){
  q[i,]=quantile(MCMCtobit.pred(fit,form,x[i,]),c(0.1,.5,.9))
}

We discussed before why the loop is necessary. You reviewed, from the beginning, so I know you already know. So there was no reason for me to mention it here.

I went with a 80% predictive interval and not 90% as above. Why not? 90% is harsh. Most people for many decisions will be happy with 80% (conditional) certainty. Feel free to change to---AS YOU MUST---for your decisions.

Look at the answers:


plot(x$sat,q[,2],ylab='cgpa',ylim=c(min(q),max(q)))
for(i in 1:nrow(x)){
  lines(c(x$sat[i],x$sat[i]),c(q[i,1],q[i,3]),col=3)
}

# or

plot(x$hgpa,q[,2],ylab='cgpa',ylim=c(min(q),max(q)))
for(i in 1:nrow(x)){
  lines(c(x$hgpa[i],x$hgpa[i]),c(q[i,1],q[i,3]),col=3)
}

Same as we did for ordinary regression---which you remember, since you reviewed. So I won't show those here. You do them.

How about scenarios for probabilities of CGPA greater than 3? Same as before:


# all the old x
q = NA
g = 3
for(i in 1:nrow(x)){
  p = MCMCtobit.pred(fit,form,x[i,])
  q[i]= sum(p>=g)/length(p)
}

# Then

plot(x$sat,q,ylab='Pr(CGPA>g|old data,M)')

# Or

plot(x$hgpa,q,ylab='Pr(CGPA>g|old data,M)')

You can color the dots using hgpa, or put them into decision buckets, or whatever. You can redo the 3D plots, too.

We've reached the end of the pre-packaged MCMpack extensions! Next up, JAGS.