Skip to content

Category: Statistics

The general theory, methods, and philosophy of the Science of Guessing What Is.

September 12, 2018 | 7 Comments

Governments Demand Encryption Backdoors

The Blonde Bombshell warned us about the “Five Eyes”. It’s illegal for our most beneficent government, which loves us and only wants what is best for us, to spy on us. Of course, that depends on what is meant by “spying.” But never mind that for a moment.

Our most loving government still wants to know about many of us, and since they cannot spy legally, they ask a friend to do it for them. Or so the rumor goes. We ask, say, England to spy on our citizens, and we in turn spy on Australia’s, who spy on Canada’s, who spy on New Zealand’s, who spy on England. Or Australia spies on New Zealand, who…ah, never mind. It’s a merry skulduggery mixup!

The real headline is this: ‘Five Eyes’ governments call on tech giants to build encryption backdoors — or else.

A pact of five nation states dedicated to a global “collect it all” surveillance mission has issued a memo calling on their governments to demand tech companies build backdoor access to their users’ encrypted data — or face measures to force companies to comply.

The international pact — the US, UK, Canada, Australia and New Zealand, known as the so-called “Five Eyes” group of nations — quietly issued the memo last week demanding that providers “create customized solutions, tailored to their individual system architectures that are capable of meeting lawful access requirements.”

This kind of backdoor access would allow each government access to encrypted call and message data on their citizens. If the companies don’t voluntarily allow access, the nations threatened to push through new legislation that would compel their help.

In other words, you will be forced to let the government spy on you. And they will, as they always have, put that information only to good, moral, upright use. Because of love.

There’s only a couple more terms we need grasp. One is “end-to-end encryption — where the data is scrambled from one device to another — even the tech companies can’t read their users’ messages.”

Another is the difference in encryption, which we can take as the mathematical scrambling of a message, and encoding, which is the stating of a message in a different language.

Finally, “Security researchers and other critics of encryption backdoors have long said there’s no mathematical or workable way to create a ‘secure backdoor’ that isn’t also susceptible to attack by hackers, and widely derided any backdoor effort.”

This is true. Even if your device and your friend’s device employed unbreakable one-time-pad encryption, there must still come the point at which the device decodes and decrypts the message and displays it in plain text. Anything that can hack what’s on the screen can then read the message. You cannot make a device invulnerable to this kind of hacking (other kinds of hacks exist, too, such as keystroke reading, etc.). Backdoors into the encryption code are thus not strictly necessary: the government can mandate “screen readers” instead.

Encryption, then, only slows down a determined enemy. Still, slowing down an enemy is a valuable strategy.

Now suppose you are one of the very, very few individuals your government does not love, and you want to make life hard on those who would spy on you. What can you do? Encryption is good, but imperfect, as mentioned. You have to limit your “meta” data, which is liable to sigint spying. The NSA, for instance, collected where you made calls (or texts or emails, etc.), to whom you made calls, what time you made calls, the duration of those calls, on what devices (and who owned them) you made calls, the pattern of those calls and non-calls (your device is tracking you wherever you go), the pattern of calls of whom you called, and so on. All this together paints a rich picture of your message, even if the exact message remains hidden.

There are only three solutions. The best is the old fashioned way. Use non-electronic means to communicate with your friends. Written one-time pads cannot be spoofed or broken, and you only need worry about cameras recording you reading the message (which you can burn). Sigint is still possible—the cameras in public spaces see you coming and going—but it can be reduced.

Spoofing is still fun. Send random messages at odd times from strange locations that seem to laden with content but which are nonsense. Be careful who you’re sending them, too, of course. Spoofing keeps ’em guessing.

Last, use layers of code (before encryption). Your device encodes your plain text into 0-1 bits, but that’s a trivial code. Remember the Navajo code talkers? That worked because only a few knew what the encoding meant. You want messages like this:

Hello, Lucky. Hello, Lucky. Report my signal. Report my signal. Over.

Hello, George Mike Walters. Strength three. Over.

Recon reports Indians on the warpath in your area. Over.

Ain’t no Indians around here. Over.

Do not take literally. Repeat. Do not take literally.

The vultures are circling the carcass. Repeat. The vultures are circling the carcass. Over.

I see a couple of gulls, but I don’t…

The pit bull is out of the cage. The crips are raiding the store.

Make yours a little more inventive. Mix real ones with spoofed ones. And NEVER, not ever, repeat a scheme. Even one repeat and they gotcha.

September 6, 2018 | 4 Comments

Pew’s Materialistic Mix Up

Pew has a new survey result with title Americans are far more religious than adults in other wealthy nations. Key was the picture below—or at their site, bigger and better, and, like the old Police Squad! videos, in color!

Bottom axis is per capita GDP adjusted by some formula. So it’s an approximation of an approximation. Whatever. The vertical axis is percent of adults surveyed who pray daily. So it’s also a guess. Also whatever.

The solid black line is, as regular readers ought to know by now, fiction. It didn’t happen. It’s made up. It’s unreal. It’s a chimera. It’s the result of magical scientism. So ignore it.

Still, there is a vague sort of trend, such that countries with higher dollars (whatever they might mean here) have people who pray less. The question is: so?

Readers also know that comparing countries like homogeneous Norway, which has fewer people than Manhattan island on a Tuesday afternoon, with heterogeneous societies like the once United States, is silly in the extreme. It’s done all the time, though. Especially when the comparison is made to show up the grand old USA. As it is here.

If we want to compare similar countries, then a better, but still imperfect, counter is Norway versus Switzerland. (Find them both.) The conclusion then would be more prayer produces higher bucks. On your knees!

Of course, we could compare Norway with Senegal, which is also somewhat homogeneous, but three times the size of Norway. There, it looks like more prayer produces fewer bucks. But there wouldn’t be any other difference beside population size we might consider? Nah. Probably not.

Put your left hand so your left forefinger shoots vertical from the horizontal axis, and such that your hand blocks out everything to the left. Ignore the fiction line. Now what does the plot look like? Right. Not much. All the caveats that went into making the dollar figure, and the uncertainty in the survey about prayer become sharper and more important.

Look at China, bottom of prayerfulness, but not so hot with money, either. What’s going on? I thought the black line said less prayer equaled more money?

And isn’t money the most important thing there is?

Again I say to you, it is easier for a camel to go through the eye of a needle, than for a rich man to enter the kingdom of God.

Well, you can do the hand trick for various other blocks which make geographic, religious, or cultural sense, and you discover…that there is nothing much to be discovered. If anything, it shows how great the USA is. Lots of prayer, lots of success. Making it simultaneously easier and harder to get into the kingdom of God. Which force wins in the end I don’t know.

There’s another Pew plot. I won’t show it, so you’ll have to go there to see it. “Greater income inequality is tied to greater importance of religion.”

Lord only knows how they calculated, with anything even approaching precision, income “inequality”. The conceit that “inequality” is a bad thing, and that all bad things must be corrected, we can ignore.

No we can’t. It’s dumb. Houses where mom stays home to care for the family’s six kids are better, in general, than those where male and female (and identifying as such) both head off to the office yet put off having families until they can “afford” them.

Forgetting that distinction—or rather, not letting it bother you—we can do the same trick with the hands, blocking off countries which are similar in size, geography, culture. Again, not much can be learned that wasn’t already known.

Still, Pew manages to say:

One idea popular among modern sociologists for a number of decades held that America’s unregulated and open religious “market” — where different faiths compete freely for new members without government interference — has fostered fertile ground for religious growth.

More recently, some sociologists have argued that there is a link between relatively high levels of income inequality in the U.S. and continued high levels of religiosity. These researchers posit that less-well-off people in the U.S. and other countries with high levels of income inequality may be more likely to seek comfort in religious faith because they also are more likely to experience financial and other insecurities.

This is the exact opposite of the eye-of-the-needle metaphor. And dumb—because it’s not what is seen; or, rather, it is seen, but also the well-to-do (in some quarters: not Hollywood, etc.) also are religious.

September 5, 2018 | 7 Comments

That Paper About Hiring Chief Diversity Officers At Universities

The paper is “The Impact of Chief Diversity Officers on Diverse Faculty Hiring” by Steven W. Bradley, James R. Garven, Wilson W. Law, and James E. West. (Thanks to John Cook for the tip.) Here’s the abstract.

As the American college student population has become more diverse, the goal of hiring a more diverse faculty has received increased attention in higher education. A signal of institutional commitment to faculty diversity often includes the hiring of an executive level chief diversity officer (CDO). To examine the effects of a CDO in a broad panel data context, we combine unique data on the initial hiring of a CDO with publicly available faculty and administrator hiring data by race and ethnicity from 2001 to 2016 for four-year or higher U.S. universities categorized as Carnegie R1, R2, or M1 institutions with student populations of 4,000 or more. We are unable to find significant statistical evidence that preexisting growth in diversity for underrepresented racial/ethnic minority groups is affected by the hiring of an executive level diversity officer for new tenure and non-tenure track hires, faculty hired with tenure, or for university administrator hires.

Even accepting that, a finding which is confirmed by p-values and a model most complex, it cannot be said that hiring CDOs has had no influence. They have. At the very least, they have found new ways to spend millions upon millions of dollars, which contributes to tuition increases. They have created sensitivity training, which is more to less mandatory. They have created a climate (aha!) of suspicion. They enforce an official ideology. None of these are good things.

But did these CDOs manage to hire more non-whites professors than would have been hired in their absence? It is a counterfactual question whose answer must be yes. We have all seen cases where this is individually true: cronies here and there are found positions. The question the authors of the paper ask must be seen therefore as broader: was the relative increase in non-whites “large”, where large is defined by some model.

Now they say college presidents and provosts largely agree that “Most academic departments at my institution place a high value on diversity in the hiring process.” Meaning race counts. “Yet the number of new PhDs who are members of an underrepresented minority group vary widely by academic discipline.” How could this be? Rather, how could this be if the additional implicit premise of Equality is true?

I’ll skip over the various theories which tout “congruency”, which are the small but superior outcomes seen by non-white students taught by non-white professors. I’m sure some of this is true: people like to be with people who are like themselves.

The authors looked at large universities, where most data on race was voluntary. The U.S. Department of Education only recently mandated tracking and reporting by race. I wonder if they also subscribe to the ideology that race does not exist, that it is only a social construct? Never mind.

We are talking cause and effect: did hiring a CDO cause diversity, or did diversity (in students or faculty) cause the hiring of CDOs? Or are both true (at different places)? The authors used a model to answer this. “We found that decisions regarding a CDO at [Carnegie classification] R1 institutions within 100 miles do not have as much explanatory power over a R1 institution as the decisions regarding a CDO of all R1 institutions (excluding self) nationwide.”

A big problem with hypothesis testing is the rejection of truth. I bet it was true that at at least one place increased diversity caused (in part) the hiring of CDOs, and that at lease one other place hiring CDOs caused (in part) diversity. But a hypothesis-testing model (Bayesian or frequentist) rejects the “both true” possibility, and says only one can be truly true. And this is because, as you have often heard, probability models cannot identify cause. Think about this in the context of pills: the same pill may cure one man but sicken another, yet the model will reject one of these while claiming the other is the truly true.

Now the paper is large and intricate, so much so that it would bore us to go too deeply into it. We’ll do just the supposed cause-and-effect model, and only in brief.

“51 percent of respondent provosts from doctoral-granting public universities responded in the affirmative to the following question: ‘Either because of the protests, or because of prior/subsequent commitments, does your college currently have a target for increasing the number or percentage of minority faculty members you employ by a certain date?'”

Well, we all know colleges cannot withstand student “outrage”. A model is not needed.

Model

“To better understand directions of causality, we implement a Granger Causality Test between the initial establishment of CDO and changes in student, faculty, and administrator hiring diversity, and growth in student applications for undergraduate admissions.”

Here is a picture of the model, where at university i “ΔU^f_it is the change in the proportion of underrepresented students from year t ? 1 to year t”, etc.:

Good grief! After that monstrosity came the p-values, and then exited the Briggs.

If only cause were so easy! No. You have to do the brutal hard work of going to each university and exploring it in depth, asking the people who did the hiring why they did the hiring, before and after their CDO, and try to tease out, for each hire, how much the effect or lack of the effect the CDO had on hiring whites or non-whites, and hope they aren’t misremembering, lying, or confused about how to answer. And who tells the truth on race anymore, given its acid effect on discussion?

Cause is hard! As it is, the model can give correlations, which are none too high, which only proves you again have to look university by university.

Extras

There are some gems in the paper, though. Such as the tacit admission “diverse candidates”, being in limited supply (they say), are sitting pretty. The term “diverse candidates” is also theirs. Golly.

We also learn that “Evidence points toward a congregation by field and subfield for underrepresented minorities.” Meaning there are few black economists and physicists new PhDs (2.6%), but there are many black education PhDs (27%). They say. See their Table 2 for a breakdown of American Indian, black, Hispanic, Asian, and white recent PhDs by field.

Blacks are 20% of “Area & Ethnic & Cultural & Gender Studies”, Hispanics 17.3%, Asians 9.5%, and whites 44.5%. Blacks are also high in “Public Administration” (25.2%). Blacks are lowest in “Foreign Languages and Literature” (0.8%) and “Geosciences & Atmospheric & Ocean Sciences” (1.2%).

September 4, 2018 | 17 Comments

Judea Pearl Is Wrong On AI Identifying Causality, But Right That AI Is Nothing But Curve Fitting

Yep

I’ve disagreed with Judea Pearl before on causality, and I do so again below; but first some areas of agreement. Some deep agreement at that.

Pearl has a new book out (which I have not read yet) The Book of Why, which was the subject of an interview he gave at Quanta Magazine.

Q: “People are excited about the possibilities for AI. You’re not?”

As much as I look into what’s being done with deep learning, I see they’re all stuck there on the level of associations. Curve fitting. That sounds like sacrilege, to say that all the impressive achievements of deep learning amount to just fitting a curve to data. From the point of view of the mathematical hierarchy, no matter how skillfully you manipulate the data and what you read into the data when you manipulate it, it’s still a curve-fitting exercise, albeit complex and nontrivial.

This is true: perfectly true, inescapably true. It is more than that: it is tough-cookies true.
Fitting curves is all computers can ever do. Pearl doesn’t think accept that limitation, though, as we shall see.

Q: “When you share these ideas with people working in AI today, how do they react?”

AI is currently split. First, there are those who are intoxicated by the success of machine learning and deep learning and neural nets. They don’t understand what I’m talking about. They want to continue to fit curves. But when you talk to people who have done any work in AI outside statistical learning, they get it immediately. I have read several papers written in the past two months about the limitations of machine learning.

Don’t despair, Pearl, old fellow, I share your pain. I too have written many articles about the limitations of machine learning, AI, deep learning, et cetera.

Q: “Yet in your new book you describe yourself as an apostate in the AI community today. In what sense?”

In the sense that as soon as we developed tools that enabled machines to reason with uncertainty, I left the arena to pursue a more challenging task: reasoning with cause and effect. Many of my AI colleagues are still occupied with uncertainty. There are circles of research that continue to work on diagnosis without worrying about the causal aspects of the problem. All they want is to predict well and to diagnose well.

I can give you an example. All the machine-learning work that we see today is conducted in diagnostic mode — say, labeling objects as “cat” or “tiger.” They don’t care about intervention; they just want to recognize an object and to predict how it’s going to evolve in time.

I felt an apostate when I developed powerful tools for prediction and diagnosis knowing already that this is merely the tip of human intelligence. If we want machines to reason about interventions (“What if we ban cigarettes?”) and introspection (“What if I had finished high school?”), we must invoke causal models. Associations are not enough — and this is a mathematical fact, not opinion.

Associations, which are what statisticians would call correlations, are not enough, amen, but that’s more than just a mathematical fact. It is just plain true.

Q: “What are the prospects for having machines that share our intuition about cause and effect?”

We have to equip machines with a model of the environment. If a machine does not have a model of reality, you cannot expect the machine to behave intelligently in that reality. The first step, one that will take place in maybe 10 years, is that conceptual models of reality will be programmed by humans.

The next step will be that machines will postulate such models on their own and will verify and refine them based on empirical evidence. That is what happened to science; we started with a geocentric model, with circles and epicycles, and ended up with a heliocentric model with its ellipses.

Robots, too, will communicate with each other and will translate this hypothetical world, this wild world, of metaphorical models.

Now I do not know exactly what Pearl has in mind with his “model of the environment” and “model of reality”, since I haven’t yet read the book. But if it’s just a list of associations (however complex) which are labeled, by some man, as “cause” and “effect”, then it is equivalent to a paper dictionary. The book doesn’t know it’s speaking about a cause, it just prints what it was told by an entity that does know (where I use that word in its full philosophical sense). The computer can be programmed to move from these to identifying associations consonant with this dictionary, but this is nothing more than advanced curve fitting. The computer has not learned about cause and effect. The computer hasn’t learned anything. It is mindless box, an electronic abacus incapable of knowing or learning.

This is why I disagree with Pearl again, when he later says “We’re going to have robots with free will, absolutely. We have to understand how to program them and what we gain out of it. For some reason, evolution has found this sensation of free will to be computationally desirable.” Evolution hasn’t found a thing, and you cannot have a feeling of free will without having free will: it is impossible. Robots, being mindless machines, can thus never have free will, because we will never figure a way to program minds into machines, and minds are needed for feelings. Why? For the very good reason that minds are not material.

Nope

Not coincidentally, in the theological teleological sense of the word, Ed Feser has a new speech on the immateriality of the mind which should be viewed (about 45 minutes). I will take that speech as read—and not as the quale red, which is a great joke. Meaning I’m not going to defend that concept here: Feser has already done it. I’m accepting it as a premise here.

The mind isn’t made of stuff. It is not the brain. Man, paraphrasing Feser, is one substance and the intellect is one power among the many other physical powers we possess. (This is not Descartes’s dualism.) The mind is not a computer. It is much more than that. Computers are nothing more than abacuses, and there’s nothing exciting about that.

Man is a rational creature, and his mind is not material. Rationality is (among other things) the capacity to grasp universals (which are also immaterial). Cause is a universal. Cause is understood in the mind. (How we learn has been answered, as I’m sure you know as you’ve been following along, in our recent review of chapters in Summa Contra Gentiles.) Causes exist, of course, and make things happen, but our knowledge of them is an extraction from data, as the extraction of any universal is. Cause doesn’t exist “in” data. We can see data, and we move from it to knowledge of cause. But no algorithm can do this, because algorithms never know anything, and in particular no algorithm engineered to work on any material thing, like a computer, like even a quantum computer, can know anything. (Yes, we make mistakes, but that does not mean we always do.)

This means we will never be able to build any machine that does more than curve fitting. We can teach a computer to watch as we toss people off of rooftops and watch them go splat, and then ask the computer what will happen to the next person tossed off the roof. If we have taught this computer well, it will say splat. But the computer does not know what it is talking about. It has fit a curve and that is that. It doesn’t know the difference between a person and a carrot: it doesn’t know what splat means: it doesn’t know anything.

We can program this mindless device to announce, “Whenever the correlation exceeds some level, as it is in the tossed-splat example, print on the screen a cause was discovered.” Computers are good at finding these kinds of correlations in data, and they worked tirelessly. But a cause has not been discovered. Merely a correlation. Otherwise all the correlations listed at Spurious Correlations would be declared causes.

Saying computers can discover a universal like cause is equivalent in operation to hypothesis testing, though not necessarily with a p-value. If the criterion to say “cause” isn’t a p-value, it has to be something, some criterion that says, “Okay, before not-cause, now cause.” It doesn’t matter what it is, so if you think it’s not a p-value, swap out “p-value” below with what you have in mind (“in mind”—get it?). In the upcoming peer-reviewed paper (and therefore perfectly true and indisputable) “Everything Wrong With P-Values Under One Roof” (to be published in a Springer volume in January: watch for the announcement), I wrote:

In any given set of data, with some parameterized model, its p-value are assumed true, and thus the decisions based upon them sound. Theory insists on this. The decisions “work”, whether the p-value is wee or not wee.

Suppose a wee p-value. The null is rejected, and the “link” between the measure and the observable is taken as proved, or supported, or believable, or whatever it is “significance” means. We are then directed to act as if the hypothesis is true. Thus if it is shown that per capita cheese consumption and the number of people who died tangled in their bed sheets are “linked” via a wee p, we are to believe this. And we are to believe all of the links found at the humorous web site Spurious Correlations, \cite{vigen_2018}.

I should note that we can either accept that grief of loved ones strangulated in their beds drives increased cheese eating, or that cheese eating causes sheet strangulation. This is joke, but also a valid criticism. The direction of causal link is not mandated by the p-value, which is odd. That means the direction comes from outside the hypothesis test itself. Direction is thus (always) a form of prior information…

I go on to say that direction is a kind of prior information forbidden in frequentist theory. But direction is also not something in the data. It is something we extract, as part of the cause.

I have no doubt that our algorithms will fit curves better, though where free will is involved, or the system is physically complex, we will bump up against impossibility sooner or later, as we do in quantum mechanical events. I have perfect certainty that no computer will ever have a mind, because having a mind requires the cooperation of God. Computers may well pass the Turing test, but this is no feat. Even bureaucrats can pass it.