William M. Briggs

Statistician to the Stars!

Page 145 of 582

Update On Sexuality Wars

What is this dog's orientation?.

What is this dog’s orientation?

Item: What if—don’t panic—there is no such thing as sexual orientation in any biological sense (save heterosexualism) where a person is born and “condemned” inescapably to lust after one fixed object (so to speak). Searches for “gay genes” or other biochemical markers have been in vain, therefore it’s rational to suppose none exist and that environment plays a large role. What if human sexuality isn’t as cut-and-dried as modern (and only modern) interpretation has it and that “orientation” is entirely man-made?

Celebrated denizen of the left Michel Foucault said:

In his Histoire de la Sexualité, Michel Foucault argues that homosexuality is a social construct, and one constructed terribly recently at that. “As defined by the ancient civil or canonical codes,” he writes, “sodomy was a category of forbidden acts; their perpetrator was nothing more than the juridical subject of them.” The late-nineteenth century saw this classical view displaced, however, when the sodomite was set up as the bearer of a distinct and pervasive psychological persuasion. “Homosexuality appears as one of the forms of sexuality,” Foucault writes, “when it was transposed from the practice of sodomy onto a kind of interior androgyny, a hermaphroditism of the soul.”

Item: The categories which define “orientation” are increasing, too. LBGTQIA—more?

Item: The Very Reverend Gary Hall, chief of the Washington National Cathedral and member of one of the protesting Christian sects, recently said, “Homophobia is a sin. Heterosexism is a sin…Only when all our churches say that clearly and boldly and courageously will our LGBT youth be free to grow up in a culture that totally embraces them fully as they are.”

Of course, “homophobia” is a fluid word, but this is the first I’ve seen “Heterosexism” (which might in other words be called natural law) called a sin. Hall also said that people’s attitudes towards homosexuality are based on “a misreading of the Bible.”

Thus we have, at least with Hall, who surely has many imitators and who preaches to a grateful media, a complete reversal of classic theology.

Item: A now 11-year-old boy, who with the help of his two lesbian guardians, decided at age 8 that he was “really” a girl, has completed three years of chemical injections to make his male body more resemble a female one.

Psychiatrists “diagnosed” the boy with “gender identity disorder”. A modern disease, one only recently “discovered.”

Item: Argentina’s government has granted a 6-year-old boy an ID that corresponds the boy’s claim that he is a girl. His proud and energetic mother even managed to have the government an amended “birth” certificate which claims the boy is a girl. This was fully legal. From the relevant law:

Gender identity is understood as the internal and individual way in which gender is perceived by persons, that can correspond or not to the gender assigned at birth, including the personal experience of the body. This can involve modifying bodily appearance or functions through pharmacological, surgical or other means, provided it is freely chosen. It also includes other expressions of gender such as dress, ways of speaking and gestures.

Interesting choice of words, “gender assigned at birth” and “freely chosen.” How “free” are the choices of pre-teens regarding sexuality? The modern presumption is “Perfectly so.” Does it follow puberty is a choice?

Item: A boy pretending to be a girl at Florence High School in Colorado was reportedly

harassing girls in the bathroom. When parents complained, school officials said the boy’s rights as a transgender trumped their daughters’ privacy rights.

As the controversy grew, some students were threatened with being kicked off athletic teams or charged with hate crimes if they continued to voice concerns.

This news arrives from an interested source, and I could not discover corroboration. The story mentions the Pacific Justice Institute, a traditionalist (the modifier, as we learned from above, is now needed) Christian organization which had involved itself in California’s new law to allow children to access whichever bathroom accorded with their “gender identity.”

So the story has some plausibility. But even if it’s false or exaggerated, it’s of interest to note its direction.

Item: California Governor Brown “signed a new law that will allow the state to recognize more than two legal parents for a child.”

One of the catalysts for the bill was a case in which one lesbian in a relationship was impregnated by a man, and later fought with her lesbian lover. One woman was jailed and the other went to the hospital, and the daughter wound up in foster care because the sperm donor did not have parental rights.

Conclusion? The first lesson is you are not who you are, but you are what you want to be. And not only that: others must acknowledge not who you are, but who you claim to be. If they do not, it is they who are troubled, not you.

The second lesson is that people have absolutely no sense of humor or proportion about these things.

Of course, the real trick is not to compile these stories, but to say where they are pointing. Readers should recall that not the whole world is acquiescing, Russia and large swaths of Africa hold to older ways.


Topographical Data Analysis Next Great Hope

Round and round she goes…

Everybody who remembers how neural nets were going to save the world, raise your hands. Little higher. Make sure everybody sees.

Well, you were wrong, weren’t you. They’ve all but disappeared from cocktail party discussions. Turns out machine learning algorithms didn’t triumph, either.

Yet something has to come up on top. What will reign supreme? Topological data analysis, baby! Or so says the folks interviewed by Wired in their article “Scientific Data Has Become So Complex, We Have to Invent New Math to Deal With It.” Story of some guys who say we’re in the midst of the “big data equivalent of a Newtonian revolution, on par with the 17th century invention of calculus.”

But before we wax eloquently about our newest warrior against uncertainty, let’s cast our minds back to the 1990s, when we regularly came across things items like this.

Neural nets are universal function approximators! Any function you can think of, and even those you can’t, can be tossed in the trash. Who needs ’em? Just think. Some function out there explains the data you have, and since this function is probably too complicated to discover mathematically, all we have to do is feed these brain-like creatures the data and they’ll figure out the function for you.

The more data you give them, the more they learn. Pictures of brains, pictures of synapses, pictures of naked interwoven dendrites! It was so sexy.

Well, as said, we know how that turned out. The cycle has since been repeated with other Holy Grail methods, though it has never reached the same peak as neural nets.

You have to hand it to the computer algorithms set. They have the best marketing team in science. Who wants to “estimate” the “parameter” of a non-linear regression when you can “input” data into a “thinking” machine? Why not embrace fuzzy logic, which is hip and cool, and eschew dull probability? Hey, all these things are equivalent, but nobody will notice.

Or maybe they will. Don’t forget to read the Machine Learning, Big Data, Deep Learning, Data Mining, Statistics, Decision & Risk Analysis, Probability, Fuzzy Logic FAQ.

Back to topological data analysis. Idea is to take enormous data sets and twist and turn them as you would donuts into coffee cups (let him would readeth understand) and store only the pattern and not the details (dimension reduction). I like this approach, and surely there will be plenty of neat and nifty tricks discovered (see the article for some fun ones).

It’s not a new idea. Remember “grand tours” of data? These were big about fifteen years ago. Cute graphics routines which let you pick off a few dimensions at a time and spin them round and round until you saw (if there was anything to see) how a “random” scatter of points collapsed to something predictable looking.

Slick stuff, and useful. Wired gives the example of the Netflix prize, where the idea was to find algorithms that made better preference guesses because “even an incremental improvement in the predictive algorithm results in a substantial boost to the company’s bottom line.” And, lo, some group won with an algorithm that did find an incremental improvement.

That’s our lesson: incremental. Human behavior is so complicated that it’s doubtful—I’d even say almost certain—that no Hari Seldon will ever exist. No human being, or machine created by one, is going to discover an equation or set of equations which predict behavior at finer than the grossest levels and for time spans greater than (let us call them) moments.

The boost was incremental. Meaning it was a tweak and significant uncertainty remained. That’s what the neural net folks never figured on. Even if we knew (100% certainty) what the weights were between “synapses”, it did not mean, and it was not true, that we knew with certainty the thing modeled.

Statisticians forget this, too. Equivalently, even if we knew (100% certainty) the values of the parameters in some model, it does not mean, and it is not true, that we know with certainty the thing modeled. This is why I argue endlessly for a return to focus on the things themselves we’re modeling, and away from parameters.

That’s another reason to like machine “learning” and this new-ish idea of topographical data analysis. The focus is on the right thing.


A reader sent me this article, but I can’t recall who and I have lost the original email. I apologize for this. I hate not giving credit.


Logical Probability Data Analysis, Measurement Error Example

Read the introduction to this first. If you don’t, you will be lost, lost, lost.

Logical probability answer to B

Means for two days.

Means for two days.

The answer to B follows from A. The picture is of what the mean might have been given the assumptions used above. A 1/9 chance the mean was 69.5, 2/9 it was 70, 3/9 for 70.5, etc. No error bars? Well, no: none are needed. This picture is the complete answer.

“Error bars” are classical and come from assuming some sort of parameterized probability model, like a normal (where the probability of seeing any observation is always zero), which we did not use and do not need here. No test statistics, no p-values, no parameters, no priors, no posteriors. Just probability.

Notice that the problem is entirely discrete? It’s not because we only averaged two days, but because the nature of the evidence is discrete (homework: find a non-discrete real-life example; I won’t wait). It would still be discrete no matter what finite number of days on which we took our mean. Our answer is exact given the assumptions; it has been deduced.

What about the month of Maxes, say 30 days? Still discrete. Each daily Max can take three values with equal probability (given our assumptions), and each of these can be combined with each other daily Max so that the mean is comprised of 330 = 2×1014 possibilities. Still discrete, but what a number!

Actually, it’s not as bad as that because not that many unique combinations can occur. It could be that every single time Max was measured, it was low by a degree, or every time high by a degree, or something in between, including the time every measurement was spot on. That makes only 61 possible values the mean of Max can take (every possibility from adding -30 to +30). Quite a reduction!

Start from the left: Assume the average is from every measurement by one degree low, which is equivalent to the sum of the actual temperatures minus 30, all divided by 30. There’s only one way out of the 2×1014 possibilities this can happen, so inverting that gives the probability. This is symmetric with every day being hot by one degree, which has the same probability.

Next: the temp could have been low 29 times and right once. That can happen 30 different ways, with a probability 30/330. This is also symmetric with 29 times high. Next: the temp could have been low 28 times and right twice, or low 29 times and right once, which is also symmetric (they all are).

You get the idea. All we need do is count the number of times each under or over could happen. A cute, eventually tedious, but not overwhelming combinatoric problem. Example: +/- 30 can happen just one way; +/- 29 can happen 30 ways (these are all each); +/- 28 can happen 465 ways; +/- 27 can happen 4930 ways; +/- 26 is 40,020 ways, and so on towards the peak at 0 (where the plus and minus errors balance). Summing all the different ways equals 330 (it must!). (Homework: what are number of ways for +/- 0?)

So I made up 30 days of Max temperatures somewhere around 70. Here’s the picture of what the mean can be, and the probabilities we deduced for these values given our assumptions.

Mean for 30 days.

Mean for 30 days.

My made-up Maxes were from 60 to 77. The computed average was 70.3666… The most likely value of the true mean is the same. The only values the mean could have been, given these conditions and data, are (rounded) 69.37, 69.4, …, 71.33, 71.37. This distribution is exact. There is an exact (to within roundoff in my calculations) probability of 0.087 for the mean to be 70.37. The others may be drawn from the figure.

For fun, I can report there is a 95.54% chance that the mean is in the set 70.0667, 70.1, …, 70.633 (I can give you all the numbers, but they are beside the point). There is no reason in the world to pick 95.54% except that it is close to the classical magical (magical classical?) value.

Did you notice the language? I did not say that there is a 95.54% chance that the mean is “between” 70.0667 to 70.633, because that is false. For one, those words leave out the endpoints, which are real possibilities. For another, only the discrete values in the set are possible. The mean might have been 70.0667 or 70.1, but it was impossible (given our etc.) that it could have been, say, 70.09, or any other value not in the discrete set.

The red line on the picture, which is cut off and which actually extends from 68.8 to 71.9, is the classical “95% confidence interval” on the parameter of a normal distribution model. Notice that this extends beyond the actual possibilities. The definition of the confidence interval means—ready?—nothing for any particular set of data (except the true mean lies in the interval or not), but even if you took the Bayesian view (same as the frequentist here for a flat prior) the interval still only speaks of a parameter. And even if you integrated the parameters out (let he who readeth understand), you’d still be left with an interval, which gives probabilities for impossible values (actually it gives probability 0 for every value!).

Don’t worry if the last paragraph made little sense. The point is this: the results we have are exact, and not the result of a parameterized probability model. Our results are deduced given the assumptions we used, and not calculated via some ad hoc model.

What happens if we change the assumptions? We change the results! Of course we do. All probability (all logic) is conditional. Change the conditions, change the conclusion.

What this answer isn’t

Because of measurement error, we were not certain of the mean, which is what we wanted to know. But we are certain of what the mean could have been, and its chances.

The results are not a prediction of future values of Max temperature. The are a prediction of what the mean of Max temperatures were during those 30 days, which we don’t know (again) because of measurement error.

There results are not statements about actual past temperatures, which we already knew, up to measurement error.

The results are also not what Kip originally asked for, but the answer to those questions are discovered in just the same was as these.

I’ll do the logical probability example most close to binomial next.


Colonoscopies Set To Music

This was sent to me by my uncle who is a VIP nurse and who in his official capacity has peered down the blind alleys of hundreds of the rich and famous.

Naturally I’m sworn to secrecy about his clientele and I won’t make any cracks. But I can tell you my uncle has seen the Oh! on the faces of many you have heard of, especially if you are a fan of sports, politics, or music from a certain region of this once fine country of ours.

There is also a celebrity doctor that you will have heard of that I can’t tell you about. Not directly. Before he gained fame, I was in his shared office when I was a newly christened biostatistician (I was there to do work, not to have it done to me) and wondered about the long black tubes hanging on the wall. Being ever curious, I picked one up. Slippery, slimy things.

The nurse and different doctor I was with cautioned me an instant too late. I recall scrubbing up as vigorously as any surgeon.

I believe the not-yet-celebrity doctor took the same view of the tubes as I, because it wasn’t too long after this incident that he began his transformation from obscurity.

Anyway, my folks have seen the gentlemen in the video several times as they make their way through Florida on winter tour. Spread the word of this dark video!

« Older posts Newer posts »

© 2015 William M. Briggs

Theme by Anders NorenUp ↑