William M. Briggs

Statistician to the Stars!

Page 145 of 549

Taiwanese And Chinese Toilet Icons

I found these on a public toilet in Ping Shi, more famous for its lantern festival. It’s only a matter of time before tourists flock to see them.





I was reminded of those after a reader sent these next snapshots from a shopping mall in Shanghai. Not much chance of mistaking who goes in which door.





Who was it that said travel narrows the mind?


Tweet Hate Map: Awful, Really Awful Use Of Statistics

I hate this kind of thing

I hate this kind of thing

I give you “Dr” Monica Stephens from Humbolt State University and her widely dispersed Twitter “Hate Map.”

The thumbnail here doesn’t give the picture the full injustice it deserves, so you’ll have to click on the link to view the real map, which is interactive.

She had her students count “hate” tweets over eleven months, aggregate them at the county level, and then plot the results on a map. She called this counting an “algorithm sentiment analysis,” which is like calling the guy who collects pop cans from the trash an “aluminum reorientation environment engineer.”

First sin, and grounds for automatic disqualification: her definition of what counts as “hate.” Tweets which had “homophobic”, “racist”, or “disability” words in them (supposedly in context). Mouse over the headings on the map to see the brief list, or go to their FAQ to see all words.

These included ni**er1 and bitch, words which are not always used hatefully, at least if we understand hip hop, rap, and black popular culture. Why, take these words away and there would be no modern music!

Indian, they swear, is a hate word. So are monkey, gringo, cripple, and honky.

Actually, strike that. The list is not the list. Turns out the list was only a starting point, because, for example “honky/honkey/honkie was discarded, as most of the tweets were positive references towards honky-tonk music and not slurs”.

Bitch also had to go. Too many instances, you see, 55 million-plus, since each tweet had to be read by a student. What about other words? Here Stephens became whiny and evasive. She wanted to include more words, she really did!, but her “research funds, and thus the scope of this project, are extremely limited. It’s not like we have billions of dollars in funding lying around.”

I weep for her, I really do. But after reading all her explanations, it appears that she only used 10 words: dyke, fag, homo, queer; chink, gook, ni**er, wetback, spick; cripple.

Apparently none that were disparaging to whites, males, or females in general made the cut. That means, even if accurate, the “hate” map only pertains to animosity toward those with nonstandard sexual desires, blacks, East Asians, and Latinos. Oh, and the “differently abled”.

Her excuse for excluding most people? “If you are a disgruntled white male who feels that the persistence of hatred towards minority groups is a license to complain about how discrimination against you is being ignored, just stop.” Shut up, she explained.

Second, the ridiculous artifact caused by the mapping application she used and her sampling scheme. Look at those red blobs of hate! The entire Eastern USA is sliding into Hades.

Stephens did normalize the number of “hate” tweets by the number of total tweets in each county, but this did not help, as we’ll see.

Go to the map and zoom in the maximum extent possible. Look at Chicago and Detroit, including their suburbs. Nary a hater to be found. Can this be? Surely New York, or at least New God-Help-Us Jersey? Nope. Dallas, L.A., Indianapolis? Uh-uh. Any city where lots of people live? Sorry.

Instead, navigate up to the darkest blob in northernmost lower Michigan, north of the town Gaylord (where yours truly was bred). Looks like Cheboygan county, population 26,000.

Stephens only collected 150,000 tweets from the entire USA over eleven full months. There are 3,143 counties in the USA (Alaska and Hawaii are on the map; just zoom out to see; look how red everything now is!). Population isn’t evenly dispersed, but that’s an average of fewer than 47 “hate” tweets per county. Obviously counties with major cities will have had lots more “hate” (and normal) tweets than counties with small towns. This means small counties had to have many fewer than 47 “hate” tweets.

Now how many total tweets could have come from the mostly older not-too-tech-savvy folk in tiny Cheboygan county? A hundred? Two? A thousand? Not too many. All it would take was one lunatic with a sour mouth making just one intemperate tweet (once in eleven months) and voilà! we have a center of hate.

There are many more problems, the chip on Stephens’s shoulder not the least, but forget it: this study is so awful that I want to weep.

Update WordPress ate the first version of this. I have no idea what happened. It was up and then just disappeared. The original is not anywhere. Thank God for a reader who received an email version, so I was able to restore it.


1I do not spell out some words because of squeamishness—I am, after all, military trained in language arts—but because of what the presence of these words would do in search engines. You behave yourself too.

Thanks to Roger McDermott for alerting us to this topic.


Tyranny Of The Mean

The mean can be nasty. This should come as news to no one. But since, the good Lord knows why, this topic made it into the New York Times, it has become news and so must be discussed.

One Stephanie Coontz, associated with something called the Council on Contemporary Families—and since “families” is modified by “contemporary” one immediately has the suspicion that “families” does not mean “families”, but never mind—trumpeted a report called “The Trouble with Averages” written by some guy who sees fit to end his name with “PhD.”

The sin warned against is using just one number to summarize uncertainty in a thing which can take more than one value. The one number is the numerical average, a quantity which is often asked to bear burdens far beyond its capability.

The average is often used to define what is “normal”, with the implication that deviations from it are “Abby Something”, to quote Igor. The more slavish the devotion to this concept, the more the world appears insane, because hitting the average becomes increasingly difficult.

This applies to people and things. You can say the normal temperature is X degrees, and as long as you define exactly how this was calculated, you’re on solid ground, but only an activist would fret at any departure from this number and suspect foul play.

It might be that the average man grieves (say) 8 months after the death of his wife (one of Coontz’s example), but that doesn’t mean that a man who stops crying at 2 months is heard-hearted, nor that a man who wears sackcloth for two years is insane.

Using just the average to define “normal” in people is dangerously close to the fallacy of defining moral truths by vote. Come to think of it, isn’t that what the Diagnostic and Statistical Manual of Mental Disorders does? Plus, even “extremes” might not be “abnormal” in the sense of undesirable or harmful; it all depends on the behavior and our understanding of biology and morality.

Planning on the average for physical things can make sense, but only in the rare cases where the average is all that matters. Engineers don’t design bridges to withstand only average loads.

Unless the item of interest is fixed and unchanging, and in which case the numerical mean is all that can occur, the idea of calculating an average is to assist in quantifying the uncertainty of the thing. If a thing varies, the mean will always be incomplete and reliance on it alone will lead to over confidence.

And don’t forget: probability doesn’t exist as a physical thing; it is instead the measure of uncertainty.

Anyway, not much of a post or a lesson today. Instead I’ll put the burden on you. What are some good examples where the mean, and only the mean, is an adequate summary?

Update Coontz used the word “outliers”. There are no such things. There can be mismeasured data, i.e. incorrect data, say when you tried to measure air temperature but your thermometer fell into boiling water. Or there can be errors in recording the data; transposition and such forth. But excluding mistakes, and the numbers you meant to measure are the numbers you meant to measure, there are no outliers. There are only measurements which do not accord with your theory about the thing of interest.

Far too often I find people throwing out real data because it doesn’t fit their preconceptions, i.e. model. Nutty behavior.


Thanks to Andrew Kennett to pointing us to this topic.


Oh Good, We Have Consensus About Climate Change

Everybody who believes science should be conducted by vote, raise their hands.

See if you agree with me: agreeing with me isn’t proof that that which we agree about is true. Doesn’t mean that that which we agree about is wrong, neither, because this agreement is about something which is true. Rather, our consensus is not of much interest, except sociologically.

Consider other consensuses (consensi?). A century ago the intelligentsia thought it was a swell idea that people should have perfection forced upon them by making a Utopian omelette created by cracking a few tens of millions of skulls. The consensus among us civilians is that the intelligentsia was bat-guano crazy. The consensus among the intelligentsia now, on the other hand, hasn’t budged: government (meaning rule by themselves) knows best—about everything.

Medieval scientists agreed to a man that Ptolemy’s theory about the movement of celestial objects was true. They were rational to do so, because the thing worked. Well, mostly worked, or worked good enough for everyday purposes. Scientists now agree that a better theory has come along. This one works, too.

Modern scientists shook hands and were adamant the continents were fixed objects. Doctors laid their thumbs upon their noses when asked their opinion about hygiene (they were against it). And on and on.

The batting average for Consensus is like that of an aging player being sent down to the minors in July. We had such hopes for it early on, but it consistently failed under pressure, though we’re still willing to give it one more try.

The leftwards press delights in telling us there is a consensus among climate scientists. Why they should be so pleased that the sky is falling is a mystery, unless it’s another symptom of the bloodlust found in progressives. The Guardian—protecting British minds from the onslaught of reality since 1821—is giddy over the statistic that 97.1% (and not just 97.0%) of academic papers agree that “climate change is anthropogenic.”

Just what is this capital-C Consensus? My pal Gav Schmidt asked me to tell you (his words; “the update” is found on the page linked):

  1. The earth is getting warmer (0.6 +/- 0.2 oC in the past century; 0.1 0.17 oC/decade over the last 30 years (see update)) [ch 2]
  2. People are causing this [ch 12] (see update)
  3. If GHG emissions continue, the warming will continue and indeed accelerate [ch 9]
  4. (This will be a problem and we ought to do something about it)

The last one is in brackets because whilst many would agree, many others (who agree with 1-3) would not, at least without qualification. It’s probably not a part of the core consensus in the way 1-3 are. Most (all?) of us here on RealClimate are physical scientists — we can talk sensibly about past, present and future changes in climate, but potential impacts on ecosystems or human society are out of our field.

Since it hasn’t been hotting up recently at the rates quoted by Gav, “People are causing this” is ambiguous, and there is substantial uncertainty in the historical observations, the Consensus can only be of minor interest. Climate scientists have to agree on something and it is a good thing they’re trying to sort out how things work, but their forecasts haven’t been of sufficient precision to encourage the rest of us to pay too close attention. Not yet, anyway.

Gav rightly emphasizes the “we ought to do something about it” isn’t a major part of the Consensus, if it’s there at all. But the Guardian, representing the perpetually “outraged” crowd, think it is; indeed, they think the Consensus is nothing but that. One prominent fellow with a mind permanently and deeply scarred by youthful “experimentation” said the Consensus is “about activism” and “is about converting people.”

Politicians and journalists who couldn’t read a thermometer, even if you threatened to withhold their kickbacks and deny their bylines, are so eager to believe climate change is man-made because that makes it “a problem and we ought to do something about it.”

This explains the mysterious glee over every heatwave headline and the perverse delight politicians display when organizing hearings on the end of the world. They believe they are the “we.” Just as with the earlier consensus, they believe paradise is just around the corner if only, if only.


Thanks to Ye Olde Statistician and the many others who asked about this topic and provided references.

« Older posts Newer posts »

© 2015 William M. Briggs

Theme by Anders NorenUp ↑