Statistics

Random topics

I use the word “random” in the sense that you did not know what topics I would select today. And I use the word know in its logical sense.

On Polling

From Instapundit comes this link to the WizBang blog on polls and polling.

Mr Wiz Bang seeks to reassure his readers that the picture is not as bleak for McCain viz. the polls as reported in the media.

All of the obvious suspects are here. The polls are commissioned and designed by folks who have a definite stake and desire in the outcome of the election. This of course does not prove that the polls are biased, but it should increase the probability that you think so.

The ordering of the questions and the exact questions used are seldom revealed, but are of obvious importance. For example, in one poll Mr WB discovered that questions about McCain came right after people are solicited for their opinion on President Bush.

You never hear about non-response. For example, pollsters ask 100 people, or try to ask 100 people, but only 20 respond. Who? Why? Are these non-responses correlated with the outcome? Usually, and in fact especially in politics, they are.

One thing Mr WB doesn’t mention is lying. People lie like dogs on surveys and polls. Sometimes, the lying is evenly spread out on both sides of Yes and No, but sometimes not. In this election, I suspect the lying is not even.

One place I did not know about is the National Council on Public Polling, a body whose purpose, inter alia1, is to provide ethical guidelines on polling. If you have ever found yourself caring about any poll, then you ought to read their “20 Questions A Journalist Should Ask About Poll Results.”

I won’t bore anybody with the technicalities, but “11. What is the sampling error for the poll results?” is based on classical statistics, and thus the typical “+/- 4 points error” you hear is wrong and should, as an extremely crude rule of thumb, be multiplied by 2. This fudge factor accounts for uncertainty in the true error, not the statistical formula error, which nobody ever really cares about. The true error is this: A poll says, 46% support M, and, in the end, actual voting reveals 74% support for M, then the error is 46% – 58% = -12%.

Their take on “18. What about exit polls?” also does not account for lying. I’ve told this story 100 times, but it bears repeating. John Kerry’s exit polls had him winning, in Manhattan, by about 10 to 1. The actual result was Kerry winning by about 5 to 2. Now, it’s true that Kerry still won the city, but the actual result wasn’t even close to that predicted by the poll. People who live in Manhattan are under a lot of pressure to voice support for Democrats.

Suicides and economic downturns

This idea comes from Dave Schultz, intrepid Chief Editor of Monthly Weather Review (I am one of the many Associate Editors there; Dave, unsolicited, was kind enough to put a link to my book on his page).

Dave pointed to this article from a local New York City paper. It’s a story of how “researchers” continually find surprising and suspicious correlations with economic data.

You might have heard this one in the news last week. A “researcher” named Pettijohn supposedly found that in lean economic times, chubbier models were featured in Playboy magazine. To which I can only say: isn’t tenure a wonderful thing?

Undoubtedly still drooling—I mean reeling—from that stunning finding, Pettijohn went on to discover “that in uncertain times, people tend to prefer songs that are longer, slower, with more meaningful themes.” Which I guess explains how Barry Manilow got to be popular (From Barry: “You get what you get when you go for it”).

As insightful as Pettijohn is, he doesn’t hold a box of tissues to Leo J. Shapiro, chief executive of SAGE, a Chicago-based consulting firm. Says Shapiro: “DURING a recession, laxatives go up, because people are under tremendous stress, and holding themselves back.”

Now that’s research. “Bob, this recession measures a solid—and I do mean solid—7.4 on the old sphinctometer.”

A guy named Ruhm says that suicides increase when dollars decrease. But the data he uses (they picture it) has already been massaged and filtered etc. and we all know what happens when you smooth time series and then use those smoothed series as inputs to other analyses, right?

_____________________________________
1This phrase was a favorite of my intellectual grandfather, Allan Murphy. Murphy was huge in forecast verification and meteorological statistics, a love which he passed on to Dan Wilks (the mustache is real), who is half my father. Meaning: Murphy was Wilks’s advisor, and Wilks was, in part, mine.

Categories: Statistics

8 replies »

  1. I expect the guy that wrote about the chubby shapes in playboy had to massage the figures to get that result. All sorts of smoothing and shenanigans, the hockey puck! And I’ll bet it was thoroughly peer reviewed!
    There are qualitative and quantitative research types! Guess he was trying to work out which was best. In fact, I think I’ll look into this further, as jokes aside I think his finding is wrong or rather, if such a correlation exists then he misunderstands the mechanism. I can think of several. Sounds like one for the journal of spurious correlation 🙂

    http://www.jspurc.org/subm2.htm

    Smoothing time series, hockey sticks, laxatives, I couldn’t think of anything to say, so I didn’t.

  2. Briggs,
    “A poll says, 46% support M, and, in the end, actual voting reveals 74% support for M, then the error is 46% – 58% = -12%.”
    I don’t see where the 58% comes from.

    This is way off topic but from the manifesto of the Journal of Spurious Correlation:
    “The Journal will, as a matter of both principle and interest, welcome submissions from practitioners of all approaches to political and social science, be they positivist, postpositivist, formal, informal, quantitative, discursive, analytic, hermeneutic, critical, applied, inductive, deductive, ideographic, nomothetic, fuzzy, emancipatory, historical or otherwise”… (they missed pink and fluffy).
    …it does not aim to remake political or social science in the image of the other sciences.”

  3. inter alia

    True meaning: “Let’s impress the nuns by replacing five syllables in English with five syllables from Latin!”

  4. Lucia,

    I tell you, if I had used any Latin at all, Sister Dorothy, my high school nemesis, would have been thrilled.

  5. Deja vu all over again. Every election cycle, the same chorus is heard. The polls are biased; they are all over the place; they don’t measure the real public sentiment. Unfortunately — and I wish this weren’t so — they have a pretty good track record in picking winners of the presidential contest, at least in aggregate. That ought to increase the probablility that they are right this time too.

  6. There is a particular reason why the polling in this election may have even more of a Democrat bias. (Though I think Coulter must have been a bit selective with her poll figures).

    People who conduct polls tend to be young, (quite often moonlighting college students). Most would be strong Obama supporters. If you are white, and say you will vote McCain, you may have the strong suspicion that the polster will suspect you are voting on racial grounds. I suspect a significant proportion of McCain voters would say “Obama” for that reason.

    For obvious reasons that factor has not been there in previous presidential elections.

Leave a Reply

Your email address will not be published. Required fields are marked *