Iowahawk Does Statistics—-Properly!

Thanks to the many readers who sent in this tip.

The Iowahawk, a.k.a. David Burge, the beloved assassin of pomposity and pretension has taken the often hysterical Paul “Global Warming Skeptics are Traitors” Krugman to task over education statistics.

It seems Krugman has taken the sides of the cowardly politicians (all Democrats1) in Wisconsin. You know, the ones who scurried away to Illinois (!) when they realized they would lose a vote. We musn’t be too harsh on these politicians, for their actions were instinctual, motivated by the same survival impulse that drives cockroaches to sprint for cover when the light comes on.

Incidentally, just like those nasty bugs, self-serving politicians cannot be eradicated by force. Poison is useless. Stomp on one and two more instantly appear. The only solution is to cut off their food supply: do not vote for them.

Anyway, Krugman, using sources known only to himself, “proved” that education outcomes were better in unionized Wisconsin then they were in non-unionized Texas. Thus, we should acceded to the demands of the Wisconsin activists who dictate that more money should be taken from the working people of that State and given to them.

The only thing Krugman did right was to take on a worthy target. Too bad everything he said was false or misleading. Burge found a hilarious admission from the Times’s ombudsman Daniel Okrent (click and read all Okrent has to say):

Op-Ed columnist Paul Krugman has the disturbing habit of shaping, slicing and selectively citing numbers in a fashion that pleases his acolytes but leaves him open to substantive assaults.

Not a bad euphemism for lying, that. One wonders how the lacrymose Krugman (“O! The planet!”) responded to his colleague’s disapprobation. Okrent says, “I didn’t give Krugman…the chance to respond before writing the last two paragraphs. I decided to impersonate an opinion columnist.”

In his first post, Iowhawk did what should be done: he found the raw, relevant numbers that best compared educational success for Texas and Wisconsin. The most obvious bit of detective work—well, obvious to Burge but not to Krugman—was to recognize that the racial makeup of the two states was different. Whites, Blacks, and Hispanics do not live in anywhere near the same proportion in these states, and neither do these two groups score the same on standardized tests.

There are many more Whites in Wisconsin and Whites tend to score better on standardized tests than do members of the other groups. Thus, raw comparisons between the states will tend to show Wisconsin out front, which is misleading—a fancy word to say wrong. It was these wrong numbers Krugman used.

But if we use the proper numbers, broken down by race, as Burge did, we find that in each year, in nearly all subjects, in nearly every pertinent measure, Texas trumps Wisconsin. Using Krugman’s logic, we should thus fire every union teacher in Wisconsin and hire a non-union ones in their place.

Wait a second! How can Wisconsin do better overall yet Texas win in every subcategory? Isn’t the overall measure just a sum of the subcategories? Texas should be the winner overall, shouldn’t it?

It was in his follow-up post that Burge made us most proud, offering an excellent definition of Simpson’s Paradox, and showing how that manifested itself in the education statistics. Simpson’s Paradox is often found in disparity or inequality studies. Indeed, it is found so often that it is practically criminal not to check for it. It is not just criminal, but is nigh treasonous. And that means Krugman is a traitor! A traitor, do you hear me! Ach! Sputter! Arr…..

Whew. Sorry about that. I don’t know what came over me. My only excuse is to say that I spent too much time reading the New York Times today.

Back to the point: Simpson’s Paradox is found when subcategories of different proportions are summed (read the material on the link for a full explanation). Since the racial makeup of the two states are so different, Simpson’s Paradox is guaranteed.

Burge also tells us the difference between the ACT and SAT, why that difference matters, and why simple state-to-state comparisons of these tests are difficult.

The only point at which Burge and I differ is his use of the term “statistical significance.” I say that it is evil, misleading, and just plain wrong to ever use. However, I thank Burge for using it, because it provides me the perfect segue for tomorrow’s column. Don’t miss it!


Update1I do not mean to imply that no or few Republican politicians behave like cockroaches; clearly, many do. I do mean to say that the actions of the Wisconsin and Indiana Democrat politicians is cowardly and bug-like. Their behavior not akin to an outnumbered army wisely retreating so that they may fight again another day, for these politicians have already been vanquished and they know it. They are instead acting petulantly, like sore losers, cry babies, cockroaches. I meant only to speak of politicians and not citizens, and therefore apologize if any thought I was talking about them (unless you are a politician stealing towels from an Illinois Red Roof Inn, then I did mean you).


  1. Another enjoyable read…but I want to change the subject to:

    How does Simpson’s Paradox relate to something like Global Average Temperature (GAT)?

    This was what I was fumbling with in my [late-in-the-day] query to the earlier post. A GAT, given geographical variations, not to mention disproportionate changes by location, seems like an aggregate figure that can only mislead & conceal.

  2. Best example I ever came across of Simpson’s Paradox was in a basketball argument. One friend of mine was saying that Larry Bird was a better shooter than Reggie Miller, and cited his higher career shooting percentage as evidence. However, I pointed out to him that a much higher proportion of Reggie Miller’s shots were three pointers, and that in fact, Miller shot a higher percentage than Bird on both two pointers and on three pointers.

    My friend stated that this result was impossible. We had to get out the actual raw numbers before he was convinced.

  3. Using reason that would be familiar to Nobel Prize Winner Krugman but foreign to rational people, it is clear that Paul Krugman has been bad for his current employer, the ever greyer New York Times.

    “Paul Krugman joined The New York Times in 1999 as a columnist on the Op-Ed Page … ” according to his the Times. On Wednesday, December 1, 1999, the quoted price of Times stock (ticker symbol NYT) climbed to $49.13 per share — a price seen briefly again in 2002, when it began a long slide to its current price of $10.11 per share. I attribute this decline to Paul Krugman.

    I wonder why anyone listens to economic recommendations from a company that has seen its profits drop from $310 million in 1999 to $18.3 million in 2009.

  4. Sadly the quality of education in Texas is not so good. Most of the high schools there exist to support High School Football, the national religion of Texas. It is still possible to ‘hold back’ a kid for a year so he can bulk up before hitting the HS football team.

  5. I know the point of the first couple paragraphs was to be cute, but that sort of rhetoric is rather silly. I’m a Democrat who leans libertarian. I don’t exactly like to be compared to a cockroach.

    I’m sure if we applied this same logic to half the crap the Republicans do, we’d end up with the same results.

    Matt, you know I love your blog, and I respect you a great deal, but that was just too much. It does you no credit to be comparing fellow Americans to cockroaches. Just because not all of us agree on all points doesn’t mean that we’re vermin.

  6. Ari,

    Now, now. You know I don’t mean you. And I apologize for offending you. I modified the text to be clearer.

  7. William,

    Thanks for today’s column, well done and the link to simpson’s paradox was good reading.

    Just for laughs, check out the second paragraph:

    WASHINGTON — The U.S. military is too white and too male at the top and needs to change recruiting and promotion policies and lift its ban on women in combat, an independent report for Congress said Monday.

    Seventy-seven percent of senior officers in the active-duty military are white, while only 8 percent are black, 5 percent are Hispanic and 16 percent are women, the report by an independent panel said, quoting data from September 2008.


  8. Well, I’m mollified.

    Part of me has to wonder, though, if the behavior or our elected officials is not in large part our own faults. Seeing that we vote people in or out, it stands to reason that their behavior is influenced by ours (no #%@, Sherlock?). The Dems are only doing what they think their constituents want, and what they think will get them re-elected.

    The fact is, our system encourages this kind of behavior, because we have:

    1. First past the post elections at all levels
    2. Too many incentives for interest groups and pork

    Nearly no one in this country votes based on what they think is best for the country– they vote nearly entirely based on how big their tax return will be in April. That’s rational of course, and it’s not like it’s NEW, but it does sort of make the original intent of the Senate seem intelligent (an indirectly-elected body of people who would have the freedom to make policy outside of the will of the electorate at times.)

    As I watch more and more of these events unfold, I become increasingly Burkeian in my outlook, and wonder if perhaps we need more delegates. But to get delegates, I suspect we need to reduce voter influence.

    Clearly, this is something the Founders considered– The Federalist outlines this pretty well. It’s kind of interesting to see how states with more “direct” democracy are more financially basket cases more often.

    Bring back the delegate, and then the cockroaches are gone.

  9. Sadly the quality of logic reflected by an occasional poster here is not so good. Most comments here exist to discuss the topic in the heading across the top of the post, which today is using statistics responsibly, more or less.

    In WI, for instance, it is still possible to ‘hold back’ a kid for a year so he can bulk up enough to help out around the family dairy, but that minutia has no bearing on the thread so I shall try to resist the temptation to post it. Oops.. Sorry.

  10. Many parents “hold back” their kids to give them the “advantage”— something pretty common among the higher strata of New York City. Although in the Midwest “holding back” has more to with increasing the chances of one’s son of being the bigger guy on the football team, not necessarily as as means to pump the kid’s GPA or the “maturity level” for the Harvard application.

  11. Speaking of bad education, it is evident that Simpson does not know how to add fractions. They must have a common denominator. For example, 1/3+2/5=11/15, not 3/8. How did Simpson ever pass in arithmetic? He obviously needed some unionized teachers to teach fractions.

  12. 49er,

    Ah, the eternal debate over tangents on blogs. How near Godwin’s Law it ever goes.

    I see nothing wrong with exploring other aspects of an article if they are pertinent or interesting. It’s not as if my musing on electoral systems creates a scarcity of space for anyone else to explore, in this case, Simpson’s Paradox.

    I like current events like these because they often allow for multi-disciplinary discussion. I am by no means a talented statistician– I’m mediocre at best. But I am pretty good at picking apart electoral systems and understanding how they affect politician behavior. I figure if I can’t contribute in one way, I might as well in another.

  13. There are MRI Lie Detectors.

    I propose a monthly review of politicians and columnists.

    “Have you told any lies this month?”

    “Please tell admit to the top 50 for this month.”

    “How much money have you stolen?”

    “How much money have you helped other people steal?”

    etc etc

  14. Ari

    I agree with you. And enjoy reading your comments. I don’t know what caused me to react so violently earlier today. Certainly it was nothing you posted.

    I guess in the instant case it was not so much of a tangent I thought I saw, but a chasm. However, on reflection the commenter may have simply meant to imply TX stats could have been bumped upward a tad due to the holding-back-of-athletically-gifted students issue, and I was so dense I missed it. My bad. Mea Culpa. Ignore me. [sigh]

  15. Wonderful column, as is Iowahawk’s. But wanna estimate the probability that 100% of a random sample of New York Times readers believes Krugman?

  16. note: Okrent hasn’t been the NYT’s ombudsman for almost 6 years and it should be made clear that the quote was not in reference to any specific example like this one (which was written years after the Okrent piece) but was just his impression of Krugman’s use of statistics in general.

  17. I’ve encountered Simpson’s paradox before but under a different name. Can anyone with a better memory than mine suggest what that name (or those names) might be?

  18. Mr. Ari the Briggs rethoric was not silly, it was pungent. And since you are a Democrat, how would you characterise the behaviour of your political bedfellows in Wisconsin in no more than two words.

  19. William

    It is because I know that you do not like , dare I say , hate the “statistical significance” that I submit you the following climate change paper :
    The paper is very short (3,5 pages) but the words “t statistics … significant … 95% level” appear already on page 2 . I have serious physical reservations but am perplexed by the statistical part .
    So I wondered what a professional statistician would have to say about “Table Look Up Test” (amusing name) and “Scanning t tests” .
    I know that it is slightly off topic but have you a clear statement to make after perusing few minutes this paper ?
    You may ask why I don’t make a statement myself .
    Well it’s because I don’t write as brilliantly as you do and I am not a statistician to Gods either .
    I am just a physicist to Men :)

  20. Just a different kind of filibuster. It’s pointless to attack those politicians’ characters. There is no difference between democrats and republicans. If you don’t want to be lumped together with G Beck, I suggest that you spend your energy answering some interesting statistical questions from readers (if you are capable). I am starting to wonder if you are a credible statistician.

  21. Chris D,

    I would say that there are few differences between the parties. It is ridiculous to say that there are no differences. The point of Iowahawk’s and my post was not statistical, but how certain statistics were relevant to a politics. In short, the post was entirely about politics, and about how one group of people were willing to lie or misuse statistics to attain their ends.

    Thus I suggest you spend your energy making truly salient points, or asking interesting statistical questions. Have any?

  22. Saying that there is no difference or few differences between the parties ignores both historical and ideological differences between the parties. There are vast differences between Democrats and Republicans.

    What most people seem to really mean is “I dislike both parties because they aren’t going to do what I think they should be doing and hence I think they’re both equally vile,” rather than postulating that there is an identity for both parties’ positions on, say, State’s rights, or nuclear power. It is also a popular excuse for simply not taking the effort to actually research why there are differences, since by saying there are none, you remove any requirement to do such research. Even for areas where both parties agree on policies (postulate for the sake of the argument that such an area actually exists and is verifiable), there would be and are very different reasons for supporting the same policies. Politics, more often than not, is based on vastly differing world views that are based on subtle nuances of philosophical principles, and researching these is virtually a full-time job if you really want to understand why Democrats and Republicans, for instance, have significantly different approaches to the role of the Federal government based on differing readings of the Constitution on States’ rights. That both parties, at times, ignores or ride rough over their own positions in order to achieve a second or even third goal does not help the situation at all: it is hard for anyone at times to keep score.

    Neither party, of course, is immune to the immanent problem of corruption of its members and ideals; perhaps what is meant is that each party is as corrupt and/or incompetent as the other. But that doesn’t mean there is no difference or even few differences.

    If there were no differences or only few differences between the parties then there would be no particular ideological reason for choosing one or the other, and indeed significant reason for choosing neither (i.e. independent voters). Given the historical record of ca. 1/3rd claiming to be Democrats and ca. 1/3rd claiming to be Republican, with the 1/3rd independent voters then generally deciding the election results, the only logical conclusion is that there are differences between the parties. Otherwise, Democrats and Republicans could easily band together and have 2/3rds of the voters, easily enabling them to ignore the independent voter entirely. That they do not do so speaks well for a strong differentiation between the parties.

    That perspective voters do not recognize those differences, nor understand the fundamental differences between the two parties, indicates a failure of education, a failure of party work (to illuminate the reasons that someone should consider themselves a Democrat or Republican based on a rational choice, rather than family history or other social pressures) and a fundamental failure of rational and critical thought in many voters. I am sure that your readers can think of further reasons for this lacunae.

    I encounter Simpson’s Paradox often at work. Part and parcel of detailed economic analysis of industrial activity: depending on the weighting, Simpson’s Paradox pops up at the most annoying times, usually when you have a client on the phone wondering what sort of idiots we are because “things don’t add up”… as they never do using chain-weighted numbers.

  23. I have a question about Iowahawk’s analysis. In his followup post:

    he describes Simpson’s paradox…

    “Hitter B has a higher batting average against both righties and lefties, but Hitter A has a higher overall average by dint of facing a different mix of pitchers. Now comes the question: it’s the bottom of the 9th, two out, and you need a base hit. Who would you insert as a pinch hitter, A or B? The detailed data suggests Hitter B, irrespective of whether the pitcher was right- or left-handed. The overall average, in this context, is worse than meaningless – it leads you to exactly the wrong conclusion.”

    If this is legit analysis, it opens a can of worms, I think. Suppose that instead of handedness, we had pitchers with and without some “X Factor.” Suppose we didn’t even know what the X Factor is. Suppose we didn’t even know what the proportion of pitchers in the league have the X Factor. But suppose some third party did know, and simply provided us with a breakdown of our two hitters’ averages against the two types of pitchers, and they formed a Simpson’s paradox. Would it still be “wrong” to put in hitter A? Is it better to put in B, because he is better at hitting against both X and not-X pitchers? Color me skeptical.