William M. Briggs

Statistician to the Stars!

The Final Polls & Predictions: What’s The Difference Between Polls And Models?

There are two ways to forecast the election: polls and models. Polls are easy: go out and ask who will vote for whom, tally the results, and print ‘em. As long as the poll sample “looks like” the eventual voters, the poll will be somewhat accurate.

Models are just the same as any statistical model. They take data as input and spit out probability predictions. Bayesian models, just because they are Bayesian (which is of course not just the superior statistical philosophy, but the correct one) are not necessarily better than any classical model. Bayesian models can stink just as badly as classical ones, or worse because their users tends to be cockier (like Yours Truly).

Polls are inherently numerical; that is, a set of numbers is the result. 50% for Romney, 49% for Obama, as Gallup’s final has it, for example. Models are usually numerical, but needn’t be (mine in not). Polls are typically released to the nearest percentage. Modelers pretend that they have more insight and show us more digits. Nate Silver puts Mr Obama’s chances (as of today) at 91.6%. That’s point-six and not point-five.

Because our task will be to assess the goodness or badness of polls and models, let’s today document them. Corrections are welcome. I’ve tried to discover all the most relevant information but there are lacunae. If you can document these blanks, or suggest other sources that should be added, please do so in the comment boxes and I’ll put them in the main text. Be sure to include links.

Polls

There are others, but these are the biggies from Real Clear Politics. All are percentages. The margins of error have the screwy “confidence interval” interpretation, and are thus too small.

Source Obama Projected Romney Projected Democrat Sample Republican Sample Independent Sample Margin of Error
Pew; Oct 31-Nov 3 50 47 39 32 29 2
NBC/WSJ; Nov. 5 48 47 43 41 15 2.6
ABC; Oct. 31-Nov. 3 49 48 33 29 34 2.5
CNN; Nov. 2-4, 49 49 41 30 29 3
Gallup-USA Today Swing States; Oct 22-28, 27-31 48 48 ? ? ? 3
Rasmussen; Nov 1-4 48 49 39 37 24 2.5
Gallup final Oct 31-Nov 3 49 50 ? ? ? 2


Models

I’ve only found a few prominent academics and journalistic celebrities who are well known enough to have published predictions. Not to over-work an exhausted phrase, but garbage in, garbage out; and this is true no matter how sophisticated the apparatus or well credentialed the pundit (which could include me!). The top two modelers were so confident that they did not include explicit probabilities.

Source Obama Probability
Drew Linzer, Votamatic (R charts!); Nov 5 ~100?
John Sides, The Monkey Cage; Nov 5. ~100?
Wang; Princeton Election Consortium; Nov 5. 99.8
Darryl Holman, Horses Asses; Nov 5. 98.8
Nate Silver; Nov 5. 91.6
Simon Jackman; Pollster; Nov 5. > 50
Kenneth Bickers, Michael Berry, CU; article; Oct 4. 23
Briggs; Nov 5. < 20 (less than twenty)


Notes: Most models take as input the poll data, but they also use other data, like measures of the state of the economy, the year, etc., etc., etc. “Monte Carlo” doesn’t mean spit. It’s just one of many techniques with which to perform numerical integration. See also this journal with folks playing around. The WSJ looks at historical poll accuracy.

Update Clarified my prediction; less than 1 in 5 chance, but that’s as close to a number as I care to go.

Update 11:17 pm. So much for my model, which took too much account of poll error and supposed too much “Bradley effect.” Ah well. More: I had Obama down, and he was by 10 some million from 2008. But I had Romney up by about 3 million, while he turned out down about the same from McCain in 2008. Those Republicans who did vote were more “enthusiastic”, but what a dichotomy.

Update See this about the missing voters. “As of this writing, Barack Obama has received a bit more than 60 million votes. Mitt Romney has received 57 million votes. Although the gap between Republicans and Democrats has closed considerably since 2008, Romney is still running about 2.5 million votes behind John McCain; the gap has closed simply because Obama is running about 9 million votes behind his 2008 totals.” That difference was my blunder.

Update Well Real Climate folks! This is the space to discuss when to admit failure and abandon models which do not make skillful predictions. Like (sadly) my model. And climate models.

See also “How Presidential Polls Work” (very useful) and “Nate Silver’s Obama Prediction.”

23 Comments

  1. Does that last table exemplify the process of “wishcasting”?

  2. Briggs

    6 November 2012 at 10:25 am

    Rich,

    Let’s hope it is instead true of the leading entries of the table.

  3. I think this is the clearest example of wishful thinking bias.

    Regardless, good luck to you all.

  4. Briggs

    6 November 2012 at 10:37 am

    Luis,

    I agree. To put Obama’s chances at near 100% is clearly evidence of wishcasting.

  5. Here’s good video of Nate Silver. “A full and continuous drillage” indeed.

  6. Obama at 100% — at some point, as more facts become known, that figure will or won’t be precisely correct. In a sense, the uncertainty in predictions is indicative of the weakness in available data going into the prediction and/or one’s confidence in the precision and completeness of the available data (so, knowing this really, in the end, doesn’t help an outsider all that much)…

  7. Garbage in, garbage out is so out of date. We used to joke, garbage in, gospel out. People would believe those computer calculations even when they were nonsense.

  8. Dr. Ron Howard at Stanford uses multiple choice questions that are weighted by the student’s assessed probability of their answer being correct. Anyone who assesses the probability at 100% and gets the answer wrong fails the class. He has actually had to fail people for doing this. It is intended to teach people about over-confidence.

    BTW, I was digging around in the internals for some of the polls listed in RCP last night and found one for which 100% of adults contacted reported that they were registered voters and 99% made it past the Likely Voter screen. So I guess we can look forward to a 99% turnout!

  9. Gravis Marketing Poll of Ohio – Obama by 1% with 99% of registered voters turning out.

  10. Here’s an interesting blog piece claiming to develop a prediction without using poll data at all:

    http://www.washingtontimes.com/blog/robbins-report/2012/nov/4/why-obama-cant-win/

  11. The Richard Charnin blog has some interesting posts on the beautiful topic of election fraud, an art that has become so easy it has to be taken for granted, even though officially it doesn’t exist. Just adds another theatrical layer to the already comical election process. Also an extraordinary report by Victoria Collier in the November issue of Harper’s.

    Richard Charnin: 1988-2008 Unadjusted Exit Polls
    http://richardcharnin.wordpress.com/2011/11/13/1988-2008-unadjusted-state-exit-polls-statistical-reference/

    Richard Charnin: Final Forecast: The 2012 True Vote / Election Fraud Model
    http://richardcharnin.wordpress.com/2012/11/05/final-forecast-the-2012-true-vote-election-fraud-model/

    Victoria Collier: How TO Rig An Election (Harper’s November)
    http://harpers.org/archive/2012/11/how-to-rig-an-election/?single=1

  12. My prediction: less than 1 in 3 chance, that’s as close to a number as I care to go.

    I have ESP powers.

  13. @Francisco,

    In my opinnion, exit polls are fraught with worse problems than DRE voting machines (major problems to be sure) and are never by themselves a valid basis to challenge the official election results. If Mr Charnin’s forcast the true vote is based on exit poll data it is invalid on it’s face.

    As to the Harpers article. If they or anyone else has even the smallest particle of real evidence of actual fraud involving DRE voting machines, they should be filling suit in the federal courts. Writing articles alleging voter fraud without going to the courts is nothing more than blowing smoke.

    Note: I am a libertarian not a conservative or a Republican.

  14. Charnin is not talking about the accuracy of the exit polls, but rather about the impossible unidirectionality of the shifts, i.e. virtually every poll overturned by results is overturned in the same direction.

    Regarding the Harper’s article, Collier is an investigative journalist and her job is to dig up material and get it published. She is not a lawyer. Regarding hard proof, that’s the beauty of the new system: there ain’t any. It’s not like in the days when you cold dig the ballot boxes from the bayou if necessary.

    The one thing remarkable in all this is that finally a major publication has dared cover something that the pundits know but won’t acknowled, no matter how obvious it is. Even without any grounds for suspicion (and there are piles of evidence of rigging if you look) you would have to assume that if vote rigging can be done by this means, then it will most certainly be done. Ireland spent millions on new electronic voting machines a couple of years ago, only to sell them all as scrap this year.

    Not that it makes that much difference which of the two gets elected, but the system is truly ridiculous, amazing.

  15. Francisco,

    “Charnin is not talking about the accuracy of the exit polls, but rather about the impossible unidirectionality of the shifts, i.e. virtually every poll overturned by results is overturned in the same direction.”

    The unidirectionality means nothing by itself. It could be that this is the result of election tampering but it is equally likely that this is the result of biased polling. Given the current state of the MSM I would be more likely to believe the later than the former.

    The Harper article demonstrates nothing new or extrodinary. The article itself mentions large scale vote rigging going all the way back to the early 1900s. At least one scheme mentioned lasted for decades before it came to light. Most of the historical schemes mentioned were in favor of the Democrats, so they are hardly in a position to stand on moral high ground on this issue.

    For that matter, speaking of more recent tech, I seriously doubt that DRE voting machines are truly any worse on any of the points of vulnerablity the article mentions than the old lever machines.

  16. The real problem is not the technology per say. The real problem is that for decades the political parties themselves have had control of the vote counting process and also the people who are supposed to police the vote counting process.

    What kind of tech is used from pencil and paper to DRE machines is irrelevant as long as political operatives with direct stakes in the outcome are the ones counting the votes.

    Do note the fact that with the current allecations of vote counting fraud around the DRE machines favoring Republicants and the administration in the hands of a Democrat, neither the DOJ nor the FEC seems interested in this issue. I wonder why?

  17. The difference between “old school” rigging and e-rigging is a matter of scope and a matter of near-impossibility of detection. Collier says at the beginning of her article:

    [...]
    But as the twentieth century came to a close, a brave new world of election rigging emerged, on a scale that might have prompted Huey Long’s stunned admiration. Tracing the sea changes in our electoral process, we see that two major events have paved the way for this lethal form of election manipulation: the mass adoption of computerized voting technology, and the outsourcing of our elections to a handful of corporations that operate in the shadows, with little oversight or accountability.
    [...]
    Old-school ballot-box fraud at its most egregious was localized and limited in scope. But new electronic voting systems allow insiders to rig elections on a statewide or even national scale. And whereas once you could catch the guilty parties in the act, and even dredge the ballot boxes out of the bayou, the virtual vote count can be manipulated in total secrecy. By means of proprietary, corporate-owned software, just one programmer could steal hundreds, thousands, potentially even millions of votes with the stroke of a key. It’s the electoral equivalent of a drone strike.
    [...]

  18. Now this is really interesting:
    http://www.youtube.com/watch?v=8K_Rgwo0Ut8

    And here is the article with more detail. What is the explanation for this?
    http://www.ukprogressive.co.uk/breaking-retired-nsa-analyst-proves-gop-is-stealing-elections/article20598.html

  19. Ye Olde Statistician

    7 November 2012 at 12:42 am

    I guess now the Usual Subjects will ascribe the vote rigging to the recently announce winner. Or else vote-rigging will vanish down the memory hole.

  20. While suggestive, proof is far to strong a word for the actual contents of the article.

  21. Proof may be too strong a word in many situations.
    But what is the explanation for the fact that vote percentage appears consistently as a function of cumulative precinct size (total votes cast in a precinct) and only for certain candidates and only when a Central Tabulator machine is used? Choquette and Johnson summarized their findings in this paper (and links to more detailed analysis at the end of the paper).

    http://www.themoneyparty.org/main/wp-content/uploads/2012/10/2008_2012_ElectionsResultsAnomaliesAndAnalysis_V1.51.pdf
    […]
    –When candidate Mitt Romney is on the ballot he always gains votes through Vote Flipping. (Except in the case of Utah and Puerto Rico).
    There is little to no vote gains in very small precincts. Vote Flipping appears to start between 5%-20% cumulative vote tally per precinct. We believe that this is a more efficient form of fraud because fewer precincts need to be affected and it further reduces the chance of detection.
    — The gain of votes increases linearly as a function of cumulative precinct size. This indicates a
    computer algorithm at play, rather than natural voter preference.
    Candidates with very low vote percentages are unaffected. This could be to prevent negative
    vote tallies.
    Possibly of very high importance to investigators, whenever a county does not make use of a “Central Tabulator” machine, there is no Vote Flipping and the plot traces on the chart “flat-line”.
    […]
    This should be of interest to all statistitians who love their craft regardless of their political inclinations.

  22. satistical models and physics models too

Comments are closed.

© 2014 William M. Briggs

Theme by Anders NorenUp ↑