Statistics

Regression To The Mean (And Performance Curses) Simply Explained

I complete this foursome

I complete this foursome

I’ve just read about The Second Term Curse which supposedly besets (of course) second-term presidents.

There is also the infamous Sports Illustrated Curse, which is said to befall athletes soon after they appear on the cover of that magazine. Other examples abound.

Roughly, ceteris paribus, on average, all other things equal, these “curses” more or less work like this:

Everything that isn’t a stick in the mud, or the product of a bureaucracy (or a bureaucrat), or isn’t otherwise ossified exhibits, for a given behavior, a range. Batters coming to the plate have hot and cold streaks. Actors have scintillating and dull performances. Golfers hit over and under par. Presidents please then displease the citizenry.

Now most of the time performance, for this or that person, is middling. Professional golfers don’t shoot birdies constantly and consistently, nor do they hit boogies: they hit par—which is why they call it par. For me, I don’t shoot two or three over par each time, nor do I hit nine or ten; my usual tally hovers around five or six over. I mean per hole.

But suppose I were invited to join this summer’s Internet Philosophers Open, held each July in beautiful downtown Gaylord, Michigan. Further suppose that I, strengthened by the love and support of my dear readers (and a sufficient dose of the water of life), shoot par and therefore win.

Instant celebrity would result. My picture and bio would appear on tens of blogs, I wouldn’t have to pick up the tab on the nineteenth, and I’d probably even get an interview request from the local paper. The Mayor would shake my hand. Discussions about t-shirts imprinted with my image would be had. I’d be the talk of the interwebs for hours.

This publicity would not go unnoticed and thus I’d surely be asked to participate in the Fall Bloggers Classic, which is October in Cleveland (weather permitting). Once there, it’s much more likely I’d “revert” to my average performance and finish +297 ( = 18 * 3 * 5.5 ).

Think of the headlines! “Shame and Ignominy on Full Display”, “Briggs Muffs It”, “Tournament Organizing Committee Under Investigation”, etc., etc. The psychic pain of my fall would be so intense I’d probably take to listening to NPR—and imagining that I enjoyed it.

Theories by the dozen would be propounded about why, after showing so much promise, I failed so badly. Some would place the blame on atmospheric conditions. Others would compare the quality of Polish sausages between the two locales. Many would pore over my writings between the two tournaments searching for clues about my mental state.

Some, none, or even all (in part) of these theories might be right—something caused me crumble—but the smart money before the Fall Classic would have bet on a dismal performance, simply because that was the best evidence and the most likely outcome.

But if people don’t recognize this, and only remember the see-sawing of performances between the two tournaments, they might put the changes down to a curse.

This all works in reverse, too. If you witness an atypically dismal performance, chances are good the next will be better. They have “regressed” (in reverse) to their “mean.” Or if you see somebody displaying their everyday ability, that’s most likely how you’ll see them the next time.

Categories: Statistics

20 replies »

  1. The obvious answer is to be like the pictured comedy group who had seven members over time but never varied from their average of three.

    I’m surprised at you, Briggs. Here we have a post with cursing in it and I thought this was a family blog.

  2. Well, that was a truly average, unremarkable, run-of-the-mill post! 🙂

    Kidding aside, an interesting post. Yet it raises a question. What about the well-known phenomenon of ‘streaks’ or the ‘hot hand’ or ‘being in the zone’ in sports? Is that just undocumented perception? Seems there may be more to it than that. With streaks, we tend to see a performer do exceptionally well, not just once or twice and then revert to the norm, but over a period of several — perhaps many — performances. Indeed, if sports psychology has any merit, someone on a streak may be more likely to continue performing above their average on the next performance. Eventually, of course, the cruel hand of fate weighs in and the inevitable drag of the individual’s average capability brings things back to the average, but it seems there may be something to the concept of a temporary streak/zone.

    On the other hand, is there a real thing as a streak/zone, or is it all just random fluctuation around a norm — with some strings of above-average performances inevitably being longer than others — and what we think is a streak/zone is just sampling bias, due to the survivor effect (i.e., in this case the athlete on a random streak gets noticed and receives lots of temporary press)?

    Thoughts?

  3. In flipping a coin over and over, one may encounter “streaks” of six heads (wins) about 1.5 times in a hundred such sequences. Doesn’t mean the coin is “hot.” Cf. Fermi’s discussion with Gen. Groves over the percentage of “great” generals.

  4. Interesting that “reversion to the mean” suggests a drop from a superior level. Why is it never “ascension to the mean?” Do we notice failure more than success? Or are steroids the reason for improved performance and thus a rising mean? Maybe it’s those kids from Lake Wobegon…

  5. In reply to Eric Anderson and streaks. I remember reading about a professor teaching a statistics class. An early homework assignment was for students to write down a series of HT coin flips= they had the option of making up the results or flipping an actual coin and writing down the results. The professor was able to guess with 90% +accuracy which students had actually flipped coins and which had just made up results. The professor did this by looking for a run of 7 or more heads or tails- There’s about an 80% chance that you’ll get a streak of 7 or more by flipping randomly 200 times, and just about a 0% chance of getting a run of 7 or more from someone making up strings of Heads and Tails.

  6. It’s probably those hidden variables. If we knew those all would be clear.

  7. Ye Olde Statistician and Alan McIntire:

    Yes, I was thinking of those examples too, thus the question.

    Sounds like you would argue the athlete on a hot streak is just random deviation around the norm and people only think it is something more because it gets noticed and reported (a kind of “survivor bias” in the reporting).

    But that’s really the question. I’m not quite sure that some streaks aren’t real, in the sense of being more substantive than a random fluctuation around a norm. In many sports (so at least we are told and some of us might even think we’ve experienced) thinks like attitude/confidence/intensity/whatever play a role in performance. So it would seem that a streak could be, at least temporarily, self-sustaining and perhaps not just a random string of positives based on a large enough sample size. Same goes for the proverbial slump.

    Of course, if there are enough hot streaks happening over a long time then we typically just revise upward our expectations of what the athlete’s “norm” is and confess that he was really better than we thought after all . . .

  8. All true..but with regard to human performance, one aspect not to be overlooked is improvement over time, which is one reason we amass time-series of such performances. The mean of a performer’s performances is not (necessarily) static; if he’s improving, the mean of his scores will march upward (or in golf, downward). That makes the use of “regression to the mean” a bit chancy in analyzing or explaining athletic performances, for example.

  9. JH,

    Impossible. There is no frequentism, as it is fallacious, though there are people calling themselves ‘frequentists’. All are Bayesian.

  10. The key to making improvements, the late Ellis Ott used to say, is to know when an apparent change is real or not. A “streak” of five or six could occur by chance even if the mean has not changed. The old Western Electric SQC Handbook used as signals worth investigating:
    1 sample beyond 3-sigma
    2 samples out of three beyond 2-sigma on the same side
    4 samples out of five beyond 1-sigma on the same side
    7 samples out of eight on the same side of the center line
    But these signals were simply intended to trigger an investigation for an assignable cause. They were to guard against the well-known human tendency to assign causes to random fluctuations.

    So a performance streak might be
    a) a “lucky” streak followed by reversion to the mean
    b) an actual change in the mean.

  11. “Sounds like you would argue the athlete on a hot streak is just random deviation around the norm and people only think it is something more because it gets noticed and reported (a kind of “survivor bias” in the reporting).” Could be- as ye olde statistician pointed out, you could compare the frequency of streaks to the number of players- In baseball there are about 15 non pitchers per team , about 36 plate appearances per game in 162 games, that works out to an average of close to 400 plate appearances per non pitcher. With 30 teams that works out to 450 players- Probably 1 of them will have a 1/450 career high performance, etc.

    Playing pickup basketball games after school many years ago, I’ve sometimes felt that I was in a hot streak- there could be a little to that as basketball and baseball involve not only luck, but skill. If someone is tired after working hard or not getting enough sleep, has a sore back or knee, a slight fever or cold, they’re not going to perform as well as when they’re in 1A tiptop shape. There are different probability curves for different people in this regard too- what’s the probability you won’t be somewhat tired from a restless night, won’t have knee or back problems, won’t be fighting off infections, etc?

  12. Briggs,

    I shall not augue with your calling frequentism “fallacious” (this seems to be your favorite word), so instead, let me suggest that you read Gelman’s book for a Bayesian generalization/explaination of “regression to the mean.”

  13. JH,

    Gelman? Gelman? The guy who keeps trying to push “Bayesian” p-values on us? No, thank you, ma’am.

  14. Briggs, Ah, I see, since Gelman promotes Bayesian p-values (Bayesian p-values a are based on posterior probabilities and thus are not p-values per se.), therefore his book is not worth reading.
    It’s a good book.

  15. Briggs, Ah, I see, since Gelman promotes Bayesian p-values (Bayesian p-values are based on posterior probabilities and thus are not p-values per se.), therefore his book is not worth reading.
    It’s a good book.

  16. I would say that there isn’t necessarily a contradiction in talking about a “hot streak” and “normal variation” when it comes to something like sports. As we often hear on this blog, “random” really means “we don’t know.”

    Of course, the observed performance is caused by something. So perhaps for some period, some of the variables (e.g., the aforementioned attitude/confidence/intensity/whatever) take on a higher level, causing increased performance. What lead to that?

    Hmm…maybe it’s random! Ok, but that just means we don’t know. Not that there isn’t an underlying cause. And with something as complex as human performance, there is no end to speculation about causes. Thus, sports talk radio.

Leave a Reply

Your email address will not be published. Required fields are marked *