Statistics

Statistically Speaking: Why You Are Too Sure

How happy are you? On a scale between one and ten, of course. Nobody could be eleven-happy. Listen: I can’t measure your happiness unless I put a number to it. And I must needs measure. A seven you say? Very well, a seven.

Now let’s ask Ed over there. Hey, Ed, how happy are you? Six? That means you’re one point happier than Joe. What’s he doing that you’re not?

Putting a number to happiness makes it science—doing statistics on these numbers is therefore scientific. Think big data, think computers, think sexy learning algorithms! No way we can go wrong by quantifying everything and then modeling it. Right?

Wrong. Hire me to come and speak to your group about why you believe more than you ought.

A helping hand

A helping hand

Categories: Statistics

33 replies »

  1. I hate these point scales. Marketing surveys are the worst – in my experience people invariably love to use a 10 point scale, because it makes them seem smart and scientific. Doctors use them too – and sometimes life or death rides on the fact that your pain is at least a 7 (so that way insurance will pay for the test that discovers that your appendix is about to burst).

  2. Nate: I agree on the pain scale. I simply refuse to answer the question–well, unless I thought my appendix were about to burst, though I have no appendix so I’m okay on that. Asking my what “number” on pain scale my headache is is ridiculous. If it’s a “10”, you’ll know. I’ll be biting your head off. If you have to ask, the answer is already obvious. Though not to bean-counters, it seems.

  3. In failure modes and effects analysis, the modern dispensation asks that we rate the severity of the effects on a 1-10 scale. This is not too bad when the scale is carefully defined (10=catastrophic failure, no warning, etc.) But then we must rate the likelihood of occurrence, which is well-nigh impossible, and then the undetectibility of the failure, both again on a 1-10 scale. Then we are to calculate the Risk Priority Number as the product of these three ratings. This is just wrong in so many ways. Is a severity of 8 really twice as bad as a severity of 4? What does it mean to multiply three numbers from three different scales? Is a severity of 10 of equal importance to a likelihood of 10? Is 10 degrees Fahrenheit the same amount as 10 centimeters? (And those are at least ratio scales!)

    The old MIL-STD 1629A did things differently. Criticality and likelihood were simply classified in four or five buckets of increasing severity or likelihood– I,II,III,IV and A,B,C,D,E –but no attempt was made to manipulate them as numbers. Further, the likelihood was defined not as a probability de novo, but as a relative frequency given the assumption that the failure mode has already occurred. In so many things, our ancestors were wiser than we.

  4. There isn’t much difference between classifying something in four or five buckets of increasing severity and doing the same with 10 buckets labeled 1-10. Both systems are ordinal. With numerically named buckets, though, it’s tempting to treat the numeric value of the names as a parameter for an equation — often an unwarranted assumption.

  5. I just love it when people purport to measure something by taking a poll. It’s like the EPA claiming that exposure to second hand smoke causes cancer when they didn’t measure any exposure. People were asked questions about their exposure and asked to rate their exposure on a scale.

    BTW, The EPA now has a scientific integrity officer, Dr. Francesca Grifo, who previously worked for The Union of Concerned Scientists. The Union of Concerned Scientists is well known for their scientific integrity so this apointment should just thrill you. I’m sure with this new Orwellian-sounding position we’ll see a great improvement in EPA integrity.

  6. Ray,

    There aren’t many ways outside of a poll to measure opinions. Surprisingly, averaging guesses of the number of pennies in a jar often comes close to the right answer — mostly because people are adept at things like guessing volumes through experience. They aren’t so adept when it comes to guessing exposure amounts — mainly because of lack of experience.

  7. Dav,

    “If happiness is a warm puppy the answer must be one. More than one warm puppy can be aggravating.”

    Utter nonsense. Happiness is a pile of warm puppies.

  8. YOS & Briggs,

    Yes, but in which things were they wiser and if we could figure that out wouldn’t that make us the wiser?

  9. MattS,

    I dunno. Spending the night with a pile of warm active firewood while on a camping trip in Hypothermia is true happiness. A pile of warm puppies will keep you up all night with their squirming and whatnot. Firewood doesn’t bite when you try to light it. You can cuddle up with one warm puppy but doing so with one piece of firewood won’t lead to much happiness.

  10. but in which things were they wiser and if we could figure that out wouldn’t that make us the wiser?

    Knowing a wise doctor won’t make you a wise doctor.

  11. Briggs,

    Couldn’t help but notice your lending a hand can be disarming. Perhaps certain redheads would agree.

  12. Four categories is much better than ten, because people have a better chance of categorizing. I recollect that five buckets is pretty much the tops for consistent sorting into categories. Beyond that, huyafulin? Heck, we sometimes had trouble securing consistency just in classifying “good”/”bad” on aluminum cans.

  13. Briggs,

    Barrel cuffs, of course. French cuffs always seem to get caught on something. There’s also the James Bond cuff. Some people I know go for the shiny interlocking cuffs but mostly wear them post-adventure.

  14. You have to reread the statements DAV as the pronouns were plural and collective.

    I have to read it more than once because the pronouns were plural?

  15. Four categories is much better than ten…

    Not as a general rule. Depends on what is intended to be accomplished. Sometimes having too few categories blurs distinctions. But, for love, hate and the like, you are probably right.

  16. This is in reference to your wise doctor comment DAV. You have to reread the comments that led to your saying this since you appear to have not understood them the first time.

  17. I guess I’ll have to do it the hard way. Let’s see:

    You asked a compound question which is really two questions which I’ve enumerated below (with some editing):

    1) … in which things were they wiser ?
    2) and if we could figure out [the things in which they were wiser] wouldn’t that make us the wiser?

    The answer to (2) is no or more precisely, not necessarily. Joe Namath was good at football (so I’ve heard). Just knowing that won’t make anyone else a good football player. Knowing what someone is/was wise about won’t (necessarily) transfer that wisdom to the observer.

    So what exactly makes you think shifting from “us” to “you” means I didn’t understand your questions? Still got that toe sucking fetish?

  18. Still having problems with politeness I see DAV. YOS, Briggs, and myself were speaking of the collective wisdom of the past versus the present and not the wisdom of any one individual. There may be past wisdom that has been lost and thus unknown to use but as I said if we could figure it out it would no longer be lost and “we” would be the wiser. There are still wise doctors and skillful football players, so that knowledge and skill has not been lost and the past is not wiser than the present in those areas. My comment was presented as a humorous tautology. Feel a little foolish now?

  19. There’s a difference between knowing what and knowing why and also a willingness to emulate — all of which has nothing to do with how many are involved. So, no.

  20. DAV,
    If this is now your position then your examples were poorly chosen. In any case you are still misreading my initial statement. This reminds me of another favorite of mine: There is no why, there is only how. That will require some thought as well.

    We might suspect that the ancients could do something that we can not, but this can not be confirmed until we can reproduce it and then they are not wiser. Notice that I am not denying the possibility of superior ancient wisdom only our ability to separate the wheat from the chaff without removing the superiority. Your wise doctor example also begs the question. How do you know that the doctor is wise without being wise yourself? This might be an interesting discussion if you could learn to control yourself.

  21. We might suspect that the ancients could do something that we can not, but this can not be confirmed until we can reproduce it and then they are not wiser. Notice that I am not denying the possibility of superior ancient wisdom

    I suspect from the wording that you are confusing knowledge with wisdom. Recall, the original context:

    The old MIL-STD 1629A did things differently. Criticality and likelihood were simply classified in four or five buckets of increasing severity or likelihood– I,II,III,IV and A,B,C,D,E –but no attempt was made to manipulate them as numbers. Further, the likelihood was defined not as a probability de novo, but as a relative frequency given the assumption that the failure mode has already occurred. In so many things, our ancestors were wiser than we.

    That is, while we know how to divvy such scales into ten increments and multiply numbers from an ordinal scale, our ancestors were wiser not to do so.
    As the joke runs: knowledge is when you know a tomato is a fruit, not a vegetable. Wisdom is not putting tomatoes in your fruit salad.

  22. Wisdom is not putting tomatoes in your fruit salad.

    I suspect that has more to do with why than how.

  23. YOS,
    The difference between knowledge and wisdom can be subtle indead and I would not want to be put in the position of crafting a clear distinction between the two. It is not at all clear in your example “Further, the likelihood was defined not as a probability de novo, but as a relative frequency” if this be wisdom or mere knowledge or using different words to say the same thing. Of course your old example is not practicularly ancient (1980?). You have very young ancestors. I suspect that your ranking of wisdom goes thusly, in increasing order: present day fools, honoured ancestors, and at the top YOS. 🙂

    DAV: Why is that?

  24. Scotian: That’s an awful lot of interpretation to lay on a passing quip. I suggest you read instructions on completing a modern FMEA and then compare to MIL-STD 1629A, especially as regards estimating the likelihood of failure modes.

  25. YOS,
    Yes of course, but to be fair my original comment was only meant to be a tautological joke but it got escalated by he whose name can not be mentioned.

Leave a Reply

Your email address will not be published. Required fields are marked *