Why probability isn’t relative frequency

(This is a modified excerpt from my forthcoming—he said hopefully—book, on the subject of why probability cannot be relative frequency. This is to be paired with the essay on why probability cannot be subjective. I particularly want to know if I have made this excruciatingly difficult subject understandable, and what parts don’t make sense to you.)

For frequentists, probability is defined to be the frequency with which an event happens in the limit of “experiments” where that event can happen; that is, given that you run a number of “experiments” that approach infinity, then the ratio of those experiments in which the event happens to the total number of experiments is defined to be the probability that the event will happen. This obviously cannot tell you what the probability is for your well-defined, possibly unique, event happening now, but can only give you probabilities in the limit, after an infinite amount of time has elapsed for all those experiments to take place. Frequentists obviously never speak about propositions of unique events, because in that theory there can be no unique events.

There is a confusion here that can be readily fixed. Some very simple math shows that if the probability of A is some number p, and you give A many chances to occur, the relative frequency with which A does occur will approach the number p as the number of chances grows to infinity. This fact, that the relative frequency approaches p, is what lead people to the backward conclusion that probability is relative frequency.

The confusion was helped because people first got interested in frequentist probability by asking questions about gambling and biology. The man who initiated much of modern statistics, Ronald Aylmer Fisher, was also a biologist who asked questions like “Which breed of peas produces larger crops?” Both gambling and biological trials are situations where the relative frequencies of the events, like dice rolls or ratios of crop yields, very quickly approach the actual probabilities. For example, drawing a heart out of a standard poker deck has logical probability 1 in 4, and simple experiments show that the relative frequency of experiments quickly approaches this. Try it at home and see.

Since people were focused on gambling and biology, they did not realize that all arguments that have a logical probability do not all match a relative frequency. To see this, let’s examine some arguments in closer detail. This one is from Stove (1983; we’ll explore this argument again in Chapter 16).

Bob is a winged horse
———————–
Bob is a horse

(Screen note: this is to be read “Bob is a winged horse, therefore Bob is a horse: stuff above the line is the evidence, stuff below is the conclusion.)

The conclusion given the premise has logical probability 1, but has no relative frequency because there are no experiments in which we can collect winged horses named Bob (and then count how many are named Bob). This example might appear contrived, but there are others in which the premise is not false and there does or can not exist any relative frequency of its conclusion being true; however, a discussion of these brings us further than we want to go in this book.

A prime difficulty of frequentism is that we have to imagine the experiments that pertain to an argument if we are to calculate its relative frequency. In any argument, there is a class of events that are to be called “successes” and a general class of events that are to be called “chances.” Think of the die roll: success are sixes and chances are the number of rolls. While this might make sense in gambling, it fails spectacularly for arguments in general. Here is another example, again adapted from Stove.

(A)
Miss Piggy loved Kermit
—————————–
Kermit loved Miss Piggy

What are the class of successes and chances? The success cannot be the unique event “Kermit loved Miss Piggy” because there can be no unique events in frequentism: all events must be part of a class. Likewise, the chances cannot be the unique evidence “Miss Piggy loved Kermit.” We must expand this argument to define just what the success and chances are so that we can calculate the relative frequencies. It turns out that this is not easy to do. This argument has three different choices! The first

(B)
Miss Piggy loved X
—————————–
X loved Miss Piggy

or,

(C)
Y loved Kermit
—————————–
Kermit loved Y

and finally,

(D)
Y loved X
—————————–
X loved Y

Evidence (from repeated viewings of The Muppet Show) suggests that the logically probability and frequency of (A) is 0. Any definition of successes and chances based on this argument (so that we can actually compute a relative frequency) should match the logical probability and relative frequency of (A). Now, because of Miss Piggy’s devotion, the relative frequency of (B) seems to match that of (A) where we have filled in the variable X for Kermit, a perfectly acceptable way to define the reference classes. But we are just as free to substitute Y for Miss Piggy. However, the relative frequency of (C) is about 0.5 and does not, obviously, match that of (A) or (B). Finally, under the rules of relative frequency, we can substitute variables for both our protagonists and see that the frequency of (D) is nothing like the frequency of any of the other arguments. Which is the correct substitution to define the reference class? There is no answer.

It’s worse than it seems, too, even for the seemingly simple example of the die toss. What exactly is the chance class? Tossing this die? Any die? And how shall it be tossed? What will be the temperature, dew point, wind speed, gravitational field, how much spin, how high, how far, for what surface hardness, and on and on to an infinite progression of possibilities, none of them having any particular claim to being the right class over any other. The book by Cook (2002) examines this particular problem in detail. And Hajek (1996) gives examples of fifteen—count `em—fifteen more reasons why frequentism fails, most of which are beyond what we can look at in this book.

These detailed explanations of frequentist peculiarities are to prepare you for some of the odd methods and the even odder interpretations of these methods that have arisen out of frequentist probability theory over the past ~100 years. We will meet these methods later in this book, and you will certainly meet them when reading results produced by other people. You will be well equipped, once you finish reading this book, to understand common claims made with classical statistics, and you will be able to understand its limitations.

——————————————-
1While an incredibly bright man, Fisher showed that all of us are imperfect when he repeatedly touted a ridiculously dull idea. Eugenics. He figured that you could breed the idiocy out of people by selectively culling the less desirable. Since Fisher also has strong claim on the title Father of Modern Genetics, many other intellectuals—all with advanced degrees and high education—at the time agreed with him about eugenics.

2Stolen might be a more generous word, since I copy this example nearly word for word.

38 Comments

  1. Joy

    Briggs:
    I was with you right up until Miss Piggy and Kermit came on the scene.
    How is substituting ‘x’ or ‘y’ expansion of an argument?
    Why is whether Kermit loves Miss Piggy relevant to whether Miss Piggy loves Kermit?
    Why can we not say that Miss Piggy loves Kermit or Miss Piggy does not love Kermit and say there’s two possible choices?
    I’ll get a cup of tea and read it a fourth time.
    Kermit is in Microsoft word’s spell check! I’ve just been corrected.
    Nit-pick:
    “Evidence (from repeated viewings of The Muppet Show) suggests that the logically probability and…” Is there a typo in ‘logically’?(

  2. Joy

    (a) Miss Piggy loved Kermit therefore Kermit loved Miss Piggy
    (b) x loved Kermit therefore Kermit loved x
    (c) Miss Piggy loved y therefore y loved miss piggy
    (d) x loved y therefore y loved x

    These statements are the same. We are no further on with (d) than (a), we’ve just removed the felt animals.
    I hope that my hazy memory of the exact nature of their relationship has not affected my ability to understand this example! I recall much tension between the two and that there was often violence.
    The winged horse and the dice make sense but I’m struggling with this Muppet example.

  3. Luis Dias

    I’m not really into your argument when you prepare to give examples and then come up with the lame horse argument. Because you give a surreal example and then say that there are a lot of ireali examples similar to this one, but then why not give a real example in the first place

    If the horse example was real, then I can’t also see how it undermines frequentism. p, as you say, is 1. But if you check all the winged horses, they are all horses, which means that if the amount of winged horses checked are all horses, itheni p will tend to be 1 in the infinte limit. How this shows how frequentism fails is not clear whatsoever.

    I do not agree with Joy, I see Kermit’s example more clear, but it is not that greatly explained. Why is the conclusion derived from C wrong If you knew nothing more, it would be the best guess you could possibly make. As in the die case, if you knew exactly how all the physical variables interacted and were able to make the mathmatical calculation sufficiently fast and be (very) precise on your hands, wouldn’t that greatly distorted the frequentist statistics as well

  4. Luis Dias

    Sorry for the typos there. When I say “die” I mean “dice”, “ireali” should be read “real”, etc.

  5. Finally, under the rules of relative frequency, we can substitute variables for both our protagonists and see that the frequency of (D) is nothing like the frequency of any of the other arguments. Which is the correct substitution to define the reference class? There is no answer.

    I think it’s more correct to say that“There is no answer, unless you better specify the question.”

    Do we really want to know the probability Kermit in particular, loves Miss Piggy specifically? Either Kermit loves Miss Piggy or he does not. Either Miss Piggy loves Kermit, or she does not. If we had a crystal ball and could look into Kermit’s and Miss Piggy’s minds, then the question is purely deterministic.

    The difficulty lies in trying to turn this into a question of probability. The fact that the question is asked this way means that, to answer it, we must turn it into a question of probability. But, since the person asking the question didn’t say, which question are we asking?

    The only way to estimating a probability meaningful is to ask questions like: What probability that “some X” loves “some Y” given evidence “Z”. If “X” is some felt puppet character “X” who flees “Y” whenever possible, distances himself from “Y” and behaves like he is suppressing the impulse to visibly cringe when embraced by “Y”, it seems the probability “Y” does not love “X” approaches 1. This seems to describe Kermit’s response to Miss Piggy and so we would conclude the probability Kermit loves Miss piggy is near zero.

    On other other hand, maybe the person asking the question is using Kermit and Miss Piggy as metaphors for a larger class. Maybe they want to know “Person selected at random ‘X’ from the entire universe of people is known to love Y there for person ‘Y’ loves X.”

    This is, of course, an entirely different question. So, the answer is entirely different! There are a zillions questions in between because we haven’t even pondered what we mean by “love”. Does person “Y” even know “X”? Or, is person Y some teen idol who is “loved” by many teenie-boppers?

    In some sense, the difficulty does not lie with probability or frequentism. It lies with a) imprecisely worded questions and b) trying to impose probability on deterministic questions through the mechanism of stating questions vaguely.

    Having said all that, I will state with confidence that the

    Miss Piggy loves Kermit
    ——
    Kermit loves Miss Piggy

    Has precisely the same probability as

    Lucy loves Linus
    —-
    Linus loves Lucy.

    🙂

  6. Just a suggestion: perhaps the relative frequency vs. the probability of X winning the lottery might be a more intuitively understandable example.

    Bob bought a lottery ticket for the first and only time
    _ _ _ _ _ _ _

    Bob won the lottery

    In this case the probability (near zero) and relative frequency (1:1) are not the same.

  7. Briggs

    Mike, Luis,

    I can see that I have to do a better job explaining this difficult subject.

    The “Bob is a winged horse” argument is not as bad as it first looks. It is a counterfactual, in the sense that the premise is false. But logical probability can give a probability to the conclusion, whereas this is impossible using frequency theory.

    I talk more about counterfactuals in the last Chapter, but I think I should say more here. Counterfactuals are common. For example, how can a credit card company know whether it’s credit-granting model works? It can see the people who default, yet who were predicted to not default. But it cannot see the people who did not default yet who were predicted to, simply because the company never issues these cards. Yet they still need this number to check model goodness.

    Another example, of the kind commonly used by historians: If Hitler did not invade Russia, Germany would have won WWII. Obviously, since Hitler did invade, the premise is false, yet we have great fun giving a probability to the conclusion.

    And again, frequentist theory fails utterly here.

    Mike, your example is OK, but I have found it difficult for people not to mentally swap out Bob with X, and so suppose they have proved a natural frequency. I have to remind them: what kind of lottery? How will the balls be chosen? Must they all weigh the same, exactly down to the molecule? How far above sea level will they be shaken, and on and on and on…

    Lucia,

    It is a tautology to say, about any event, that “It will happen or not”, or “It has happened or it hasn’t.” Yet we can still give a probability to the “it will happen” part. “We see a 6” in a die roll will happen or it won’t, but given “This is a die tossed once, and just one side is a 6” we can say the probability of the event is 1 in 6. Frequentists cannot say this without first carrying out the experiments to verify what the actual relative frequency is.

    Of course, if we had crystal balls we wouldn’t need probability at all. Consider, too, that we do not have, in our list of premises, any information about the conclusion (of the crystal ball kind).

    You have put your finger on the main problem with frequentism. Now, I say there is an answer to “Kermit loves…” questions and that, like the die example mentioned at the bottom or the lottery example just above, defining the question precisely is an impossibility under frequency.

    I do think I have explained these “Kermit loves” questions badly, so I’ll have another go. But I would like to challenge anybody who thinks—let’s use the lottery example from Mike—they can define the precise experiment which will prove probabilities are frequencies to have a go.

    I have already mentioned some items that must be kept in mind. How about the Yolanda Vega’s sock color as she shakes the lottery balls? Should that be the same each time? If you say, “Don’t be an ass. Of course not!” you should realize that this logical argument is not allowed. After all, how can you prove the sock color doesn’t matter? Just because you claim it is not sufficient. We have to test, right?

    Oh, I think you can see where we’re going. Logic—or information if you like—is inevitable.

  8. JH

    Here is how I understand the post. Consider the inference or verdict to be reached by a jury at a criminal trial. The verdict probably doesn’t involve any consideration (e.g., the relative frequency of a guilty verdict) of randomly selected past situations of similar crimes. It is usually based on the evidence presented at the trial (premises entail a conclusion). The probability represents the degree to which the premises logically support the conclusions.

  9. Logic—or information if you like—is inevitable.
    I agree with this. I could say more, but I suspect you will. I’d guess the probability that you will say something I agree with exceeds 90%. But, not being a statistician, my formal exposure to Bayes law came up in Engineering course more than 20 years ago. So, I could have miscalculated. 😉

    On your dice rolling example: That is an event that supposedly has not yet occurred. Or, alternatively, whose outcome has not yet been revealed. So, that differentiates it from the “Kermit-Miss Piggy” issue. I believe their love/non-love relationship is eternally frozen in time and for which we appear to have loads of information.

    I would never try to prove probabilities are frequencies.

    I would say that sometimes, the idea that probability corresponds to relative frequency is useful. Other times, the idea doesn’t work so well.

    In some ways, this is no different from problems in other fields where we need to create a model for the thing we wish to understand, and then we study the model. (BTW: I don’t just mean computer models.)

    Sometimes, it’s easy to forget the model isn’t the thing itself. It’s sort of a cartoon.

  10. For Ms. Piggy and Kermt, we can define a chance to mean “each time they interact” and we can define a success to mean “shows love for.”

    I’m not going to get into a discussion about what “love” means or what “shows love for” means, but the statements “Ms Piggy loves Kermit” and “Kermit loves Ms. Piggy” need not be descriptions of unique events, but an event that happens each time they interact.

    If you use the statements “Ms Piggy loved Kermit” and “Kermit loved Ms. Piggy”, there is no need to assign probabilities. Everything has already happened, and the data already exists. You just need to count your chances and successes.

    In fact, you did just that when you said that by repeated viewing the Muppet Show you can impute that the frequency of (A) is zero. What exactly were you measuring when watching the Muppet Show to come up with such a statement?.

    I do not deny the underlying argument that you are making between probability and relative frequency. Bob the winged horse example works just fine as an example of a data set where you cannot count the set of chances. However I don’t think that the Kermit / Ms. Piggy example works as an example of a unique event and I’m betting that the problem is the imprecision of the language used, as lucia says.

    From a reader’s point of view, I also don’t understand how you arrived at relative frequencies for B, and C. Can you clarify it in the text.

  11. Matt

    So, the gist of this frequentist thing seems to be that in order to get a probability, they need to observe some number of trials, since their definition of probability considers nothing else, while a non-frequentist might use any other information available to come up with a probability.

    A simple, if silly (well, not as silly as a Muppet), example might be whether a natural number is odd. Given my knowledge of the definition of an odd number, I know that the probability of any natural number n being odd is 0.5:

    n is a natural number
    —————————
    n is odd

    The experimenter would do some number of experiments, and likely come up with a number close to 0.5, depending on how many experiments he did (and how he did them).

  12. I almost get it, but maybe not. Let’s take a real world one of kind event, say McCain wins the upcoming election.

    That event has some logical probability of happening, but we have no experiment we can repeat to test it. We could, however, look at past elections, opinion polls, and other evidence, make numerous assumptions, and calculate some value p based on that evidence.

    In fact, Media types do that every day. They announce percentages and the “margin of error.” I don’t know exactly how they come up with those numbers, but I assume they use classical frequentist methods.

    How might a Bayesian, given the exact same evidence, calculate a probability, and would that result in a different value p and/or a different margin of error? Would those be “more logical” estimates?

    Or does this example miss the point of your discussion entirely?

  13. They announce percentages and the “margin of error.” I don’t know exactly how they come up with those numbers, but I assume they use classical frequentist methods.

    As far as I can determine, the assumptions underlying the calculations are usually this:

    If we repeated this poll N times, doing it at exactly the same time, under exactly the same circumstances, but setting our auto-dialers to dial somewhat different numbers, we expect the answers to fall within ±x% of what we got. (Some confidence intervals is assumed or stated.)

    The “margin of error” may sometimes mean something different, but I think that’s the gist of what they do. So, the auto-dialer might dial me, or my neighbor, or my sister etc. You’ll get a different answer depending on who you dialed– but other than that, you expect to get the same percentage who answered “a”, “b” or “c” etc.

    The margin of error reported in political polls is trying to estimate the range of results one might get as a result of slight differences in the choice of phone numbers dialed by the pollster.

    Here’s the problem: there is no good way to calculate how well the answers to the questions are correlated to what people really think or what they will really do. There is perfect way to correct for the fact that I, or others “like me” might answer my phone, but my sister, husband or others “like them”, never will.)

    The only thing you can estimate based on the poll results and numbers is how repeatable you think the answers to a specific poll done a specific way will be.

    Unfortunately, the goal of the survey is often to learn something other than the outcome of the survey. When you ask 10,000 people whether they will vote for Obama or McCain in November, you probably want to know how voters who actually go to the polls will vote in November. But what you learn is how people who answered your survey answered the question you asked worded the way you asked it on the day you asked it. The estimated “margin of error” is an estimate about how different your results might be if you’d happened to bump into a slightly different batch of people!

    These estimates of the “margin of error” are useful relative to having no estimate at all. But it’s important to recognize that you generally can’t compute the uncertainty due to other factors. Since you can’t compute the true uncertainty in what you really want to know, that margin of error is not the true uncertainty relative to what you really want to know!

    Statistics are much cleaner and easier if someone is trying to calibrate a thermocouple or figure out if one batch of Guinness likely has a somewhat lower proof than the standard batch. (And even then, it’s possible to screw up. 🙂 ) But Briggs — like many people– seems to want to deal with the tough questions like “What is the probability Kermit loves Miss Piggy.”

  14. steven mosher

    1. Kermit loves Miss Piggy therefore Miss Piggy love kermit.

    Well, we all know that the vignettes of kermit and Miss Piggy trade on the experience of unrequited love, and since we all know that true lovers tell each other “you are the only one for me” and “you complete me”
    that the probability of #1 being true, is quite low, so rare that it appears to those who experience it as if it were a miracle. Just ask aristophanes.

    Opps this was supposed to be about statistics.

  15. Oh— I reread and want to mention. I don’t mean my last statement to sound snide. There are many tough statistical questions, and the example about Kermit and Miss Piggy is a good one.

  16. Briggs

    All,

    My heart soars like a hawk to read these comments. And people say that the internet is making us stupider!

    Lucia, Jinnah,

    It does not make a difference when an event happens. For example, five minutes ago I flipped a coin. What is the probability it is a head? The event already happened, it was already an H or T. Further, I know what it was. But you do not. Probability is a measure of ignorance or information. Your information, or evidence, is different than mine.

    In this way, the Kermit and dice problem are the same. The evidence does not tell us whether the conclusion is true, so we use probability.

    Jinnah,

    An event each time they meet does not work. Each week? Day? Minute, second? Microsecond? Besides, to be a frequentist problem you have to design an experiment that can be repeated indefinitely.

    You’re right about not being explicit enough in detailing (B) and (C). I haven’t done a very good job at this so I’ll work on it.

    Matt,

    Close. But the “some number” needs to go to infinity. Simply having a large number isn’t good enough. It must, by definition, go to infinity. In this sense, your example looks like a natural frequentist situation. This is kind of like a gambling problem of the kind that mistakenly lead people to believe the probabilities are frequencies.

    Mike,

    Yep. Probability of McCain winning is a good one. A frequentist will immediately try to expand your argument

    All I know about this election
    ————————————-
    McCain wins

    To something like

    All I know about presidential elections
    ————————————-
    Republican wins

    Where I have substituted for some variables to make the problem more frequentist like. But this still doesn’t work because this number of experiments doesn’t go to infinity.

    Mike, Lucia,

    Those intervals are confidence intervals, the classical interpretation of which boggles the mind. What Lucias says is sort of right: but you have to repeat the survey…you guessed it. An infinite number of times.

    We’ll talk more about surveys another time.

  17. Matt

    Re: ‘some number’

    Obviously, it’s impossible to explicitly observe an infinite number of experiments. I sorta assumed that they’d pick some epsilon, and once the sequence of frequencies-to-probability calculations (calculated cumulatively) stayed within some epsilon, they’d assume that they’d come out with a pretty good guess at what the actual limit would be. Assuming that we’re talking about a formal limit here.

    So my question would be: How would a frequentist answer the odd natural number question?

  18. Bernie

    I am with Joy on the Ms Piggy example. Besides it is kind of dated!!

    Do frequentists try to prove the famous “all other things being equal” assumption by repeated experiments, while Bayesians assert that you cannot prove it for many types of events? Your sock example seems to assert that the lottery may be “fair” but other factors can influence the outcome and we have to systematically control for them to come up with the “actual” probability.

    Certainly I agree that an event can have different outcomes without the event being repeatable and the lack of repeatability does not preclude the attaching of probabilities to those outcomes.

  19. Do frequentists try to prove the famous “all other things being equal” assumption by repeated experiments,

    Maybe I should let Briggs answer… but I think the answer is “no”. People rarely try to prove “all other things being equal” by repeated experiments.

    Some ideas are cartoons. Seriously– a long time ago, I read a journal article called something like “The role of cartoons in science”.

    Frequentism is based on the ‘cartoon’ idea that whatever samples you draw come from some infinite set of possible set of samples that you might have drawn. Then, the true probability of “x” is the fraction of samples in this mythical set that display “x”. You estimate the true probability by determining the fraction of ‘x’ from the samples you drew.

    For some sorts of problems, this idea is useful. It’s not a bad way to think about the problem if you are calibrating a thermocouple. You take 20 measurements. You can imagine that you might have take 20 different ones. Or repeated it again and again and again. At least in your imagination, it is possible to do an infinite number of calibration experiments on the thermometer.

    But it gets weird for certain types of problems. Is there a set containing an infinite number of Kermit the Frog Miss Piggy pairs? Do some of the Kermits love some of their Miss Piggies? Do some of the Miss Piggies not love their Kermit?

    Does the idea of set containing an infinite number of Kermit the Frog -Miss Piggy pairs with some fraction “x” of Kermit’s loving Miss Piggy and some fraction 1-x not loving her seem to have nothing to do with the question about the probability the one and only Kermit may or may not love the one and only Miss Piggy?

    It seems weird to me! So, clearly, the frequentism idea has a problem for the Kermit-Miss Piggy example. And the problem isn’t simply that we can’t calculate the probability. Under frequentism ‘cartoon’, the probability is defined as a frequency but frequency definition doesn’t make sense in this circumstance.

    Needless to say, if the cartoon breaks down for some problems, then someone has to come up with a different cartoon to describe what we really mean by probability. We need the new cartoon to cover both what we mean by probability in the example of calibrating a thermocouple and in the “Kermit-Miss Piggy” example.

    FWIW– cartoons breaking down isn’t unique to statistics. Think of science questions like: “Does light consist of particles or waves?”

  20. I’m not sure you answered my question. Maybe it was a bad example. We can imagine a great many one of kind future events, and we can speculate on the probability of those events occuring based on “evidence.” Sometimes the evidence is strong, sometimes weak. Will the sun come up tomorrow? Lacking crystal balls we don’t really know, but strong evidence suggests it will.

    We can even premise counterfactuals, i.e. the sun is drawn by magical horses belonging to the gods or other earth-centric theories of pre-Copernicans, and still speculate about the logical probability.

    Sun orbits the Earth
    – – – – – – – – –
    Sun will come up tomorrow

    Is this beyond the logic of Fisher et al.? But not that of Bayesians?

    Infinite limit theory may seem illogical (or imponderable), but it worked for Issac Newton. Calculus is based upon limit theory and calculus underpins all those distributions that Bayesians use for their priors. And what is a prior but a stab at the unknown?

    I accept Bayes conditional logic. I accept that our uncertainty about the future is larger than many realize or admit to. But I cringe (somewhat) at the anti-frequentist rhetoric when the exact same parametric statistical distributions are used by the “enlightened” Bayesians.

    I will have to read Hajek. I probably will. If I do, I might understand the issue better. At this point in time I cannot say for sure which of those three premises, if any, are counterfactual.

  21. MrCPhysics

    I’m with the reader above: I like everything in your essay except the fictional/surreal examples. I think you’d be much clearer if you used a real-world example. I don’t know why those would be beyond the scope of the book.

    Doesn’t the frequentist argument assume an idealization of the (imagined) experiment? By this I mean, for example, that dice probability is set assuming a “fair die”, which anyone can imagine, defined as one where symmetry is perfect, as is the landing area, and the throws are idealized as “random”. This makes a lot more sense with dice than it does with economic models, of course, which is, at least in part, your point (I think).

    If the main thrust of this section is that frequentists are too certain of their predictions, well, then, as they say, “You had me at ‘hello'”.

  22. Bernie

    Here is a real world example: We just sent in a proposal for a piece of work. It seems reasonable and meaningful to ask, “What is the probability of us winning the work?” What information is needed to answer this question. As with many proposal processes, there is an initial round and then a request for presentations from a “shortlist” of proposers. “What is the probability of us winning the proposal, if we are asked to present?”
    If historically we win 50% of the proposals we are asked to present, how much does this help answer the question?

    Now I must get back to preparing for a presentation of our proposal.

  23. Briggs

    All,

    I agree with everybody that my Kermit example stinks. I also agree that I have explained it very badly, and so out it goes. I have caused more confusion than I have increased anybody’s understanding. There are more technical ways to present the argument (“instantiating argument schemas”, “what happens when substituting propositional vs. predicate variables” etc.), but they would be far out of place in an introductory book.

    The key arguments are: (1) probability is a logical measure of the relation to a set of precisely described set of premises and a conclusion; (2) if an experiment (situation with premises and logical probability of a conclusion) is given many chances to occur, the relative frequency of successes will eventually match the probability; (3) in frequentist theory, there can be no known probability for any unique event (Bernie’s example is good), nor any number of events that are finite in number (a dogmatic statement, but all truth is dogmatic); (4) sometimes, like in Matt’s example of odd natural numbers, the relative frequency matches (at the limit only, as always) the probability; (5) for most arguments, and all real-life problems, probability does not match relative frequency.

    And because of (1), (6) an exact number value for probability cannot always be given. Again, Bernie’s example is good for this.

    Mike, I’ve no problems with limits and infinitesimals in math. I am a good friend to the law of large numbers (the primary point of confusion for frequentists). Of course, you know my feelings toward the normal (not good), so I’m even with you when you suspect too many Bayesians use the same tools as the frequentists (don’t forget that people always learn to be frequentist first, and only then can then opt for Bayesianism; it’s nearly impossible to leave all the old baggage behind).

    I’m more with Jaynes who said (I’m paraphrasing) to set up the problem in discrete, physically-measurable form, and only after it is all laid out do you take limits. Jaynes never took his own advice with the normal! (Sorry if this doesn’t make sense to everybody, but Mike will get it. Mike, in a week or two, I’ll have an interesting paper on this with my friend Russ Zaretzki. )

    MrCPhysics, I have a more technical paper (on my resume page, near the top) showing why you do not, and even can not, have premises which include statements of “symmetry” and “fairness.” Gist is: these make the arguments circular.

  24. Excellent!!!! Will read it and maybe even post it at my online library!

    If I get around to it. Been schlepping vegetables at the farmers market. Loaded the pottery kiln today and firing tomorrow. So many projects, so little time. And to top it off, the future is largely unknown. We schlep and we fire anyhow, always hoping that the future will be okay, even though, taking it all in logically and rationally, it probably won’t be eventually.

  25. Bernie

    Mike:
    One really doesn’t know, does one?
    🙂

  26. Luis Dias

    I’m still not satisfied. I’m sorry for I fervently believe that it is surely my fault. But, taking bernie’s case, if I, being an outsider, am asked what are the chances that his presentation succeeds, knowing nothing more than his track record, I must say that he has a 50% chance to score, though I wish him better luck than that!

    The reason is straigthforward. Can’t see why am I wrong in saying this. If I have more information, things get a little complicated. As explained in a previous post by mr Briggs, there is a lot of information that is subjective. So, if I am unable to rigorously quantify it, I’m in trouble. But the track record history is not subjective, at least if I know only the success/insuccess rate.

    Ain’t that so?

    I’m utterly confused. Help me out here.

  27. Luis Dias

    The thing is, I’m probably confused at the why. Why ” there can be no known probability for any unique event “? What does that mean? When I say, there’s fifty fifty chance of this occurring, based on x, why am I wrong? My mind wanders in this mess. Surely, there is almost nothing in this world that hasn’t subjective influences on our statistical inquiries, but if we put those down in a pen and paper, we can at least objectively look at them.

    For instance, when someone claims that the chance of Kermit loving ms Piggy is zero, one is already making huge claims that don’t really matter, like:

    a) They truly exist and have feelings. Okay, we can say that those feelings are emulated, so where are they? In the minds of the writers? Hmmm.

    But taking for granted that they exist,

    b) Aren’t we taking the action/state of mind of “love” as an eternal condition, where perhaps it’s but a temporal one? Perhaps in most episodes, Kermit hates her, but there is one of them that he gets a little drunk and “decides” he loves her.

    In this case, it gets to be a frequentist case, doesn’t it? Even if it doesn’t reach infinity, one could say that the chance that in a certain episode Kermit loves ms Piggy is n%. But the sentence is already changed!

    So when you ask for the probability that Kermit loves ms Piggy, what are you truly asking? Does it take only one episode of mutual love to declare that he loves her? 3 episodes? All of them?

    (Of course, if in no episode does Kermit loves her, then it’s zero)

    Do you understand my frustration?

  28. George Crews

    Hi Briggs,

    Are there not two kinds of probability depending on what we want probability to measure? A property or a meta-property?

    Spin a roulette wheel at a Vegas casino a bunch of times and record the number of red, black, and green results. The frequencies are a property of that roulette wheel. And it allows for one kind of probability to be measured. Our belief that the roulette wheel is fair or not is a meta-property — it is a probability measure not of the wheel, but of us. It may or may not be based on frequencies (or other measurements or observations about the roulette wheel). That’s another kind of probability.

    I will admit that I do not find the argument that if you sample a property an infinite number of times it becomes a meta-property to be convincing. But what is wrong with having two distinct kinds of probability? Too confusing?

    George

  29. Bernie

    Luis:
    Perhaps I can help. First I would be more than happy to settle for a 50% chnace at winning a proposal – but thank you for you good wishes!!

    I think the fundamental issue is that we tend to make a priori assumptions that we are talking about an essentially homogenous set of events, i.e., all proposals and proposal situations are the same. The key thing is to begin to build a model that compares the current event, i.e., proposal/proposal situation to all other events and then decide the extent to which this event is the same to past events. As you begn to unpack each event it becomes clear that the a priori assumption of similarity is pretty optimistic and very much more optimistic than say the similarity in functioning of a roulette wheel or dice or a coin. (This is where my earlier poorly articulated notion of “all other things being equal” comes in.)

    On a practical level and as a rule of thumb, I tell young consultants that even when we have the winning proposal, they should still assume that we have a 50% chance that we will actually get to do the work. This is one reason why builders/plumbers/doctors seldom show up when they say they are going to. We all tend to overbook!! Plus I am a natual skeptic and cynic. 😉

  30. Joy

    Luis:
    I love your example of Kermit getting drunk and viewing Miss Piggy through beer glasses! It made me smile.
    I am surely not the one to help you out here but I’m going to try anyway.
    I think the problem with the example is that there are too many dimensions of irrelevant information which tangles the mind. Consider the example:
    “Miss Piggy loves Kermit therefore Kermit loves football” might have caused less confusion. It would have illustrated that the event of a felt animal loving a contact sport, a unique event, could draw on no useful or relevant information that could be of any use in determining the probability. I suppose that was the point of showing that the information in the premise was not useful in determining what was unique. We all got caught up in the humour and on the unrequited love.
    My own confusion was also that I thought the three examples were showing different reasons where it could be shown that probability is not relative frequency. Add this to the fact that I thought Kermit adored Miss Piggy and she was indifferent I was confused by the “0” probability. I had to ask others to make sure.

    In Bernie’s example, if one only knows the information about 50% success rate then this can be said to be the probability working on the premise that this is the only information available. However it is clear that this approach leaves much to be desired in terms of likely truth of actual probability. So in that case, is counting past event outcomes in terms of frequency a true gage of all factors that make up a probability? And isn’t Bernie’s proposal, as the others, a unique event anyway?
    With Uncle Bernie in the mix, 50/50 is hardly fair!
    Now I may be way off here but I am seeing relative Frequency as a modality within a wider discipline of statistics that does not in itself characterise the entity of probability or reveal it’s true measure.

  31. Briggs

    All,

    I am busy working on a whole new sub-sub section, re-writing a lot of it. I’ll post it when finished.

    Let me say, though, that it is false that because any (contingent) event will or won’t happen, it therefore has a probability of 1/2. To say “the event will or won’t happen” is a tautology, meaning it is always true, and gives no useful or probative information.

    This material only takes a small part of the overall book, incidentally. It reminds me that I have never really showed the difference between parameter-inference and so-called predictive-inference. I never posted those final chapters on line.

    The book is almost done!

  32. Bernie

    But Matt, when we actually get the work the consultant feels extra special, and when we don’t they don’t feel so bad. But, of course, you are right. The reality is as complex as the issue of whether we would win the proposal: Buyers completing a sale even after they have said they they are interested in buying – hence “closing” as a distinct phase in the sales process.

  33. Luis Dias

    Well, thanks Joy and Bernie for the answers. I’ll wait for Briggs’ update in the matter for further info…

  34. Bernie

    Let me add one more real world example. Suppose you are a property and casualty insurance company and a client, who installs and sells propane gas appliances and who you have insured for 30 years without a claim, has an event where an installer made a mistake and creates a major loss. What should you do? Drop the risk or reinsure? How would you justify your decision?

    This raises an interesting broader question: Are successful Actuaries Bayesian or Frequentists?

  35. Luis Dias

    bernie, that’s a hell of a question! Given the sheer amount of variables, ranging from quantitive and objective (the amount of loss, the value of insurance, etc) and the subjective ones (if the company drops this guy, will it backfire on it because clients won’t want an insurance company that doesn’t?) I’d say, “don’t know”. There are a lot of people that think they do. Perhaps a given technique is more profitable than other, but only considering a rather limited time span. It could also depend on the market context. There’s an area that is quite abundant of said examples: the US health care insurance system.

  36. Bernie

    Luis:
    The answer needs to involve, among many other things, an assessment of how insureds respond to negative events. Does an invent lead to increased vigilence? If it does, then all other things being equal, the correct decision is to keep insuring this risk. “All other things being equal” is a critical assumption. If, for example, the insured’s business had recently had a change in management, training practices, hiring practices for technicians, etc., then a new risk assessment would be in order.
    I think that the recent AllState commericial that “forgives” the first event suggests that heightened vigilence is a significant favorable risk factor for previously “good” risks.

    Joy:
    That’s what I would have thought, though I do not know much about actuaries — does actuarial training emphasize a Bayesian perspective? I would imagine that it should in order to deal with insuring singular events.

Leave a Reply

Your email address will not be published. Required fields are marked *