William M. Briggs

Statistician to the Stars!

Answering A Critic On Sampling Variability

ransom51

Alfred ‘Dominant Strategy’ (ADS) is confused that “William Briggs is confused on sampling variability”.

I wrote an article highlighting misconceptions and mistakes people make when thinking about sampling variability, and ADS kindly answered (I couldn’t discover the gentleman’s full name). In the spirit of peer review done rightly—openly, and not as a blunt instrument to suppress unpopular or misunderstood views—this answer to his rebuttal.

ADS says I think sample variability “is due to the assumption that the population follows some underlying probability distribution. That is not the case.” Well, I agree, and I don’t see how ADS missed my agreement.

It isn’t the case, i.e. it is false, that anything “follows” a probability distribution. As I say in the article and in another article which I linked to (repeated here), to say “variables” “follow probability distributions” or are “distributed as” this or that probability distribution is a common mistake and an error in ascribing cause.

It is our knowledge of the value of certain propositions that is quantified by probability distributions. Epistemology not ontology.

ADS and I agree we’re discussing a “population” of a fixed size about which we want to characterize the uncertainty of some measurement on each member of the population. As is typical, ADS gives a blizzard of math in place of simple words and speaks of “random samples” with “non-zero probability” for collecting samples from the population.

ADS agrees with me that if we knew the values of the thing for each member of the population, we’d be done. I state simply we’d know the values and therefore don’t need probability models. ADS says the average of the thing (across the population) “is not a random variable!” Although I didn’t say it in the sampling piece, I often say that “random” only means unknown, and variable means can take more than one value. So to say a thing is not a random variable is to say we know its value, which is much simpler, no?

ADS is only concerned with taking the average of the measurement across the population, whereas I talked about ascertaining the values of the measure for each individual, so my view was broader. But except for the unnecessary (mystical) language about randomness, there is no real divergence thus far.

I said we start with probative evidence about the thing of interest and use it to deduce a probability (model) to characterize the uncertainty in the unmeasured values of each member of the population, which if you like (cartoon) math is written:

     [A] Pr( Measure takes these values in the 300+ million citizens | Probative Evidence),

which can be converted to the following if we’re only interested in the mean across the population of the measure:

     [A’] Pr( Mean value of the measure of the 300+ million citizens | Probative Evidence).

This puts ADS and me on the same ground. Now suppose we have take a sample of measurements, which we can and should use to give us:

     [B’] Pr( Mean value of the measure of all citizens | Observations & Probative Evidence).

And we’re done, because [B’] can be expanded to accommodate all the measurements we have (on the right hand side). Of course, [B’] doesn’t tell us the exact value of the mean (of the thing), but gives us the probability it takes whatever values we supply (e.g. Pr( Mean = 17.32 | Observations & Probative Evidence) = 0.02, etc., etc.).

ADS goes the classical route and speaks of the sample mean being an estimate of the population mean, and that we can calculate the “variance” of the sample mean, variability which he calls sampling error. Of course, the classical interpretation of the “confidence interval” which uses this variance is itself a problem (see this or the Classic Posts page).

The problem is we don’t care about the sample mean and some interval. We want [B’]. If we had to guess what the population mean was based on [B’], we could, but that’s a decision (a prediction!); the best guess would depend on what penalties we’d pay for being wrong and so forth. If we don’t need to decide, we fall back on [B’], which contains everything we know about the uncertain quantity given the totality of our evidence.

ADS says “William Briggs is confused because he mixes sampling error with statistical inference.” Rather, ADS is confused about the goal of measuring a sample. But his is a common mistake; indeed, his view is taught as the correct way to do things.

7 Comments

  1. “This puts ADS and I on the same ground.” Your enemies are attacking your grammar now. It’s “…ADS and me…”.

  2. Briggs

    March 17, 2015 at 8:47 am

    Gary,

    Grrrr.

  3. Statistics can be used with great success to feed populations. It can also be used to great effect to starve them. There is a place in between where the golden mean dances in delight.

  4. “speaks of “random samples” with “non-zero probability” for collecting samples ”
    I’m confused. What does that mean?

  5. Rather than cartoon math, I’d like to see a concrete example. I’m not a native English speaker, and to begin with I find the expression “probative evidence” very confusing. My dictionary says “probative” means “furnishing or affording evidence”, so “probative evidence” is “evidence furnishing evidence” ? And in what way is “probative evidence” different from observations ? Is evidence not or should it not be based on observations ? Or put more simply, where does this probative evidence come from, if (ultimately) not from observations? And exactly how does one deduce a probability (model) from probative evidence, given a lack of observations ?
    BTW, none of my textbooks ever stated that “the population follows some underlying probability distribution”, although they did introduce the concept of “sampling distributions”.

  6. SteveBrooklineMA

    March 18, 2015 at 3:58 pm

    I don’t really understand why ADS says Y is not a random variable but “hat bar Y” is. Y is a function that takes each person in the population to some value, e.g. income. “hat bar Y” is a function that takes each possible sample to a value, e.g. mean income of the sampled population. What makes one random and the other not? In the special case when n=1, is there really a difference between “hat bar Y” and Y?

  7. Johan,
    Merriam-Webster has a more helpful definition:

    Definition of PROBATIVE
    1: serving to test or try : exploratory
    2: serving to prove : substantiating

    Examples of PROBATIVE

    Related to PROBATIVE
    Synonyms
    confirmational, confirmatory, confirming, corroborating, corroboratory, corroborative, probatory, substantiating, supporting, supportive, verifying, vindicating

    Antonyms
    confuting, disproving, refuting

Leave a Reply

Your email address will not be published.

*

© 2016 William M. Briggs

Theme by Anders NorenUp ↑