Answering A Critic On Sampling Variability

Alfred ‘Dominant Strategy’ (ADS) is confused that “William Briggs is confused on sampling variability”.

I wrote an article highlighting misconceptions and mistakes people make when thinking about sampling variability, and ADS kindly answered (I couldn’t discover the gentleman’s full name). In the spirit of peer review done rightly—openly, and not as a blunt instrument to suppress unpopular or misunderstood views—this answer to his rebuttal.

ADS says I think sample variability “is due to the assumption that the population follows some underlying probability distribution. That is not the case.” Well, I agree, and I don’t see how ADS missed my agreement.

It isn’t the case, i.e. it is false, that anything “follows” a probability distribution. As I say in the article and in another article which I linked to (repeated here), to say “variables” “follow probability distributions” or are “distributed as” this or that probability distribution is a common mistake and an error in ascribing cause.

It is our knowledge of the value of certain propositions that is quantified by probability distributions. Epistemology not ontology.

ADS and I agree we’re discussing a “population” of a fixed size about which we want to characterize the uncertainty of some measurement on each member of the population. As is typical, ADS gives a blizzard of math in place of simple words and speaks of “random samples” with “non-zero probability” for collecting samples from the population.

ADS agrees with me that if we knew the values of the thing for each member of the population, we’d be done. I state simply we’d know the values and therefore don’t need probability models. ADS says the average of the thing (across the population) “is not a random variable!” Although I didn’t say it in the sampling piece, I often say that “random” only means unknown, and variable means can take more than one value. So to say a thing is not a random variable is to say we know its value, which is much simpler, no?

ADS is only concerned with taking the average of the measurement across the population, whereas I talked about ascertaining the values of the measure for each individual, so my view was broader. But except for the unnecessary (mystical) language about randomness, there is no real divergence thus far.

I said we start with probative evidence about the thing of interest and use it to deduce a probability (model) to characterize the uncertainty in the unmeasured values of each member of the population, which if you like (cartoon) math is written:

[A] Pr( Measure takes these values in the 300+ million citizens | Probative Evidence),

which can be converted to the following if we’re only interested in the mean across the population of the measure:

[A’] Pr( Mean value of the measure of the 300+ million citizens | Probative Evidence).

This puts ADS and me on the same ground. Now suppose we have take a sample of measurements, which we can and should use to give us:

[B’] Pr( Mean value of the measure of all citizens | Observations & Probative Evidence).

And we’re done, because [B’] can be expanded to accommodate all the measurements we have (on the right hand side). Of course, [B’] doesn’t tell us the exact value of the mean (of the thing), but gives us the probability it takes whatever values we supply (e.g. Pr( Mean = 17.32 | Observations & Probative Evidence) = 0.02, etc., etc.).

ADS goes the classical route and speaks of the sample mean being an estimate of the population mean, and that we can calculate the “variance” of the sample mean, variability which he calls sampling error. Of course, the classical interpretation of the “confidence interval” which uses this variance is itself a problem (see this or the Classic Posts page).

The problem is we don’t care about the sample mean and some interval. We want [B’]. If we had to guess what the population mean was based on [B’], we could, but that’s a decision (a prediction!); the best guess would depend on what penalties we’d pay for being wrong and so forth. If we don’t need to decide, we fall back on [B’], which contains everything we know about the uncertain quantity given the totality of our evidence.

ADS says “William Briggs is confused because he mixes sampling error with statistical inference.” Rather, ADS is confused about the goal of measuring a sample. But his is a common mistake; indeed, his view is taught as the correct way to do things.

Discover more from William M. Briggs

Subscribe to get the latest posts sent to your email.

7 Comments

Gary

March 17, 2015, 8:31 am

“This puts ADS and I on the same ground.” Your enemies are attacking your grammar now. It’s “…ADS and me…”.
Briggs

March 17, 2015, 8:47 am

Gary,

Grrrr.
brad tittle

March 17, 2015, 11:22 am

Statistics can be used with great success to feed populations. It can also be used to great effect to starve them. There is a place in between where the golden mean dances in delight.
Ray

March 17, 2015, 3:19 pm

“speaks of “random samples” with “non-zero probability” for collecting samples ”
I’m confused. What does that mean?
Johan

March 17, 2015, 6:52 pm

Rather than cartoon math, I’d like to see a concrete example. I’m not a native English speaker, and to begin with I find the expression “probative evidence” very confusing. My dictionary says “probative” means “furnishing or affording evidence”, so “probative evidence” is “evidence furnishing evidence” ? And in what way is “probative evidence” different from observations ? Is evidence not or should it not be based on observations ? Or put more simply, where does this probative evidence come from, if (ultimately) not from observations? And exactly how does one deduce a probability (model) from probative evidence, given a lack of observations ?
BTW, none of my textbooks ever stated that “the population follows some underlying probability distribution”, although they did introduce the concept of “sampling distributions”.
SteveBrooklineMA

March 18, 2015, 3:58 pm

I don’t really understand why ADS says Y is not a random variable but “hat bar Y” is. Y is a function that takes each person in the population to some value, e.g. income. “hat bar Y” is a function that takes each possible sample to a value, e.g. mean income of the sampled population. What makes one random and the other not? In the special case when n=1, is there really a difference between “hat bar Y” and Y?
Bert Walker

March 19, 2015, 1:00 am

Johan,
Merriam-Webster has a more helpful definition:

Definition of PROBATIVE
1: serving to test or try : exploratory
2: serving to prove : substantiating

Examples of PROBATIVE

Related to PROBATIVE
Synonyms
confirmational, confirmatory, confirming, corroborating, corroboratory, corroborative, probatory, substantiating, supporting, supportive, verifying, vindicating

Antonyms
confuting, disproving, refuting

Briggs on How Can You Tell If You Have ESP?June 30, 2025
Spetzer, All good points. I go into many of them later in the chapter.
spetzer86 on How Can You Tell If You Have ESP?June 30, 2025
If the receiver and sender independently wrote down the guess / card with no verbal communication, it might be better.…
JH on Class 56: The Best Model!June 29, 2025
How can you make predictions in the big data era without using a model? Trust your gut.
Jonas P. Kay on Which God Are You Rejecting? David Bentley Hart’s The Experience of God, Part IJune 28, 2025
I answer all by saying you cannot get something from nothing. There is beauty in its simplicity and it is…
Johnno on England’s Mandatory Suicide & WomenJune 27, 2025
So what's the betting pool on the odds about when and in which country will be the first to perform…

Share this:

Related

Discover more from William M. Briggs

7 Comments

Leave a Reply