From Ernst:
I was wondering if you could help me as a layman grasp a problem I’m wrestling with. Say I have a large tray and pour a gallon of white paint in it. Then I ‘toss’ a quart of black paint on top of it. The end result looks like a work of modern art. Assume the white paint represents patients who won’t have heart attacks and black those that will. Imagine now we are color blind to complete the analogy. What sampling logic can I use to ensure ‘randomization’ for a clinical trial using the spilled paint as an analogy?
First, randomization is of no interest, except in those cases we worry about a person trying to fool himself, or others. This is why we have referees flip coins at sporting events, because it is presumed the results cannot be controlled. But we could just as easily have the referee decide without a coin. After all, we presume the referee is fair.
We fear doctors aren’t so fair, particularly when it comes to investigating their hot new treatment ideas. Like I’ve said a million times: all scientists agree confirmation bias exists, they just think it always happens to the other guy. Hence removing causal control of evidence selection.
Randomization is asked for in classical statistics, especially frequentism, because it is believed probability is ontic, and thus the randomization is adding something to an observation. If you select a randomization systematically, and not randomly, the observation has been “blessed”, as it were, and somehow it counts less. I am in agreement with Judea Pearl on this subject: cause is primary, not randomness.
None of that is very systematically thought out, but then neither is much in the philosophical aspects of frequentism. We can blame this on the math, which is too easy and beautiful.
Second, how to sample your paint? Well, to what end? Since all the black paints will have heart attacks, presumably we can’t save them, but we might like to find them to sell them cushions they can carry around for when they eventually keel over.
By definition, blacks are all over the place with no pattern that we can predict—and we can’t predict because we can’t identify all the causes of their dispersion. We know one part of the cause, the tossing, but the other aspects are a mystery. So we begin by believing the blacks would be anywhere. We do have the idea that 1 in 4 people (if we consider paint comprised as individual dots) are blacks.
If you’re going to now sample, you have to have a sampling mechanism, by which I mean you have to cause people to come into your net. On the assumption blacks could be anywhere, then it does not matter what you do: just grab people as they walk by your door. One in four, on average—and here we could compute this distribution exactly (which distribution changes dynamically, since we know the population size)—will be black.
But then there might be the idea that people are clustered together, in the sense that threads of blackness run through the connected blobs of white. Yet because we don’t know the structure, or the cause of the structure, of these isolated contiguous groups, it’s the same as believing the blacks could be anywhere.
Think of it this way (another example I use). If I tell you the evidence is “We have a 6-sided device that must be activated and upon each activated it can take one of 6 states only, labeled 1-6”, the probability of “6” is 1/6. But change the evidence to “We have a 6-sided device that must be activated and upon each activated it can take one of 6 states only, with some sides possibly more frequent, labeled 1-6”, the probability of “6” is still 1/6. Because we don’t have any information in the new evidence that allows us to change our probability.
We can’t push that analogy any farther, though. For the second set of evidence leads to a prediction question after the first point (first activation) is sampled. That’s not the same with the paint because we are not trying to predict geographically.
There is no reason to do any geographic sampling, because here we know some of that initial cause of blackness. By assumption, again, blacks could be anywhere. So we don’t need to be careful about setting up some kind of grid, or blocks, and sampling within these. Unless we do know, like in the picture heading the post. But this implies we know something of the cause of the placement of the blacks. If we do have that, then certainly we can use it.
Gist: if you can’t tell black from white by looking, then just grab whomever comes by, until you’ve collected enough evidence to be confident the predictions you make with your probability model will have skill.
Dip a brush in yellow paint. Shake it over the paint pan with the black and white sample. Examine every dot of yellow, to see if it has been lightened or darkened.
“We do have the idea that 1 in 4 people (if we consider paint comprised as individual dots) are blacks.” We start with 1 gallon. We add a quart. We now have 5 quarts. I think that you probably mean 1 in 5?
Paint is mixable. Bulk sampling methods apply.
Quote (Ed Schrock): “Trouble comes in bunches, so take your samples in bunches.”
One reason for randomizing trials is to ensure that the sequence of trials does not inadvertently match some extraneous factor. For example, a production supervisor wanted to test the effect of raising the ingredient ratio on the yield of batch process. His initial impulse was to run a batch at the ‘high’ ratio and compare it to the previous batch [at the normal ratio]. I pointed out that, using the normal ratio, batch yields had fluctuated considerably in the past, so the H batch might be higher or lower for reasons having nothing to do with the ingredient ratio. So we used the average of four batches. Then he wanted to run NNNN followed by HHHH. But I said, what if some extraneous factor smites the process after the fourth batch? You’ll think it was changing the material ratio that caused it. He said, “What’s the chance of that happening?” I said, “I don’t know. If I did I would be less worried.” Since he also wanted to test a low ratio — if ratio affected yield, you would expect L to have opposite effect to H — he proposed NHLNHLNHLNHL, thus getting a mean of four for each level.
Unless there were a natural cycle in the process. So we decided on four bunches, with one N, H, L in each bunch. and the order mixed up within each bunch.
In the actual test run, yields went in the toilet on the fourth and fifth run, then gradually recovered. Since both the initial high yields and subsequent low yields included L, H, and N batches, the ingredient ratio was not a likely cause. Further investigation discovered the raw amyline from the second tank car had been contaminated with water due to poor filtration and this spoiled the reaction.
Nothing so gladdens the heart of production more than the prospect of blaming a vendor for his problems. The vendor was producing the amyline as a byproduct of a petrochemical operation and had paid little attention to controlling it. Informed of the problem, however, he actually took action to prevent recurrence, and the intermittent drops in yield went away.