You go into the doc’s office. He’s not looking happy. He’s holding your chart.
“Mr Smith, I’m afraid I have some bad news for you.”
Give it to me straight, doc. I can take it.
“You have,” he said, pausing to look sadly down once more, hoping he wasn’t seeing what he was seeing, “You have a nasty case of striated lustrations.”
Not that!
“I’m afraid so,” he answers. “The gomferted kind. The kind that can’t be taken care of with surgery. But it’s not as terrible as it sounds. There are drugs that can offer some hope.”
Like what?
“Well there’s that old standby aspirin.”
Anything else?
“Yes, there’s something new. Profital (Beezelbub™) from Luxurious Pharmaceuticals.”
Profital?
“You’ve seen the commercials. The one with the happy people walking in slow motion with big smiles on their faces? The one that says ‘Ask your doctor if Profital is right for you’?”
So? Is it right for me?
“That’s hard to say.”
I mean, is it better than aspirin?
“Better in what way?”
If I take it, there’s a better chance for a cure than aspirin, right?
“Well, I don’t know. What I can say is that Luxurious ran a scientific experiment comparing aspirin with Profital. They gave one group of folks aspirin and another group Profital. Then they measured improvement of the gomferted striated lustrations.”
And?
“Then they ran a sophisticated statistical analysis and got a P-value of 0.0412.”
A what-now?
“A P-value of 0.0412.”
What’s that supposed to mean. That Profital is better?
“Well, we can’t say that. But what we can say is that a P-value of 0.0412 is statistically significant.”
And this statistically significant means Profital is better?
“Not quite. It means that the result was not due to chance.”
So not due to chance means that Profital is superior to aspirin?
“It’s more complicated than that.”
In what way? Just what does “statistically significant” mean then?
“It means in the analysis that they did a P-value less than 0.05 was discovered.”
Wait. So a P-value less than 0.05 means Profital is better than aspirin?
“Again, it’s more subtle than that.”
Well explain it to me. Tell me exactly what this P-value is. I’ve heard of them before, when I’ve read those articles that always begin “Studies show.” Tell me what a P-value means in the context of this new drug. After all, I have to decide whether I should take it, right?
“Yes, you do.”
Okay, then what’s the precise meaning of this P-value here, for me today? And how can it help me decide whether I should take this new drug?
“All right. In this statistical analysis I told you about, they calculated something called a statistic. Technically it’s a mathematical function of the data they collected on patients, and some other things having to do with the specific type of analysis they did. With me so far?”
Do I need to see this math?
“Not really. It’s not necessary to understand.”
Carry on, then.
“What they do next is ask what would happen if they repeated the same experiment they just conducted. They imagine they did it exactly precisely the same way: same number of patients in both groups, same dosages, same days of the week the meds were given, same number of men and women, same patient weights, and so on. Precisely the same.”
I got it.
“Except that this repetition is randomly different.”
Randomly different?
“It means everything is exactly the same, but changed in ways no one can predict.”
So not exactly the same, then. Could be different in all kinds of ways nobody knows about, yes?
“I suppose so. But it’s not that important because this experiment is never done. It’s just imagined that it’s done.”
Okaaaay.
“And in this imaginary experiment, they imagine they do the same analysis. And in this imaginary repeated analysis, they calculate that statistic again. You follow?”
Not really, but keep going.
“What they do then is to imagine they do a third experiment, just like the second. And then a fourth, then a fifth and so on. In fact, they imagine an infinity of future experiments.”
That’s a lot.
“And the P-value is the probability that one of these imaginary statistics is larger, in absolute value, than the one they actually calculated in the real experiment. There. Does that help?”
No. What does that have to do with the chance the new drug is better?
“Not a damn thing.”
Then why don’t they just calculate the chance the new drug is better and tell me that? That’s what I really need to make an informed decision.
“Nobody knows. When you ask they always say ‘P-values have some uses’.”
Give me the aspirin.
Subscribe or donate to support this site and its wholly independent host using credit card click here. Or use the paid subscription at Substack. Cash App: $WilliamMBriggs. For Zelle, use my email: matt@wmbriggs.com, and please include yours so I know who to thank.
about 50 years ago, computerized operational optimization began to hold foot in “europe” economy & administration; then 25 years later, all the failed “statistics” projects (academics claimed their “imaginary experiment” has more comfort) were redesignated as skilled labor shortages; today (while the spectrum and depth of practical trades and professions has thinned against degree time and again) it’s mini- and midi-jobs for retirees and all the other forced low-income earners.
I love the smell of roasted P values in the morning.
“They gave one group of folks aspirin and another group Profital.”
Wishful thinking
Not reality
The RCT double blind tests are versus a sugar pill not a generic drug
If there are not two RCT results better than a placebo, the data can be data mined to get there or the failed tests can be ignored if there were two “good tests” among all the RCTs..
Or look ONLY at the subset from age 25 to age 55, for whom the new drug works slightly better than a placebo.
Of course few people over age 55, or relatively unhealthy people, are used for RCTs because they are more prone to adverse side effects from large doses that would not bother younger, healthier test subjects. After FDA approval most people using the drug are likely to be OVER age 55, including some very unhealthy people.
If your newly approved drug
— Does not work better than aspirin
— Costs 1000x more than aspirin
— Has known short term adverse side effects worse than aspirin
— Has unknown long term adverse side effects
Then it barely matters what statistics you used
The FDA will approve your new drug anyway.
Then doctors will soon be using it off label, such as for people over age 55, or for children, who were never tested in the RCTs … or for off label purposes/. No problem unless it was ivermectin, a generic drug that used off label actually worked well to keep Covid patients out of hospitals.
Laughing out loud. I can see myself giving that same talk, though shorter and in different words. I’m also likely to tell that patient to use aspirin, provided aspirin isn’t too harmful for him. “Are you allergic to aspirin?” “No”, then take that. “Yes”, take the alternative but let me know if you have bad after effects.
Makes me consider whether I might as well quit taking all the prescription crap I’ve been assigned by lazy, disinterested, and uninterested doctors over the past couple of decades. Probably wouldn’t make a damned bit of difference to my health.
You’re kidding me, right?
You finally did it. You gave me an explanation of p values that I understand without math and can explain to someone else.
I suppose there is an important philosophical lesson here. Even though it must be intensely boring to you, the fact that you keep repeating the same lesson in different ways means that sometimes you succeed, even if it is only one person at a time.
Rob,
This is my goal. Thank you.
Matt, thank you for this tireless, continued attempt to attempt to get people to understand the uselessness of this thing called a wee P. Yet most continue to miss the point because they are loathe to have something not be “scientific”. “but they work for casinos!” sure, by accident mostly it seems…
Pingback: Women Are Fat Not Because Donuts But Because “Discrimination”, Wee P-values Confirm – William M. Briggs