From our very own JMJ, who asked this in response to an announcement of my new book (read this first):
Briggs, how many people how you encountered who need these clarifications? I mean, just how many people misunderstand this stuff? You taught this stuff, so I guess you’ve seen how a lot of people see this. Is this a real intellectual problem out there among math students? Or are you seeing something that’s not there?
For instance, when most people say that something happened by chance, do you think they’re trying to say that they believe that a thing called chance actually made it happen? When someone says they see a trend in the numbers, do you think they’re trying to impart a belief that somehow the numbers themselves are creating a trend?
JMJ
Excellent questions, all. Everybody who uses a hypothesis test, whether by p-values or Bayes factors, needs these clarifications, misunderstands the purpose of standard tests, and these misunderstandings are a real intellectual problem. They’re not a math problems—everybody’s math is accurate—but philosophical problems.
And some people, not all, really do believe that chance is causative. More people, not all, really do believe numbers, i.e. models, are creating trends. The remaining people who do not really believe, but use the old methods, fail to recognize that the methods they use logically imply chance is a cause and that the numbers/models are a cause. So in that sense, everybody gets it wrong.
Speaking only of observable models, what we want is this: Pr(Y | X, old data, assumptions). This is the probability of some observable Y given premises X (this may be multifaceted), old observations (which we might not have), and other assumptions (this usually includes includes ad hoc statements like “I will use a normal”).
I say: to communicate models use Pr(Y | X, old data, assumptions)!
Not too exciting an answer, right? I don’t think it is, either. But, except for a trivial minority, it isn’t what anybody does.
Instead, people will calculate a p-value, which takes the assumptions, which include ad hoc statements about unobservable parameters to models, and functions of the old data, and then calculates the probability these functions would exceed some value if the “experiment” which produced the old data would be infinitely repeated and assuming that parameters are equal to some value.
If the p-value is less than the magic number, people think that one of the Xs they picked was the cause of the data. Maybe not the cause of all the data, but of a “significant” portion of it. If the p-value is greater than the magic number, they say X was not a cause but “chance” or “randomness” was.
Some might not think they are saying these things, but they are in fact saying them by implication. To prove that is not difficult, but it takes more than 800 words, so I’ll leave it for the book. I have a hint in these two papers “The Crisis Of Evidence: Why Probability And Statistics Cannot Discover Cause“, and “The Third Way Of Probability & Statistics: Beyond Testing and Estimation To Importance, Relevance, and Skill“.
It’s not better using Bayes factors, because these are also statements about (functions of) the parameters, though with only some of them at fixed values. The same fallacy about cause is there. The same forgetfulness about Pr(Y | X, old data, assumptions)—which is all anybody ever wants to know, but which the classical methods ignore (except in rare cases)—is also there. This is why there are no “good” uses of p-values or Bayes factors. If you want the probability, go to the probability and skip the substitutes!
The Deadly Sin of Reification includes any kind of smoothing, or estimation of parameters, where the probability model is taken for reality. Since probability is not a cause, and neither it chance or randomness, the plotting of probability models over real data often (always?) leads people to think “what really happened” was the model and not the data. What really happened, or what’s really going on, are the data. If we knew the cause or causes of the data, then we don’t need the probability models. Why would we?
Vast over-certainty is produced here. Again, I prove it in the book. But here are some hints in old posts (one, two, three).
Understanding cause is our fundamental goal in science. With probability, we don’t need cause but can do predictions, which is the fundamental goal in engineering. So I don’t advocate dumping probability; I do say which should use it properly.
Yes, people believe this stuff. Personal injury lawyers make billions off it. Cigarettes do not cause cancer. They may contribute, but they cannot be the only cause. Yet everyone nearly believes that less than 10% of a group getting an illness means “causality”. People will rearrange statistics or change the subject to “smoking is bad” rather than admit it is not the only cause of smoking. That is complete faith in a belief. Saying otherwise gets one labeled a “science denier”. When the 100% correlation is removed for cause, virtually everything can be “cause” based on a p-value that originally was meant to show probability. Yes, people believe in imaginary cause based on p-values and megastudies. Why? Personal injury lawyers love it because it makes them rich. Politicians love it because it allows them to pass laws. Environmentalists love it because they can outlaw whatever they don’t like. Yes, most people are believers. Numbers equal power.
Real example:
Several hundred college students indicated on a survey whether or not they studied with or tutored other students. When the responses were analyzed with respect to their overall grade point average for the semester, those who interacted academically with other students earned better grades. The p-value was wee.
But studying together doesn’t necessarily cause better grades. It could be that students who earn better grades just are more gregarious (or answer surveys differently). The best I can say is that the two things are associated, at least for these students. I might predict that there’s an effect, but would need to design a more rigorous experiment to have more confidence in it.
Of course, we’re assuming good grades mean real learning and that the survey responses are uniform and accurate, but skip that for now. By observation, group study is thought to be beneficial, so the statistical analysis may be superfluous. However, since much time and resources are devoted to promoting it, some degree of certainty about it is worth knowing.
So questions for Briggs:
1. Is there anything more I can say about the results of the example that avoids ascribing to Chance and Reifying the data?
2. What level of certainty do I have that studying together actually helps?
3. What minimal conditions would I need to include in an experiment to be reasonably sure that advising students to study together is good advice?
Your enemies strike again.
“and neither it chance or randomness”
Practically, what I read here is an explanation of small-l liberal/small-c conservative thinking. For instance, when I plan an order for some for some soft drinks to sell at a restaurant, I would look at reports of past sales, contemporaneous situations (maybe there’s a parade that day, or road construction, or rain), what I have on had and when it will last. I then apply liberal or conservative thinking. I think, “Hmm, I’ll get a some more just to make sure I don’t run out,” or “Hmm, I’ll think I’ll estimate on the low end, I don’t want to throw out product.”
Recently, someone I know was diagnosed with a terrible illness. The doctor estimated the worst outcome for her, giving here the old “you’ve got (x-much) long to live” spiel. She got a biopsy back just the other day that showed it was a much better situation. I can’t stand that stupid old “you’ve got (x-much) long to live” spiel. That doctor scared the living heck out her. You’ve got to wonder how many “terminally ill” people have died from the sheer stress of being told they are “terminally ill.”
Students should always be reminded of their own subjectivity so as not to misuse statistics… as much. There’s a time and a place for conservative or liberal thinking, for created models and arbitrary or illusory algorithms, and everything else under the sun for that matter. You never know what may turn out useful and not what you thought at all. But you can’t call that science, just blithe subjectivity.
JMJ
Oh, and thank you for those great answers. Bayes comes in handy with computer animation, but then I don’t have to worry about my animated character suffering heart disease!
JMJ
Here’s another question for Briggs or anyone else interested or knowledgeable. (I have my own ideas, but want to see what others think.) Most of you are probably aware of the anthropic principle, that physical constants and laws are prescribed within narrow limits (“fine-tuned”) to enable carbon-based life to exist. (Given acronyms ranging from WAP–Weak Anthropic Principal to CRAP–Completely Ridiculous Anthropic Principle–depending on interpretations.) Ok, now you see people making arguments about probability chains for constants and laws and coming up with really low probabilities, wee-wee p-values, and thus proving design. What is wrong with making those assumptions about probabilities for constants and laws (again, I have my own ideas, but I’d like to hear from others)?
Danke Schoen, Merci, Gracias.
It seems to me that’s just looking for proof from arbitrary numbers. The likelihood of life is a lot trickier a question than the likelihood of the defiance of known physical laws. You can test gravity, but without some kind of control-Earths, examining our own likelihood is hard to do. I suppose we’ll find out if we ever manage to explore far out into the stars.
JMJ
[quote]Bob Kurland
May 7, 2016 at 8:58 pm
Here’s another question for Briggs or anyone else interested or knowledgeable. (I have my own ideas, but want to see what others think.) Most of you are probably aware of the anthropic principle, that physical constants and laws are prescribed within narrow limits (“fine-tuned”) to enable carbon-based life to exist. (Given acronyms ranging from WAP–Weak Anthropic Principal to CRAP–Completely Ridiculous Anthropic Principle–depending on interpretations.) Ok, now you see people making arguments about probability chains for constants and laws and coming up with really low probabilities, wee-wee p-values, and thus proving design. What is wrong with making those assumptions about probabilities for constants and laws (again, I have my own ideas, but I’d like to hear from others)?
Danke Schoen, Merci, Gracias.[/quote]
I suppose that you’d like to hear from anyone but me, Bob, but I can hardly resist the invitation to vent my pet hate of Materialism.
The execrable Julian Huxley selling “Evolution” gave the probability of a simple protein being formed by accident (being some 1 in some 10 to the thousandth power) “it’s impossible”, he said ” but it has happened because here we are”!!! Scientific method?!!
Modern nonscience is firmly entrenched as a rationalisation of ideological presumptions of Materialism… anything that supports (at least, does not confront or contradict Materialistic assumptions) is called “science” everything else is “religious bigotry”.
Mathemagicians of every shade are a peculiar version of Scribes that seem to imagine that a thing is true just because they imply it is… or that it is untrue because they imply it isn’t. Purveyors of nonsense are much more cunning these days as they are careful not to state anything that might be realistically challenged, confining themselves to vague insinuations and specious evasions that are designed to lead their prey to the appropriate “self-determined” conclusions.
I will contend that a natural probability of (essentially) nil is an indicator that the existence of the thing (or process) is not an accident with no cause or purpose.
W’m Briggs,
here you are again implying (by scorning “reification”) that a “thing” that can’t be empirically measured is, ipso facto, not a “thing”… a “no-thing”, perhaps, that’s just a conceptual amusement for Neanderthal pre-Materialists.
I have a visceral dislike for absurdities such as Materialism and Empiricism.
probability of a simple protein being formed by accident (being some 1 in some 10 to the thousandth power)
Likely achieved by faulty reasoning. P(A with B with C | whatever) is lower that P(A with B | … ) if they were to be joined at the same time but P(A with B being joined by C | … ) is the same as probability of C joining with A or B independently (assuming they are equal).
absurdities such as Materialism and Empiricism
Though far better alternatives to drawing conclusions from assumptions where that only check for validity is that they seem reasonable.
Bob,
Nothing really wrong with being anthropic as long as it is made clear it is a potentially limited viewpoint. Too often we get statements like “Life can only exist under conditions like ours”. A bit shortsighted. There isn’t much value in things like the Drake equation except perhaps to set a lower bound to questions like “Are we alone?”. If the Drake equation has any validity at all, the answer is: most likely no.
If people “say something happened by chance” to me, I see it merely as an expression of not knowing and possibly uncontrollable conditions depending on the context. No evidence to support my view. It’s not important. So it has never occurred to me to ask people whether they actually believe chance causes something to happen.
Probability is not a cause.
Statistics has no mercy.
Mathematics is not a sport.
A hammer doesn’t pound nails into wood.
.
.
.
Love is not just a word.
.
.
.
Don’t try to steer a river.
I think I am getting Chopra-esque… so spiritually and philosophically deep.
How does one take a probability model for reality? What is the cause of data? Whoever collects the data!? Right, if we knew the causes of certain phenomenon under investigation, we wouldn’t need probability models. The problem is that we often don’t know for certain.
BTW, one can plot a probability density or mass function with one argument for given parameter values. It’s hard, if not impossilbe, to plot a probability model over data, even in the simplistic setting of a simple linear regression model. Why? Becauase a probability model (function) usually involves unknowns and several arguments. What’s been plotted is usually the estimated probability model or the estimated systematic component of a model. A very important concept!
Pingback: This Week in Reaction (2016/05/08) - Social Matter