The Crisis Of Evidence: Why Probability And Statistics Cannot Discover Cause. New Paper

A PM2.5 Storm
A PM2.5 Storm

Cancer of the albondigas is horrifyingly under-diagnosed. See your doctor today and ask him if Profitizol is right for you.

Today’s post, in a way, is at Arxiv: The Crisis Of Evidence: Why Probability And Statistics Cannot Discover Cause. Here’s the abstract (the official one has two typos, meaning my enemies are gaining in power and scope!):

Probability models are only useful at explaining the uncertainty of what we do not know, and should never be used to say what we already know. Probability and statistical models are useless at discerning cause. Classical statistical procedures, in both their frequentist and Bayesian implementations, falsely imply they can speak about cause. No hypothesis test, or Bayes factor, should ever be used again. Even assuming we know the cause or partial cause for some set of observations, reporting via relative risk exaggerates the certainty we have in the future, often by a lot. This over-certainty is made much worse when parametric and not predictive methods are used. Unfortunately, predictive methods are rarely used; and even when they are, cause must still be an assumption, meaning (again) certainty in our scientific pronouncements is too high.

I use PM2.5 (particulate matter 2.5 microns or smaller, i.e. dust) as a running example, since it is one of the EPA’s favorite thing to regulate. I’ll be giving a version of this paper at this weekend’s Doctors for Disaster Preparedness conference in LA. I’ll concentrate more on the PM2.5 angle there, naturally, but I will and must hit the primary focus, which is that probability cannot discover cause.

“But, Briggs, isn’t all of statistics designed around discovering what causes what? Isn’t that what hypothesis tests and Bayes factors do?”

This is true: this is what people think statistics can do. And they are wrong. We bring knowledge of cause to data, we don’t get cause from data. Not directly. Understanding cause is something that is above or beyond any set of data. To understand that, you’ll have to read the paper, a mere 21 pages. If you expect that you have understood my argument by only considering what is in this post, you will be wrong.

An out-of-context-ish quotation (in a low or no group of 1,000 5 people got cancer of the albondigas, and in the “some” or high PM2.5 group of 1,000 15 did):

There is no indication in the data that high levels of PM2.5 cause cancer of the albondigas. If high levels did cause cancer, then why didn’t every of the 1,000 folks in the high group develop it? Think about that question. If high PM2.5 really is a cause—and recall we’re supposing every individual in the high group had the same exposure—then it should have made each person sick. Unless it was prevented from doing so by some other thing or things. And that is the most we can believe. High PM2.5 cannot be a complete cause: it may be necessary, but it cannot be sufficient. And it needn’t be a cause at all. The data we have is perfectly consistent with some other thing or things, unmeasured by us, causing every case of cancer. And this is so even if all 1,000 individuals in the high group had cancer.

This is true for every hypothesis test; that is, every set of data. The proposed mechanism is either always an efficient cause, though it sometimes may be blocked or missing some “key” (other secondary causes or catalysts), or it is never a cause. There is no in-between. Always-or-never a cause is tautological, meaning there is no information added to the problem by saying the proposed mechanism might be a cause. From that we deduce a proposed cause, absent knowledge of essence (to be described in a moment), said or believed to be a cause based on some function of the data, is always a prejudice, conceit, or guess. Because our knowledge that the proposed cause only might be always (albeit possibly sometimes blocked) or never an efficient cause, and this is tautological, we cannot find a probability the proposed cause is a cause.

Consider also that the cause of the cancer could not have been high PM2.5 in the low group, because, of course, the 5 people there who developed cancer were not exposed to high PM2.5 as a possible cause. Therefore, their cause or causes must have been different if high PM2.5 is a cause. But since we don’t know if high PM2.5 is a cause, we cannot know whether whatever caused the cancers in the low group didn’t also cause the cancers in the high group. Recall that there may have been as many as 20 different causes. Once again we have concluded that nothing in the plain observations is of any help in deciding what is or isn’t a cause.

There are some papers by Jerrett mentioned in the body. See this article, and links therein, for more details.

Update Typo-corrected version uploaded! Same link as above.

51 Comments

  1. Briggs–

    Being a relative newcomer to your site, I have become addicted to your posts and the responses from your commentators. This post of yours has me nodding at my screen because it hits so close to home.

    Taking your advice, I read your full paper. I particularly like the quotes and references to Hume, and your quote “… return to the old view [of causation], the Aristotelian view, the view most physical scientists actually hold, or used to, when they come to think of causality. Although this is a huge and difficult subject, we can boil it down to essence.”

    As an individual who has conducted clinical trials, and studied data, I am keenly aware of the need to return to first-cause principals, what I like to call the “atomic level” or essence of causation. I am abhorrent of indices. Indices are all over medicine. They tell nothing of cause but imply effect or outcome based on value, which is most often unit-less (dimensionless). That said, they imply cause based on a priori patterns of typical or measured behavior. Yet, they are not foolproof. Some are based on probabilistic views of cause and effect.

    My attitude in study of observations is this: review the base data, determine or identify the physical processes surrounding that may have led to the observations, and then develop the hypothesis on the basis of the physics (or chemistry). Test the hypothesis with controlled experiment, and assess the validity from there. I do not use probability as causal because, frankly, it is physically meaningless.

    Human beings tend to see patterns. Patterns in data cause us to infer cause and effect. This can be useful, but it can also be dangerous, particularly when we incorrectly infer which is the cause and which is the affect; or whether there really is some other cause that is resulting in two variables to be incorrectly associated as cause and effect on one another (they may be be causal, or they may both be the effect of some other cause).

    So, bravo on this post.

  2. Briggs

    John Z,

    Thanks for those kind words.

  3. Briggs: My thoughts precisely when I hear mesothelioma is “caused” by asbestos. Virtually every person in the US over 50 has been exposed to asbestos. Yet there are only 3000 or so cases of mesothelioma diagnosed per year. How can asbestos be the cause? Answer: Because those who produced asbestos had lots and lots of money and that money can go to personal injury lawyers and “victims” of that evil asbestos industry. Causality is now a matter of the money available and how bad the “cause” can be made to look. If you hit the cancer lottery and get mesothelioma instead of say, pancreatic cancer, you won’t live very long in either case, but your relatives can get a pay day in the former case. Sorry, pancreatic cancer has no such fund, so you lose.

    John Z: Agreed that humans see patterns and then jump to cause (patterns may predict without actually identifying cause). This is very true in the idea of “cancer clusters” where a group of people living close to each other all develop cancer. The immediate thought is something environmental caused the cancer. Yet, while the pattern of a large number of cancers is noted, other more important patterns may be automatically shut out and the actual cause missed because we concentrated on the “likely” cause. It becomes tunnel vision. Cancer studies are very prone to this. Researchers seem to look only at the environment much of the time, while genetics and other possible contributors or causes are ignored.

  4. Gary

    Good paper, even if your enemies, like little foxes spoiling the grapes, speckle it with typos. Maybe you can you hire a proof-reader?

    Section 4. The Essence and Power of Cause — this is a call for elevating theory, you know, which I thought you weren’t too keen on. But you quickly remedy this in Section 5 by requiring theory to be verified with observational evidence, so OK.

    The title is a bit curious. The crisis really is more with certainty than evidence, no?

  5. Sheri–

    Concur.

    Let me see if I can correctly recount this particular story highlighted by Edward Tufte. I suppose a “classic” example of the determining of root cause from patterns is that regarding the physician John Snow with his identification of the cause of the cholera outbreak in London back in the 1850s. He had identified the concentration of individuals who contracted cholera around one particular well. The removal of its handle resulted in the cessation of the outbreak (owing to inability to continue to draw water from the well).

    Of particular note, the employees of a brewery, which sat only yards from the well, did not contract cholera, principally because they took their libation from the brewery (alcohol in the beer killed the bacteria), plus the brewery had its own well.

  6. Briggs

    Gary,

    “Maybe you can you hire a proof-reader?” Using what for money?

  7. MattS

    “Using what for money?”

    Clam shells?

  8. James

    “Using what for money?”

    Hopefully the money I throw at you for your book. Is this paper a chapter in it? It should be!

  9. Briggs

    James,

    Yep. This steals from the book in a couple of spots.

  10. DAV

    Hmmm … from the paper’s title one might think you believe Judea Pearl is wrong.

    Seems to me even if you come to think X causes Y by mere reason how would you go about verifying it? One way is experimentation but that is just seeing how Y is related to other variables (Pearl’s “do”) in addition to X — Pr(Y | X, do). I say one way but it’s really the ONLY way. “Do” may already be available so in such a case experimentation is unnecessary but the idea is still the same: using the dependencies and independencies between variables — IOW statistics.

    Yes, statistics used incorrectly, particularly classical hypothesis testing, is worthless in determining causal relationships but (disregarding the philosophical position that NO TOOL can discover cause) it’s quite a jump to imply it’s generally worthless in determining cause.

  11. Briggs

    DAV,

    Yes, I say Judea Pearl is wrong. I have more in the book answering his arguments.

    It is also false that “no tool” can discover cause. We discover causes all the time. We do use observations, of course, but not in functional way implied by hypothesis tests and Bayes factors. The way we discover cause is by understanding essence and powers, as I said, and we do that via induction, which is an analogical term.

    Now to explain that requires some time—which I take in the book.

  12. DAV

    Briggs,

    I’d be interested in an overview of why you think Pearl is wrong. I don’t really want to wait for the book. “Understanding essence and powers” sound like hippy talk but then some hippies were quite good at discovering “causes”.

    As for it is also false that “no tool” can discover cause I agree, Pearl’s algorithm discovers causes in a mechanical way, however I was referring to a philosophical position that it’s the operator and not the tool doing the discovery.

  13. John B()

    Wednesday, 27 July

    Using statistics, what is the cause?

    Aristotelian-wise, what is the cause?

  14. DAV

    A necessary condition for Wednesday, 27 July was (perhaps still is) Tuesday, 26 July.

  15. Briggs

    DAV,

    I’ll get to an induction post. Maybe next week? I’ll be away from the computer starting Friday for a few days. Meanwhile, the references on essence are a good place to start.

  16. DAV

    Course, that depends on the year. Not this one apparently,

  17. DAV

    Briggs,

    That would be great.

  18. Briggs

    DAV,

    Of course, I more than agree that a check of assumption of cause must be made. That’s the predictive approach. Those final plots are predictions, and should be used to verify the assumption of PM2.5 as a possible cause. Like I say in the paper, they’re still not proof.

  19. John B()

    DAV said:
    A necessary condition for Wednesday, 27 July was (perhaps still is) Tuesday, 26 July. Course, that depends on the year. Not this one apparently,

    Unfortunately Tuesday was the 27th, as was Monday…

  20. Gary

    “Maybe you can you hire a proof-reader?” Using what for money?

    OK, some of your friends here probably (p<.05) will do it for free. You have our emails and we'll can sign confidentiality agreements. More than one typo in a published paper looks sloppy and suggests the reasoning may be suspect.

  21. DAV

    John B()

    Indeed, in fact every day of the week has been the Jul 27th at one time or another.
    I ‘m moving rather slowly today and am not seeing your point.

  22. Briggs

    Gary,

    I submitted a new version to Arxiv correcting the typos. Should go live soon. Will update this page when it does. This version ought to flummox my enemies.

    Let me know if you want to proof read the book. Of course, I still have to finish it…

  23. John B()

    DAV
    I ‘m moving rather slowly today and am not seeing your point.

    Awhile ago, Briggs started putting a “status” message at the top of the post, to tell us regulars what he’s up to each day.
    Some days he’d forget to update.
    Many days, he tells us what was going down and fail to update the “day” and “date”.
    Since Monday now, he updated the day of the week without update the “number date” … You didn’t know what I was referring to but, he did … the status now reads properly: Wednesday, 29 July

    One of his status messages was “I’m beginning to regret this status message” and then he provided the wrong day/date with that message.

  24. Ken

    The ‘The Crisis of Evidence’ paper is one of those kinds of papers that ought not be necessary…people ought to know better than jump to conclusions about a correlation & imputing a cause-effect relationship exists. But as P.T. Barnum noted, there’s a sucker born every minute, so such reminders will, unfortunately, remain necessary.

    Given the less than rigorously attentive nature of people, it seems the ‘The Crisis of Evidence’ paper starts out with basically the same fundamental flaw [of “jumping to conclusions” to put an un-fine spin on it] that it purports to remedy where it states:

    ” Suppose, too, it turns out 5 people in the low group developed cancer of the albondigas, and that 15 folks in the high group contracted the same dread disease. What is the probability that more people in the high
    group had cancer?
    “If you answered anything other than 1, you probably had advanced training in classical statistics. Given our observations, 1 is the right answer.”

    One cannot say the answer is one (“1”) because in doing so one is jumping to conclusions & making the same sort of error the paper is addressing:

    In conducting testing, sampling is crucial. The evidence presented only states that of the sample size a very particular form of cancer was observed to subsequently develop. Nothing is stated that the persons comprising the sample were cancer-free prior to sampling, only that they lacked a particular type of cancer.

    Absent info to the contrary one cannot assume that those in the lower cancer-of-the-albondigas group don’t have more of some other cancers than the higher-cancer-of-the-albondigas group. That is, one cannot assume the experimenters pre-screened broadly for cancer as opposed to focusing their pre-screen for the albondigas cancer.

    The concluding statement in the paper imputes that the 1,000 people who were exposed were cancer-free prior to exposure. The summary presented does not state this, so that conclusion cannot be certain.

    If the paper presented the key sentence as follows, all would be well:

    “Suppose we learned that 1,000 CANCER-FREE people were “exposed” to …”

    Did those conducting the study actually verify this (did they set up the experiment properly)? One would hope, but dumber things have occurred.

    Seems to me the paper should begin with a brief description of the experiment, to indicate that whatever data was obtained has some credibility, and THEN proceed to discuss how the analysis of the measures taken may be reasonably accepted & what to watch out for from there on. That can be done briefly, and in so doing can communicate any number of weaknesses & limitations overtly or by obvious implication.

    I believe this is a very substantial distinction to make; consider elsewhere how significant the data is from the following remarks:

    “The data we have is perfectly consistent with some other thing or things, unmeasured by us, causing every case of cancer.”

    “From that we deduce a proposed cause, absent knowledge of essence (to be described in a moment), said or believed to be a cause based on some function of the data, is always a prejudice, conceit, or guess. ”

    What is not addressed, and what remains a significant source of error/overreaching, is seeing things in the data that aren’t there — those imputed values that may seem obvious, but because of the experimental setup cannot be assumed. Prejudice, conceit or guesswork are bona-fide issues, but every bit as fundamental is understanding how must faith can be put in the data itself derived from the experimental setup.

    If nothing else, a brief statement to the effect that the analysis presented is, for the sake of discussion, assuming that the data is credibly obtained from a reputable experimental set up. Simply stating that avoids a lot of discussion that would be distracting, but also conveys that some other very basic issues need to be understood, not assumed, before building an analyses…and from there justifying & undertaking some remedial course of action.

  25. Briggs

    Ken,

    “One cannot say the answer is one (‘1’) because in doing so one is jumping to conclusions & making the same sort of error the paper is addressing:”

    No. Given we have seen 15 in the high and 5 in the low, the probability that there were 15 in the high and 5 in the low is 1. And no other number.

  26. Great.

    I had albondigas last night. Now I find out I’m going to get cancer.
    .

  27. Ramspace

    I, for one, would be heppy to halp out, and I do poof-reading professionally.

    Did that on purpose . . . honest!

  28. Oregon Guy: Don’t worry. The latency period is quite long. You’ve got time to take up sky diving and anything else on your bucket list.

    (A note on this whole causality thing: I found it interesting that if one gets tongue cancer and there are no known risk factors—like smoking, drinking, etc—and thus nothing to blame as the cause, it is treated far more aggressively than if there were risk factors. I never really considered that the lack of a cause could very seriously affect treatment for cancer.)

  29. James

    Briggs,

    I still really like the paper, but I do have one comment/question for you about an area of experimentation that you didn’t cover.

    In engineering (among other fields), experiments are done with control groups and structured changes in inputs. For example, the exact same wind tunnel and wind tunnel model will be tested, varying only the angle of attack. If I took repeated (noisy) measurements at each of two angles, would this be an occasion where hypothesis testing would be appropriate? I ask because I think a case might be able to be made that since we only changed 1 variable, we can be incredibly certain that any differences in the measurement groups are a function of measurement error and the angle.

    I would think that assigning causation to the angle would be acceptable. Although, I can see an argument being made that we already know from physical observation that different angles cause different forces. Perhaps we really aren’t asking “what caused the change”, but a more mundane “how much was the change?”.

    Short version: In experiments on systems with controlled inputs, is it acceptable to say things like “X is causally important?”.

    I think the key difference is that in things like the epidemiological fallacy, nothing is actually controlled by the experimenter. When things are well controlled by the experimenter, we are (are we?) able to ascribe causation, at least in the sense that we know that our one change might not be the only cause, but it was the ‘first cause’ in whatever happened in our experiment to lead to the measurement (and that’s usually what is of interest).

  30. Briggs

    James,

    “Perhaps we really aren’t asking ‘what caused the change’, but a more mundane ‘how much was the change?'”

    If you assume you have controlled all possible causes but one, which is varied, then it must be that the changes are caused by this variation. And then it becomes, just as you say, a question of how much change. That’s when verification comes in. You predict then verify, which of course engineers do naturally.

    What I’m asking if for scientists to do the same. Return to the old way of testing claims before asserting they are true.

  31. Ray

    It’s even worse than you think. The EPA claims that exposure to second hand tobacco smoke causes lung cancer but they didn’t measure any exposure. People were asked about their exposure. According to the EPA you can determine causality by taking a poll and doing a statistical analysis of the answers. They are the new haruspex.

  32. Katie

    Predict, then verify. Almost sounds like something President Reagan might have said.

  33. Bumble

    WMB: I liked your paper. Here are a few thoughts.

    1. I’m not convinced that failing to reject a hypothesis amounts to accepting it, or even that people commonly do this. I’m not defending the NHST method, just taking it as an example. If I obtained a high p value, I would not conclude that the null hypothesis was true, merely that it was consistent with the evidence. I would remain undecided as to whether it was true.

    2. Would I be right in thinking that your reference to “brutal hard work” means that you are only likely to accept evidence of causation in cases where the subject matter can be subjected to controlled testing? Preferably under laboratory conditions? This is highly optimistic: we typically cannot experiment with people in this way, and some such testing would be unethical. This requirement would rule out the possibility of making causal judgements in huge areas of social sciences and medicine. Are you hard-headed enough to say: that’s too bad?

    3. In a discussion of causation in the context of epidemiology, I would expect to see a reference to the Bradford Hill criteria.

    4. I would advise avoiding reference to climate change in articles about evidence, cause and probability. When many readers see a reference to that, a mist descends and rational thought makes an exit. It is preferable to use neutral and uncontroversial examples wherever possible.

  34. JH

    Mr. Briggs,
    The p-value in Step 4 calculates:… on page 3 is not correctly defined.

    Regarding the figures, binomial probability is not appropriate for modeling the probability of the number of people having cancer in a a group of exposed (or non-exposed) people in a city. The following two conditions of a binomial experiment are not met – independence of people and same probability of having cancer for all individuals.

    “Remember: the Lord Himself assured us the relative risk of 2 is correct.

    In the exposed group, the chance at least one person gets cancer is one minus the chance nobody does, or 33%; and the chance at least one person gets it in the non-exposed group is thus 18%. The risk ratio is now 1.8. a big change from 2! But how can this be when God Himself told us the relative risk is 2? That’s because the “2” is an abstract number. “

    How can this be? Can it be that you have calculated relative risks for different events? Note that the probability of getting cancer is not the same as the probability of at least one person gets it .

    I would suggest that you cut the crap (Yes, crap) about frequentists and classical statistics. Too lazy to type the reasons for my suggestion.

  35. Briggs

    Bumble,

    1. True, some people still think that cause is there even though the p-value is not wee. But notice that they are then not using the statistics procedure to decide what the cause was.

    2. Control is the way to go—when it’s possible. Can’t do that will climate models, for instance. Nor will wondering whether PM2.5 is a carcinogen in humans. Like you say: no experiments.

    3. Well, I suppose. But I don’t go very much into how to understand cause here, just some teasing hints. My goal was to prove that classical stats can’t get you there.

    4. You’re probably right. But this paper is sort of speech, too. I refrain in the book from any mention of global warming, unless a reference slipped in I can’t recall.

    JH!

    The test doesn’t matter. Pick your favorite and all the conclusions remain the same. Probability and statistics can’t discover cause, as I showed.

    How can it be? Well, binomials are easy to work with. I’m sure if you plugged the numbers in you’d get the same results. The larger point is not so much these calculations, of course. But that the model has to be used in the predictive sense. So I picked LA because CARB did. And applied the model not in some abstract sense of risk ratios, but how many sick people we can really expect, given we believe our model etc.

    Classical statistics does not emphasize this procedure—you’ll never find it in any introductory book. It is what should be used as a replacement for hypothesis testing/Bayes factors, which should never be used again for the reasons I stated.

    Like I’ve showed many times, using p-values leads to formal fallacies. See the classic posts page.

  36. JH

    Mr. Briggs,

    I doubt any competent statisticians would say that probability model alone can discover or establish cause. If you are to point out some misuse of statistics by practitioners, you should make it clear. Your attempt to trash classical statistics (and frequentists) and p-value because somehow YOU THINK it implies “cause” won’t fly among statisticians.

    Are you saying that over-certainty is a formal fallacy? What is over-certainty? Do you mean over-confidence? Is it quantifiable? If yes, what is the objective yard stick for “being over”?

    YOU THINK have shown using p-value leading formal fallacies. Well, D. Draper did show by simulations in a published paper that using p-value tend to result in more incorrect conclusions than a Bayesian method does. But, we know simulation results depend heavily on the model used in the simulations. One can probably show otherwise by changing the models.

    Examples of the crap:

    Still, the error is really only a major problem for the classical (frequentist or Bayesian) statistician who uses the hypothesis test or Bayes factor. He will be incredibly over-certain in is pronouncements. But the predictive statistician does not suffer these faults.

    You can at least provide literature evidence for this claim. What does a predictive statistician do? Does s/he use parametric models? I bet s/he does suffer because s/he still have to use the observed data to construct a model for predictions.

    If you answered anything other than 1, you probably had advanced training in classical statistics.

    This is equal to saying that a person who had advanced training in classical statistics probably doesn’t know the difference between descriptive statistics and statistical modelling, be it explanatory or predictive modelling. Silly.

    At any rate, big data have made this disciplinary of descriptive statistics quite interesting.

    Note also the predictive modelling and predictive methods, with the goal of making predictions, are more about cross validation or predicting out-of-sample observations, not about waiting for new yet-to-be-observed data to test the model validity, which is impossible when a decision cannot wait. The statistical model involved in predictive modelling can be parametric or non-parametric.

    Big data has also made predictive modelling big nowadays.

    Classical statistics does not emphasize this procedure..

    What procedure?

  37. JH

    I also think if you want to criticize the papers by Jerret and others, you should explain what’s been done in their papers in concise manner. For example, clearly state the conclusion in the paper here.
    “We assessed the association between air pollution and death using standard and multilevel Cox proportional hazards models.
    Conclusion: Taken together, the results from this investigation indicate consistent and robust effects of PM2.5 ? and other pollutants commonly found in the combustion-source mixture with PM2.5 ? on deaths from CVD and IHD. We also found significant associations between PM2.5 and all causes of death, although these findings were sensitive to model specification. In Los Angeles, where the monitoring network is capable of detecting intraurban variations in PM2.5, we observed large effects on death from all causes, CVD, IHD, and respiratory disease. These results were consistent with past ACS analyses and with findings from other national or international studies reviewed in this report. Our strongest results were from a land use regression estimate of NO2, which is generally thought to represent traffic sources, where significantly elevated effects were found on deaths from all causes, CVD, IHD, and lung cancer. We therefore conclude that combustion-source air pollution is significantly associated with premature death in this large cohort of Californians.”
    Then, discuss how the causal relationship is clamed in the paper!!!

  38. I was reading the title again and it seems fairly obvious why “probability” and “statistics” cannot discover cause, except of course by sheer luck or accident. Cause is certain. It’s concrete. The measles virus causes the measles. Remove the virus, no measles. Probability is not concrete. Smoking probably causes lung cancer, except not everyone who smokes gets lung cancer and not everyone with lung cancer smoked. Remove smoking and you do not remove lung cancer. (Note–The American Cancer society now says it’s not your fault you have lung cancer. Just so you don’t think I’m a conspiracy nut who thinks big tobacco buried the evidence.)

    Statistics are not specific to an individual event. You cannot use statists to say if I will or will not get measles. You can only say the group I’m in may have 10 cases and I might be one of them. If I was in the room with someone with measles and I have no immunity, you can say I will get the measles. (I know it’s not 100%, but it’s very close.)

    Cause is a concrete concept while probability and statistics are fluid. It just makes sense you can’t find cause with probability and statistics.

  39. Briggs

    JH,

    Do you have a specific argument showing where my argument was in error? Or are you just upset? You’re not still using (or teaching) hypothesis tests, are you? If so, what do you tell people these are?

    The Jerrett results I have gone into at length. I gave a link (bottom of the page above) and in my new paper I cite the paper in which I criticize his results.

  40. JH

    Mr. Briggs,

    No, I am not upset. Are you? As usual, you avoid answering my questions by diverting to other irrelevant concerns.

    I do think you make a few misleading, unsubstantiated claims. You make the claims, and you are to show they are correct. I ask questions for you to consider as to why you are or may be wrong. Just what a colleague would do in my line of work ( including free calculation corrections above). Yes, I could provide more technical materials to show why you are wrong, but hey, this is not my paper.

    Yes, I still use hypothesis testing, both frequentist and Bayesian methods, which are both useful and can be considered at the same time for decision making. People can easily find an introductory textbook or google for the materials if they are interested.

  41. Fred

    @Sheri, you raise a good point on the reciprocity (and lack thereof) of conditional probability versus that of correlation.

    This paper/discussion reminds me of Fisher Black’s “The Trouble with Econometric Models”. Looking forward to the book, Briggs.

  42. I enjoyed the paper on arxiv.org. Your statement: “Cause is under-determined by the data. The only way to solve this, or any problem, is to grasp the nature or essence and powers of the things involved” was a nice summary of the issue. It reminded me of David Deutsche’s position (in ‘The Beginning of Infinity’) that science is a quest for “good explanations.” A good explanation is hard to vary, according to Deutsch. As I see it, it’s hard to vary because it gets at the essence of the relevant phenomena. For example, the axis-tilt theory is a good explanation for the seasons because it is hard to vary the nature of the things involved. Instrumentalism, on the other hand, considers explanation unnecessary. What a contrast! All science is laced with philosophy, which should not go unexamined.

  43. Briggs

    JH,

    “I do think you make a few misleading, unsubstantiated claims.”

    Very well: which.

  44. JH

    Mr. Briggs,

    Here is another one. Gamble’s paper demonstrates errors in variables can lead bias results, and he is not shy about draw attention to the biases throughout the paper. Well, it all depends on how you want to use the paper to further your agenda.

    (No test of significance, i.e., no p-value, is involved in the calculations of relative risk resulting the statement of 150 to 300 times more toxic quoted in your paper. Simply estimations based on the data. So no curse of statistical significance, but the curse of errors in variables.)

    I already offered you explanations for two examples of your misleading “crap” about classical statistics, which also target you another claim about “… parametric and not predictive…” and mention what predictive modeling is in the world of statistics. My explanations are evidently not appreciated, perhaps I am not clear. So, I rather go back to the papers I am studying this summer.

  45. Briggs

    JH,

    So no “misleading, unsubstantiated claims” after all. P-values, as I’ve said thousands of times, should not be used, therefore that I don’t use one in the discussion of evidence is hardly misleading or unsubstantiated. Anyway, that’s Gamble’s paper. Forget it exists, if you like.

    Have you any counter arguments to my proofs that probability and statistics can’t discover cause? And that hypothesis tests/Bayes factors should never be used again, and that statistics is only useful in its predictive sense? Or do you just like saying “crap”?

  46. JH

    Mr. Briggs,

    Have I said that probability and statistics can discover cause? No. Please read my previous comments.

    It’s not that you don’t use p-value therefore you mislead your readers. The point is that even though the calculations of relative risks in that particular case have nothing to do with statistical significance, yet, you blame it on statistical significance. Misleading impression about statistical significance, don’t you think? Therefore, crap it is.

    (Yes, I like the saying the word “crap”, but I like “turd” even more. It’s a challenge for many Chinese to pronounce “r,” and I like challenges. )

    Not all statistical modelling aims to predict; some to explain!

    In YOUR predictive sense – good luck with telling the president and vice presidents of some businesses that they have to wait to make their decisions about selling certain division of the business because your predictions have not been validated by new data. And they just paid you $100 per variable or $1 per experiment unit, whichever is higher, and there are 200 variables collected online. (Inspired by an alumni who came to visit and talked about his consulting projects yesterday.)

    I don’t see any value in saying or proving (if it is what you believe what you have done) that probability and statistics can’t discover cause, but your readers sure do. Math can’t either.

    At any rate, how do you, by grasping the nature or essence and powers of the thing that involved, find the cause(s) that
    there are 1,446 living Grand-masters of chess. Only 33 of these aren’t men.
    What are the causes? What is the thing involved here?

    What is the essence of gravity you have in mind? How do you grasp the essence?

  47. testing

    Interesting paper. I would add that “causing cancer” is not well defined. The most popular definition I’d think would be the exposure that causes the mutation leading to a tumor cell. The same data regarding a PM2.5 cancer link could also be explained if the tumor cells already existed in a preclinical state, but the exposure stimulated them to grow and proliferate. Or such cells may be formed every day in everyone, but usually the immune system kills them or arrests their growth. If the exposure is an immunosuppressant, that would make it look like it caused cancer.

    Or we can reverse the causality and speculate that, for example, people who are getting cancer feel ill more often, so they drive more, leading to higher PM2.5 exposure in that area.

    Inspect the cancer literature, you will find little interest in making such distinctions. Instead, it is an endless tome of people using the method of rejecting a null hypothesis then accepting the preferred explanation.* That is, quite frankly, pseudoscience. Future generations will have to redo everything where the evidence was assessed in this way.

    In fields where NHST is rampant, there is nothing stopping people from veering down the wrong path for decades or even generations. The scientist needs to make numerical predictions and then compare to new data. If that is not possible, they need to stick to collecting data until someone can. Speculation is great, as long as it is recognized as such.

    JH wrote: “good luck with telling the president and vice presidents of some businesses that they have to wait to make their decisions about selling certain division of the business because your predictions have not been validated by new data. And they just paid you $100 per variable or $1 per experiment unit, whichever is higher, and there are 200 variables collected online.”

    People should be free to waste their money on whatever, but get this out of taxpayer funded science.

    *As far as I can determine, this is the only possible use of the usual NHST method (when the null hypothesis is no difference and the research hypothesis is some kind of explanation for a difference or effect). You can find Student[1] and Neyman[2] making this error, I am sure if I look through Fisher’s applied stuff I will find him doing it as well. It would be interesting to check the work of both Pearsons too.

    [1] http://errorstatistics.com/2015/03/16/stephen-senn-the-pathetic-p-value-guest-post/#comment-120537

    [2] http://errorstatistics.com/2015/08/05/neyman-distinguishing-tests-of-statistical-hypotheses-and-tests-of-significance-might-have-been-a-lapse-of-someones-pen-2/#comment-129224

Leave a Reply

Your email address will not be published. Required fields are marked *