The two keys points of both are the (a) “statistically significant” correlations, even nonsensical ones, are free for the asking, and (b) people are always assuming statistical models are causal.
If people see a “significant” correlation that they can tell a story about, they accept it as real and causal. If their imagination is not fecund enough to tell a believable tale, the correlation is dismissed as spurious. All that business of wee p-values and technical talk of model innards is secondary and incidental, though necessary to get the tale past peer review.
From reader Kent Clizbe comes this Washington Post story “Foreclosures may be driving the rise in suicides”, with the picture given at the top of this post. The correlation is clear, as the percentage of loans in foreclosure increase, so do suicides.
Fear first: the writer says the “foreclosure crisis” is a “public health crisis.” Never let a good crisis go to waste. Then she expands the correlation: “The Washington Post’s Dina elBoghdaddy wrote that it even appears foreclosures may raise the blood pressure of neighbors who simply live near these repossessed homes.” Maybe “For Sale” signs send out occult hypertension rays? Better look into that.
Here’s the meat:
Researchers Jason Houle at Dartmouth and Michael Light at Purdue looked at state-level suicide rates from the Centers for Disease Control and Prevention, alongside proprietary foreclosure data from RealtyTrac in all 50 states and the District of Columbia between 2005 and 2010. Their analysis compared the two datasets within each state across time, and across each state at fixed moments, controlling for variables captured in data from the American Community Survey (including demographics, unemployment and poverty rates, divorce rates and population density).
The tell tale phrase is “controlling for”, which means we’re dealing with regression. This is a model of the central parameters of a normal distribution which quantifies our uncertainty in suicide rates (per state). That model had a bunch of modifiers of the central parameter, i.e. those things “controlled for”, plus the foreclosure rate (per state). That is, the uncertainty in the suicide rate itself is characterized with a normal distribution, and that parameter of that normal which describes the center is shifted rightwards in the presence of all those things “controlled for” and in the presence of higher foreclosure rates.
The p-value which confirms this wasn’t as wee as usual. The authors admit “Net of other factors, an increase in the within-state total foreclosure rate was associated with a within-state increase in the crude suicide rates (bâ€‰=â€‰0.04; Pâ€‰<â€‰.1)”. Not 0.05, you see. Well, let’s hope that the authors were not too bereft over the lack of traditional weeness. And, indeed, they probably were not, as we shall soon see.
Now it is certainly plausible, and probably even true in some individual cases, that foreclosing a home pushes people to commit that awful unpardonable sin. The model, however, says nothing about individuals, only about rates within states. The model is silent on the causes of each individual suicide.
Indeed, it is not known whether any of the folks in this database who committed suicide even had a home which was foreclosed.
That’s because the authors relied on the epidemiologist fallacy for their results. And since their p-value was not traditionally wee, they dug further into the data, doing sub-analyses, and discovered suicide rates were largest “among the middle-aged”, which is “46–64 years”. The p-value in this subset was comfortably wee.
The Post writer asked, “Why the significant effect for adults in between those groups, and for those age 46 to 64 in particular?” And the authors answer, plausibly, “Middle-aged adults have the highest proportion of homeowners relative to other age groups and have a higher risk of home foreclosure than do other age groups.”
On the other hand, which group of folks has the highest suicide rate, independent of all? According to the American Foundation for Suicide Prevention,
In 2010, the highest suicide rate (18.6) was among people 45 to 64 years old. The second highest rate (17.6) occurred in those 85 years and older. Younger groups have had consistently lower suicide rates than middle-aged and older adults. In 2010, adolescents and young adults aged 15 to 24 had a suicide rate of 10.5.
So middle-age is where all the action is anyway. It’s still possible, of course, that a foreclosures can be the last straw for somebody. But it does not appear likely that the increase in foreclosures it causing a public health “crisis.”