Remember our discussion of the Spurious Correlations website, nicely paired with the Regression Isn’t What You Think post?
The two keys points of both are the (a) “statistically significant” correlations, even nonsensical ones, are free for the asking, and (b) people are always assuming statistical models are causal.
If people see a “significant” correlation that they can tell a story about, they accept it as real and causal. If their imagination is not fecund enough to tell a believable tale, the correlation is dismissed as spurious. All that business of wee p-values and technical talk of model innards is secondary and incidental, though necessary to get the tale past peer review.
From reader Kent Clizbe comes this Washington Post story “Foreclosures may be driving the rise in suicides”, with the picture given at the top of this post. The correlation is clear, as the percentage of loans in foreclosure increase, so do suicides.
The tale?
Fear first: the writer says the “foreclosure crisis” is a “public health crisis.” Never let a good crisis go to waste. Then she expands the correlation: “The Washington Post’s Dina elBoghdaddy wrote that it even appears foreclosures may raise the blood pressure of neighbors who simply live near these repossessed homes.” Maybe “For Sale” signs send out occult hypertension rays? Better look into that.
Here’s the meat:
Researchers Jason Houle at Dartmouth and Michael Light at Purdue looked at state-level suicide rates from the Centers for Disease Control and Prevention, alongside proprietary foreclosure data from RealtyTrac in all 50 states and the District of Columbia between 2005 and 2010. Their analysis compared the two datasets within each state across time, and across each state at fixed moments, controlling for variables captured in data from the American Community Survey (including demographics, unemployment and poverty rates, divorce rates and population density).
The tell tale phrase is “controlling for”, which means we’re dealing with regression. This is a model of the central parameters of a normal distribution which quantifies our uncertainty in suicide rates (per state). That model had a bunch of modifiers of the central parameter, i.e. those things “controlled for”, plus the foreclosure rate (per state). That is, the uncertainty in the suicide rate itself is characterized with a normal distribution, and that parameter of that normal which describes the center is shifted rightwards in the presence of all those things “controlled for” and in the presence of higher foreclosure rates.
The p-value which confirms this wasn’t as wee as usual. The authors admit “Net of other factors, an increase in the within-state total foreclosure rate was associated with a within-state increase in the crude suicide rates (b = 0.04; P < .1)”. Not 0.05, you see. Well, let’s hope that the authors were not too bereft over the lack of traditional weeness. And, indeed, they probably were not, as we shall soon see.
Now it is certainly plausible, and probably even true in some individual cases, that foreclosing a home pushes people to commit that awful unpardonable sin. The model, however, says nothing about individuals, only about rates within states. The model is silent on the causes of each individual suicide.
Indeed, it is not known whether any of the folks in this database who committed suicide even had a home which was foreclosed.
That’s because the authors relied on the epidemiologist fallacy for their results. And since their p-value was not traditionally wee, they dug further into the data, doing sub-analyses, and discovered suicide rates were largest “among the middle-aged”, which is “46–64 years”. The p-value in this subset was comfortably wee.
The Post writer asked, “Why the significant effect for adults in between those groups, and for those age 46 to 64 in particular?” And the authors answer, plausibly, “Middle-aged adults have the highest proportion of homeowners relative to other age groups and have a higher risk of home foreclosure than do other age groups.”
On the other hand, which group of folks has the highest suicide rate, independent of all? According to the American Foundation for Suicide Prevention,
In 2010, the highest suicide rate (18.6) was among people 45 to 64 years old. The second highest rate (17.6) occurred in those 85 years and older. Younger groups have had consistently lower suicide rates than middle-aged and older adults. In 2010, adolescents and young adults aged 15 to 24 had a suicide rate of 10.5.
So middle-age is where all the action is anyway. It’s still possible, of course, that a foreclosures can be the last straw for somebody. But it does not appear likely that the increase in foreclosures it causing a public health “crisis.”
Operationally in an analysis, how does one “control for” the extraneous variables in the data set? Weighting factors? Reject cases where the extraneous variables are outliers? Make unwarranted assumptions and/or adjustments? I frequently see this term, yet never see it explained in the particular cases. What exactly is the mathematical manipulation being done?
Great analysis, great article.
Wonder how they adjusted for the over 85 crowd….that would have to be significant to “control for” the 17.6%.
Gary,
In this case, it only means “sticking them into the model for the central parameter.”
Here is another study that tells you how to reduce you risk of dying by 50%.
http://www.telegraph.co.uk/science/science-news/10811734/Why-avoiding-sunshine-could-kill-you.html
Briggs:
I agree that the study draws weak conclusions. And you said it better than I could:
Which is a problem. Am I correct in saying that had the study controlled for home foreclosure specifically that its conclusions would possibly have been less weak?
The study is paywalled, but the last sentence of the conclusion is, “Rising foreclosure rates may be partially responsible for the recent uptick in suicide among middle-aged adults.”
Does this not express a due amount of uncertainty? Pity I couldn’t read the discussion section of the paper, assuming it exists.
I wonder how much “good” literature exists on the causal factors of suicide? Conventional wisdom — more like common sense — holds that happy people don’t kill themselves on purpose.
… awful unpardonable sin …
I have a piece of paper with a headshot of Ben Franklin on it that says no one close to you has ever killed themselves. Be honest now.
awful unpardonable sin
Most certainly one which won’t be divulged in the confessional.
Which is one reason I laughed so much watching Things to do in Denver when You’re Dead.
Here’s the real kicker though: “unpardonable” is not the default official position.
Briggs, I was hoping for a general explanation. Maybe the topic a future post?