William M. Briggs

Statistician to the Stars!

Autism Caused By Highways?

AutismEpidemiology is nothing if not a productive field. All that is needed for success is a database (larger the better), a disease (any will do), and some minor facility with statistical software.

Our latest example is the Environmental Health Perspectives1 paper “Residential Proximity to Freeways and Autism in the CHARGE Study” by Volk et al.

The authors found a group of mothers who lived in California. They measured the distance these mothers lived to “freeways and major roadways” for the majority of their pregnancies. They also took note whether their children developed autism. They posited that living closer to freeways increased the risk of autism. They also measured mothers’ education, age, and smoking status, the kids’ race and whether the kids were preemies.

They purposely identified 304 kids with autism and 259 without from a database “frequency matched by sex, age, and broad geographic area.” Ideally, since this data was hand-picked, they should have had equal numbers in each group, and equal frequencies of boys in each group. But the autism group had 87% boys, while the normal group had 81%. In other words, by design (purposeful or accidental), they put more boys in the autism group than they put in the control group. They gave this difference a “Chi-square p-value” of 0.10. What does that number mean? Well, nothing (see the footnote2).

As to the models:

Specifically, we included child’s sex and ethnicity, maximum education level of the parents, maternal age, gestational age at birth, and maternal smoking during pregnancy.

This was a logistic regression model, which here assumes the log-odds of developing autism is a linear function of the attributes just mentioned plus the distance (in meters) from freeways or major roadways. No plots of the freeway or roadway data are shown which indicate if this is a good assumptions.

Instead, the authors do a strange thing: they do not model the actual distance but chop up the distance into arbitrary bins. The first is living less than 309M from a freeway (about three football field’s distance). The next is living from 309M to 647M, then living from 647M to 1419M, and finally living greater than 1419M. They did the same thing for major roadways: less than 42M, 42-96M, 96-209M, and greater than 209M.

Two separate models were run: one for freeways, the other for major roadways. Only the less-than-309M group with respect to the greater-than-1419M group reached (classical) statistical significance. None of the other groups did. Nor did any bin in the roadways model. The (exponentiated) parameter associated with the freeway 309M-model was 1.86. This is incorrectly said to the the odds ratio for those to live withing 309M compared to those living farthest. It isn’t: it’s the parameter. To get the real odds, we’d have to “integrate out” the parameter, which would make the real odds ratio, assuming all else true and good, to be less than 1.86.

Remember when we talked about how changing the start date in time series analysis can lead to opposite conclusions? It’s the same here: why 309M and not, say, 308M or 310M? And the same for the other buckets. Different cuts will give different conclusions. Why not just leave distance in as a linear function? I mean, why chop it up at all?

Be generous and assume that these cuts are “real” and the “best”. Can we think of any other reason which might account for the results? Living within a football field or two of freeway in Los Angeles is a good indicator of what? Great medical care? Wealth? Health insurance? You’ll notice the authors left out any measure of economic importance.

In other words for this study, the effect is small, it is only for one small suspiciously chosen subset of the population (10% by the authors’ reckoning), and the posited cause is most likely an artifact caused by mismeasure and conflation of unmeasured socioeconomic variables. In short, the article gives no more than a vague suspicion that freeways are autism inducers: it even says they usually are not. It also says roadways are not autism inducers.

What makes this study interesting, then, is how it was reported in the press. The Wall Street Journal, not usually given to flights on fancy, reporting on this paper (and others) led with the headline:

The Hidden Toll of Traffic Jams

Scientists Increasingly Link Vehicle Exhaust With Brain-Cell Damage, Higher Rates of Autism

Lots of reasons given how exhaust might influence this or that biologic process, words like “Scientist believe”, a quote from the study author (“The evidence is growing that air pollution can affect the brain”), a quote or two from non-authors (“There is real cause for concern”).

CBS news, not content with the actual numbers, juiced them a little: “A new study shows that children in families who live near freeways are twice as likely to have autism as kids who live off the beaten path.”

No news source I could discover provided any analysis of how weak—and even nonexistent—the effects of this study were. Lesson for reporters: don’t trust scientists. We are no different than anybody else.

—————————————————————————————————

1Volume 119, number 6, June 2011, pp. 873–877. Thanks to Willie Soon for suggesting this topic.

2Ordinarily, in frequentist statistics, a chi-square test is used to test for “differences in proportions” in groups. Here, there were two groups with proportions 87% and 81%. Are these different? This is not a trick question, but it also one which is not to be answered within classical theory. That is, the chi-square is not an answer to this obvious question. The test is not a test of difference in actual proportions, but something else. Okay?

Instead, the statistician asks, “Assuming the ‘true’ proportion of boys with autism and boys without autism is identical: if we sampled from these two groups indefinitely, what is the chance of seeing a certain mathematical function (the chi-square) of these two sampled proportions being larger than the chi-square we see for the actual data?” This is 10%.

And so? Well, again, well nothing. The statistic has no bearing or meaning to this data. The database was built by hand with the intent of matching by frequency the sex of kids with autism. It failed in this; slightly, but it still failed. The authors could have, just as they picked the other data by hand, tossed out a few of the boys in the autism group or added a few more in the control group. There was nothing “random” in these selections, not even in the classical sense.

Now this dull subject is important because, as all prior evidence indicates, boys are vastly more likely to develop autism than girls. Why this is so, while interesting, is not relevant to this study. Why is relevant is that this discrepancy, the actual difference in proportions in the sample, might account for the “significance” in the results even though the authors included sex in their models.

17 Comments

  1. Modern technology is wonderful. I think the reason they had to round to the nearest meter was, even with today’s sophisticated equipment, the question of whether your place of residence is determined by the location of the head or feet are has never been fully established. Imagine though! Just a few decades ago, the best resolution in locating your residence was no better than 10 meters.

  2. What is autism anyway? That marevelous source, Wikipedia, defines it as “Autism is a disorder of neural development characterized by impaired social interaction and communication, and by restricted and repetitive behavior ” and has this picture, presumably as an example: http://upload.wikimedia.org/wikipedia/commons/thumb/d/d1/Autism-stacking-cans_2nd_edit.jpg/230px-Autism-stacking-cans_2nd_edit.jpg

    But the child has escaped the leash lying on the floor thus this behavior is no longer restricted. Perhaps they mean the poor social interaction which is apparent in the photo. Certainly the artistic stacking is communication as art speaks to us all.

  3. It’s not just autism …

    Childhood Asthma Linked to Freeway Pollution
    For every 1.2 kilometers (about three-quarters of a mile) the students lived closer to the freeway, asthma risk increased by 89 percent. For example, students who lived 400 meters from the freeway had an 89 percent higher risk of asthma than students living 1,600 meters away from the freeway.
    http://www.usc.edu/uscnews/stories/11614.html

    Pollution linked to freeways …
    Air Pollution From Freeway Extends One And A Half Miles Away
    ScienceDaily (June 18, 2009) — Environmental health researchers from UCLA, the University of Southern California and the California Air Resources Board have found that during the hours before sunrise, freeway air pollution extends much further than previously thought.
    http://www.sciencedaily.com/releases/2009/06/090618172409.htm

    It is interesting that “reports of autism cases per 1,000 children” rose from three to more than five (http://en.wikipedia.org/wiki/Frequency_of_autism#Frequency) over the period studied. Both a confounding factor and a very low incidence; it implies that a population of between 50,000 and 100,000 children was studied.

  4. What a tragic waste of scientific(?) resources. They used California Freeway data. Total fail. In the first place no two California Freeways are alike. None. Material, design, development and user patterns are all distinctive. Surfaces change, year in and year out. Nothing is the same. Ergo, data of their use and locale can never be standardized, and just barely approximated. Accordingly, the authors started with a pre-failed premise.

    And then we get to “vehicle exhaust”. Where to begin?….. …there is nothing “standardized” about exhaust gases in the Golden State. Motorists here use everything from common carbon-based fuel sources through all sorts of animal and vegetable residue by-products, and up to “pixie dust” for internal combustion propulsion purposes. Nowhere else in the world can be found the variety of fuel sources used to circumnavigate Southern California freeways. Even IF a standard typical measurement could be developed, it could never be duplicated. Two days in a row. Or any where else in the known universe. So what earthly use could data derived from it’s use be?

  5. In addition to a database, disease, and statistical software, I would add: a cause. My dubious-science-sniffer caught this whopper: “There’s a common saying in Appalachia: what we do to the land, we do to the people. Recently, 21 peer-reviewed scientific studies have confirmed the truth of those words. Not only has mountaintop removal permanently destroyed more than 500 Appalachian mountains, but people living near the destruction are 50% more likely to die of cancer and 42% more likely to be born with birth defects compared with other people in Appalachia.” (http://ilovemountains.org/the-human-cost)

    I’ve lived near the rural Appalachians and can state that there does seem to be a high incidence of cancer in families who have lived there for generations. However, the mountaintops are intact. Locals like to blame the water for their ills, but I strongly suspect inbreeding as the culprit.

  6. Speaking of which, is it *ever* advantageous, from a statistical point of view, to bin data? It seems like binning destroys information, so would always be bad.

    I can see people wanting to bin data in an attempt to simplify reporting (“Adjacent To”, “Near To”, “Far From”). I can imagine that people *want* to use a particular methodology that requires categorical data. But is there ever any more-accurate-answers advantage to this?

  7. Briggs

    14 February 2012 at 7:25 pm

    Wayne,

    Sure, you can bin, but you must have external evidence which gives information about the bins: some kind of justification. Just like in the time series example, if you have logical reasons for the start dates/bins, then you can use them. Otherwise we are right to suspect that some “playing around” has been done.

  8. Wayne, whatever the method, it boils down to counts/distance. If no binning is done, you would be left with a count of one at every distance measurement. Rather cries out for a Poisson distribution, which may have been used. Still, a function of x makes a lousy sound bite so the results are going to be binned. That they used overly precise numbers is an indicator that they may not understand what those numbers mean (or something much worse).

  9. Given that they seem to be homing in on atmospheric constituents (gasses and particles from exhaust), I’m wondering if they also looked at the homes’ ventilation and air filtration systems. Surely that would be a major factor in the amount of time someone is exposed to the exhaust.

    Don’t get me wrong. I don’t doubt that breathing a bunch of exhaust isn’t the greatest thing for one’s health. Just wondering if this particular ailment can be definitively tied to it.

  10. well, lets list the many problems with his junk research and i am merely restating the questions of the commentators or questioning and expanding on he discussions.

    1. what about indoor air??? well people spend more than 80 percent of their time inside–and so what is indoor air about–outside air monitors are worthless.

    2. the study is fatally flawed by the fact that air quality is back to ambient levels 300 meters from the roadway. read the studies of Zhu on roadway pollution.

    3. Any study, any, that attempts to analyze autism is futile–the diagnosis of autism is not accurate or reliable at all.

    4. Most important the studies by Zhu show that air quality is back to normal ambient levels at 300 meters–what are these clowns doing other than dredging for data.

    1. Zhu, Y, Hinds WC Seongheon K et. al. Study of ultrafine particles near a major highway with heavy-duty diesel traffic. Atomospheric Environment 2002; 36: 4325-35.
    2. Zhu Y, Hinds WC, Seongheon K, et. al. Concentration and size distribution of ultrafine particles near a major highway. Journal of Air & Waste Management 2002; 52: 1032-42.
    3. Zhu Y, Kuhn T, Mayo P, et. al. Comparison of Daytime and nighttime concentration profiles and size distribution of ultrafine particles near a major highway Environmental Science & Technology 2006; 40: 2531-36.

    consider the materials and tell me how and why i should believe anything these BS promoters say.

  11. Ah yes, another “price of tea in China” analysis.

    The great thing about statistics, especially regression, is that you can use anything as an “explanatory” variable. For instance, the researchers could have used distance to Disneyland, or the dot product of home phone numbers arranged in a matrix, or average foot length of the grandparents, or the price of tea in China.

    It’s amazing how many quantifiable phenomena correlate with “significant” R-squares. But as we all know, correlation is not causation. Or maybe we don’t all know that.

  12. Whoops, I forgot. According to the Luddite religion, which I swear fealty to, anything and everything technological causes horrible disfigurement, brain damage, and shortness of pants.

    Cell phones, car radios, high heels, bacon bits, rubber baby buggy bumpers, corn syrup, pet hamsters, voting machines, chewing gum, TV — they are all deadly according to the High Priests of Science (to whom I bow down in awe and reverence as a self-flagellated penitent and not some kind of smart-mouthed heretic).

    As a matter of fact, I can prove scientifically with astounding significance that all you blog commenters are going to die! So shut off the glowing screen and make amends and atonements, before it’s too late!

  13. @DAV: Ah yes. OK, so in this case, we have to bin something since we’re talking about the likelihood of getting a disease as compared to others in the population, and the bins are the comparable population.

    I was thinking of continuous situations where things get binned instead of doing a straightforward regression. For example, binning temperatures into 10-degree categories.

  14. Briggs

    15 February 2012 at 9:11 am

    Eric, John,

    Right. Many, even most, epidemiological studies suffer from the ecological fallacy. They measure some thing, like highway distance, and assume that this proxy without error or ambiguity is a perfect-pristine-absolute substitute for what is of actual interest, like inhalation of pollutants. They then take the proxy and “submit it to SAS” (or SPSS, or whatever) and tweak it here and there until it spits out wee p-values. At which point the thesis that the thing of actual interest causes the disease/malady/disparity is said to be true.

    This is a procedure which lets the government regulate anything it wishes, including (as we hear) lunches packed by mothers for their kids.

  15. Briggs,

    It is also the environmental fallacy. This is the attempt to find environmental causes for what is most likely due to genetic inheritance. There is strong evidence that disorders such as autism and tourettes are genetic in origin.

  16. Rather than sample the air quality in each neighbourhood, they make assumptions based on crude proxies like distances from roads – regardless of traffic levels – and see the truth they were seeking in the trends that result.

    Didn’t Asimov write about a society where the thought of a scientist actually going out and collecting new data was abhorrent?

  17. It’s called air pollution that blocks UVB that prevents vitamin D production in the skin…

    In the other words, it’s probably likely that vitamin D deficiency is the root cause for autism. No difference than recommending extra folic acid for Spina Bifida. Activated vitamin D is a steroid hormone that acts as DNA repair and regulation hormone…

    We have been chronically deficient in vitamin D for a long time… no thanks to sun scare which was based on junk science.

    http://www.biochemj.org/bj/441/0061/bj4410061.htm

    https://www.google.com/search?rlz=1C1CHFX_enUS457US457&sourceid=chrome&ie=UTF-8&q=autism+vitamin+D

Comments are closed.

© 2014 William M. Briggs

Theme by Anders NorenUp ↑