Statistics

On That New Genes & Intelligence Paper

Niggling complaint number one. Big name journals are printing more telegraphic papers than ever before, leaving the meat to ill-edited, hard-to-navigate “supplementary” material. It’s a way to get in under the word count limit, but it makes work harder to follow. Holds here.

The peer-reviewed paper is “Genome-wide association meta-analysis of 78,308 individuals identifies new loci and genes influencing human intelligence” by Suzanne Sniekers and so many other authors I didn’t bother to count them, published in Nature Genetics.

Niggling complaint number two. First sentence of the Abstract: “Intelligence is associated with important economic and health-related life outcomes.” Academics are forced to write badly like that. Somebody feared readers wouldn’t grasp the importance of the material based solely on the title.

Anyway, onto the findings. This is a meta-analysis, and as I always and only half-jokingly say, a meta-analysis is a statistical study to “prove” what couldn’t be proved in individual studies. One must always view with deep suspicion any meta-analysis.

Short version. What Sniekers (and all the others) did was to gather data from other studies and cram it all together. The other studies took people “of European descent” (i.e. whites) and asked them different series of questions, and then took samples of their DNA. The questions asked differed from study to study. The questions were numerically scored and then, along with measures of the DNA, were filtered by various statistical manipulations.

Signals between SNPs (single-nucleotide polymorphisms) and the filtered quantified scores were then estimated. Most SNPs showed no signal between the quantified, filtered scores. But some did. P-values, the scourge of science, were the chosen signal metric.

According to the authors, “Our calculations show that the current results explain up to 4.8% of the variance in intelligence” as measured by the classical R^2. Up to all will recognize as an advertising phrase, which means, we might guess, real-life values will be lower.

Now R^2 is a measure of parametric fit and is not a measure of predictive accuracy, which is the only true way to judge model goodness. Experience shows that one must divide the parametric R^2 by anywhere from 4 to 8 to get a good estimate of predictive variance explained. If that is so here, and there is no reason to suppose it is not, knowing a person’s SNP variant will allow us to make only the weakest of predictions about that person’s score on an intelligence test, as opposed to the predictions we would make not knowing this variant.

“But Briggs, everybody knows intelligence is at least partially heritable, which means it must be associated with at least some genes.”

Sure. And your point is?

You couldn’t have meant that because some genes must be associated with some aspect(s) of intelligence, that Sniekers et alia must therefore have found the right ones? No.

Here, for example, are the (partial) details from one of the studies that went into the stew. All emphasis mine, like on “outliers excluded”—which should crack you up.

Manchester and Newcastle Longitudinal Studies of Cognitive Ageing Cohorts

…Recruitment took place in Newcastle and Greater Manchester between 1983 and 1992. At the outset of the study, 6063 volunteers were available (1825 men and 4238 women), with a median age of 65 years (range 44 to 93 years). Over the period 1983 to 2003, two alternating batteries of cognitive tasks applied biennially were designed to measure fluid and crystallized aspects of intelligence. These included: the Alice Heim 4 (AH4) parts 1 and 2 tests of general intelligence, Mill Hill Vocabulary A and B Tests, the Cattell and Cattell Culture Fair intelligence tests, and the Wechsler Adult Intelligence Scale Vocabulary test…

…To represent crystallized intelligence (gc), we used the Mill Hill Vocabulary A and B Tests in the Manchester and Newcastle samples. For fluid-type intelligence (gf) in the Manchester and Newcastle samples empirical Bayes estimates for each individual were obtained from a random effects model fitted by maximum likelihood (ML) to the standardized age-regressed residuals obtained for each sex from the Alice Heim 4 test and the Cattell Culture Fair test scores. The phenotypes for gc were corrected for age and gender and the phenotypes for gf were corrected for age and derived separately for males and females. The standardized residuals were used for all subsequent analyses.

Participants had DNA extracted and were genotyped for 599,011 common single nucleotide polymorphisms (SNPs)…Individuals were excluded from this study based on unresolved gender discrepancy, relatedness…Each cohort was tested for population stratification and any outliers were excluded

Look at those manipulations fly! Recall the other three studies were different. There’s lots of layers of uncertainty here all banged together.

Here is a description of one of the wee p-value-implicated genes from the meta-analysis, which where about 40 out of the tested 12,104,294 SNPs. How many are spuriously identified? P-values won’t tell you. I picked this one only because I like to say “forkhead family”.

FOXO3

This gene belongs to the forkhead family of transcription factors which are characterized by a distinct forkhead domain. This gene likely functions as a trigger for apoptosis through expression of genes necessary for cell death. Translocation of this gene with the MLL gene is associated with secondary acute leukemia. Alternatively spliced transcript variants encoding the same protein have been observed.

Holy forkhead family! So a gene “associated” (probably identified via another wee p-value) with secondary acute leukemia is also implicated in “crystallized aspects of intelligence”. Hey, maybe, maybe.

Maybe is, or should be, the key word for this study. But listen to this concluding sentence from the authors who speak in terms of utmost certainty: “In conclusion, we conducted a meta-analysis GWAS and GWGAS for intelligence, including 13 cohorts and 78,308 individuals. We confirmed 3 loci and 12 genes, and identified 15 new genomic loci and 40 new genes for intelligence.”

Confirmed. Identified.

C’mon, gang. Give us at least a “possibly” or a “might.”

Categories: Statistics

7 replies »

  1. I think it would be error to deny any relationship between DNA and intellect. Still, genotype is not phenotype. Clouding the water with PC babble does not serve the interests of any “maligned, victim” group.

  2. Those researchers don’t know their Italian vocab, poorly educated, clearly.

    What could possibly be the ultimate goal of studies like these?

    ~“”But Briggs, everybody knows intelligence is at least partially heritable, which means it must be associated with at least some genes.”
    …Sure. And your point is?”~

    It is faulty to think that character and nature of a person may be pinned to one or several genes like hereditary diseases.

    It’s like alchemy.

    Same for the quest for the gene for homosexuality. It’s way more complex than that even before you start looking to the genetics.
    (And I don’t doubt that in most cases genetics are a crucial factor).

    And that’s that, so there!

  3. How do you crystalize aspects of intelligence? I had a chemistry lab in college but must have been absent for this lesson.

  4. Seems like nitpicking. Or cherry-picking.

    The paper isn’t available except by purchase, but the key participants were interviewed and made the following remarks regarding this study:

    “These results are very exciting as they provide very robust ASSOCIATIONS with intelligence. The genes we detect are involved in …, and are specifically important in …. These findings for the first time provide clear CLUES towards the underlying biological mechanisms of intelligence,” says Danielle Posthuma, Principal Investigator of the study.

    OBSERVE: “associations”, “involved in”, and “clues” towards the underlying biological mechanisms” — nothing conclusive about that.

    Posthuma also remarked: “Future studies will need to clarify the exact role of these genes in intelligence in order to obtain a more complete picture of how genetic differences lead to differences in intelligence. “The current genetic results explain up to 5% of the total variance in intelligence.”

    DOES “up to 5% of the total variance in intelligence” seem like they’ve nailed it down? Really??

    The authors at Science Daily (where the interview is published) themselves hedge quite a bit, saying:

    “The study also showed that the genetic influences on intelligence are highly CORRELATED with genetic influences on [other things listed]”

    When I read Briggs say,

    “Confirmed. Identified. C’mon, gang. Give us at least a “possibly” or a “might.””

    I’m expecting to find the authors of a study making bold conclusive claims. But reading what they actually say one finds the exact opposite. Further, reading what the authors of the press article say one finds similar hedging — “correlated” is expressly used, no indication of the ‘correlation equals causation’ error. Sure the article leads one in that direction, but nothing there expressly says so, and, crucial details are presented in clear context that what conclusions can be reached are VERY LIMITED.

    I haven’t read the paper, so maybe the paper is more assertive about causes vs effects, but I doubt it. There’s some 20,000 genes in the human genome, with about 52 identified associated with no more than 5% variation in human intelligence. There ain’t no way anyone can conclude from that that anything has been “identified” or “concluded” on the subject with any real certainty.

    When Briggs derisively says “confirmed” one is led to think that the researchers validated that those genes dictate intelligence; the reality is probably much more benign–the researchers merely identified some of the same genes being CORRELATED with intelligence that others before identified (recall, even the media got that distinction, reporting “correlated” not “confirmed”).

    When Briggs derisively says “identified” one is led to think those new genes are pinpointed as also directly affecting intelligence. But given the study’s lead investigator stated, clearly, the need for further studies to clarify the role they play [which includes the possibility that they play no role at all in intelligence], we learn that “identified” means “candidate” genes that drive intelligence — specifically up to 5% of the variation.

    Briggs make it out that the study is asserting significant conclusions and presents that in a context that is very misleading–to defame. The reality is the researchers found correlates for no more than 5% of the variation in human intelligence, that to me seems to me like a pittance, but progress toward real understanding.

    More said by the study lead investigator is available at:
    https://www.sciencedaily.com/releases/2017/05/170523083324.htm

  5. Why do soft scientists put so much stock in such niggardly r² values? In mechanical engineering studies, we wanted values at least in the range of r² > 0.64 or > 0.81. That the data is noisy is no reason to take the values of 0.05 more seriously.

  6. Not that I’ve looked too terribly hard, but I’ve yet to see a study of X’s association with intelligence where the measure of intelligence was the same full spectrum IQ test administered to all the participants. It’s always, always some kind of proxy.

    I don’t want to be too critical here. The WISC-V is expensive and having it administered by well trained diagnosticians on any large number of people would be really expensive.

    A possible solution: Whenever the WISC publisher gets around to normalizing version VI of the test, have them get a DNA sample from all the kids they test

Leave a Reply

Your email address will not be published. Required fields are marked *