Knowledge that a variable makes non-uniform, non-unitary, and non-explanatory average differences in an outcome is defined by Madole & Harden (M&H) as “first-generation causal knowledge” (target article, sect. 1.1, para. 5). This is the kind of knowledge that can be inferred from average treatment effects (ATEs) in randomized controlled trials (RCTs). RCTs are designed to be counterfactual, producing results that can be compared for different values of an experimental condition. The different outcomes of such experiments may, however, depend on temporal, spatial, or environmental contexts in which the experiments are carried out, which may restrict the generality of the results.
M&H introduce “second-generation causal knowledge” (target article, sect. 1.1, para. 7), which derives from understanding the mechanisms that might explain why knowledge inferred from RCTs is not uniform, unitary, or explanatory. Examples of such mechanisms include the effect of the context in which the experiment was carried out, and the role of unintended bias in the choice of subjects (the lithium case in target article, sect. 2.4). As more mediators and confounders are recognized and a more complete causal chain is established, they hope that those closer to the end of a causal chain might be more uniform, unitary, and explanatory, and the reasons for the ATE better understood.
The juxtaposition of first- and second-generation knowledge is blurred by the introduction (target article, sect. 3.3) of genes as “shallow causes” (target article, sect. 3.5) of behavioral phenotypes relative to deep causes. It is not clear whether the authors believe deep causes to be first-generation causal knowledge. However, shallow causes, which are also non-uniform, non-unitary, and non-explanatory, seem to fall under the rubric of first-generation causal knowledge and depend on when, where, how, and on whom the phenotype is assessed. Confusion arises here because second-generation causes are said to provide “a clear sense of the mechanisms of change” by identifying “in what contexts and with whom” causality can be inferred. Do the authors aim for a more complete genetic causal chain, which we assume would involve second-generation causal knowledge, but intend to base it upon shallow causes, which appear to be first generation? A precise dichotomy of first- and second-generation causes, and where deep and shallow causes fall in such a dichotomy, would have been valuable.
Shallow causality's conceptual legitimacy seems M&H to rely on the fact that its limitations are shared with ATEs from RCTs, which do have the advantage of being counterfactually based. Shared limitations are hardly a strong reason to endorse shallow causality as an analytic paradigm.
Neither population genome-wide association studies (GWASs) nor classical heritability studies have a counterfactual basis, and neither should be construed as revealing anything about causality (Feldman & Lewontin, Reference Feldman and Lewontin1975; Lewontin, Reference Lewontin1974; Shen & Feldman, Reference Shen and Feldman2020). Emphasizing genes as causes, M&H focus on within-family studies, namely comparison between siblings. Given their parents' genotypes, sibs' genotypes can be regarded as a counterfactual experiment only with respect to that family. From within-family GWASs of educational attainment (EA), they conclude that “genes cause EA.” “For behavior geneticists” they regard this as “undoubtedly a triumph” (target article, sect. 3.3, paras. 3–4).
But is it? The largest GWAS of EA (Okbay et al., Reference Okbay, Wu, Wang, Jayashankar, Bennett, Nehzati and Young2022) included 3 million subjects and 53,000 sib pairs. The polygenic score (PGS) for the general sample explained 10–16% of the variance in EA. From the within-family (sib-pair) GWASs, the estimate was that about 31% of the variance explained by the PGS could be classified as “direct effects,” which are roughly equivalent to causal. Burt (Reference Burt2023) goes into great detail about the dangers of making general population inferences from within-family GWASs. Here, we note that Okbay et al. (Reference Okbay, Wu, Wang, Jayashankar, Bennett, Nehzati and Young2022) report on GWASs for EA from nearly 2,500 mate pairs and find strong evidence of assortative mating on phenotypes other than EA itself that are correlated with the PGS for EA. Geographic and environmental factors most likely contributed to this assortative mating. Within the general population, there are likely to be differences among families, which may reflect cryptic population stratification. Besides assortative mating, PGS are affected by gene–environment interactions, gene–environment correlations, and environmental variance (Okbay et al., Reference Okbay, Wu, Wang, Jayashankar, Bennett, Nehzati and Young2022, p. 440). As pointed out by Coop and Przeworski (Reference Coop and Przeworski2022), “the central challenge to identifying genetic causes of behavioral traits” is “the immense difficulty of disentangling population stratification from biological and social effects.” Thus, it is not legitimate to claim that within-family studies of EA lead to the conclusion “that genes caused these differences” (target article, sect. 3.3, para. 3). In fact, it is important to stress that PGSs “cannot be used to predict an individual's EA” (Okbay et al., Reference Okbay, Wu, Wang, Jayashankar, Bennett, Nehzati and Young2022, p. 440).
M&H do recognize the difficulty of extrapolation from inference of genetic causes based on within-family studies to claims about population GWASs. They state (target article, sect. 3.3, para. 10) that genes make “some distal difference in the level of attainment” or that “while genes cause EA, this is neither a singular nor a generic claim” (target article, sect. 3.3, para. 9.). Their justification for the legitimacy of the concept of genes as a shallow cause of traits like EA seems to be that statistical inference of genetic causality shares the properties of being “local, probabilistic, and distal” (target article, sect. 3.3, para. 5) with ATEs. They conclude that “genetic effects conditional on the parental genotype are causal in the same sense as average treatment effects” (target article, sect. 4, para 1). This is actually a statement about within-family GWASs, and the paper's conflation of causal inference from such studies with those of population-level GWASs could be dangerous and should have been avoided. There is no logical reason to believe that claims about causality based on within-family studies also apply to the general population, whether the causal paradigm is first or second generation, deep or shallow (Coop & Przeworski, Reference Coop and Przeworski2022, p. 851).
M&H are familiar with the shortcomings of the inferential processes that culminate in claims that genes cause behavior. In section 3.3, para. 6, they state “genes might cause EA but they are certainly not the only cause of EA,” and in section 3.3, para. 8 “the probability that genes matter for EA varies depending on the environmental exposures.” Such statements seem to indicate a genuflection in the direction of Lewontin's (Reference Lewontin1974) demonstration that causality cannot be inferred from analysis of variance. A straightforward and explicit statement to this effect would have been preferable to introducing complicated definitions of different kinds or levels of causality.
Knowledge that a variable makes non-uniform, non-unitary, and non-explanatory average differences in an outcome is defined by Madole & Harden (M&H) as “first-generation causal knowledge” (target article, sect. 1.1, para. 5). This is the kind of knowledge that can be inferred from average treatment effects (ATEs) in randomized controlled trials (RCTs). RCTs are designed to be counterfactual, producing results that can be compared for different values of an experimental condition. The different outcomes of such experiments may, however, depend on temporal, spatial, or environmental contexts in which the experiments are carried out, which may restrict the generality of the results.
M&H introduce “second-generation causal knowledge” (target article, sect. 1.1, para. 7), which derives from understanding the mechanisms that might explain why knowledge inferred from RCTs is not uniform, unitary, or explanatory. Examples of such mechanisms include the effect of the context in which the experiment was carried out, and the role of unintended bias in the choice of subjects (the lithium case in target article, sect. 2.4). As more mediators and confounders are recognized and a more complete causal chain is established, they hope that those closer to the end of a causal chain might be more uniform, unitary, and explanatory, and the reasons for the ATE better understood.
The juxtaposition of first- and second-generation knowledge is blurred by the introduction (target article, sect. 3.3) of genes as “shallow causes” (target article, sect. 3.5) of behavioral phenotypes relative to deep causes. It is not clear whether the authors believe deep causes to be first-generation causal knowledge. However, shallow causes, which are also non-uniform, non-unitary, and non-explanatory, seem to fall under the rubric of first-generation causal knowledge and depend on when, where, how, and on whom the phenotype is assessed. Confusion arises here because second-generation causes are said to provide “a clear sense of the mechanisms of change” by identifying “in what contexts and with whom” causality can be inferred. Do the authors aim for a more complete genetic causal chain, which we assume would involve second-generation causal knowledge, but intend to base it upon shallow causes, which appear to be first generation? A precise dichotomy of first- and second-generation causes, and where deep and shallow causes fall in such a dichotomy, would have been valuable.
Shallow causality's conceptual legitimacy seems M&H to rely on the fact that its limitations are shared with ATEs from RCTs, which do have the advantage of being counterfactually based. Shared limitations are hardly a strong reason to endorse shallow causality as an analytic paradigm.
Neither population genome-wide association studies (GWASs) nor classical heritability studies have a counterfactual basis, and neither should be construed as revealing anything about causality (Feldman & Lewontin, Reference Feldman and Lewontin1975; Lewontin, Reference Lewontin1974; Shen & Feldman, Reference Shen and Feldman2020). Emphasizing genes as causes, M&H focus on within-family studies, namely comparison between siblings. Given their parents' genotypes, sibs' genotypes can be regarded as a counterfactual experiment only with respect to that family. From within-family GWASs of educational attainment (EA), they conclude that “genes cause EA.” “For behavior geneticists” they regard this as “undoubtedly a triumph” (target article, sect. 3.3, paras. 3–4).
But is it? The largest GWAS of EA (Okbay et al., Reference Okbay, Wu, Wang, Jayashankar, Bennett, Nehzati and Young2022) included 3 million subjects and 53,000 sib pairs. The polygenic score (PGS) for the general sample explained 10–16% of the variance in EA. From the within-family (sib-pair) GWASs, the estimate was that about 31% of the variance explained by the PGS could be classified as “direct effects,” which are roughly equivalent to causal. Burt (Reference Burt2023) goes into great detail about the dangers of making general population inferences from within-family GWASs. Here, we note that Okbay et al. (Reference Okbay, Wu, Wang, Jayashankar, Bennett, Nehzati and Young2022) report on GWASs for EA from nearly 2,500 mate pairs and find strong evidence of assortative mating on phenotypes other than EA itself that are correlated with the PGS for EA. Geographic and environmental factors most likely contributed to this assortative mating. Within the general population, there are likely to be differences among families, which may reflect cryptic population stratification. Besides assortative mating, PGS are affected by gene–environment interactions, gene–environment correlations, and environmental variance (Okbay et al., Reference Okbay, Wu, Wang, Jayashankar, Bennett, Nehzati and Young2022, p. 440). As pointed out by Coop and Przeworski (Reference Coop and Przeworski2022), “the central challenge to identifying genetic causes of behavioral traits” is “the immense difficulty of disentangling population stratification from biological and social effects.” Thus, it is not legitimate to claim that within-family studies of EA lead to the conclusion “that genes caused these differences” (target article, sect. 3.3, para. 3). In fact, it is important to stress that PGSs “cannot be used to predict an individual's EA” (Okbay et al., Reference Okbay, Wu, Wang, Jayashankar, Bennett, Nehzati and Young2022, p. 440).
M&H do recognize the difficulty of extrapolation from inference of genetic causes based on within-family studies to claims about population GWASs. They state (target article, sect. 3.3, para. 10) that genes make “some distal difference in the level of attainment” or that “while genes cause EA, this is neither a singular nor a generic claim” (target article, sect. 3.3, para. 9.). Their justification for the legitimacy of the concept of genes as a shallow cause of traits like EA seems to be that statistical inference of genetic causality shares the properties of being “local, probabilistic, and distal” (target article, sect. 3.3, para. 5) with ATEs. They conclude that “genetic effects conditional on the parental genotype are causal in the same sense as average treatment effects” (target article, sect. 4, para 1). This is actually a statement about within-family GWASs, and the paper's conflation of causal inference from such studies with those of population-level GWASs could be dangerous and should have been avoided. There is no logical reason to believe that claims about causality based on within-family studies also apply to the general population, whether the causal paradigm is first or second generation, deep or shallow (Coop & Przeworski, Reference Coop and Przeworski2022, p. 851).
M&H are familiar with the shortcomings of the inferential processes that culminate in claims that genes cause behavior. In section 3.3, para. 6, they state “genes might cause EA but they are certainly not the only cause of EA,” and in section 3.3, para. 8 “the probability that genes matter for EA varies depending on the environmental exposures.” Such statements seem to indicate a genuflection in the direction of Lewontin's (Reference Lewontin1974) demonstration that causality cannot be inferred from analysis of variance. A straightforward and explicit statement to this effect would have been preferable to introducing complicated definitions of different kinds or levels of causality.
Financial support
This work was supported in part by the Stanford Center for Computational, Evolutionary, and Human Genomics.
Competing interest
None.