Proponents of sociogenomics argue polygenic scores (PGSs) should be incorporated into social science research. PGSs are derived from many variants (up to thousands) of very small effect size (common variants) that are associated with measures of a social behavior trait as determined by genome-wide association studies (GWASs; Pain et al., Reference Pain, Glanville, Hagenaars, Selzam, Fürtjes, Gaspar and Lewis2021). As Burt described, although PGSs are suggested to be measures of genetic influence or propensity for complex traits, several factors make it difficult or impossible to distinguish genetic and environmental effects on such traits. Thus, PGSs are unlikely to be strictly genetic predictors of the propensity to exhibit a trait. Moreover, PGSs do not typically identify alleles or variants responsible for a phenotype (Astle et al., Reference Astle, Elding, Jiang, Allen, Ruklisa, Mann and Soranzo2016). This is unfortunate because identifying a deleterious mutation would permit its biological activity to be studied. The information obtained could provide the knowledge needed to repair or counteract the deleterious effects of the mutation. This limitation of PGSs is exemplary of a broader issue as GWASs have identified thousands of strong associations with complex diseases and traits, but in very few instances has the actual risk variant been identified (Chorley et al., Reference Chorley, Wang, Campbell, Pittman, Noureddine and Bell2008), or have they been successfully translated into clinical use (Bomba, Walter, & Soranzo, Reference Bomba, Walter and Soranzo2017). Identifying causal common variants in GWASs has been difficult because they usually map to regulatory regions (Astle et al., Reference Astle, Elding, Jiang, Allen, Ruklisa, Mann and Soranzo2016), where they influence gene expression, including processes involved in execution of gene expression such as splicing (Lalonde et al., Reference Lalonde, Ha, Wang, Bemmo, Kleinman, Kwan and Majewski2011).
In contrast to common variants of small effect size, rare variants that have large effects on phenotypes have been identified. These variants are often associated with the protein-coding portion of a gene. As proteins are important for structural and physiological functions of cells, mutations that affect them can produce these large effects. An explanation for the rarity of these variants based on evolutionary theory proposes that the detrimental effect of disease on fitness results in selection against variants that promote disease (Gibson, Reference Gibson2012).
Rare variants are often identified by quantitative trait locus (QTL) analysis, which looks for correlations between variants and measures of continuous phenotypic traits (Bloom et al., Reference Bloom, Boocock, Treusch, Sadhu, Day, Oates-Barker and Kruglyak2019). The goal is to uncover the locations in the genome important for these traits. A variation of this analysis that has identified rare variants of large effect size used individuals that displayed the trait of interest and individuals that did not display it from multiple generations of families or isolated populations. Rare variants might be found at higher frequencies in isolated populations because of previous bottleneck events, genetic drift or adaptation, and selection (Moltke et al., Reference Moltke, Grarup, Jørgensen, Bjerregaard, Treebak, Fumagalli and Hansen2014). This increases the power to detect associations between rare variants and phenotypes (Colonna et al., Reference Colonna, Pistis, Bomba, Mona, Matullo, Boano and Toniolo2013). In these studies that sample from families or isolated populations, variants that are closely linked to the mutation or causative allele are present in individuals that exhibit the trait at higher frequency than in individuals that do not display the trait. The locations of these variants indicate the chromosome region likely to contain the mutation. Positional cloning within this region can be used to identify the mutated gene and then comparison of this gene's DNA sequence in subjects with and without the trait can identify the causative mutation. Even though this mutation might only be present in a family or isolated population, the ability to study how any mutation alters the brain to influence a complex behavioral trait would be a breakthrough. An example of a study with success using this strategy focused on Canadian families of Celtic descent with multiple relatives in up to three generations diagnosed for schizophrenia (Brzustowicz, Hodgkinson, Chow, Honer, & Bassett, Reference Brzustowicz, Hodgkinson, Chow, Honer and Bassett2000). A highly significant association between schizophrenia and a locus on chromosome 1q21–q22 was found. Then additional variants within this region were used to pinpoint the nitric oxide synthase 1 adaptor gene (Brzustowicz et al., Reference Brzustowicz, Simone, Mohseni, Hayter, Hodgkinson, Chow and Bassett2004). This gene is overexpressed in the frontal cortex of people with schizophrenia, and it is involved in synaptic function and cortical neuron development, effects that could contribute to schizophrenia (Carrel et al., Reference Carrel, Hernandez, Kwon, Mau, Trivedi, Brzustowicz and Firestein2015; Hernandez et al., Reference Hernandez, Swiatkowski, Patel, Liang, Dudzinski, Brzustowicz and Firestein2016).
In contrast, GWASs, and thus PGSs, do not typically detect QTLs or rare variants of large effect size because these variants are rare in the total population sampled by GWASs. The power to detect a variant of any effect size decreases with the frequency of the variant because fewer individuals in the sample carry a less-frequent variant (Zuk et al., Reference Zuk, Schaffner, Samocha, Do, Hechter, Kathiresan and Lander2014). Put another way, because GWASs calculate the average effects of alleles across thousands of individuals, they cannot capture heterogeneity of effect sizes at the family level (Gibson, Reference Gibson2012).
Can approaches that detect rare variants be useful for sociogenomics? It could be argued that some measures of interest in sociogenomics, for example, level of educational attainment, could not be accounted for by one or a few rare variants. However, the contrast between what GWASs and PGSs identify best (common variants of small effect size) versus what QTL and related approaches identify best (rare variants of large effect size) suggests QTL and related approaches could have significant relevance for sociogenomics. As discussed above, by studying the right population it may be possible to identify associations of a complex behavioral trait with rare variants of large effect and ultimately identify one or more causative alleles. Social behaviors are complex and depend on multiple interacting neural systems as illustrated in a recent review on neural encoding of social valence (Padilla-Coreano, Tye, & Zelikowsky, Reference Padilla-Coreano, Tye and Zelikowsky2022). Social attributes, social memory, social rank, and social isolation were proposed to influence valence assignment to social stimuli, which in turn influences social interactions. Also, the separate neural circuits that control each of these influences were described, noting some overlap of these circuits. Interestingly, they suggest that across psychiatric disorders, brain regions that contribute to encoding of valence and social functions exhibit abnormal activity during emotional processing (e.g., Laviolette, Reference Laviolette2007). Thus, if a mutation disrupts one or more of the neural systems that influences valence assignment, this might lead to abnormal social interactions and a search might identify causal variants, including rare ones of large effect size.
Proponents of sociogenomics argue polygenic scores (PGSs) should be incorporated into social science research. PGSs are derived from many variants (up to thousands) of very small effect size (common variants) that are associated with measures of a social behavior trait as determined by genome-wide association studies (GWASs; Pain et al., Reference Pain, Glanville, Hagenaars, Selzam, Fürtjes, Gaspar and Lewis2021). As Burt described, although PGSs are suggested to be measures of genetic influence or propensity for complex traits, several factors make it difficult or impossible to distinguish genetic and environmental effects on such traits. Thus, PGSs are unlikely to be strictly genetic predictors of the propensity to exhibit a trait. Moreover, PGSs do not typically identify alleles or variants responsible for a phenotype (Astle et al., Reference Astle, Elding, Jiang, Allen, Ruklisa, Mann and Soranzo2016). This is unfortunate because identifying a deleterious mutation would permit its biological activity to be studied. The information obtained could provide the knowledge needed to repair or counteract the deleterious effects of the mutation. This limitation of PGSs is exemplary of a broader issue as GWASs have identified thousands of strong associations with complex diseases and traits, but in very few instances has the actual risk variant been identified (Chorley et al., Reference Chorley, Wang, Campbell, Pittman, Noureddine and Bell2008), or have they been successfully translated into clinical use (Bomba, Walter, & Soranzo, Reference Bomba, Walter and Soranzo2017). Identifying causal common variants in GWASs has been difficult because they usually map to regulatory regions (Astle et al., Reference Astle, Elding, Jiang, Allen, Ruklisa, Mann and Soranzo2016), where they influence gene expression, including processes involved in execution of gene expression such as splicing (Lalonde et al., Reference Lalonde, Ha, Wang, Bemmo, Kleinman, Kwan and Majewski2011).
In contrast to common variants of small effect size, rare variants that have large effects on phenotypes have been identified. These variants are often associated with the protein-coding portion of a gene. As proteins are important for structural and physiological functions of cells, mutations that affect them can produce these large effects. An explanation for the rarity of these variants based on evolutionary theory proposes that the detrimental effect of disease on fitness results in selection against variants that promote disease (Gibson, Reference Gibson2012).
Rare variants are often identified by quantitative trait locus (QTL) analysis, which looks for correlations between variants and measures of continuous phenotypic traits (Bloom et al., Reference Bloom, Boocock, Treusch, Sadhu, Day, Oates-Barker and Kruglyak2019). The goal is to uncover the locations in the genome important for these traits. A variation of this analysis that has identified rare variants of large effect size used individuals that displayed the trait of interest and individuals that did not display it from multiple generations of families or isolated populations. Rare variants might be found at higher frequencies in isolated populations because of previous bottleneck events, genetic drift or adaptation, and selection (Moltke et al., Reference Moltke, Grarup, Jørgensen, Bjerregaard, Treebak, Fumagalli and Hansen2014). This increases the power to detect associations between rare variants and phenotypes (Colonna et al., Reference Colonna, Pistis, Bomba, Mona, Matullo, Boano and Toniolo2013). In these studies that sample from families or isolated populations, variants that are closely linked to the mutation or causative allele are present in individuals that exhibit the trait at higher frequency than in individuals that do not display the trait. The locations of these variants indicate the chromosome region likely to contain the mutation. Positional cloning within this region can be used to identify the mutated gene and then comparison of this gene's DNA sequence in subjects with and without the trait can identify the causative mutation. Even though this mutation might only be present in a family or isolated population, the ability to study how any mutation alters the brain to influence a complex behavioral trait would be a breakthrough. An example of a study with success using this strategy focused on Canadian families of Celtic descent with multiple relatives in up to three generations diagnosed for schizophrenia (Brzustowicz, Hodgkinson, Chow, Honer, & Bassett, Reference Brzustowicz, Hodgkinson, Chow, Honer and Bassett2000). A highly significant association between schizophrenia and a locus on chromosome 1q21–q22 was found. Then additional variants within this region were used to pinpoint the nitric oxide synthase 1 adaptor gene (Brzustowicz et al., Reference Brzustowicz, Simone, Mohseni, Hayter, Hodgkinson, Chow and Bassett2004). This gene is overexpressed in the frontal cortex of people with schizophrenia, and it is involved in synaptic function and cortical neuron development, effects that could contribute to schizophrenia (Carrel et al., Reference Carrel, Hernandez, Kwon, Mau, Trivedi, Brzustowicz and Firestein2015; Hernandez et al., Reference Hernandez, Swiatkowski, Patel, Liang, Dudzinski, Brzustowicz and Firestein2016).
In contrast, GWASs, and thus PGSs, do not typically detect QTLs or rare variants of large effect size because these variants are rare in the total population sampled by GWASs. The power to detect a variant of any effect size decreases with the frequency of the variant because fewer individuals in the sample carry a less-frequent variant (Zuk et al., Reference Zuk, Schaffner, Samocha, Do, Hechter, Kathiresan and Lander2014). Put another way, because GWASs calculate the average effects of alleles across thousands of individuals, they cannot capture heterogeneity of effect sizes at the family level (Gibson, Reference Gibson2012).
Can approaches that detect rare variants be useful for sociogenomics? It could be argued that some measures of interest in sociogenomics, for example, level of educational attainment, could not be accounted for by one or a few rare variants. However, the contrast between what GWASs and PGSs identify best (common variants of small effect size) versus what QTL and related approaches identify best (rare variants of large effect size) suggests QTL and related approaches could have significant relevance for sociogenomics. As discussed above, by studying the right population it may be possible to identify associations of a complex behavioral trait with rare variants of large effect and ultimately identify one or more causative alleles. Social behaviors are complex and depend on multiple interacting neural systems as illustrated in a recent review on neural encoding of social valence (Padilla-Coreano, Tye, & Zelikowsky, Reference Padilla-Coreano, Tye and Zelikowsky2022). Social attributes, social memory, social rank, and social isolation were proposed to influence valence assignment to social stimuli, which in turn influences social interactions. Also, the separate neural circuits that control each of these influences were described, noting some overlap of these circuits. Interestingly, they suggest that across psychiatric disorders, brain regions that contribute to encoding of valence and social functions exhibit abnormal activity during emotional processing (e.g., Laviolette, Reference Laviolette2007). Thus, if a mutation disrupts one or more of the neural systems that influences valence assignment, this might lead to abnormal social interactions and a search might identify causal variants, including rare ones of large effect size.
Financial support
This research received no special grant from any funding agency, commercial, or not-for-profit sectors.
Competing interest
None.