Introduction
Adult second language acquisition is demanding (Sakai, Reference Sakai2005) and individuals vary in terms of their learning efficacy (Qi, Han, Garel, Chen & Gabrieli, Reference Qi, Han, Garel, Chen and Gabrieli2015). Several studies have highlighted the importance of structural and functional brain connectivity for individual differences in learning capacity (Xiang, Dediu, Roberts, Oort, Norris & Hagoort, Reference Xiang, Dediu, Roberts, Oort, Norris and Hagoort2012; Sheppard, Wang & Wong, Reference Sheppard, Wang and Wong2012; Hosoda, Tanaka, Nariai, Honda & Hanakawa, Reference Hosoda, Tanaka, Nariai, Honda and Hanakawa2013; Lopez-Barroso, de Diego-Balaguer, Cunillera, Camara, Münte & Rodriguez-Fornells, Reference Lopez-Barroso, de Diego-Balaguer, Cunillera, Camara, Münte and Rodriguez-Fornells2011; Qi et al., Reference Qi, Han, Garel, Chen and Gabrieli2015 and Ripollés, Biel, Peñaloza, Kaufmann, Marco-Pallarés, Noesselt & Rodríguez-Fornells, Reference Ripollés, Biel, Peñaloza, Kaufmann, Marco-Pallarés, Noesselt and Rodríguez-Fornells2017). There is also increasing evidence that predispositions in brain structure (and function) play a part for music and speech learning (Zatorre, Reference Zatorre2013) as well as language acquisition (Qi et al., 2015) and more generally as a determinant for acquiring expertise (Ullén, Hambrick & Mosing, Reference Ullén, Hambrick and Mosing2016).
Functional and structural neuroimaging can help explain why some learn better than others. Yang, Gates, Molenaar and Li (Reference Yang, Gates, Molenaar and Li2015) used functional Magnetic Resonance Imaging (fMRI) to study native English speakers as they were taught a new tonal vocabulary over the course of six weeks. Their results show that successful and less successful learners differed in brain activation prior to training, with greater neural activity during tone discrimination for successful learners, and more activity during pitch discrimination for less successful learners. Over time successful learners also showed a more coherent and integrated functional network. Shepard, Wang and Wong (Reference Sheppard, Wang and Wong2012) used graph theory to look at functional network characteristics (using fMRI) in relation to learning outcomes following an auditory pitch discrimination task. They found that successful learners showed reduced local efficiency but increased global efficiency in a core network of auditory language areas.
Other studies have combined measures of functional (using fMRI) and structural (using Diffusion Tensor Imaging: DTI) brain connectivity. López-Barrosso and colleagues (Reference Lopez-Barroso, de Diego-Balaguer, Cunillera, Camara, Münte and Rodriguez-Fornells2011) showed that efficient word learners rely on fast and efficient communication between temporal and frontal areas in the left hemisphere whilst more recent findings (Ripollés et al., Reference Ripollés, Biel, Peñaloza, Kaufmann, Marco-Pallarés, Noesselt and Rodríguez-Fornells2017) have linked brain connectivity in temporal pathways to correct identification of words and meanings. Qi et al. (Reference Qi, Han, Garel, Chen and Gabrieli2015) compared DTI measurements of known language tracts to the outcome of four weeks of Mandarin training and saw that language learning was correlated with DTI measures in the right hemisphere, but not in the left. These findings, taken together with work by Xiang et al. (Reference Xiang, Dediu, Roberts, Oort, Norris and Hagoort2012), show that individual differences in brain connectivity can help explain why some people learn a language more successfully than others.
Structural brain change (in the form of alterations in grey and white matter microstructure) can also occur as an effect of learning a new language (see Li, Legault and Litcofsky (Reference Li, Legault and Litcofsky2014) for review). We identified local increases in grey matter volume as an effect of learning a new language (Mårtensson, Eriksson, Bodammer, Lindgren, Johansson, Nyberg & Lövdén, Reference Mårtensson, Eriksson, Bodammer, Lindgren, Johansson, Nyberg and Lövdén2012). Intense language learning in a group of conscript interpreters at the Swedish Armed Forces Language School led to regional changes in cortical thickness as well as changes in right hippocampal volume. Importantly, these changes were related to learning outcomes, with larger increases in right hippocampal volume and the superior temporal gyrus in interpreters who became more proficient in their assigned language. White matter microstructure has also been known to change in response to language training. Schlegel, Rudelson and Tse (Reference Schlegel, Rudelson and Tse2012) collected monthly DTI scans of English speakers who underwent a 9-month intensive course in Modern Standard Chinese and saw that white matter networks reorganized progressively during that time period. Hosoda and colleagues (Reference Hosoda, Tanaka, Nariai, Honda and Hanakawa2013) investigated a larger cohort of participants (n 137) using both grey matter and white matter measures. They found that grey matter structure was initially related to vocabulary competence but did not predict the outcome of a 16-week 2nd language training program. They did, however, see changes in both white matter microstructure and regional grey matter volume, primarily in the right hemisphere, as an effect of training. Language studies can also lead to brain changes on the functional level (Paulesu, Vallar, Berlingeri, Signorini, Vitali, Burani, Perani & Fazio, Reference Paulesu, Vallar, Berlingeri, Signorini, Vitali, Burani, Perani and Fazio2009; e.g., Shtyrov, Reference Shtyrov2011; Shtyrov, Nikulin & Pulvermüller, Reference Shtyrov, Nikulin and Pulvermüller2010).
The highlighted studies point towards the importance of the language network as a determinant for later language performance. And even more importantly, they illustrate the ability for said network to change in response to demands. The present study follows a select group of multilinguals as their language networks are pushed at a high pace, allowing us to observe how an experienced brain responds to training. Studying this group can lend valuable insight into how learning affects other advanced groups that learn multiple languages over the course of their careers, such as interpreters or diplomats. Our earlier findings (Mårtensson et al., Reference Mårtensson, Eriksson, Bodammer, Lindgren, Johansson, Nyberg and Lövdén2012) were limited to grey matter structure. In light of the functional and structural findings presented above it is relevant to investigate whether a) white matter and grey matter microstructure will act as a predictor of learning outcomes and b) whether white matter microstructure will change in response to language training in a group of multilingual individuals who were pre-selected for their language ability prior to training.
Grey matter and white matter measurements from the same population as in Mårtensson et al. (Reference Mårtensson, Eriksson, Bodammer, Lindgren, Johansson, Nyberg and Lövdén2012) were used for this study. Individuals with strong language ability and knowledge in at least 2 foreign languages enrolled at the Swedish Armed Forces Language School and studied at a pace of 350-500 new words per week. Increases in right hippocampal volume and changes in cortical thickness of the inferior frontal gyrus (IFG), superior temporal gyrus (STG) and the middle frontal gyrus (MFG) were observed following 3 months of language learning. These areas hold key roles in sensorimotor aspects of language (Demonet, Reference Demonet2005; Hickok & Poeppel, Reference Hickok and Poeppel2007; Price, Reference Price2010). More specificically, the IFG is belived to be involved in the articulatory network (predominantly left hemisphere) whilst the STG is concerned with spectrotemporal analysis (bilaterally). Both belong to the same frontal network of areas regulating speech processing (Hickok & Poeppel, Reference Hickok and Poeppel2007). The hippocampus is also believed to be involved in rapid learning of new words (Davis & Gaskell, Reference Davis and Gaskell2009).
There are known connections (see Friederici (Reference Friederici2009), for review) between the IFG and the STG, such as the Arcuate Fasciculus and the Superior Longitudinal Fasciculus (SLF). The arcuate fasciculus forms parts of the SLF and has traditionally been claimed to connect Broca's and Wernicke's areas (Catani & Thiebaut de Schotten, Reference Catani and Thiebaut de Schotten2008), a statement that has been challenged as advancements in neuroimaging have taken place (Bernal & Altman, Reference Bernal and Altman2010). Another pathway that is believed to be related to language is the uncinate fasciculus. It connects the anterior temporal lobe with the medial and lateral orbitofrontal cortex (Catani & Thiebaut de Schotten, Reference Catani and Thiebaut de Schotten2008).
Grey matter consists primarily of neuronal cell bodies, dendrites, and axon collaterals along with glia (mainly astrocytes), synapses, and capillaries whilst white matter is dominated by myelinated axons (Kassem, Lagopoulos, Stait-Gardner, Price, Chohan, Arnold, Hatton & Bennett, Reference Kassem, Lagopoulos, Stait-Gardner, Price, Chohan, Arnold, Hatton and Bennett2012; Zatorre, Fields & Johansen-Berg, Reference Zatorre, Fields and Johansen-Berg2012). White matter microstructure is often measured using Diffusion-tensor imaging (DTI,). DTI is sensitive to hindrance of water diffusion that results from tissue boundaries and is often quantified using mean diffusivity (MD), fractional anisotropy (FA), radial diffusivity (RD) and axial diffusivity (AD). MD quantifies free diffusion of water within a voxel (Beaulieu, Reference Beaulieu2002) whilst FA measures the directionality of diffusion. FA and MD are calculated from AD and RD, which measure diffusion parallel to and perpendicular to axonal fibers, respectively (Sen & Basser, Reference Sen and Basser2005).
Considering earlier studies from the field, and the findings by Mårtensson et al. (Reference Mårtensson, Eriksson, Bodammer, Lindgren, Johansson, Nyberg and Lövdén2012) specifically, we hypothesize that grey and white matter will be predictive of later learning outcomes as measured by language proficiency and that white matter microstructure will change in response to training. Left and right hippocampal volumes are expected to be predictive of later language proficiency, as are areas in the language network (Friederici, Reference Friederici2011; Friederici & Gierhan, Reference Friederici and Gierhan2013). Brain structure will be measured globally using Tract-Based Spatial Statistics (TBSS: white matter) and using FreeSurfer's cortical stream (grey matter). Changes are expected to be larger in, but not restricted to, the left hemisphere and with projections towards the right temporal lobe in light of the findings of Mårtensson et al. (Reference Mårtensson, Eriksson, Bodammer, Lindgren, Johansson, Nyberg and Lövdén2012).
Methods
Participants
Fourteen (6 female) right-handed and MRI-eligible volunteers were recruited from the Swedish Armed Forces Intelligence and Security Centre. Conscripts at the center were selected from all Swedish 18-year-old men and women who had willfully decided to undergo military training. Screening before entry was based on school achievements, study skills, emotional stability, and intelligence. Students are required to have top grades in at least 2 foreign languages (making all students at least tri-lingual) and are given a week to learn 350 non-words of Finnish origin. The purpose of this vocabulary test is to select recruits that can manage the very high pace of the academy. Out of roughly a hundred select applicants the academy chooses between 20-30 students in an average year (see Dahlquist, Reference Dahlquist2004 for reference). Eight of the interpreters studied Dari, four studied Arabic, and two studied Russian. No interpreter had prior knowledge in his or her assigned language.
Controls (n = 17, 10 female) were students of cognitive science or medicine at Umeå University. The control group was recruited to be comparable to interpreters in age, intelligence, years of education, and emotional stability. Controls were measured before and their semester, which matched closely in time to the measurements of interpreters. The time between measurements was 3 months.
The groups were equivalent in age, years of education, intelligence and emotional stability. See Table 1.
1 Additional languages known on top of Swedish and English.
2 Measures of Proficiency and Struggle were only collected for the Interpreter population.
Behavioral measures
Raven's advanced progressive matrices
The 18 odd-numbered items from set II of this test (Raven, 2000) were administered to both groups at pretest. Participants had 10 minutes to complete the task, and the dependent variable was the number of correctly selected patterns.
Anxiety ratings
A Swedish translation of the STAI Y-2 (Spielberger, Gorsuch & Lushene, Reference Spielberger, Gorsuch and Lushene1970) was modified and presented to participants who rated anxiety (Filaire, Sagnol, Ferrand, Maso & Lac, Reference Filaire, Sagnol, Ferrand, Maso and Lac2001) they had experienced in the past month. The questionnaire was administered at pretest.
Proficiency
Our proficiency measure consisted of grades from the mid-year exam at the interpreter academy. This test was performed a few weeks after posttest. This exam is especially important to the interpreters because those who fail are forced to leave the academy (none of our participants had to leave, indicating that they studied hard). The exam itself consists of one written and one oral test. The written language test includes translating full sentences and texts and the oral test includes non-simultaneous interpreting. Both tests have been developed to measure the ability of actual language use in the demanding circumstances a military interpreter might find herself in, and as such they measure a broad spectrum of language abilities. For our proficiency measure we used the means of the oral and written exam, with a scale between 1 (least proficient) and 10 (most proficient). During the stay at the academy the interpreters underwent similar tests, with oral and written exams interleaved (one per week). The teachers had no insight into the findings from the study when grading the students. The tests were developed at the academy (which has been active since 1957) by language teachers, most of whom work part time as lecturers at Swedish universities.
Struggle
To investigate whether training-induced white matter changes were connected to increased knowledge or/and a large amount of effort, we asked the head teacher at the academy to subjectively rate the amount of effort needed to stay at the academy at post-test. The question (translated from Swedish) was: ”Judge how large effort was needed for each participant to achieve the goals of the interpreter academy and to be allowed to stay in the program” and was rated on a Likert scale of 1 (little effort) to 9 (large effort). No participant scored below 6 on this scale which led to a restriction in range.
Known languages
All participants knew at least two languages (Swedish and English) with additional languages known by both interpreters and controls. In effect, all participants were at least trilingual, which is of relevance since learning and using a second language can affect white matter microstructure (Pliatsikas, Moschopoulou & Saddy, Reference Pliatsikas, Moschopoulou and Saddy2015). Statistically the groups did not differ in terms of number of languages known (see Table 1.). However, it is difficult to rule out the possibility that there actually might have been differences between the groups, since the participants were simply asked to note down any additional languages they know (aside from Swedish and English) and not the language ability in each additional language.
MR Acquisition
Images were acquired at Umeå center for Functional Brain Imaging (UFBI) on a GE Discovery MR 750, 3 Tesla scanner with a 32-channel phased-array head coil. For Diffusion Weighted Images at pre- and post-test a Spin Echo refocused EPI sequence was used. Images had a slice thickness of 2 mm. Sequence parameters were: TR = 8000 ms, 64 slices with no gap, acquisition matrix 128 × 128 interpolated to 256 × 256 matrix with a FOV of 250 mm, TE= 84.4 ms, 4 repetitions of 24 independent directions, b= 1000 s/mm2 and 4 b = 0 images, Dual Spin Echo switched on and ASSET acceleration factor 2.
For T1-weighted imaging a 3D fast spoiled gradient-echo (fSPGR) sequence was used at pretest and post-test (TE = 3.2 ms, TR = 8.1 ms, TI = 450 ms, flip angle = 12°, FOV = 172×250×250 mm3, matrix = 172×256×256, bandwidth per pixel = 122 Hz, no parallel imaging, no surface coil intensity correction, 3D correction for gradient non-linearities).
DTI Preprocessing
Diffusion-weighted images were analyzed using the FSL software package (http://www.fmrib.ox.ac.uk/fsl). The four subject-specific images were averaged to improve signal-to-noise ratio and the resulting output image was corrected for possible head movement. This was done using FLIRT from FSL (Jenkinson & Smith, Reference Jenkinson and Smith2001; Jenkinson, Bannister, Brady & Smith, Reference Jenkinson, Bannister, Brady and Smith2002) with the mean of the B = 0 images from each run being used as the reference using 6 degrees of freedom followed by interpolation using nearest neighbor. Images from all participants were inspected for motion and no participant was excluded. A mean image was calculated from the B = 0 image of each run and used as a brain mask. The resulting data was then processed via dtifit and voxelwise statistical analysis was carried out using TBSS v1.2 (Smith, Jenkinson, Johansen-Berg, Rueckert, Nichols, Mackay, Watkins, Ciccarelli, Cader, Matthews & Behrens, Reference Smith, Jenkinson, Johansen-Berg, Rueckert, Nichols, Mackay, Watkins, Ciccarelli, Cader, Matthews and Behrens2006; Smith, Jenkinson, Woolrich, Beckmann, Behrens, Johansen-Berg, Bannister, De Luca, Drobnjak, Flitney & Niazy, Reference Smith, Jenkinson, Woolrich, Beckmann, Behrens, Johansen-Berg, Bannister, De Luca, Drobnjak, Flitney and Niazy2004): all FA images were aligned into 1 × 1x1mm standard space with the FMRIB58_FA image as target; the most typical subject (the participant that corresponds the most to the rest of the sample) was then used for group-wide alignment into standard space; the resulting image was fed into a tract skeleton generation program and a threshold set at 0.2 to exclude gray matter voxels or cerebrospinal fluid. The MD, RD and AD images made use of the non-linear transformation and projection vectors extracted during preprocessing of the FA images but were otherwise treated in the same way as above.
Analysis of DTI data
Difference images
For each participant, a difference image was calculated from the skeletonized images (by subtracting the pretest image from the posttest image). This was done for FA, MD, RD and AD values respectively. Interpreters were then compared against controls on these difference images by means of voxelwise permutation-based inference (5000 permutations). Significant differences in this test thus reflect differences between the groups in the amount of changes between pretest and posttest (i.e., a group by time interaction). Threshold-Free Cluster Enhancement (TFCE) was used and only TFCE p-value images, fully corrected for multiple comparisons across space and at p < 0.05 were considered.
Pretest differences
Due to the select nature of the interpreters as compared to the control population we compared FA, MD, RD and AD between groups at pretest using the criteria above.
Behavioral correlations to change
To measure whether difference images correlated with either Struggle or Proficiency in the interpreters a third analysis was carried out, using the difference images but with Struggle and Proficiency added as covariates (in two separate runs).
Predictive value of white matter microstructure
A fourth analysis was performed to evaluate whether FA, MD, RD or AD values at pretest were related to language proficiency at posttest. Within the group of interpreters, demeaned proficiency values were added as a covariate to the same type of analysis and then tested against a null distribution generated from 5000 iterations. The mean values from the resulting skeletonized MD, RD and AD values were then exported (using fslstats with the -M option) into SPSS version 26 (IBM Corp, 2019) for outlier analysis (using the SPSS boxplot feature, which highlights outliers based on Interquartile range of 1.5), and comparison to prior grey-matter findings from Mårtensson et al. (Reference Mårtensson, Eriksson, Bodammer, Lindgren, Johansson, Nyberg and Lövdén2012). A second run was performed to account for measures of intelligence.
Pre-test analysis of FreeSurfer data from Mårtensson et al. (Reference Mårtensson, Eriksson, Bodammer, Lindgren, Johansson, Nyberg and Lövdén2012).
To provide completion in light of earlier findings that found grey matter to be predictive in a similar population (Hosoda et al., Reference Hosoda, Tanaka, Nariai, Honda and Hanakawa2013) the published cortical and hippocampal data from the presented cohort was revisited to investigate whether cortical thickness or hippocampal volume was predictive of later language proficiency. All scans were manually inspected for artifacts and no scan was rejected. The volumes were then analyzed using the FreeSurfer imaging analysis suite (http://surfer.nmr.mgh.harvard.edu/; version 4.5). FreeSurfer makes use of intensity and continuity information from MR volumes to reconstruct and measure cortical thickness and volumetric segmentation in a separate analysis stream to measure subcortical structures. The technical details of these procedures have been described in earlier publications (Dale, Fischl & Sereno, Reference Dale, Fischl and Sereno1999; Dale & Sereno, Reference Dale and Sereno1993; Desikan, Segonne, Fischl, Quinn, Dickerson, Blacker, Buckner, Dale, Maguire, Hyman, Albert & Killiany, Reference Desikan, Segonne, Fischl, Quinn, Dickerson, Blacker, Buckner, Dale, Maguire, Hyman, Albert and Killiany2006; Fischl & Dale, Reference Fischl and Dale2000; Fischl, Liu & Dale, Reference Fischl, Liu and Dale2001; Fischl, Salat, Busa, Albert, Dieterich, Haselgrove, van der Kouwe, Killiany, Kennedy, Klaveness, Montillo, Makris, Rosen & Dale, Reference Fischl, Salat, Busa, Albert, Dieterich, Haselgrove, van der Kouwe, Killiany, Kennedy, Klaveness, Montillo, Makris, Rosen and Dale2002; Fischl, Salat, van der Kouwe, Makris, Segonne, Quinn & Dale, Reference Fischl, Salat, Van Der Kouwe, Makris, Ségonne, Quinn and Dale2004; Fischl, van der Kouwe, Destrieux, Halgren, Segonne, Salat, Busa, Seidman, Goldstein, Kennedy, Caviness, Makris, Rosen & Dale, Reference Fischl, van der Kouwe, Destrieux, Halgren, Segonne, Salat, Busa, Seidman, Goldstein, Kennedy, Caviness, Makris, Rosen and Dale2004; Fischl, Sereno & Dale, Reference Fischl, Sereno and Dale1999; Fischl, Sereno, Tootell & Dale, Reference Fischl, Sereno, Tootell and Dale1999; Han, Jovicich, Salat, van der Kouwe, Quinn, Czanner, Busa, Pacheco, Albert, Killiany, Maguire, Rosas, Makris, Dale, Dickerson & Fischl, Reference Han, Jovicich, Salat, van der Kouwe, Quinn, Czanner, Busa, Pacheco, Albert, Killiany, Maguire, Rosas, Makris, Dale, Dickerson and Fischl2006; Jovicich, Czanner, Greve, Haley, van der Kouwe, Gollub, Kennedy, Schmitt, Brown, Macfall, Fischl & Dale, Reference Jovicich, Czanner, Greve, Haley, van der Kouwe, Gollub, Kennedy, Schmitt, Brown, Macfall, Fischl and Dale2006; Segonne, Dale, Busa, Glessner, Salat, Hahn & Fischl, Reference Ségonne, Dale, Busa, Glessner, Salat, Hahn and Fischl2004).
Vertex-wise general linear model analysis was performed with Proficiency as the dependent variable and cortical thickness as the independent variable. For subcortical grey matter structure, a linear regression was performed using Proficiency as the dependent variable and right and left hippocampal volume as independent variables. See Mårtensson et al. (Reference Mårtensson, Eriksson, Bodammer, Lindgren, Johansson, Nyberg and Lövdén2012) for further details, and the results below.
Results
Pretest differences in white matter microstructure (RD) between interpreters and controls but no change over time
No statistically significant pretest differences were found between controls and interpreters for FA, MD or AD. However, interpreters had relatively lower RD values bilaterally in several areas involved in the language network (Friederici, Reference Friederici2011; Friederici & Gierhan, Reference Friederici and Gierhan2013): forceps minor, inferior fronto-occipital fasciculus, superior longitudinal fasciculus, uncinate fasciculus, anterior thalamic radiation, inferior longitudinal fasciculus with a larger area visible in the left hemisphere, the corticospinal tract as well as the cingulum (see Figure 1A). Controls did not have lower RD values compared to interpreters in any area.
Calculated difference images (post – pre) showed no selective changes over time for FA, MD, RD or AD for interpreters relative to controls. Hence, no evidence for changes in white matter microstructure over time were found.
An additional analysis compared RD at posttest between both groups to see whether the initial difference between the populations was still present. There was no difference between the groups for RD at posttest (at p < 0.05) but a tendency (p > 0.1) for interpreters to have lower RD in the same networks at posttest as at pretest, again indicating that no real change between the groups occurred as an effect of training.
Within the group of interpreters neither language proficiency (Proficiency) nor a subjective measure of the effort needed to stay at the academy (Struggle) correlated with changes in white matter microstructure.
Grey matter did not predict later language proficiency
No predictive value of cortical thickness for later proficiency was found at p < .001, using a clusters-extent threshold of 100 vertices. In addition, no predictive value was found for left and right hippocampal volume in relation to final proficiency.
White matter microstructure predicts later language proficiency
Within the group of interpreters, higher language proficiency was related to higher values of MD, RD and AD (p < .05, fully corrected for multiple comparisons) at pretest. Notably, baseline MD was higher for more proficient interpreters in areas that partly overlap with the pretest differences between interpreters and controls shown in Figure 1: forceps minor, anterior thalamic radiation, inferior fronto-occipital fasciculus, corticospinal tract, hippocampal part of the cingulum and uncinate fasciculus bilaterally. Additionally, higher MD values were found in tracts stretching along the superior longitudinal fasciculus bilaterally as well as the inferior longitudinal fasciculus in the left hemisphere (see figure 1B-D).
RD and AD showed similar trends where higher values were positively correlated with higher Proficiency. Effects for RD and AD were limited to the forceps minor bilaterally (RD) as well as parts of the inferior fronto-occipital fasciculus (RD, AD), anterior thalamic radiation (AD) and uncinate fasciculus in the left hemisphere (RD) or bilaterally (AD). No effect was found for FA. This is indicative of crossing fibers, which have been known to cause RD and AD to increase or decrease simultaneously in the same areas (Vos, Jones, Jeurissen, Viergever & Leemans, Reference Vos, Jones, Jeurissen, Viergever and Leemans2012). The effects in RD and AD were strongly correlated (r(12) = .787, p = .001).
The areas where MD, RD or AD correlated with language proficiency were compared to changes in grey matter from Mårtensson et al. (Reference Mårtensson, Eriksson, Bodammer, Lindgren, Johansson, Nyberg and Lövdén2012). No effects were found when controlling for multiple comparisons using Bonferroni correction.
To rule out that the correlations between DTI parameters and proficiency were driven by age differences within the group of interpreters, a bivariate correlation was performed between chronological age and the mean MD, RD and AD values from the respective skeletonized images. The significant areas from each analysis (MD, RD and AD) were then used as masks on the individual skeletonized images and the mean of all the voxels within the mask exported to SPSS for analysis. Neither MD (r = .125, p = .671), RD (r = .237, p = .414) nor AD (r = .185, p = .526) correlated with age.
The effects were not related to general intelligence
Proficiency was used once more as a covariate whilst regressing out individual scores from Raven's progressive matrices. The resulting skeletonized image was visually equal to the original analysis.
Discussion
We found that interpreters and controls differed in white matter microstructure at baseline, and that roughly the same networks were predictive of later learning proficiency in the interpreters. We did not find support for our hypothesis that white matter microstructure would change over time as an effect of training, possibly because of the prior language expertise in the interpreter population as compared to other study populations (Schlegel et al., Reference Schlegel, Rudelson and Tse2012).
Of the areas differing between controls and interpreters at baseline, only the superior longitudinal fasciculus (Friederici, Reference Friederici2009) is strongly related to language, but it overlaps with the Arcuate fasciculus that has been tied to language performance (López-Barrosso et al., Reference Lopez-Barroso, de Diego-Balaguer, Cunillera, Camara, Münte and Rodriguez-Fornells2011, but not in Ripollés et al., Reference Ripollés, Biel, Peñaloza, Kaufmann, Marco-Pallarés, Noesselt and Rodríguez-Fornells2017) and has been seen to change in response to language learning (Hosoda et al., Reference Hosoda, Tanaka, Nariai, Honda and Hanakawa2013). However, the uncinate fasciculus is believed to be related to language (Catani & Mesulam, Reference Catani and Mesulam2008), was recently implicated as relevant for language learning (Ripollés et al., Reference Ripollés, Biel, Peñaloza, Kaufmann, Marco-Pallarés, Noesselt and Rodríguez-Fornells2017) and belongs to the limbic system, which includes the hippocampus (Catani & Thiebaut de Schotten, Reference Catani and Thiebaut de Schotten2008). The right hippocampus increased in volume in the same population as described here (Mårtensson et al., Reference Mårtensson, Eriksson, Bodammer, Lindgren, Johansson, Nyberg and Lövdén2012). The inferior longitudinal fasciculus connects the limbic system to visual areas (Fox, Iaria & Barton, Reference Fox, Iaria and Barton2008) and has been observed to change in response to (Hosoda et al., Reference Hosoda, Tanaka, Nariai, Honda and Hanakawa2013) and be relevant for (Ripollés et al., Reference Ripollés, Biel, Peñaloza, Kaufmann, Marco-Pallarés, Noesselt and Rodríguez-Fornells2017; Qi et al., 2015), language learning in the right hemisphere. The inferior fronto-occipital fasciculus has been implied in semantic processing (Duffau, Gatignol, Mandonnet, Peruzzi, Tzourio-Mazoyer & Capelle, Reference Duffau, Gatignol, Mandonnet, Peruzzi, Tzourio-Mazoyer and Capelle2005; Duffau, Gatignol, Mortiz-Gasser & Mandonnet, Reference Duffau, Gatignol, Moritz-Gasser and Mandonnet2009; Mandonnet, Nouet, Gatignol, Capelle & Duffau Reference Mandonnet, Nouet, Gatignol, Capelle and Duffau2007; Ripollés et al., Reference Ripollés, Biel, Peñaloza, Kaufmann, Marco-Pallarés, Noesselt and Rodríguez-Fornells2017) which is likely of relevance for interpreters with high demands on vocabulary acquisition.
Lower RD values have been connected to higher levels of myelin (Song, Sun, Ju, Lin, Cross & Neufeld, Reference Song, Sun, Ju, Lin, Cross and Neufeld2003; Song, Sun, Ramsbottom, Chang, Russell & Cross, Reference Song, Sun, Ramsbottom, Chang, Russell and Cross2002; Song, Yoshino, Le, Lin, Sun, Cross & Armstrong, Reference Song, Yoshino, Le, Lin, Sun, Cross and Armstrong2005). Thus, the present findings, where controls had relatively higher values as compared to interpreters, could be taken as a cautious indication of higher degree of myelination in the group of interpreters in white-matter pathways connecting known language areas as well as areas that have been related to language learning outcomes (Schlegel et al., Reference Schlegel, Rudelson and Tse2012; Mårtensson et al., Reference Mårtensson, Eriksson, Bodammer, Lindgren, Johansson, Nyberg and Lövdén2012; Hosoda et al., Reference Hosoda, Tanaka, Nariai, Honda and Hanakawa2013; López-Barrosso et al., Reference Lopez-Barroso, de Diego-Balaguer, Cunillera, Camara, Münte and Rodriguez-Fornells2011; Qi et al., Reference Qi, Han, Garel, Chen and Gabrieli2015 and Ripollés et al., Reference Ripollés, Biel, Peñaloza, Kaufmann, Marco-Pallarés, Noesselt and Rodríguez-Fornells2017). It should be noted, however, that these areas are far from language specific, and are used in a wide array of higher cognitive functions.
White matter microstructure (MD, RD and AD) before training was related to later language performance, which is in line with findings pointing towards the importance of brain structure at baseline in relation to learning outcomes in general (Zatorre, Reference Zatorre2013) and for language learning specifically (Schlegel et al., Reference Schlegel, Rudelson and Tse2012; Hosoda et al., Reference Hosoda, Tanaka, Nariai, Honda and Hanakawa2013; López-Barrosso et al., Reference Lopez-Barroso, de Diego-Balaguer, Cunillera, Camara, Münte and Rodriguez-Fornells2011; Qi et al., Reference Qi, Han, Garel, Chen and Gabrieli2015 and Ripollés et al., Reference Ripollés, Biel, Peñaloza, Kaufmann, Marco-Pallarés, Noesselt and Rodríguez-Fornells2017). Intuitively, we would expect higher FA values to be correlated positively with performance, such as observed when Qi and colleagues (2015) measured local white matter microstructure in twenty-one English native speakers who were taught Mandarin for four weeks. Instead, we saw a relation between MD and language proficiency in the interpreters, with relatively higher MD values foreshadowing higher performance. The areas overlap with regions where interpreters differed from controls in RD, but are more widespread.
Increased MD has been observed in reduced membrane density (Sen & Basser, Reference Sen and Basser2005), which should be compared to FA that is known to increase with maturation (Zatorre et al., Reference Zatorre, Fields and Johansen-Berg2012). As such our findings appear counterintuitive when compared to the supposed greater myelination of interpreters compared to controls in the same areas. Earlier training studies have shown both decrease (Taubert, Draganski, Anwander, Müller, Horstmann, Villringer & Ragert, Reference Taubert, Draganski, Anwander, Müller, Horstmann, Villringer and Ragert2010) and increase (Scholz, Klein, Behrens & Johansen-Berg, Reference Scholz, Klein, Behrens and Johansen-Berg2009) in FA values as an effect of training (Schlegel et al., Reference Schlegel, Rudelson and Tse2012; Hosoda et al., Reference Hosoda, Tanaka, Nariai, Honda and Hanakawa2013). Halwani, Loui, Rüber and Schlaug (Reference Halwani, Loui, Rüber and Schlaug2011) compared singers to instrumental musicians. Singers showed lower FA values in the arcuate fasciculus. Within the group of singers, however, the inverse was true; singers with more experience showed lower FA values. The authors of the study conclude that with more experience it is likely that microstructural complexity increases and FA values decrease.
Relatively larger values of MD, however, point towards less microstructural complexity and in the direction of large dominating fiber volumes. Since all our MD values are contained within skeletonized tracts where smaller fibers have been averaged out across participants, large MD values in themselves are not unexpected. Relatively smaller values on the other hand could point towards larger amounts of crossing fibers in less proficient interpreters. Crossing fibers have also been known to cause RD and AD to increase or decrease simultaneously in the same areas (Vos et al., Reference Vos, Jones, Jeurissen, Viergever and Leemans2012), which we observe in this study. This might also help explain the counterintuitive finding that RD was lower in interpreters as compared to controls, but higher in interpreters who became more proficient in the end. Perhaps there was more room for change in those interpreters, perhaps there was higher microstructural complexity. Diffusion imaging measures the effects of myelin, cell membranes, and other small structures on diffusion within relatively large voxels (2 mm) resulting in the issue of measuring microscopic anatomical factors at a macroscopic level of detail (Mori & Zhang, Reference Mori and Zhang2006). With the level of detail that we present in this study we cannot delve further into the neurophysiological underpinnings of our DTI findings.
It should be noted that Mårtensson et al. (Reference Mårtensson, Eriksson, Bodammer, Lindgren, Johansson, Nyberg and Lövdén2012) found selective increases in grey matter volume in the left STG, IFG, MFG and the right hippocampus for interpreters. All of these areas are adjacent to the effects observed in mean diffusivity. This makes sense, since the arcuate fasciculus is believed to connect the STG (and MTG) with the IFG and MFG (along with the premotor cortex (Catani, Jones & Ffytche, Reference Catani, Jones and Ffytche2005; Catani, Allin, Husain, Pugliese, Mesulam, Murray & Jones, Reference Catani, Allin, Husain, Pugliese, Mesulam, Murray and Jones2007). The uncinate fasciculus in turn, is believed to connect limbic areas such as the hippocampus, to frontals areas of the cortex (Catani & Thiebaut de Schotten, Reference Catani and Thiebaut de Schotten2008) that correspond reasonably well with the areas observed in the current findings along with those of Schlegel and colleagues (Reference Schlegel, Rudelson and Tse2012). A notable difference to the grey matter findings is that the initial white matter differences between interpreters and controls are mostly bilateral whilst the gray matter changes were mainly left hemispheric (Mårtensson et al., Reference Mårtensson, Eriksson, Bodammer, Lindgren, Johansson, Nyberg and Lövdén2012).
In contrast to some previous reports (Schlegel et al., Reference Schlegel, Rudelson and Tse2012; Hosoda et al., Reference Hosoda, Tanaka, Nariai, Honda and Hanakawa2013), no white matter changes were found following intensive language training. This may be due to the extensive prior experience with languages within the group of interpreters, when compared to the control group. Compared to the participants in Schlegel et al. (Reference Schlegel, Rudelson and Tse2012), the interpreters presumably had more extensive experience with foreign language learning. Inconsistencies could also be due to small sample size, with lower sample sizes in the range reported here and earlier (Mårtensson et al., Reference Mårtensson, Eriksson, Bodammer, Lindgren, Johansson, Nyberg and Lövdén2012) demonstrating excessive variability when compared to larger samples (Munson & Hernandez, Reference Munson and Hernandez2019). Hosoda et al. (Reference Hosoda, Tanaka, Nariai, Honda and Hanakawa2013) used a considerably larger number of participants (n = 137) whilst Schlegel et al. (Reference Schlegel, Rudelson and Tse2012) measured each participant (n = 27) nine times over the course of nine months.
The differences between methods used is also worth mentioning. TBSS measures changes to large tracts common to all participants, whilst Schlegel et al. (Reference Schlegel, Rudelson and Tse2012) measured tracts in predefined regions of interest by means of fiber-tracking as well as brain-wide analysis of white matter voxels. Training characteristics may have further contributed to the differences in findings, as the participants from Hosoda et al. (Reference Hosoda, Tanaka, Nariai, Honda and Hanakawa2013) knew at least 2 languages prior to training. Participants exhibited training-related changes mainly in the right hemisphere as opposed to the predominantly left hemispheric changes from Mårtensson et al. (Reference Mårtensson, Eriksson, Bodammer, Lindgren, Johansson, Nyberg and Lövdén2012), which can perhaps be taken as indication that different demands have been placed on the two populations.
We found that white matter microstructure in language-related areas predicts later language proficiency, in consistency (to some extent) with earlier findings (Schlegel et al., Reference Schlegel, Rudelson and Tse2012; Hosoda et al., Reference Hosoda, Tanaka, Nariai, Honda and Hanakawa2013; López-Barrosso et al., Reference Lopez-Barroso, de Diego-Balaguer, Cunillera, Camara, Münte and Rodriguez-Fornells2011; Qi et al., Reference Qi, Han, Garel, Chen and Gabrieli2015 and Ripollés et al., Reference Ripollés, Biel, Peñaloza, Kaufmann, Marco-Pallarés, Noesselt and Rodríguez-Fornells2017). Our results also fall in line with earlier cross-sectional findings showing that faster language learners have greater density of white matter in areas related to auditory processing (Golestani, Molko, Dehaene, LeBihan & Pallier, Reference Golestani, Molko, Dehaene, LeBihan and Pallier2007) and that there are structural differences between early and late literates (Carreiras, Seghier, Baquero, Estévez, Lozano, Devlin & Price, Reference Carreiras, Seghier, Baquero, Estévez, Lozano, Devlin and Price2009) as well as between simultaneous interpreters and controls (Elmer, Hänggi, Meyer & Jäncke, Reference Elmer, Hänggi, Meyer and Jäncke2011).
These results are limited to a select and small group of individuals with knowledge of several foreign languages prior to training. However, they contribute to the increasing number of studies that have shown that recording white matter microstructure in the language network can be a valuable tool in understanding future learning capacity. The findings also highlight the need for further study of how initial brain structure influences and interacts with learning outcomes. Starting values need to be taken into consideration, when looking at why some individuals have an easier time learning a language than others.
Acknowledgements
The authors thank A-K. Larsson & H-O. Karlsson at UFBI as well as the staff and students at the interpreter academy of Swedish Armed Forces Intelligence and Security Center.
This work was supported by the Sofja Kovalevskaja Award (to Martin Lövdén) from the Alexander von Humboldt foundation donated by the German Federal Ministry for Education and Research, the Swedish Research Council (421-2005-2018, 421-2010-1250), the Linnaeus environment Thinking in Time: Cognition, Communication and Learning, financed by the Swedish Research Council (349-2007-869), and grants from the Swedish Research Council and the Umeå School of Education to LN.