SECOND AND THIRD LANGUAGE LEARNERS’ SENSITIVITY TO JAPANESE PITCH ACCENT IS ADDITIVE: AN INFORMATION-BASED MODEL OF PITCH PERCEPTION

Seth Wiener; Seth Goss

doi:10.1017/S0272263119000068

SECOND AND THIRD LANGUAGE LEARNERS’ SENSITIVITY TO JAPANESE PITCH ACCENT IS ADDITIVE

AN INFORMATION-BASED MODEL OF PITCH PERCEPTION

Published online by Cambridge University Press: 25 March 2019

Seth Wiener and

Seth Goss

Show author details

Seth Wiener*: Affiliation:
Carnegie Mellon University
Seth Goss: Affiliation:
Emory University
*: *Correspondence concerning this article should be addressed to Seth Wiener, Department of Modern Languages, Carnegie Mellon University, 160 Baker Hall, 5000 Forbes Avenue, Pittsburgh, PA 15213. E-mail: sethw1@cmu.edu

Article contents

Abstract
INTRODUCTION
PERCEPTION OF JAPANESE PITCH ACCENT BY L1 AND L2 LISTENERS
THE PRESENT STUDY
EXPERIMENT
STATISTICAL ANALYSES
RESULTS
DISCUSSION
SUPPLEMENTARY MATERIAL
Footnotes
References

Get access

Rights & Permissions

Abstract

This study examines second (L2) and third (L3) language learners’ pitch perception. We test the hypothesis that a listener’s discrimination of and sensitivity (d’) to Japanese pitch accent reflects how pitch cues inform all words a listener knows in an additive, nonselective manner rather than how pitch cues inform words in a selective, Japanese-only manner. Six groups of listeners performed a speeded ABX discrimination task in Japanese. Groups were defined by their L1, L2, and L3 experience with the target language’s pitch cues (Japanese), a language with less informative pitch cues (English), or a language with more informative pitch cues (Mandarin Chinese). Results indicate that sensitivity to pitch is better modeled as a function of pitch’s informativeness across all languages a listener speaks. These findings support cue-centric views of perception and transfer, demonstrate potential advantageous transfer of tonal-L1/L2 speakers, and highlight the cumulative role that pitch plays in language learning.

Type: Research Report
Information: Studies in Second Language Acquisition , Volume 41 , Issue 4 , September 2019 , pp. 897 - 910

DOI: https://doi.org/10.1017/S0272263119000068 [Opens in a new window]
Open Practices: Open materials
Copyright: Copyright © Cambridge University Press 2019

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

INTRODUCTION

A speaker’s fundamental frequency (F0) conveys both nonlinguistic and linguistic information. This includes emotion and speaker idiosyncrasies, as well as language-specific pragmatic, grammatical, and lexical information (see Beckman, Reference Beckman1986; Gussenhoven, Reference Gussenhoven2004). For learners of a second (L2) or third (L3) language, speech perception in the target language can involve two challenges related to pitch, the perceptual correlate of F0: (a) learners must discriminate new language-specific pitch patterns (So & Best, Reference So and Best2010, Reference So and Best2014); and (b) learners must weight pitch cues in accordance with their language-specific informativeness and functional load (Holt & Lotto, Reference Holt and Lotto2006; Surendran & Niyogi, Reference Surendran, Niyogi and Thomsen2006).

For example, spoken English demonstrates pitch rise and fall in intonation patterns that are not fixed to syllables or words (Cruttenden, Reference Cruttenden1997). These patterns typically convey pragmatic information at a postlexical level (Beckman & Pierrehumbert, Reference Beckman and Pierrehumbert1986; Ladd, Reference Ladd2008). As a result, English pitch cues carry a relatively low functional load (Van Lancker, Reference Van Lancker1980) in that they are informative for the identification of less than 1% of English words (Shibata & Shibata, Reference Shibata and Shibata1990).

In contrast, pitch variations in Mandarin Chinese (hereafter “Mandarin”) convey lexical information. To accurately perceive that a Mandarin speaker said “four” and not “death,” a listener must discriminate between the syllable si spoken with a falling pitch (“four”) and the same syllable spoken with a low-dipping pitch (“death”). The four discrete Mandarin pitch patterns or tones are informative for the identification of roughly 70% of Mandarin words (Shibata & Shibata, Reference Shibata and Shibata1990), causing Mandarin tones to carry a functional load as high as that of Mandarin vowels (Surendran & Levow, Reference Surendran, Levow, Bel and Marlien2004). Unsurprisingly, Mandarin tone discrimination and categorization are initially challenging for most nonnative listeners because pitch perception is affected by L1 phonetic, phonemic, and phonotactic properties (Hallé, Chang, & Best, Reference Hallé, Chang and Best2004; Wong & Perrachione, Reference Wong and Perrachione2007; Wu, Munro, & Wang, Reference Wu, Munro and Wang2014). Yet, a listener’s sensitivity to pitch is adaptive; categories can be learned and cue weighting typically changes over time (e.g., Hao, Reference Hao2012; Wang, Spence, Jongman, & Sereno, Reference Wang, Spence, Jongman and Sereno1999).

The present study examines nonnative listeners’ perception of Japanese pitch accent in the context of L2/L3 classroom learning. Unlike previous studies that have examined participants’ perceptual learning of pitch through short lab-based training (e.g., Saito & Wu, Reference Saito and Wu2014; Wayland & Guion, Reference Wayland and Guion2004; Wayland & Li, Reference Wayland and Li2008), we test participants engaged in classroom learning of different L2/L3s that vary in how pitch cues inform word discrimination. This serves as a more ecologically valid sample of L2/L3 learners who have acquired new lexical representations informed by pitch cues. Using an information-theoretic approach (e.g., Chang, Reference Chang2018; Wedel, Kaplan, & Jackson, Reference Wedel, Kaplan and Jackson2013), we test and model listeners’ discrimination of and sensitivity to Japanese pitch accent. We use Japanese as our test language for two reasons.

First, Japanese allows us to draw on the well-documented L2 pitch accent literature (e.g., Shport, Reference Shport, Heinrich and Sugita2008, Reference Shport2015, Reference Shport2016), including key cross-linguistic studies summarized in the following text. This body of work provides clear predictions regarding how a listener’s L1 affects Japanese pitch accent perception, as well as how L2/L3-Japanese learning can improve perception of pitch accent.

Second, the informativeness of Japanese pitch cues represent an intermediary point between that of English (lowest) and that of Mandarin (highest). This allows us to test groups for which Japanese pitch accent represents either an information value increase (L1-English) or decrease (L1-Mandarin). Additionally, Japanese allows us to compare how learning more or less informative pitch cues as an L2/L3 affects pitch accent perception. We compare how L1-English + L2-Mandarin classroom learners differ in their Japanese pitch accent discrimination from L1-English + L2-Japanese classroom learners. We also compare how L1-Mandarin listeners with and without L3-Japanese classroom experience differ in their Japanese pitch accent discrimination. This serves as a test of whether a listener’s sensitivity to pitch increases, decreases, or remains unchanged given L3 input in which pitch cues are less informative than they are in the L1.

PERCEPTION OF JAPANESE PITCH ACCENT BY L1 AND L2 LISTENERS

Japanese pitch accent (in Tokyo-type accent regions) is produced with sequences of high (H) and low (L) pitch cues combined with an accent (*). These cues are minimally spread across two moras, the primary timing unit in spoken Japanese (Vance, Reference Vance2008). For example, the two-mora hashi can carry three different lexical meanings as “chopsticks” (H*L: initial accented mora with a high pitch followed by a mora with a lower pitch), “bridge” (LH*: initial low pitch followed by an accented mora with a higher pitch), or “edge” (LH: initial low pitch followed by a higher pitch with neither mora accented). These accent categories are primarily cued by F0 rise/fall, though F0 peak, duration, and amplitude serve as secondary cues (Beckman, Reference Beckman1986; Hasegawa & Hata, Reference Hasegawa and Hata1992; Maniwa, Reference Maniwa2002).

Native listeners make use of pitch cues for speech segmentation and the discrimination of slightly less than one fifth of Japanese words (Cutler & Otake, Reference Cutler and Otake1999; Goss & Tamaoka, Reference Goss and Tamaoka2015; Otake & Cutler, Reference Otake and Cutler1999; Shibata & Shibata, Reference Shibata and Shibata1990). L2 learners of Japanese must therefore attend to pitch accent to become proficient listeners. In a series of dichotic listening studies, Wu, Tu, and Wang (Reference Wu, Tu and Wang2012) demonstrated that L1 linguistic experience affects L2 listeners’ perception of pitch accent. The researchers observed that L1-Mandarin listeners performed similar to L1-Japanese listeners because L1-Mandarin listeners heavily weight F0 movement information at the lexical level. L1-Mandarin listeners may have also perceptually assimilated pitch accents to Mandarin tone categories (e.g., So & Best, Reference So and Best2010, Reference So and Best2014). For instance, the HLL pitch accent may be perceptually similar to Mandarin Tone 4 given the two categories’ relatively similar F0 decrease over time. Mandarin tone, however, is manifested on a syllable whereas Japanese pitch accent is manifested on at least two moras. It is therefore unclear whether L1-Mandarin listeners automatically assimilate pitch accent to Mandarin tone categories.

In contrast, the L1-English listeners tested in Wu et al. (Reference Wu, Tu and Wang2012) did not resemble the L1-Japanese or L1-Mandarin listeners because L1-English listeners tend to perceive pitch accent (and lexical tones) as nonlinguistic units (Burnham & Mattock, Reference Burnham, Mattock, Bohn and Munro2007; Leather, Reference Leather, James and Leather1987; Stagray & Downs, Reference Stagray and Downs1993). Wu et al.’s findings corroborate previous L2 research that has shown L1-English listeners with no Japanese experience routinely discriminate pitch accent less accurately than L1-Japanese listeners (e.g., Goss, Reference Goss2018; Shport, Reference Shport, Heinrich and Sugita2008, Reference Shport2015, Reference Shport2016).

In Wu, Kawase, and Wang’s (Reference Wu, Kawase and Wang2017) follow-up dichotic listening study, the authors demonstrated that L2-Japanese experience improves L1-English listeners’ perception of Japanese pitch accent. After roughly one year of L2 classroom learning, L1-English + L2-Japanese listeners showed a shift toward more nativelike behavior. This finding supports previous lab-based L2-Japanese perceptual learning studies (e.g., Hirata, Reference Hirata1999; Minematsu et al., Reference Minematsu, Hirano, Nakamura and Oikawa2016). Taken together, Wu et al. (Reference Wu, Tu and Wang2012, Reference Wu, Kawase and Wang2017) established that while L1 background can constrain Japanese pitch accent perception, learners’ perception improves as a result of classroom L2 learning.

THE PRESENT STUDY

This study tests the hypothesis that a listener’s discrimination of and sensitivity to Japanese pitch accent reflects how F0 cues inform all known words rather than only Japanese-specific words. Because early stages of phonetic processing can involve activation of multiple languages in parallel through nonselective access (Dijkstra & Van Heuven, Reference Dijkstra and Van Heuven2002; Kroll & Stewart, Reference Kroll and Stewart1994; Spivey & Marian, Reference Spivey and Marian1999), including tonal and nontonal languages (Ortega-Llebaria, Nemoga & Presson, Reference Ortega-Llebaria, Nemoga and Presson2017; Shook & Marian, Reference Shook and Marian2016; Wang, Wang, & Malins, Reference Wang, Wang and Malins2017; Wu, Cristino, Leek, & Thierry, Reference Wu, Cristino, Leek and Thierry2013), a listener’s pitch perception may better reflect how F0 cues inform all lexical candidates than how F0 cues inform only candidates in the target language.

For example, an L1-English listener engaged in L2-Mandarin classroom learning will gradually encode new tonal minimal pairs. Compared to a monolingual L1-English listener, this L1-English + L2-Mandarin listener uses F0 information to identify a relatively large percentage of known words independent of the input. In theory, this predicts that L2-Mandarin listeners without Japanese experience should outperform proficiency-matched L1-English + L2-Japanese listeners in pitch accent discrimination because pitch cues inform a much larger proportion of Mandarin words than Japanese words (Shibata & Shibata, Reference Shibata and Shibata1990). Moreover, this nonselective, information-based account predicts that discrimination accuracy should not be affected by Japanese-specific categories. L2-Mandarin learners should discriminate all Japanese pitch accent categories equally well despite unfamiliarity with particular F0 rise/fall patterns. Finally, this account predicts that pitch perception is additive or cumulative. As a learner acquires more words in which pitch is an informative cue—irrespective of the language—overall sensitivity should improve in a linear manner.

This information-based model is juxtaposed with a purely language-specific, experience-based model of L2/L3 pitch perception. Such a model predicts that although adult L2/L3-Japanese listeners can approach L1-Japanese listeners’ accuracy, L2/L3 learners should never outperform L1 listeners. Moreover, L1-English + L2-Japanese listeners should be more accurate at discriminating Japanese pitch accent than L1-English + L2-Mandarin listeners because the latter group has no experience with Japanese speech. If L1/L2-Mandarin listeners assimilate rising/falling pitch accents to Mandarin rising/falling tone categories (e.g., So & Best, Reference So and Best2010), the LHL pitch accent serves as a critical comparison because it does not directly map to a Mandarin tone category; only listeners exposed to Japanese speech should be familiar with the LHL pitch pattern.

To investigate learners’ pitch perception given their L1/L2/L3 experience, we test six groups of listeners in a speeded-ABX Japanese pitch accent discrimination task. We test L1-English and L1-Japanese baseline groups and compare their results to two new groups of L1-English listeners undergoing either L2-Japanese or L2-Mandarin classroom learning. This allows for an examination of whether L1-English listeners’ sensitivity to pitch is better predicted by experience with Japanese-specific pitch cues or experience with more informative pitch cues independent of the language. We additionally test two groups of L1-Mandarin (+L2-English) listeners. The first group allows for an examination of whether L1-Mandarin sensitivity to Japanese pitch accent approaches that of L1-Japanese listeners despite the former group having no Japanese experience. The second L1-Mandarin group allows for a test of how L3-Japanese input affects listeners’ perception. That is, do L1-Mandarin listeners undergoing L3-Japanese classroom learning demonstrate even greater sensitivity to pitch cues in an additive manner or does L3-Japanese learning dampen L1-Mandarin listeners’ pitch perception because pitch cues are less informative in Japanese than they are in Mandarin?

EXPERIMENT

PARTICIPANTS

Six groups of participants (N = 15 per group; see Table 1) took part in the experiment. All non-L1-English participants spoke English as an L2 and were matched for English proficiency based on self-reported abilities and Test of English as a Foreign Language (TOEFL) or International English Language Testing System (IELTS) scores. All L1-Mandarin participants were from central or northern China and self-reported only speaking standard Mandarin and no other tonal dialect. All L1-Japanese speakers were from the Tokyo-type accent regions of Japan. The L1-English groups engaged in L2-Mandarin and L2-Japanese classroom learning were controlled for their respective level of L2 proficiency based on self-reported abilities, length of L2 study, and placement in comparable L2 intermediate courses. No participant had studied abroad in China or Japan. All L3-Japanese learners were drawn from the same advanced Japanese class. All L1-English participants had 2 years or less of previous secondary school instruction in Spanish, French, Latin, or German; no participant was currently studying or exposed to an L2. An adaptive pitch test (Mandell, Reference Mandell2018) was used to control for pure pitch perception. All 90 participants were able to reliably differentiate two tones at 16 Hz or lower (range: 1.4–16 Hz). All participants were undergraduate or graduate students and were paid or given class credit for their participation.

TABLE 1. Participant information (group means)

Table 2 summarizes each group’s relative experience with Japanese and estimated informativeness of pitch. Experience with Japanese was conceptualized as a value from 1 (none) to 4 (native) with L1-Japanese listeners scoring the highest (4) and L1-Mandarin, L1-English, and L1-English + L2-Mandarin listeners with no Japanese experience scoring the lowest (1). L1-English + L2-Japanese listeners were given (2) to account for their intermediate proficiency level, while L1-Mandarin + L3-Japanese listeners were given (3) to account for their advanced proficiency level.

TABLE 2. Estimates of pitch informativeness and Japanese experience

* indicates that all participants in the Group spoke English as an L2.

Pitch’s informativeness reflects Shibata and Shibata’s (1990; see also Tamaoka et al., Reference Tamaoka, Saito, Kiyama, Timmer and Verdonschot2014) calculated values. They computed informativeness by first identifying homophones in lexical corpora from the three languages and then determining which homophones differed by stress, pitch, or tone alone. Values were then averaged across word lengths, yielding estimates for each language: 0.5% for English, 14% for Japanese, and 71% for Mandarin. For intermediate L2-Mandarin and L2-Japanese learners, values were halved. For advanced L2-English and L3-Japanese learners, values were split into three fourths. Thus, the information values roughly captured the same four-step learning continuum used for measuring Japanese experience. For example, the informativeness value for the L1-Mandarin + L2-English + L3-Japanese group was calculated as: 71 (L1) + .375 (L2) + 10.5 (L3) = 81.375.

MATERIALS

Ten nonwords were created with a segmental structure of CVCVCV (see Online Supplementary Material for items and acoustic measurements). Nonwords were chosen to eliminate potential lexical frequency information that would unfairly bias L1/L2/L3 listeners. Each trimoraic stimulus was recorded by a female native speaker of Tokyo Japanese at 16 bits/44,100 Hz with three existent accent patterns: HLL, LHL, and LHH. For example, the nonword makana was produced with the accent patterns: MAkana (HLL), maKAna (LHL), and maKANA (LHH). All nonwords were produced in the carrier sentence ____to iimashita “(I) said ___” and extracted in Praat (Boersma & Weenink, Reference Boersma and Weenink2018). Acoustic means for each category aligned with accent categories used in previous studies (e.g., Cutler & Otake, Reference Cutler and Otake1999; Shport, Reference Shport2015, Reference Shport2016), and were thus presented to listeners without phonetic manipulation. The ABX stimuli consisted of three consecutive trimoraic nonwords with a 250 ms interstimulus interval. The first two nonwords, A and B, only differed in pitch accent for a total of 120 unique stimuli/trials (12 ABX patterns per target × 10 targets).

PROCEDURE

Participants were tested individually in a quiet room with headphones. After completing a language background questionnaire and the Tonometric pitch perception test, participants were given oral and printed instructions in their L1 by a bilingual English-Mandarin or English-Japanese experimenter. Participants were told to indicate through a button press whether the final sound (X) was similar to the first (A) or second (B) sound as quickly and accurately as possible. If participants did not respond within 2.5 seconds from the offset of X, the next trial would proceed automatically. AB ordering was counterbalanced across all trials with a 1 second intertrial interval. To familiarize participants with the task, two practice trials were presented using a practice stimulus of similar moraic structure and accent pattern as the stimuli. All stimuli were presented using Superlab 5.

STATISTICAL ANALYSES

Figure 1 shows group violin plots, 95% confidence intervals (CIs; black box), and group means (white line within CI box). CIs for the six groups revealed that L1-English listeners were overall least accurate while L1-Mandarin + L3-Japanese listeners were overall most accurate. Participants’ overall time-out rate was less than 2%.

FIGURE 1. ABX discrimination accuracy by group. Black box indicates 95% confidence interval. White line within interval indicates group mean.

The ABX data were first analyzed in R (version 3.3) using the lme4 package (Bates, Mächler, Boker, & Walker, Reference Bates, Mächler, Bolker and Walker2015). A mixed-effects logistic regression model of the log odds of correct discrimination was built. The model contained subjects and items as random intercepts and a fixed-effect term for group with L1-Japanese as the reference level (treatment coded), allowing for five comparisons. The model [n = 10800, log-likelihood = –3879] is summarized in Table 3. Observed power as a 95% CI for the predictor group was calculated using the simr package (Green & MacLeod, Reference Green and MacLeod2016): [.88, 1].

TABLE 3. Fixed-effect terms in mixed-effects model of the likelihood of ABX discrimination accuracy

R code: glmer(accuracy ∼ group + (1|subject) + (1|item), family = “binomial”)

The model indicated that L1-Japanese listeners were significantly more likely than L1-English and L1-English + L2-Japanese listeners to accurately discriminate pitch accent. No difference was found between the L1-Japanese and L1-English + L2-Mandarin listeners. The L1-Mandarin listeners were marginally more likely than L1-Japanese listeners to accurately discriminate pitch accent. The L1-Mandarin + L3-Japanese listeners were significantly more likely than the L1-Japanese listeners to accurately discriminate pitch accent.

To test planned comparisons not present in the main model, two additional models were built with different L1 reference levels and the inclusion of Japanese pitch accent type (LHL as the reference level, treatment coded). R code: glmer(accuracy ∼ group * accent + (1|subject) + (1|item), family = “binomial”). The model testing L1-English listeners as the reference level [n = 5400, log-likelihood = –2411] revealed that L1-English + L2-Japanese listeners were as likely as L1-English listeners to accurately discriminate pitch accent [β = 0.40, z = 1.74, p = .18]. In contrast, L1-English + L2-Mandarin listeners were significantly more likely than L1-English listeners to accurately discriminate pitch accent [β = 0.66, z = 2.80, p < .01]. A comparison between the two L2 groups calculated using least-square-means in the lsmeans package (Lenth, Reference Lenth2016) revealed that L1-English + L2-Japanese listeners did not differ from L1-English + L2-Mandarin listeners in their discrimination accuracy [β = 0.25, z = 1.07, p = .53]. Neither a main effect of accent type nor its interaction with group was significant at an alpha level of .05.

The model testing L1-Mandarin listeners as the reference level [n = 3600, log-likelihood = –849] revealed that L1-Mandarin listeners did not differ from L1-Mandarin + L3-Japanese listeners in their discrimination accuracy [β = 0.33, z = 1.42, p = .15]. Neither a main effect of accent type nor its interaction with group was significant at an alpha level of .05.

To model this overall pattern of responses, raw ABX data were converted to d-prime (d’), a measure of perceptual sensitivity that accounts for response bias (McNicol, Reference McNicol2005). The differencing model strategy (Hautus & Meng, Reference Hautus and Meng2002) was used to calculate d’; adjustments to d’ were not required because all participants performed above chance and no participant had a hit rate of 1. Participants’ d’ scores were analyzed in R by building two generalized linear models. The models contained a fixed-effect term for either experience with Japanese or estimated informativeness of pitch: R code: lm(dprime ∼ experience), lm(dprime ∼ informativeness). Values for these variables were taken from Table 2 corresponding to either language experience or Shibata and Shibata’s (1990) calculations of informativeness.

Figure 2 plots fits from each linear model (with standard error). Points represent individual d’ scores. The model testing informativeness better fit the data [β = 0.01, t = 7.06, p < .001; adjusted R ² = .36; log-likelihood = –81; f2 = .56] than the model testing experience with Japanese [β = 0.15, t = 2.38, p = .02; adjusted R ² = .06; log-likelihood = –98; f2 = .06].

FIGURE 2. Linear regression model fits (with SE).

RESULTS

We examined Japanese pitch accent perception in two groups of L1-English listeners undergoing either L2-Mandarin or L2-Japanese classroom learning. Our results indicate that even after more than one year of L2-Japanese classroom training, L1-English + L2-Japanese listeners’ pitch accent discrimination still resembled that of naïve L1-English listeners, and did not reach L1-Japanese abilities. In contrast, L1-English + L2-Mandarin listeners discriminated Japanese pitch accent as accurately as L1-Japanese listeners and more accurately than L1-English listeners.

We further examined two groups of L1-Mandarin (+ L2-English) listeners. L1-Mandarin listeners without Japanese experience discriminated pitch accent with nativelike abilities. L1-Mandarin + L3-Japanese listeners discriminated pitch accent significantly more accurately than L1-Japanese listeners, suggesting that sensitivity to pitch cues continues to increase for learners in an additive manner. This hypothesis was evaluated by building two linear regression models. The model testing sensitivity as a function of pitch’s informativeness accounted for relatively more of the observed variance and had a larger effect size than the model testing sensitivity as a function of experience with Japanese.

DISCUSSION

Our ABX discrimination results (Figure 1) confirmed previously reported effects of L1 affecting listeners’ Japanese pitch accent perception: L1-English listeners discriminated Japanese pitch accent less accurately than L1-Japanese listeners whereas L1-Mandarin listeners discriminated Japanese pitch accent marginally more accurately than L1-Japanese listeners (Goss, Reference Goss2018; Shport, Reference Shport, Heinrich and Sugita2008, Reference Shport2015, Reference Shport2016; Wu et al., Reference Wu, Kawase and Wang2017).

Previously reported L2-Japanese perceptual learning effects, however, were not replicated. One year of classroom learning was insufficient for L1-English + L2-Japanese listeners to approach nativelike levels of pitch accent perception; learners still resembled L1-English listeners without Japanese experience (cf., Goss, Reference Goss2018; Shibata & Hurtig, Reference Shibata, Hurtig and Han2008). Unexpectedly, the L2-Mandarin learners unfamiliar with Japanese speech reached nativelike discrimination accuracy. One interpretation of this finding is that Mandarin tone training facilitated Japanese pitch accent discrimination. Future research may test this hypothesis by examining whether training on unrelated languages in which the target cue is more informative results in improved perception of the (less informative) cue in the L2/L3. As an example, Vietnamese or Cantonese tone training may facilitate L2 learners’ Mandarin tone discrimination given the more informative tonal systems of these languages as compared to Mandarin’s (see Yip, Reference Yip2002).

This view of linguistic pitch learning also predicts that advanced L1-English + L2-Mandarin learners with a large enough lexicon of tonal minimal pairs (e.g., participants from Reference Pelzl, Lau, Guo and DeKeyserPelzl, Lau, Guo, & DeKeyser, 2018) may demonstrate a sensitivity to Japanese pitch accent that exceeds that of L1-Japanese listeners. Indeed, superior nonnative perception was observed in the present study: L1-Mandarin + L3-Japanese listeners discriminated Japanese pitch accent more accurately than L1-Japanese listeners. To our knowledge, this is the first reported evidence of “advantageous transfer” (Bohn & Best, Reference Bohn and Best2012; Chang, Reference Chang2018; Chang & Mishler, Reference Chang and Mishler2012) in which nonnative perception of pitch accent exceeded that of native Japanese listeners.

Taken together, our results motivate the claim that sensitivity to pitch accent reflects how pitch cues inform all words a listener knows in a nonselective, additive manner. Theoretically, this information-based model is in line with a cue-centric approach of perception and transfer (e.g., Chang, Reference Chang2018). How pitch informs words in an L1 and its relative functional load drive how F0 cues are initially weighted and transferred to an L2 (e.g., Schaefer & Darcy, Reference Schaefer and Darcy2014; Tremblay, Broersma, & Coughlin, Reference Tremblay, Broersma and Coughlin2017). During L2 acquisition, new pitch patterns are learned and F0 cue weighting changes as more words informed by pitch are acquired (Holt & Lotto, Reference Holt and Lotto2006). Importantly, this attunement to F0 cues appears to be additive. Sensitivity to pitch cues continues to increase during L3 acquisition. This view is compatible with models of L3 acquisition that posit any prior language experience contributes to subsequent acquisition, for example, the Cumulative Enhancement Model (Flynn, Foley, & Vinnitskaya, Reference Flynn, Foley and Vinnitskaya2004).

While this study serves as an initial investigation, we recognize limitations to our information-theoretic approach. Among them, we assume that L2 and L3 acquisition involve the same underlying mechanisms. Future work will need to quantify L2/L3 learners’ proficiency with more rigor and account for within-group learner variability (e.g., Chandrasekaran, Sampath, & Wong, Reference Chandrasekaran, Sampath and Wong2010; Perrachione, Lee, Ha, & Wong, Reference Perrachione, Lee, Ha and Wong2011). Our approach also assumes listeners activate representations in their L1 and L2 in parallel, including those representations informed primarily by pitch cues. While ample evidence indicates that L1-Mandarin listeners continue to heavily weight F0 contours during L2-English lexical processing (Ortega-Llebaria et al., Reference Ortega-Llebaria, Nemoga and Presson2017; Shook & Marian, Reference Shook and Marian2016; Wang et al., Reference Wang, Wang and Malins2017; Wu et al., Reference Wu, Cristino, Leek and Thierry2013; see also Tatsuno & Sakai, Reference Tatsuno and Sakai2005 for limited L1-Japanese evidence), it is an empirical question whether L2-Mandarin or L2-Japanese listeners change their weighting of English F0 cues as a result of undergoing L2 or L3 learning. Because L2 learning can impact the L1 even during early development (Bice & Kroll, Reference Bice and Kroll2015), L2/L3 pitch learning could theoretically modify how L1 pitch cues are weighted.

We acknowledge that the present results cannot fully falsify a phonotactic transfer account of prosodic categories or rule out any category-specific assimilation. This claim is difficult to adequately evaluate because F0 cues are manifested differently across the two languages: syllable (Mandarin) versus minimally two moras (Japanese). Evidence for tone category transfer in L2/L3 perception is inconsistent, with results varying given the participants, tasks, stimuli, and dependent measures (see Chang, Yao, & Huang, Reference Chang, Yao and Huang2017; Hallé et al., Reference Hallé, Chang and Best2004; Qin & Jongman, Reference Qin and Jongman2016; So & Best, Reference So and Best2010, Reference So and Best2014 among others). Model fits of our discrimination data suggest that all statistically significant increases were due to aggregate gains and were not driven by category-specific improvements. We note that the LHL pitch accent, which does not directly map to a Mandarin tone category, should have been L1-Mandarin and L2-Mandarin listeners’ least accurate category. This pattern was not found. Future studies can test whether the LHL pitch accent results in new category formation for L1/L2-Mandarin listeners (e.g., Flege, Reference Flege and Strange1995), how additional unfamiliar pitch patterns affect L1/L2/L3-Japanese listeners’ responses, and to what degree frequency distributions of pitch accent patterns may contribute to these results.

In conclusion, an L2 or L3 learner’s sensitivity to pitch is additive and appears to reflect how pitch informs discrimination of words in all known languages rather than in the target language only. These findings are in line with cue-centric views of perception and transfer, demonstrate potential advantageous transfer for tonal-L1 listeners, and highlight the cumulative role that pitch plays in language learning.

SUPPLEMENTARY MATERIAL

To view supplementary material for this article, please visit https://doi.org/10.1017/S0272263119000068

Footnotes

The authors thank Joy Maa and Zhe Gao for their help with the experiment and the anonymous reviewers for their incredibly valuable and insightful feedback on earlier versions of the manuscript.

The experiment in this article earned an Open Materials badge for transparent practices. The materials are available at https://osf.io/9328s/.

References

REFERENCES

Bates, D., Mächler, M., Bolker, B., & Walker, S. (2015). Fitting linear mixed-effects models using lme4. Journal of Statistical Software, 67, 1–48.Google Scholar

Beckman, M. E. (1986). Stress and non-stress accent. Dordrecht, The Netherlands: Fortis Publications.CrossRef Google Scholar

Beckman, M. E., & Pierrehumbert, J. B. (1986). Intonational structure in Japanese and English. Phonology, 3, 255–309.CrossRef Google Scholar

Bice, K., & Kroll, J. F. (2015). Native language change during early stages of second language learning. NeuroReport, 26, 966–971.CrossRef Google Scholar

Boersma, P., & Weenink, D. (2018). Praat: Doing phonetics by computer [Computer software]. Version 6.0.39. Retrieved from http://www.praat.org/.Google Scholar

Bohn, O. S., & Best, C. T. (2012). Native-language phonetic and phonological influences on perception of American English approximants by Danish and German listeners. Journal of Phonetics, 40, 109–128.CrossRef Google Scholar

Burnham, D., & Mattock, K. (2007). The perception of tones and phones. In Bohn, O-S. & Munro, M. J. (Eds.), Language experience in second language speech learning: In honor of James Emil Flege (pp. 259–280). Amsterdam, The Netherlands: John Benjamins Publishing.CrossRef Google Scholar

Chandrasekaran, B., Sampath, P. D., & Wong, P. C. (2010). Individual variability in cue-weighting and lexical tone learning. The Journal of the Acoustical Society of America, 128, 456–465.CrossRef Google Scholar

Chang, C. B. (2018). Perceptual attention as the locus of transfer to nonnative speech perception. Journal of Phonetics, 68, 85–102.CrossRef Google Scholar

Chang, C. B., & Mishler, A. (2012). Evidence for language transfer leading to a perceptual advantage for non-native listeners. The Journal of the Acoustical Society of America, 132, 2700–2710.CrossRef Google Scholar

Chang, Y. H. S., Yao, Y., & Huang, B. H. (2017). Effects of linguistic experience on the perception of high-variability non-native tones. The Journal of the Acoustical Society of America, 141, EL120–126.CrossRef Google Scholar

Cruttenden, A. (1997). Intonation. Cambridge: Cambridge University Press.CrossRef Google Scholar

Cutler, A., & Otake, T. (1999). Pitch accent in spoken-word recognition in Japanese. The Journal of the Acoustical Society of America, 105, 1877–1888.CrossRef Google Scholar

Dijkstra, T., & Van Heuven, W. J. (2002). The architecture of the bilingual word recognition system: From identification to decision. Bilingualism: Language and Cognition, 5, 175–197.CrossRef Google Scholar

Flege, J. E. (1995). Second language speech learning: Theory, findings, and problems. In Strange, W. (Ed.), Speech perception and linguistic experience: Issues in cross language research (pp. 233–272). Baltimore, MD: York Press.Google Scholar

Flynn, S., Foley, C., & Vinnitskaya, I. (2004). The cumulative-enhancement model for language acquisition: Comparing adults’ and children’s patterns of development in first, second and third language acquisition of relative clauses. International Journal of Multilingualism, 1, 3–16.CrossRef Google Scholar

Goss, S. (2018). A critical pedagogy of lexical accent in L2 Japanese: Insights into research and practice. Japanese Language and Literature, 52, 1–26.Google Scholar

Goss, S., & Tamaoka, K. (2015). Predicting lexical accent perception in native Japanese speakers: An investigation of acoustic pitch sensitivity and working memory. Japanese Psychological Research, 57, 143–154.CrossRef Google Scholar

Green, P., & MacLeod, C. J. (2016). SIMR: An R package for power analysis of generalized linear mixed models by simulation. Methods in Ecology and Evolution, 7, 493–498.CrossRef Google Scholar

Gussenhoven, C. (2004). The phonology of tone and intonation. Cambridge: Cambridge University Press.CrossRef Google Scholar

Hallé, P. A., Chang, Y. C., & Best, C. T. (2004). Identification and discrimination of Mandarin Chinese tones by Mandarin Chinese vs. French listeners. Journal of Phonetics, 32, 395–421.CrossRef Google Scholar

Hao, Y. C. (2012). Second language acquisition of Mandarin Chinese tones by tonal and non-tonal language speakers. Journal of Phonetics, 40, 269–279.CrossRef Google Scholar

Hasegawa, Y., & Hata, K. (1992). Fundamental frequency as an acoustic cue to accent perception. Language and Speech, 35, 87–98.CrossRef Google Scholar

Hautus, M. J., & Meng, X. (2002). Decision strategies in the ABX (matching-to-sample) psychophysical task. Perception and Psychophysics, 64, 89–106.CrossRef Google Scholar

Hirata, Y. (1999). Acquisition of Japanese rhythm and pitch accent by English native speakers (Unpublished doctoral dissertation). University of Chicago.Google Scholar

Holt, L. L., & Lotto, A. J. (2006). Cue weighting in auditory categorization: Implications for first and second language acquisition. The Journal of the Acoustical Society of America, 119, 3059–3071.CrossRef Google Scholar

Kroll, J. F., & Stewart, E. (1994). Category interference in translation and picture naming: Evidence for asymmetric connections between bilingual memory representations. Journal of Memory and Language, 33, 149–174.CrossRef Google Scholar

Ladd, D. R. (2008). Intonational phonology. Cambridge: Cambridge University Press.CrossRef Google Scholar

Leather, J. (1987). F0 pattern inference in the perceptual acquisition of second language tone. In James, A. & Leather, J. (Eds.), Sound patterns in second language acquisition (pp. 59–81). Dordrecht, The Netherlands: Foris Publications.Google Scholar

Lenth, R. V. (2016). Least-squares means: The R package lsmeans. Journal of Statistical Software, 69, 1–33.CrossRef Google Scholar

Mandell, J. (2018). Tonometric [computer software]. Retrieved from https://tonometric.com.Google Scholar

Maniwa, K. (2002). Acoustic and perceptual evidence of complete neutralization of word-final tonal specification in Japanese. Kansas Working Papers in Linguistics, 26, 93–112.Google Scholar

McNicol, D. (2005). A primer of signal detection theory. Mahwah, NJ: Lawrence Erlbaum.CrossRef Google Scholar

Minematsu, N., Hirano, H., Nakamura, N., & Oikawa, K. (2016). Improvement of naturalness of learners’ spoken Japanese by practicing with the web-based prosodic reading tutor, Suzuki-kun. Speech prosody 2016 (pp. 257–261). Boston, MA.CrossRef Google Scholar

Ortega-Llebaria, M., Nemoga, M., & Presson, N. (2017). Long-term experience with a tonal language shapes the perception of intonation in English words: How Chinese-English bilinguals perceive “Rose?” vs. “Rose.” Bilingualism: Language and Cognition, 20, 367–383.CrossRef Google Scholar

Otake, T., & Cutler, A. (1999). Perception of suprasegmental structure in a non-native dialect. Journal of Phonetics, 27, 229–253.CrossRef Google Scholar

Pelzl, E., Lau, E., Guo, T., & DeKeyser, R. (2018). Advanced second language learners’ perception of lexical tone contrasts. Studies in Second Language Acquisition, https://doi.org/10.1017/S0272263117000444.CrossRef Google Scholar

Perrachione, T. K., Lee, J., Ha, L. Y., & Wong, P. C. (2011). Learning a novel phonological contrast depends on interactions between individual differences and training paradigm design. The Journal of the Acoustical Society of America, 130, 461–472.CrossRef Google Scholar

Qin, Z., & Jongman, A. (2016). Does second language experience modulate perception of tones in a third language? Language and Speech, 59, 318–338.CrossRef Google Scholar

Saito, K., & Wu, X. (2014). Communicative focus on form and second language suprasegmental learning: Teaching Cantonese learners to perceive Mandarin tones. Studies in Second Language Acquisition, 36, 647–680.CrossRef Google Scholar

Schaefer, V., & Darcy, I. (2014). Lexical function of pitch in the first language shapes cross-linguistic perception of Thai tones. Laboratory Phonology, 5, 489–522.CrossRef Google Scholar

Shibata, T., & Hurtig, R. (2008). Prosody acquisition by Japanese learners. In Han, Z. (Ed.), Understanding second language process (pp. 176–204). Clevedon, UK: Multilingual Matters.Google Scholar

Shibata, T., & Shibata, R. (1990). Akusento wa doo’ongo o dono teido bembetsu shiuruka: Nihongo, eigo chugokugo no ba’ai [Is word-accent significant in differentiating homonyms in Japanese, English, and Chinese?]. Keiryoo Kokugogaku [Mathematical Linguistics], 17, 317–327.Google Scholar

Shook, A., & Marian, V. (2016). The influence of native-language tones on lexical access in the second language. The Journal of the Acoustical Society of America, 139, 3102–3109.CrossRef Google Scholar

Shport, I. A. (2008). Acquisition of Japanese pitch accent by American learners. In Heinrich, P. & Sugita, Y. (Eds.), Japanese as a foreign language in the age of globalization (pp. 165–187). Munich, Germany: Iudicium Publishing.Google Scholar

Shport, I. A. (2015). Perception of acoustic cues to Tokyo Japanese pitch-accent contrasts in native Japanese and naïve English listeners. The Journal of the Acoustical Society of America, 138, 307–318.CrossRef Google Scholar

Shport, I. A. (2016). Training English listeners to identify pitch-accent patterns in Tokyo Japanese. Studies in Second Language Acquisition, 38, 739–769.CrossRef Google Scholar

So, C. K., & Best, C. T. (2010). Cross-language perception of non-native tonal contrasts: Effects of native phonological and phonetic influences. Language and Speech, 53, 273–293.CrossRef Google Scholar

So, C. K., & Best, C. T. (2014). Phonetic influences on English and French listeners’ assimilation of Mandarin tones to native prosodic categories. Studies in Second Language Acquisition, 36, 195–221.CrossRef Google Scholar

Spivey, M. J., & Marian, V. (1999). Cross talk between native and second languages: Partial activation of an irrelevant lexicon. Psychological Science, 10, 281–284.CrossRef Google Scholar

Stagray, J. R., & Downs, D. (1993 ). Differential sensitivity for frequency among speakers of a tone and a nontone language. Journal of Chinese Linguistics, 21, 143–163.Google Scholar

Surendran, D., & Levow, G. A. (2004). The functional load of tone in Mandarin is as high as that of vowels. In Bel, B. & Marlien, I. (Eds.), Proceedings of the 2nd International Conference on Speech Prosody (pp. 99–102). Nara, Japan.Google Scholar

Surendran, D., & Niyogi, P. (2006). Quantifying the functional load of phonemic oppositions, distinctive features, and suprasegmentals. In Thomsen, O. N. (Ed.), Competing models of linguistic change: Evolution and beyond (pp. 43–58). Amsterdam, The Netherlands, and Philadelphia, PA: John Benjamins.CrossRef Google Scholar

Tamaoka, K., Saito, N., Kiyama, S., Timmer, K., & Verdonschot, R. G. (2014). Is pitch accent necessary for comprehension by native Japanese speakers? An ERP investigation. Journal of Neurolinguistics, 27, 31–40.CrossRef Google Scholar

Tatsuno, Y., & Sakai, K. L. (2005). Language-related activations in the left prefrontal regions are differentially modulated by age, proficiency, and task demands. Journal of Neuroscience, 25, 1637–1644.CrossRef Google Scholar

Tremblay, A., Broersma, M., & Coughlin, C. E. (2017). The functional weight of a prosodic cue in the native language predicts the learning of speech segmentation in a second language. Bilingualism: Language and Cognition, 21, 1–13.Google Scholar

Van Lancker, D. (1980). Cerebral lateralization of pitch cues in the linguistic signal. Research on Language and Social Interaction, 13, 201–277.Google Scholar

Vance, T. (2008). The sounds of Japanese. New York, NY: Cambridge University Press.Google Scholar

Wang, X., Wang, J., & Malins, J. (2017). Do you hear “feather” when listening to “rain”? Lexical tone activation during unconscious translation: Evidence from Mandarin-English bilinguals. Cognition, 169, 15–24.CrossRef Google Scholar

Wang, Y., Spence, M. M., Jongman, A., & Sereno, J. A. (1999). Training American listeners to perceive Mandarin tones. The Journal of the Acoustical Society of America, 106, 3649–3658.CrossRef Google Scholar

Wayland, R. P., & Guion, S. G. (2004). Training English and Chinese listeners to perceive Thai tones: A preliminary report. Language Learning, 54, 681–712.CrossRef Google Scholar

Wayland, R. P., & Li, B. (2008). Effects of two training procedures in cross-language perception of tones. Journal of Phonetics, 36, 250–267.CrossRef Google Scholar

Wedel, A., Kaplan, A., & Jackson, S. (2013). High functional load inhibits phonological contrast loss: A corpus study. Cognition, 128, 179–186.CrossRef Google Scholar

Wong, P. C., & Perrachione, T. K. (2007). Learning pitch patterns in lexical identification by native English-speaking adults. Applied Psycholinguistics, 28, 565–585.CrossRef Google Scholar

Wu, X., Kawase, S., & Wang, Y. (2017). Effects of acoustic and linguistic experience on Japanese pitch accent processing. Bilingualism: Language and Cognition, 20, 931–946.CrossRef Google Scholar

Wu, X., Munro, M. J., & Wang, Y. (2014). Tone assimilation by Mandarin and Thai listeners with and without L2 experience. Journal of Phonetics, 46, 86–100.CrossRef Google Scholar

Wu, X., Tu, J. Y., & Wang, Y. (2012). Native and nonnative processing of Japanese pitch accent. Applied Psycholinguistics, 33, 623–641.CrossRef Google Scholar

Wu, Y. J., Cristino, F., Leek, C., & Thierry, G. (2013). Non-selective lexical access in bilinguals is spontaneous and independent of input monitoring: Evidence from eye tracking. Cognition, 129, 418–425.CrossRef Google Scholar

Yip, M. (2002). Tone. Cambridge: Cambridge University Press.CrossRef Google Scholar

TABLE 1. Participant information (group means)

TABLE 2. Estimates of pitch informativeness and Japanese experience

FIGURE 1. ABX discrimination accuracy by group. Black box indicates 95% confidence interval. White line within interval indicates group mean.

TABLE 3. Fixed-effect terms in mixed-effects model of the likelihood of ABX discrimination accuracy

FIGURE 2. Linear regression model fits (with SE).

Wiener and Goss supplementary material

Table S1

File 18 KB