This paper argues that tone-driven epenthesis is possible in tonal languages, contrary to claims in the literature that it is unattested/impossible. In Wamey, an epenthetic [ə] is inserted to host a high tone in two contexts. The first is to host a tone which would otherwise be left floating due to a restriction on rising tones in closed syllables (i.e. /cv̀cⒽ/ maps to [cv̀cə́] due to a ban *[cv̌c]). The second is to host a tone which is introduced by word-level morphology but is restricted from associating across a stem boundary (i.e. an input /(cv̀cv̀)cⒽ/ maps to [(cv̀cv̀)cə́] not *[(cv̀cv́)c]). These patterns cannot be attributed to syllable phonotactics, which freely allow all consonants in the coda position. We assemble the evidence for tone-driven epenthesis, focusing on the distribution of final [ə] in lexical stem structure and [ə]-alternating suffixes that pattern as underlyingly consonant-final. A simple OT analysis derives [ə]-epenthesis, using common constraints (e.g. *Float, *Rise, OCP(H), Dep(µ)), together with constraints against associating tone across certain prosodic boundaries. In total, Wamey provides evidence for parallelism between tonal and intonational languages given that intonation-driven epenthesis is established in the literature. This parallelism is predicted under a model where both types of prosodic systems make use of the same phonological substance and autosegmental architecture, and have the same functional pressures to cultivate segmental environments best suited for realising pitch targets.
1. Introducing the issue: The interaction of tone and epenthesis
Pitch is present in the linguistic signal of all spoken languages. Roughly speaking, pitch can be exploited as lexical and grammatical tone in ‘tonal languages’, and as pitch accents and boundary tones in ‘intonational languages’. For both tonal and intonational languages, the basic units of pitch contrast are referred to as tones. Such tones may be pre-associated with a specific tone-bearing unit (TBU), such as a mora, or enter the derivation unassociated with any TBU, in which case they are referred to as floating tones.
As a null hypothesis, we expect that floating tones in tonal languages and intonational languages should not be ontologically distinct (i.e. comparable representations and behaviour should be able to be identified in both). We refer to this premise as tone-intonation parallelism. To exemplify, consider first the tonal language Kalabari (Ijoid: Nigeria), which has a basic distinction between H and L plus downstep (i.e. ${^\downarrow \mbox{H}}$). To express the imperative, a floating tone sequence HL is appended to the right edge of the verb and co-occurs with its lexical tone (Harry Reference Harry2004). For example, a low tone root such as /sɔ̀/ ‘cook’ has the imperative form [s] ‘cook!’ with a LHL pattern, where the floating tones dock with the lone TBU.
We can compare this to (American) English, an intonational language. Jeong & Condoravdi (Reference Jeong and Condoravdi2018) highlight the use of the so-called calling contour (Pike Reference Pike1945) in certain imperatives (e.g. in the mnemonic imperative Don't forget to feed the cats!) Phonologically, this is analysed as a complex H*${^\downarrow \mbox{H-L}}$% intonation configuration, in essence a H* pitch accent on the (final) stressed syllable cats followed by a downstepped ${^\downarrow \mbox{H}}$. In both Kalabari and English, the tonal inventory can be deconstructed into basic tonemes (H, L, ${^\downarrow \mbox{H}}$), and floating tones are unassociated in the input, target specific positions, co-occur with no segmental exponents and systematically express complex meaning as part of the grammar (here, flavours of imperative).
What happens when there is no appropriate TBU in the targeted location for a floating tone to dock with, or when docking with this TBU would create a banned structure? One strategy in a number of intonational systems is to induce epenthesis of a TBU to host the floating tones, typically some default vowel (e.g. /ə/). Consider the case of intonation in Tashlhiyt (Berber: Morocco), a book-length treatment of which is provided by Roettger (Reference Roettger2017). A floating H is found in several intonational contexts, where it serves in part to indicate yes-no and echo questions, as well as contrastive statements (see also Grice et al. Reference Grice, Ridouane and Roettger2015 for details). One area of focus involves cases where intonational floating tones – what Reference RoettgerRoettger refers to as the ‘tune’ – target a word which does not have an appropriate TBU to host the tone, referred to as the ‘text’. Such a mismatch is found in words composed entirely of voiceless consonants (e.g. /tfsχt/ ‘you cancelled’).
Reference RoettgerRoettger identifies three general strategies for tune–text alignment in such cases, as shown in Figure 1. The solid black line indicates pitch over voiced segments, and the dashed grey line, over voiceless segments. In the first strategy, the tune is simply deleted, and no pitch rise is seen. In the second, tone is anticipated on some TBU before the targeted portion (e.g. a vowel of a preceding word). The third option involves the insertion of a default vowel, /ə/, with which the tune associates. This may be at some point within the voiceless sequence (a) or at the end of the sequence (b).
In his discussion, (Roettger 2017: 126) interprets schwa insertion as serving as ‘a landing site for a communicatively relevant tone’. Although this is not a categorical pattern (epenthesis sometimes occurs even without a tone), two observations are important. First, insertion is more often observed in sentence modalities with ‘complex tonal movements’, thus showing a correlation between tone and epenthesis. Second, vowel insertion is observed significantly more in phrase-final position than phrase-medially, the former more often being the target of the intonational tune. Taken together, the presence of /ə/ can be understood (in part) as driven by the needs of intonational floating tones.
We refer to patterns such as these as tone-driven epenthesis. Tashlhiyt is but one of the languages for which the intonational literature identifies tone-driven epenthesis. Roettger & Grice (Reference Roettger and Grice2019) summarise a number of studies which show vowel insertion correlating solely or primarily with intonational properties of the clause, including Galician (Martínez-Gil Reference Martínez-Gil1997), Bari Italian (Grice et al. Reference Grice, Savino and Roettger2018) and Tunisian Arabic (Hellmuth Reference Hellmuth2021).
Having established this, we are now in a position to ask whether tone-driven epenthesis is possible in tonal languages, in which pitch is used for lexical/grammatical functions. There is no a priori reason to exclude this possibility based on production, perception or the architecture of the phonology module. To date, this question remains unanswered and is absent from all major surveys of tonal languages (Pike Reference Pike1948; Fromkin Reference Fromkin1978; Yip Reference Yip2002; Hyman Reference Hyman2011b; Wee Reference Wee2019, inter alia).Footnote 1 In fact, Wee (Reference Wee2019: 208) broadly summarises that while it is common for segments to affect tone, ‘[t]here is …little evidence of reciprocation and very little evidence of tone affecting segments’, which would include vowel epenthesis.Footnote 2
Equally, the epenthesis literature in general does not address tone-driven epenthesis (Broselow Reference Broselow1982; Itô Reference Itô1989; Piggott Reference Piggott1995; Blumenfeld Reference Blumenfeld2006; de Lacy Reference de Lacy2006, Reference de Lacy2007; Hall Reference Hall2006, Reference Hall2011; Baković Reference Baković2007; Moore-Cantwell Reference Moore-Cantwell2016, inter alia). Epenthesis is frequently cited as having three main patterns, as laid out in Broselow (Reference Broselow1982). One is syllabically triggered epenthesis, where a vowel is inserted for syllable well-formedness (e.g. as a response to a ban on word-final codas (…C# → …Cə#). A second is segmentally triggered epenthesis, responding to a ban on certain configurations of segments (e.g. adjacent sibilants in English brush[ə]s; Moore-Cantwell Reference Moore-Cantwell2016: 239). The third is minimality-triggered epenthesis (e.g. a response to a ban on monosyllabic words (#CV# → #ə.CV). These three functions are corroborated in de Lacy (Reference de Lacy2006: 287 ff.), whose survey of 105 cases of vowel epenthesis shows that all are triggered to ‘satisfy a general phonological requirement such as minimal word restrictions, metrical conditions, and segmental phonotactic restrictions’.
In the few epenthesis studies which address the possibility of tone-driven epenthesis in tonal languages, it is assumed to be either impossible or unattested. Brief mention is found in Blumenfeld (Reference Blumenfeld2006: 153 ff.), who pursues a maximally restrictive theory of epenthesis.Footnote 3 Blumenfeld (Reference Blumenfeld2006: 41) hypothesises a number of impossible interactions involving epenthesis, one being that ‘tone conditions cannot affect string structure’ and, therefore, tone ‘cannot force epenthesis/syncope’. Blumenfeld (Reference Blumenfeld2006: 5) concludes explicitly that epenthesis is ‘used exclusively as a response to pressures of syllable structure, sonority sequencing, syllable contact and word minimality but cannot be used to avoid violations of other metrical constraints’. He characterises this generalisation as ‘equivalent to saying that in all cases where epenthesis applies, the winner of the alternative grammar without epenthesis contains either a marked consonant cluster or is a subminimal word’ (Blumenfeld Reference Blumenfeld2006: 158).
Gleim (Reference Gleim2019) summarises this literature and concludes that tone-driven epenthesis in tonal languages is unattested (we return to Reference GleimGleim and his analysis of Arapaho in §5). He leaves open the question of ‘how to restrict the grammar in such a way that it excludes tone-triggered epenthesis (Gleim Reference Gleim2019: 24), which, of course, presupposes that such epenthesis is impossible.
In the sections which follow, we illustrate that Wamey shows exactly such a pattern and thus fills an empirical gap in the literature. In §2, we lay out the relevant background on Wamey and its phonology. In §3, we present our evidence for tone-driven epenthesis, drawing on root and stem structure, as well as a class of [ə]-alternating suffixes. Here, we also show the preferability of a [ə]-epenthesis analysis to an alternative analysis involving [ə]-deletion. In §4, we provide an OT analysis in which we derive the epenthesis patterns and situate Wamey within a larger theoretical context. In §5, we discuss similar phenomena and other (potential) cases of tone-driven epenthesis in tonal languages, and speculate as to why it is so rare. §6 provides a conclusion.
2. The Wamey language
Wamey (wæ̀-mèỹ; ISO 639-3 code: cou) belongs to the Tenda subgroup within Niger-Congo.Footnote 4 It is spoken on both sides of the border between Senegal (Salémata Department) and Guinea (Koundara prefecture) by at least 18,000 speakers (Jenkins & Amdahl Reference Jenkins and Amdahl2007). The core of Wamey is described in detail in Reference SantosSantos's (Reference Santos1996) comprehensive grammar (including a 4000-entry lexicon), from which all the data in this paper are taken. Hereafter, we refer to this grammar as S96. We present the relevant preliminaries on Wamey phonology before moving to the issue of tone-driven epenthesis.
2.1. Basic segmental phonology
Wamey has a rich set of segmental contrasts, as shown in Table 1.Footnote 5
Digraphs and trigraphs represent single segments (consonant clusters are highly restricted, which we turn to shortly). Wamey exhibits an extensive mutation system targeting root-initial consonants (Merrill Reference Merrill2018: 302 ff.). The details of the mutation system are irrelevant to the current discussion, but its effects are seen in several of the examples that we cite.
Of the seven contrastive vowels, Reference SantosSantos identifies /ə, æ/ as ‘weak’ for a number of reasons, such as their inability to be stressed (i.e. bear word-level ‘accent’) and positional restrictions (/ə/ is not allowed in word-initial position, and /æ/ is not allowed in final position) (S96: 40–41, 53, 56). However, all vowels including /ə, æ/ can appear as the sole root vowel (e.g. /lə̀n/ ‘snake’, /sǽt/ ‘blood’). Wamey has no underlying long vowels, and surface long vowels are restricted to contexts in which one vowel assimilates to another vowel in the same word across a morpheme boundary (S96: 44).
Wamey prohibits both consonant clusters and vowel clusters within a morpheme. All consonants may appear in the coda position at all prosodic levels (p-word, p-phrase, utterance), although type and token frequencies vary considerably (S96: 47). If a consonant cluster would arise across a morpheme boundary, Wamey shows three responses, depending on the environment. One response simply allows the marked consonant cluster. This happens in compounding, full reduplication and at clitic boundaries generally in (1)–(3). The relevant portion is in bold. Bear in mind that complex consonants represented by digraphs or trigraphs are single segments (e.g. represents /ŋkw/).Footnote 6
(1)
(2)
(3)
The other two responses repair consonant clusters, either by deleting the second consonant (C2) or inserting [ə] between the consonants, which is the general epenthetic vowel. The more common phenomenon is consonant deletion, as illustrated in Table 2. It is always the consonant of the second morpheme which deletes.
Nearly all (consonant-initial) derivational suffixes (Table 2a) and inflectional suffixes (Table 2b) show this type of repair. For instance, the inflectional suffix /-xâw̃/ 3s.object in Table 2b is part of a series of /x/-initial suffixes, all of which show /x/-deletion when adjacent to a consonant. Further, definite clitics which begin with /ŋ w̃ ỹ/ (Table 2c) optionally delete this consonant when consonant-adjacent, a pattern which also holds for demonstratives. We return to these inflectional suffixes and clitics in §3, where we use them to diagnose the underlying representation of preceding morphemes.
The other cluster repair is much more limited and involves [ə]-insertion, as shown in Table 3. These include both derivational suffixes (Table 3a) and inflectional ones (Table 3b). One thing which sets these affixes apart is that three of the four consist of a single consonant only. Therefore, it seems to be the case that deletion is blocked if insufficient phonological substance of the morpheme would remain. The exception is durative /-lél̰/, which, diachronically, likely consists of a frozen, non-productive suffix /-l/ plus causative factive /-él̰/, leading to a bigram /-l-él̰/ whose first morpheme still retains its monoconsonantal behaviour (cf. Wolof -taan with the same meaning, also ultimately from two suffixes in combination). As it stands, however, we must classify /-lél̰/ as an exception. What is important to take from this discussion is that this syllabically driven vowel epenthesis is independent of the tone-driven [ə]-epenthesis we discuss in §3.Footnote 7
2.2. Basic tonal phonology
Wamey makes a basic distinction between H and L tone, with contours HL and LH permitted but subject to various restrictions. A minimal triple for a monosyllabic base is shown in (4).
(4)
A falling HL contour is generally restricted to CVC words (or larger). The only cases of HL on a CV word are reduced forms of interrogatives, [mô] ‘who’ (< /mógà/) and [nê] ‘where’ (< /négà/). A rising LH contour is not found on the surface in either CV or CVC words (a point we return to).
In multisyllabic words, contours are found predominantly in word-final position, as in (5). Here, we see a contrast between LH and HL contours.
(5)
The LH contour must be preceded by an H tone; [L.LH] is unattested. The most common realisation of LH is as a surface mid tone ((5c) would be [æ̀ŋómpē]). For completeness, other tone rules of Wamey are provided in (6); the Greek letter tau represents a TBU.
(6)
Finally, a crucial part of our analysis involves positing floating tones as part of the lexical representation of certain morphemes. For example, the marker of mode énonciatif, a multi-functional morph marking certain classes of non-negative clauses, is realised variably as a prefix /ǽ-/ or as a floating tonal morph Ⓗ (hereafter, a tone within a circle indicates a floating tone). An example is shown in (7), where we gloss it as infl.
(7)
In the second example, the floating tone docks with the initial TBU. This pushes the lexical L tone onto the following vowel, resulting in a rising tone [tókə̌nì] (a rare case where a contour is not domain-final).
3. Tone-driven [ə]-epenthesis
This section lays out the evidence for tone-driven epenthesis, from two areas of Wamey grammar. The first involves the complementary distribution of word-final [ə] with $\varnothing$, which is straightforwardly accounted for if [ə]-final roots are actually consonant-final underlyingly. The other involves alternations between [ə] and $\varnothing$ in what we term [ə]-alternating suffixes.
Before we begin, we emphasise that several of the analytic generalisations should be attributed to Reference SantosSantos's (Reference Santos1996) grammar itself. In discussing the behaviour of /ə/, she states overtly its role in providing ‘support’ for tone: ‘the vowel /ə/ supports a high tone after a low tone or a low-high tone after a high tone’, in which case ‘it is always realised even when it is in final word position’ (p. 43), even though forms with final /ə/ constitute ‘a base with a final consonant at the structural level’ (p. 189, fn. 3).Footnote 8 Her grammar has unfortunately gone unnoticed in the phonology literature, and we are happy to bring it to the fore. Where she provides analytic generalisations overtly, we state as much, but in most places we go beyond her original study in scope and theoretical modelling, and in other places we make reanalyses.
3.1. Evidence from root and stem structure
The first piece of evidence for tone-driven epenthesis in Wamey involves the phonological patterns of roots/stems. Recall from (4) that CVC roots surface as H, L or HL but not as LH. CVC patterns are actually in complementary distribution with CVCə roots with a final schwa (e.g. [ì-nkæ̀w̃ə́] ‘dance’ (n.), [æ̀-mbə̀ỹə́] ‘leper’, [à-l̰ə̀nkwə́] ‘imbecile’, [ì-còkə́] ‘to weld’, [ì-mènə́] ‘to fish’, [dòlə́] ‘today’, [tàmpə́] ‘enslaved’, among many others). All other tone patterns are unattested on CVCə roots. This complementarity is shown in Table 4. The initial hyphen indicates that these forms normally appear with noun class prefixes.
This complementarity is categorical across Reference SantosSantos's lexicon. We digitised this lexicon and coded all lexical roots and stems for segmental structure, syllabic structure and tone patterns ($n=3518$). We counted monomorphemic nouns, verbs, adjectives, adverbs, ideophones, temporals/locatives and numerals as lexical roots. If they were provided by Reference SantosSantos, we included proper nouns, such as names of villages and certain Wamey names. We also included a large number of multi-morphemic lexical stems which contain derivational morphology, a point we return to below. Both these and plain lexical roots constitute morphological stems, hereafter simply referred to as stems. We set aside any noun class prefixes (including the infinitive prefix /i-/), which are always outside the stem. We excluded grammatical morphemes such as conjunctions, connectives, demonstratives, interrogatives, pronouns, prepositions, quantifiers and relative markers from our database. We also excluded complex constructions such as set phrases/idioms, reduplicated words and compounds. A copy of this database is found in the supplementary materials. Note that adjectives marked with the adjectival suffix are in their own section of the database.
Let us first examine monosyllabic CV and CVC stems compared to bisyllabic CVCə stems which end in [ə]. Table 5 shows that while LH sequences are banned in CV and CVC stems, they represent the primary pattern found in CVCə stems.Footnote 9
Parallel patterns are found for longer lexical stems. Just as CVC and CVCə stems are in complementary distribution, so are CVCVC and CVCVCə. Table 6 shows the number of these longer stems with each tone pattern. As in Table 5, patterns with final L.H appear only on ə-final stems, and ə-final stems never host any other tone pattern.Footnote 10
This complementarity also holds for the small number of three-syllable (plus schwa) stems in our database ($n=12$) (i.e. CVCVCVC versus CVCVCVCə stems). All such CVCVCVCə stems end as expected with a low tone on the penult and a high tone on the final [ə] (e.g. the [L.L.L.H] stem [-nkə̀rə̀ƴàlə́] ‘sand’, the [L.H.L.H] stem [-gèkə́lèrə́] ‘snuffbox’, and the [H.H.L.H] stem [-ƴǽlǽnkònə́] ‘cram-cram grain’). In these [ə]-final stems, the [ə] cannot be attributed to any syllabically or segmentally triggered epenthesis. As stated above, all consonants are permitted in the coda position. For all stems with a final schwa, there are equivalent words which end in the consonant before that schwa. This is shown in Table 7 with plain stops (palatal /c/ in row Table 7a), pre-nasal stops (/nd/ in Table 7b), implosives (/ɓ/ in Table 7c), fricatives (/s/ in Table 7d) and sonorants (/l/ in Table 7e). All rows show that if the syllable ends in a low, high or falling tone, no [ə] is inserted.
A straightforward interpretation of this complementarity is that CVC and CVCə structures derive from a common underlying structure conditioned by tone. Logically, the common structure could be underlying /CVC/, in which case [CVCə] is derived via epenthesis, or /CVCə/ with [CVC] derived via deletion. There is no contrast in the language between /CVC/ and /CVCə/ stems, which compounds the difficulty in choosing between the two analyses (on this fundamental difficulty as it pertains to consonant epenthesis, see Morley Reference Morley2015).
The simpler of the two involves positing less underlying structure, which is the epenthesis account advocated in this paper. The four basic tone patterns – H, L, HL and LH – as they appear on CVC stems are illustrated in (8). Inputs with H, L or HL sequences associate straightforwardly with the vowel. However, due to constraints on rising tones, a LH sequence cannot associate both tones with the TBU. To preserve all tones in the input, an epenthetic schwa is inserted to host what would otherwise be a floating H.
(8)
In what follows, we analyse the underlying structure of such words to be /cv̀cⒽ/ with a pre-linked low tone and an underlying floating high tone (circled), rather than as underlying /cv̌c/ with both tones pre-linked. Regardless of analysis, as stated, there is no contrast between these two representations in CVC stems.
We can refer to the alternative as tone-driven /ə/-retention, as shown in (9). Here, underlying /ə/ is deleted in word-final position unless it would result in a floating tone or a rising tone. Under this alternative (which we reject), there is no epenthesis; therefore, Wamey does not constitute a true counterexample to the aforementioned claims in the literature.
(9)
Therefore, we must adjudicate between two competing sets of underlying representations:
(10)
Before beginning to compare these two, we stress an important commonality: the occurrence of final [ə] on the surface is entirely determined by tonal factors and insensitive to segmental ones.Footnote 11
One piece of evidence in favour of epenthesis comes from the general shape of monomorphemic roots in the lexicon. In the alternative /ə/-retention analysis, there are no C-final roots. Therefore, we might expect /CVCV/ roots in general to be common, with a full range of final vowels and tonal patterns. However, this is not the case in Wamey. Of the unambiguously vowel-final /CVCV/ stems in the lexicon ($n=621$; cf. 1352 CVC stems), the vast majority end in /a/ (478 out of 621, or 77%). Most of these are decomposable into a CVC root plus a derivational -V suffix and are transparently related to a CVC base; some others contain frozen morphology. The anticausative/middle suffix /-á/ is particularly common and, like its equivalent in other Atlantic languages, is often used to form denominal verbs.
(11)
Further, those CVCV structures which cannot be morphologically segmented are overwhelmingly loanwords, mostly from Malinke. Many of these loanwords are additionally exceptional in that they lack an overt noun class prefix.
(12)
Another point concerns CV roots, which are very rare ($n=108$). This is summarised in Table 8. Unlike the CVCV roots, these are mostly native roots in which it is likely that an earlier root-final consonant was deleted. A striking gap can be seen here. While vowels are generally evenly distributed, there is a near-complete lack of CV roots ending in [ə].Footnote 12 We can compare this to CVC roots, where [ə] is the most frequent vowel.
The only exception is the bound root /-jə́/ ‘(grand)son of’, which can never appear on its own, as shown in (13). It might even be analysed as a prefix rather than a root.
(13)
Under the alternative /ə/-retention analysis, final /ə/ is deleted unless it is retained to host tone. In (13), for instance, the underlying final /ə/ would not be deleted, because this would result in a banned floating high tone. Therefore, under the alternative, it is unexplained why there is not a comparable number of /Cə/ roots, where the final /ə/ should always surface transparently to host a tone.
In contrast, under tone-driven [ə]-epenthesis these facts are straightforwardly unified via a constraint banning word-final [ə] at the prosodic stem-level. This prosodic constituent would equally apply to CV and CVCV stems regardless of tonal pattern.Footnote 13 Under this analysis, surface [ə] in [CVCə] patterns emerge only at a later stage where the (prosodic) word is evaluated. Underlyingly, roots in Wamey are canonically CVC, and deviations from this template come about through synchronic/diachronic morphological processes. In fact, the canonical root shape in this Atlantic linguistic area is CVC, as found, for example, in Wolof, Fula and, most notably, the two remaining Tenda languages Bassari and Bedik related to Wamey. These languages have CVC roots wherever Wamey has [CVCə].
To summarise, in surface CVCV roots, the final vowel is overwhelmingly [ə], which appears only in the appropriate tonal context. This is easily explained if the [ə] is inserted to allow for the realisation of the underlying tones. If, on the other hand, the [ə] is present underlyingly, there is no explanation for why other vowels are not also attested in this position proportionate to their occurrence in other positions. The ə-epenthesis analysis simply treats underlying CVCV roots as uncommon and non-canonical, which aligns with the historical and areal facts. The /ə/-retention analysis must treat CVCV roots as the norm but does not provide any independent reason why /ə/ is overwhelmingly preferred as the underlying root-final vowel. In fact, the evidence from CV roots shows that underlying /ə/ is specifically banned in this position unless it serves a tonal purpose.
3.2. Evidence from C deletion in enclitic determiners
If [cv̀cə́] forms are actually /cv̀cⒽ/ at the stem-level, as we advocate, then they should also pattern as consonant-final in morphophonological processes. This is indeed the case. Consider the following data involving definite marker enclitics which appear after all other elements in the noun phrase. The definite marker has the form /=(C)ǎ/ with an underlying rising tone. The identity of the consonant is dictated by noun class agreement, as demonstrated in (14).
(14)
For most noun class contexts, the consonant of the definite marker is fixed and obligatory. However, if the noun class prefix begins with /w/, /y/, /ỹ/ or a vowel, or if it is null, then the corresponding definite marker has two variants which occur in free variation: =Cǎ and =ǎ. The =Cǎ variant has the form [=ŋǎ], [=ỹǎ] or [=w̃ǎ], depending on the noun class. This free variation is demonstrated in (15).
(15)
This variation is possible only after a consonant. If the stem ends in a vowel, only the =Cǎ form is found. This is shown in Table 9 with vowel-final CV stems (Table 9a) and CVCV stems (Table 9b). Importantly, Santos is explicit that the same variation found in CVC stems is found for those CVC stems analysed with a floating Ⓗ(e.g. /-mbə̀lⒽ/ [mbə̀lə́] ‘milk’) despite the fact that the latter surface with a final schwa in context (Table 9c–d; S96: 209). The importance of such data is clear: these stems pattern as consonant-final, suggesting that surface forms with final [ə] are derived.
Table 10 shows the interaction of the three phonological processes present here: consonant deletion in the clitic, [ə]-epenthesis and assimilation of this vowel. The optional consonant deletion applies first in these simplified derivations, which accounts for why inputs such as /mbə̀lⒽ/ also condition deletion.
In this table, notice that an epenthetic vowel is still inserted after consonant deletion to host the otherwise floating tone. This then assimilates to the following vowel and is one of the few places in the language where a surface long vowel is seen. We return to this fact in §4.3.
Consider now the competing analysis involving /ə/-deletion, as shown in Table 11. Here, we must adopt a far less phonologically natural operation, whereby certain intervocalic consonants delete but only if the first vowel is /ə/. Further, as discussed in §2.1, there are several independently motivated operations which repair CC clusters that arise from morphological concatenation. One of these is the deletion of the second consonant, as in the first step of the derivation in Table 10. In contrast, the derivation in Table 11 must propose a novel and idiosyncratic deletion process triggered only by a preceding /ə/.
In total, [cv̀cə́] stems pattern as consonant-final, not as vowel-final. This supports an underlying (segmental) representation as /CVC/, which entails that surface [ə] is inserted rather than deleted.
3.3. Evidence from [ə]-alternating suffixes
Further evidence for the epenthesis analysis comes from bound morphology, specifically from what we call [ə]-alternating suffixes, which display an alternation between [C]-final and [Cə]-final variants. The relevant suffixes are summarised in Table 12.For each pair, the variants are predictable based on the phonological context; therefore, they should not be considered suppletive allomorphs.
The suffixes in Table 12a alternate between shapes [C] and [Cə́], the latter of which bears a high tone. The first, [-k]$\sim$[-kə́], is roughly equivalent to third singular subject marking in perfective aspect, which we gloss as 3s.minimal following Reference SantosSantos's terminology (‘indice personnel minimal’). The second, [-n]$\sim$[-nə́], marks imperatives with plural addressees. The data in (16) are representative examples which illustrate the conditioning factor: the [C] form appears if the preceding vowel is high-toned, while the [Cə́] variant appears if the preceding vowel is low-toned (the preceding syllable is underlined, and the [ə]-alternating suffix is in bold).
(16)
The second group of affixes (Table 12b) is identical in its distribution. All of these additionally consist of a low vowel [æ] or [a], and some have an initial consonant (subject to deletion). The first two are adjectival suffixes [-ǽx]$\sim$[-æ̀xə́] (adj1) and [-ák]$\sim$[-àkə́] (adj2). Adj1 and adj2 are in complementary distribution, with adj2 appearing after [a]-final stems. Representative examples of these variant pairs are provided in (17). As with the first group of suffixes, the [v́c] variant occurs if the stem ends in a high tone, while the [v̀cə́] one appears if the stem ends in a low tone.
(17)
The patterns of both sets of suffixes corroborate the static distribution with lexical stems detailed in §3.1. If a word-final [ə] is present, it is always high-toned and appears after a low. There are no instances of final word shapes *[cv̀cə̀], *[cv́cə́] and *[cv́cə̀] involving these suffixes.
Parallel to the argument involving definite enclitics above, we can probe the underlying representation of [ə]-alternating suffixes based on how they condition the affixes which follow them. As expected under our analysis, these suffixes pattern as underlyingly consonant-final. The relevant data involve the interaction with another class of suffixes, in which an initial /x/ is deleted. We call these /x/-alternating suffixes. An example is /-xâw̃/, which indexes third singular objects (3s.o). If it appears after a consonant, the initial /x/ deletes, as shown in (18a). In contrast, if it appears after a vowel, the /x/ surfaces, as in (18b).
(18)
Table 13 lists two classes of /x/-alternating suffixes consisting of a full set of object-indexing suffixes (Table 13a) and the past-tense suffix /-xôw̃/ (Table 13b). Note that all of these suffixes have final consonants (n or w̃) which surface only before a vowel; this alternation is irrelevant to the current discussion.
Let us now look at the interaction of [ə]-alternating suffixes with /x/-alternating suffixes. For consistency, we use only the suffix [-k]$\sim$[-kə́] 3s.minimal in these data, which we take to be representative of the [ə]-alternating class. Table 14 shows verb forms with /-k/ followed by each /x/-alternating suffix. In all cases, /-k/ conditions the deletion of /x/, suggesting that /-k/ is underlyingly consonant-final.
We can contrast [ə]-alternating suffixes such as /-k/ 3s.min with other suffixes of the same relative morphological class and position. For example, /-ə́rú/ indexes second-person singular subjects and also directly precedes the relevant /x/-alternating suffixes. In contrast to third-person /-k/, second-person /-ə́rú/ underlyingly ends in a vowel; therefore, it conditions the /x/-initial variant without deletion. This is shown in (19) where the /-ə́rú/ is underlined and the /x/-alternating suffix is in bold.
(19)
Given these distributions, we posit the underlying representations in (20) for the [ə]-alternating suffixes. All bear an underlying floating Ⓗ tone, even those which contain a TBU (such as /-æxⒽ/ in (20b)).Footnote 14
(20)
An alternative would posit underlying representations with a final /ə/ to which the high tone is pre-linked, parallel to the alternative representations for stem structure in the previous section. Alternative URs would be as in (21).
(21)
Under this alternative, we must account for the fact that these suffixes condition /x/-deletion on /x/-alternating suffixes. An anonymous reviewer suggests that underlying morpheme-final /ə/ might not pattern with other vowels due to /ə/'s status as ‘weak’ and more easily deleted when in marked positions. This alternative would be as in (22), where both /ə/ and /x/ are deleted.
(22)
Under the alternative, it is predicted that [x] should delete whenever it is adjacent to [ə]. However, there are several places where this is not the case. One such environment involves a small set of suffixes expressing imperfective aspect, one [-ɗ]$\sim$[-ɗə́] glossed as ipfv1 and another [-nd]$\sim$[-ndə́] glossed as ipfv2.Footnote 15 They constitute a type of [ə]-alternating suffix, where the [-C] variant is used following a high-toned stem (as in (23a) and (24a)), while the [-Cə́] variant is used with a low-toned stem (as in (23b) and (24b)).
(23)
(24)
However, their behaviour diverges from other [ə]-alternating suffixes in that they do not condition the deletion of /x/. Examples are in (25)–(26), where these imperfective suffixes are compared to /k/ 3s.min. For clarity, the /x/-alternating suffix is in bold.
(25)
(26)
In preserving /x/, the imperfective suffixes pattern with the vowel-final suffixes, as in (19).
There are two ways to interpret the special behaviour of imperfective suffixes. The first is to assume that these suffixes are underlyingly /ɗ/ and /nd/, and trigger a segmentally driven [ə]-epenthesis rule to break up consonant clusters. Here, their representations would be identical to those of other [ə]-alternating suffixes (e.g. /-k/ 3s.min). Recall from §2.1 that segmentally driven [ə]-epenthesis is sensitive to the morphological identify; therefore, the special behaviour of the imperfective suffixes would simply be another case of such morphological sensitivity.
Alternatively, we could attribute their behaviour to a difference in underlying structure. Here, the imperfective suffixes would underlyingly end in /ə/; therefore, they would not condition [x]-deletion, because they end in a vowel underlyingly. This analysis is sketched in (27).
(27)
These two analyses of imperfective suffixes both unequivocally tolerate patterns where [ə] does not condition deletion of a following /x/. Therefore, this undermines the position that /x/ deletes because it is adjacent to a weak vowel [ə], as opposed to our more phonologically natural interpretation that /x/ deletes when it is adjacent to a consonant.
3.4. Deriving the final [ə] in suffixes
To conclude this section, we briefly outline how the [C]-final and [Cə]-final variants are derived, shown with a series of input–output mappings. For reasons of space, we show only derivations with /-kⒽ/ 3s.min. Our discussion here is fully formalised in §4 within an OT framework.
Table 15 shows /-kⒽ/ with CVC and CVCV stems of four lexical tone patterns(H, LH, L, HL). As established, the variant [-k] appears after a stem H tone, and [-kə́] after L.Footnote 16
Let us first provide derivations of simple low- and high-toned stems. In (28), with a low-toned stem, a syllabically triggered epenthetic schwa is inserted between the consonant-final root and the consonant-initial suffix. Here and throughout, epenthetic material is in grey, and new association lines are dashed. As stated previously, Wamey generally disallows consonant clusters. With such low-toned stems, the low spreads from the root to the syllabically triggered epenthetic [ə]. This happens even though a floating high tone is available which could value this epenthetic vowel (LH sequences are perfectly acceptable, as in [yòmpə́-k] in Table 15).
(28)
In both (28) and (29), a word-final [ə] is inserted, to which the floating Ⓗ docks. A constraint against creating rising tones forbids docking the Ⓗ leftward (i.e. *[ròkə̌k] and *[nùfǐk]), and a constraint against tone deletion forbids simply deleting Ⓗ (i.e. *[ròkə̀k] and *[nùfìk]).
(29)
Next, consider the high-toned stems in (30) and (31). As above, a syllabically triggered epenthetic [ə] is inserted between consonants. The first high tone (H$_1$) spreads rightward to the epenthetic vowel.
(30)
(31)
The surface forms here are [lə́cə́k] and [yáryík], rather than *[lə́cə́kə́] and *[yáryíkə́]. This indicates that the second high tone (the floating ) does not trigger epenthesis in this context. To account for this, we assume a simple rule of high-tone deletion: when a string of high tones, H$_1$ H$_2$, appears domain-finally, the second is deleted (a type of OCP dissimilation rule). Inserting [ə] here would not circumvent any OCP violation.
Derivations for HL stems are in (32) and (33). These are derived in the same way as the L stems: spreading of the root tone, followed by epenthesis to host the floating Ⓗ tone due to a restriction on rising tones.
(32)
(33)
Likewise, derivations for LH stems are in (34) and (35), in which the two adjacent high tones constitute an OCP violation and the second is deleted. Note that in (34), the surface form is [yòmpə́k] rather than a conceivable alternative such as *[yòmpə̀kə́].
(34)
(35)
4. OT analysis
In this section, we develop an analysis in Optimality Theory (OT) to derive the insertion of epenthetic [ə] based on the interaction of segmental and tonal constraints. The OT analysis aims to show that a small set of familiar constraints can generate the attested Wamey patterns, showing that all the ingredients needed to generate tone-driven epenthesis are already present in the theory. We begin by deriving the patterns with simple stems, before moving on to derivations in three more complex contexts: data with nominal enclitics, [ə]-alternating suffixes and [x]-alternating suffixes. The complete set of constraints, plus their crucial orderings, are found in the supplementary materials.
4.1. Stems
The most basic pattern involves the isolation form of a stem which sponsors a floating Ⓗ, in (36). Here, the input is given as a pre-linked low tone, plus a floating high which triggers [ə]-epenthesis in the surface form. We adopt a two-step derivation involving stem-level, followed by word-level phonology.
(36)
While this is in the spirit of derivational OT models (e.g. Stratal OT; Kiparsky Reference Kiparsky2015, inter alia), we are not explicitly arguing for one model over another per se in this paper.Footnote 17
First, a key component is that forms such as /ƴòmpⒽ/ do not undergo tone-driven epenthesis in the stem-level phonology but only at word-level phonology. The simple constraint set in (37) can derive the correct input–output mappings at these two stages. These involve a faithfulness constraint MaxTone and three markedness constraints NoRise, NoFloat and NoEdgeSchwa.
(37)
The constraint NoEdgeSchwa is abbreviated as , which is meant to restrict [ə] from appearing at either the right or left edge of any prosodic constituent (whether prosodic stem, prosodic word, prosodic phrase, etc.). The subscript Φ denotes a prosodic constituent in general. This is motivated by the fact that [ə] in word-initial position is banned regardless of tonal environment, as discussed in §2.1. The tableau in (38) shows stem-level phonology using these constraints, where *Ⓣ is crucially ranked below the others. In this tableau, the morpheme is placed within a prosodic stem, denoted with a subscript Σ (for a cross-linguistic overview of the prosodic stem, see Inkelas Reference Inkelas2014; Downing & Kadenge Reference Downing and Kadenge2020, inter alia). Epenthetic material is in grey, as throughout.
(38)
Candidates (38e) and (38f) violate the first constraint by placing [ə] at a prosodic constituent edge, (38d) has a rising tone on a closed syllable, and candidates (38b) and (38c) delete a tone. Fully faithful candidate (38a) is optimal, even though it still retains the floating tone.
It is at word-level phonology that [ə] is epenthesised, as shown in (39). Crucially, here the constraints and *Ⓣ are re-ranked.
(39)
Candidates (39b)–(39e) all violate constraints which are still highly ranked, namely *R and Max(T). Further, because *Ⓣ is re-ranked, candidate (39f) is suboptimal, and candidate (39a) is optimal despite its insertion of [ə]. This tableau illustrates the ease with which tone-driven epenthesis can be modelled in OT.
In addition to inputs such as /ƴòmpⒽ/, we must also entertain underlyingly /ə/-final inputs; for example, a hypothetical (abstract) input /cv́cə́/. This is due to the standard principle in OT of Richness of the Base (Prince & Smolensky Reference Prince and Smolensky2004: §9.3), which states that we cannot prohibit any input shapes. This is where our demarcation into stem-level vs. word-level phonology is crucial. The tableau in (41) shows that inputs such as /cv́cə́/ are mapped to cv́c outputs at stem-level phonology, deleting the final /ə/. This tableau involves three additional constraints, which are defined in (40).
(40)
(41)
The fully faithful candidate (41d) violates high-ranked , while candidates (41b) and (41c) both violate newly introduced constraints against consonant deletion and feature spreading. Candidate (41a) is optimal even though it violates Max(V). None of the candidates violate tonal constraints *R, Max(T) or *Ⓣ, which play no role in this tableau.
A two-step grammar accounts for the shape of stems and their mapping to words when no word-level suffixes are additionally added. The fact that stem-level phonology eliminates any stem-final /ə/ accounts for the /Cə/ gap in CV roots, as shown in §3.1. Recall that the only exception was a root /jə́/ ‘(grand)son of’. However, its exceptional status can be straightforwardly derived from the fact that it is a bound root which never appears on its own, unlike other nouns. If we treat this as a type of lexical affix, we can assume that it does not go through stem-level phonology, and thus it is predicated that its final /ə/ is not ruled out.
4.2. [ə]-alternating suffixes
Next, let us examine more complex data involving [ə]-alternating suffixes. We will illustrate this with the subject agreement suffix /-kⒽ/ 3s.min, which alternates between forms [-k] and [-kə́] depending on its tonal environment. We must derive the three (word-level) input-output mappings exemplified in (42) (data repeated from Table 15).
(42)
We must add four new constraints to the word-level phonological grammar, in (43).
(43)
*ConsonantCluster and OCP(H) are standard markedness constraints. Align(,Φ) is an alignment constraint which requires that any epenthetic material (e.g. an epenthetic [ə]) be aligned to some prosodic constituent edge. This will help dictate the optimal position of epenthetic schwas below. Further, *Cross-stem-R(T-µ) states that a tone must not be associated with a TBU (the mora, µ) across the right edge of a prosodic stem boundary. This will dictate, in part, the optimal host for floating tones.Footnote 18
The tableau in (44) shows how adding the first three of these constraints derives the correct mapping with a low-toned stem (i.e. ròk + kⒽ → [ròkə̀-kə́]).
(44)
Candidate (44g) is ruled out by *CC because its consonant cluster is not repaired in the output, while candidate (44f) is ruled out because the floating tone of the suffix remains floating. Candidates (44c)–(44e) are ruled out by the more complex constraints. In candidate (44e), [ə] is inserted to break up the cluster but is placed outside the prosodic stem rather that inside it. Therefore, it does not align with the edge of any prosodic constituent it is contained within and is suboptimal. Further, in candidates (44c) and (44d) the floating tone associates with [ə] within a prosodic stem. By crossing a right-edge prosodic stem boundary, this violates . Finally, candidate (44b) deletes the floating H, which violates Max(T), leaving candidate (44a) [ròkə̀-kə́] as optimal even though it has inserted two epenthetic vowels at constituent edges (violating ).Footnote 19
In short, (44) shows that if a floating tone cannot dock ‘backwards’ into the stem, then an epenthetic vowel is inserted to host it. This shows that there are two contexts where tone-driven epenthesis arises: to avoid a rising tone on a closed syllable and to avoid docking a word-level floating tone to an inner constituent, the prosodic stem.
Further, the tableaux in (45) and (46) show the input–output mappings with a high-toned stem (lə́c) and a stem with a floating tone (yòmpⒽ), respectively. Both of these stems end in a high tone and thus illustrate the role of OCP(H) in the grammar.
(45)
(46)
Considering these tableaux together, each input contains two adjacent high tones. High-ranked OCP(H) rules out all candidates that do not delete one of these tones.Footnote 20 Next, Al(,Φ) eliminates those candidates which insert an epenthetic [ə] outside the prosodic stem, and eliminates those whereby a word-level floating tone docks inward into the prosodic stem. The remaining candidates all violate the lower-ranked Max(T) by deleting one of the input tones. The optimal candidates in each are those which violate the least, essentially the candidates which have the fewest epenthetic schwas. Here, the floating tone is deleted as part of a general pattern generated by the ranking OCP(H) $\gg$ Max(T); therefore, there is no reason to insert an epenthetic [ə] to host it.Footnote 21
4.3. Definite enclitics
Next, consider data involving the definite enclitic from §3.2. The input–output mappings are repeated in (47) with a definite enclitic /=ŋǎ/. Recall that output forms in this context have two forms, one with the initial consonant of the enclitic and one where it has been deleted.
(47)
(47b) shows that in both contexts, the floating tone of the stem triggers an epenthetic host. As we showed in §3.2, this variation is not found in underlyingly vowel-final roots (e.g. /i-ɓú=ŋǎ/ → [ì-ɓú=ŋǎ] ‘the baobab fruit’).
To account for this variation, we posit that definite enclitics have two prosodic parses: one where they form their own phonological word (ω) and one where they form a recursive word (Bennett Reference Bennett2018; Ito and Mester Reference Ito and Mester2021, inter alia). This is exemplified in (48).
(48)
That consonant clusters are allowed in (48a) but not in (48b) is in line with other facts about when clusters are permitted (e.g. compounds, reduplication and with other clitics (see §2.1)).
To see how this works, consider the tableaux in (49) and (50), which generate the forms in (48). We add a constraint Max(C) to the grammar, prohibiting deletion of underlying consonants. For simplicity, we do not include the noun prefix in these tableaux.
(49)
(50)
Consonant clusters violate *CC if and only if they occur within a prosodic word. In (49) the two morphemes form separate words; therefore, none of the candidates violates *CC, even when the cluster is present. Therefore, the faithful candidate (49a) is optimal. With the recursive structure in (50), in contrast, the faithful candidate (50a) violates *CC and is eliminated. Next, candidates which epenthesise [ə] to break up the cluster violate the constraint against [ə] at a prosodic boundary (). In (50), the optimal candidate is (50b), which violates Max(C) to satisfy *CC; in (49), candidate (49b) is eliminated by Max(C) because the faithful candidate (49a) does not violate *CC.
Let us now move to the more complicated data with floating tones. The tableau in (52) shows such data, with the variant involving separate words. We add three more constraints to the word-level grammar, as defined in (51).
(51)
The constraints in (51a) and (51b) have a structure parallel to *Cross-stem-R(T-µ) in (43c) but refer to other phonological units and constituent edges.
(52)
Examining this tableau, the high-ranked constraints Max(T), *Ⓣ, and *R eliminate candidates (52c), (52f) and (52g), respectively. Further, candidates (52d) and (52e) violate the newly introduced constraints and , respectively. Candidate (52d) shows a mora associating with a vowel over a left-edge word boundary, and, similarly, (52e) shows a tone associating with a mora over this boundary. The remaining candidates in (52a) and (52b) each violate by epenthesising a vowel to host the tone, and the more faithful (52a) wins.
Next, consider the variant in (54), which shows word recursion. Here, due to this recursion, we require sensitivity both to the left edges of prosodic words (above), as well as to the right edges. Therefore, we add the constraints in (53) to the grammar.
(53)
(54)
Here, because the morphemes are grouped into a single word, they are subject to *CC, which eliminates candidates (54d) and (54e). Next, candidate (54c) violates by associating a tone across the right edge of a word boundary. None of the candidates, however, violate similar constraints involving the left edge – or – due to the recursive word structure. Of the remaining two candidates, candidate (54b) violates ; therefore, it is eliminated. The winning candidate, (54a), violates Dep(µ) by epenthesising a mora to host the floating tone, as well as low-ranked by associating this mora across a right-edge word boundary.
Taking this all together, the optimal output is one where the floating tone triggers mora epenthesis but not vowel epenthesis; the epenthesised mora is able to parasitically associate with another vowel. This results in one of the few long vowels found in Wamey, an output [mbə̀láǎ]. We return to tone-driven µ-epenthesis in our discussion in §5.1.
4.4. /x/-alternating suffixes
The final forms to derive using our phonological grammar is the interaction of [ə]-alternating suffixes with the /x/-alternating suffixes introduced in §3.3. We illustrate their interaction using the data point in (55) (repeated from Table 14), showing the deletion of /x/.
(55)
Notice in this example that the floating Ⓗ does not condition mora insertion (the final vowel remains short). We can compare this to a minimal pair in (56) with the demonstrative enclitic /=ŋî/ ‘this’. This, too, bears a lexical HL tone and, like the definite enclitics, may optionally undergo /ŋ/-deletion when adjacent to a consonant. With this clitic, however, the final vowel surfaces as long […líì] (cf. short […kî] immediately above).
(56)
Our grammar captures this contrast as in (57), which shows the input–output mapping of the form in (55). No new constraints need to be added.Footnote 22
(57)
First, candidate (57g) is eliminated because it fails to delete one of the two high tones, which violates OCP(H). Next, candidates (57d)–(57f) each violate Al(,Φ) by epenthesising [ə] between the two consonants rather than at a prosodic constituent edge. The remaining candidates all equally violate Max(T) and . Candidate (57c) violates by associating the floating tone over a stem boundary, leaving candidates (57a) and (57b). The winner is (57a), which epenthesises less structure compared to (57b), which epenthesises an additional mora. Notice that because all of the morphemes are within a single non-recursive word, candidate (57a) does not violate (prohibiting associating a tone over a word boundary). This accounts for the fact that the floating tone does not require a mora to host it here, whereas it does require a mora with an enclitic in a recursive word (cf. (54)).
5. Discussion
As defined in §1, tone-intonation parallelism is the premise that tones (and in particular floating tones) in tonal languages and intonational languages should not be ontologically distinct (i.e. they should have comparable representations and behaviour). We explored this premise looking at tone-driven vowel epenthesis, which is attested in intonational systems but has been claimed to be impossible or unattested in tonal languages where tone is used for lexical, derivational and inflectional purposes. We presented Wamey as the best case to date for tone-driven epenthesis in a tonal language, filling an important empirical gap and supporting tone-intonation parallelism. In this section, we (§5.1) situate the Wamey data by discussing phenomena similar to tone-driven epenthesis, (§5.2) call attention to a small number of other cases where tone-driven epenthesis has been postulated (or, at the very least, considered), and (§5.3) speculate as to why tone-driven epenthesis is so rare.
5.1. Similar phenomena
Tone-driven vowel epenthesis is one of several phenomena which demonstrate linguistic systems cultivating segmental environments ‘better suited for realising meaningful f0 movements’ (Roettger & Grice Reference Roettger and Grice2019: 279). One such phenomenon has already been introduced – namely, tone-driven vowel retention – which was entertained (but rejected) as an alternative to epenthesis in Wamey in §3. Under this type, vowels which are otherwise expected to delete and/or reduce are retained if they bear tone. Roettger & Grice (Reference Roettger and Grice2019) identify several such cases in intonational systems (e.g. Standard European Portuguese, Bulgarian, Greek, Ath-Sidhar Rifian Berber, Moroccan Colloquial Arabic, Bonaara Oromo and Tunica), and highlight parallel patterns in lexical accent/tone systems (e.g. Cheyenne, Acoma, Konso, Shanghainese and Japanese). To these we can add tone-driven vowel retention in tonal languages Baraïn (Lovestrand Reference Lovestrand2012), Sumi Naga (Teo Reference Teo2009) and Arapaho (Cowell & Moss Reference Cowell and Moss2008; Gleim Reference Gleim2019). For example, in Arapaho, an epenthetic vowel surfaces to break up certain consonant clusters only if it additionally hosts a high tone (e.g. the floating high in (58)).
(58)
Gleim (Reference Gleim2019) provides a lengthy presentation of the Arapaho facts as tone-driven vowel retention rather than tone-driven epenthesis, crucially showing that consonant mutation (e.g. w → b above) provides evidence for A → B → A Duke-of-York derivations (i.e. C-C → C-i-C → C-C).
In contrast to the rarity of tone-driven insertion of a vowel, there are several languages where a mora is inserted to realise tone, which then associates with some already present segmental root node. We have already seen this in Wamey in one environment; namely, across an internal word boundary with definite enclitics (§§3.2 and 4.3). Two other examples come from the tonal languages Kuria (Marlo et al. Reference Marlo, Chacha Mwita and Paster2015: 256ff.) and Gokana (Hyman Reference Hyman1985: 24, Reference Hyman2011a: 74), among other examples in African languages.
5.2. Other cases of tone-driven vowel epenthesis in tonal languages
Outside of Wamey, we are aware of only a handful of tonal languages for which tone-driven vowel epenthesis has been posited, or at least entertained. These are Kejom (aka Babanki; Akumbu et al. Reference Akumbu, Hyman and Kießling2020), Kifuliiru (van Otterloo Reference van Otterloo2011: 71–73), Hdi (Frajzyngier & Shay Reference Frajzyngier and Shay2002) and Ghomala’ (aka Bandjoun/Banjun; Nissim Reference Nissim1981; Eichholzer Reference Eichholzer2010). Of these, the first two are morphologically quite restricted rather than phonologically general, and the case of Hdi has already been dismissed as a case of epenthesis (Gleim Reference Gleim2019: 4).
By far the most convincing case of these is Ghomala’. There are five main tone patterns on monosyllabic roots, as shown in Table 16. The pattern denoted by L$^0$ is a low pitch which does not fall to the bottom of a speakers pitch range (i.e. level low), while L is a low pitch which does fall to the bottom (i.e. falling low).
In general, if a root ends in an obstruent (possible codas: /p k ʔ/), then that obstruent faithfully surfaces in final position. However, with LH roots and only LH roots, this may variably be realised either as a rising tone, or L on the first syllable and H on an epenthetic [ə]. This is very similar to Wamey in that (i) it involves the general unmarked vowel [ə], (ii) it is a solution that avoids a rising tone on a closed syllable (a common restriction; see Zhang Reference Zhang2013), and (iii) CVCV roots are otherwise not allowed. It is clear that in both languages the epenthetic [ə] cannot be attributed solely to a syllabically driven restriction on codas. In fact, Nissim (Reference Nissim1981: 63, fn. 12) is explicit, stating that words with rising tone ending in obstruents are realised with an epenthetic final vowel. Active alternations also exist, which show that if the rising tone is eliminated (e.g. by regular tone rules), no epenthetic vowel can surface. We refer the reader to Nissim (Reference Nissim1981) for details.
5.3. Rarity
If the patterns of Wamey (and Ghomala’) constitute genuine cases of tone-driven epenthesis in tonal languages, then such systems cannot be banned by some universal component of phonological architecture. At the same time, it is indisputable that tone-driven epenthesis is very rare. If we cannot point to a universal restriction on such systems, we are forced to find additional avenues to explain its particular rarity. We conclude this section by speculating why this rarity exists, focusing on two features of floating tone in tonal languages which make it different from its use in intonational systems: co-exponence with segmental material, and positional (un)restrictedness.
First, in tone languages, when floating tones realise a specific lexical, derivational or inflectional category, they are typically accompanied by segmental co-exponents. This was the case in Wamey, where the relevant floating tones we examined appear with other segmental material (e.g. roots such as /-mbə̀lⒽ/ ‘milk’ and affixes such as /-æxⒽ/ adj1 or /-kⒽ/ 3s.min). Such segmental co-exponents provide additional cues (indeed, the primary cues) for the intended meaning target. If the floating tone were simply to delete in these cases without triggering epenthesis, little information would be lost. For example, of the 177 Wamey stems of the (surface) shape [cv̀cə́] only 35 form minimal pairs with a [cv̀c] stem with which they would merge if their floating high were deleted (e.g. /i-ƴùr/ [ì-ƴùr] ‘drool’ vs. /i-ƴùrⒽ/ [ì-ƴùrə́] ‘a tuft of unshaven hair’). Of these 35, many would still remain distinct, due to different noun class prefixes (e.g. /æ-mbə̀l/ [æ̀-mbə̀l] ‘Guinea worm’ vs. /wæ-mbə̀lⒽ/ [wæ̀-mbə̀lə́] ‘milk’) or part-of-speech differences (e.g. /wæ-pèl̰/ [wæ̀-pèl̰] ‘sword peas’ vs. /i-pèl̰Ⓗ/ [ì-pèl̰ə́] ‘to keep food scraps’).
In contrast, floating tones in intonational systems typically do not occur with segmental co-exponents. The consequence is that the functional load of floating tone in tonal vs. intonational systems in expressing linguistic meaning is quite different. Losing the floating tone in intonational systems would be far more ‘costly’, and to avoid this cost, epenthesis may be employed. Relatedly, due to the tendency for floating tone to co-occur with segmental material in tonal systems but not in intonational systems, tone-driven epenthetic material in a tonal system would be more likely to be reinterpreted as part of the underlying representation. In other words, a surface form [ə́] would be more likely to be reinterpreted as /ə́/ in a tone language than in an intonational language.
Second, while less common than in intonational systems, it is certainly the case that floating tone may appear as the sole exponent of some meaning targets in tone languages. For example, both Ghomala’ and Gokana (previously mentioned) have an associative construction [N1 Ⓣ N2] used for possession and compounds, where the sole marking of association is a floating tone Ⓣ which does not co-occur with any segmental morphology. Even in such cases, however, there is a key difference. In intonational systems, floating tones are often positionally quite restricted (e.g. to stressed syllables or to prosodic domain edges, especially the right edge of large prosodic constituents such as the intonational phrase). This restricts the ‘window’ within which the floating tone can search to find a host. In contrast, floating tone association in tonal languages tends to be more flexible and need not specifically target a stressed syllable or domain edge.
For example, the other Mbam-Nkam languages of Cameroon all have [N1 Ⓣ N2] associative constructions cognate with those in Ghomala’. Hyman & Tadadjeu (Reference Hyman and Tadadjeu1976) show that in these languages whether the floating tone in such constructions will be ‘grounded’ to the right or to the left will depend on a complex set of factors. These include attaching in the direction which (i) has the greatest tonal effect, (ii) creates the more natural tonal contour, and (iii) complies with syllable or other boundaries. This shows that there are more potential targets of the floating tone in tonal languages than there are in intonational systems, and as such epenthesis is less likely to be required to host the floating tone than it would be in an intonational context.
6. Conclusion
We have argued that theories of tone/epenthesis interaction must be amended to include tone-driven epenthesis. While it was previously claimed that tone-driven epenthesis is unattested/impossible (Blumenfeld Reference Blumenfeld2006; Gleim Reference Gleim2019), we have argued that the Wamey language provides the best case to date which falsifies this position. We demonstrated that in Wamey an epenthetic [ə] is inserted to host a high tone in two contexts. The first was to host a tone which would otherwise be left floating due to a restriction on rising tones in closed syllables (i.e. /cv̀cⒽ/ maps to [cv̀cə́] due to a ban *[cv̌c]). The second was to host a tone which was introduced by word-level morphology but restricted from associating across a stem boundary (i.e. /(cv̀cv̀)cⒽ/ maps to [(cv̀cv̀)cə́]). These patterns cannot be attributed to syllable phonotactics, which freely allow all consonants in the coda position. We presented the evidence for tone-driven epenthesis focusing on the distribution of final [ə] in lexical stem structure and [ə]-alternating suffixes which pattern as underlyingly consonant-final. We showed that a simple OT analysis derives [ə]-epenthesis, utilising common markedness constraints (e.g. *Float, *Rise, OCP(H), Dep(µ), etc.) together with constraints against associating tone across certain prosodic boundaries. In total, Wamey provides evidence for parallelism between tonal and intonational languages, given that intonation-driven epenthesis is well established in the literature. This parallelism is predicted under a model where both types of prosodic systems make use of the same phonological substance and autosegmental architecture and have the same functional pressures to cultivate segmental environments best suited for realising pitch targets.
Competing interests
The authors declare that there are no competing interests regarding the publication of this paper.
Acknowledgements
For feedback, we thank Ryan Bennett, Karee Garvin, Larry Hyman, Laura McPherson, Charlie O'Hara, Mary Paster, Hannah Sande, Juliet Stanton, Sam Zukoff and the audience of AMP 2020. The primary source (Santos Reference Santos1996) may not be easily accessible for many readers; please email us to obtain a copy. Finally, we are indebted to the handling editor and three anonymous reviewers.
Supplementary material
The lexical database and a list of all constraints and crucial rankings are provided in the online supplement to this article can be found at https://doi.org/10.1017/S0952675722000094.