Shilluk is a Western Nilotic language spoken in Southern Sudan.Footnote 1 In the study of sound systems, the Western Nilotic languages are of particular interest on account of their rich systems of suprasegmental distinctions. For example, Dinka, another Western Nilotic language, has three levels of vowel length, a voice quality distinction (modal vs. breathy), and – depending on the dialect – three or four distinctive tone patterns (Andersen Reference Andersen1987, Remijsen & Manyang Reference Remijsen, Miller, Gilley and Ayoker2009). As we shall see, Shilluk presents a similarly complex system of suprasegmental distinctions.
The self-referent term for the Shilluk language is cl(ɔ) ‘mouth/language Shilluk’ – dhøg cøllø in the Shilluk orthography.Footnote 2 According to Ethnologue (Gordon Reference Gordon2005), Shilluk has around 175,000 speakers. The Shilluk kingdom is located in Southern Sudan, in the area around the confluence of the Sobat River with the White Nile. Earlier primary studies on Shilluk include Westermann (Reference Westermann1970), Gilley (Reference Gilley1992, Reference Gilley2000, Reference Gilley and Anyanwu2003), Miller & Gilley (Reference Miller and Gilley2001, Reference Miller, Gilley, Reh and Payne2007), and Reid (Reference Reid2009). A version of this article annotated with embedded sound files for all the Shilluk examples is available on the journal website, as supplementary material to this Illustration.
Syllable structure and word structureFootnote 3
In order to understand the structure of syllables and words in Shilluk, we need to distinguish between content morphemes (stems) and function morphemes. Uninflected native stem syllables are overwhelmingly monosyllabic. With few exceptions, these monosyllabic stems have the structure in (1). That is, stem syllables typically consist of an onset, a vowel (nucleus), and a coda. The length of the vowel will be discussed in the ‘Vowel length’ section below.
(1) C (Cj/w) V (V) (V) C
Native noun stems that lack either the onset or the coda exist, but they are rare. An exceptional case of an onsetless noun is ɔɔt ‘house.s’.Footnote 4 Nouns without coda include ɟɪ ‘people’, tʌʌ ‘berry.s’, and pìi ‘water’. Transitive verb stems never lack either onset or coda.
Complex onsets are constrained in the sense that the second consonant can only be a semivowel: /w/ or /j/ – e.g. djl ‘goat.s’, jaaŋ(ɔ) ‘sorghum stalk.s’, and lwl ‘open gourd.s’. When the onset is complex, and the initial consonant is a semivowel itself, then the sequence is invariably /jw/, as in jwé ‘defile.dvna’.
Shilluk words can be polysyllabic. We will first deal with morphologically complex polysyllabic words, and then consider monomorphemic polysyllabic words. The reason for this seemingly back-to-front approach will become clear shortly. We stated earlier that most stems are monosyllabic. These monosyllabic stems give rise to polysyllabic words through processes of derivation or inflection. For verbs and nouns alike, the most common prefixes are /a- ʊ-/, and the most common suffixes are /-Cɪ -ɪ -a (-ɔ)/. In addition, there are the weak forms of pronouns, which can be interpreted as agreement-marking suffixes. Three examples of the inflectional morphology of transitive verb stems are presented in (2). As seen in (2), the morpheme {cam} ‘to eat’ combines to form trisyllabic words through inflection for tense/evidentiality and agreement. Descriptive analyses of morphological marking for subject agreement and evidentiality can be found in Miller & Gilley (Reference Miller and Gilley2001 and Reference Miller, Gilley, Reh and Payne2007, respectively).
(2)
The same segmental material recurs in nominal affixes. For example, {-} marks male gender in (3a). In (3b), {à-} is part of the marking of patient nominalisation, a derivation that applies to transitive verb stems.
(3)
This background on morphologically complex content words prepares us for the encounter with monomorphemic polysyllabic nouns. These predominantly involve a closed syllable preceded by /a/ or /ʊ/. For example, gīik ‘buffalo.s’ begins with /ʊ/, but this term is not specific to the sex of the animal, and there are plenty of animal names that do not have this initial vowel. In the absence of a synchronic morphological process, the initial syllable is to be interpreted as part of the stem. Additional cases are presented in (4). In general, the composition of polysyllabic nouns suggests that morphological derivation is involved, even when there is no synchronic evidence for this.Footnote 5
(4)
Transitive verb stems are invariably monosyllabic. In other words, when they appear in a polysyllabic word, they are always transparently inflected or derived, as in (2).
The suffix /-ɔ/ is realised very weakly. Like most other affixes, this inflectional marker is found both on verbs and on nouns. Cases of this suffix can be found in cl() ‘mouth/language Shilluk’, in (2), (3) and (4) and in further examples in this Illustration. It is often devoiced, and its duration is much shorter than that of other suffixes. Often the only indication of this suffix is a breathy release at the end of the word. Also, if the preceding consonant is a plosive, then the presence of the weak final /-ɔ/ can be inferred from intervocalic voicing of this plosive (see ‘Consonants’ section below).
Function words are relatively few, as many grammatical relations are expressed through inflection instead. Here open syllables are common, and the vowel is very often /ɪ/: k loc, kí ext, ɲí hab, kāa conjtns. Examples of these and other function words can be found in the narrative at the end of this article.
Consonants
Voiceless plosives, voiced plosives, and nasals are each found at five places of articulation. Apart from these, there are only four other consonants: /l/, /r/, and the semivowels /w/ and /j/. There are no fricative phonemes. This inventory of consonants is typical of Western Nilotic languages in general (see Storch Reference Storch2005).
All of these consonantal phonemes can appear as the initial consonant in stem syllables. In this position, the phonetic realisation of the phonemes reflects their IPA transcription fairly closely. The only exception is /c/, which is often realised as an affricate [ ] or as a fricative [ɕç] – see Remijsen & Manyang (Reference Remijsen and Manyang2009) on Dinka. The list in (5) presents examples of the consonants in stem-initial position.
(5)
The situation is different in stem-final position. Here the inventory of consonantal phonemes is more limited, and the phonemes display more allophonic variation. The restriction relates to the plosive series: there is no distinction between voiceless and voiced plosives in the stem-final position. In our phonological transcriptions, stem-final plosives are represented by means of the relevant voiceless IPA symbol – i.e. /p t c k/ – without intention to prejudge the phonological analysis. The phonetic realisation of these stem-final consonants varies greatly. In prepausal position, they are realised consistently as voiceless. But when the stem-final consonant is followed by a vowel within the same word or within the same phrase, its realisation varies both in voicing and in manner, and this variation is largely free. For example, stem-final /p/ can be realised [p b f w] in this context. This sporadic process of intervocalic weakening or lenition of plosives is found at the five places of articulation. It is illustrated in (6). The stem-final /t/ is realised [t] in prepausal position. But when the stem is followed by the weak final suffix /-ɔ/, the stem-final /t/ is realised as [d].
(6)
As for consonants other than the plosives, coda /l/ is sporadically realised as a voiceless fricative before a pause. This is illustrated by two repetitions of the sentence in (7), elicited from the same speaker one after the other.
(7)
Also, coda /r/ may be elided completely. This process of elision applies in a sporadic manner. The same phenomenon is also found in northern dialects of Dinka, such as Ageer and Ruweng, which are geographically adjacent to the Shilluk-speaking territory.
The weakening of plosives has a bearing on another phenomenon relating to the realisation of consonants in stem-final position: gemination (Gilley Reference Gilley1992). The stem-final consonant may be followed by a consonant-initial suffix. The onset consonant of the suffix invariably assimilates to the preceding stem coda, yielding a geminate. One such suffix is the iterative marker {Cɪ} (see Gilley Reference Gilley and Anyanwu2003: 108–109), as in á-ŋl-l ‘past-cut-iter’. In the example of this form uttered by the second author, the geminate nature of the /l/ is evident from its long duration. However, the status of gemination across the speech community is unclear. Gilley (Reference Gilley1992: 26–27) writes that a salient realisation of geminates is only found in slow and deliberate speech.Footnote 6 Our own impressionistic observations concur that hypothesised geminates often appear indistinguishable from corresponding singleton consonants. To examine this issue, we collected a dataset of controlled speech, involving five singleton vs. geminate pairs. The stem-final consonant was varied over the five pairs: three had a sonorant (nasal or /l/), and two had a plosive. The preceding vowel is invariably short. An example pair is á-lŋ-à ‘past-drum-1s’ vs. á-lŋ-ŋ-á ‘past-drum-iter-1s’.Footnote 7 As seen in this example, the vowel of the iterative suffix elides before the first-person singular agreement marker. The examples also show that there is a difference in tonal specification alongside the difference in quantity. These materials were elicited from eight native speakers of Shilluk, using English as a medium.Footnote 8
We discuss the results for sonorants (nasals and /l/) and plosives separately, because they are qualitatively different. We start with the results for sonorants, where weakening is not at issue. There is little difference in the descriptive statistics for geminate vs. singleton sonorants. That is, the mean values are similar – 78 ms for singleton consonants, and 82 ms for geminates – and the standard deviations around the mean largely overlap – 20 ms for singletons, 22 ms for geminates. In a within-subjects analysis of variance, this difference is not significant (F(1,6) = 2.9; p = .14). These results suggest that, even in controlled speech, the hypothetical geminates do not differ from corresponding singleton consonants in their duration. When the consonant at issue is a plosive, there is a substantial difference in duration: the mean is 56 ms for the singleton consonants, and 72 ms for the corresponding geminates. The distributions overlap partially – the standard deviation is 17 ms for singleton plosives, and 28 ms for geminates. This difference is almost significant in a within-subjects ANOVA (F(1,7) = 5.5; p = .052). In summary, there is no significant difference in duration between singletons and geminates, neither for sonorants, nor for plosives. We speculate that the iterative inflection may be realised primarily through its tonal specification on the stem syllable. In the case of plosive consonants, we find a sizeable difference in the means of singletons vs. geminates, but there is considerable overlap between the distributions. Our results confirm the observations in Gilley (Reference Gilley1992): gemination is not realised consistently.
One factor that may explain the difference in the mean duration values for geminates vs. singletons is lenition. As explained above, stem-final plosives weaken between two vowels within the same word or phrase. This process interacts with gemination, as does phonemic vowel length. This is illustrated in Table 1. This Table shows two inflected forms of two verbs, each uttered by three speakers. As before, gemination marks the iterative inflection on transitive verbs. The stem-final consonant is the phoneme /k/, and it is followed by a vowel within the same word. The table gives phonetic transcriptions and sound examples of the consonantal gesture of the stem-final velar plosive phoneme. As in the example above, the initial consonant of this suffix assimilates to the preceding stem coda, and the vowel elides before the first-person singular agreement-marking suffix.
The phonetic realisation of stem-final /k/ varies considerably, from [k] over [ɡ] to [ɰ]. Vowel length and gemination both affect the likelihood that /k/ weakens, and also the extent to which it does. The influence of vowel length is illustrated in the data from speaker 1. He realises the stem-final consonant as [k] in á-kk-à, but as [ɰ] in á-kɔɔk-á. The influence of gemination is displayed in the examples from speaker 3 in the table: plosive realisations are somewhat more frequent in geminates. These examples also illustrate that there is considerable variability within and between speakers in the extent to which weakening takes place. Speaker 1 weakens stem-final consonants the least, not just within the group of three speakers presented here, but within the material from the eight speakers from which these examples are drawn. Other stem-final plosives are similarly variable when they are followed by a vowel within the same phrase. In summary, weakening is more likely with greater phonological vowel length, and less likely when the consonant is a geminate. However, neither of these factors has a categorical influence, and weakening is found more often than not. The grammatical distinction that was used to investigate gemination – iterative – additionally involves a tonal specification. Thus, the inflection is still retrievable from the signal, even if gemination is not phonetically realised.
Vowel quality and ATR
The phonological inventory of Shilluk vowels includes ten phonemes. The system is organised in terms of five vowel qualities, crossed orthogonally with an Advanced Tongue Root (ATR) distinction. The –ATR set includes /ɪ ɛ a ɔ ʊ/; the +ATR vowels are /i e ʌ o u/ (cf. Gilley Reference Gilley1992). This system is illustrated in (8). Impressionistically, the +ATR vowels sound more closed, and they also appear to be somewhat breathier, as compared to their –ATR counterparts.
(8)
These examples also serve to illustrate the functions of the ATR distinction. On the one hand, ATR distinguishes unrelated lexical items. On the other hand, ATR also plays a role in inflection and derivation. Transitive verbs that have a –ATR vowel underlyingly, like {kɔɔl} in (8), change their vowel to +ATR in several inflections, among others the antipassive (Miller & Gilley Reference Miller and Gilley2001), and to mark spatial deixis – centrifugal vs. centripetal (Leoma Gilley, personal communication). An additional example of this is presented in (9).
(9)
Descriptive statistics on acoustic vowel quality and voice quality are presented in Figure 1. These results are based on minimal-set data of the kind illustrated in (9), with ATR marking spatial deixis in inflected forms of the same verb. The vowel is always overlong. Twenty sets like the one in (9) were elicited – four lexical sets for each of the five pairs distinguished by ATR: /ɪ–i ɛ–e a–ʌ ɔ–o ʊ–u/, yielding 40 types. One or two realisations are included from each of nine speakers (seven male, two female). After manual checks of the measurements, 527 tokens were used in the acoustic analyses. For both analyses, the data were averaged over any repetitions of the same type, and then z-transformed per speaker.
The left panel of Figure 1 shows that phonetic vowel quality – expressed in terms of F1 × F2 values – separates the ten vowel phonemes fairly well from one another: there is almost no overlap between the ellipses, which encircle one standard deviation (i.e. 68 percent) of the distribution around the mean of each vowel. The ATR distinction is marked consistently in terms of phonetic vowel height, reflected by F1: for any pair of vowels distinguished solely in terms of ATR, the mean F1 value of the +ATR vowel is at least 100 Hz lower than that of the corresponding –ATR vowel.
Across the world's languages, perceived voice quality is correlated with the distribution of energy across the frequency spectrum: breathy vowels have proportionally less high-frequency energy than modal or creaky vowels (Gordon & Ladefoged Reference Gordon and Ladefoged2001). The measure of energy distribution that we are reporting here is spectral emphasis (Traunmüller & Eriksson Reference Traunmüller and Eriksson2000).Footnote 9 This measure relates the amount of high-frequency energy – i.e. energy upward from 1.5 times the fundamental frequency (F0) – to the overall energy. As seen from the right panel of Figure 1, −ATR vowels have higher values for spectral emphasis than corresponding +ATR vowels. This indicates that the −ATR vowels have more of their energy above 1.5 times the fundamental frequency, as compared to corresponding +ATR vowels. This result is in line with the observation that, impressionistically, the +ATR vowels sound somewhat breathy.
In summary, ATR is marked both by phonetic vowel quality (F1, F2) and by energy distribution. But the extent to which these correlates distinguish levels of ATR is not the same. This can be seen from the variability around the mean in Figure 1. The variability is quantified in the same way for formant values (left) as for spectral emphasis (right): one standard deviation of the values, after z-transformation per speaker. On the right, we can see that the standard deviations for the spectral emphasis measurement overlap for all ATR pairs other than /u–ʊ/. In contrast, there is almost no overlap between the standard deviations of the formant values (Figure 1, left). This difference leads us to conclude that phonetic vowel quality is the primary correlate of ATR in Shilluk, and that phonetic voice quality constitutes a secondary correlate. In the perception of the first author, who does not speak Shilluk, the pairs of vowels that are easiest to confuse are the +ATR half-open vowels vs. their closed −ATR counterparts: so /e/ vs. /ɪ/, and /o/ vs. /ʊ/. These pairs are similar in vowel height, and, impressionistically, the relative centralisation of the half-open +ATR vowels is not salient. Voice quality, the secondary correlate of the ATR distinction, may help to disambiguate these vowels.
At the phonological level, there is no evidence of ATR harmony within constituents. This is illustrated in (10). The suffixes /-wn/ ‘1pex’ (–ATR vowel) and /-wùn/ ‘2p’ (+ATR vowel) are not affected by the ATR value of the vowel of the preceding stem, and, conversely, they do not themselves affect the ATR value of the stem vowel.
(10)
In summary, the main acoustic correlate of the ATR distinction is phonetic vowel quality (F1, F2); phonetic voice quality (energy distribution) serves as a secondary correlate. ATR is involved in lexical distinctions, and also plays a paradigmatic role in the morphophonology. To the best of our knowledge, it is not involved in syntagmatic phonological processes within prosodic domains.
Vowel length
Shilluk has three levels of vowel length: short, long, and overlong (Remijsen, Miller & Gilley Reference Remijsen, Miller and Gilley2010). A three-way vowel length distinction has also been postulated for Dinka, another Western Nilotic language (Andersen Reference Andersen1987, Remijsen & Gilley Reference Remijsen and Gilley2008). Illustration (11a–c) presents Shilluk examples involving nouns – these are semi-minimal sets, controlled except for tone. Minimal sets involving verbs appear in (11d–e). As seen from these sets, vowel length distinguishes both unrelated lexical items and also inflected forms of the same stem.
(11)
ToneFootnote 10
Shilluk has a rich inventory of tone, with at least seven distinctive tone patterns or tonemes.Footnote 11 There are three level tonemes – Low (cc), Mid (cc), and High (cc). In addition, there are four contours – the Rise (cc), and three falling configurations: Fall (cc), High Fall (cc), and Late Fall (c).Footnote 12 Just like vowel length, tone is involved both in lexical and in morphological distinctions. An example of its lexical function is presented in (12).
(12)
The involvement of tone in morphological paradigms is illustrated in (13). In this example, all seven of the tonemes are exponents of morphological marking on the stem {ŋɔl} ‘to cut’.
(13)
In the list in (13), the seven distinctive tone patterns are saliently different from one another, with one exception: ŋl vs. ŋ. These patterns are very similar in the context of a following Low tone. For this reason, sentence-final examples of the same inflected forms have been included as sound illustrations in this paragraph.
As seen in (13e, f), we postulate that the transitive and intransitive forms of {ŋɔl} differ in terms of their tonal specification – High Fall vs. Fall, respectively. This analysis deviates from Miller & Gilley (Reference Miller and Gilley2001) and Gilley (Reference Gilley and Anyanwu2003), who hypothesise that transitive vs. intransitive forms of the same stem are distinguished by a paradigmatic stress feature. The transitive forms would be stressed; the corresponding intransitive would be unstressed. An evaluation of the competing hypotheses can be found in Remijsen, Miller & Gilley (Reference Remijsen, Miller and Gilley2010).
The examples in (14) present a second minimal set for tone, now involving a long vowel. Whereas the set in (13) involves forms of a single verb stem, the set in (14) is composed of verb forms from two different transitive verb stems.
(14)
Figure 2 presents descriptive statistics on the realisation of the seven tonemes, both for the set with a short vowel in (13), and for the set with an overlong vowel in (14). Each of the fundamental frequency (F0) traces in these graphs represents the average across seven speakers.
The realisation of the tonemes on short and overlong vowels is very similar. Minor differences can be attributed to divergence in the tonal context. For example, the High toneme starts out from a lower F0 value in the /ŋɔl/ set (Figure 2, left) than in the /lʊʊʊɲ/ set (Figure 2, right). This is due to the fact that the former is not preceded by a High tone target (13c), whereas the latter is (14c). Also, the Mid toneme is positioned lower in the tonal space in the /ŋɔl/ set than in the /lʊʊʊɲ/ set. This is due to the fact that the Mid toneme on /ŋɔl/ is preceded by two High tone targets earlier in the sentence (13b), whereas there is only one High tone target before /lʊʊʊɲ/ in (14b). As a result, automatic downstep applies twice in the former case, but only once in the latter.
The data in (13), (14) and Figure 2 illustrate the fact that vowel length is not a factor in the tone system: the same distinctive tone patterns – including the contours – are found on syllables with a short vowel as on syllables with an overlong vowel. This leads us to infer that the tone-bearing unit in Shilluk is the syllable rather than the mora, even though syllables vary considerably in weight structure (similarly, see Remijsen & Ladd Reference Remijsen and Ladd2008 on Dinka).
In polysyllabic words, every syllable is specified for tone. However, the distribution of tonemes is constrained: non-stem syllables – i.e. both present-day affixes and affixes that have become lexicalised – can only have level tonemes: Low, Mid or High. This is illustrated in (15).
(15)
The fact that contour tones are restricted to stem syllables is in line with the hypothesis that the stem-internal morphology of Shilluk and other Western Nilotic languages has its origin in affixal morphology (Andersen Reference Andersen1990). Andersen presents comparative evidence suggesting that the quantity of these lost affixes has shifted to the stem, yielding the third level of vowel length. The limitation of contour tones to stem syllables falls out naturally from the same diachronic account, as the additional tone targets on stem syllables can be attributed to the tone patterns of lost affixes.
Yes/no-questions involve the addition of a Low boundary tone at the right edge of the utterance, and an increase in the F0 range of the tone on the last word in the utterance. When the vowel is short and the toneme on the word-final syllable is the Rise contour, the addition of the boundary tone leads to a highly compressed realisation of tone targets. This state of affairs is illustrated in (16).
(16)
Summary
The sound system of Shilluk is complex with respect to the suprasegmental distinctions, particularly in terms of vowel length and tone. There are three levels of vowel length, and at least seven tone patterns are distinctive on monosyllabic verbs. In contrast, the segmental inventory is limited, especially as far as consonants are concerned. There are no fricatives, and the realisation of plosives in stem-final position is highly variable.
Transcription of the recorded passage
‘The North Wind and the Sun’
Abbreviations
Acknowledgements
Most recordings in this paper present the voice of the second author, although a few come from other speakers. Eight additional speakers of Shilluk took part in the elicitation sessions that led to this paper: John Adwok Apar, Rhoda Oman Nyibil, Daniel Thabo Nyibong, Onyoti Adigo Nyikwec, Maria Bocay Onak, Nyikwec Pakwan, Peter Mojwok Yor, and Mary Nyikongo Yor. We gratefully acknowledge their effort. We are also grateful to Prof. Al-Amin Abu-Manga, of the Institute of African and Asian Studies at University of Khartoum, who extended the hospitality of his Institute during two data collection trips to Khartoum. Finally, we thank Tatiana Reid for numerous thought-provoking discussions on Shilluk, and Bob Ladd for feedback before submission. This research is funded by the Arts & Humanities Research Council (AHRC), through the research grant ‘Stress in Nilotic – a typological challenge’ and through the grant ‘Metre and melody in Dinka speech and song’. The latter grant is part of the AHRC's Beyond Text programme.