The Xiangxiang dialect of Chinese

Ting Zeng

doi:10.1017/S002510031800035X

The Xiangxiang dialect of Chinese

Published online by Cambridge University Press: 08 August 2019

Ting Zeng

Show author details

Ting Zeng*: Affiliation:
College of Foreign Languages, Hunan University
*: tingzeng9@126.com

Article contents

Extract
Consonants
Vowels
Syllable structure
Tones
Transcription of recorded passage
Supplementary material
Footnotes
References

Rights & Permissions

Extract

Xiangxiang ([ɕjaŋ44 ɕjaŋ44]) is a Chinese dialect spoken by a population of 885,552 in the urban area of Xiangxiang (CN-430381), a city located in the centre of Hunan Province, China (Jiang 2008: 6). It belongs to Xiang ([ɕjaŋ44], ISO: 639-3: [hsn]), which is one of the ten major dialect groups of Chinese (LAC 2012).1 Xiang has two main subgroups, New Xiang and Old Xiang (Zhou & You 1985, Yuan 2001), and Xiangxiang is often cited as a representative dialect of Old Xiang (H. Bao 2001, Jiang 2008).2 Rather than denoting different historical stages, ‘old’ and ‘new’ reflect more and less conservative varieties among contemporary Xiang dialect speakers. Some impressionistic descriptions of the sounds and tones of Xiangxiang dialect will be found in Chao & Wu (1974), S. Zeng (2001) and Jiang (2008).

Type: Illustrations of the IPA
Information: Journal of the International Phonetic Association , Volume 50 , Issue 2 , August 2020 , pp. 258 - 281

DOI: https://doi.org/10.1017/S002510031800035X [Opens in a new window]
Copyright: © International Phonetic Association 2019

Xiangxiang ([ɕjaŋ⁴⁴ ɕjaŋ⁴⁴]) is a Chinese dialect spoken by a population of 885,552 in the urban area of Xiangxiang (CN-430381), a city located in the centre of Hunan Province, China (Jiang Reference Jiang2008: 6). It belongs to Xiang ([ɕjaŋ⁴⁴], ISO: 639-3: [hsn]), which is one of the ten major dialect groups of Chinese (LAC 2012).^{Footnote 1} Xiang has two main subgroups, New Xiang and Old Xiang (Zhou & You Reference Zhou and Rujie1985, Yuan Reference Yuan2001), and Xiangxiang is often cited as a representative dialect of Old Xiang (H. Bao Reference Bao and Li2001, Jiang Reference Jiang2008).^{Footnote 2} Rather than denoting different historical stages, ‘old’ and ‘new’ reflect more and less conservative varieties among contemporary Xiang dialect speakers. Some impressionistic descriptions of the sounds and tones of Xiangxiang dialect will be found in Chao & Wu (Reference Chao, Zongji and Yang1974), S. Zeng (Reference Zeng and Li2001) and Jiang (Reference Jiang2008).

The current transcriptions and generalizations, which are based on the data gathered for T. Zeng (Reference Zeng2012), present a variety of the Xiangxiang dialect that is typical of the city of Xiangxiang. The recordings accompanying this Illustration are from the author, who is a fluent female speaker of the dialect variety. She was born and grew up in the city of Xiangxiang, residing there for 18 years until she moved to attend college. She was in her mid-thirties at the time of recording and still uses Xiangxiang in daily communication with her parents. Throughout this article, Chao tone numbers are used to transcribe the tones, whereby 1 denotes the lowest pitch and 5, the highest (Chao Reference Chao1980).

Consonants

Non-syllabic consonants

Syllabic consonants

The discussions of consonants and generalizations presented in this section are based on findings from three experiments in T. Zeng (Reference Zeng2012): (i) a palatographic and linguographic study of consonants produced by six speakers, three males and three females; (ii) an acoustic study of the voiced obstruents produced by ten speakers, five males and five females; and (iii) an airflow study of the nasals, produced by seven speakers, six males and one female. All were native speakers of Xiangxiang, and were aged between 45 and 55 years at the time of recording.

Xiangxiang has nine plosives and shows a three-way distinction among voiceless unaspirated /p t k/ (as in /pa³⁴/ ‘eight’, /ta³⁴/ ‘to reach’ and /ka⁴⁴/ ‘street’), voiceless aspirated /p^h t^h k^h/ (as in /p^ha²⁵/ ‘to send’, /t^ha²⁴/ ‘tower’ and /k^haυ⁵⁵/ ‘to knock’) and voiced /b d g/ (as in /ba²⁴/ ‘card’, /daυ²⁴/ ‘peach’ and /gi²⁴/ ‘to ride’). The palatogram and linguogram of /ta³⁴/ ‘to reach’ and /ka⁴⁴/ ‘street’ shown in Figure 1 illustrate the pronunciations of the two series of non-labial plosives.

Figure 1 Palatogram and linguogram of (a) /ta³⁴/ ‘to reach’, (b) /ka⁴⁴/ ‘street’.

/t t^h d/ are consistently produced as apico-laminal and denti-alveolo-post-alveolar plosives. As exemplified by /ta³⁴/ ‘to reach’ in Figure 1a, they are pronounced with the tip and blade of the tongue touching an area on the roof of the mouth from the upper front teeth to the back part of the alveolar ridge.

The productions of /k k^h g/ are also consistent among the six speakers. They appear as dorsal velar plosives, produced with the posterior part of the tongue dorsum making contact with the velum, as can be seen from /ka⁴⁴/ ‘street’ in Figure 1b.

Xiangxiang also has three series of sibilants, namely /ts ts^h s dz/ (as in /ts⁴⁴/ ‘capital’, /ts^h²⁵/ ‘thorn’, /sa³⁴/ ‘to kill’, /dza²⁴/ ‘firewood’), /t∫ t∫^h dʒ ∫/ (as in /t∫³⁴/ ‘nephew’, /t∫^h²⁴/ ‘ruler’, /dʒ²⁴/ ‘time’, /S⁴⁴/ ‘poem’) and /tɕ tɕ^h dʑ ɕ/ (as in /tɕi³⁴/ ‘to amass’, /tɕ^hi²⁴/ ‘seven’, /dʑi²⁴/ ‘neat’, /ɕi³⁴/ ‘to learn’). The palatograms and linguograms of the three voiceless affricates /ts t∫ tɕ/ and three voiceless fricatives /s S ɕ/ in Figure 2 illustrate the pronunciations of the three series of sibilants typically used by the six speakers.

Figure 2 Palatogram and linguogram of (a) /ts⁴⁴/ ‘capital’, (b) /t∫³⁴/ ‘nephew’, (c) /tɕi³⁴/ ‘to amass’, (d) /s⁴⁴/ ‘silk’, (e) /ʃ⁴⁴/ ‘poem’, (f) /ɕi³⁴/ ‘to learn’.

/ts ts^h dz s/ are typically denti-alveolar sounds, with the narrowing of the vocal tract made by raising the tongue tip (/s/ in Figure 2d) or the blade (/ts/ in Figure 2a) toward an area on the roof of the mouth from the central incisors to the alveolar ridge.

/t∫ t∫^h S dʒ/ were often described as retroflexes ʂ in past studies (e.g. Jiang Reference Jiang2008). However, as /t∫³⁴/ ‘nephew’ in Figure 2b and /S⁴⁴/ ‘poem’ in Figure 2e illustrate, they are typically laminal post-alveolar sounds, made by raising the tongue blade toward the back of the alveolar ridge. This indicates that their articulation deviates from the two typical retroflex articulations discussed by Ladefoged & Maddieson (1996: 25–27, 158–163): one involves the tip of the tongue being curled up and backwards, as the sub-apical palatal articulations that occur in Tamil and Toda; the other has the tip of the tongue curled slightly upward, making a contact with the post-alveolar region, as the apical post-alveolar articulations that occur in Hindi and Bzyb. Consequently, the set of laminal post-alveolar sibilants in Xiangxiang are transcribed as /t∫ t∫^h ∫ dʒ/ ( in narrow transcription) instead of .

/tɕ tɕ^h dʑ ɕ/ are also typically laminal post-alveolar, as /tɕi³⁴/ ‘to amass’ in Figure 2c and /ɕi³⁴/ ‘to learn’ in Figure 2f show. Accordingly, the ‘laminal’ diacritic was also employed in their narrow transcriptions .

Although both laminal post-alveolars, /t∫ t∫^h dʒ ∫/ and /tɕ tɕ^h dʑ ɕ/ and are still distinguishable from each other by different tongue shapes. Comparisons between Figures 2b and 2c and between Figures 2e and 2f show that /tɕ tɕ^h dʑ ɕ/ have a wider lateral contact both on the palate and on the tongue than /t∫ t∫^h dʒ ∫/, indicating that during the production of /tɕ tɕ^h dʑ ɕ/, both the blade and the body of the tongue are high in the mouth, forming a comparatively long and flat constriction.

/tɕ tɕ^h dʑ ɕ/ are in complementary distribution with /ts ts^h dz s/ and /t∫ t∫^h dʒ ∫/. Specifically, /tɕ tɕ^h dʑ ɕ/ occur only before high front vowels/glides (/i y j Ч/), and /t∫ t∫^h dʒ ∫/ only before the homorganic syllabic approximant //, two environments where /ts ts^h dz s/ never occur. Thus, it is reasonable to postulate /tɕ tɕ^h dʑ ɕ/ as allophones of either of them. However, given the clearly different auditory quality of /tɕ tɕ^h dʑ ɕ/ from the other two series, they are still treated as phonemes in the present study, and are placed in a separate grid labeled ‘alveolo-palatal’ next to ‘post-alveolar’ (also see Lee & Zee Reference Lee and Eric2003, Chen & Carlos Reference Chen and Gussenhoven2015, Q. Li, Chen & Xiong Reference Li, Yiya and Ziyu2019, for similar treatments). Controversies over the phonemic status of alveolo-palatals have been found across Chinese dialects, mainly because they are in complementary distribution with one or more of the following three series of obstruents, namely [ts ts^h s], [t∫ t∫^h S], and [k k^h x]. In Beijing Mandarin, for instance, the alveolo-palatals are in complementary distribution with all the three sets of consonants. However, they are still treated as phonemes in many studies, including W. Li (Reference Li1999), Lee & Zee (Reference Lee and Eric2003). Readers are referred to Y-H. Lin (Reference Lin, C.-T, Y.-H and Andrew2014: 401) for a more extensive discussion.

Note that articulatory variability is identifiable for the three series of sibilants. For instance, the /ts/ series are sometimes pronounced as alveolar or denti-alveolo-post-alveolar sounds. In the minority of cases, they can also be articulated with a broad area covering the tip and blade of the tongue. Some speakers make /t∫ dʒ tɕ dʑ/ with a more extensive central contact involving two contiguous articulatory areas, namely the alveolar and post-alveolar. A full discussion of the articulatory variations is beyond the scope of the present paper; interested readers are referred to T. Zeng (Reference Zeng2012) for further details.

It is further observed that /dz dʒ dʑ/ are realized as affricates or fricatives. Figure 3 shows the waveform and spectrogram of /dzu²²/ [zu²²] ‘to sit’ (panel (a)), /dzo²⁴/ [dzo²⁴] ‘tea’ (panel (b)) and /dzo²²/ [zo²²] ‘stubble’ (panel (c)) produced by two different female speakers: (panel (a) represents one speaker and panels (b) and (c) represent the other speaker). The consonant in Figure 3a has no release burst and shows continuous voicing throughout the frication, which is a typical fully-voiced fricative ([z]). Figure 3c shows a partially voiced fricative, with voicing beginning after the onset of fricative. The consonant in Figure 3b is a partially-voiced affricate ([dz]): it has a clear release burst, and voicing is present during the closure period but does not continue into the frication part. Moreover, the variation ([dz]≈[z], [dʒ]≈[Z], [dʑ]≈[ʑ]), which is observed both within and between speakers, is found even among repetitions of the same test word by the same speaker. This indicates that [dz dʒ dʑ] and [z Z ʑ] do not contrast with each other in absolute initial position. In view of this, as well as the finding in T. Zeng (Reference Zeng2012) that affricates are the predominant realization, /dz dʒ dʑ/ are postulated as underlying.

Figure 3 Waveform and spectrogram of (a) /dzu²²/ [zu²²] ‘to sit’, (b) /dzo²⁴/ [dzo²⁴] ‘tea’, (c) /dzo²²/ [dzo²²] ‘stubble’.

Xiangxiang also has two non-sibilant fricative phonemes /x ɣ/, as in /xa³⁴/ ‘blind’ and /ɣa²⁴/ ‘shoe’). They are phonetically dorsal velar fricatives, articulated with the tongue dorsum coming close toward the velum.

Xiangxiang has three nasal phonemes /m n N/. /m/ is phonetically a bilabial nasal [m], as in /ma³⁴/ [ma³⁴] ‘to bury’. /n/ has several phonetic realizations, as illustrated by the airflow traces of four example words in Figure 4: /nẽ²²/ [nẽ²²] ‘to practice’ (4a), /naI³⁴/ [nĩũ)³⁴] ‘to come’ (4b), /naI²¹/ [laI²¹] ‘basket’ (4c) and /nu³⁴/ [Ĩũ)³⁴] ‘to display’ (4d). The dotted double-ended arrows indicate the nasal airflow baseline; the solid double-ended arrows denote the duration of vowel nasalization.^{Footnote 3} As can be seen, if the following rhyme has a nasal coda or a nasal vowel, /n/ is realized as [n] (as in /nẽ²²/ [nẽ²²] ‘to practice’ (4a), which shows high nasal airflow and no oral airflow. If the following rhyme is purely oral, /n/ shows a free variation, among [l], as in /naI²¹/ [laI²¹] ‘basket’, 4c), nasalized [l] (as in /nu³⁴/ [Ĩũ)³⁴] ‘to display’, (4d) and [n] (as in /naI³⁴/ [na)I)³⁴] ‘to come’, (4b).

Figure 4 The airflow traces of (a) /nẽ²²/ [nẽ²²] ‘to practice’, (b) /naI³⁴/ [nãĨ)³⁴] ‘to come’, (c) /naI²¹/ [laI²¹] ‘basket’ and (d) /nu³⁴/ [ĨũĨ)³⁴] ‘to display’. The three vertical lines in each panel denote consonant onset, vowel onset and syllable offset, in order of appearance.

Note that [Ĩ] is a distinct variant from [n] and [l]. Unlike [n] (4a, 4b), which exhibits plateau-like nasal airflow but no oral airflow throughout its duration, [Ĩ] (4d) shows both oral and nasal airflow, like [l] (4c). Nevertheless, [Ĩ] is still different from [l] in two significant ways. Firstly, the level of nasal airflow for [Ĩ] is higher than that for [l] . Secondly, the slight nasal airflow for [l] is unable to trigger any nasalisation on the following vowel, while vowels that follow [Ĩ] are like those that follow [n], which are nasalised to varying degrees. In case for the example words in Figures 4a, 4b and 4d, the vowels are all fully nasalized, as evidenced by the observation that the levels of nasal airflow are all above the corresponding nasal airflow baselines. However, phonemically, the vowels in Figures 4b and 4d are still transcribed as oral vowels (/aI/ and /u/), because they are nasalized only in nasal environments. /ẽ/ of /nẽ/ (4a) occurs in either nasal or purely oral contexts, hence transcribed as a nasal vowel both phonetically and phonemically.

The articulatory data from six speakers show that /n/ is frequently realized as an apico-alveolar nasal or lateral, as illustrated in Figures 5a (/ni³⁴/ [ni³⁴] ‘to leave’) and 5b (/na³⁴/ [la³⁴] ‘candle’). Note that there are some articulatory variations for the allophone [n]. Specifically, the palatographic contact may extend forward into the dental area and backward into the post-alveolar region; the contact on the tongue may also involve the tongue blade, or, very rarely, even the anterior part of the tongue dorsum.

Figure 5 Palatogram and linguogram of (a) /ni³⁴/ [ni³⁴] ‘to leave’, (b) /na³⁴/ [la³⁴] ‘candle’.

Note that /n/ and /l/ were treated as independent phonemes in S. Zeng (Reference Zeng and Li2001) and Jiang (Reference Jiang2008), mainly because they are historically contrastive sounds. Given that there is no synchronic evidence for a phonemic contrast between them, and that [n] appears in more contexts, /n/ and /l/ are analyzed as allophones of the same phoneme /n/. A similar treatment is also found in Chao & Wu (1974).

/ŋ/ is phonetically realized as a dorso-velar nasal ([ŋ]) in all but one environment, produced with the posterior part of the tongue dorsum making a complete contact with the velum, as illustrated by /ŋa³⁴/ [ŋa³⁴] ‘duck’ in Figure 6a. Preceding a high front vowel/glide, /ŋ/ is articulated with the posterior part of the tongue dorsum touching the posterior part of the palate and the velum (Figure 6b). This nasal was described in past studies published by Chinese linguists in China as an alveolo-palatal nasal [ȵ] (e.g. S. Zeng Reference Zeng and Li2001), a commonly used symbol among Chinese dialectologists to transcribe the nasal that occurs in a similar environment across Chinese dialects. However, a comparison between the articulation of this nasal (Figure 6b) and that of the alveolo-palatal fricative /ɕ/ in Figure 2c above reveals that they have quite distinct places of articulation (palato-velar for Figure 6a vs. post-alveolar for /ɕ/ in Figure 2c) and locations of constriction (dorsal for Figure 6a vs. laminal for 8 /ɕ/). This allophone of /ŋ/ is, therefore, not described as [ȵ].

Figure 6 Palatogram and linguogram of (a) /ŋ(a)/ in Xiangxiang, (b) /ŋ(jẽ)/ in Xiangxiang, (c) /ɲ(au)/ in Hakka (Zee & Lee Reference Zee and Wai-sum2008: 115), (d) /ɲ/ in Czech (Recasens Reference Recasens1990: 271), (e) /k(i)/ (Recasens Reference Recasens1990: 275).

Following a suggestion from one of the anonymous reviewers, this nasal is not described as a palatal nasal [ɲ], although the palate is involved in its production, but as a front velar nasal and an allophone of /ŋ/. Firstly, it is typologically uncommon for a language to have only a palatal nasal without the occurrence of a palatal stop and, in most cases, a palatal fricative. For example, apart from the palatal nasal, both the palatal stop and palatal fricative occur in languages such as Czech, Hungarian, Arrernte and Warlpiri (both Australian languages). This is also the case for some Chinese dialects such as Liuyang (a subdialect of the Gan dialect group) and Meixian (a subdialect of the Hakka dialect group). However, Xiangxiang has neither a palatal stop nor a palatal fricative.

Secondly, the articulation of this nasal in Xiangxiang is different from that of the typical /ɲ/ in other languages, such as Hakka Chinese (Figure 6c, Zee & Lee Reference Zee and Wai-sum2008: 115) and Czech (Figure 6d, Recasens Reference Recasens1990: 271), in at least two respects. One is a difference in the contact on the upper surface of the vocal tract. The /ɲ/ in Hakka Chinese and Czech is articulated with a complete and continuous contact from the post-alveolar to the prepalatal zones, a distinct and much fronted place of articulation compared with the palato-velar nasal in Xiangxiang (Figure 6b). The other is a difference in the lower articulator. Recasens (Reference Recasens1990: 272) shows that /ɲ/ is primarily produced with the anterior part of the tongue dorsum, although ‘some involvement of the mediodorsum, and more rarely, the lamina and the postdorsum may exist’. As the lower panels of Figures 6b–d show, the /ɲ/ in Hakka and Czech is mainly predorsal, while the palato-velar nasal in Xiangxiang is postdorsal.

Recasens (Reference Recasens1990) grouped consonants produced at the palatal zones into four classes, namely alveolopalatals, front palatals, mid palatals and back palatals. While the /ɲ/ in both Hakka Chinese and Czech are typical front palatals, the nasal in Xiangxiang is more similar to back palatals, such as the front velar (i.e. velar consonants before a front vowel /k(i)/), as shown in Figure 6e. Also, given that it is in complementary distribution with /ŋ/ and occurs only before a high front vowel/glide, the most frequent environment where palatalization takes place, it is hence described as a palatalized velar nasal and an allophone of /ŋ/ (e.g. /ŋjẽ³⁴/ [ŋ^jjẽ³⁴] ‘mud’).

Xiangxiang dialect also has three glides /j w Ч/, which may occur syllable-initially (as in /ja³⁴/ ‘hot’, /wa³⁴/ ‘socks’ and /Чa³⁴/ ‘moon’) or in combination with a preceding consonant (as in /kja³⁴/ ‘grid’, /kwa³⁴/ ‘to scrape’ and /ɕЧa³⁴/ ‘snow’). Syllabic nasals are restricted to four words: /²²/ ‘mother’, /²¹/ ‘you’, /N'²¹/ ‘I’ and /N'³⁴/ ‘yellow’. The syllabic approximant // occurs after /ts ts^h s dz t∫ t∫^h S dʒ/ and becomes homorganic with the preceding consonant. When following /ts ts^h s dz/, it is typically realized as a syllabic apical (or laminal) denti-alveolar approximant; when following /t∫ t∫^h S dʒ/, it is a syllabic laminal post-alveolar approximant. In previous studies published by Chinese linguists in China, these two syllabic allophones have often been transcribed as apical vowels [ɿ] and [ﺎ], respectively. For instance, words such as /dz`²⁴/ ‘porcelain’ and /dƷ²⁴/ ‘time’ were transcribed as [dzɿ²³] and [dʐﺎ²³], respectively, in S. Zeng (Reference Zeng and Li2001).

The most frequently cited phonetic feature of Xiangxiang is that it has a distinction between voiced and voiceless obstruents, whereas most other Chinese dialects do not. Patterns are identifiable in the voicing of initial stops, fricatives and affricates (T. Zeng Reference Zeng2015).

The voiced stops are divided into two groups. One has pre-voicing or voicing lead, indicating that there is glottal vibration before the release of a stop, while the other group does not. The number of occurrence for each pattern as a percentage to total number of voiced stops is 59% vs. 41%.

The three patterns for the voiced fricatives are labelled ‘complete voicing’, ‘incomplete voicing’ and ‘no voicing’. ‘Complete voicing’ refers to the presence of vocal cord vibration throughout frication noise. ‘Incomplete voicing’ means that vocal cord vibration is only present during part of the fricative duration. ‘no voicing’ means the absence of vocal cord vibration during the whole duration of the fricative. The numbers of occurrence as a percentage to total number of voiced fricatives are 10% (‘complete voicing’), 80% (‘incomplete voicing’) and 10% (‘no voicing’).

For voiced affricates, the first pattern, ‘fully voiced’, is defined as voicing throughout oral closure (i.e. pre-voicing) and continuing into the following vowel. The second pattern is ‘incomplete voicing’, which means that voicing is observed only for part of the affricate duration. It is found that this pattern only includes tokens that have vocal cord vibration only during the period of fricative release but not oral closure. The third pattern is ‘no voicing’, indicating that there is no vocal cord vibration at all. The numbers of occurrence as a percentage to total number of voiced affricates are 21% (‘fully voiced’), 70% (‘incomplete voicing’) and 9% (‘no voicing’).

Two general observations are in order. Firstly, 59% of voiced stops, 21% of voiced affricates and 10% of voiced fricatives show continuous glottal vibration throughout the duration. Secondly, incomplete voicing during fricative noise are noticed for 80% of voiced fricatives and 70% of voiced affricates. Overall, voicing is still found on 70% of the voiced obstruents to varying degrees, indicating that obstruent voicing is still a prominent feature of Xiangxiang.

Voicing is also a distinctive feature for obstruents in the Wu dialects of Chinese and other languages such as Spanish, Dutch and Polish. Unlike the Xiangxiang voiced stops, which are produced with either lead VOT or short-lag VOT, in Wu dialects such as Shanghai and Suzhou, the voiced categories are typically produced with short-lag VOT values, contributing to a complete overlap in the VOT distribution with the voiceless unaspirated categories at each place of articulation (Shi Reference Shi1983, Chen & Gussenhoven Reference Chen and Gussenhoven2015). Nevertheless, there are some similarities across these Chinese dialects. For instance, the voiced obstruents are fully voiced in non-initial position, and voiceless fricatives have longer duration and greater amplitude of the frication noise than their voiced counterparts (for more details please refer to T. Zeng Reference Zeng2012). In contrast with Chinese, prevoicing tends to be a more stable acoustic property in Spanish, Dutch and Polish. In both Spanish and Polish, there is essentially no overlap of VOT values between voiced and voiceless categories in absolute initial position: voiced stops have lead values, and voiceless stops have short-lag values (Williams Reference Williams1977, Keating, Mikos & Ganong III Reference Keating, Michael and William1981, Rosner et al. Reference Rosner, Luis, Jose and Richard2000). In Dutch, altogether 75% of the voiced stops are produced with prevoicing (van Alphen & Smits Reference van Alphen and Roel2004), a much higher percentage than that in Xiangxiang (59%).

Vowels

Monophthongs

Symbols of oral monophthongs in open syllables on a conventional IPA vowel chart.

Vowel symbols on a conventional IPA vowel chart are shown in this section in three vowel diagrams. Acoustic plots for the vowels (Figures 7, 9–11) and diphthongs (Figures 12–14) are also shown. They are based on data from twenty speakers of the Xiangxiang dialect, including ten males and ten females, all aged between 45 and 55 years.

Figure 7 Vowel ellispes of the six oral vowels that occur in open syllables: /i⁴⁴/ ‘clothes’, /y⁴⁴/ ‘slit’, /ɯ⁴⁴/ ‘jet black’, /u⁴⁴/ ‘snail’, /o⁴⁴/ ‘to dig’ and /ka⁴⁴/ ‘street’

Figure 8 The lip position of (a) /ɯ/, (b) /u/ and (c) /o/.

Figure 9 Vowel ellipses of the two vowels that also occur in closed syllables: /i/ in /tin⁴⁴/ [tIn⁴⁴] ‘nail’, /a/ in /tan⁴⁴/ ‘needle’ and /taŋ⁴⁴/ ‘to serve’. The vowel ellipses of /i/ and /a/ in an open syllable are also shown, only to help determine the phonetic values of the vowels in closed syllables.

Figure 10 Vowel ellipses of the nasal vowels: /ẽ/ in /ẽ⁴⁴/ ‘smoke’, /õ)/ in /kõ)⁴⁴/ ‘rice cereal’, /ã)/ in /tjã)⁴⁴/ ‘light’ and /wã)⁴⁴/ ‘bent’. The vowel ellipses of /i/ [I], /a/ and /o/ are also shown, only to help determine the phonetic values of the nasal vowels.

Figure 11 Vowel ellipses of the nasal vowels (with the /ã/ tokens divided into two groups: /ã/ in /tjã⁴⁴/ ‘light’ vs. /ã/ in /wã⁴⁴/ ‘bent’). The vowel ellipses of /i/ [I], /a/ and /o/ are also shown, only to help determine the phonetic values of the nasal vowels.

Figure 12 Vowel ellipses for the three diphthongs: /eI/ in /jeI⁴⁴/ ‘excellent’, /aI/ in /taI⁴⁴/ ‘slow-witted’ and /aυ/ in /taυ⁴⁴/ ‘knife’.

In general, Xiangxiang has a vowel system consisting of six oral vowels: /i y ɯ u a o/, three nasal vowels: /ẽ õ ã)/, and three diphthongs: /eI aI aυ/. For the six oral vowels, /y ɯ u o/ occur only in open syllables, while /i/ and /a/ occur in both open and closed syllables.

As shown in Figure 7, /i/ in an open syllable (as in /i⁴⁴/ ‘clothes’) occupies the most anterior and highest position in the acoustic space. By comparison, /y/ (as in /y⁴⁴/ ‘slit’) is lower and more central. /a/ (as in /ka⁴⁴/ ‘street’) is low and central.

Transcriptions of the three back vowels are mainly based on two observations. Firstly, as pointed out by one of the anonymous reviewers, the three back vowels /ɯ u o/ involve different lip gestures. This is evidenced by the data in Figure 8, which shows the lip position of the three vowels taken from a videotape of two speakers pronouncing isolated words, one of them being the consultant that provided the data for illustration for the current study. As can be seen, /ɯ/ (as in /ɯ⁴⁴/ ‘jet black’) shows fairly close approximation of the upper and lower lips, which is very similar to the Japanese /u/ (/ɯ/ in narrow transcription) (Ladefoged & Maddieson 1996: 291, 295; Ladefoged & Johnson Reference Ladefoged and Keith2011: 226–227). During the production of /u/ (as in /u⁴⁴/ ‘snail’), the lips are protruded. /o/ (as in /o⁴⁴/ ‘to dig’) also has lip protrusion, but with a much lesser degree of protrusion than /u/. Secondly, both /u/ and /ɯ/ are high and back, and /o/ is mid-close and back.

Symbols of nasal monophthongs (left panel) and oral monophthongs in closed syllables (right panel) on a conventional IPA vowel chart.

The above vowel diagram (right panel) shows the two vowels that can also occur in closed syllables, namely /i/ and /a/. /i/ only occurs before /n/ (as in /tin⁴⁴/ ‘nail’), while /a/ occurs before either /n/ or /N/ (as in /tan⁴⁴/ ‘needle’ and /taŋ⁴⁴/ ‘to serve’). As observed in Figure 9, /i/ is phonetically realized as [I] in a closed syllable, which has higher F1 and lower F2 values than /i/ in an open syllable, whereas the vowel ellipse of /a/ in an open syllable is overlapped to an extensive extent with that of /a/ in syllables checked by either /n/ or /ŋ/.

The Xiangxiang dialect also has three nasal vowels, shown in the left panel of the above vowel diagram: /ẽ/ (as in /ẽ⁴⁴/ ‘smoke’), /õ/ (as in /kõ⁴⁴/ ‘rice cereal’) and /ã/ (as in /tjã⁴⁴/ ‘light’ and /wã⁴⁴/ ‘bent’). They all occur in open syllables and it is further obligatory for /a)/ to co-occur with a preceding glide /j/ or /w/.

As illustrated in Figure 10, /ẽ)/ is lower than [I] in /tin⁴⁴/ [tIn⁴⁴] ‘nail’; the vowel ellipses of /o/ and /õ/ overlap considerably, but /õ/ shows a wider variation along the F1 dimension, especially for the female speakers. For both gender groups, the vowel ellipse of /ã)/ covers that of /a)/ completely, with /ã/ further showing a wider variation along both F1 and F2. The variation can be partly explained by the coarticulatory effect asserted by the preceding glide: Figure 11 separates the /ã/ tokens into two groups, one following /j/ and the other following /w/. It can be seen that the former (/tjã⁴⁴/ ‘light’) is generally more anterior and the latter (/wã⁴⁴/ ‘bent’) is more posterior than /a/ in open syllables (/ka⁴⁴/ ‘street’).

Diphthongs

Symbols of diphthongs on a conventional IPA vowel chart.

As seen in the above vowel diagram, Xiangxiang dialect has three diphthongs: /eI/ (as in /jeI⁴⁴/ ‘excellent’), /aI/ (as in /taI⁴⁴/ ‘slow-witted’) and /aυ/ (as in /taυ⁴⁴/ ‘knife’). They occur only in open syllables and /eI/ is obligatorily preceded by /j/.

Figure 12 plots the onset and offset values of the diphthongs only. Figure 13 also plots [I] (/tin⁴⁴/ [tIn⁴⁴] ‘nail’), /u/ and /a/, only to facilitate a more precise description of the diphthongs.^{Footnote 4} Whether diphthongs are single vowels or sequences of two vowels, as well as whether monophthongs, which are often taken to represent the canonical form of a vowel phoneme, act as targets for diphthongs, is beyond the scope of this paper, and readers are referred to Miret (Reference Miret1998: 27–31) for a more extensive discussion.

Figure 13 Vowel ellipses for the three diphthongs and three monophthongs (/i/ [I], /u/, /a/).

As can be seen from Figure 13, /aI/ and /aυ/ both start at a place near monophthong /a/ in the acoustic plane, and glide toward a place which can be described as the lax counterparts of monophthongs /i/ and /u/, respectively. The third diphthong is described as /eI/, for it starts near /e/, and moves to a position which overlaps considerably with [I] in /tin⁴⁴/ [tIn⁴⁴] ‘nail’ and the end-point of /aI/.

Note that, according to all the past studies (e.g. S. Zeng Reference Zeng and Li2001, Jiang Reference Jiang2008), there are a greater number of diphthongs in Xiangxiang dialect, because all the glide–vowel combinations (such as /ja/, /wa/ and /Чa/) are described as genuine diphthongs (such as /ia/, /ua/ and /ya/). However, given the following finding, the high vowel and a following vowel in Xiangxiang is transcribed as a GV sequence in the present study. Specifically, the glide element as in /ja³⁴/ ‘hot’ (Figure 14a) has very brief duration, and the change of the vowel quality over the /ja/ sequence is much less gradual than over the diphthong as in /tai⁴⁴/ ‘slow-witted’ (Figure 14b). This fits Ladefoged & Maddieson’s (1996: 322) definition of a ‘glide’ well, as it involves a quick movement from a high vowel position to that of the following vowel. Similar treatments can also be found in very recent studies on the sound system of Chinese dialects, such as Shanghai Chinese (Chen & Gussenhoven Reference Chen and Gussenhoven2015) and Tianjin Mandarin (Q. Li et al. Reference Li, Yiya and Ziyu2019).

Figure 14 Waveform and spectrogram of (a) /ja³⁴/ ‘hot’ and (b) /taI⁴⁴/ ‘slow-witted’.

Syllable structure

The syllable structure of Xiangxiang dialect is (C)(G)V(C). There is a clear asymmetry between the beginning and the end of a syllable: all Cs can occur at the beginning of a syllable, whereas only two of the three nasals can occur at the end of a syllable, namely /n/ and /ŋ/.

Table 1 Co-occurrence restrictions on consonants with the high vowels /i y u/ and glides /j Ч w/.

G is either /j/, /w/ or /Ч/, which may occur syllable-initially or following C. There are some distributional restrictions concerning the glides /j Ч w/. Firstly, as shown in Table 1, all the three glides are banned after the post-alveolar consonants /t∫ t∫^h d₃ ∫/, which only allow a following homorganic syllabic approximant //; labial consonants reject a following /Ч/ or /w/. Comparatively speaking, /Ч/ has the narrowest distribution as for CG combination, which is only allowed following an alveolo-palatal. Secondly, the distribution of the corresponding high vowels /i y u/ is also shown in Table 1. As can be seen, while /j/ shares similar distribution with its high vowel counterpart /i/, it is not the case for /Ч y/ and /w u/. For instance, while a labial consonant is allowed to occur before /u/, it rejects a following /w/; alveolar non-sibilants /t t^h d n/ can be followed by /y/ but not by /Ч/.^{Footnote 5} Thirdly, as for GV co-occurrence restrictions, /j/ has a wider distribution than /w/ and /Ч/. Within an open syllable, /j/ is only banned before the front vowels /i y ẽ)/; /w/ is not allowed to occur before any of the back vowels /ɯ u o õ/ or the front rounded /y/; /Ч/ is banned before all the vowels except /i/ and /ẽ)/. Additionally, only /j/ and /w/ may occur before diphthongs: /j/ before /eI aυ/ and /w/ before /aI/.

There are further phonotactic restrictions. First, as noted in the section of consonants, there are some restrictions concerning the three sets of sibilants: /t∫ t∫^h d₃ ∫/ only occur before the homorganic syllabic approximant //; /tɕ tɕ^h dʑ ɕ/ are only permitted before high front vowels/glides (/i y j Ч/); /ts ts^h dz s/ do not occur in the environments where /t∫ t∫^h dʒ ∫/ and /tɕ tɕ^h dʑ ɕ/ occur. Secondly, not all the velar obstruents are allowed before the high front vowels/glides. Specifically, /x/ does not occur before any of the high front vowels/glides (/i y j Ч); /k k^h g/ are absent before the high front rounded vowel/glide (/y Ч/); /ɣ/ is only absent before /Ч/. Thirdly, among the three nasal vowels, /ã/ has the narrowest distribution, which can only occur in combination with a preceding glide /j/ or /w/. Fourthly, the three diphthongs /eI aI aυ/ occur only in open syllables. /eI/ is obligatorily preceded by /j/; /aI/ and /aυ/ occur with a non-glide consonant onset or in combination with a preceding glide (/w/ for /aI/ and /j/ for /aυ/). Finally, only /i/ and /a/ can occur in CVN syllables: /i/ is only allowed to appear before /n/; /a/ can occur before either /n/ or /ŋ/.

Tones

Citation forms

The citation tones and sandhi patterns described here are based on accompanying sound files produced by the consultant of the current study. For acoustic analysis of tones produced by several speakers, readers are referred to T. Zeng (Reference Zeng2012).

After the fundamental frequencies (f0) of the citation and sandhi tones were obtained, they were normalized with the T-normalization method in (1).

(1)

$${{\rm{T}}_{\rm{i}}} = 5 \times {{\lg \,{{\rm{x}}_{\rm{i}}} - \lg \,{{\rm{x}}_{\min }}} \over {\lg \,{{\rm{x}}_{\max }} - \lg \,{{\rm{x}}_{\min }}}}$$

In this formula, x_i means the f0 value to be normalised; x_min and x_max denote the minimum and maximum f0 value for a given speaker, respectively. This method has been used in many studies, such as Shi (Reference Shi1986: 78; Reference Shi and Feng1990: 68; Reference Shi, Feng and Rongrong1994: 12) and Shi & Wang (2006: 34), to reduce the interspeaker f0 variations, as well as to normalize the f0 values so that they are interpretable in the five-point-scale pitch system. The five intervals (0–1, 1–2, 2–3, 3–4, and 4–5) in Figures 15 and 16 correspond to the five pitch levels of the tone range (1, 2, 3, 4 and 5, or low, half-low, medium, half-high and high), respectively (Chao Reference Chao1980: 81).

Figure 15 Pitch curves of the seven citation tones interpretable in Chao’s five-point scale for the female speaker who provided the data for illustration.

Figure 16 Tone sandhi patterns of bisyllabic words from the speaker who provided the recording for illustration in this study.

Figure 17 Waveform and spectrogram of [ɕja³⁴ ɕja³³] ‘black’ (/ɕja³⁴/ ‘black’ + /ɕja³⁴/ ‘color’) from the speaker who provided the recording for illustration in this study.

Xiangxiang dialect has seven citation tones, as illustrated in Figure 15. Tone 1 is a high level tone which has a relatively stable pitch at level 4 throughout its duration, and is subsequently transcribed as 44. Tone 2 and Tone 3 both have a rising contour. Tone 2 starts at pitch level 2 and Tone 3 at 3. Although both tones end roughly near the dividing line between levels 3 and 4, they were transcribed as 24 and 34, respectively, for if the pitch level at the end is transcribed as 3, Tone 3 would be 33, a level instead of a rising tone. Tone 5 and Tone 6 are also rising tones which start with a two-level difference in pitch (level 4 for Tone 5 and level 2 for Tone 6) and end at a comparable pitch level (5). They were subsequently annotated as 45 and 25, respectively. Although the four rising tones all tend to start with a dip, they were still treated as rising instead of concave tones considering that (i) the initial fall is very brief and ranges within about one scale point and (ii) the initial fall for these tones is not observed for all speakers (please refer to T. Zeng Reference Zeng2012 for the tones produced by three other speakers). In contrast, an initial falling is obligatory for the concave tone in Mandarin Chinese and Suzhou Chinese, as evidenced by the acoustic data provided in Lau (Reference Lau2002) and Shi & Wang (2006), indicating that the initial falling is intended by the speakers of these two dialects. It is also noticeable that, Tone 2 is lower in f0 than Tone 3 for most part of its duration, except that they end with rather comparable f0 at the middle of the f0 range. Similarly, Tone 6 has a lower f0 than Tone 5 throughout its duration except for the very final part, which constitutes the highest f0 level. It is further observed that these four rising tones co-occur with different onset consonants (Tone 2 with voiced and voiceless aspirated obstruent onsets vs. Tone 3 with voiceless unaspirated obstruent or sonorant onsets; Tone 5 with voiceless unaspirated obstruent and sonorant onsets vs. Tone 6 with voiceless aspirated obstruent onsets). In a more abstract level, the pitch differences between Tone 2 and Tone 3, as well as those between Tone 5 and Tone 6, which are evident at the beginning and middle part of the pitch contour, can only be argued to be partly attributable to the type of the onset consonants. Tone 4 is a low falling tone 21 and Tone 7 is a low level tone 22.

As shown in Table 2, there are some consonant–tone co-occurrence restrictions in Xiangxiang. Specifically, Tone 1, Tone 3, Tone 4 and Tone 5 co-occur with either voiceless obstruents or sonorants, Tone 2 with voiced and voiceless aspirated obstruents, and Tone 6 with voiceless aspirated obstruents only.

Table 2 Co-occurrence relationship between syllable onset and tones.

Sandhi patterns for bisyllabic words

Following the suggestion from one of the anonymous reviewers, the test words include one set of near-minimal-pair words (MP), and one set of high-frequency words (HF), as listed in Table 3 below. Similar results are obtained for both data sets, which show that the tones of the first syllable do not undergo sandhi rules and remain unchanged, whereas the tonal contours of the second syllable are reduced to just three level tones, depending on which citation tone it carries. If the citation tone of the second syllable is 45 or 25, it becomes a high-level tone 44; if the tone in question is 44, 24 or 34, it becomes a mid-level tone 33; if it is 21 or 22, it becomes a low-level tone 22. Please note that these sandhi patterns apply only to bisyllabic words and further studies are needed for longer words.

Table 3 Example words for tone sandhi. Two sets of words were provided, one including minimal pairs (denoted as MP), the other high-frequency words (HF).

To illustrate the sandhi patterns, the tone contours of 14 groups of words are shown in panels (a)–(n) in Figure 16 below. Panels (a)–(g) show the pitch curves of seven groups of bi-syllabic near–minimal-pair words, with the tone of the second syllable remaining unchanged within each group, while the tone of the first syllable is a variable ‘x’ from Tone 1 to Tone 7: x + T1 (panel (a)), x + T2 (panel (b)), x + T3 (panel (c)), x + T4 (panel (d)), x + T5 (panel (e)), x + T6 (panel (f)) and x + T7 (panel (g)). They are used to illustrate that the first syllables remain unchanged in the sandhi environment. Take panel (a), for example. It shows the pitch curves of seven bi-syllabic words T1 + T1, T2 + T1, T3 + T1, T4 + T1, T5 + T1, T6 + T1, T7 + T1. It can be seen that there are seven different pitch curves for the first syllables, resembling those of their respective counterparts when pronounced alone. This indicates that the first syllables do not undergo tonal changes in sandhi environments.

Panels (h)–(n) in Figure 16, on the other hand, show the pitch curves of the seven groups of high-frequency words. However, different from panels (a)–(g), the tonal data in panels (h)–(n) are presented in such a way that the tone of the first syllable remains unchanged within each group and the tone of the second syllable is a variable ‘x’ from Tone 1 to Tone 7: T1 + x (panel (h)), T2 + x (panel (i)), T3 + x (panel (j)), T4 + x (panel (k)), T5 + x (panel (l)), T6 + x (panel (m)) and T7 + x (panel (n)). They are used to show that the tonal contours of the second syllables are reduced to three level tones 44, 33 and 22, depending on which citation tone they carry. For instance, panel (h) shows the pitch curves of seven bi-syllabic words T1 + T1, T1 + T2, T1 + T3, T1 + T4, T1 + T5, T1 + T6, T1 + T7. As can be seen, the tone values of the first syllables are 44, the same with the corresponding citation tone; the second syllables are grouped into three level tones. Specifically, Tone 5 (45) and Tone 6 (25) become 44; Tone 1 (44), Tone 2 (24) and Tone 3 (34) become 33; Tone 4 (21) and Tone 7 (22) become 22.

Note that some carryover coarticulatory effects are observable, especially for panel (b), which shows the pitch curves of seven bi-syllabic words T1 + T2, T2 + T2, T3 + T2, T4 + T2, T5 + T2, T6 + T2, T7 + T2. Basically speaking, the pitch curves of the seven non-initial syllables lie in the pitch range 3, but the tones following initial Tone 5 (45) and Tone 6 (25) are obviously higher than those following the other five tones. This seems to suggest that maybe other factors, such as tonal coarticulation, also affect pitch contour realization of the non-initial syllables. A systematic analysis of these factors is to be done in future studies.

It is argued that this three-register leveling is part of a neutralization process for the non-initial syllable, which tends to occupy a weak position when compared with the initial syllable. The non-initial syllable shares similar phonetic features with a weak or neutralized syllable, as defined in Lin & Yan (Reference Lin and Jingzhu1990) and Cao (Reference Cao2016), than the initial syllable that is in a strong position. As illustrated by the example word [ɕja³⁴ ɕja³³] ‘black’ (/ɕja³⁴/ ‘black’ + /ɕja³⁴/ ‘color’) in Figure 17, apart from tonal reduction or neutralization, the non-initial syllable often shows shorter duration, less amplitude or reduced vowel quality than the initial syllable. At a more abstract level, the observed sandhi pattern in Xiangxiang dialect can be viewed as the result of contour deletion, based on the tonal models argued for in Z. Bao (Reference Bao1990, Reference Bao1996) that regard tone as a binary-split structure consisting of a contour node and a register node. That is, after contour deletion, the two high rising tones (45, 25) become high level; the three tones with a mid register (44, 24, 34) become mid level; the two low tones (21, 22) become low level.

Transcription of recorded passage

Phonemic version

This passage is transcribed phonemically, using the symbols presented in the vowel and consonant charts. Tones are marked for each syllable, with sandhi tones given in (). [ǁ] marks the end of an utterance and [│] the end of an intonational phrase within an utterance.

jeI⁽²²⁾ i⁽³³⁾ t^hẽ⁴⁴ │ pja³⁴ xwan⁽³³⁾ kjã⁽³³⁾ t^ha²⁵ jaŋ⁽³³⁾ dzaI⁽²²⁾ nu⁽²²⁾ ni⁽²²⁾ tsõ⁴⁴ │ na²¹ ku⁽⁵⁵⁾ ti⁽³³⁾ pin²¹ dz⁽²²⁾ da⁽²²⁾ ║ tsõ⁴⁴ naI⁽³³⁾ tsõ⁴⁴ k^hi⁽⁵⁵⁾ │ jeI²¹ ku⁽⁵⁵⁾ jan⁽³³⁾ kɯ⁽⁵⁵⁾ ɕjan⁴⁴ │ ɕjan⁴⁴ ɣjaŋ²² t^hЧẽ⁽³³⁾ ni⁽²²⁾ gẽ⁽²²⁾ ɣaI²² ŋaυ⁽²²⁾ ts`⁽²²⁾║ t^ho⁽³³⁾ njã⁽³³⁾ dʑjeI⁽²²⁾ ɕjaŋ⁴⁴ njaŋ⁽³³⁾ xaυ²¹ │ kaŋ²¹ │ na²¹ ku⁽⁵⁵⁾ ɕjẽ⁴⁴ jaŋ⁽²²⁾ ku⁽²²⁾ ku⁽⁵⁵⁾ kɯ⁽⁵⁵⁾ ɕjan⁴⁴ ti⁽³³⁾ │ po⁽²²⁾ t^ho⁴⁴ ti⁽³³⁾ Naυ²¹ ts⁽²²⁾ t^hwa¹³ ɣo⁽²²⁾ naI⁽³³⁾ │ dʑjeI⁽²²⁾ swã⁽⁵⁵⁾ na²¹ ku⁽⁵⁵⁾ ti⁽³³⁾ pin³¹ dz’⁽²²⁾ da⁽²²⁾ ║ pja³⁴ xwan⁽³³⁾ dʑjeI⁽²²⁾ jan⁽²²⁾ tɕЧi⁴⁵ da⁽²²⁾ ka⁽³³⁾ kin⁴⁵ │ ɣo²² dʑin⁽³³⁾ ka⁽³³⁾ t^hy⁴⁴ ║ bɯ⁽²²⁾ kɯ⁽⁵⁵⁾ │ t^ho⁽³³⁾ t^hy⁴⁴ tja⁽³³⁾ Чa⁽³³⁾ ni⁴⁵ xaI⁽⁵⁵⁾ │ nu⁽²²⁾ ku⁽⁵⁵⁾ jan⁽³³⁾ dʑjeI⁽²²⁾ po⁽²²⁾ t^ho ⁽³³⁾ ti⁽³³⁾ Naυ²¹ ts`⁽²²⁾ │ k^hjã²⁵ tja⁽³³⁾ Чa⁽³³⁾ kin²¹║ tɕЧi⁴⁵ ɣaI⁽²²⁾ │ pja³⁴ xwan⁽³³⁾ maυ⁽²²⁾ tja⁽³³⁾ bjã²² xwa⁽³³⁾ │ t∫⁽³³⁾ xaυ⁽²²⁾ swã⁴⁵ ni⁽³³⁾ ║ kɯ⁴⁵ ni⁽²²⁾ i⁽³³⁾ ɣo²² tɕi⁽²²⁾ │ t^ha²⁵ jaŋ⁽³³⁾ t^hy⁽³³⁾ naI⁽³³⁾ i⁽³³⁾ so⁴⁵ │ nu⁽²²⁾ ku⁽⁵⁵⁾ jan⁽³³⁾ mo)²¹ ɣjaŋ⁽²²⁾ dʑjeI⁽²²⁾ po⁽²²⁾ to⁽³³⁾ Naυ²¹ ts⁽²²⁾ t^hwa²⁴ ɣo⁽²²⁾ naI⁽³³⁾ ni⁽²²⁾ ║su²¹ i⁽²²⁾ │ pja³⁴ xwan⁽³³⁾ bɯ⁽³³⁾ tja⁽³³⁾ bɯ⁽³³⁾ dan²⁴ jan⁽²²⁾ │ ɣa⁽³³⁾ dz⁽²²⁾ t^ha²⁴ jaŋ⁽³³⁾ pi⁽²²⁾ t^ho⁽³³⁾ ti⁽³³⁾ pin²¹ dz’⁽²²⁾ da⁽²²⁾ ║

Orthographic version

有一天，北风跟太阳在那里争哪个的本事大。争来争去，有个人过身，身上穿了件厚袄子。他们就商量好，讲，哪个先让这个过身的把他的袄子脱下来，就算哪个的本事大。北风就用最大咖劲，下勤咖吹。不过，他吹得越厉害，那个人就把他的袄子箝得越紧。最后，北风冇得办法，只好算哩。过哩一下唧，太阳出来一晒，那个人马上就把只袄子脱下来哩。所以，北风不得不承认，还是太阳比他的本事大。

Acknowledgements

I would like to thank the editors Amalia Arvaniti and Adrian Simpson, the two anonymous reviewers, our audio assistant André Radtke and copy-editor Ewa Jaworska for their constructive suggestions on earlier versions of this paper. I would also like to express special thanks to Qian Li and Dianfeng Hou for their insightful advice and encouragement during our discussions. I am also very grateful to Ian Maddieson, Chris Sinha and Russell Palmer for their invaluable contributions.

Supplementary material

To view supplementary material for this article, please visit https://doi.org/10.1017/S002510031800035X

Footnotes

1 Opinions differ as to the number of dialectal groups in China, which have been posited as numbering five (Wang 1936), seven (Yuan 2001, X. Li & Xiang 2009, Huang & Liao 2011: 4–7), nine (F. Li 1973) and ten (LAC 2012). This paper follows the proposal by LAC (2012) to divide Chinese into ten dialectal groups – Mandarin, Jin, Wu, Min, Hakka, Cantonese, Xiang, Gan, Hui, Ping and local dialects – primarily because this study is based on major findings (especially numerous new and increasingly influential findings in the last 20 years) that help to establish the basic facts regarding the major linguistic features of Chinese dialects, presenting a comprehensive and up-to-date account of the distribution and classification of Chinese dialects.

2 Apart from Xiangxiang, Shuangfeng is also frequently regarded as a representative variety of Old Xiang (e.g. Yuan 2001), due to the fact that the linguistic features of Old Xiang are evident in both dialects (H. Bao 2006). In fact, Shuangfeng has been regarded as a subdialect of Xiangxiang because (i) up until 1951, Shuangfeng City, where the Shuangfeng dialect is spoken, had been administratively subordinate to Xiangxiang City; and (ii) the phonetic systems of the two dialects resemble each other to a great extent (H. Bao & Chen 2005, H. Bao 2006).

3 Nasal airflow baselines are intrinsic nasal airflow levels that are normally observed on oral vowels in pure oral contexts. They are established by calculating the mean nasal airflow rates for oral vowels in (C)V syllables. The onset and offset of nasalization for an oral vowel in (C)VN syllables or for a nasal vowel were defined as the points when nasal airflow rising above or falling below the baseline respectively. The baseline was determined for each vowel and for each speaker.

4 This follows studies such as Ladefoged & Johnson (2011: 93), in which the phonetic values of the diphthongs are often described by comparing with those of the monophthong vowels, as indicated by statements such as ‘some American English speakers have a diphthong starting with a vowel very like [ɛ] in head’.

5 Note that although all are grouped under ‘alveolar’, /t t^h d/ are typically denti-alveolo-post-alveolar, /n/ is typically alveolar, and /ts ts^h dz s/ are typically denti-alveolar.

References

Bao, Houxing. 2001. Gaishu [An outline]. In Li, Yongming (eds.), Hunan sheng fangyanzhi [Dialects in Hunan Province], 1–31. Changsha: Hunan Renmin Chubanshe.Google Scholar

Bao, Houxing. 2006. Xiangfangyan Gaiyao [An outline of the Xiang dialects]. Changsha: Hunan Shifan Daxue Chubanshe.Google Scholar

Bao, Houxing & Hui, Chen. 2005. Xiangyu de fenqu [The subdivision of the Xiang dialects]. Fangyan [Dialects] 3, 261–270.Google Scholar

Bao, Zhiming. 1990. On the nature of tone. Ph.D. dissertation, MIT.Google Scholar

Bao, Zhiming. 1996. Tonal contour and register harmony in Chaozhou. Linguistic Inquiry 30, 485–493.10.1162/002438999554165CrossRef Google Scholar

Cao, Jianfen. 2016. Yuyan de Yunlü yu Yuyin de Bianhua [Prosody and phonetic variations]. Beijing: Chinese Academy of Social Sciences Press.Google Scholar

Chao, Yuanren. 1980. A system of “tone-letters”. Fangyan [Dialects] 2, 81–83.Google Scholar

Chao, Yuanren & Zongji, Wu. 1974. Xiangxiang fangyan [Xiangxiang dialect]. In Yang, Shifeng (eds.), Hunan fangyan diaocha baogao [A survey of the dialects in Hunan Province]. Journal of the National Institute of Historical Linguistics 66, 584–602.Google Scholar

Chen, Yiya & Gussenhoven, Carlos. 2015. Shanghai Chinese. Journal of the International Phonetic Association 45(3), 321–337.10.1017/S0025100315000043CrossRef Google Scholar

Huang, Borong & Xudong, Liao (eds.). 2011. Xiandai hanyu [Modern Chinese], vol. 1, 4–7. Beijing: China Higher Education Press.Google Scholar

Jiang, Junfeng. 2008. Xiangxiang fangyan yuyin yanjiu [A phonetic study of Xiangxiang dialect]. Ph.D dissertation, Hunan Normal University.Google Scholar

Keating, Patricia A., Michael, J. Mikos & William, F. Ganong III. 1981. A cross-language study of range of voice onset time in the perception of initial stop voicing. The Journal of the Acoustical Society of America 70(5), 1261–1271.10.1121/1.387139CrossRef Google Scholar

LAC [Language atlas of China]. Institute of Linguistics of the Chinese Academy of Social Sciences, Institute of Ethnology and Anthropology of the Chinese Academy of Social Sciences & Language Information Sciences Research Centre of the City University of Hong Kong (eds.). 2012. Zhongguo yuyan dituji [Language atlas of China], 2nd edn. Beijing: Commercial Press.Google Scholar

Ladefoged, Peter & Keith, Johnson. 2011. A course in phonetics, 6th edn. Boston, MA: Wadsworth, Cengage Learning.Google Scholar

Ladefoged, Peter & Ian, Maddieson. 1996. The sounds of the world’s languages. Oxford: Blackwell.Google Scholar

Lau, Sze Lok. 2002. Tone and tone sandhi in Suzhou. M.Phil. thesis, City University of Hong Kong.Google Scholar

Lee, Wai-sum & Eric, Zee. 2003. Standard Chinese (Beijing). Journal of the International Phonetic Association 33, 109–112.10.1017/S0025100303001208CrossRef Google Scholar

Li, Fang-kuei. 1973. Languages and dialects of China. Journal of Chinese Linguistics 1(1), 1–13.Google Scholar

Li, Qian, Yiya, Chen & Ziyu, Xiong. 2019. Tianjin Mandarin. Journal of the International Phonetic Association 48(1), 109–128.10.1017/S0025100317000287CrossRef Google Scholar

Li, Wenchao. 1999. A diachronically-motivated segmental phonology of Mandarin Chinese. New York: Peter Lang.Google Scholar

Li, Xiaofan & Mengbing, Xiang. 2009. Hanyu fangyanxue jichu jiaocheng [A basic course of Chinese dialectology]. Beijing: Beijing University Press.Google Scholar

Lin, Maocan & Jingzhu, Yan. 1990. Putonghua qingsheng yu qingzhongyin [The neutral tone and stress in Putonghua]. Yuyan Jiaoxue yu yanjiu 3, 88–104.Google Scholar

Lin, Yen-Hwei. 2014. Segmental phonology. In C.-T, James Huang, Y.-H, Audrey Li & Andrew, Simpson (eds.), The handbook of Chinese linguistics, 400–421. Oxford: Wiley-Blackwell.10.1002/9781118584552.ch15CrossRef Google Scholar

Miret, Fernando Sánchez. 1998. Some reflections on the notion of diphthong. Papers and Studies in Contrastive Linguistics 34, 27–51.Google Scholar

Recasens, Daniel. 1990. The articulatory characteristics of palatal consonants. Journal of Phonetics 18, 267–280.10.1016/S0095-4470(19)30393-6CrossRef Google Scholar

Rosner, Burton S., Luis, E. López-Bascuas, Jose, E. García-Albea & Richard, P. Fahey. 2000. Voice-onset times for Castilian Spanish initial stops. Journal of Phonetics 28, 217–224.10.1006/jpho.2000.0113CrossRef Google Scholar

Shi, Feng. 1983. Suzhouhua zhuoseyin de shengxue fenxi [A phonetic study of the voiced stops in Suzhou Chinese]. Yuyan Yanjiu 1, 49–83.Google Scholar

Shi, Feng. 1986. Tianjin fangyan shuangzizu shengdiao fenxi [An acoustic study of the tone sandhi for bisyllabic words in Tianjin Chinese]. Yuyan Yanjiu 1, 77–90.Google Scholar

Shi, Feng. 1990. Tianjin fangyan shuangzizu shengdiao fenxi [An acoustic analysis of the tone sandhi for the bisyllabic words in Tianjin Chinese]. In Feng, Shi (ed.), Yuyinxue tanwei [Phonetic inquiry], 66–83. Beijing: Beijingdaxue Chubanshe.Google Scholar

Shi, Feng. 1994. Beijinghua de shengdiao geju [The tonal system of Beijing Mandarin]. In Feng, Shi & Rongrong, Liao (eds.), Yuyin conggao [Papers on phonetics], 10–19. Beijing: Beijing Yuyan Xueyuan Chubanshe.Google Scholar

Shi, Feng & Ping, Wang. 2006. Beijinghua danziyin shengdiao de tongji fenxi [A statistical analysis of the citation tones in Beijing Mandarin]. Zhongguo yuwen 5, 33–40.Google Scholar

van Alphen, Petra M. & Roel, Smits. 2004. Acoustical and perceptual analysis of the voicing distinction in Dutch initial plosives: The role of prevoicing. Journal of Phonetics 32(4), 455–491.10.1016/j.wocn.2004.05.001CrossRef Google Scholar

Wang, Li. 1936. Zhongguo yinyunxue xia [Chinese phonology II]. Shanghai: Shanghai Shudian.Google Scholar

Williams, Lee. 1977. The voicing contrast in Spanish. Journal of Phonetics 5, 169–184.10.1016/S0095-4470(19)31127-1CrossRef Google Scholar

Yuan, Jiahua. 2001. Hanyu fangyan gaiyao [An outline of Chinese dialects], 2nd edn. Beijing: Yuwen Chubanshe.Google Scholar

Zee, Eric & Wai-sum, Lee. 2008. The articulatory characteristics of the palatals, palatalized velars and velars in Hakka Chinese. Presented at the 8th International Seminar on Speech Production, Strasbourg, France.Google Scholar

Zeng, Shaoda. 2001. Xiangxiang fangyan [Xiangxiang dialect]. In Li, Yongming (ed.), Hunan sheng fangyanzhi [Dialects in Hunan Province], 211–275. Changsha: Hunan Renmin Chubanshe.Google Scholar

Zeng, Ting. 2012. A phonetic study of the sounds and tones in Xiangxiang Chinese. Ph.D. dissertation, City University of Hong Kong.Google Scholar

Zeng, Ting. 2015. Devoicing of historically voiced obstruents in Xiangxiang Chinese. Journal of Chinese Linguistics 43(2), 638–667.Google Scholar

Zhou, Zhenhe & Rujie, You. 1985. Hunansheng fangyan quhua ji lishi beijing [The subdivision of the dialects in Hunan Province and the historical background]. Fangyan [Dialects] 4, 257–272.Google Scholar

Figure 1 Palatogram and linguogram of (a) /ta34/ ‘to reach’, (b) /ka44/ ‘street’.

Figure 2 Palatogram and linguogram of (a) /ts44/ ‘capital’, (b) /t∫34/ ‘nephew’, (c) /tɕi34/ ‘to amass’, (d) /s44/ ‘silk’, (e) /ʃ44/ ‘poem’, (f) /ɕi34/ ‘to learn’.

Figure 3 Waveform and spectrogram of (a) /dzu22/ [zu22] ‘to sit’, (b) /dzo24/ [dzo24] ‘tea’, (c) /dzo22/ [dzo22] ‘stubble’.

Figure 4 The airflow traces of (a) /nẽ22/ [nẽ22] ‘to practice’, (b) /naI34/ [nãĨ)34] ‘to come’, (c) /naI21/ [laI21] ‘basket’ and (d) /nu34/ [ĨũĨ)34] ‘to display’. The three vertical lines in each panel denote consonant onset, vowel onset and syllable offset, in order of appearance.

Figure 5 Palatogram and linguogram of (a) /ni34/ [ni34] ‘to leave’, (b) /na34/ [la34] ‘candle’.

Figure 6 Palatogram and linguogram of (a) /ŋ(a)/ in Xiangxiang, (b) /ŋ(jẽ)/ in Xiangxiang, (c) /ɲ(au)/ in Hakka (Zee & Lee 2008: 115), (d) /ɲ/ in Czech (Recasens 1990: 271), (e) /k(i)/ (Recasens 1990: 275).

Figure 7 Vowel ellispes of the six oral vowels that occur in open syllables: /i44/ ‘clothes’, /y44/ ‘slit’, /ɯ44/ ‘jet black’, /u44/ ‘snail’, /o44/ ‘to dig’ and /ka44/ ‘street’

Figure 8 The lip position of (a) /ɯ/, (b) /u/ and (c) /o/.

Figure 9 Vowel ellipses of the two vowels that also occur in closed syllables: /i/ in /tin44/ [tIn44] ‘nail’, /a/ in /tan44/ ‘needle’ and /taŋ44/ ‘to serve’. The vowel ellipses of /i/ and /a/ in an open syllable are also shown, only to help determine the phonetic values of the vowels in closed syllables.

Figure 10 Vowel ellipses of the nasal vowels: /ẽ/ in /ẽ44/ ‘smoke’, /õ)/ in /kõ)44/ ‘rice cereal’, /ã)/ in /tjã)44/ ‘light’ and /wã)44/ ‘bent’. The vowel ellipses of /i/ [I], /a/ and /o/ are also shown, only to help determine the phonetic values of the nasal vowels.

Figure 11 Vowel ellipses of the nasal vowels (with the /ã/ tokens divided into two groups: /ã/ in /tjã44/ ‘light’ vs. /ã/ in /wã44/ ‘bent’). The vowel ellipses of /i/ [I], /a/ and /o/ are also shown, only to help determine the phonetic values of the nasal vowels.

Figure 12 Vowel ellipses for the three diphthongs: /eI/ in /jeI44/ ‘excellent’, /aI/ in /taI44/ ‘slow-witted’ and /aυ/ in /taυ44/ ‘knife’.

Figure 13 Vowel ellipses for the three diphthongs and three monophthongs (/i/ [I], /u/, /a/).

Figure 14 Waveform and spectrogram of (a) /ja34/ ‘hot’ and (b) /taI44/ ‘slow-witted’.

Table 1 Co-occurrence restrictions on consonants with the high vowels /i y u/ and glides /j Ч w/.

Figure 15 Pitch curves of the seven citation tones interpretable in Chao’s five-point scale for the female speaker who provided the data for illustration.

Figure 16 Tone sandhi patterns of bisyllabic words from the speaker who provided the recording for illustration in this study.

Figure 17 Waveform and spectrogram of [ɕja34 ɕja33] ‘black’ (/ɕja34/ ‘black’ + /ɕja34/ ‘color’) from the speaker who provided the recording for illustration in this study.

Table 2 Co-occurrence relationship between syllable onset and tones.

Table 3 Example words for tone sandhi. Two sets of words were provided, one including minimal pairs (denoted as MP), the other high-frequency words (HF).

Zeng supplementary material

Sound files zip. These audio files are licensed to the IPA by their authors and accompany the phonetic descriptions published in the Journal of the International Phonetic Association. The audio files may be downloaded for personal use but may not be incorporated in another product without the permission of Cambridge University Press

File 16 MB

Article contents

The Xiangxiang dialect of Chinese

Extract

Consonants

Non-syllabic consonants

Syllabic consonants

Vowels

Monophthongs

Diphthongs

Syllable structure

Tones

Citation forms

Sandhi patterns for bisyllabic words

Transcription of recorded passage

Phonemic version

Orthographic version

Acknowledgements

Supplementary material

Footnotes

References

Zeng supplementary material

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests