Book contents
- Frontmatter
- Contents
- Foreword
- Preface
- 1 Introduction
- 2 Communication and language
- 3 The text-to-speech problem
- 4 Text segmentation and organisation
- 5 Text decoding: finding the words from the text
- 6 Prosody prediction from text
- 7 Phonetics and phonology
- 8 Pronunciation
- 9 Synthesis of prosody
- 10 Signals and filters
- 11 Acoustic models of speech production
- 12 Analysis of speech signals
- 13 Synthesis techniques based on vocal-tract models
- 14 Synthesis by concatenation and signal-processing modification
- 15 Hidden-Markov-model synthesis
- 16 Unit-selection synthesis
- 17 Further issues
- 18 Conclusion
- Appendix A Probability
- Appendix B Phone definitions
- References
- Index
6 - Prosody prediction from text
Published online by Cambridge University Press: 25 January 2011
- Frontmatter
- Contents
- Foreword
- Preface
- 1 Introduction
- 2 Communication and language
- 3 The text-to-speech problem
- 4 Text segmentation and organisation
- 5 Text decoding: finding the words from the text
- 6 Prosody prediction from text
- 7 Phonetics and phonology
- 8 Pronunciation
- 9 Synthesis of prosody
- 10 Signals and filters
- 11 Acoustic models of speech production
- 12 Analysis of speech signals
- 13 Synthesis techniques based on vocal-tract models
- 14 Synthesis by concatenation and signal-processing modification
- 15 Hidden-Markov-model synthesis
- 16 Unit-selection synthesis
- 17 Further issues
- 18 Conclusion
- Appendix A Probability
- Appendix B Phone definitions
- References
- Index
Summary
Informally we can describe prosody as the part of human communication which expresses emotion, emphasises words, reveals the speaker's attitude, breaks a sentence into phrases, governs sentence rhythm and controls the intonation, pitch or tune of the utterance. This chapter describes how to predict prosodic form from the text while Chapter 9 goes on to describe how to synthesize the acoustics of prosodic expression from these form representations. In this chapter we first introduce the various manifestations of prosody in terms of phrasing, prominence and intonation. Next we go on to describe how prosody is used in communication, and in particular explain why this has a much more direct affect on the final speech patterns than with verbal communication. Finally we describe techniques for predicting what prosody should be generated from a text input.
Prosodic form
In our discussion of the verbal component of language, we saw that, while there were many difficulties in pinning down the exact nature of words and phonemes, broadly speaking words and phonemes were fairly easy to find, identify and demarcate. Furthermore, people can do this readily without much specialist linguistic training – given a simple sentence, most people can say which words were spoken, and with some guidance people have little difficulty in identifying the basic sounds in that sentence.
The situation is nowhere near as clear for prosody, and it may amaze new comers to this topic to discover that there are no widely agreed description or representation systems for any aspect of prosody, be it to do with emotion, intonation, phrasing or rhythm.
- Type
- Chapter
- Information
- Text-to-Speech Synthesis , pp. 111 - 145Publisher: Cambridge University PressPrint publication year: 2009