Skip to main content Accessibility help
×
Hostname: page-component-5c6d5d7d68-wp2c8 Total loading time: 0 Render date: 2024-08-30T01:27:35.324Z Has data issue: false hasContentIssue false

8 - Pronunciation

Published online by Cambridge University Press:  25 January 2011

Paul Taylor
Affiliation:
University of Cambridge
Get access

Summary

We now turn to the problem of how to convert the discrete, linguistic, word-based representation generated by the text-analysis system into a continuous acoustic waveform. One of the primary difficulties in this task stems from the fact that the two representations are so different in nature. The linguistic description is discrete, the same for each speaker for a given accent, compact and minimal. By contrast, the acoustic waveform is continuous, is massively redundant, and varies considerably even between utterances with the same pronunciation from the same speaker. To help with the complexity of this transformation, we break the problem down into a number of components. The first of these components, pronunciation, is the subject of this chapter. While specifics vary, this can be thought of as a system that takes the word-based linguistic representation and generates a phonemic or phonetic description of what is to be spoken by the subsequent waveform-synthesis component. In generating this representation, we make use of a lexicon, to find the pronunciations of words we know and can store, and a grapheme-to-phoneme (G2P) algorithm, to guess the pronunciations of words we don't know or can't store. After doing this we may find that simply concatenating the pronunciations for the words in the lexicon is not enough; words interact in a number of ways and so a certain amount of post-lexical processing is required. Finally, there is considerable choice in terms of how exactly we should specify the pronunciations for words, hence rigorously defining a pronunciation representation is in itself a key topic.

Type
Chapter
Information
Publisher: Cambridge University Press
Print publication year: 2009

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Save book to Kindle

To save this book to your Kindle, first ensure coreplatform@cambridge.org is added to your Approved Personal Document E-mail List under your Personal Document Settings on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part of your Kindle email address below. Find out more about saving to your Kindle.

Note you can select to save to either the @free.kindle.com or @kindle.com variations. ‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi. ‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.

Find out more about the Kindle Personal Document Service.

  • Pronunciation
  • Paul Taylor, University of Cambridge
  • Book: Text-to-Speech Synthesis
  • Online publication: 25 January 2011
  • Chapter DOI: https://doi.org/10.1017/CBO9780511816338.010
Available formats
×

Save book to Dropbox

To save content items to your account, please confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account. Find out more about saving content to Dropbox.

  • Pronunciation
  • Paul Taylor, University of Cambridge
  • Book: Text-to-Speech Synthesis
  • Online publication: 25 January 2011
  • Chapter DOI: https://doi.org/10.1017/CBO9780511816338.010
Available formats
×

Save book to Google Drive

To save content items to your account, please confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account. Find out more about saving content to Google Drive.

  • Pronunciation
  • Paul Taylor, University of Cambridge
  • Book: Text-to-Speech Synthesis
  • Online publication: 25 January 2011
  • Chapter DOI: https://doi.org/10.1017/CBO9780511816338.010
Available formats
×