Analysis of speech signals

Paul Taylor

doi:10.1017/CBO9780511816338.014

In this chapter we turn to the topic of speech analysis, which tackles the problem of deriving representations from recordings of real speech signals. This book is of course concerned with speech synthesis – and at first sight it may seem that the techniques for generating speech “bottom-up” as described in Chapters 10 and 11 may be sufficient for our purpose. As we shall see, however, many techniques in speech synthesis actually rely on an analysis phase, which captures key properties of real speech and then uses these to generate new speech signals. In addition, the various techniques here enable useful characterisation of real speech phenomena for purposes of visualisation or statistical analysis. Speech analysis then is the process of converting a speech signal into an alternative representation that in some way better represents the information which we are interested in. We need to perform analysis because waveforms do not usually directly give us the type of information we are interested in.

Nearly all speech analysis is concerned with three key problems. First, we wish to remove the influence of phase; second, we wish to perform source/filter separation, so that we can study the spectral envelope of sounds independently of the source that they are spoken with. Finally, we often wish to transform these spectral envelopes and source signals into other representations that are coded more efficiently, have certain robustness properties, or more clearly show the linguistic information we require.

Book contents

12 - Analysis of speech signals

Summary

Access options

Book contents

12 - Analysis of speech signals

Summary

Access options

Save book to Kindle

Save book to Dropbox

Save book to Google Drive