10 - Advanced topics
Published online by Cambridge University Press: 05 June 2016
Summary
The preceding chapters have, as far as possible, attempted to isolate topics into welldefined areas such as speech recognition, speech processing, the human hearing system, voice production system, big data and so on. This breakdown has allowed us to discuss the relevant factors, background research and application methodology in some depth, as well as develop many MATLAB examples that are mostly self-contained demonstrations of the sub-topics themselves. However, some modern speech and audio related products and techniques span across disciplines, while others cannot fit neatly into those subdivisions discussed earlier.
In this chapter we will progress onward from the foundation of previous chapters, discussing and describing various advanced topics that combine many of the processing elements that we had met earlier, including aspects of both speech and hearing, as well as progressing beyond hearing into the very new research domain of low-frequency ultrasound.
It is hoped that this chapter, while conveying some fascinating (and a few unusual) application examples, will inspire readers to apply the knowledge that they have gained so far in many more new and exciting ways.
Speech synthesis
Speech synthesis means creating artificial speech, which could be by mechanical, electrical or other means (although our favoured approach is using MATLAB of course). There is a long history of engineers who have attempted to synthesise speech, including the famous Austrian Wolfgang von Kempelen, who published a mechanical speech synthesiser in 1791 (although it should be noted that he also invented something called ‘The Turk’, a mechanical chess playing machine which apparently astounded both public and scientists alike for many years before it was revealed that a person, curled up inside, operated the mechanism). The much more sober Charles Wheatstone, one of the fathers of electrical engineering as well as being a prolific inventor, built a synthesiser based on the work of von Kempelen in 1857, proving that the original device at least was not a hoax.
These early machines used mechanical arrangements of tubes and levers to recreate a model of the human vocal tract, with air generally being pumped through using bellows and a pitch source provided by a reed or similar (i.e. just like that used in a clarinet or oboe).
- Type
- Chapter
- Information
- Speech and Audio ProcessingA MATLAB-based Approach, pp. 314 - 365Publisher: Cambridge University PressPrint publication year: 2016