Book contents
- Frontmatter
- Contents
- Preface
- Notation and abbreviations
- Part I General discussion
- 1 Introduction
- 2 Bayesian approach
- 3 Statistical models in speech and language processing
- Part II Approximate inference
- Appendix A Basic formulas
- Appendix B Vector and matrix formulas
- Appendix C Probabilistic distribution functions
- References
- Index
2 - Bayesian approach
from Part I - General discussion
Published online by Cambridge University Press: 05 August 2015
- Frontmatter
- Contents
- Preface
- Notation and abbreviations
- Part I General discussion
- 1 Introduction
- 2 Bayesian approach
- 3 Statistical models in speech and language processing
- Part II Approximate inference
- Appendix A Basic formulas
- Appendix B Vector and matrix formulas
- Appendix C Probabilistic distribution functions
- References
- Index
Summary
This chapter describes a general concept and statistics of the Bayesian approach. The Bayesian approach covers wide areas of statistics (Bernardo & Smith 2009, Gelman, Carlin, Stern et al. 2013), pattern recognition (Fukunaga 1990), machine learning (Bishop 2006, Barber 2012), and applications of these approaches. In this chapter, we start the discussion from the basic probabilistic theory, and mainly describe the Bayesian approach by aiming to follow a machine learning fashion of constructing and refining statistical models from data. The role of the Bayesian approach in machine learning is very important since the Bayesian approach provides a systematic way to infer unobserved variables (e.g., classification category, model parameters, latent variables, model structure) given data. This chapter limits the discussions considering the speech and language problems in the latter chapters, by providing simple probabilistic rules, and prior and posterior distributions in Section 2.1. The section also provides analytical solutions of posterior distributions of simple models. Based on the basic introduction, Section 2.2 introduces a useful representation of the relationship of probabilistic variables in the Bayesian approach, called the Graphical model. The graphical model representation gives us an intuitive view of statistical models even when they have complicated relationships between their variables. Section 2.3 explains the difference between Bayesian and maximum likelihood (ML) approaches. The following chapters extend the general Bayesian approach described in this chapter to deal with statistical models in speech and language processing.
Bayesian probabilities
This section describes the basic Bayesian framework based on probabilistic theory. Although some of the definitions, equations, and concepts are trivial, this section reviews the basics to assist readers to fully understand the Bayesian approach.
In the Bayesian approach, all the variables that are introduced when models are parameterized, such as model parameters and latent variables, are regarded as probabilistic variables. Thus, let a be a discrete valuable, then the Bayesian approach deals with a as a probabilistic variable, and aims to obtain p (a):
Hereinafter, we assume that a is a discrete variable, and the expectation is performed by the summation over a for simplicity.
- Type
- Chapter
- Information
- Bayesian Speech and Language Processing , pp. 13 - 52Publisher: Cambridge University PressPrint publication year: 2015