Variational Bayes

Shinji Watanabe; Jen-Tzung Chien

doi:10.1017/CBO9781107295360.008

Variational Bayes (VB) was developed in the machine learning community in the 1990s (Attias 1999, Jordan, Ghahramani, Jaakkola et al. 1999) and has now become a standard technique to approximated Bayesian inference for latent models, based on the EM-like algorithm. In Chapter 4, we have also dealt with latent models based on the maximum a-posteriori (MAP) EM algorithm. However, the MAP approximation uses the point estimation of model parameters instead of the distribution estimation, which is far from a true Bayesian manner of regarding all the variables introduced in our problem as probabilistic random variables. Another approximation based on the asymptotic approximation in Chapter 6 assumes a complex posterior distribution as a single Gaussian distribution without latent variables, which is not a true assumption for many of our applications. The evidence approximation in Chapter 5 also does not explicitly deal with latent models (can be obtained by combining MAP, VB, or MCMC). Instead of considering the MAP, evidence, and asymptotic approximations, VB can efficiently approximate complicated integrals and expectations over model parameters, based on variational method within a specific family of distribution types (exponential family, as discussed in Section 2.1.3). The key idea of the variational technique is to find the lower bound of the marginal log likelihood, similar to the EM algorithm in Section 3.4, and obtain the posterior distributions directly based on the variational method. This chapter starts to explain the general framework of VB in Section 7.1, and more specific pattern recognition problems in Section 7.2. Then this chapter goes on to provide a VB version of the EM algorithm for statistical models and model selection in speech and language processing, including speech recognition in Sections 7.3 and 7.4 and speaker verification in Section 7.5. Sections 7.6 and 7.7 also deal with latent topic models and their extensions; these try to capture long-range topic information from (poken) documents, based on VB solutions.

Variational inference in general

This section starts by describing a general latent model with observation data X = {xn|n = 1,…, N}, and the set of all variables introduced in our model including latent variables, parameters, hyperparameters, and model structure Z. The latter sections specify Z with more specific variables.

Book contents

7 - Variational Bayes

Summary

Access options

Book contents

7 - Variational Bayes

Summary

Access options

Save book to Kindle

Save book to Dropbox

Save book to Google Drive