Bibliography

Adam Prügel-Bennett

doi:10.1017/9781108635349.016

Bibliography

Published online by Cambridge University Press: 03 January 2020

Adam Prügel-Bennett

Show author details

Adam Prügel-Bennett: Affiliation:
University of Southampton

Book contents

Get access

Summary

A summary is not available for this content so a preview has been provided. Please use the Get access link above for information on how to access this content.

Image of the first page of this content. For PDF version, please use the ‘Save PDF’ preceeding this image.'

Type: Chapter
Information: The Probability Companion for Engineering and Computer Science , pp. 449 - 453

DOI: https://doi.org/10.1017/9781108635349.016 [Opens in a new window]

Publisher: Cambridge University Press

Print publication year: 2020

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Abramowitz, M. and Stegun, I. A.. Handbook of Mathematical Functions, volume 55 of Applied Mathematics Series. Dover, 9th edition, 1964. This is the standard collection of facts about mathematical function. It includes a useful chapter on probabilities.Google Scholar

Arclioptas, D., Naor, A., and Peres, Y.. Rigorous location of phase transitions in hard optimization problems. Nature, 435(9):759–764, 2005. Nice example of the use of the second-moment method applied to the maximum satisfiability problem.Google Scholar

Arjovsky, M., Chintala, S., and Bottou, L.. Wasserstein GAN. arXiv:1701.07875 [cs, stat], January 2017. http://arxiv.org/abs/1701.07875. Paper that introduced Wasserstein generative adversarial networks (GANs).Google Scholar

Barber, D.. Bayesian Reasoning and Machine Learning. Cambridge University Press, 2011. An accessible machine learning book.Google Scholar

Bernstein, S.. On a modification of Chebyshev’s inequality and of the error formula of Laplace. Annales scientifiques des Institutions mathématiques savantes de l’Ukraine, 1(4):38–49, 1924. Paper deriving Bernstein’s inequality. A very early tail-bound result.Google Scholar

Bishop, C.. Pattern Recognition and Machine Learning. Springer, 2nd edition, 2007. A very readable description of modern developments in pattern recognition and machine learning.Google Scholar

Blei, D. M., Ng, A. Y., and Jordan, M. I.. Latent Dirichlet allocation. Journal of Machine Learning Research, 3:993–1022, 2003. Clear article introducing the now famous latent Dirichlet allocation (LDA) method.Google Scholar

Boucheron, S., Lugosi, G., and Massart, P.. Concentration Inequalities. Oxford University Press, 2013. Comprehensive, state-of-the-art guide to obtaining tail inequalities. Gives good coverage of the basics, but rapidly becomes quite advanced.Google Scholar

Boyd, S. and Vandenberghe, L.. Convex Optimization. Cambridge University Press, 2004. Highly respected text on convexity, although not a quick read.Google Scholar

Chernoff, H.. A measure of asymptotic efficiency for tests of a hypothesis based on sum of observations. Annals of Mathematical Statistics, 23:493–507, 1952. Paper deriving the famous Chernoff tail bound.Google Scholar

Cover, T. A. and Thomas, J. A.. Elements of Information Theory. Wiley, 2nd edition, 2006. A standard reference on all elements of information theory.Google Scholar

Cressie, N. and Wikle, C. K.. Statistics for Spatio-Temporal Data. Wiley, 2011. Comprehensive although dense review of Bayesian methods for dealing with time series and spatial data.Google Scholar

Crisan, D. and Rozovskii, B., editors. The Oxford Handbook of Nonlinear Filtering. Oxford University Press, 2011. Collection of many articles and tutorials on using particle filters and other non-linear filtering techniques.Google Scholar

Dempster, A. P., Laird, N. M., and Rubin, D. B.. Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society, Series B, 39(1):1– 38, 1977. Paper proposing the now classic expectation-maximization (EM) algorithm.Google Scholar

Devroye, L.. Non-Uniform Random Variate Generation. Springer, 1986. This provides a very extensive discussion of techniques for generating random deviates for different probability distributions. Alas this has been out of print for many years, but it is available online at www.nrbook.com/devroye/.Google Scholar

Dobrushkin, V. A.. Methods in Algorithmic Analysis. Chapman & Hall/CRC, 2010. Nicely written and comprehensive book on combinatorics and generating functions aimed at studying algorithms.Google Scholar

Duane, S., Kennedy, A. D., Pendleton, B. J., and Roweth, D.. Hybrid Monte Carlo. Physics Letters, B 195:216–222, 1987. The first paper to introduce hybrid Monte Carlo.Google Scholar

Dubhashi, D. P. and Panconesi, A.. Concentration of Measure for Analysis of Randomized Algorithms. Cambridge University Press, 2009. Fairly comprehensive book covering concentration theorems and the applications to problems in the analysis of algorithms.Google Scholar

Duda, R. O., Hart, P. E., and Stork, D. G.. Pattern Classification. Wiley, 2001. This is an update of Duda and Hart’s original book, written in 1973, which served for many years as the standard reference on anything to do with pattern classification.Google Scholar

Durbin, R., Eddy, S., Krogh, A., and Mitchison, G.. Biological Sequence Analysis. Cambridge University Press, 1st edition, 1998. This is an extremely readable introduction to hidden Markov models and their extension in the context of biological sequence analysis.Google Scholar

Einstein, A.. Investigations on the Theory of the Brownian Movement. Dover, 1998. Five key papers by Einstein on Brownian motion in a collection edited by R. Furth. These papers were written around 1905.Google Scholar

Feller, W.. An Introduction to Probability Theory and Its Applications, volume 1. Wiley, 3rd edition, 1968a. This is the classic on probability theory.Google Scholar

Feller, W.. An Introduction to Probability Theory and Its Applications, volume 2. Wiley, 2nd edition, 1968b.Google Scholar

Field, A., Miles, J., and Field, Z.. Discovering Statistics Using R. SAGE, 2012. Very informal but comprehensive guide to using statistics in empirical research. Covers topics such as correlation, regression, analysis of variance (ANOVA), experimental design, and factor analysis. There is a companion book that covers the statistical package SPSS. The book assumes less mathematical sophistication than I’ve assumed and in consequence it is long.Google Scholar

Fischer, K. H. and Hertz, J. A.. Spin Glasses. Cambridge University Press, 1991. Slightly more tutorial-level introduction to spin glasses and disordered systems.Google Scholar

Fishman, G. S.. Monte Carlo: Concepts, Algorithms and Applications. Springer Series in Operations Research. Springer, 1996. A very comprehensive guide to practical ways of generating fast random deviates. Possibly showing its age, but nice to dip into.Google Scholar

Frieden, B. R.. Science from Fisher Information. Cambridge University Press, 1998. An attempt to derive almost all the laws of physics from Fisher Information.Google Scholar

Gardiner, C.. Stochastic Methods. Springer, 4th edition, 2009. Established classic on stochastic processes, covering a lot of details. It is aimed at the practitioner.Google Scholar

Gelman, A., Carlin, J. B., Stern, H. S., and Rubin, D. B.. Bayesian Data Analysis. Texts on statistical science. Chapman & Hall, 2003. One of the classics of Bayesian analysis.Google Scholar

Geman, S. and Geman, D.. Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images. IEEE Transactions on Pattern Analysis and Machine Intelligence, 6:721–741, 1984. A classic paper that introduced many of the ideas in using Markov Chain Monte Carlo (MCMC). It also provided the first proof of the convergence of simulated annealing.Google Scholar

Goodfellow, I. J., Pouget-Abadie, J., Mirza, M., Xu, B.G, Warde-Farley, D., Ozair, S., Courville, A., and Bengio, Y.. Generative adversarial networks. arXiv:1406.2661 [cs, stat], June 2014. http://arxiv.org/abs/1406.2661. The paper that introduced GANs. In the fast-moving world of deep learning most papers are put onto the open web service arXiv.Google Scholar

Graham, R., Knuth, D. E., and Patashnik, O.. Concrete Mathematics. Addison-Wesley, 1989. This is Donald Knuth’s toolbox and is a tour de force of solving problems in discrete mathematics. A joy to read, although prepare to feel intimidated.Google Scholar

Grimmett, G. and Stirzaker, D.. Probability and Random Processes. Oxford University Press, 3rd edition, 2001a. This is a modern classic, which is packed with a lot of material. It ignores the measure-theoretic foundations. Covers many areas in more detail than this book, although it takes a more mathematical viewpoint, missing some applications altogether.Google Scholar

Grimmett, G. and Stirzaker, D.. One Thousand Exercises in Probability. Oxford University Press, 2nd edition, 2001b. Covers numerous problems. Acts as a companion text to Probability and Random Processes by the same authors.Google Scholar

Gulrajani, I., Ahmed, F., Arjovsky, M., Dumoulin, V., and Courville, A.. Improved training of Wasserstein GANs. In Proceedings of the 31st International Conference on Neural Information Processing Systems, NIPS’17, pages 5769–5779, USA, 2017. Curran Associates Inc.Google Scholar

Gulrajani, I., Ahmed, F., Arjovsky, M., Dumoulin, V., and Courville, A. C.. Improved training of Wasserstein GANs. CoRR, abs/1704.00028, 2017. http://arxiv.org/abs/1704.00028. Paper explaining how to train Wasserstein GANs properly.Google Scholar

Hoeffding, W.. Probability inequalities for sums of bounded random variables. Journal of the American Statistical Association, 58(301):13–30, 1963. Classic paper deriving the eponymous inequality, but also obtains some bounds for non-independent random variables.Google Scholar

Jaynes, E. T.. The well-posed problem. Foundations of Physics, 3(4):477–493, 1973. Beautifully argued article showing how to use invariance principles to obtain an uninformative prior. This article is reproduced in the collection by Rosenkrantz (1982).Google Scholar

Jaynes, E. T.. Probability Theory: The Logic of Science. Cambridge University Press, 1st edition, 2003. A posthumously published diatribe on Bayesian methods. Ed Jaynes spent most of his career fighting for Bayesian methods to be accepted. He won this battle, but it didn’t stop him fighting. This contains many wonderful insights but is not the most balanced account.Google Scholar

Kelly, J. L.. A new interpretation of information rate. Bell System Technical Journal, 35:917–992, 1956. How to gamble and win, if only the odds are on your side.Google Scholar

Kingma, D. P. and Welling, M.. Auto-encoding variational Bayes. arXiv:1312.6114 [cs, stat], 2013. http://arxiv.org/abs/1312.6114. This is the classic paper that introduced variational auto-encoders (VAEs).Google Scholar

Knuth, D. E.. The Art of Computer Programming: Fundamental Algorithms, volume 1. Addison-Wesley/Longman, 3rd edition, 1997a. Knuth’s classic text on algorithms.Google Scholar

Knuth, D. E.. The Art of Computer Programming: Seminumerical Algorithms, volume 2. Addison-Wesley/Longman, 3rd edition, 1997b.Google Scholar

Ziemba, W. T., MacLean, L. C., and Thorp, E. O., editors. The Kelly Capital Growth Investment Criterion: Theory and Practice. World Scientific Publishing, 2012. Collection of papers discussing the use of the Kelly criterion in investments, including some key papers.Google Scholar

Lee, P. M.. Bayesian Statistics: An Introduction. Hodder Arnold, 3rd edition, 2003. One of a host of books on Bayesian statistics. This one is highly recommended by many readers.Google Scholar

Li, M. and Vitányi, P.. An Introduction to Kolmogorov Complexity and Its Applications. Springer, 2nd edition, 1997. A standard text for Kolmogorov complexity. Covers a lot of material.Google Scholar

MacKay, D. J. C.. Information Theory, Inference and Learning Algorithms. Cambridge University Press, 1st edition, 2003. An eclectic collections of ideas centred around information theory and machine learning. Very readable and full of many original insights.Google Scholar

Mézard, M., Parisi, G., and Virasoro, M. A.. Spin-Glass Theory and Beyond, volume 9 of Lecture Notes in Physics. World Scientific, 1987. Collection of papers with introductory chapters authored by researchers who drove the current research.Google Scholar

Minka, T. P.. Expectation propagation for approximate “Bayesian” inference. In Uncertainty in Articial Intelligence 17, pages 362–369, 2001. The paper that introduced expectation propagation as an alternative variational approximation scheme.Google Scholar

Moler, C. and Van Loan, C.. Nineteen dubious ways to compute the exponential of a matrix, twenty-five years later. SIAM Review, 45(1):3–49, 2003. Very readable description of the numerical challenges posed by one important matrix operation.Google Scholar

Mosteller, F.. Fifty Challenging Problems in Probability: With Solutions. Dover, 1988. What it says on the cover. An entertaining set of puzzles, but tends to stress probability as combinatorics.Google Scholar

Neal, R. M.. Probabilistic inference using Markov Chain Monte Carlo methods. Technical Report CRG-TR-93-1, Deptartment of Computer Science, University of Toronto, 1993. A description of MCMC by one of its greatest exponents. Very rich, detailed source.Google Scholar

Newman, M. E. J.. Networks: An Introduction. Oxford University Press, 2010. Good comprehensive overview of the current work on networks.Google Scholar

Opper, M. and Saad, D.. Advanced Mean Field Methods. Neural Information Processing Series. MIT Press, 2001. As the book title suggests, a series of articles covering advanced mean field methods and variational approximation methods.Google Scholar

Pearl, J.. Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. Morgan Kaufmann, 1988. The book helped start the Bayesian revolution and introduced graphical models. Still a good read.Google Scholar

Press, W. H., Flannery, B. P., Teukolsky, S. A., and Vetterling, W. T.. Numerical Recipes: The Art of Scientific Computation. Cambridge University Press, 3rd edition, 2007. The Numerical Recipes series provides both a huge number of practical algorithms as well as enjoyable and very informative descriptions of the algorithms. No one who uses computers for science should be without a copy.Google Scholar

Propp, J. G. and Wilson, D. B.. Exact sampling with coupled Markov chains and applications to statistical mechanics. Random Structures and Algorithms, 9:223–252, 1996. The original paper introducing exact sampling in MCMC.3.0.CO;2-O>CrossRef Google Scholar

Rabiner, L. R.. A tutorial on hidden Markov models and selected applications in speech recognition. Proceedings of the IEEE, 77: 257–286, 1989. Classic tutorial introduction to hidden Markov models.Google Scholar

Rasmussen, C. E. and Williams, C. K. I.. Gaussian Processes for Machine Learning. MIT Press, 2006. Covers the subject mentioned in the title, by some of the initiators.Google Scholar

Raymond, J., Manoel, A., and Opper, M.. Expectation propagation. arXiv:1409.6179, 2014. These are notes from a lecture by Manfred Opper who, together with Ole Winther, had independently, and around the same time as Thomas Minka, come up with an approximation scheme similar to expectation propagation. This scheme was based on old ideas from developing a mean field theory for spin-glass systems.Google Scholar

Rissanen, J.. A universal prior for integers and estimation by minimum description length. Annals of Statistics, 11(2):416–431, 1983. One of the first papers on minimum description length by the idea’s founding father.Google Scholar

Rosenkrantz, R. D.. E. T. Jaynes: Papers on Probability, Statistics and Statistical Physics, volume 158 of Synthese Library. D. Reidel, 1982. A collection of articles by Ed Jaynes. All beautifully argued. Gives a feel for the fight to make Bayesian inference respectable.Google Scholar

Stauffer, D. and Aharony, A.. Introduction to Percolation Theory. CRC Press, 2nd edition, 1997. A nice introduction to percolation and its many applications.Google Scholar

Steele, J. M.. The Cauchy–Schwarz Master Class: An Introduction to the Art of Mathematical Inequalities. Mathematical Association of America, 2004. Nice introduction to inequalities, although focused on all inequalities not just those related to probabilities.Google Scholar

Swendsen, R. H. and Wang, J.-S.. Nonuniversal critical dynamics in Monte Carlo simulations. Physical Review Letters, 58:86–88, 1987. First paper describing how to speed up MCMC for spin systems using cluster techniques.Google Scholar

van Kampen, N. G.. Stochastic Processes in Physics and Chemistry. North Holland, [1981] 2001. Very clearly written book concentrating on the use of stochastic processes in the physical sciences by one of the leading proponents. This was influential amongst scientists and is still a good read.Google Scholar

van Rijsbergen, C. J.. The Geometry of Information Retrieval. Cambridge University Press, 2004. Short, thought-provoking book advocating the use of probabilistic quantum mechanics as an appropriate model for information retreival.Google Scholar

Watts, D. J.. Six Degrees: The Science of a Connected Age. Vintage, 2003. Popular account of the structure found in networks.Google Scholar

Williams, D.. Weighing the Odds. Cambridge University Press, 2001. This book concentrates on developing the reader’s intuition of probability. It explores some of the darker recesses of probability where it requires sharp analysis to get consistant and correct results. I have skipped such examples, as my experience has never brought me into such dark corners. However, for the mathematically or philosophically minded reader these examples pose interesting problems to ponder. The book also covers classical statistics (from a mathematical point of view). The author is a strong advocate of using confidence intervals in preference to hypothesis testing. Also (and unusually) the book provides a chapter on probabilistic quantum mechanics. This is a subject that I haven’t covered as I consider it to be rather specific to physics, although there has been attempts to use this formalism elsewhere – see, for example, van Rijsbergen (2004).Google Scholar

Wolff, U.. Collective Monte Carlo updating for spin systems. Physical Review Letters, 62:361–364, 1989. Paper describing a fast-cluster MCMC algorithm for spin systems, developing the work of Swendsen and Wang.Google Scholar

Ziman, J. M.. Models of Disorder: The Theoretical Physics of Homogeneously Disordered Systems. Cambridge University Press, 1979. Classic book covering a wide variety of models of disordered systems. Missing more recent developments.Google Scholar

Book contents

Bibliography

Summary

Access options

References

Save book to Kindle

Save book to Dropbox

Save book to Google Drive