Bayesian Inference in Python

Author: Jessica Lu Published: February 24, 2012 3 comments

Categories: analysis, python

Tags: bayesian, python, statistics

CosmoMC Bayesian Inference Package - sampling posterior probability distributions of cosmological parameters.

Observational astronomers don’t simply present images or spectra, we analyze the data and use it to support or contradict physical models. A key aspect of data analysis is understanding the certainty of claims that are made. Thus using statistics is a fundamental part of observational astronomy.

Statistical inference is one method of drawing conclusions, and establishing their certainty, given a set of observational data that is subject to random variation. There are two broad categories of statistics in wide-spread use in astronomy: Frequentist and Bayesian statistics. The frequentist sees probability as the long-run expected frequency of occurrence while the Bayesian view of probability is related to degree of belief; it is a measure of the plausibility of an event given incomplete knowledge. I won’t go into the pros and cons of each approach as there is plenty of fodder to be found on the internet. For beginners, check out this Astrobites post introducing Bayesian methods, or if want a to learn a little of both, try Practical Statistics for Astronomers by Wall & Jenkins. I will say that for complex data sets and complex models, bayesian inference is gaining traction in astrophysics (Loredo 1992, Trotta 2008).

Bayesian inference is quite simple in concept, but can seem formidable to put into practice the first time you try it (especially if the first time is a new and complicated problem). Here are two interesting packages for performing bayesian inference in python that eased my transition into bayesian inference:

pymc – Uses markov chain monte carlo techniques. There is also a useful set of examples using and extending pymc on the Healthy Algorithms blog. A useful introduction was presented at the SciPy 2011 conference and was recorded here.
multinest – Uses nested sampling techniques, which are superior for parameter spaces with strong and non-linear correlations as is usually the case for astronomical applications. The package is actually written in Fortran and has a C-wrapper that is callable from python. Here are two possible python wrappers used by lisasolve and a general purpose pymultinest. I have successfully used the second one, but if you are interested in running in parallel (with MPI), the first might be a better alternative. The hardest part about this package was getting it installed (fortran and C compilers have to play nice together).

In my own personal experience, I used pymc and, once I got passed the learning curve of understanding how they handle parameter and model descriptions, I found it to be elegant and easily extensible to other MCMC sampling techniques. However, I eventually moved to multinest due to the strongly degenerate and curved nature of the parameter space I needed to sample. Multinest was far superior in handling this situation (ran 5-10 times faster and provided a more accurate result on simulated data sets), even over hit-and-run MCMC algorithms supposedly designed for such problems.

Here are a few other packages I found with a simple google search, but I don’t have experience with these:

BIP – Bayes Melding Module
OpenBayes – Bayesian Nets
Inference Package – developed by astronomers
emcee – developed by astronomers (see astrobites summary)

What packages do you use or do you role your own? Any other thoughts on the pros and cons of the various packages?

3 comments… add one

Brian Weaver Feb 27, 2012 @ 9:16

Another piece of software that performs Bayesian Estimation via MCMC is WinBUGS and OpenBUGs (BUGS = Bayesian using Gibbs sampler). Both of these programs are easy to use and there are plenty of examples to help you along the way. OpenBUGS is essentially the newer form of WinBUGS (they have stopped development of WinBUGS and are now concentrating on OpenBUGS). I believe the biggest difference between the two (and I may be wrong) is WinBUGS is heavily dependent on the Gibbs sample while OpenBUGS is based on the Metropolis-Hastings algorithm. Fortunately, both are free!

If anyone is interested, I can provide some references to good books on Bayesian Statistics.

WinBUGS: http://www.mrc-bsu.cam.ac.uk/bugs/winbugs/contents.shtml
OpenBUGS: http://www.openbugs.info/w/

Reply Link
- Jan Galkowski Oct 10, 2014 @ 16:44
  
  There is also Martyn Plummer’s JAGS which can run on its own, and has an R interface, with a Python interface in development: http://doingbayesiandataanalysis.blogspot.com/2014/09/doing-bayesian-data-analysis-in-python.html
  
Thomas Fitzgerald Jan 24, 2013 @ 13:32

Bayes++ (http://bayesclasses.sourceforge.net/Bayes++.html) is a strong Bayesian framework. Commonly used in Robot Navigation.

Reply Link