Getting Started¶
System Requirements¶
mc3
is compatible with Python 3.6+, and has been tested to work on Unix/Linux and
OS X machines, with the following software:
- Numpy >= 1.19.5
- Scipy >= 1.9.3
- Matplotlib >= 3.3.4
mc3
may work with previous versions of these software;
however, we do not guarantee nor provide support for that.
Install¶
To install mc3
run the following command from the terminal:
pip install mc3
Or if you prefer conda:
conda install -c conda-forge mc3
Alternatively (e.g., for developers), clone the repository to your local machine with the following terminal commands:
git clone https://github.com/pcubillos/mc3
cd mc3
pip install -e .
mc3
provides MCMC and nested-sampling posterior sampling,
optimization and other lower-level statistical and plotting
routines. See the full docs in the api or through the Python
interpreter:
import mc3
# Bayesian posterior sampling:
help(mc3.sample)
# Optimization:
help(mc3.fit)
# Assorted stats:
help(mc3.stats)
# Plotting utilities:
help(mc3.plots)
Example 1: Interactive Run¶
The following example shows a basic MCMC run from the Python interpreter, for a quadratic-polynomial fit to a noisy dataset:
import numpy as np
import mc3
def quad(p, x):
"""
Quadratic polynomial function.
Parameters
p: Polynomial constant, linear, and quadratic coefficients.
x: Array of dependent variables where to evaluate the polynomial.
Returns
y: Polinomial evaluated at x: y = p0 + p1*x + p2*x^2
"""
y = p[0] + p[1]*x + p[2]*x**2.0
return y
# For the sake of example, create a noisy synthetic dataset, in a real
# scenario you would get your dataset from your data analysis pipeline:
np.random.seed(3)
x = np.linspace(0, 10, 100)
p_true = [3.0, -2.4, 0.5]
y = quad(p_true, x)
uncert = np.sqrt(np.abs(y))
data = y + np.random.normal(0, uncert)
# Initial guess for fitting parameters:
params = np.array([10.0, -2.0, 0.1])
pstep = np.array([0.03, 0.03, 0.05])
# Run the MCMC:
func = quad
output = mc3.sample(
data, uncert, func, params, indparams=[x],
pstep=pstep, sampler='snooker', nsamples=1e5, burnin=1000, ncpu=7,
)
That’s it. The code returns a dictionary with the MCMC results.
Among these, you can find the posterior sample
(posterior
), the best-fitting values (bestp
),
the lower and upper boundaries of the 68%-credible region (CRlo
and CRhi
, with respect to bestp
), the standard deviation of
the marginal posteriors (stdp
), among other variables.
mc3
will also print out to screen a progress report every 10% of
the MCMC run, showing the time, number of times a parameter tried to
go beyond the boundaries, the current best-fitting values, and
lowest \(\chi^{2}\); for example:
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
Multi-core Markov-chain Monte Carlo (mc3).
Version 3.1.0.
Copyright (c) 2015-2023 Patricio Cubillos and collaborators.
mc3 is open-source software under the MIT license (see LICENSE).
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
Yippee Ki Yay Monte Carlo!
Start MCMC chains (Wed Mar 29 17:52:45 2023)
[: ] 10.0% completed (Wed Mar 29 17:52:46 2023)
Out-of-bound Trials:
[0 0 0]
Best Parameters: (chisq=112.6196)
[ 3.12005211 -2.51498098 0.50946457]
Gelman-Rubin statistics for free parameters:
[1.05031159 1.04920662 1.05254562]
...
[::::::::::] 100.0% completed (Wed Mar 29 17:52:51 2023)
Out-of-bound Trials:
[0 0 0]
Best Parameters: (chisq=112.5923)
[ 3.07675149 -2.50001621 0.50890678]
Gelman-Rubin statistics for free parameters:
[1.00024775 1.00029444 1.00023952]
All parameters converged to within 1% of unity.
MCMC Summary:
-------------
Number of evaluated samples: 100002
Number of parallel chains: 7
Average iterations per chain: 14286
Burned-in iterations per chain: 1000
Thinning factor: 1
MCMC sample size (thinned, burned): 93002
Acceptance rate: 28.36%
Parameter name best fit median 1sigma_low 1sigma_hi S/N
--------------- ----------- ----------------------------------- ---------
Param 1 3.0768e+00 3.0761e+00 -3.7968e-01 3.8946e-01 7.9
Param 2 -2.5000e+00 -2.4981e+00 -2.2876e-01 2.1325e-01 11.2
Param 3 5.0891e-01 5.0868e-01 -2.6467e-02 2.7415e-02 18.7
Best-parameter's chi-squared: 112.5923
Best-parameter's -2*log(posterior): 112.5923
Bayesian Information Criterion: 126.4078
Reduced chi-squared: 1.1607
Standard deviation of residuals: 3.00577
For a detailed summary with all parameter posterior statistics see
mc3_statistics.txt
Output sampler files:
mc3_statistics.txt
At the end of the MCMC run, mc3
displays a summary of the MCMC
sample, best-fitting parameters, credible-region boundaries, posterior
mean and standard deviation, among other statistics.
Additionally, the user has the option to generate several plots of the MCMC
sample: the best-fitting model and data curves, parameter traces, and
marginal and pair-wise posteriors (these plots can also be generated
automatically with the MCMC run by setting plots=True
).
The plots sub-package provides the plotting functions:
# And now, some post processing:
import mc3.plots as mp
import mc3.utils as mu
# Output dict contains the entire sample (posterior), need to remove burn-in:
posterior, zchain, zmask = mu.burn(output)
# Set parameter names:
pnames = ["constant", "linear", "quadratic"]
post = mp.Posterior(posterior, pnames)
# Plot pairwise posteriors:
fig_pairs = post.plot(savefile="quad_pairwise.png")
# Plot marginal posterior histograms:
fig_hist = post.plot_histogram(savefile="quad_hist.png")
# Plot trace plot:
mp.trace(posterior, zchain, pnames=pnames, savefile="quad_trace.png")
# Plot best-fitting model and binned data:
mp.modelfit(data, uncert, x, y, savefile="quad_bestfit.png")
Note
These plots can also be automatically generated along with the MCMC run (see Outputs).
Example 2: Shell Run¶
The following example shows a basic MCMC run from the terminal using a configuration file. First, create a Python file (’quadratic.py’) with the modeling function:
def quad(p, x):
y = p[0] + p[1]*x + p[2]*x**2.0
return y
Then, generate a data set and store into files, e.g., with the following Python script:
import numpy as np
import mc3
from quadratic import quad
# Create synthetic dataset:
x = np.linspace(0, 10, 1000) # Independent model variable
p0 = [3, -2.4, 0.5] # True-underlying model parameters
y = quad(p0, x) # Noiseless model
uncert = np.sqrt(np.abs(y)) # Data points uncertainty
error = np.random.normal(0, uncert) # Noise for the data
data = y + error # Noisy data set
# Store data set and other inputs:
mc3.utils.savebin([data, uncert], 'data.npz')
mc3.utils.savebin([x], 'indp.npz')
Now, create a configuration file with the mc3
setup (’MCMC.cfg’):
[MCMC]
data = data.npz
indparams = indp.npz
func = quad quadratic
params = 10.0 -2.0 0.1
pmin = -25.0 -10.0 -10.0
pmax = 30.0 10.0 10.0
pstep = 0.3 0.3 0.05
nsamples = 1e5
burnin = 1000
ncpu = 7
sampler = snooker
grtest = True
plots = True
savefile = output_demo.npz
Finally, call the mc3
entry point, providing the configuration file as
a command-line argument:
mc3 -c MCMC.cfg