General Usage

Introduction

The tests provided by MCMCTesting are frequentist hypothesis tests for testing the correctness of MCMC kernels. In particular, it compute the p-value for the null hypothesis that the MCMC kernel has the correct stationary distribution against the alternative hypothesis that it doesn't.

Currently, MCMCTesting provide three different tests originally proposed by Gandy and Scott[gandyandscott2021]:

  1. Simple Two-Sample Test
  2. Two-Sample Test with an Additional Gibbs Step
  3. Exact Rank Test

The two-sample tests are generally applicable. On the other hand, the exact rank test assumes that the MCMC kernel is reversible. Therefore, it can specifically be used to test reversibility.

Interface

The user needs to implement the following function specializations for the model and kernel subject to the test.

MCMCTesting.sample_jointFunction
sample_joint(rng, model)

Sample from the joint distribution of the prior and the predictive distribution of model.

Arguments

  • rng::Random.AbstractRNG: Random number generator.
  • model: Model subject to test.

Returns

  • θ: Model parameter sampled from the prior p(θ).
  • y: Data generated from conditionally on θ from p(y|θ)
source
MCMCTesting.markovchain_transitionFunction
markovchain_transition(rng, model, kernel, θ, y)

Perform a single Markov chain transition of kernel on the previous state θ targeting the posterior of model conditioned on y.

Arguments

  • rng::Random.AbstractRNG: Random number generator.
  • model: Model forming the posterior p(θ|y) conditioned on y.
  • θ: Previous state of the Markov chain.
  • y: Data to condition on.

Returns

  • θ′: Next state of the Markov chain.
source

Some tests might be require additional interfaces to be implemented. For an overview of how to implement these interfaces, refer to the tutorial.

The model and kernel are then passed to MCMCTesting through the following struct:

MCMCTesting.TestSubjectType
TestSubject(model, kernel)

Model and MCMC kernel obejct subject to test.

Arguments

  • model: Model subject to test.
  • kernel: MCMC kernel subject to test.
source

Simulating a P-Value with mcmctest

Each of the test internally run simulations and compute a single p-value through the following routine:

MCMCTesting.mcmctestFunction
mcmctest([rng,] test, subject; kwargs...)

Sample a p-value according to test for subject

Arguments

  • rng::Random.AbstractRNG: Random number generator. (Default: Random.default_rng().)
  • test::AbstractMCMCTest: Test strategy.
  • subject::TestSubject: MCMC algorithm and model subject to test.

Keyword Arguments

  • show_progress::Bool: Whether to show the progress bar. (Default: true.)
  • statistics: Function for computing test statistics from samples generated from the tests. (See section below for additional description.)
  • Check the documentation for the respective test strategy for additional keyword arugments.

Custom Test Statistics

The statistics used for the hypothesis tests can be modified by passing a custom funciton to statistics. The default statistics are the first and second moments computed as below.

statistics = params -> vcat(params, params.^2)

The cross-interaction can also be tested by adding an additional entry as below.

statistics = params -> vcat(params, params.^2, reshape(params*params',:))

But naturally, adding more statistics increase the computational cost of computing the tests.

Also, different tests may result in different statistics being computed through the same statistics function. For example, the two-sample test strategies generate both model parameters θ and data y. Therefore, params = vcat(θ, y). On the other hand, the exac rank test only generates model parameters θ. Therefore, params = θ. Naturally, statistics can also be used to select a subset of parameters used for testing. For example, for the two-sample test strategies, if we only want to use θ for the tests, where d = length(θ) > 0, one can do the following:

statistics = params -> θ[1:d]
source

Increasing Power with seqmcmctest

seqmcmctest (Algorithm 3[gandyandscott2021]) sequentially calls mcmctest to increase the power and ensure a low false rejection rate. Furthermore, the p-values from each component of the statistics are combined through multiple hypothesis adjustment.

MCMCTesting.seqmcmctestFunction
seqmcmctest([rng,] test, subject, false_rejection_rate, samplesize; kwargs...)

Sequential run multiple hypothesis tests to guarantee false_rejection_rate.

Arguments

  • rng::Random.AbstractRNG: Random number generator.
  • test::AbstractMCMCTest: Test strategy.
  • subject::TestSubject: MCMC algorithm and model subject to test.
  • false_rejection_rate::Real: Desired false rejection rate.
  • samplesize::Int: The number of p-values used at each test iteration.

Keyword Arguments

  • samplesize_increase: Factor of increase for the samplsize after the first test iteration turns out inconclusive. (Default: 2.0)
  • show_progress::Bool: Whether to show progress. (Default: true)
  • pvalue_adjustmeht::MultipleTesting.PValueAdjustment: P-value adjustment for multiple testing over the elements of the statistic. (Default: MultipleTesting.Bonferroni())

Additional keyword arguments are passed to internal calls to mcmctest.

Returns

  • test_result::Bool: true if the null-hypothesis (the MCMC algorithm has the correct stationary distribution) wasn't rejected, false otherwise.
source

References

  • gandyandscott2021Gandy, A., & Scott, J. (2020). Unit testing for MCMC and other Monte Carlo methods. arXiv preprint arXiv:2001.06465.