Bayesian Analysis

Quantum Speedups for Multiproposal MCMC
Lin CY, Chen KC, Lemey P, Suchard MA, Holbrook AJ and Hsieh MH
Multiproposal Markov chain Monte Carlo (MCMC) algorithms choose from multiple proposals to generate their next chain step in order to sample from challenging target distributions more efficiently. However, on classical machines, these algorithms require target evaluations for each Markov chain step when choosing from proposals. Recent work demonstrates the possibility of quadratic quantum speedups for one such multiproposal MCMC algorithm. After generating proposals, this quantum parallel MCMC (QPMCMC) algorithm requires only target evaluations at each step, outperforming its classical counterpart. However, generating proposals using classical computers still requires time complexity, resulting in the overall complexity of QPMCMC remaining . Here, we present a new, faster quantum multiproposal MCMC strategy, QPMCMC2. With a specially designed Tjelmeland distribution that generates proposals close to the input state, QPMCMC2 requires only target evaluations and qubits when computing over a large number of proposals . Unlike its slower predecessor, the QPMCMC2 Markov kernel (1) maintains detailed balance exactly and (2) is fully explicit for a large class of graphical models. We demonstrate this flexibility by applying QPMCMC2 to novel Ising-type models built on bacterial evolutionary networks and obtain significant speedups for Bayesian ancestral trait reconstruction for 248 observed salmonella bacteria.
Simulation-Based Calibration Checking for Bayesian Computation: The Choice of Test Quantities Shapes Sensitivity
Modrák M, Moon AH, Kim S, Bürkner P, Huurre N, Faltejsková K, Gelman A and Vehtari A
Simulation-based calibration checking (SBC) is a practical method to validate computationally-derived posterior distributions or their approximations. In this paper, we introduce a new variant of SBC to alleviate several known problems. Our variant allows the user to in principle detect any possible issue with the posterior, while previously reported implementations could never detect large classes of problems including when the posterior is equal to the prior. This is made possible by including additional data-dependent test quantities when running SBC. We argue and demonstrate that the joint likelihood of the data is an especially useful test quantity. Some other types of test quantities and their theoretical and practical benefits are also investigated. We provide theoretical analysis of SBC, thereby providing a more complete understanding of the underlying statistical mechanisms. We also bring attention to a relatively common mistake in the literature and clarify the difference between SBC and checks based on the data-averaged posterior. We support our recommendations with numerical case studies on a multivariate normal example and a case study in implementing an ordered simplex data type for use with Hamiltonian Monte Carlo. The SBC variant introduced in this paper is implemented in the SBC R package.
Nonparametric Bayes Differential Analysis of Multigroup DNA Methylation Data
Gu C, Baladandayuthapani V and Guha S
DNA methylation datasets in cancer studies are comprised of measurements on a large number of genomic locations called cytosine-phosphate-guanine (CpG) sites with complex correlation structures. A fundamental goal of these studies is the development of statistical techniques that can identify disease genomic signatures across multiple patient groups defined by different experimental or biological conditions. We propose , a nonparametric Bayesian approach for differential analysis relying on a novel class of first order mixture models called the Sticky Pitman-Yor process or two-restaurant two-cuisine franchise (2R2CF). The BayesDiff methodology flexibly utilizes information from all CpG sites or biomarker probes, adaptively accommodates any serial dependence due to the widely varying inter-probe distances, and makes posterior inferences about the differential genomic signature of patient groups. Using simulation studies, we demonstrate the effectiveness of the BayesDiff procedure relative to existing statistical techniques for differential DNA methylation. The methodology is applied to analyze a gastrointestinal (GI) cancer dataset exhibiting serial correlation and complex interaction patterns. The results support and complement known aspects of DNA methylation and gene association in upper GI cancers.
Gridding and Parameter Expansion for Scalable Latent Gaussian Models of Spatial Multivariate Data
Peruzzi M, Banerjee S, Dunson DB and Finley AO
Scalable spatial GPs for massive datasets can be built via sparse Directed Acyclic Graphs (DAGs) where a small number of directed edges is sufficient to flexibly characterize spatial dependence. The DAG can be used to devise fast algorithms for posterior sampling of the latent process, but these may exhibit pathological behavior in estimating covariance parameters. In this article, we introduce gridding and parameter expansion methods to improve the practical performance of MCMC algorithms in terms of effective sample size per unit time (ESS/s). Gridding is a model-based strategy that reduces the number of expensive operations necessary during MCMC on irregularly spaced data. Parameter expansion reduces dependence in posterior samples in spatial regression for high resolution data. These two strategies lead to computational gains in the big data settings on which we focus. We consider popular constructions of univariate spatial processes based on Matérn covariance functions and multivariate coregionalization models for Gaussian outcomes in extensive analyses of synthetic datasets comparing with alternative methods. We demonstrate effectiveness of our proposed methods in a forestry application using remotely sensed data from NASA's Goddard LiDAR, Hyper-Spectral, and Thermal imager (G-LiHT).
Exploiting Multivariate Network Meta-Analysis: A Calibrated Bayesian Composite Likelihood Inference
Wang Y, Lin L and Liu YL
Multivariate network meta-analysis has emerged as a powerful tool for evidence synthesis by incorporating multiple outcomes and treatments. Despite its advantages, this method comes with methodological challenges, such as the issue of unreported within-study correlations among treatments and outcomes, which can lead to biased estimates and misleading conclusions. In this paper, we propose a calibrated Bayesian composite likelihood approach to overcome this limitation. The proposed method eliminates the need for a fully specified likelihood function while allowing for the unavailability of within-study correlations among treatments and outcomes. Additionally, we developed a hybrid Gibbs sampler algorithm along with the Open-Faced Sandwich post-sampling adjustment to enable robust posterior inference. Through comprehensive simulation studies, we demonstrated that the proposed approach yields unbiased estimates while maintaining coverage probabilities close to the nominal levels. We implemented the proposed method to two real-world network meta-analysis datasets: one comparing treatment procedures for root coverage and the other comparing treatments for anemia in patients with chronic kidney disease.
Causally Sound Priors for Binary Experiments
Irons NJ and Cinelli C
We introduce the BREASE framework for the Bayesian analysis of randomized controlled trials with binary treatment and outcome. Approaching the problem from a causal inference perspective, we propose parameterizing the likelihood in terms of the aseline isk, fficacy, and dverse ide ffects of the treatment, along with a flexible, yet intuitive and tractable jointly independent beta prior distribution on these parameters, which we show to be a generalization of the Dirichlet prior for the joint distribution of potential outcomes. Our approach has a number of desirable characteristics when compared to current mainstream alternatives: (i) it naturally induces prior dependence between expected outcomes in the treatment and control groups; (ii) as the baseline risk, efficacy and risk of adverse side effects are quantities commonly present in the clinicians' vocabulary, the hyperparameters of the prior are directly interpretable, thus facilitating the elicitation of prior knowledge and sensitivity analysis; and (iii) we provide analytical formulae for the marginal likelihood, Bayes factor, and other posterior quantities, as well as an exact posterior sampling algorithm and an accurate and fast data-augmented Gibbs sampler in cases where traditional MCMC fails. Empirical examples demonstrate the utility of our methods for estimation, hypothesis testing, and sensitivity analysis of treatment effects.
Functional Concurrent Regression Mixture Models Using Spiked Ewens-Pitman Attraction Priors
Liang M, Koslovsky MD, Hébert ET, Businelle MS and Vannucci M
Functional concurrent, or varying-coefficient, regression models are a form of functional data analysis methods in which functional covariates and outcomes are collected concurrently. Two active areas of research for this class of models are identifying influential functional covariates and clustering their relations across observations. In various applications, researchers have applied and developed methods to address these objectives separately. However, no approach currently performs both tasks simultaneously. In this paper, we propose a fully Bayesian functional concurrent regression mixture model that simultaneously performs functional variable selection and clustering for subject-specific trajectories. Our approach introduces a novel spiked Ewens-Pitman attraction prior that identifies and clusters subjects' trajectories marginally for each functional covariate while using similarities in subjects' auxiliary covariate patterns to inform clustering allocation. Using simulated data, we evaluate the clustering, variable selection, and parameter estimation performance of our approach and compare its performance with alternative spiked processes. We then apply our method to functional data collected in a novel, smartphone-based smoking cessation intervention study to investigate individual-level dynamic relations between smoking behaviors and potential risk factors.
Bag of DAGs: Inferring Directional Dependence in Spatiotemporal Processes
Jin B, Peruzzi M and Dunson D
We propose a class of nonstationary processes to characterize space- and time-varying directional associations in point-referenced data. We are motivated by spatiotemporal modeling of air pollutants in which local wind patterns are key determinants of the pollutant spread, but information regarding prevailing wind directions may be missing or unreliable. We propose to map a discrete set of wind directions to edges in a sparse directed acyclic graph (DAG), accounting for uncertainty in directional correlation patterns across a domain. The resulting Bag of DAGs processes (BAGs) lead to interpretable nonstationarity and scalability for large data due to sparsity of DAGs in the bag. We outline Bayesian hierarchical models using BAGs and illustrate inferential and performance gains of our methods compared to other state-of-the-art alternatives. We analyze fine particulate matter using high-resolution data from low-cost air quality sensors in California during the 2020 wildfire season. An R package is available on GitHub.
How Trustworthy Is Your Tree? Bayesian Phylogenetic Effective Sample Size Through the Lens of Monte Carlo Error
Magee A, Karcher M, Matsen FA and Minin VM
Bayesian inference is a popular and widely-used approach to infer phylogenies (evolutionary trees). However, despite decades of widespread application, it remains difficult to judge how well a given Bayesian Markov chain Monte Carlo (MCMC) run explores the space of phylogenetic trees. In this paper, we investigate the Monte Carlo error of phylogenies, focusing on high-dimensional summaries of the posterior distribution, including variability in estimated edge/branch (known in phylogenetics as "split") probabilities and tree probabilities, and variability in the estimated summary tree. Specifically, we ask if there is any measure of effective sample size (ESS) applicable to phylogenetic trees which is capable of capturing the Monte Carlo error of these three summary measures. We find that there are some ESS measures capable of capturing the error inherent in using MCMC samples to approximate the posterior distributions on phylogenies. We term these tree ESS measures, and identify a set of three which are useful in practice for assessing the Monte Carlo error. Lastly, we present visualization tools that can improve comparisons between multiple independent MCMC runs by accounting for the Monte Carlo error present in each chain. Our results indicate that common post-MCMC workflows are insufficient to capture the inherent Monte Carlo error of the tree, and highlight the need for both within-chain mixing and between-chain convergence assessments.
A General Bayesian Functional Spatial Partitioning Method for Multiple Region Discovery Applied to Prostate Cancer MRI
Masotti M, Zhang L, Metzger GJ and Koopmeiners JS
Current protocols to estimate the number, size, and location of cancerous lesions in the prostate using multiparametric magnetic resonance imaging (mpMRI) are highly dependent on reader experience and expertise. Automatic voxel-wise cancer classifiers do not directly provide estimates of number, location, and size of cancerous lesions that are clinically important. Existing spatial partitioning methods estimate linear or piecewise-linear boundaries separating regions of local stationarity in spatially registered data and are inadequate for the application of lesion detection. Frequentist segmentation and clustering methods often require pre-specification of the number of clusters and do not quantify uncertainty. Previously, we developed a novel Bayesian functional spatial partitioning method to estimate the boundary surrounding a single cancerous lesion using data derived from mpMRI. We propose a Bayesian functional spatial partitioning method for multiple lesion detection with an unknown number of lesions. Our method utilizes functional estimation to model the smooth boundary curves surrounding each cancerous lesion. In a Reversible Jump Markov Chain Monte Carlo (RJ-MCMC) framework, we develop novel jump steps to jointly estimate and quantify uncertainty in the number of lesions, their boundaries, and the spatial parameters in each lesion. Through simulation we show that our method is robust to the shape of the lesions, number of lesions, and region-specific spatial processes. We illustrate our method through the detection of prostate cancer lesions using MRI.
Bayesian Analysis of Exponential Random Graph Models Using Stochastic Gradient Markov Chain Monte Carlo
Zhang Q and Liang F
The exponential random graph model (ERGM) is a popular model for social networks, which is known to have an intractable likelihood function. Sampling from the posterior for such a model is a long-standing problem in statistical research. We analyze the performance of the stochastic gradient Langevin dynamics (SGLD) algorithm (also known as noisy Longevin Monte Carlo) in tackling this problem, where the stochastic gradient is calculated via running a short Markov chain (the so-called inner Markov chain in this paper) at each iteration. We show that if the model size grows with the network size slowly enough, then SGLD converges to the true posterior in 2-Wasserstein distance as the network size and iteration number become large regardless of the length of the inner Markov chain performed at each iteration. Our study provides a scalable algorithm for analyzing large-scale social networks with possibly high-dimensional ERGMs.
Easily Computed Marginal Likelihoods from Posterior Simulation Using the THAMES Estimator
Metodiev M, Perrot-Dockès M, Ouadah S, Irons NJ, Latouche P and Raftery AE
We propose an easily computed estimator of marginal likelihoods from posterior simulation output, via reciprocal importance sampling, combining earlier proposals of DiCiccio et al (1997) and Robert and Wraith (2009). This involves only the unnormalized posterior densities from the sampled parameter values, and does not involve additional simulations beyond the main posterior simulation, or additional complicated calculations, provided that the parameter space is unconstrained. Even if this is not the case, the estimator is easily adjusted by a simple Monte Carlo approximation. It is unbiased for the reciprocal of the marginal likelihood, consistent, has finite variance, and is asymptotically normal. It involves one user-specified control parameter, and we derive an optimal way of specifying this. We illustrate it with several numerical examples.
Fast Methods for Posterior Inference of Two-Group Normal-Normal Models
Greengard P, Hoskins J, Margossian CC, Gabry J, Gelman A and Vehtari A
We describe a class of algorithms for evaluating posterior moments of certain Bayesian linear regression models with a normal likelihood and a normal prior on the regression coefficients. The proposed methods can be used for hierarchical mixed effects models with partial pooling over one group of predictors, as well as random effects models with partial pooling over two groups of predictors. We demonstrate the performance of the methods on two applications, one involving U.S. opinion polls and one involving the modeling of COVID-19 outbreaks in Israel using survey data. The algorithms involve analytical marginalization of regression coefficients followed by numerical integration of the remaining low-dimensional density. The dominant cost of the algorithms is an eigendecomposition computed once for each value of the outside parameter of integration. Our approach drastically reduces run times compared to state-of-the-art Markov chain Monte Carlo (MCMC) algorithms. The latter, in addition to being computationally expensive, can also be difficult to tune when applied to hierarchical models.
Shrinkage with shrunken shoulders: Gibbs sampling shrinkage model posteriors with guaranteed convergence rates
Nishimura A and Suchard MA
Use of continuous shrinkage priors - with a "spike" near zero and heavy-tails towards infinity - is an increasingly popular approach to induce sparsity in parameter estimates. When the parameters are only weakly identified by the likelihood, however, the posterior may end up with tails as heavy as the prior, jeopardizing robustness of inference. A natural solution is to "shrink the shoulders" of a shrinkage prior by lightening up its tails beyond a reasonable parameter range, yielding a version of the prior. We develop a regularization approach which, unlike previous proposals, preserves computationally attractive structures of original shrinkage priors. We study theoretical properties of the Gibbs sampler on resulting posterior distributions, with emphasis on convergence rates of the Pólya-Gamma Gibbs sampler for sparse logistic regression. Our analysis shows that the proposed regularization leads to geometric ergodicity under a broad range of global-local shrinkage priors. Essentially, the only requirement is for the prior on the local scale to satisfy . If further satisfies for , as in the case of Bayesian bridge priors, we show the sampler to be uniformly ergodic.
Reproducible Model Selection Using Bagged Posteriors
Huggins JH and Miller JW
Bayesian model selection is premised on the assumption that the data are generated from one of the postulated models. However, in many applications, all of these models are incorrect (that is, there is misspecification). When the models are misspecified, two or more models can provide a nearly equally good fit to the data, in which case Bayesian model selection can be highly unstable, potentially leading to self-contradictory findings. To remedy this instability, we propose to use bagging on the posterior distribution ("BayesBag") - that is, to average the posterior model probabilities over many bootstrapped datasets. We provide theoretical results characterizing the asymptotic behavior of the posterior and the bagged posterior in the (misspecified) model selection setting. We empirically assess the BayesBag approach on synthetic and real-world data in (i) feature selection for linear regression and (ii) phylogenetic tree reconstruction. Our theory and experiments show that, when all models are misspecified, BayesBag (a) provides greater reproducibility and (b) places posterior mass on optimal models more reliably, compared to the usual Bayesian posterior; on the other hand, under correct specification, BayesBag is slightly more conservative than the usual posterior, in the sense that BayesBag posterior probabilities tend to be slightly farther from the extremes of zero and one. Overall, our results demonstrate that BayesBag provides an easy-to-use and widely applicable approach that improves upon Bayesian model selection by making it more stable and reproducible.
Generalized Geographically Weighted Regression Model within a Modularized Bayesian Framework
Liu Y and Goudie RJB
Geographically weighted regression (GWR) models handle geographical dependence through a spatially varying coefficient model and have been widely used in applied science, but its general Bayesian extension is unclear because it involves a weighted log-likelihood which does not imply a probability distribution on data. We present a Bayesian GWR model and show that its essence is dealing with partial misspecification of the model. Current modularized Bayesian inference models accommodate partial misspecification from a single component of the model. We extend these models to handle partial misspecification in more than one component of the model, as required for our Bayesian GWR model. Information from the various spatial locations is manipulated via a geographically weighted kernel and the optimal manipulation is chosen according to a Kullback-Leibler (KL) divergence. We justify the model via an information risk minimization approach and show the consistency of the proposed estimator in terms of a geographically weighted KL divergence.
Bayesian Hierarchical Stacking: Some Models Are (Somewhere) Useful
Yao Y, Pirš G, Vehtari A and Gelman A
Stacking is a widely used model averaging technique that asymptotically yields optimal predictions among linear averages. We show that stacking is most effective when model predictive performance is heterogeneous in inputs, and we can further improve the stacked mixture with a hierarchical model. We generalize stacking to Bayesian hierarchical stacking. The model weights are varying as a function of data, partially-pooled, and inferred using Bayesian inference. We further incorporate discrete and continuous inputs, other structured priors, and time series and longitudinal data. To verify the performance gain of the proposed method, we derive theory bounds, and demonstrate on several applied problems.
Perfect Sampling of the Posterior in the Hierarchical Pitman-Yor Process
Bacallado S, Favaro S, Power S and Trippa L
The predictive probabilities of the hierarchical Pitman-Yor process are approximated through Monte Carlo algorithms that exploits the Chinese Restaurant Franchise (CRF) representation. However, in order to simulate the posterior distribution of the hierarchical Pitman-Yor process, a set of auxiliary variables representing the arrangement of customers in tables of the CRF must be sampled through Markov chain Monte Carlo. This paper develops a perfect sampler for these latent variables employing ideas from the Propp-Wilson algorithm and evaluates its average running time by extensive simulations. The simulations reveal a significant dependence of running time on the parameters of the model, which exhibits sharp transitions. The algorithm is compared to simpler Gibbs sampling procedures, as well as a procedure for unbiased Monte Carlo estimation proposed by Glynn and Rhee. We illustrate its use with an example in microbial genomics studies.
Scalable Approximate Bayesian Computation for Growing Network Models via Extrapolated and Sampled Summaries
Raynal L, Chen S, Mira A and Onnela JP
Approximate Bayesian computation (ABC) is a simulation-based likelihood-free method applicable to both model selection and parameter estimation. ABC parameter estimation requires the ability to forward simulate datasets from a candidate model, but because the sizes of the observed and simulated datasets usually need to match, this can be computationally expensive. Additionally, since ABC inference is based on comparisons of summary statistics computed on the observed and simulated data, using computationally expensive summary statistics can lead to further losses in efficiency. ABC has recently been applied to the family of mechanistic network models, an area that has traditionally lacked tools for inference and model choice. Mechanistic models of network growth repeatedly add nodes to a network until it reaches the size of the observed network, which may be of the order of millions of nodes. With ABC, this process can quickly become computationally prohibitive due to the resource intensive nature of network simulations and evaluation of summary statistics. We propose two methodological developments to enable the use of ABC for inference in models for large growing networks. First, to save time needed for forward simulating model realizations, we propose a procedure to extrapolate (via both least squares and Gaussian processes) summary statistics from small to large networks. Second, to reduce computation time for evaluating summary statistics, we use sample-based rather than census-based summary statistics. We show that the ABC posterior obtained through this approach, which adds two additional layers of approximation to the standard ABC, is similar to a classic ABC posterior. Although we deal with growing network models, both extrapolated summaries and sampled summaries are expected to be relevant in other ABC settings where the data are generated incrementally.
Improving multilevel regression and poststratification with structured priors
Gao Y, Kennedy L, Simpson D and Gelman A
A central theme in the field of survey statistics is estimating population-level quantities through data coming from potentially non-representative samples of the population. Multilevel regression and poststratification (MRP), a model-based approach, is gaining traction against the traditional weighted approach for survey estimates. MRP estimates are susceptible to bias if there is an underlying structure that the methodology does not capture. This work aims to provide a new framework for specifying structured prior distributions that lead to bias reduction in MRP estimates. We use simulation studies to explore the benefit of these prior distributions and demonstrate their efficacy on non-representative US survey data. We show that structured prior distributions offer absolute bias reduction and variance reduction for posterior MRP estimates in a large variety of data regimes.
Robust Adaptive Incorporation of Historical Control Data in a Randomized Trial of External Cooling to Treat Septic Shock
Murray TA, Thall PF, Schortgen F, Asfar P, Zohar S and Katsahian S
This paper proposes randomized controlled clinical trial design to evaluate external cooling as a means to control fever and thereby reduce mortality in patients with septic shock. The trial will include concurrent external cooling and control arms while adaptively incorporating historical control arm data. Bayesian group sequential monitoring will be done using a posterior comparative test based on the 60-day survival distribution in each concurrent arm. Posterior inference will follow from a Bayesian discrete time survival model that facilitates adaptive incorporation of the historical control data through an innovative regression framework with a multivariate spike-and-slab prior distribution on the historical bias parameters. For each interim test, the amount of information borrowed from the historical control data will be determined adaptively in a manner that reflects the degree of agreement between historical and concurrent control arm data. Guidance is provided for selecting Bayesian posterior probability group-sequential monitoring boundaries. Simulation results elucidating how the proposed method borrows strength from the historical control data are reported. In the absence of historical control arm bias, the proposed design controls the type I error rate and provides substantially larger power than reasonable comparators, whereas in the presence bias of varying magnitude, type I error rate inflation is curbed.