proximal sampler

R-bloggers 2025-04-28

[This article was first published on R – Xi'an's Og, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

At the Columbia workshop last week, Andre Wibisono presented work related with a recent arXival on the exponentially fast convergence of both unadjusted Langevin and  proximal sampler algorithms under strong [definitely strong] log-concavity assumptions. The idea behind the proximal sampler is to target the demarginalised density

g(x,y) \propto \exp\{\log f(x) - ||x-y||^2/2\eta\}\quad\eta>0″ class=”latex” /></p><p style=by introducing an auxiliary Gaussian vector y, which preserves f(x) as the marginal distribution on the first component vector X. While the auxiliary Y is (obviously) conditionally Gaussian, the conditional of X is at least as challenging as simulating from f. Unless η is chosen small enough to regularize log g(x) into a strongly log-concave function, since

\log g(x,y) \le \log g(x^\star,y) -\beta||x-x^\star||^2

when x*=x*(y) is the maximiser of log g(x,y) (for a given value y) and β>0 is the appropriate log-concavity constant. This inequality means that an accept-reject can be implemented to simulate from the conditional of X given Y but it requires both the factor β and the derivation of x*(y), hence a pretty good understanding and a rather high regularity of the actual target f(x). Besides, the regularization term ||x-y||² means that y is approximately the previous value of the (sub)chain X, hence it creates a rappel force that slows down the exploration of the target.

Since the arXival does not contain numerical comparisons, I attempted one using the (2D) banana shaped distribution,

target=function(x,sig,B,mu)-x[1]^2/2/sig-(x[2]+B*x[1]^2-mu)^2/2

with μ=σ=B=10. Comparing with a vanilla random walk Metropolis with three potential scales, chosen randomly at each iteration. Since I did not want to check whether or not the target was log-concave (and derive the corresponding β), I used the Normal distribution centred at proposal x*(y) of a Metropolis step, again with several scales. The following is the representation of the samples (sienna for MCMC, navy blue for proximal with β=50, dark green for β=5), with a lesser rate of tail exploration for the proximal samplers. It is thus unclear to me the theoretical characterisations of the method translate into practical efficiency beyond the most regular cases.

To leave a comment for the author, please follow the link and comment on their blog: R – Xi'an's Og.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Continue reading: proximal sampler