采样
Sampling 采样
A* sampling
CJ Maddison, 2014, NeurIPS
A pratical generic sampling algorithm that searchs for the maximum of a Gumbel process using A* search.
Monte Carlo
Monte Carlo refers to approximation algorithms relying on repeated random sampling. (stochastic simulation)
Monte Carlo for Expectation Approximation
Why Monte Carlo Expectation works: Law of Large Numbers.
Monte Carlo for Integration Approximation
To calculate the integration of
:
- Draw \(n\) samples uniformly from \(\Omega\) at random, denoted as \(\bm x^{(1)}, \bm x^{(2)},\dots, \bm x^{(n)}\)
- Calculate
, here the shape of \(\Omega\) should be simple, or it's hard to calculate the above integration.
3. Calculate the approximation
Why it works:
Inverse Transform Sampling
- 求累积密度函数 \(y={\rm{cdf}}(x)\) 的逆函数 \(x={\rm{cdf}}^{-1}(y)\) ;
- 从0~1的均匀分布中采样得到样本u,求对应的样本x。
这样看该采样: cdf值域是[0,1],在cdf图像上,从其值域[0,1]中等可能随机选择n个点,其对应的能使cdf取得该值的x(及cdf逆函数输出值)与cdf陡峭程度有关,越笔直(陡峭)则能获得更多样本x。
Pros:
- 采样效果很好。因p(x)高峰对应的x区间,cdf在该区间上陡增。
Cons:
- 有的函数cdf逆函数难以求出。
Importance Sampling
For an objective:
, we introduce a random variable \(X\) following a distribution with a probability density function \(h(X)\) with \(\forall x\in A h(x)\ne 0\) , so \(\int h(x)dx=1\) , and then
. Next, we approximate the expectation by Monte Carlo:
where \(x^{(i)}\) 's are samples drawn from sample space. We call h(x) importance sampling funtion. A good importance sampling function h(x) should have the following properties:
- \(h(x)>0\)
- close proportional to \(|f(x)|\)
- easy to simulate
- easy to compute \(h(x)\)
Acceptance-Rejection Sampling (accept-reject)
(also called rejection sampling)
We can generate sample values for a target random variable \(X\) from a target distribution \(p(X)\) by instead sampling values from an introduced distribution \(q(X)\) satisfying \(\exists c>0: cq(x)> p(x),\forall x\) . The choosen distribution \(q(X)\) is called a proposal distribution.
To obtain samples as follows:
- Obtain a sample \(x_i\) from \(q(x)\) and \(u_i\) from \(\mathrm{Uniform}(0,1)\)
- Check \(\frac{p(x)}{cq(x)}\)
- Accept if \(u_i \le \frac{p(x)}{cq(x)}\)
- Reject otherwise.
Pros:
*
Cons:
- Maybe inefficient due to too many rejection samples
拒绝-接受采样不常用,因其proposal distribution和c不好找,容易导致拒绝率很高。
Adaptive Rejection Sampling
MCMC, Markov Chain Monte Carlo
Gibbs Sampling
Gibbs sampling is a MCMC method. 常用的高维数据采样方法。