堆优化模拟退火（List-Based Simulated Annealing|List-Based SA|LBSA|模拟退火）算法

图炸了的话请多刷新几次

堆优化模拟退火（List-Based Simulated Annealing）算法

前言

lsy 讲的随机化算法中提到了这个 LBSA（我将其命名为堆优化模拟退火），于是阅读了一下论文。实际上国内 OI 界对这个算法的使用非常少，使用频率也非常低。大家图一乐即可。

我们假定下面求解均是求最小值。

LBSA 较于 SA 的优点

在普通 SA 中，我们需要手动调参，而 LBSA 则是通过一个简单的温度堆，通过择优退火免去了调参。

温度反推公式

我们定义当前温度为 $t$，已知状态为 $x$，新状态为 $y$，答案计算函数为 $f$。根据模拟退火可以得到发生状态转移（修改最优解）的概率 $p$ 为（公式1）：

\[p=\begin{cases} 1 & \text{if}\ f(y)\le f(x) \\ \exp({\frac{-(f(y)-f(x))}{t}}) & \text{otherwise} \end{cases} \]

相反，如果我们知道发生状态转移的概率 $p$，那么我们就可以计算出相应的温度 $t$。相应的温度 $t$ 为（公式2）：

\[t=\frac{-(f(y)-f(x))}{\ln(p)} \]

生成初始温度堆

LBSA 通过生成一个存储了大量温度的温度堆。择优退火。下面是温度堆的生成过程：

graph TD a(温度堆生成开始) --> b[定义初始状态 $x$ 创建空的温度堆 $L$ 定义温度堆长度 $L$$max$ 定义初始发生状态转移的概率 $p$ $i=0$] --> c[创建新状态 $y$] --> d{"$f(y)＜f(x)$ （解更优）"} --NO--> f["计算温度 $t=(-(f(y)-f(x)))/\ln(p)$（公式2）&emsp;&emsp;&emsp;&emsp; 将温度 $t$ 插入温度堆 $L$ 中 $i++$"] --> g{"$i < L$$max$"} --Yes-->c d --Yes--> e["$x=y$（更新状态）"] --> f g --NO--> h[结束]

我们一般定义 $p=0.1$。

这个温度堆为大根堆，即温度越高，优先级越高。重复相同的程序，直到填满。

温度控制

对于第 $i$ 次模拟退火，我们会跑 $M$ 次。定义当前温度堆最大值为 $t_{max}$，已知状态与新状态的值差为 $d_i$，那么发生状态转移的概率 $p_i$ 为（公式3）：

\[p_i=e^{-d_i/t_{max}} \]

以上可以通过公式 1 得出。

根据 Metropolis 算法（Metropolis acceptance criterion），每次遇到一个较差的新状态，生成一个从 0 到 1 的随机小数 $r$。如果 $r$ 小于发生状态转移的概率 $p$，则将接受较差的新状态，同时通过以下公式算出新的温度 $t_i$（公式4）：

\[t_i=\frac{-d_i}{\ln(r_i)} \]

证明可参见公式 2 的证明。

更新列表

对于第 $i$ 次模拟退火，我们跑完 $M$ 次后，将最大值 $t_{max}$ 从堆里删去，插入上述 $t_i$ 的平均值，然后进行下一次模拟退火。

下图表对此进行了详细解释：

graph TD a(LBSA开始) --> b[生成温度堆 生成状态 $x$ $k=0$] --> c[从温度堆 $L$ 堆顶取出最大值 $T$$max$ $k++,t=0,c=0,m=0$] --> d[创建新状态 $y$ $m++$] --> e{"$f(y)＜f(x)$ （解更优）"} --No--> f["定义$d_i=−(f(y)-f(x))$ $p=exp(-d_i/t$$max$$)$（公式3） 生成从0到1的随机数 $r$"] --> g{"$r\le p$"} --Yes--> h["$t=t+(-d_i/\ln(r))$ &emsp;&emsp;&emsp;&emsp; c++"] --> i["$x=y$ &emsp;&emsp;&emsp;&emsp;"] --> j[$m\le M$] --No--> k{"$c==0?$ &emsp;&emsp;&emsp;&emsp;"} --No--> l["弹出温度堆 $L$ 堆顶 插入 $t/c$"] --> m{$k\le K$} --No--> n(LBSA结束) e --Yes--> i g --No--> j j --Yes--> d k --Yes--> m m --Yes--> c

实现

这是对于 [JSOI2004] 平衡点 / 吊打XXX 的一种实现：

#include <cstdio>
#include <cmath>
#include <cstdlib>
#include <ctime>
#include <climits>
#include <queue>
#define INIT_TEMP 1e3
#define INIT_ACC_PROB 0.1
#define TEMP_LIST_LENGTH 300
#define MAXIMUM_ITERATION_TIMES 1000
#define MARKOV_CHAIN_LENGTH 100
#define COLD 0.5

int numOfWeights;

struct node {
	int x, y, w;
} weight[1010];

struct solve {
	double x, y, energy;
} ans;

std::priority_queue<double> tempList;

double calcEnergy(solve pos) {
	double energy = 0.0;
	for(int i = 1; i <= numOfWeights; i++) {
		energy += sqrt(pow(pos.x - weight[i].x, 2) + pow(pos.y - weight[i].y, 2)) * weight[i].w;
	}
	return energy;
}
void createInitTemp() {
	for(int i = 1; i <= TEMP_LIST_LENGTH; i++) {
		solve newAns;

		newAns.x = ans.x + ((rand() % 100000) - 50000);
		newAns.y = ans.y + ((rand() % 100000) - 50000);
		newAns.energy = calcEnergy(newAns);

		if(newAns.energy < ans.energy) {
			ans = newAns;
		}

		double temp = (- abs(newAns.energy - ans.energy)) / log(INIT_ACC_PROB);

		tempList.push(temp);
	}
}
void runLBSA() {
	for(int i = 1; i <= MAXIMUM_ITERATION_TIMES; i++) {
		double temp = 0;
		double tmax = tempList.top();
		// printf("%lf\n", tmax);
		int updateTimes = 0;
		for(int j = 1; j <= MARKOV_CHAIN_LENGTH; j++) {
			solve newAns;

			newAns.x = ans.x + (rand() * 2 - RAND_MAX) * (tmax / INT_MAX * INIT_TEMP);
			newAns.y = ans.y + (rand() * 2 - RAND_MAX) * (tmax / INT_MAX * INIT_TEMP);
			newAns.energy = calcEnergy(newAns);

			if(newAns.energy < ans.energy) {
				ans = newAns;
			}
			else {
				double p = exp(- (newAns.energy - ans.energy) / tmax);
				double r = (rand() * 1.0) / (RAND_MAX);
				if(r < p) {
					temp = temp + (- (newAns.energy - ans.energy)) / log(r);
					updateTimes++;
				}
			}

			tmax *= COLD;
		}
		if(updateTimes) {
			tempList.pop();
			tempList.push(temp / updateTimes);
		}
	}
}
int main() {
	srand(time(0));

	scanf("%d", &numOfWeights);

	for(int i = 1; i <= numOfWeights; i++) {
		scanf("%d %d %d", &weight[i].x, &weight[i].y, &weight[i].w);
		ans.x += weight[i].x;
		ans.y += weight[i].y;
	}

	ans.x /= numOfWeights;
	ans.y /= numOfWeights;
	ans.energy = calcEnergy(ans);

	createInitTemp();

	runLBSA();

	printf("%.3lf %.3lf", ans.x, ans.y);
	return 0;
}

例题

THUWC 2024 四子棋（题外话：这题使用 UCT 可以拿到 445/500 的分数！）

posted @ 2023-08-09 15:15 ZnPdCo 阅读(96) 评论(0) 编辑收藏举报

刷新页面返回顶部

znpdco

堆优化模拟退火（List-Based Simulated Annealing|List-Based SA|LBSA|模拟退火）算法

堆优化模拟退火（List-Based Simulated Annealing）算法

前言

LBSA 较于 SA 的优点

温度反推公式

生成初始温度堆

温度控制

更新列表

实现

例题

公告

znpdco

堆优化模拟退火（List-Based Simulated Annealing|List-Based SA|LBSA|模拟退火） 算法

堆优化模拟退火（List-Based Simulated Annealing） 算法

前言

LBSA 较于 SA 的优点

温度反推公式

生成初始温度堆

温度控制

更新列表

实现

例题

公告

堆优化模拟退火（List-Based Simulated Annealing|List-Based SA|LBSA|模拟退火）算法

堆优化模拟退火（List-Based Simulated Annealing）算法