Adaptive restart strategies for conflict driven SAT solvers
Abstract
As the SAT competition has shown, frequent restarts improve the speed of SAT solvers tremendously, particularly on satisfiable industrial instances. 译文:正如SAT竞赛所显示的那样,频繁的重启极大地提高了SAT求解者的速度,尤其是在令人满意的工业实例上。 This paper presents a novel adaptive technique that measures the agility of the search process dynamically, which in turn is used to control the restart frequency. 译文:本文提出了一种新的自适应技术,通过动态度量搜索过程的敏捷性,进而控制重新启动频率。 Experiments demonstrate, that this new dynamic restart strategy improves speed of our SAT solver PicoSAT on crafted instances considerably and on industrial instances slightly. 译文:实验表明,这种新的动态重启策略提高了我们的求解器PicoSAT在手工实例上的速度,在工业实例上的速度略有提高。 |
|
1 Introduction
SAT solvers may benefit from restarts [3]. Particularly on satisfiable industrial examples frequent restarts improved the performance of our SAT solver PicoSAT [1] tremendously. Even though PicoSAT is a winner of the SAT competition 2007 in the category of satisfiable industrial instances, an analysis of PicoSAT’s performance on unsatisfiable instances in general and on crafted instances in particular reveals, that frequent restarts can also be harmful. 译文:尽管PicoSAT是2007年SAT竞赛中可满足工业实例类别的赢家,但对PicoSAT在一般不满足实例和特别精心设计的实例上的性能分析表明,频繁重启也可能是有害的。 | |
In this short paper we address this issue and present a novel adaptive technique that measures the “agility”of the SAT solver as it traversesthe search space, based on the rate of recently flipped assignments. 译文:我们解决了这个问题(此处指的是频繁重启),并提出了一种新的自适应技术,以最近翻转分配的比率为基础,衡量SAT求解器在遍历搜索空间时的“敏捷性”。 The level of agility dynamically determines the restart frequency. Low agility enforces frequent restarts, high agility prohibits restarts. 译文:敏捷性的级别动态地决定了重新启动的频率。低敏捷性强制频繁重启,高敏捷性禁止重启。 Our experiments demonstrate, that this new dynamic restart strategy improves the speed of PicoSAT on crafted instances considerably and on industrial instances slightly。 译文:我们的实验表明,这种新的动态重启策略在手工实例上显著提高了PicoSAT的速度,在工业实例上略有提高。 |
|
As has been argued in [3] combinatorial search has heavy-tail behavior. Even if an instance is easy to satisfy (or to refute), the search may get stuck in a complex part of the search space. 译文:正如在[3]中所论证的那样,组合搜索具有重尾行为。即使实例很容易满足(或反驳),搜索也可能陷入搜索空间的复杂部分。 As a solution to this problem, the authors suggest to use randomization, and in particular restarts. To restart means to stop the current search after a certain time has passed and start over again. 译文:作为这个问题(指的是重尾现象)的解决方案,作者建议使用随机化,特别是重新启动。重启是指在一定时间后停止当前的搜索,重新开始。 |
|
Our focus is on industrial and crafted instances. For random benchmarks randomized algorithms are more successful. There has been work on dynamic restart algorithms for randomized search, see for instance [4,6]. 译文:针对随机搜索的动态重启算法已经有了一些研究,如[4,6]。 This work is not applicable to our setting. We want to improve the performance of conflict driven SAT solvers with learning, such as RSAT [9] and PicoSAT [1]. 译文:这项工作不适用于我们的环境。我们希望通过学习来提高冲突驱动的SAT求解器的性能,如RSAT[9]和picsat[1]。 Additionally these solvers always pick the last assignment for a decision variable. Enforcing these heuristics without learning makes restarts useless. 译文:此外,这些求解器总是为决策变量选择最后的赋值。在不学习的情况下强制执行这些启发式方法会使重新启动变得无用。 Furthermore, statistics, such as the number of satisfied clauses, which are crucial in adaptive restart scheduling for local search [4], are not available in the solvers we want to improve. 译文:此外,在我们想要改进的解算器中,满足子句的数量等统计信息是至关重要的,而这些信息对于本地搜索[4]的自适应重启调度至关重要。 |
|
Techniques, as implemented in the SAT solver TiniSAT [5] inspired by [7] and further improved in RSAT [9] and PicoSAT [1], show, that frequent restarts in combination with saving and reusing the previous phase can speed up SAT solvers on industrial instances tremendously, particularly on satisfiable ones. 译文:频繁的重启与保存和重用之前的阶段相结合,可以极大地提高工业实例上的SAT求解速度,尤其是在满足要求的实例上。 In this category PicoSAT was a clear winner of the SAT’07 Competition. |
|
Beside fast low level data structures [1], the major improvement in version 535 of PicoSAT as submitted to the SAT’07 Competition, is an aggressive restart schedule in combination with saving and reusing phases of assigned variables: 译文:除了快速的低级别数据结构[1]之外,提交给SAT ' 07竞赛的picsat 535版本的主要改进是一个积极的重启计划,结合保存和重用指定变量的阶段:
The first restart occurs after 100 conflicts. Then this restart interval is increased by 10%, which means the next restart happens after another 110 conflicts, then after another 121 conflicts etc. 译文:第一次重启发生在100个冲突之后。然后这个重启间隔增加10%,这意味着下一次重启发生在另一个110个冲突之后,然后是另一个121个冲突之后,等等。 However, this sequence of longer and longer inner restart intervals is reset to its initial value of 100 conflicts after the end of an outer restart interval is reached. Then the outer restart interval is also increased by 10%. This results in “bursts” of restarts. 译文:然而,这个越来越长的内部重启间隔序列在到达外部重启间隔结束后被重置为其初始值100个冲突。然后外部重启间隔也增加10%。这将导致重启的“突发”。 The restart frequency in one burst sequence slows down at the end and its length, the burst duration, slowly increases over time. More details can be found in [1]. 译文:在一个脉冲序列中,重启频率在结束时变慢,其长度,即脉冲持续时间,随着时间的推移缓慢增加。更多细节可以在[1]中找到。 |
|
RSAT [9] follows TiniSAT [5] with respect to restarts. Both have a less aggressive restart strategy than PicoSAT. They also do use the same kind of preprocessing [2] as MiniSAT. As a result RSAT, TiniSAT and MiniSAT turned out to be faster than PicoSAT on unsatisfiable industrial instances. On unsatisfiable crafted instances the situation is even worse. PicoSAT and in this case also RSAT can solve far less benchmarks than MiniSAT. | |
After this analysis it seems a valid conjecture, that frequent restarts may also be harmful, particularly on unsatisfiable crafted instances. 译文:在此分析之后,似乎有一个合理的推测,即频繁重启也可能有害,尤其是在设计不理想的实例上。The question then is, how to measure the effectiveness of frequent restarts, or better, to determine criteria, when to disable restarts. 译文:接下来的问题是,如何衡量频繁重启的有效性,或者更好地,如何确定标准,何时禁用重启。 | |
2 Measuring Agility 测量的灵活性
In all our recent SAT solvers we monitor the average decision height and print it as a kind of progress report. The average decision height is calculated by summing up the decision levels at decision points and dividing the result by the number of decisions. If the average decision height is going up, we are “close” to a satisfying assignment. If the average decision height goes down1, the solver will eventually resolve the empty clause, or at least some new unit clauses, and its getting “closer” to a refutation. Intuitively the solver is stuck if the average height is not changing much, and it may be a good idea to restart. On the other hand restarts should not happen if the average decision height is changing fast. | |
Our first failed attempt to dynamically control restarts was based on this observation. Restarts are disabled if the derivative of the average decision height becomes small. However, we were not able to get any positive results. In particularly, it seems to be impossible to come up with good “magic constants”. The absolute values of the derivative of the average decision height varies considerably from instance to instance. | |
2.1 Flips |
|
As pioneered by RSAT [9], PicoSAT always picks the last phase resp. direction to which a variable was assigned when assigning a decision variable. 译文:由RSAT[9]首创,picsat总是选择最后阶段响应。在分配决策变量时,分配变量的方向。 For instance if a decision variable was assigned to true, the last time it was assigned, then again it is assigned to true. If a variable is picked as decision variable and was not assigned before, then the phase is picked depending on the number of positive resp. negative occurrences. 译文:如果选择一个变量作为决策变量,并且之前没有赋值,则根据正响应数选择阶段。负面的事件。 |
|
Therefore, whenever a variable becomes assigned to a certain value, in particular if the assignment is forced by some other decision, PicoSAT and RSAT have to remember this value. During backtracking the variable is unassigned again, but the old value is saved. 译文:因此,每当一个变量被赋给某个值时,特别是当赋值是由其他决定强制执行时,PicoSAT和RSAT必须记住这个值。在回溯过程中,再次取消对变量的赋值,但保存旧值。 | |
This apparatus easily allows to determine when a new forced2 assignment to a variable flips the old value of the variable. Flipping the value of a variable means, that it is assigned to the opposite value, as it was assigned the last time. 译文:这种设备可以很容易地确定对一个变量的强制赋值何时会翻转该变量的旧值。翻转变量的值意味着它被赋给相反的值,就像上次赋给它的值一样。 | |
Clearly, if the frequency of flips is small, then the SAT solver literally does not move much, using for instance hamming distance in the boolean space as metric. This may be a good time to restart. On the other hand if many flips have occurred recently then there is no point in restarting, it may be even counterproductive.译文:显然,如果翻转的频率很小,那么SAT求解器实际上不会移动太多,例如使用布尔空间中的汉明距离作为度量。现在可能是重新开始的好时机。另一方面,如果最近发生了多次翻转,那么重新启动就没有意义了,甚至可能适得其反。 | |
2.2 A Fresh Look at VSIDS |
|
In order to obtain a robust metric for measuring agility, we follow a reformulation of the seminal work on VSIDS [8]. 译文:为了获得度量敏捷性的健壮的度量标准,我们遵循了VSIDS[8]上开创性工作的重新构想。 The basic idea of VSIDS is to concentrate on those variables that recently were involved in conflicts: a variable v is involved in a conflict, if v is resolved in the conflict analysis to produce the learned clause or is contained in the learned clause. 译文:VSIDS的基本思想是集中在那些最近卷入冲突的变量上:如果v在冲突分析中被解决从而产生学习的子句,或者v包含在学习的子句中,那么一个变量v就卷入了冲突。 |
|
Every variable has a counter, called the VSIDS score, which counts how often this variable was used in deriving a learned clause. This counter essentially sums up all these involvements. However, and this is the intriguing idea of VSIDS, it is much better to slowly forget past involvement. 译文:每个变量都有一个称为VSIDS分数的计数器,它计算该变量在生成已学习子句时使用的频率。这个计数器基本上概括了所有这些涉及。然而,这是VSIDS的有趣想法,慢慢忘记过去的参与要好得多。 Variables with higher VSIDS score are picked as decision variable, which increases the focus of the search. Explaining the effectiveness of VSIDS is out of the scope of this paper. 译文:选择VSIDS得分较高的变量作为决策变量,增加了搜索的重点。解释VSIDS的有效性超出了本文的范围。 |
|
One way to implement this scheme, is to multiply the VSIDS counters of all |
|
|
|
This reformulation of VSIDS [8] has the benefit that it produces a rational number between 0 and 1, and can be interpreted as the percentage of the number of times a variable was involved in a conflict “recently”. 译文:VSIDS[8]的这种重新表述的好处是它产生一个介于0和1之间的有理数,可以解释为一个变量“最近”卷入冲突的次数的百分比。 Unfortunately we do not have a more precise definition for “recently” at this moment. 译文:不幸的是,目前我们对“最近”没有一个更精确的定义。 |
|
|
|
In practice it is too costly to update the VSIDS resp. NVSIDS score of all variables at every conflict, in particular for industrial examples.译文:在实践中,更新VSIDS响应的成本太高。NVSIDS在每次冲突中对所有变量进行评分,特别是对于工业实例。 In the original Chaff implementation, this overhead is avoided, by accumulating and delaying punishment: variables are only punished after 256 conflicts have passed, by multiplying their score by 0.5. Meanwhile involvements increment the score by 1. 译文:在最初的Chaff实现中,通过累积和延迟惩罚来避免这种开销:变量只有在通过256次冲突后才会受到惩罚,方法是将它们的分数乘以0.5。同时涉及将分数增加1 |
|
MiniSAT 1.13 has shown that it is also possible, much more accurate, more efficient and more effective to just update the scores of variables involved in the conflict. The same scheme is used in PicoSAT and in the following we explain and relate this optimized score calculation to our NVSIDS. 译文:MiniSAT 1.13已经表明,仅仅更新冲突中涉及的变量的分数也是可能的,而且更准确、更有效率和更有效。PicoSAT也使用了相同的方案,下面我们将解释并将这种优化的分数计算方法与我们的NVSIDS相关联。 |
|
In MiniSAT’s new exponential VSIDS scheme (EVSIDS) variables are not punished, but the EVSIDS score Sn has to be interpreted as sn · f −n/(1 − f) 译文:在MiniSAT的新指数VSIDS方案中,变量不受惩罚,而是增加。 |
|
As the equation shows the EVSIDS score is linearly related to NVSIDS and thus can be used instead of NVSIDS to compare activity of variables. 译文:如方程所示,EVSIDS得分与NVSIDS呈线性相关,因此可以代替NVSIDS来比较变量的活动性 Moreover, it can be kept up-to-date by just adding f −k to the score of those variables involved in the kth conflict. The EVSIDS scores of other variables, which are usually many more, do not have to be touched. 译文:此外,只要在第k次冲突中所涉及的变量的分数中加上f−k,它就可以保持最新。其他变量的evsid分数(通常要多得多)不需要修改。 |
|
2.3 Average Number of Recently Flipped Assignments (ANRFA) |
|
To obtain a concrete metric for the agility a we follow the same idea as our NVSIDS reformulation of VSIDS. The global variable a is initialized to zero and intuitively measures the average number of recently flipped assignments. 译文:为了获得敏捷性的具体度量,我们遵循与我们的NVSIDS对VSIDS的重新制定相同的想法。全局变量a被初始化为0,并直观地度量最近翻转的赋值的平均数量。 | |
Whenever a variable v is forced to be assigned, a is updated. First a is multiplied by 0 <g< 1. If the assignment is a flip, e.g. it assigns the opposite value as in the previous assignment to v, then we increment a by 1 − g. 译文:每当一个变量v被强制赋值时,a就被更新。首先a乘以0 <g< 1。如果赋值是翻转,例如,赋值与前一次赋值相反的v,则a加1 - g。
Assignments of decision variables and variables not assigned before do not increment a. As discussed for NVSIDS this enforces 0 ≤ a ≤ 1, if we start with a = 0: 译文:赋值的决策变量和之前未赋值的变量不增加a。正如对NVSIDS所讨论的,这强制0≤a≤1,如果我们从a = 0开始: |
|
|
|
Also note that we do not need an “exponential” reformulation of EVSIDS as for VSIDS, because there is only one single global agility counter. 译文:还请注意,我们不需要像VSIDS那样对EVSIDS进行“指数级”的重新制定,因为只有一个单一的全球敏捷计数器。 | |
A value of g = 0.9999 = 1−1/10000 was effective in our experiments. Slightly different values did not change the result much (in contrast to f in VSIDS). Note, that there are orders of magnitude more assignments than conflicts in a SAT run and therefore g naturally has to be much closer to 1 than f. 译文:稍微不同的值不会对结果有太大的改变(与VSIDS中的f不同)。注意,在SAT考试中,作业要比冲突多几个数量级,因此g自然要比f更接近1。 | |
We logged a over industrial and crafted benchmarks on which the old version of PicoSAT performed much worse than competitors. 译文:我们记录了一个超过工业和精心制作的基准,旧版本的picsat表现比竞争对手差得多。 It turned out that in those cases, where we conjectured that restarts should be slowed down, the agility a varied between 15% and 40%. 译文:结果是,在那些我们推测重启应该慢下来的情况下,敏捷性在15%到40%之间变化。 For many industrial benchmarks a was way below 20%. Therefore we picked 20% as the limit at which a scheduled inner restart is disabled. 译文:对于许多工业基准来说,a远远低于20%。因此,我们选择20%作为禁用预定内部重启的限制。 Outer restarts are only disabled if the agility reaches 25% and more. Slightly different values do not change experimental results much. 译文:外部重启仅在敏捷度达到25%或以上时禁用。略有不同的值对实验结果影响不大。 |
|
The restart schedule controls the garbage collection limit for learned clauses, as in MiniSAT. Thus the restart schedule per se should not change. If a scheduled restart is disabled resp. skipped the solver simply does not backtrack and continues at the same decision level. 译文:重启计划控制已学习子句的垃圾回收限制,如MiniSAT。因此,重启计划本身不应该改变。如果定时重启被禁用,那么resp。跳过求解器简单地不回溯,并继续在相同的决策级别。 |
|
Table 1. Number of solved instances: “adaptive = no” is without dynamic restart control, “adaptive = yes” uses the ANRFA agility a to disable backtracking. Columns sat, unsat, and solved denote the number of solved satisfiable instances, then the number of unsatisfiable instances, and the sum of these two numbers. Time out is only 900 seconds which matches the one used in the SAT Race’06, but is much less than the time limit in the SAT Competition’07. The three rows with AAS-RSAT, show the number of solved instances for a modified version of RSAT, which is more similar to PicoSAT. The percentages “25%” and “30%” are the two values on the limit of the ANRFA agility a. Above this limit AAS-RSAT does not backtrack if a restart is scheduled. |
|
3 Experiments
We added calculating ANRFA and the adaptive restart strategy to PicoSAT and measured its effect on the SAT Race’06 instances and the SAT’07 Competition benchmarks with a time out of 900 seconds and a memory limit of 1.5 GB on Linux PCs with 3 GHz Pentium IV. As Tab. 1 shows we slightly improved on industrial examples. PicoSAT with the adaptive restart schedule can solve 36% more crafted instances. This is mainly due to the improvement on unsatisfiable instances, where 50% more instances are solved. 译文:具有自适应重启计划的PicoSAT可以解决36%的精心设计的实例。这主要是由于对不满意的实例的改进,其中50%以上的实例被解决。 |
|
We also implemented the suggested adaptive technique in RSAT 2.0, the version submitted to the SAT’07 Competition. Before we changed the basic restart interval from 512 to 100 as in PicoSAT and always enforced saving and reusing phases to match PicoSAT more closely. This results in an “aggressive always saving” RSAT, called AAS-RSAT, with and without adaptive restart control. Using adaptive control for restarts in RSAT is not as impressive as for PicoSAT, but we did not spend much time to optimize magic constants either. 译文:这导致了一个“积极的总是保存”的RSAT,称为AAS-RSAT,有或没有自适应重启控制。在RSAT中使用自适应控制进行重启并不像picsat那样令人印象深刻,但我们也没有花太多时间来优化魔术常数。 | |
4 Conclusion and Future Work
We presented a new adaptive restart strategy, which slows down restarts if the agility of the SAT solver is high. The key insight is to apply the same filtering technique to the number of flipped assignments as in a new reformulation of VSIDS. 译文:我们提出了一种新的自适应重新启动策略,当SAT求解器的敏捷性较高时,可以减缓重新启动的速度。关键的见解是,将与VSIDS的新重组一样的过滤技术应用于翻转分配的数量。 For PicoSAT considerable performance improvements have been achieved. In future work we want to apply similar ideas to dynamically control the number of garbage collected clauses resp. the limit on the number of conflicts. |
|
References
1. Biere, A.: PicoSAT essentials. Journal on Satisfiability, Boolean Modeling and Computation (submitted, 2008) 2. E´en, N., Biere, A.: Effective preprocessing in SAT through variable and clause elimination. In: Bacchus, F., Walsh, T. (eds.) SAT 2005. LNCS, vol. 3569, Springer, Heidelberg (2005) 3. Gomes, C., Selman, B., Kautz, H.: Boosting combinatorial search through randomization. In: Proc. AAAI 1998 (1998) 4. Hoos, H.: An adaptive noise mechanism for WalkSAT. In: Proc. AAAI 2002 (2002) 5. Huang, J.: The effect of restarts on the eff. of clause learning. In: Proc. IJCAI 2007 (2007) 6. Kautz, H., Horvitz, E., Ruan, Y., Selman, B., Gomes, C.: Dynamic restart policies. In: Proc. AAAI 2002 (2002) 7. Luby, M., Sinclair, A., Zuckerman, D.: Optimal speedup of Las Vegas algorithms. Information Processing Letters 47 (1993) 8. Moskewicz, M., Madigan, C., Zhao, Y., Zhang, L., Malik, S.: Chaff: Engineering an efficient SAT solver. In: Proc. DAC 2001 (2001) 9. Pipatsrisawat, K., Darwiche, A.: RSat 2.0: SAT solver description. Technical Report D–153, Automated Reasoning Group, Comp. Science Dept., UCLA (2007) |
|