文献学习——Evaluating CDCL Variable Scoring Schemes

作者:Armin Biere ( B ) and Andreas Fröhlich    ------大牛,CaDiCal、YalSAT、Lingeling等求解器的研发团队负责人




The VSIDS (variable state independent decaying sum) decision heuristic invented in the context of the CDCL (conflict-driven clause
learning) SAT solver Chaff, is considered crucial for achieving high efficiency of modern SAT solvers on application benchmarks.


This paper proposes ACIDS (average conflict-index decision score), a variant of VSIDS. The ACIDS heuristics is compared to the original implementation of VSIDS, its popular modern implementation EVSIDS (exponential VSIDS), the VMTF (variable move-to-front) scheme, and other related decision heuristics.

译文:本文提出了一种改进的方案——平均冲突指数决策得分。将ACIDS的启发法与vsid的原始实现进行比较,后者是流行的现代实现EVSIDS(指数级) VSIDS)、VMTF(变量前移)方案以及其他相关的决策启发法。


They all share the important principle to select those variables as decisions, which recently participated in conflicts. The main
goal of the paper is to provide an empirical evaluation to serve as a
starting point for trying to understand the reason for the efficiency of these decision heuristics.


In our experiments, it turns out that EVSIDS, VMTF, ACIDS behave very similarly, if implemented carefully.


1 Introduction

VSIDS  ----- variable state independent decaying sum (VSIDS) decision heuristic。    

EVSIDS  -------    exponential VSIDS (EVSIDS)

The EVSIDS heuristic allows fast
selection of decision variables and adds focus to the search, but also is able to pick
up long-term trends due to a “smoothing” component, as argued in [6].



On the practical side, there have been various attempts to improve on the
EVSIDS scheme. These include the variable move-to-front (VMTF) strategy
of the Siege SAT solver [8], the BerkMin strategy [9], which is focusing on
recently learned clauses, and the clause move-to-front (CMTF) strategies of
HaifaSAT [10] and PrecoSAT [11].



 VMTF  ------  variable move-to-front (VMTF) strategy

 CMTF  -------  the clause move-to-front (CMTF) strategie

    对应于两类求解器HaifaSAT [10] and PrecoSAT [11]


ACIDS  ------  本文提出的策略 average conflict-index decision score (ACIDS)



2.Decision Heuristics




Modulo initialization, typically based on (one-sided)
Jeroslow-Wang’s heuristic [20], phase saving turns the decision heuristic into a
variable selection heuristic.



Accordingly, we focus on variable selection, which in
turn will be based on selecting a variable with the highest decision score.




The original VSIDS implementation in Chaff
worked as follows:

Variables are stored in an array used to search for a decision
variable. After learning a clause, the score of its variables is incremented. Further,
every 256th conflict, all variable scores are divided by 2, and the array is sorted
w.r.t. decreasing score.



The process of updating scores of variables is also referred to as variable bumping [7].

Note, however, that in modern solvers and also in our experiments
we not only bump variables of the learned clause, but all seen variables occur-
ring in antecedents used to derive the learned clause through a regular input
resolution chain [25] from existing clauses.



An essential optimization in Chaff
is to cache the position of the last found decision variable with maximum score
in the ordered array.




INC (or inc in the experiments)

SUM (or sum in our experiments)

The decide procedure selects the next decision variable, by searching for the
first unassigned variable in the ordered array, starting at the lower end, e.g., the
variable with the highest score during sorting.



An essential optimization in Chaff
is to cache the position of the last found decision variable with maximum score
in the ordered array. This position is used as starting point for the next search.



 If a variable in the array with a position smaller than the cached maximum score
position becomes unassigned then the maximum score position is updated to
that position. During rescoring, similar updates might be necessary.


The first part of VSIDS, e.g., only incrementing scores, constitutes an approximation of dynamic DLIS. It counts occurrences of variables in clauses, ignoring whether a clause is satisfied or not, or even removed during learned clause deletions [3] (called clause database reduction in the following).  译文:vsid的第一部分,例如,只增加分数,构成动态DLIS的近似。它计算子句中变量的出现次数,忽略是否满足子句。(在学习子句删除策略时也是不考虑被删除的学习自己是否满足的。学习子句删除在后面我们统称“学习子句数据库缩减”)。

This restricted version of VSIDS without smoothing is denoted INC (or inc in the experiments).译文:这种没有平滑的vsid的受限版本被称为INC(或实验中的INC)。

As an alternative to using frequent rescoring, we propose that the smoothing
part of VSIDS can also be approximated by adding the conflict-index to the score
instead of just incrementing it.



The conflict-index is the total number of conflicts
that occurred so far. We call this scheme SUM (or sum in our experiments).













Variable selection heuristics can be seen as online sorting algorithms of variable scores.



This view suggests to use online algorithms with efficient amortized complexity, such as move-to-front (MTF) [30].


A similar motivation was given in the master thesis of Lawrence Ryan [8], which precedes MiniSAT [7] and
introduced the Siege SAT solver as well as the variable move-to-front (VMTF, or
vmtf in the experiments) strategy.

译文:Lawrence Ryan[8]的硕士论文也给出了类似的动机,该论文先于miniat[7],并引入了围攻求解器和变量移动到前线(VMTF,或实验中的VMTF)策略。


As in Chaff, the restriction in Siege’s VMTF bumping scheme was to only move variables in the learned clause.


Actually, only a small subset of those variables, e.g., of size 8, was selected, according to [8].


The restriction in Siege to move only a small subset of variables might have
been partially motivated by the cost of moving many.


It is not uncommon that tens of thousands variables occur in antecedents of a learned clause, which also
are rather long for some instances.



In our experiments in Sect. 4, the default decision heuristic (evsids in Tab. 2) bumped on average 276 literals per learned
clause of average length 105 (on 275 considered instances).


Unfortunately, details on how even this restricted version of VMTF is implemented in Siege were not
provided. The source code is not available either. We give details for a fast
implementation of unrestricted VMTF in Sect. 3.



We describe how the VMTF scheme can be implemented efficiently, as well as
how these techniques can be lifted to implement a generic priority queue, which
(empirically) is efficient for all the considered scoring schemes.



This new implementation of a priority queue for variable selections combines ideas originally
implemented in Chaff [5] and JeruSAT [29], but adds additional optimizations and works with arbitrary precise floating-point scores, in contrast to an imprecise earlier version implemented in Lingeling [31].



Variable scores play a role while (a) bumping variables participating in deriving a learned clause, (b) deciding or searching for the next decision variable, (c)
unassigning variables during backtracking, (d) rescoring variable scores either for explicit smoothing in VSIDS or due to protecting scores from overflow dur-
ing bumping, and (e) comparing past decisions on the trail to maximize trail reuse [28].

译文:变量得分而发挥作用(a)碰撞变量参与推导了学习子句,(b)决定或寻找下一个决策变量,(c)在回溯unassigning变量,d)改成绩显式平滑的vsid或由于溢出保护成绩大调的- ing碰撞,和(e)比较过去决策追踪最大化重用[28]。(是否可以交互用在SLS求解器之中?

First, we explain a fast implementation for VMTF, focusing on (a)-(c). Next, we address its extension to precise scoring schemes using floating-point
numbers, which in previous implementations followed the example set by MiniSAT to use a binary heap data structure. Last, we discuss (d) and (e).





3.1 Fast Queue for VMTF

According to Sect. 2, the score of a variable in VMTF is the conflict-index, e.g.,the number of conflicts at the point a variable was last bumped.



With this score definition, VMTF can be simulated with a binary heap.



However, every bump then needs a logarithmic number of steps to “bubble-up” a bumped variable in the heap.



Instead, a queue, implemented as doubly linked list which holds all variables, only requires two simple constant time operations for bumping:
dequeue the variable and enqueue it back at the end of the list, which we consider
as head. Even storing the score seems to be redundant.




To find the next decision variable in the queue, we could start at the end (head) of the queue and traverse it backwards until an unassigned variable is found.


Unfortunately, this algorithm has quadratic accumulated complexity.



For example, consider an instance with 10000 variables and a single clause containing all variables in default phase.


However, we can employ the same 2 optimization as used in Chaff (see Sect. 2) and remember the variable up to which the last search
proceeded until finding an unassigned variable.



Since the solver will restart the next search at this variable, we call this reference next-search.



During backtracking, variables are unassigned and (as in Chaff) next-search
potentially has to be updated to such an unassigned variable if it sits further
down the queue closer to head than the next-search variable.



But in reverse order, e.g., while we prefer the variable with largest score at the end
of the queue, Chaff had the variable with largest score at the first array position.


In order to achieve this, we could use the scores of the variables for comparing queue position.


However, in VMTF, variables bumped at the same conflict all get the same score,
and thus simply using the score leads to violation of the following important
invariant: variables right of next-search (closer to head) are assigned.




To fix this problem, we globally count enqueue operations to the queue with an enqueue-counter and remember with each variable the value of the enqueue-
counter at the point the variable was enqueued as enqueue-time.



Thus, the enqueue-time precisely captures the order of the elements in the queue and can
be used to precisely compare the relative positions of variables in the queue.


In the actual implementation, we use a 32-bit integer for the enqueue-counter, which occasionally, e.g., after billion enqueue operations, requires to reassign

enqueue-times to all queue elements in a linear scan of the queue.



Note that, in a dedicated queue implementation for VMTF (like queue in our experiments),
the scores become redundant again, after adding enqueue-times.




3.2 Generic Queue for all Decision Heuristics   译文:用于所有决策启发的通用队列

For other schemes, it is tempting to also just use a queue implemented as doubly
linked list as for VMTF, maintaining both scores and enqueue-times.



Every operation remains constant time except for bumping. We have to ensure that
the queue is sorted w.r.t. score.



However, only for VMTF, bumped variables are guaranteed to be enqueued at the end (head) of the queue, i.e., in constant time.


For other scoring schemes, a linear search is required to find the right position,
which risks an accumulated quadratic bumping effort.


To reduce enqueue time, we propose three optimizations and two modifications to the bumping order.


The first optimization is inspired by bucket sort and already gives acceptable bumping times for EVSIDS.



It is motivated by the following observation.译文:它是由下面的观察引起的。

For EVSIDS, rescoring to avoid floating-point overflow of scores and score increment
occurs quite frequently, e.g., roughly every 2000 conflicts, as Tab. 2 suggests.


Thus, the exponents of variable scores represented as floating-point numbers
will tend to span the whole range of possible values 3 .


So instead of a single queue, we keep a stack of queues, indexed by the exponent of the scores of variables.

Variables belong to the queue of the floating-point exponent of their score.


As the motivation on rescoring shows, this stack will soon grow to its maximum
size for EVSIDS, but for other scoring schemes (particularly for VMTF or INC)
it will only have very few elements or even just one.



Note that, sinceexponents canbenegative, the actual index to access the stack
is obtained after adding the negation of the minimum negative exponent.


Furthermore, Lingeling uses its own implementation of floating-points, in order to
make execution of Lingeling deterministic across different hardware, compilers,
and compiler flags.

译文:而且,为了在不同的硬件、编译器和编译器标记之间确定地执行Lingeling, Lingeling使用自己的浮点数实现。

These software floats have a 32 bit exponent, but we restrict
exponents to 10 bits including a sign bit, by proper rescoring of large scores and
truncation of small scores.


MiniSAT/Glucose use 10 100 as an upper score limit,
which is only a slightly smaller maximum limit than ours 2 512 ≈ 10 154 , but then
does not use any truncation for small scores, which means that the minimum score
exponent in MiniSAT is (roughly) 2−10 .

MiniSAT/Glucose使用10100作为一个得分上限,这只是一个略小的比我们最大限度2 512≈10154,但是不使用任何截断为小分数,这意味着最低饱和的最低分数指数(大约)2−10

So Lingeling uses 9 bits for positive scores and 9 bits for negative scores, while MiniSAT uses slightly less than 9 bits for pos-
itives scores and (almost) full 10 bits for negative scores.



When searching for decisions as well as during backtracking, more specifically
during unassigning variables, we additionally have to maintain the highest expo-
nent of an unassigned variable.


This follows the same idea as for next-search in
a single queue and only adds constant time effort for all considered operations.



During conflict analysis, variables participating in resolutions to derive a learned clause are collected on a seen-variables stack, before they are bumped 

(or discarded if on-the-fly subsumption succeeds).


The analysis traverses the trail of assigned variables in reverse order. Thus, there is a similarity between
the order of variables on the seen-variables stack and the reverse order of assign ments.



However, this is not guaranteed, particularly for variables with smaller
decision-level. The order of bumping these variables then follows this order too.



At a conflict, it can happen that thousands of variables with different score
are bumped and end up in almost random order w.r.t score order on the seen-
variables stack (or worse, in reverse order) before they are bumped. For many of
these variables, even for EVSIDS, the new updated score might end up having the
same exponent and all those variables have to be enqueued to the same queue.
However, since their scores still differ, enqueueing them degrades to insertion-
sort. There are instances where bumping leads to a time-out due to this effect.


A first modification to the order in which variables are bumped prevents
this problem. Before actually first dequeuing a bumped variable, then updating its
score,and finally enqueueing it back,we sort the seen-variables stack w.r.t. increas-
ing score. However, a similar problem occurs if all bumped variables have the same
score exponent, which also does not change during update. This is for instance
almost always the case for INC. The second modification prevents this corner
case by first dequeuing all variables on the seen-variables stack, and only then
updating their score and enqueueing them back in score order.


While EVSIDS exponents of variable scores are more or less spread out, other schemes do not have this property, clearly not INC, but probably also SUM and
ACIDS to a smaller extent. For these schemes, score exponents might cluster
around some few values. Thus, our second optimization repeats the bucket
sort argument w.r.t. some fixed number of highest bits of the mantissa of a
variable score. For each queue (indexed by exponent), we add another cache-
table (indexed by highest bits of mantissa) of references pointing to the last
element in the queue with matching highest mantissa bits.

译文:虽然EVSIDS的可变分数指数或多或少是分散的,但其他方案没有这种特性,显然不是INC,但也可能是SUM和ACIDS到一个较小的程度。对于这些方案,分数指数可能会聚集在几个值周围。因此,我们的第二个优化重复bucket sort参数w.r.t.,即变量分数尾数的最高位的固定数目。对于每个队列(按指数索引),我们添加另一个缓存表(按尾数最高位索引)的引用,这些引用指向具有匹配最高尾数位的队列中的最后一个元素。

This ensures that
these variables referenced in the cache-table have the maximum score among
variables in this queue with the same highest bits of the mantissa of their score.
In our implementation, we use the highest 8 bits and thus a cache-table of size
256. This cache is only used for fast enqueue and can be ignored otherwise.

译文:这可以确保在缓存表中引用的这些变量在这个队列中的变量中拥有与其得分尾数相同的最高位的最大得分。在我们的实现中,我们使用最高的8位,因此使用大小的缓存表256. 此缓存仅用于快速入队,否则可以忽略。

If bumping individual variables is done in the order of their scores, as sug-
gested by the first modification above, there is a high chance that consecutively
bumped variables end up in the same queue one after each other or at least close
to each other. Thus, as a third optimization, we propose to additionally cache
the last-enqueued variable for each (sub) queue consisting of variables with the
same highest mantissa bits.


In an enqueue operation, we first check whether the
corresponding cache-table entry of the second optimization points to a variable
with smaller (or equal) score. If this is the case, we enqueue right next to it. Oth-
erwise, we obtain the last-enqueued variable and start searching for the proper
enqueue position from there towards the end, e.g., towards larger scores.


This might fail if the score of the last-enqueued variable is larger or if the last-enqueue
reference is not valid, e.g., if the variable is already dequeued. We then search
backwards from the cache-table reference (towards smaller scores).




Altogether, these optimizations and modifications seem to avoid the most
severe worst-case corner cases. We track this by profiling relative and total decide
and particularly bump time per instance. Total time summed for these over all
instances are shown in Tab. 2. Further distribution plots are included in the
additional material, mentioned in the results in Sect. 4.



3.3 Rescore, Reuse-Trail and Complexity

For the original array based VSIDS implementation, rescoring requires sorting
variables. For a binary heap implementation, one would expect that the heap
does not change, since rescoring does not change the relative order of variables.
However, due to finite precision of scores, even when using floating-points, rescoring will make the score of some variables the same, even though they differed in
score before rescoring.




Moreover, scores of many variables will become zero after
a few rescores (particularly in EVSIDS). In this situation, the binary heap will
only remain unchanged after rescoring if the actual scores are the only mean to
compare variables (and for instance the variable index is not used as a tie breaker
for comparing variables with the same score). The same argument applies to our
improved queue based implementation.


The reuse-trail optimization [28] is based on the following observation. After
a restart, it often happens that the same decisions are taken and the trail ends
up with the same assigned variables. Thus, the whole restart was useless. By
comparing scores of assigned previous decisions with the score of the next decision variable before restarting, this situation can be avoided.



With some effort,this technique can be lifted to our generic queue implementation. To simplify
the comparison in favor of a clean experiment, the results presented in Sect 4
are without reuse-trail (except for sc14ayv, the old 2014 version of Lingeling).


While we do not have a precise complexity analysis for this new data structure, our empirical results show that it performs almost as good as a dedicated
binary heap for EVSIDS (heap) and as a dedicated simplified queue for VMTF (queue).



This makes our empirical comparison of decision heuristics more accurate since they all use the same implementation. This data structure should also
allow to experiment with new scoring schemes without the need to implement dedicated data structures.



It might also be possible to improve it further, while our binary heap implementation is close to being as fast and compact as possible.








