An Extensible SAT-solver
Cite this paper as:
Eén N., Sörensson N. (2004) An Extensible SAT-solver. In: Giunchiglia E., Tacchella A. (eds) Theory and Applications of Satisfiability Testing. SAT 2003. Lecture Notes in Computer Science, vol 2919. Springer, Berlin, Heidelberg. https://doi-org-s.era.lib.swjtu.edu.cn/10.1007/978-3-540-24605-3_37
Chalmers University of TechnologySweden
1.学习子句集的管理—也被称为子句删除或Constraint removal
Abstract
In this article, we present a small, complete, and efficient SAT-solver in the style of conflict-driven learning, as exemplified by Chaff. We aim to give sufficient details about implementation to enable the reader to construct his or her own solver in a very short time. This will allow users of SAT-solvers to make domain specific extensions or adaptions of current state-of-the-art SAT-techniques, to meet the needs of a particular application area. The presented solver is designed with this in mind, and includes among other things a mechanism for adding arbitrary boolean constraints. It also supports solving a series of related SAT-problems efficiently by an incremental SAT-interface. | |
1 Introduction
The use of SAT-solvers in various applications is on the march. As insight on how to efficiently encode problems into SAT is increasing, a growing number of problem domains are successfully being tackled by SAT-solvers. This is particularly true for the electronic design automation (EDA) industry [BC+99, Lar92]. The success is further magnified by current state-of-the-art solvers being adapted to meet the specific characteristics of these problem domains [AR+02, ES03]. 译文:sat求解器在各种应用中的应用正在发展中。随着对如何有效地将问题编码到SAT中的见解不断增加,越来越多的问题领域正被SAT解决者成功地解决。对于电子设计自动化(EDA)行业来说尤其如此[BC+99, Lar92]。目前最先进的解决方案正在适应这些问题领域的特定特征,这进一步扩大了成功的规模[AR+02, ES03]。 |
|
Likewise, writing a solver from scratch often means spending much time rediscovering the intricate details of a correct and efficient solver. 译文:同样地,从头开始编写求解器通常意味着花费大量时间重新发现正确且高效的求解器的复杂细节。 Thus, the principal goal of this article is to bridge the gap between existing descriptions of SAT-techniques and their actual implementation. 译文:因此,本文的主要目标是弥合现有的sat技术描述和它们的实际实现之间的差距。 |
|
The presented code includes an incremental SAT-interface, which allows for a series of related problems to be solved with potentially huge efficiency gains [ES03]. We also generalize the expressiveness of the SAT-problem formulation by providing a mechanism for defining arbitrary constraints over boolean variables. 译文:我们还通过提供一种机制来定义布尔变量上的任意约束,推广了sat问题公式的表达性。 | |
From the documentation in this paper we hope it is possible for you to implement a fresh SAT-solver in your favorite language, or to grab the C++ version of MINISAT from the net and start modifying it to include new and interesting ideas. | |
2 Application Programming Interface
For a standard SAT-problem, the interface is used in the following way: Variables are introduced by calling newVar(). From these variables, clauses are built and added by addClause(). Trivial conflicts, such as two unit clauses {x} and {x} being added, can be detected by addClause(), in which case it returns False. From this point on, the solver state is undefined and must not be used further. If no such trivial conflict is detected during the clause insertion phase, solve() is called with an empty list of assumptions. It returns False if the problem is unsatisfiable, and True if it is satisfiable, in which case the model can be read from the public vector “model”. |
|
The simplifyDB() method can be used before calling solve() to simplify the set of problem constraints (often called the constraint database). 译文:simplifyDB()方法可以在调用solve()之前使用,以简化问题约束集(通常称为约束数据库)。 In our implementation, simplifyDB() will first propagate all unit information, then remove all satisfied constraints. 译文:在我们的实现中,simplifyDB()将首先传播所有单元信息,然后删除所有满足的约束。 As for addClause(), the simplifier can sometimes detect a conflict, in which case False is returned and the solver state is, again, undefined and must not be used further. 译文:对于addClause(),简化器有时可以检测到冲突,在这种情况下返回False,解算器状态同样是未定义的,不能再使用。 |
|
If the solver returns satisfiable, new constraints can be added repeatedly to the existing database and solve() run again. 译文:如果求解器返回satisfiable,则可以向现有数据库重复添加新的约束,并再次运行solve()。 However, more interesting sequences of SAT-problems can be solved by the use of unit assumptions. When passing a non-empty list of assumptions to solve(), the solver temporarily assumes the literals to be true. After finding a model or a contradiction, these assumptions are undone, and the solver is returned to a usable state, even when solve() return False, which now should be interpreted as unsatisfiable under assumptions. 译文:当向solve()传递一个非空的假设列表时,求解器会临时假定文字为真。在找到一个模型或一个矛盾之后,这些假设将被撤销,求解器将返回到可用状态,甚至当solve()返回False时,现在应该将其解释为在假设下不可满足。 |
|
For this to work, calling simplifyDB() before solve() is no longer optional. It is the mechanism for detecting conflicts independent of the assumptions – referred to as a top-level conflict from now on – which puts the solver in an undefined state. For an example of the use if unit assumptions, see [ES03]. | |
An alternative interface would be for solve() to return one of three values: satisfiable, unsatisfiable, or unsatisfiable under assumptions. This is indeed a less error-prone interface as there is no longer a pre-condition on the use of solve(). The current interface, however, represents the smallest modification of a nonincremental SAT-solver。译文:然而,当前接口代表了非增量sat求解器的最小修改。 | |
3 Overview of the SAT-solver
This article will treat the popular style of SAT-solvers based on the DPLL algorithm [DLL62], backtracking by conflict analysis and clause recording (also referred to as learning) [MS96], and boolean constraint propagation (BCP) using watched literals [MZ01]. We will refer to this style of solver as a conflict driven SAT-solver. The components of such a solver, and indeed a more general constraint solver, can be conceptually divided into three categories: | |
• Representation. Somehow the SAT-instance must be represented by internal data structures, as must any derived information. • Inference. Brute force search is seldom good enough on its own. A solver also needs some mechanism for computing and propagating the direct implications of the current state of information. 译文:蛮力搜索本身很少足够好。 译文:求解器还需要一些机制来计算和传播当前信息状态的直接含义。
• Search. Inference is almost always combined with search to make the solver complete. The search can be viewed as another way of deriving information. 译文:推理几乎总是与搜索相结合,使求解器完整。搜索可以看作是获取信息的另一种方式。 |
|
A standard conflict-driven SAT-solver can represent clauses (with two literals or more) and assignments. Although the assignments can be viewed as unit-clauses, they are treated specially, and are best viewed as a separate type of information | |
The only inference mechanism used by a standard solver is unit propagation. As soon as a clause becomes unit under the current assignment (all literals except one are false), the remaining unbound literal is asserted, possibly making more clauses unit. The process continues until no more information can be propagated. 译文:标准求解器使用的唯一推理机制是单元传播。 |
|
The search procedure of a modern solver is the most complex part. Heuristically, variables are picked and assigned values (assumptions are made), until the propagation detects a conflict (all literals of a clause have become false). At that point, a so called conflict clause is constructed and added to the SAT problem. Assumptions are then canceled by backtracking until the conflict clause becomes unit, at which point it is propagated and the search process continues. | |
MINISAT is extensible with arbitrary boolean constraints. This will affect the representation, which must be able to store these constraints; the inference, which must be able to derive unit information from these constraints; and the search, which must be able to analyze and generate conflict clauses from the constraints. The mechanism we suggest for managing general constraints is very lightweight, and by making the dependencies between the SAT-algorithm and the constraints implementation explicit, it adds to the clarity of the solver. | |
Propagation. The propagation procedure of MINISAT is largely inspired by that of CHAFF [MZ01]. For each literal, a list of constraints is kept. These are the constraints that may propagate unit information (variable assignments) if the literal becomes True. For clauses, no unit information can be propagated until all literals except one have become False. Two unbound literals p and q of the clause are therefore selected, and references to the clause are added to the lists of p and q respectively. The literals are said to be watched and the lists of constraints are referred to as watcher lists. As soon as a watched literal becomes True, the constraint is invoked to see if information may be propagated, or to select new unbound literals to be watched. | |
An effect of using watches for clauses is that on backtracking, no adjustment to the watcher lists need to be done. 译文:对条款使用观察体系的效果是,回溯时,不需要对观察者名单进行调整。 Backtracking is therefore cheap. However, for other constraint types, this is not necessarily a good approach. 译文:对于其他约束类型,这未必是一个好方法。
MINISAT therefore supports the optional use of undo lists for those constraints; storing what constraints need to be updated when backtracking unbinds a variable. 译文:因此,MINISAT支持对这些约束可选地使用undo列表;存储回溯解除变量绑定时需要更新的约束。 |
|
Learning. The learning procedure of MINISAT follows the ideas of MarquesSilva and Sakallah in [MS96]. The process starts when a constraint becomes conflicting (impossible to satisfy) under the current assignment. The conflicting constraint is then asked for a set of variable assignments that make it contradictory. For a clause, this would be all the literals of the clause (which are False under a conflict). Each of the variable assignments returned must be either an assumption of the search procedure, or the result of some propagation of a constraint. The propagating constraints are in turn asked for the set of variable assignments that made the propagation occur, continuing the analysis backwards. The procedure is repeated until some termination condition is met, resulting in a set of variable assignments that implies the conflict. A clause prohibiting that particular assignment is added to the clause database. This learnt (conflict) clause will always be implied by the original problem constraints. |
|
Learnt clauses serve two purposes: 译文:学到的从句有两个目的: they drive the backtracking and they speed up future conflicts by “caching” the reason for the conflict. Each clause will prevent only a constant number of inferences, but as the recorded clauses start to build on each other and participate in the unit propagation, the accumulated effect of learning can be massive. However, as the set of learnt clauses increase, propagation is slowed down. Therefore, the number of learnt clauses is periodically reduced, keeping only the clauses that seem useful by some heuristic. 译文:它们驱动回溯,并通过“缓存”冲突的原因加速未来的冲突。 每一个子句只能阻止一定数量的推论,但当记录的子句开始相互构建并参与单元传播时,学习的累积效果可能是巨大的。 然而,随着学习到的子句数量的增加,传播速度会减慢。因此,学习子句的数量会定期减少,只保留那些通过启发式方法看起来有用的子句 |
|
Search.The search procedure of a conflict-driven SAT-solver is somewhat implicit. Although a recursive definition of the procedure might be more elegant, it is typically described (and implemented) iteratively. The procedure will start by selecting an unassigned variable x (called the decision variable) and assume a value for it, say True. The consequences of x=True will then be propagated, possibly resulting in more variable assignments. All variables assigned as a consequence of x is said to be from the same decision level, counting from 1 for the first assumption made and so forth. Assignments made before the first assumption (decision level 0) are called top-level. |
|
All assignments will be stored on a stack in the order they were made; from now on referred to as the trail. The trail is divided into decision levels and is used to undo information during backtracking. The decision phase will continue until either all variables have been assigned, in which case we have a model, or a conflict has occurred. On conflicts, the learning procedure will be invoked and a conflict clause produced. The trail will be used to undo decisions, one level at a time, until precisely one of the literals of the learnt clause becomes unbound (they are all False at the point of conflict). By construction, the conflict clause cannot go directly from conflicting to a clause with two or more unbound literals. If the clause is unit for several decision levels, it is advantageous to chose the lowest level (referred to as backjumping or non-chronological backtracking [MS96]). | |
|
|
Activity heuristics. One important technique introduced by CHAFF [MZ01] is a dynamic variable ordering based on activity (referred to as the VSIDS heuristic). The original heuristic imposes an order on literals, but borrowing from SATZOO, we make no distinction between p and p in MINISAT. |
|
Each variable has an activity attached to it. Every time a variable occurs in a recorded conflict clause, its activity is increased. We refer to this as bumping. After the conflict, the activity of all the variables in the system are multiplied by a constant less than 1, thus decaying the activity of variables over time. | |
Activity is also used for clauses. When a learnt clause takes part in the conflict analysis, its activity is bumped. Inactive clauses are periodically removed. | |
Constraint removal.The constraint database is divided into two parts: the problem constraints and the learnt clauses. The set of learnt clauses is periodically reduced to increase the performance of propagation. This may result in a larger search space, as learnt clauses are used to crop future branches of the search tree. The balance between the two forces is delicate, and there are SATinstances for which a big learnt clause set is advantageous, and others where a small set is better. MINISAT’s default heuristic starts with a small set and gradually increases the size. |
|
Problem constraints can also be removed if they are satisfied at the toplevel. The API method simplifyDB() is responsible for this. The procedure is particularly important for incremental SAT-problems. | |
Top-level solver.The pseudo-code for the search procedure presented above suffices for a simple conflict-driven SAT-solver, but a solver strategy can improve the performance. A typical strategy applied by modern conflict-driven SATsolvers is the use of restarts to prevent from getting stuck in a futile part of the search tree. In MINISAT we also vary the number of learnt clauses kept at a given time. Furthermore, the solve() method of the API supports incremental assumptions, not handled by the above pseudo-code. |
|
4 Implementation
The following conventions are used in the code. | |
Atomic types start with a lowercase letter and are passed by value. Composite types start with a capital letter and are passed by reference. Blocks are marked by indentation level. The bottom symbol ⊥ always mean undefined; False is used to denote the boolean false. We will use, but not specify an implementation of, the following abstract data types: VecT an extensible vector of type T; lit the type of literals containing a special literal ⊥lit; lbool for the lifted boolean domain containing elements True⊥, False⊥, and ⊥; QueueT a queue of type T. We also use var as a type synonym for int (for implicit documentation) with the special constant ⊥var. The literal data type has an index () method which converts a literal to a “small” integer suitable for array indexing. |
|
4.1 The Solver State | |
A number of things need to be stored in the solver state. Figure 2 shows the complete set of member variables of the solver type of MINISAT. A number of trivial, one-line functions will be assumed to exist, such as nVars() for the number of variables etc. The interface of VarOrder is given in Figure 1, and is further explained in section 4.6. Note that the state does not contain a boolean “conflict” to remember if a top-level conflict has been reached. Instead we impose as an invariant that the solver must never be in a conflicting state. 译文:相反,我们要求解算器永远不能处于冲突状态。
|
|
4.2 Constraints | |
MINISAT can handle arbitrary constraints over boolean variables through the abstraction presented in Figure 3. Each constraint type needs to implement methods for constructing, removing, propagating and calculating reasons. In addition, methods for simplifying the constraint and updating the constraint on backtrack can be specified. 译文:此外,还可以指定简化约束和更新回溯约束的方法。 The contracts of these methods are as follows: |
|
Constructor. The constructor may only be called at the top-level. It must create and add the constraint to appropriate watcher lists after enqueuing any unit information derivable under the current top-level assignment. Should a conflict arise, this must be communicated to the caller. |
|
Remove. The remove method supplants the destructor by receiving the solver state as a parameter. It should dispose the constraint and remove it from the watcher lists. |
|
Propagate. The propagate method is called if the constraint is found in a watcher list during propagation of unit information p. The constraint is removed from the list and is required to insert itself into a new or the same watcher list. Any unit information derivable as a consequence of p should be enqueued. If successful, True is returned; if a conflict is detected, False is returned. The constraint may add itself to the undo list of var(p) if it needs to be updated when p becomes unbound. |
|
Simplify. At the top-level, a constraint may be given the opportunity to simplify its representation (returns True) or state that the constraint is satisfied under the current assignment (returns False). A constraint must not be simplifiable to produce unit information or to be conflicting; in that case the propagation has not been correctly defined. |
|
Undo. During backtracking, this method is called if the constraint added itself to the undo list of var(p) in propagate(). The current variable assignments are guaranteed to be identical to that of the moment before propagate() was called. |
|
Calculate Reason. f the moment before propagate() was called. Calculate Reason. This method is given a literal p and an empty vector. The constraint is the reason for p being true, that is, during propagation, the current constraint enqueued p. The received vector is extended to include a set of assignments (represented as literals) implying p. The current variable assignments are guaranteed to be identical to that of the moment before the constraint propagated p.
The literal p is also allowed to be the special constant ⊥lit in which case the |
|
4.6 Activity Heuristics and Constraint Removal | |
In the VarOrder data type of MINISAT, the list of variables is kept sorted on activity at all time. The search will always accurately choose the most active variable. The original suggestion for the VSIDS dynamic variable ordering was to sort periodically. MINISAT implements variable decay by bumping with larger and larger numbers. Only when the limit of what is representable by a floating point number is reached need activities be scaled down. Activity for conflict clauses are also maintained. The method for reducing the set of learnt clauses based on this activity, as well as the top-level simplification procedure can be found in Figure 9. |
|
4.7 Top-Level Solver | |
The method implementing MINISAT’s top-level strategy can be found in Figure 8. It is responsible for making the incremental assumptions and setting the root level. Furthermore, it completes the simple backtracking search with restarts, which are performed less and less frequently.
After each restart, the number of allowed learnt clauses is increased. |
|
5 Conclusions and Related Work
By this paper, we have provided a minimal reference implementation of a modern conflict-driven SAT-solver. We have tested MINISAT against ZCHAFF and BERKMIN 5.61 on 177 SAT-instances. These instances were used to tune SATZOO for the SAT 2003 Competition. As SATZOO solved more instances and series of problems, ranging over all three categories (industrial, handmade, and random), than any other solver in the competition, we feel that this is a representative test-set.
No extra tuning was done in MINISAT; it was just run once with the constants presented in the code. At a time-out of 10 minutes, MINISAT solved 158 instances, while ZCHAFF solved 147 instances and BERKMIN 157 instances. |
|
References
-
[AR+02]Aloul, F., Ramani, A., Markov, I., Sakallah, K.: Generic ILP vs. Specialized 0-1 ILP: an Update. In: International Conference on Computer Aided Design, ICCAD (2002)Google Scholar
-
[BC+99]Biere, A., Cimatti, A., Clarke, E.M., Fujita, M., Zhu, Y.: Symbolic Model Checking using SAT procedures instead of BDDs. In: Proceedings of Design Automation Conference, DAC 1999 (1999)Google Scholar
-
[CS03]Claessen, K., Sörensson, N.: New Techniques that Improve MACEstyle Finite Model Finding. In: CADE-19, Workshop W4. Model Computation – Principles, Algorithms, Applications (2003)Google Scholar
-
[DLL62]Davis, M., Logman, M., Loveland, D.: A machine program for theorem proving. Communications of the ACM 5 (1962)Google Scholar
-
[ES03]Eén, N., Sörensson, N.: Temporal Induction by Incremental SAT Solving. In: Proc. of First International Workshop on Bounded Model Checking (2003)Google Scholar
-
[Lar92]Larrabee, T.: Test Pattern Generation Using Boolean Satisfiability. IEEE Transactions on Computer-Aided Design, vol 11(1) (1992)Google Scholar
-
[MS96]Marques-Silva, J.P., Sakallah, K.A.: GRASP – A New Search Algorithm for Satisfiability. In: ICCAD. IEEE Computer Society Press, Los Alamitos (1996)Google Scholar
-
[MZ01]Moskewicz, M.W., Madigan, C.F., Zhao, Y., Zhang, L., Malik, S.: Chaff: Engineering an Efficient SAT Solver. In: Proc. of the 38th Design Automation Conference (2001)Google Scholar
-
[ZM01]Zhang, L., Madigan, C.F., Moskewicz, M.W., Malik, S.: Efficient Conflict Driven Learning in Boolean Satisfiability Solver. In: Proc. of the International Conference on Computer Aided Design, ICCAD (2001)Google Scholar
-
[WKS01]Whittemore, J., Kim, J., Sakallah, K.: SATIRE: A New Incremental Satisfiability Engine. In: Proc. 38th Conf. on Design Automation. ACM Press, New York (2001)Google Scholar