Weaknesses of CDCL Solvers
Armin Biere
Johannes Kepler University
Linz, Austria

讲座的时间、地点:The Fields Institute, Toronto, Canada Tuesday, August 16, 2016


 

Abstract: 

 
Even though the basic CDCL scheme is already quite effective, a SAT solver requires careful implementation of many additional techniques to achieve state-of-the-art performance.
译文:即使基本的CDCL方案已经非常有效,一个SAT求解器仍然需要仔细地实现许多其他技术来实现最先进的性能
 
 
Probably most important are decision heuristics and their implementation, followed by data-structures for fast propagation, garbage collection schemes for reclaiming inactive learned clauses, then preprocessing techniques, and finally restart scheduling.
译文:可能最重要的是决策启发式及其实现,其次是用于快速传播的数据结构,用于回收非活动学习子句的垃圾收集方案,然后是预处理技术,最后是重新启动调度。
 
Our recent results revisiting decision heuristics as well as restart schemes on one hand simplified our understanding of what is essential for fast SAT solvers, but on the other hand revealed weaknesses both in SAT solving technology and how it is evaluated empirically.
译文:我们最近的结果重新审视决策启发式和重新启动方案,一方面简化了我们对什么是快速SAT解决者的必要的理解,但另一方面揭示了在SAT解决技术和如何评估它的经验的弱点
   

BIO (个人小传)

 

Since 2004 Prof. Armin Biere is a Full Professor for Computer Science at the Johannes Kepler University in Linz, Austria, and chairs the Institute for Formal Models and Verification.

译文:Armin Biere教授自2004年起担任奥地利林茨约翰内斯开普勒大学计算机科学的全职教授,并担任形式模型与验证研究所的主席。

 

Between 2000 and 2004 he held a position as Assistant Professor within the Department of Computer Science at ETH Zürich, Switzerland. In 1999 Biere was working for a start-up company in electronic design automation after one year as Post-Doc with Edmund Clarke at CMU, Pittsburgh, USA. In 1997 Biere received a Ph.D. in Computer Science from the University of Karlsruhe, Germany.

译文:2000年至2004年,他在瑞士苏黎世联邦理工学院计算机科学系担任助理教授。1999年,Biere在美国匹兹堡CMU的Edmund Clarke做了一年博士后之后,在一家新兴的电子设计自动化公司工作。1997年,比尔获得了德国卡尔斯鲁厄大学的计算机科学博士学位。

 

His primary research interests are applied formal methods, more specifically formal verification of hardware and software, using model checking, propositional and related techniques.

译文:他的主要研究兴趣是应用形式化方法,更具体的是硬件和软件的形式化验证,使用模型检查、命题和相关技术。

 

He is the author and co-author of more than 120 papers and served on the program committee of more than 110 international conferences and workshops.译文:他是120多篇论文的作者和合著者,并在110多个国际会议和研讨会的项目委员会任职。

His most influential work is his contribution to Bounded Model Checking. Decision procedures for SAT, QBF and SMT, developed by him or under his guidance rank at the top many international competitions and were awarded 57 medals including 32 gold medals.

译文:他最具影响力的工作是他对有限模型检验的贡献。他所开发或指导的SAT、QBF和SMT决策程序,在许多国际比赛中名列前茅,获奖牌57枚,其中金牌32枚。

  Besides organizing several workshops Armin Biere was co-chair of SAT'06, and FMCAD'09, was PC co-chair of HVC'12, and co-chair of CAV'14. He serves on the editorial boards of the Journal on Satisfiability, Boolean Modeling and Computation (JSAT), the Journal of Automated Reasoning (JAR), and the journal for Formal Methods in System Design (FMSD).译文:他在满足性、布尔建模和计算期刊(JSAT)、自动推理期刊(JAR)和系统设计中的形式化方法期刊(FMSD)的编委会任职。
  He is an editor of the Handbook of Satisfiability and initiated and organizes the Hardware Model Checking Competition (HWMCC). Since 2011 he serves as chair of the SAT Association and since 2012 on the steering committee of FMCAD. In 2006 Armin Biere co-founded NextOp Software Inc. which was acquired by Atrenta Inc. in 2012.译文:2011年以来,他担任SAT协会主席,2012年以来担任FMCAD指导委员会成员。2006年,阿明·比尔与人共同创立了NextOp Software Inc.,该公司于2012年被Atrenta Inc.收购。

 

slider——Evaluating CDCL Variable Scoring Schemes

1

(Exponential) Variable State Independent Decaying Sum (VSIDS)

empirically one of the most important features of state-of-the-art solvers

no formal argument “why it works”译文:没有正式的论证"为什么它有效"

“Trying to Understand the Power of VSIDS”

 

reconsider simpler alternatives ------particularly variable move to front schemes (VMTF)

shows that VMTF is as good as VSIDS (and explains boths)

   
2

decision heuristics consist of

    • variable selection: which variable to assign next?
    • phase selection: assign variable to which phase (true or false)?
 

phase saving [PipatsrisawatDarwiche’07]

    • select phase to which variable was assigned before
    • initialized by static one-side heuristics [JeroslowWang’90]
    • very effective and thus default in state-of-the-art solvers
 

we consider only variable selection as decision heuristic here

    • clause based heuristics less effective (BerkMin, CMTF)                     译文:基于子句的启发式效果较差
    • same applies to literal based heuristics (using literal scores)
 

variable selection and decision heuristic boils down to

    • compute and maintain heuristic scores for variables
    • select variable with highest score
 

how to compute scores

    • static or dynamic
    • bump variables: when to increase scores and by how much
    • rescore variables: when to decrease scores and by how much
    • state-of-the-art: VSIDS (from Chaff)

               more precisely the exponential variant (EVSIDS) of MiniSAT!

 

data structures for finding decision variables

    • eager or lazy update of “order”
    • state-of-the-art: priority queue of variables ordered by score (MiniSAT)
  data structure depends on how scores are computed and vice versa
   
3

zero order scheme = static scores

    • computed for instance once during preprocessing
    • still needs search for “best” unassinged variable
    • only total orders considered so far

 

first order schemes = dynamic but state less

    • for instance: score = pos occs × neg occs
    • independent of how search reached current branch / search node
    • might be quite expensive to compute / update (linear in CNF size)

second order schemes: variable score depends on history of search

    • first order + learning ⇒ second order
    • but can also be used to speed up search for “best” variable
    • goal is logarithmic or even constant algorithm for variable selection
   
  Variable Move To Front
4

Siege SAT solver [Ryan’04] used variable move to front (VMTF)

    • bumped variables moved to head of doubly linked list
    • search for unassigned variable starts at head
    • variable selection is an online sorting algorithm of scores
    • classic “move-to-front” strategy achieves good amortized complexity
 

original implementation severely restricted

    • only moved a subset of bumped variables
    • details about caching the search not described no source code published either
    • not exactly the same as VSIDS
 

as consequence VMTF not used in state-of-the-art solvers

译文:因此,VMTF不用于最先进的解决方案

   
  MiniSAT’s Exponential VSIDS (EVSIDS)
5

floating point scores

    • allows fine grained rescore at every conflict
    • consider multiplying by f = 0.9 every score at each conflict

 

actually, instead of updating scores of all variables (at every conflict)

    • only increase score of bumped variables by g i
    • with i the conflict-index, and g = 1/ f
    • non-bumped variables not touched

 

priority queue of variables ordered by score

    • implemented as binary heap with update (bump and bubble up)
    • lazy assigned variable removal
      • remove largest score variable from heap until unassigned one found
      • put unassigned variables not on the heap back (logarithmic complexity)

 

normalized VSIDS (NVSIDS) ∈ [0,1] as (theoretical) model [Biere’08] + video

   
6

Summary Variable Scoring Schemes

   
  Fast VMTF Implementation
 

fast simple implementation for caching searches(缓存搜索) in VMTF [Biere’15]

    • doubly linked list does not have positions as an ordered array
    • bump = move-to-front = dequeue then insertion at the head

 

time-stamp list entries with insertion time 译文:带有插入时间的时间戳列表条目

    • maintained invariant: all variables right of next-search are assigned
    • requires update to next-search while unassigning variables 译文:在不分配变量时,需要对next搜索进行更新

 

solved SAT competition 2014 application track instances (ordered by time)

 

solved SAT competition 2014 application track instances (ordered by time)

   
 

SAT'16 Competition Application Benchmarks (sorted by percentage run−time)

   
 

Conclusion Part I

 

surveyed and classified variable selection / scoring schemes

译文:调查和分类变量选择/评分方案

    • came up with a new one ACIDS (as well as SUM)
    • EVSIDS, VMTF, ACIDS comparable in performance
    • with a generic fast queue implementation 译文:使用一个通用的快速队列实现

 

VMTF was considered to be obsolete 译文:VMTF被认为是过时的

    • can be made effective (with less code than EVSIDS)
    • needs proper optimized implementation: time-stamping with insertion-time
    • VMTF might be easier to reason about in proof complexity 译文:在证明复杂性方面,VMTF可能更容易推理

 

threads to validity 线程的有效性

    • unclear whether VMTF only works in combination with Glucose restarts see also our POS’15 paper and talk in Part II of this talk
    • benchmark selection in recent SAT competitions highly controversial

 

Splatz: SAT solver only based on VMTF

        http://fmv.jku.at/splatz

   

Part II

 

Evaluating CDCL Restart Schemes [POS’15 paper with Andreas Frohlich] ¨

    • simplifies Glucose style restart schemes
    • evaluation shows clear benefits but also weaknesses
   
1 Post SAT Competition 2014 Analysis
 

Lingeling actually barely won

    • only for long time limit of 5000 seconds
    • for 900 seconds: no chance
 

two main reasons

    • selected benchmark biased towards decendants of Glucose / MiniSAT
    • but Glucose restarts are important for many (selected) benchmarks              译文:但是Glucose的重新启动对许多(选定的)基准是重要的
 

the POS’15 paper is about lessons learned while

    • porting the Glucose restart scheme to Lingeling
    • and simplifying by using exponential moving averages (EMA)

original longer slide set at http://fmv.jku.at/biere/talks/Biere-POS15-talk.pdf

   
 2

Buckets 

 

 application track instances clustered in buckets (by the organizers):

 

2d-strip-packing (4), argumentation (20), bio (11), crypto-aes (8), crypto-des (7), crypto-gos (9), crypto-md5 (21), crypto-sha (29) , crypto-vpmc (4), diagnosis (28), fpga-routing (1), hardware-bmc (4), hardware-bmc-ibm (18), hardware-cec (30), hardware-manolios (6), hardware-velev (27), planning (19), scheduling (30), scheduling-pesp (3), software-bit-verif (9), software-bmc (6), symbolic-simulation (1), termination (5)


 

in total 300 instances clustered in 23 buckets  译文:总共300个实例聚集在23个桶中

 

 

lingeling−sc2014 versus SWDiA5BY

   
   
   
 

 Restarts in CDCL

 
 1 Status run_CDCL_loop_with_restarts () {
 2     for (;;) {
 3         if (bcp ()) {
 4             if (restarting ()) restart ();
 5             else if (!decide ()) return SATISFIABLE;
 6         } else {
 7             conflicts++;
 8             if (!analyze ()) return UNSATISFIABLE;
 9         }
10     }
11 }                            

 

  • run BCP and conflict analysis (including learning) until completion
  • restart if restart policy implemented in restarting says so
  • usually based on a global conflicts counter
  • otherwise pick next decision (unless all are assigned)
   Restart Scheme Examples
   
 1 bool restarting () {
 2     return conflicts >= limit;
 3 }
 4 
 5 void static_uniform_restart () {
 6     restarts++;
 7     limit = conflicts + interval;
 8     backtrack (0);
 9 }
10 
11 void static_geometric_restart () {
12     limit = conflicts + interval * pow (1.5, ++restarts);
13     backtrack (0);
14 }
15 
16 void luby_restart () {
17     limit = conflicts + interval * luby (++restarts);
18     backtrack (0);
19 }
   
  Restart Scheme Classification 
 

static schemes

  • fixed schedule of restarts only based on conflicts counter
    • uniform intervals: wait a fixed number of conflicts after each restart
    • non-uniform restart intervals
      • number of performed restarts determines next interval (in terms of conflicts)
      • arithmetically or geometrically increasing actual interval
      • Luby scheme (also known as reluctant doubling)
      • inner-outer scheme
   

dynamic schemes

  • agility based restart blocking 译文:基于敏捷性的重启阻塞
  • local restarts (not discussed in the paper nor the talk)
  • reusing the trail implicitly also blocks restarts (even partially)译文:隐式地重用跟踪也会阻止重新启动(甚至部分地)
  • Glucose restart scheme (focus here)
   
   
   
 

Comparing Static but Non-Uniform Restart Schemes

 

   
   
   Conclusion Part II
   data and source: http://fmv.jku.at/evalrestart/evalrestart.7z
   
 

optimal restart interval varies with benchmark bucket

    • for miters fast restarts essential
    • for crypto benchmarks longer intervals necessary
    • disabling restarts completely is bad
    • Glucose restarts superior to Luby style
 

presented an EMA variant of the Glucose restart scheme

    • simpler model, simpler to implement
    • similar performance (slightly faster)
 

future work

  • how to improve blocking of restarts
  • restart intervals still not optimal: really need machine learning?
  • combined
    • SAT and Stock Market Analysis ------originally proposed title for the POS’15 paper
    • or SAT and Reinforcement Learning
 

Part III

  Weaknesses of CDCL for Equivalence Checking (miters) CDCL has a hard time to learn the right clauses fast restarts important for miters。
   
 

Miter (斜接)

 

 

 

hyper-binary resolve multiple binary clauses in “parallel”:

 

thus “in principle” hyper-binary resolution can simulate structural hashing

but we do not know how to implement it fast (without having the circuit and including equivalence literal substitution)

 

   
 

Software Architecture of our Boolector SMT Solver

 

 

   
 

Solving Miters with CDCL Requires Restarts

译文:用CDCL解决Miters需要重新启动

 

trivial miters of identical circuits

    • equivalence = bimplication (two implications aka binary clauses)
    • equivalences are reused recursively and thus order of learning important
    • best solved by “finding” and applying equivalences bottom-up
   
 

point of Yakau Novikov at our predecessor Dagstuhl meeting in 2015

    • assume first implication a → b is learned
    • w.l.o.g. a assigned before b
    • then a was assigned to 1 and b to 0
    • after learning first implication, b is flipped and both a and b are assigned to 1
    • in order to learn b → a we have to assign b to 1 and a to 0
    • thus without unassigning a we can not learn the second implication
   
 

so frequent restarts are useful here

    • triggered by Glucose restart style schedules (as discused in Part II)
    • actually phase saving is also counter productive here (tried to fix it without success)
    • still not perfect, dedicated preprocessing faster
   
 

Challenges Part III

 

how to do equivalence checking on the CNF level

    • (even) more efficient implementation of hyper-binary resolution
    • partial solution [HeuleBiere-LPAR’13]
      • blocked clause decomposition
      • SAT sweeping
   
 

another (unpublished) partial solution:

    • simple-probing in Lingeling
    • simulates structural hashing on the CNF level
    • eager equivalent literal substitution
    • fails after preprocessing (bounded variable elimination)
   
 

recover / use circuit structure

    • locality
    • direction (inputs)
    • functional dependencies (might get lost during preprocessing)
   
  Part IV
 

Arithmetic Reasoning Hard for CDCL 译文:CDCL的算术推理很困难

 

Commutativity of Bit-Vector Multiplication 译文:位向量乘法的交换性

 

 

Arithmetic Circuit Equivalence Checking译文:算术电路等价性检验

   
 

Challenge Part IV

 

Problem

    • Tseitin encoding of the miter of a multiplier with itself but with inputs swapped.译文:乘法器的斜接编码,它本身与输入交换。

Conjecture 猜想

    • Refuting the resulting CNF requires exponential resolution proofs (and thus CDCL too).译文:反驳由此产生的CNF需要指数分辨率的证明(因此CDCL也需要)

Research Question

    • Determine proof systems with polynomial proofs for this problem.译文:为这个问题确定多项式证明的证明系统
   

Overall Conclusion

 

decision heuristics

    • new empirical insights: VMTF simpler and as good as VSIDS
    • can we prove why these work?

 

 

restarts

    • simplified Glucose restart scheme, showed that it somehow works
    • clearly not where we want it to be (machine learning necessary?)
 

miters

    • CDCL implementations do not learn the right clauses
    • fast restarts partially fix it, can we prove that?
   
 

arithmetic reasoning  计算推理

    • need stronger proof systems? 译文:需要更强大的证明系统吗?
    • can we prove that?
   
posted on 2020-07-25 19:06  海阔凭鱼跃越  阅读(356)  评论(0编辑  收藏  举报