Weaknesses of CDCL Solvers
Armin Biere
Johannes Kepler University
Linz, Austria

The Fields Institute, Toronto, Canada Tuesday, August 16, 2016



Even though the basic CDCL scheme is already quite effective, a SAT solver requires careful implementation of many additional techniques to achieve state-of-the-art performance.
Probably most important are decision heuristics and their implementation, followed by data-structures for fast propagation, garbage collection schemes for reclaiming inactive learned clauses, then preprocessing techniques, and finally restart scheduling.
Our recent results revisiting decision heuristics as well as restart schemes on one hand simplified our understanding of what is essential for fast SAT solvers, but on the other hand revealed weaknesses both in SAT solving technology and how it is evaluated empirically.

BIO (个人小传)


Since 2004 Prof. Armin Biere is a Full Professor for Computer Science at the Johannes Kepler University in Linz, Austria, and chairs the Institute for Formal Models and Verification.

Between 2000 and 2004 he held a position as Assistant Professor within the Department of Computer Science at ETH Zürich, Switzerland. In 1999 Biere was working for a start-up company in electronic design automation after one year as Post-Doc with Edmund Clarke at CMU, Pittsburgh, USA. In 1997 Biere received a Ph.D. in Computer Science from the University of Karlsruhe, Germany.

His primary research interests are applied formal methods, more specifically formal verification of hardware and software, using model checking, propositional and related techniques.



His most influential work is his contribution to Bounded Model Checking. Decision procedures for SAT, QBF and SMT, developed by him or under his guidance rank at the top many international competitions and were awarded 57 medals including 32 gold medals.


slider——Evaluating CDCL Variable Scoring Schemes


(Exponential) Variable State Independent Decaying Sum (VSIDS)

empirically one of the most important features of state-of-the-art solvers

no formal argument "why it works"

“Trying to Understand the Power of VSIDS”


reconsider simpler alternatives ------particularly variable move to front schemes (VMTF)

shows that VMTF is as good as VSIDS (and explains boths)


decision heuristics consist of

    • variable selection: which variable to assign next?
    • phase selection: assign variable to which phase (true or false)?

phase saving [PipatsrisawatDarwiche’07]

    • select phase to which variable was assigned before
    • initialized by static one-side heuristics [JeroslowWang’90]
    • very effective and thus default in state-of-the-art solvers

we consider only variable selection as decision heuristic here

    clause based heuristics less effective (BerkMin, CMTF)
    • same applies to literal based heuristics (using literal scores)

variable selection and decision heuristic boils down to

    • compute and maintain heuristic scores for variables
    • select variable with highest score

how to compute scores

    • static or dynamic
    • bump variables: when to increase scores and by how much
    • rescore variables: when to decrease scores and by how much
    • state-of-the-art: VSIDS (from Chaff)

               more precisely the exponential variant (EVSIDS) of MiniSAT!


data structures for finding decision variables

    • eager or lazy update of “order”
    • state-of-the-art: priority queue of variables ordered by score (MiniSAT)
  data structure depends on how scores are computed and vice versa

zero order scheme = static scores

    • computed for instance once during preprocessing
    • still needs search for “best” unassinged variable
    • only total orders considered so far


first order schemes = dynamic but state less

    • for instance: score = pos occs × neg occs
    • independent of how search reached current branch / search node
    • might be quite expensive to compute / update (linear in CNF size)

second order schemes: variable score depends on history of search

    • first order + learning ⇒ second order
    • but can also be used to speed up search for “best” variable
    • goal is logarithmic or even constant algorithm for variable selection
  Variable Move To Front

Siege SAT solver [Ryan’04] used variable move to front (VMTF)

    • bumped variables moved to head of doubly linked list
    • search for unassigned variable starts at head
    • variable selection is an online sorting algorithm of scores
    • classic “move-to-front” strategy achieves good amortized complexity

original implementation severely restricted

    • only moved a subset of bumped variables
    • details about caching the search not described no source code published either
    • not exactly the same as VSIDS

as consequence VMTF not used in state-of-the-art solvers


  MiniSAT’s Exponential VSIDS (EVSIDS)

floating point scores

    • allows fine grained rescore at every conflict
    • consider multiplying by f = 0.9 every score at each conflict


actually, instead of updating scores of all variables (at every conflict)

    • only increase score of bumped variables by g i
    • with i the conflict-index, and g = 1/ f
    • non-bumped variables not touched


priority queue of variables ordered by score

    • implemented as binary heap with update (bump and bubble up)
    • lazy assigned variable removal
      • remove largest score variable from heap until unassigned one found
      • put unassigned variables not on the heap back (logarithmic complexity)


normalized VSIDS (NVSIDS) ∈ [0,1] as (theoretical) model [Biere’08] + video


Summary Variable Scoring Schemes

  Fast VMTF Implementation

fast simple implementation for caching searches(缓存搜索) in VMTF [Biere’15]

    • doubly linked list does not have positions as an ordered array
    • bump = move-to-front = dequeue then insertion at the head


time-stamp list entries with insertion time

    • maintained invariant: all variables right of next-search are assigned
    requires update to next-search while unassigning variables


solved SAT competition 2014 application track instances (ordered by time)


solved SAT competition 2014 application track instances (ordered by time)


SAT'16 Competition Application Benchmarks (sorted by percentage run−time)


Conclusion Part I


surveyed and classified variable selection / scoring schemes


    • came up with a new one ACIDS (as well as SUM)
    • EVSIDS, VMTF, ACIDS comparable in performance
    with a generic fast queue implementation


VMTF was considered to be obsolete

    • can be made effective (with less code than EVSIDS)
    • needs proper optimized implementation: time-stamping with insertion-time
    VMTF might be easier to reason about in proof complexity


threads to validity

    • unclear whether VMTF only works in combination with Glucose restarts see also our POS’15 paper and talk in Part II of this talk
    • benchmark selection in recent SAT competitions highly controversial


Splatz: SAT solver only based on VMTF



Part II


Evaluating CDCL Restart Schemes [POS’15 paper with Andreas Frohlich] ¨

    • simplifies Glucose style restart schemes
    • evaluation shows clear benefits but also weaknesses
1 Post SAT Competition 2014 Analysis

Lingeling actually barely won

    • only for long time limit of 5000 seconds
    • for 900 seconds: no chance

two main reasons

    • selected benchmark biased towards decendants of Glucose / MiniSAT
    but Glucose restarts are important for many (selected) benchmarks

the POS’15 paper is about lessons learned while

    • porting the Glucose restart scheme to Lingeling
    • and simplifying by using exponential moving averages (EMA)

original longer slide set at http://fmv.jku.at/biere/talks/Biere-POS15-talk.pdf




 application track instances clustered in buckets (by the organizers):


2d-strip-packing (4), argumentation (20), bio (11), crypto-aes (8), crypto-des (7), crypto-gos (9), crypto-md5 (21), crypto-sha (29) , crypto-vpmc (4), diagnosis (28), fpga-routing (1), hardware-bmc (4), hardware-bmc-ibm (18), hardware-cec (30), hardware-manolios (6), hardware-velev (27), planning (19), scheduling (30), scheduling-pesp (3), software-bit-verif (9), software-bmc (6), symbolic-simulation (1), termination (5)


in total 300 instances clustered in 23 buckets



lingeling−sc2014 versus SWDiA5BY


 Restarts in CDCL

 1 Status run_CDCL_loop_with_restarts () {
 2     for (;;) {
 3         if (bcp ()) {
 4             if (restarting ()) restart ();
 5             else if (!decide ()) return SATISFIABLE;
 6         } else {
 7             conflicts++;
 8             if (!analyze ()) return UNSATISFIABLE;
 9         }
10     }
11 }                            


  • run BCP and conflict analysis (including learning) until completion
  • restart if restart policy implemented in restarting says so
  • usually based on a global conflicts counter
  • otherwise pick next decision (unless all are assigned)
   Restart Scheme Examples
 1 bool restarting () {
 2     return conflicts >= limit;
 3 }
 5 void static_uniform_restart () {
 6     restarts++;
 7     limit = conflicts + interval;
 8     backtrack (0);
 9 }
11 void static_geometric_restart () {
12     limit = conflicts + interval * pow (1.5, ++restarts);
13     backtrack (0);
14 }
16 void luby_restart () {
17     limit = conflicts + interval * luby (++restarts);
18     backtrack (0);
19 }
  Restart Scheme Classification 

static schemes

  • fixed schedule of restarts only based on conflicts counter
    • uniform intervals: wait a fixed number of conflicts after each restart
    • non-uniform restart intervals
      • number of performed restarts determines next interval (in terms of conflicts)
      • arithmetically or geometrically increasing actual interval
      • Luby scheme (also known as reluctant doubling)
      • inner-outer scheme

dynamic schemes

  agility based restart blocking
  • local restarts (not discussed in the paper nor the talk)
  reusing the trail implicitly also blocks restarts (even partially)
  • Glucose restart scheme (focus here)

Comparing Static but Non-Uniform Restart Schemes


   Conclusion Part II
   data and source: http://fmv.jku.at/evalrestart/evalrestart.7z

optimal restart interval varies with benchmark bucket

    • for miters fast restarts essential
    • for crypto benchmarks longer intervals necessary
    • disabling restarts completely is bad
    • Glucose restarts superior to Luby style

presented an EMA variant of the Glucose restart scheme

    • simpler model, simpler to implement
    • similar performance (slightly faster)

future work

  • how to improve blocking of restarts
  • restart intervals still not optimal: really need machine learning?
  • combined
    • SAT and Stock Market Analysis ------originally proposed title for the POS’15 paper
    • or SAT and Reinforcement Learning

Part III

  Weaknesses of CDCL for Equivalence Checking (miters) CDCL has a hard time to learn the right clauses fast restarts important for miters.

Miter (斜接)




hyper-binary resolve multiple binary clauses in “parallel”:


thus “in principle” hyper-binary resolution can simulate structural hashing

but we do not know how to implement it fast (without having the circuit and including equivalence literal substitution)



Software Architecture of our Boolector SMT Solver




Solving Miters with CDCL Requires Restarts



trivial miters of identical circuits

    • equivalence = bimplication (two implications aka binary clauses)
    • equivalences are reused recursively and thus order of learning important
    • best solved by “finding” and applying equivalences bottom-up

point of Yakau Novikov at our predecessor Dagstuhl meeting in 2015

    • assume first implication a → b is learned
    • w.l.o.g. a assigned before b
    • then a was assigned to 1 and b to 0
    • after learning first implication, b is flipped and both a and b are assigned to 1
    • in order to learn b → a we have to assign b to 1 and a to 0
    • thus without unassigning a we can not learn the second implication

so frequent restarts are useful here

    • triggered by Glucose restart style schedules (as discused in Part II)
    • actually phase saving is also counter productive here (tried to fix it without success)
    • still not perfect, dedicated preprocessing faster

Challenges Part III


how to do equivalence checking on the CNF level

    • (even) more efficient implementation of hyper-binary resolution
    • partial solution [HeuleBiere-LPAR’13]
      • blocked clause decomposition
      • SAT sweeping

another (unpublished) partial solution:

    • simple-probing in Lingeling
    • simulates structural hashing on the CNF level
    • eager equivalent literal substitution
    • fails after preprocessing (bounded variable elimination)

recover / use circuit structure

    • locality
    • direction (inputs)
    • functional dependencies (might get lost during preprocessing)
  Part IV

Arithmetic Reasoning Hard for CDCL


Commutativity of Bit-Vector Multiplication



Arithmetic Circuit Equivalence Checking


Challenge Part IV



    Tseitin encoding of the miter of a multiplier with itself but with inputs swapped.

Conjecture 猜想

    Refuting the resulting CNF requires exponential resolution proofs (and thus CDCL too).

Research Question

    Determine proof systems with polynomial proofs for this problem.

Overall Conclusion


decision heuristics

    • new empirical insights: VMTF simpler and as good as VSIDS
    • can we prove why these work?




    • simplified Glucose restart scheme, showed that it somehow works
    • clearly not where we want it to be (machine learning necessary?)


    • CDCL implementations do not learn the right clauses
    • fast restarts partially fix it, can we prove that?

arithmetic reasoning

    need stronger proof systems?
    • can we prove that?
