不同的研究视角

1.学习子句生成器

Gilles Audemard proposed to renew the vision of CDCL solvers，instead of seeing them as an improvement of a DPLL search, seeing them as clauses producers.

Gilles Audemard 提议更新 CDCL 求解器的愿景，不是将它们视为 DPLL 搜索的改进，而是将它们视为子句生产者。

Gilles Audemard, Laurent Simon:
On the Glucose SAT Solver. Int. J. Artif. Intell. Tools 27(1): 1840001:1-1840001:25 (2018)

@article{DBLP:journals/ijait/AudemardS18,
  author       = {Gilles Audemard and
                  Laurent Simon},
  title        = {On the Glucose {SAT} Solver},
  journal      = {Int. J. Artif. Intell. Tools},
  volume       = {27},
  number       = {1},
  pages        = {1840001:1--1840001:25},
  year         = {2018},
  url          = {https://doi.org/10.1142/S0218213018400018},
  doi          = {10.1142/S0218213018400018},
  timestamp    = {Tue, 12 May 2020 16:53:25 +0200},
  biburl       = {https://dblp.org/rec/journals/ijait/AudemardS18.bib},
  bibsource    = {dblp computer science bibliography, https://dblp.org}
}

2.强化学习

	2019 Vitaly Kurin, Saad Godil, Shimon Whiteson, Bryan Catanzaro: Improving SAT Solver Heuristics with Graph Networks and Reinforcement Learning. CoRR abs/1909.11830 (2019) 2008 Roberto Battiti, Paolo Campigotto: Reinforcement Learning and Reactive Search: an adaptive MAX-SAT solver. ECAI 2008: 909-910

	Exponential Recency Weighted Average Branching Heuristic for SAT Solvers Inspired by the bandit framework and reinforcement learning, we learn to choose good variables to branch based on past experience. Our goal is to leverage the theory and practice of a rich sub-field of reinforcement learning to plain and design an effective branching heuristic for solving real-world problems. 译文：受到bandit框架和强化学习的启发，我们学会根据过去的经验选择好的变量进行分支。我们的目标是利用理论和实践的丰富子领域的强化学习，以平原和设计一个有效的分支启发式解决实际问题。分支决策变元的选择包含强化学习的思想

	Improving SAT Solver Heuristics with Graph Networks and Reinforcement Learning

	Adaptive Restart and CEGAR-Based Solver for Inverting Cryptographic Hash Functions MapleCrypt has two key features, namely, a multi-armed bandit based adaptive restart (MABR) policy and a counterexample-guided abstraction refinement (CEGAR) technique.译文：将固定目标的哈希函数反演问题简化为布尔逻辑的可满足性问题，并使用MapleCrypt构造这些目标的前像。MapleCrypt有两个关键特性，即基于多武装强盗的自适应重启(MABR)策略和反例引导的抽象细化(CEGAR)技术 The MABR technique uses reinforcement learning to adaptively choose between different restart policies during the run of the solver.译文：MABR技术使用强化学习来在求解器运行过程中自适应地选择不同的重启策略。

3.结构探测的视角

4. 特殊子句有限传播——体现了什么原理？强化学习？桥接变元？简化？

Specific clauses (and their variants) that are known to be studied include \textit{Glue clauses} and \textit{Core clauses}, as well as \textit{Duplicate Learnt Clauses}. They are either judged to be of high quality, or they are speculated to carry important information. All of them have been experimentally proven to play an important role in improving the ability to solve.
%已知被研究的特定的子句（及其变元）包括胶水子句和核心子句以及复制学习子句。

%子句尺寸最小的glue子句在早期文献加强的重视成为研究的热点。

%glue first
\medskip
\noindent\textbf{\textsf{Glue First.}} In conflict-directed clause learning (CDCL) SAT solving, a class of learning clauses known as glue clauses is highly valued and serves as the basis for various heuristics\cite{abs-1904-11106}. Experimentally that branching decisions with variables appearing in glue clauses, called glue variables, are more conflict efficient than with nonglue variables,and the frequency of learned clauses that are glue clauses can be used as a reliable indicator of solving efficiency for some instances\cite{Chowdhury0Y19}.
%一类被称为glue clause的学习子句得到高度重视并成为各种启发式的基本依据。

%core first
\medskip
\noindent\textbf{\textsf{Core First.}}
Core first unit propagation strategies are proposed in 2019 \cite{abs-1907-01192}. The core clauses, which is defined as one with literal block distance less than or equal to 7, are prioritized by organizations to participate in conflict generationis in BCP.

%这是与保留高质量的子句长久保留不被删除的策略的技术路线是一脉相承。在文献中，学习子句被按照质量高低分配到Core、Iter2、Local中，不同的集合元素的生存期（保留期限）被区别对待。这里的质量标准通常是LBD或子句中文字数量。Core集合中的子句被永久保留;Iter2中的子句被继续评价考察决定是否流动到其它两个集合;Local中的子句被定期删除至少一半数量。

%复制子句
\medskip
\noindent\textbf{\textsf{Duplicate Learnt Clauses.}}
CDCL solver may end up computing (exponentially) many conflict clauses. Based on the efficiency of BCP and the saving of storage space, it is necessary to evaluate and delete some learning clauses in stages to maintain a relatively controllable collection of learning clauses. However, the Duplicate Learnt Clauses is a special type of clause that is repeated in the implementation of the dynamic deletion policy in the learning clause management\cite{Kochemazov0SK20}. Miraculously, in the process of solving most of the CNF instances, a solver may learn and remove the same clause multiple times. Extracting duplicate clauses acutely and storing them indefinitely can be beneficial for the CDCL solver performance.
%复制学习子句是在学习子句管理的动态删除策略实施中长期观察发现的一类特殊子句类型。

%基于BCP效率和节省存储空间考虑，需要阶段性地评价和删除一些学习子句，维持一个相对可控规模的学习子句集合。

%在求解大多数样例的过程中
The above three types of clauses was given increased emphasis and was called important entry points for improving solver performance. So, the horn clause (and the literals it contains) is also studied as an entry point from the perspective of ordered linear reduction in the next chapter.

5. 引擎

文献名：On the Glucose SAT Solver

When they were introduced fifteen years ago, CDCL SAT solvers (Conflict Driven Clause Learning) were presented as an extension of the DPLL algorithm with additional features such as clause learning, based on top of an efficient data structure (2 Watched Literals) for Unit Propagation detections, giving an efficient Boolean Constraint Propagation engine (BCP). Now, it is well admitted that they have to be seen as a mix of backtrack algorithms and resolution engines. Furthermore, is has been proved that CDCL SAT solvers are more powerful than DPLL ones, i.e., there exist formulas on which the proof (the proof can be seen as a special trace of solver’s run) can be polynomial (w.r.t. to the number of variables) for CDCL solvers whereas, it is necessary exponential for DPLL ones. The opposite is not true.

我整理写在文档中：Structure Time Scale of the CDCL SAT Solver

正式的文章标题为：Instance Assignment Coverage Feature for Operation Control of SAT Solve

\subsection{Cognitive diversity of CDCL}
\label{sub:cognitive diversity of CDCL}

In the research to improve the performance of the solver, new and innovative ideas and methods are constantly emerging based on the different perspectives of understanding CDCL.
%在提升求解器性能的研究中，基于对CDCL的理解的角度不同，不断涌现出新的创新的思想和方法。

G. Audemard first proposed the concept of propagation sequence and implication sequence \cite{AudemardBHJS08}, which takes into account the satisfying clause that is usually ignored by modern solvers in BCP, and expands the formal expression of the entailment process. Following this line of thinking, they later extended the conflict implication graph (named Extended Implication Graph) for conflict analysis and learning clause simplification \cite{Gelder11a}.
%G. Audemard首先提出了传播序列和蕴含序列的概念\cite{AudemardBHJS08}，将现代求解器在BCP中通常忽略掉的以满足子句考虑其中，拓展了对蕴含过程的形式化表达。按照这一思路，他们后续将冲突蕴含图做了扩充(被命名为Extended Implication Graph)用于冲突分析和学习子句化简\cite{Gelder11a}。

Dynamic adaptive disposition has been widely used in CDCL solvers. In the process of solving, the learning clause generation \cite{Gelder11a,0001LM21}, intelligent backtrace \cite{SilvaS96,AudemardBHJS08,NadelR18}, and adaptive restart \cite{AudemardS12,Biere08,ZulkoskiMWRLCG18} are all direct manifestations of this technical idea.
%动态自适应处置在CDCL求解器中已经得到广泛应用。在求解过程中，基于冲突蕴含图的学习子句生成\cite{Gelder11a,0001LM21}、智能回溯\cite{SilvaS96,AudemardBHJS08,NadelR18}、自适应重启\cite{AudemardS12,Biere08,ZulkoskiMWRLCG18}等都是这一技术思想的直接体现。

Block of Text Distance (LBD) is a criterion for evaluating the quality of sentences learnedcite{AudemardS09}. The LBD value of the learning clause involved in Boolean constraint propagation is constantly re-evaluated to keep the minimum value for the entire search period. According to the dynamic change of the LBD value of the existing learning clause, it can be moved in different clause sets \cite{abs-2110-14187}.
%文字块距离(LBD)是评估所学子句质量的标准\cite{AudemardS09}。布尔约束传播中涉及到的学习子句的LBD值会被不断重新评估，实现为该LBD保留整个在搜索期间的最小值。S.Jamali依据已有学习子句LBD值的动态变化实现在不同子句集合中移动\cite{abs-2110-14187}。

A. Goultiaeva demonstrated that CDCL can be reformulated as a local search algorithm that through clause learning is able to prove UNSAT \cite{GoultiaevaB12}. This novel cognitive perspective was considered to open up avenues for further research and algorithm design at that time.
%A. Goultiaeva 证实了可以将CDCL重新表述为一种能通过子句学习能够证明UNSAT本地搜索算法。这一新颖认知视角在当时被认为为进一步研究和算法设计开辟了道路。
%A. Goultiaeva show that CDCL can be reformulated as a local search algorithm that through clause learning is able to prove UNSAT \cite{GoultiaevaB12}.

By the bandit framework and the view of reinforcement learning, Liang et al. propose LRB and CHB branching heuristic successively to choose some variables involved in recent conflicts to branch\cite{LiangGZZC15,LiangGPC16CHB,LiangGPC16}. The intuition is that assignments of these variables are likely to generate further conflicts, leading to useful learned clauses and thus pruning the search space.
Gilles Audemard proposed to renew the vision of CDCL solvers, instead of seeing them as an improvement of a DPLL search, seeing them as clauses producers\cite{AudemardS18}.
%受bandit框架和强化学习的启发，Liang等人提出了LRB和CHB分支启发式方法，选择一些近期冲突中涉及的变量进行分支\cite{LiangGZZC15，LiangGPC16CHB，LiangGPC16}。直觉是，这些变量的赋值可能会产生进一步的冲突，导致有用的学习子句，从而修剪搜索空间。Gilles Audemard 提议更新 CDCL 求解器的愿景，而不是将它们视为 DPLL 搜索的改进，而是将它们视为子句生产者\cite{AudemardS18}。
Based on the fact that conflict generation during CDCL search has the characteristics of alternating conflict depression (CD) and conflict burst (CB) periods, Md. Solimul Chowdhury et al. proposed a random exploration strategy called expSAT\cite{ChowdhuryMY20}.
%基于冲突具有压抑期和繁盛期交替进行的特点，Md. Solimul Chowdhury等提出了一种称为expSAT的随机探索策略。

The idea of hybrid dominance was adopted very early. Shallow composition involves concatenating different branching decision heuristics to take advantage of different strategies\cite{AudemardS18,XiaoLLMLL19}. The latest innovation of deep collaboration is the relaxation of the CDCL framework to take advantage of the conflict frequency of variables in local search are exploited in the phase selection and branching heuristics of CDCL, while invoking the local search solver in the promising branch to search for nearby models\cite{CaiZ22}.
%杂交优势的思想很早就被采用。浅层次的组合涉及串联不同的分支决策启发式用以发挥不同策略的优势。深层次的协作最新创新成果是，放松CDCL框架，在CDCL的阶段选择和分支试探法中利用了局部搜索任务和局部搜索中变量的冲突频率，同时在有希望的分支调用局部搜索求解器来搜索附近的模型\cite{CaiZ22}。
%First, we relax the CDCL framework by extending promising branches to complete assignments and calling a local search solver to search for a model nearby. More importantly, the local search assignments and the conflict frequency of variables in local search are exploited in the phase selection and branching heuristics of CDCL.

The above perspectives include reinforcement learning, learning clause producer, and local exploration, which provide cognition of CDCL from multiple perspectives. Inspired by these realizations, in the next section, we will propose a holistic exploration perspective on the CDCL process and define the associated concepts.
%以上包括强化学习、学习子句生产者、局部勘探等视角，提供了对CDCL多个角度的认知。受这些认知启发，在下一章节，我们将提出一种整体勘探视角看待CDCL过程，并定义相关概念。

posted on 2024-04-07 14:25 海阔凭鱼跃越阅读(11) 评论(0) 编辑收藏举报

会员力量，点亮园子希望

刷新页面返回顶部

不同的研究视角

Exponential Recency Weighted Average Branching Heuristic for SAT Solvers

Improving SAT Solver Heuristics with Graph Networks and Reinforcement Learning

Adaptive Restart and CEGAR-Based Solver for Inverting Cryptographic Hash Functions

公告