这涉及一系列文献:

文献1:        Locality and Hard SAT-Instances  

                    ——Markström, Klas. ‘Locality and Hard SAT-Instances’. Journal on Satisfiability, Boolean Modeling and Computation, vol. 2, no. 1-4, pp. 221-227, 2006

 

求解器如何尝试避免这些实例所带来的至少一些陷阱。

 

Abstract

In this note we construct a family of SAT-instance based on Eulerian graphs which are aimed at being hard for resolution based SAT-solvers. We discuss some experiments made with instances of this type and how a solver can try to avoid at least some of the pitfalls presented by these instances. Finally we look at how the density of subformulae can influence the hardness of SAT instances.

我们将讨论使用这种类型的实例所做的一些实验,以及求解器如何尝试避免这些实例所带来的至少一些陷阱。

   
 

1. Introduction

DPLL method

这个方法实际上执行深度优先搜索,并结合单元子句的传播。

该方法是一个多项式时间复杂度,等价于树状归结。

Tree-like resolution is a fairly weak proof system and for a long time the performance of solvers were quite limited..树形归结是一种较弱的证明系统,长期以来求解器的性能都很有限。

 

树状归结弱于常规归结(包括many proper refinements of general resolution)和戴维斯-普特南归结。

 

The aim of this note is to construct a family of CNF formulae which are specially tuned to make use of the structure of current solvers to produce small hard examples. 这篇笔记的目的是构造一组CNF公式,这些公式是特别调整的,以利用当前求解器的结构来产生小的硬例子。

Several instances from the constructed family were submitted as benchmarks for the 2005 SAT solver competition, none of which could be solved by the competing solvers. 构建族中的几个实例被提交为2005年SAT求解器竞赛的基准,没有一个可以被竞争求解器解决。

In connection with this we will also give a short discussion of the importance of local dense subformulae for the performance of resolution based solvers.  与此相关,我们还将简要讨论局部密集子公式对基于分辨率的求解器性能的重要性。

Here we will also comment on some structural similarities between our family of formulae and formulae from the random k-SAT distribution. 在这里,我们还将评论我们的公式家族和随机k-SAT分布的公式之间的一些结构相似性。

   
 

2. The problem class

图着色问题及其形式化方法

 

对应生成cnf的形式化方法实现

coloring.py

  1 #!/usr/bin/env python
  2 # -*- coding:utf-8 -*-
  3 """Formulas that encode coloring related problems
  4 """
  5 
  6 from cnfgen.formula.cnf import CNF
  7 from cnfgen.graphs import Graph
  8 from cnfgen.localtypes import non_negative_int
  9 
 10 def GraphColoringFormula(G, colors, functional=True, formula_class=CNF):
 11     """Generates the clauses for colorability formula
 12 
 13     The formula encodes the fact that the graph :math:`G` has a coloring
 14     with color set ``colors``. This means that it is possible to
 15     assign one among the elements in ``colors``to that each vertex of
 16     the graph such that no two adjacent vertices get the same color.
 17 
 18     Parameters
 19     ----------
 20     G : cnfgen.Graph
 21         a simple undirected graph
 22     colors : non negative int
 23         the number of colors
 24     functional: bool
 25         forbid a vertex to be mapped to multiple colors
 26 
 27     Returns
 28     -------
 29     CNF
 30        the CNF encoding of the coloring problem on graph ``G``
 31 
 32     """
 33     non_negative_int(colors, 'colors')
 34     G = Graph.normalize(G, 'G')
 35 
 36     # Describe the formula
 37     description = "Graph {}-Colorability of {}".format(colors,G)
 38     F = formula_class(description=description)
 39     col = F.new_mapping(G.order(), colors,label='x_{{{0}{1}}}')
 40 
 41     # Color each vertex
 42     F.force_complete_mapping(col)
 43     if functional:
 44         F.force_functional_mapping(col)
 45 
 46 
 47     # This is a legal coloring
 48     for (v1, v2) in G.edges():
 49         for c in range(1,colors+1):
 50             F.add_clause([-col(v1, c), -col(v2, c)])
 51 
 52     return F
 53 
 54 
 55 def EvenColoringFormula(G, formula_class=CNF):
 56     """Even coloring formula
 57 
 58     The formula is defined on a graph :math:`G` and claims that it is
 59     possible to split the edges of the graph in two parts, so that
 60     each vertex has an equal number of incident edges in each part.
 61 
 62     The formula is defined on graphs where all vertices have even
 63     degree. The formula is satisfiable only on those graphs with an
 64     even number of edges in each connected component [1]_.
 65 
 66     Arguments
 67     ---------
 68     G : cnfgen.Graph
 69        a simple undirected graph where all vertices have even degree
 70 
 71     Raises
 72     ------
 73     ValueError
 74        if the graph in input has a vertex with odd degree
 75 
 76     Returns
 77     -------
 78     CNF object
 79 
 80     References
 81     ----------
 82     .. [1] Locality and Hard SAT-instances, Klas Markstrom
 83        Journal on Satisfiability, Boolean Modeling and Computation 2 (2006) 221-228
 84 
 85     """
 86     G = Graph.normalize(G, 'G')
 87 
 88     description = "Even coloring formula on " + G.name
 89     F = formula_class(description=description)
 90 
 91     e = F.new_graph_edges(G)
 92 
 93     # Defined on both side
 94     for v in G.vertices():
 95 
 96         if G.degree(v) % 2 == 1:
 97             raise ValueError(
 98                 "Markstrom's Even Coloring formulas requires all\n
vertices to have even degree.
" 99 ) 100 101 edge_vars = [e(u, v) for u,v in e.indices(v,None)] 102 103 F.cardinality_eq(edge_vars, len(edge_vars) // 2) 104 return F

 

cnf.py

 1 #!/usr/bin/env python
 2 # -*- coding:utf-8 -*-
 3 """Build and manipulate CNF formulas
 4 
 5 The module `contains facilities to generate cnf formulas, in order to
 6 be printed in DIMACS, OPB or LaTeX formats. Such formulas are ready to
 7 be fed to sat solvers.
 8 
 9 The module implements the `CNF` object, which is the main entry point
10 to the `cnfgen` library.
11 
12 Copyright (C) 2012-2022  Massimo Lauria <lauria.massimo@gmail.com>
13 https://github.com/MassimoLauria/cnfgen.git
14 
15 """
16 from cnfgen.formula.cnfio import CNFio
17 from cnfgen.formula.linear import CNFLinear
18 from cnfgen.formula.variables import VariablesManager
19 
20 
21 class CNF(VariablesManager, CNFio, CNFLinear):
22     """Propositional formulas in conjunctive normal form.
23 
24     A CNF  formula is a  sequence of  clauses, which are  sequences of
25     literals. Each literal is either a variable or its negation.
26 
27     Use ``add_clause`` to add new clauses to CNF. Clauses will be added
28     multiple times in case of multiple insertion of the same clauses.
29 
30     For documentation purpose it is possible use have an additional
31     comment header at the top of the formula, which will be
32     *optionally* exported to LaTeX or dimacs.
33 
34     Implementation:  for efficiency reason clauses and variable can
35     only be added, and not deleted. Furthermore order matters in
36     the representation.
37 
38     Examples
39     --------
40     >>> c=CNF([[1, 2, -3], [-2, 4]])
41     >>> print( c.to_dimacs(),end='')
42     p cnf 4 2
43     1 2 -3 0
44     -2 4 0
45     >>> c.add_clause([-3, 4, -5])
46     >>> print( c.to_dimacs(),end='')
47     p cnf 5 3
48     1 2 -3 0
49     -2 4 0
50     -3 4 -5 0
51     >>> print(c[1])
52     [-2, 4]
53     """
54 
55     def __init__(self, clauses=None, description=None):
56         """Propositional formulas in conjunctive normal form.
57 
58         Parameters
59         ----------
60         clauses : ordered list of clauses
61             a clause with k literals list containing k pairs, each
62             representing a literal (see `add_clause`). First element
63             is the polarity and the second is the variable, which must
64             be an hashable object.
65 
66             E.g. (not x3) or x4 or (not x2) is encoded 
as [(False,"x3"),(True,"x4"),False,"x2")] 67 68 description: string, optional 69 a description of the formula 70 """ 71 CNFLinear.__init__(self, 72 clauses=clauses, 73 description=description) 74 VariablesManager.__init__(self,self)

 

   1 #!/usr/bin/env python
   2 # -*- coding:utf-8 -*-
   3 """Utilities to manage graph formats and graph files in order to build
   4 formulas that are graph based.
   5 """
   6 
   7 import os
   8 import io
   9 import random
  10 from io import StringIO
  11 import copy
  12 from bisect import bisect_right, bisect_left
  13 
  14 import networkx
  15 
  16 from cnfgen.localtypes import positive_int, non_negative_int
  17 
  18 __all__ = [
  19     "readGraph", "writeGraph",
  20     "Graph", "DirectedGraph", "BipartiteGraph",
  21     "supported_graph_formats",
  22     "bipartite_random_left_regular", "bipartite_random_regular",
  23     "bipartite_random_m_edges", "bipartite_random",
  24     "dag_complete_binary_tree", "dag_pyramid", "dag_path"
  25 ]
  26 
  27 #################################################################
  28 #          Import third party code
  29 #################################################################
  30 
  31 
  32 class BipartiteEdgeList():
  33     """Edge list for bipartite graphs"""
  34 
  35     def __init__(self, B):
  36         self.B = B
  37 
  38     def __len__(self):
  39         return self.B.number_of_edges()
  40 
  41     def __contains__(self, t):
  42         return len(t) == 2 and self.B.has_edge(t[0], t[1])
  43 
  44     def __iter__(self):
  45         for u in range(1, self.B.left_order() + 1):
  46             yield from ((u, v) for v in self.B.right_neighbors(u))
  47 
  48 
  49 class GraphEdgeList():
  50     """Edge list for bipartite graphs"""
  51 
  52     def __init__(self, G):
  53         self.G = G
  54 
  55     def __len__(self):
  56         return self.G.number_of_edges()
  57 
  58     def __contains__(self, t):
  59         return len(t) == 2 and self.G.has_edge(t[0], t[1])
  60 
  61     def __iter__(self):
  62         n = self.G.number_of_vertices()
  63         G = self.G
  64         for u in range(1, n):
  65             pos = bisect_right(G.adjlist[u], u)
  66             while pos < len(G.adjlist[u]):
  67                 v = G.adjlist[u][pos]
  68                 yield (u, v)
  69                 pos += 1
  70 
  71 
  72 class DirectedEdgeList():
  73     """Edge list for bipartite graphs"""
  74 
  75     def __init__(self, D, sort_by_predecessors=True):
  76         self.D = D
  77         self.sort_by_pred = sort_by_predecessors
  78 
  79     def __len__(self):
  80         return self.D.number_of_edges()
  81 
  82     def __contains__(self, t):
  83         return len(t) == 2 and self.D.has_edge(t[0], t[1])
  84 
  85     def __iter__(self):
  86         n = self.D.number_of_vertices()
  87         if self.sort_by_pred:
  88             successors = self.D.succ
  89             for src in range(1, n+1):
  90                 for dest in successors[src]:
  91                     yield (src, dest)
  92         else:
  93             predecessors = self.D.pred
  94             for dest in range(1, n+1):
  95                 for src in predecessors[dest]:
  96                     yield (src, dest)
  97 
  98 
  99 class BaseGraph():
 100     """Base class for graphs"""
 101 
 102     def is_dag(self):
 103         """Test whether the graph is directed acyclic
 104 
 105 This is not a full test. It only checks that all directed edges (u,v)
 106 have that u < v."""
 107         raise NotImplementedError
 108 
 109     def is_directed(self):
 110         "Test whether the graph is directed"
 111         raise NotImplementedError
 112 
 113     def is_multigraph(self):
 114         "Test whether the graph can have multi-edges"
 115         return False
 116 
 117     def is_bipartite(self):
 118         "Test whether the graph is a bipartite object"
 119         return False
 120 
 121     def order(self):
 122         return self.number_of_vertices()
 123 
 124     def vertices(self):
 125         return range(1, self.number_of_vertices()+1)
 126 
 127     def number_of_vertices(self):
 128         raise NotImplementedError
 129 
 130     def number_of_edges(self):
 131         raise NotImplementedError
 132 
 133     def has_edge(self, u, v):
 134         raise NotImplementedError
 135 
 136     def add_edge(self, u, v):
 137         raise NotImplementedError
 138 
 139     def add_edges_from(self, edges):
 140         for u, v in edges:
 141             self.add_edge(u, v)
 142 
 143     def edges(self):
 144         raise NotImplementedError
 145 
 146     def __len__(self):
 147         return self.number_of_vertices()
 148 
 149     def to_networkx(self):
 150         """Convert the graph TO a networkx object."""
 151         raise NotImplementedError
 152 
 153     @classmethod
 154     def from_networkx(cls, G):
 155         """Create a graph object from a networkx graph"""
 156         raise NotImplementedError
 157 
 158     @classmethod
 159     def normalize(cls, G):
 160         """Guarantees a cnfgen graph object"""
 161         raise NotImplementedError
 162 
 163     @classmethod
 164     def supported_file_formats(cls):
 165         """File formats supported for graph I/O"""
 166         raise NotImplementedError
 167 
 168     @classmethod
 169     def graph_type_name(cls):
 170         """File formats supported for graph I/O"""
 171         raise NotImplementedError
 172 
 173     @classmethod
 174     def from_file(cls, fileorname, fileformat=None):
 175         """Load the graph from a file
 176 
 177         The file format is either indicated in the `fileformat` variable or, if
 178         that is `None`, or from the extension of the filename.
 179 
 180         Parameters
 181         -----------
 182         fileorname: str or file-like object
 183             the input file from which the graph is read. If it is a string
 184             then the graph is read from a file with that string as
 185             filename. Otherwise if the fileorname is a file object (or
 186             a text stream), the graph is read from there.
 187 
 188             Input files are assumed to be UTF-8 by default (for some
 189             formats it is actually ascii)
 190 
 191         fileformat: string, optional
 192             The file format that the parser should expect to receive.
 193             See also :py:func:`cnfgen.graph.supported_formats`. By default
 194             it tries to autodetect it from the file name extension (when applicable)."""
 195 
 196         # Reduce to the case of filestream
 197         if isinstance(fileorname, str):
 198             with open(fileorname, 'r', encoding='utf-8') as file_handle:
 199                 return cls.from_file(file_handle, fileformat)
 200 
 201         # Discover and test file format
 202         fileformat = guess_fileformat(fileorname, fileformat)
 203         allowed = cls.supported_file_formats()
 204         typename = cls.graph_type_name()
 205         if fileformat not in allowed:
 206             raise ValueError(
 207                 "Invalid file type."
 208                 " For {} graphs we support {}".format(typename,
 209                                                       allowed))
 210 
 211         # Read file
 212         return readGraph(fileorname, typename, fileformat)
 213 
 214 
 215 class Graph(BaseGraph):
 216 
 217     def is_dag(self):
 218         return False
 219 
 220     def is_directed(self):
 221         return False
 222 
 223     def __init__(self, n, name=None):
 224         non_negative_int(n, 'n')
 225         self.n = n
 226         self.m = 0
 227         self.adjlist = [[] for i in range(n+1)]
 228         self.edgeset = set()
 229         if name is None:
 230             self.name = "a simple graph with {} vertices".format(n)
 231         else:
 232             self.name = name
 233 
 234     def add_edge(self, u, v):
 235         if not (1 <= u <= self.n and 1 <= v <= self.n and u != v):
 236             raise ValueError(
 237                 "u,v must be distinct, between 1 and the number of nodes")
 238         if (u, v) in self.edgeset:
 239             return
 240         u, v = min(u, v), max(u, v)
 241         pos = bisect_right(self.adjlist[u], v)
 242         self.adjlist[u].insert(pos, v)
 243         pos = bisect_right(self.adjlist[v], u)
 244         self.adjlist[v].insert(pos, u)
 245         self.m += 1
 246         self.edgeset.add((u, v))
 247         self.edgeset.add((v, u))
 248 
 249     def update_vertex_number(self, new_value):
 250         """Raises the number of vertices to `new_value`"""
 251         non_negative_int(new_value, 'new_value')
 252         for _ in range(self.n,new_value):
 253             self.adjlist.append([])
 254         self.n = max(self.n, new_value)
 255 
 256     def remove_edge(self,u,v):
 257         if not self.has_edge(u,v):
 258             return
 259         self.edgeset.remove((u,v))
 260         self.edgeset.remove((v,u))
 261         self.adjlist[u].remove(v)
 262         self.adjlist[v].remove(u)
 263         self.m -= 1
 264 
 265     def has_edge(self, u, v):
 266         return (u, v) in self.edgeset
 267 
 268     def vertices(self):
 269         return range(1, self.n+1)
 270 
 271     def edges(self):
 272         """Outputs all edges in the graph"""
 273         return GraphEdgeList(self)
 274 
 275     def number_of_vertices(self):
 276         return self.n
 277 
 278     def number_of_edges(self):
 279         return self.m
 280 
 281     def to_networkx(self):
 282         G = networkx.Graph()
 283         G.add_nodes_from(range(1, self.n+1))
 284         G.add_edges_from(self.edges())
 285         return G
 286 
 287     def neighbors(self, u):
 288         """Outputs the neighbors of vertex `u`
 289 
 290 The sequence of neighbors is guaranteed to be sorted.
 291 """
 292         if not(1 <= u <= self.n):
 293             raise ValueError("vertex u not in the graph")
 294         yield from self.adjlist[u]
 295 
 296     def degree(self, u):
 297         if not(1 <= u <= self.n):
 298             raise ValueError("vertex u not in the graph")
 299         return len(self.adjlist[u])
 300 
 301     @classmethod
 302     def from_networkx(cls, G):
 303         if not isinstance(G, networkx.Graph):
 304             raise ValueError('G is expected to be of type networkx.Graph')
 305         G = normalize_networkx_labels(G)
 306         C = cls(G.order())
 307         C.add_edges_from(G.edges())
 308         try:
 309             C.name = G.name
 310         except AttributeError:
 311             C.name = '<unknown graph>'
 312         return C
 313 
 314     @classmethod
 315     def graph_type_name(cls):
 316         """Simple graphs are laleled as 'simple'"""
 317         return 'simple'
 318 
 319     @classmethod
 320     def supported_file_formats(cls):
 321         """File formats supported for simple graph I/O"""
 322         # Check that DOT is a supported format
 323         if has_dot_library():
 324             return ['kthlist', 'gml', 'dot', 'dimacs']
 325         else:
 326             return ['kthlist', 'gml', 'dimacs']
 327 
 328     @classmethod
 329     def null_graph(cls):
 330         return cls(0, 'the null graph')
 331 
 332     @classmethod
 333     def empty_graph(cls, n):
 334         return cls(n, 'the empty graph of order '+str(n))
 335 
 336     @classmethod
 337     def complete_graph(cls, n):
 338         G = cls(n, 'the complete graph of order '+str(n))
 339         for u in range(1, n):
 340             for v in range(u+1, n+1):
 341                 G.add_edge(u, v)
 342         return G
 343 
 344     @classmethod
 345     def star_graph(cls, n):
 346         G = cls(n+1, 'the star graph with {} arms'.format(n))
 347         for u in range(1, n+1):
 348             G.add_edge(u, n+1)
 349         return G
 350 
 351     @classmethod
 352     def normalize(cls, G, varname=''):
 353         """Guarantees a cnfgen.graphs.Graph object
 354 
 355 If the given graph `G` is a networkx.Graph object, this method
 356 produces a CNFgen simple graph object, relabeling vertices so that
 357 vertices are labeled as numbers from 1 to `n`, where `n` is the number
 358 of vertices in `G`. If the vertices in the original graph have some
 359 kind of order, the order is preserved.
 360 
 361 If `G` is already a `cnfgen.graphs.Graph` object, nothing is done.
 362 
 363         Parameters
 364         ----------
 365         cls: a class
 366 
 367         G : networkx.Graph or cnfgen.Graph
 368             the graph to normalize/check
 369         varname: str
 370             the variable name, for error messages (default: 'G')
 371         """
 372         typemsg = "type of argument '{}' must be either networx.Graph or cnfgen.Graph"
 373         conversionmsg = "cannot convert '{}' into a cnfgen.Graph object"
 374         if not isinstance(G, (Graph, networkx.Graph)):
 375             raise TypeError(typemsg.format(varname))
 376         if isinstance(G, Graph):
 377             return G
 378         try:
 379             G2 = cls.from_networkx(G)
 380             return G2
 381         except AttributeError:
 382             raise ValueError(conversionmsg.format(varname))
 383 
 384 
 385 class DirectedGraph(BaseGraph):
 386 
 387     def is_dag(self):
 388         """Is the graph acyclic?
 389 
 390 The vertices in the graph are assumed to be topologically sorted,
 391 therefore this function just determines whether there are edges going
 392 backward with respect to this order, which can be done in O(1) because
 393 edges can be added and not removed."""
 394         return self.still_a_dag
 395 
 396     def is_directed(self):
 397         return True
 398 
 399     def __init__(self, n, name='a simple directed graph'):
 400         non_negative_int(n, 'n')
 401         self.n = n
 402         self.m = 0
 403         self.edgeset = set()
 404         self.still_a_dag = True
 405         self.pred = [[] for i in range(n+1)]
 406         self.succ = [[] for i in range(n+1)]
 407         if name is None:
 408             self.name = "a directed graph with {} vertices".format(n)
 409         else:
 410             self.name = name
 411 
 412     def add_edge(self, src, dest):
 413         if not (1 <= src <= self.n and 1 <= dest <= self.n):
 414             raise ValueError(
 415                 "u,v must be distinct, between 1 and the number of nodes")
 416         if self.has_edge(src, dest):
 417             return
 418         if src >= dest:
 419             self.still_a_dag = False
 420 
 421         pos = bisect_right(self.pred[dest], src)
 422         self.pred[dest].insert(pos, src)
 423 
 424         pos = bisect_right(self.succ[src], dest)
 425         self.succ[src].insert(pos, dest)
 426 
 427         self.m += 1
 428         self.edgeset.add((src, dest))
 429 
 430     def has_edge(self, src, dest):
 431         """True if graph contains directed edge (src,dest)"""
 432         return (src, dest) in self.edgeset
 433 
 434     def vertices(self):
 435         return range(1, self.n+1)
 436 
 437     def edges(self):
 438         return DirectedEdgeList(self)
 439 
 440     def edges_ordered_by_successors(self):
 441         return DirectedEdgeList(self, sort_by_predecessors=False)
 442 
 443     def number_of_vertices(self):
 444         return self.n
 445 
 446     def number_of_edges(self):
 447         return self.m
 448 
 449     def to_networkx(self):
 450         G = networkx.DiGraph()
 451         G.add_nodes_from(range(1, self.n+1))
 452         G.add_edges_from(self.edges())
 453         return G
 454 
 455     def predecessors(self, u):
 456         """Outputs the predecessors of vertex `u`
 457 
 458 The sequence of predecessors is guaranteed to be sorted."""
 459         if not(1 <= u <= self.n):
 460             raise ValueError("vertex u not in the graph")
 461         yield from self.pred[u]
 462 
 463     def successors(self, u):
 464         """Outputs the successors of vertex `u`
 465 
 466 The sequence of successors is guaranteed to be sorted."""
 467         if not(1 <= u <= self.n):
 468             raise ValueError("vertex u not in the graph")
 469         yield from self.succ[u]
 470 
 471     def in_degree(self, u):
 472         if not(1 <= u <= self.n):
 473             raise ValueError("vertex u not in the graph")
 474         return len(self.pred[u])
 475 
 476     def out_degree(self, v):
 477         if not(1 <= v <= self.n):
 478             raise ValueError("vertex v not in the graph")
 479         return len(self.succ[v])
 480 
 481     @classmethod
 482     def from_networkx(cls, G):
 483         if not isinstance(G, networkx.DiGraph):
 484             raise ValueError('G is expected to be of type networkx.DiGraph')
 485         G = normalize_networkx_labels(G)
 486         C = cls(G.order())
 487         C.add_edges_from(G.edges())
 488         try:
 489             C.name = G.name
 490         except AttributeError:
 491             C.name = '<unknown graph>'
 492         return C
 493 
 494     @classmethod
 495     def graph_type_name(cls):
 496         """Directed graphs are laleled as 'digraph'"""
 497         return 'digraph'
 498 
 499     @classmethod
 500     def supported_file_formats(cls):
 501         """File formats supported for directed graph I/O"""
 502         if has_dot_library():
 503             return ['kthlist', 'gml', 'dot', 'dimacs']
 504         else:
 505             return ['kthlist', 'gml', 'dimacs']
 506 
 507     @classmethod
 508     def normalize(cls, G, varname='G'):
 509         """Guarantees a cnfgen.graphs.DirerctedGraph object
 510 
 511 If the given graph `G` is a networkx.DiGraph object, this method
 512 produces a CNFgen directed graph object, relabeling vertices so that
 513 vertices are labeled as numbers from 1 to `n`, where `n` is the number
 514 of vertices in `G`. If the vertices in the original graph have some
 515 kind of order, the order is preserved.
 516 
 517 If all edges go from lower vertices to higher vertices, with respect
 518 to the labeling, then t he graph is considered a directed acyclic
 519 graph DAG.
 520 
 521 If `G` is already a `cnfgen.graphs.DirectedGraph` object, nothing is done.
 522 
 523         Parameters
 524         ----------
 525         cls: a class
 526 
 527         G : networkx.DiGraph or cnfgen.DirectedGraph
 528             the graph to normalize/check
 529         varname: str
 530             the variable name, for error messages (default: 'G')
 531         """
 532         typemsg = "type of argument '{}' must be either networx.DiGraph or cnfgen.DirectedGraph"
 533         conversionmsg = "cannot convert '{}' into a cnfgen.DirectedGraph object"
 534         if not isinstance(G, (DirectedGraph, networkx.DiGraph)):
 535             raise TypeError(typemsg.format(varname))
 536         if isinstance(G, DirectedGraph):
 537             return G
 538         try:
 539             G2 = cls.from_networkx(G)
 540             return G2
 541         except AttributeError:
 542             raise ValueError(conversionmsg.format(varname))
 543 
 544 
 545 class BaseBipartiteGraph(BaseGraph):
 546     """Base class for bipartite graphs"""
 547 
 548     def __init__(self, L, R, name=None):
 549         non_negative_int(L, 'L')
 550         non_negative_int(R, 'R')
 551         self.lorder = L
 552         self.rorder = R
 553         if name is None:
 554             self.name = 'a bipartite graph with ({},{}) vertices'.format(L, R)
 555         else:
 556             self.name = name
 557 
 558     def is_bipartite(self):
 559         return True
 560 
 561     def number_of_vertices(self):
 562         return self.lorder + self.rorder
 563 
 564     def edges(self):
 565         return BipartiteEdgeList(self)
 566 
 567     def left_order(self):
 568         return self.lorder
 569 
 570     def right_order(self):
 571         return self.rorder
 572 
 573     def left_degree(self, v):
 574         return len(self.left_neighbors(v))
 575 
 576     def right_degree(self, u):
 577         return len(self.right_neighbors(u))
 578 
 579     def left_neighbors(self, v):
 580         raise NotImplementedError
 581 
 582     def right_neighbors(self, u):
 583         raise NotImplementedError
 584 
 585     def parts(self):
 586         return range(1, self.lorder + 1), range(1, self.rorder + 1)
 587 
 588     def to_networkx(self):
 589         G = networkx.Graph()
 590         n, m = self.lorder, self.rorder
 591         G.add_nodes_from(range(1, n+1), bipartite=0)
 592         G.add_nodes_from(range(n+1, m+n+1), bipartite=1)
 593         G.add_edges_from((u, v+n) for (u, v) in self.edges())
 594         G.name = self.name
 595         return G
 596 
 597 
 598 class BipartiteGraph(BaseBipartiteGraph):
 599     def __init__(self, L, R, name=None):
 600         non_negative_int(L, 'L')
 601         non_negative_int(R, 'R')
 602         BaseBipartiteGraph.__init__(self, L, R, name)
 603         self.ladj = {}
 604         self.radj = {}
 605         self.edgeset = set()
 606 
 607     def has_edge(self, u, v):
 608         return (u, v) in self.edgeset
 609 
 610     def add_edge(self, u, v):
 611         """Add an edge to the graph.
 612 
 613         - multi-edges are not allowed
 614         - neighbors of a vertex are kept in numberic order
 615 
 616         Examples
 617         --------
 618         >>> G = BipartiteGraph(3,5)
 619         >>> G.add_edge(2,3)
 620         >>> G.add_edge(2,2)
 621         >>> G.add_edge(2,3)
 622         >>> G.right_neighbors(2)
 623         [2, 3]
 624         """
 625         if not (1 <= u <= self.lorder and 1 <= v <= self.rorder):
 626             raise ValueError("Invalid choice of vertices")
 627 
 628         if (u, v) in self.edgeset:
 629             return
 630 
 631         if u not in self.ladj:
 632             self.ladj[u] = []
 633         if v not in self.radj:
 634             self.radj[v] = []
 635 
 636         pv = bisect_right(self.ladj[u], v)
 637         pu = bisect_right(self.radj[v], u)
 638         self.ladj[u].insert(pv, v)
 639         self.radj[v].insert(pu, u)
 640         self.edgeset.add((u, v))
 641 
 642     def number_of_edges(self):
 643         return len(self.edgeset)
 644 
 645     def right_neighbors(self, u):
 646         """Outputs the neighbors of a left vertex `u`
 647 
 648 The sequence of neighbors is guaranteed to be sorted."""
 649         if not (1 <= u <= self.lorder):
 650             raise ValueError("Invalid choice of vertex")
 651         return self.ladj.get(u, [])[:]
 652 
 653     def left_neighbors(self, v):
 654         """Outputs the neighbors of right vertex `u`
 655 
 656 The sequence of neighbors is guaranteed to be sorted."""
 657         if not (1 <= v <= self.rorder):
 658             raise ValueError("Invalid choice of vertex")
 659         return self.radj.get(v, [])[:]
 660 
 661     @classmethod
 662     def from_networkx(cls, G):
 663         """Convert a :py:class:`networkx.Graph` into a :py:class:`cnfgen.graphs.BipartiteGraph`
 664 
 665         In order to convert a :py:class:`networkx.Graph` object `G`,
 666         it is necessary that all nodes in `G` have the property
 667         `bipartite` set to either `0` or `1`.
 668 
 669         If this is not the case, or if there are edges between the two
 670         parts, :py:class:`ValueError` is raised.
 671 
 672         Example
 673         -------
 674         >>> G = networkx.bipartite.complete_bipartite_graph(5,7)
 675         >>> B = BipartiteGraph.from_networkx(G)
 676         >>> print(B.order())
 677         12
 678         >>> print(B.left_order())
 679         5
 680         >>> print(B.has_edge(2,3))
 681         True
 682         """
 683         if not isinstance(G, networkx.Graph):
 684             raise ValueError('G is expected to be of type networkx.Graph')
 685         side = [[], []]
 686         index = [{}, {}]
 687         for u in G.nodes():
 688             try:
 689                 color = G.nodes[u]['bipartite']
 690                 assert color in ['0', 0, '1', 1]
 691             except (KeyError, AssertionError):
 692                 raise ValueError(
 693                     "Node {} lacks the 'bipartite' property set to 0 or 1".format(u))
 694             side[int(color)].append(u)
 695 
 696         B = cls(len(side[0]), len(side[1]))
 697         index[0] = {u: i for (i, u) in enumerate(side[0], start=1)}
 698         index[1] = {v: i for (i, v) in enumerate(side[1], start=1)}
 699         for u, v in G.edges():
 700             ucolor = 0 if (u in index[0]) else 1
 701             vcolor = 1 if (v in index[1]) else 0
 702 
 703             if ucolor == vcolor:
 704                 raise ValueError(
 705                     "Edge ({},{}) across the bipartition".format(u, v))
 706 
 707             iu, iv = index[ucolor][u], index[vcolor][v]
 708             if ucolor == 0:
 709                 B.add_edge(iu, iv)
 710             else:
 711                 B.add_edge(iv, iu)
 712         try:
 713             B.name = G.name
 714         except AttributeError:
 715             B.name = '<unknown graph>'
 716         return B
 717 
 718     @classmethod
 719     def graph_type_name(cls):
 720         """Bipartite graphs are laleled as 'bipartite'"""
 721         return 'bipartite'
 722 
 723     @classmethod
 724     def supported_file_formats(cls):
 725         """File formats supported for bipartite graph I/O"""
 726         if has_dot_library():
 727             return ['kthlist', 'gml', 'dot', 'matrix']
 728         else:
 729             return ['kthlist', 'gml', 'matrix']
 730 
 731     @classmethod
 732     def normalize(cls, G, varname='G'):
 733         """Guarantees a cnfgen.graphs.BipartiteGraph object
 734 
 735 If the given graph `G` is a networkx.Graph object with a bipartition,
 736 this method produces a CNFgen bipartite graph object, relabeling
 737 vertices so that vertices og each side are labeled as numbers from 1
 738 to `n` and 1 to `m` respectively, where `n` and `m` are the numbers of
 739 vertices in `G` on the left and right side, respectively. If the
 740 vertices in the original graph have some kind of order, the order
 741 is preserved.
 742 
 743 If `G` is already a `cnfgen.graphs.BipartiteGraph` object, nothing is done.
 744 
 745         """
 746         typemsg = "type of argument '{}' must be either networx.Graph or cnfgen.BipartiteGraph"
 747         conversionmsg = "cannot convert '{}' to a bipartite graph: inconsistent 'bipartite' labeling"
 748         if not isinstance(G, (BipartiteGraph, networkx.Graph)):
 749             raise TypeError(typemsg.format(varname))
 750         if isinstance(G, BipartiteGraph):
 751             return G
 752         try:
 753             G2 = cls.from_networkx(G)
 754             return G2
 755         except AttributeError:
 756             raise ValueError(conversionmsg.format(varname))
 757 
 758 
 759 class CompleteBipartiteGraph(BipartiteGraph):
 760     def __init__(self, L, R):
 761         BipartiteGraph.__init__(self, L, R)
 762         self.name = 'Complete bipartite graph with ({},{}) vertices'.format(
 763             L, R)
 764 
 765     def has_edge(self, u, v):
 766         return (1 <= u <= self.lorder and 1 <= v <= self.rorder)
 767 
 768     def add_edge(self, u, v):
 769         pass
 770 
 771     def number_of_edges(self):
 772         return self.lorder * self.rorder
 773 
 774     def right_neighbors(self, u):
 775         return range(1, self.rorder + 1)
 776 
 777     def left_neighbors(self, v):
 778         return range(1, self.lorder + 1)
 779 
 780 
 781 def has_dot_library():
 782     """Test the presence of pydot
 783     """
 784     try:
 785         # newer version of networkx
 786         from networkx import nx_pydot
 787         import pydot
 788         del pydot
 789         return True
 790     except ImportError:
 791         pass
 792 
 793     return False
 794 
 795 
 796 #################################################################
 797 #          Graph reader/writer
 798 #################################################################
 799 
 800 
 801 def guess_fileformat(fileorname, fileformat=None):
 802     """Guess the file format for the file or filename """
 803     if fileformat is not None:
 804         return fileformat
 805 
 806     try:
 807         if isinstance(fileorname, str):
 808             name = fileorname
 809         else:
 810             name = fileorname.name
 811         return os.path.splitext(name)[-1][1:]
 812     except (AttributeError, ValueError, IndexError):
 813         raise ValueError(
 814             "Cannot guess a file format from arguments. Please specify the format manually.")
 815 
 816 
 817 def _process_graph_io_arguments(iofile, graph_type, file_format, multi_edges):
 818     """Test if the argument for the graph I/O functions make sense"""
 819 
 820     # Check the file
 821     if not isinstance(iofile, io.TextIOBase) and \
 822        not isinstance(iofile, io.IOBase) and \
 823        not isinstance(iofile, StringIO):
 824         raise ValueError(
 825             "The IO stream \"{}\" does not correspond to a file".format(
 826                 iofile))
 827 
 828     # Check the graph type specification
 829     if graph_type not in ['dag', 'digraph', 'simple', 'bipartite']:
 830         raise ValueError("The graph type must be one of " +
 831                          list(_graphformats.keys()))
 832 
 833     if multi_edges:
 834         raise NotImplementedError("Multi edges not supported yet")
 835 
 836     elif graph_type in ["dag", "digraph"]:
 837         grtype = DirectedGraph
 838     elif graph_type == "simple":
 839         grtype = Graph
 840     elif graph_type == "bipartite":
 841         grtype = BipartiteGraph
 842     else:
 843         raise RuntimeError(
 844             "Unknown graph type argument: {}".format(graph_type))
 845 
 846     # Check/discover file format specification
 847     if file_format == 'autodetect':
 848         try:
 849             extension = os.path.splitext(iofile.name)[-1][1:]
 850         except AttributeError:
 851             raise ValueError(
 852                 "Cannot guess a file format from an IO stream with no name. Please specify the format manually."
 853             )
 854         if extension not in grtype.supported_file_formats():
 855             raise ValueError("Cannot guess a file format for {} graphs from the extension of \"{}\". Please specify the format manually.".
 856                              format(graph_type, iofile.name))
 857         else:
 858             file_format = extension
 859 
 860     elif file_format not in grtype.supported_file_formats():
 861         raise ValueError(
 862             "For {} graphs we only support these formats: {}".format(
 863                 graph_type, grtype.supported_file_formats()))
 864 
 865     return (grtype, file_format)
 866 
 867 
 868 def normalize_networkx_labels(G):
 869     """Relabel all vertices as integer starting from 1"""
 870     # Normalize GML file. All nodes are integers starting from 1
 871     try:
 872         G = networkx.convert_node_labels_to_integers(
 873             G, first_label=1, ordering='sorted')
 874     except TypeError:
 875         # Ids cannot be sorted natively
 876         G = networkx.convert_node_labels_to_integers(
 877             G, first_label=1, ordering='default')
 878     return G
 879 
 880 
 881 def readGraph(input_file,
 882               graph_type,
 883               file_format='autodetect',
 884               multi_edges=False):
 885     """Read a Graph from file
 886 
 887     In the case of "bipartite" type, the graph obtained is of
 888     :py:class:`cnfgen.graphs.BipartiteGraph`.
 889 
 890     In the case of "simple" type, the graph is obtained of
 891     :py:class:`cnfgen.graphs.Graph`.
 892 
 893     In the case of "dag" or "directed" type, the graph obtained is of
 894     :py:class:`cnfgen.graphs.DirectedGraph`.
 895 
 896     The supported file formats are enumerated by the respective class method
 897     ``supported_file_formats``
 898 
 899     In the case of "dag" type, the graph read in input must have
 900     increasing edges, in the sense that all edges must be such that
 901     the source has lower identifier than the sink. (I.e. the numeric
 902     identifiers of the vertices are a topological order for the
 903     graph)
 904 
 905     Parameters
 906     -----------
 907     input_file: str or file-like object
 908         the input file from which the graph is read. If it is a string
 909         then the graph is read from a file with that string as
 910         filename. Otherwise if the input_file is a file object (or
 911         a text stream), the graph is read from there.
 912 
 913         Input files are assumed to be UTF-8 by default.
 914 
 915     graph_type: string in {"simple","digraph","dag","bipartite"}
 916 
 917     file_format: string, optional
 918         The file format that the parser should expect to receive.
 919         See also the method py:method::``supported_file_formats``. By default
 920         it tries to autodetect it from the file name extension (when applicable).
 921 
 922     multi_edges: bool,optional
 923         are multiple edge allowed in the graph? By default this is not allowed.
 924 
 925     Returns
 926     -------
 927     a graph object
 928         one type among Graph, DirectedGraph, BipartiteGraph
 929 
 930     Raises
 931     ------
 932     ValueError
 933         raised when either ``input_file`` is neither a file object
 934         nor a string, or when ``graph_type`` and ``file_format`` are
 935         invalid choices.
 936 
 937     IOError
 938         it is impossible to read the ``input_file``
 939 
 940     See Also
 941     --------
 942     writeGraph, is_dag, has_bipartition
 943 
 944     """
 945     if multi_edges:
 946         raise NotImplementedError("Multi edges not supported yet")
 947 
 948     # file name instead of file object
 949     if isinstance(input_file, str):
 950         with open(input_file, 'r', encoding='utf-8') as file_handle:
 951             return readGraph(file_handle, graph_type, file_format, multi_edges)
 952 
 953     graph_class, file_format = _process_graph_io_arguments(input_file,
 954                                                            graph_type,
 955                                                            file_format,
 956                                                            multi_edges)
 957 
 958     if file_format == 'dot':
 959 
 960         # This is a workaround. In theory a broken dot file should
 961         # cause a pyparsing.ParseError but the dot_reader used by
 962         # networkx seems to mismanage that and to cause a TypeError
 963         #
 964         try:
 965             G = networkx.nx_pydot.read_dot(input_file)
 966             try:
 967                 # work around for a weird parse error in pydot, which
 968                 # adds an additiona vertex '\\n' in the graph.
 969                 G.remove_node('\\n')
 970             except networkx.exception.NetworkXError:
 971                 pass
 972             G = graph_class.normalize(G)
 973         except TypeError:
 974             raise ValueError('Parse Error in dot file')
 975 
 976     elif file_format == 'gml':
 977 
 978         # Networkx's GML reader expects to read from ascii encoded
 979         # binary file. We could have sent the data to a temporary
 980         # binary buffer but for some reasons networkx's GML reader
 981         # function is poorly written and does not like such buffers.
 982         # It turns out we can pass the data as a list of
 983         # encoded ascii lines.
 984         #
 985         # The 'id' field in the vertices are supposed to be an integer
 986         # and will be used as identifiers for the vertices in Graph
 987         # object too.
 988         #
 989         try:
 990             G = networkx.read_gml((line.encode('ascii')
 991                                   for line in input_file), label='id')
 992             G = graph_class.normalize(G)
 993         except networkx.NetworkXError as errmsg:
 994             raise ValueError("[Parse error in GML input] {} ".format(errmsg))
 995         except UnicodeEncodeError as errmsg:
 996             raise ValueError(
 997                 "[Non-ascii chars in GML file] {} ".format(errmsg))
 998 
 999     elif file_format == 'kthlist' and graph_type == 'bipartite':
1000 
1001         G = _read_bipartite_kthlist(input_file)
1002 
1003     elif file_format == 'kthlist' and graph_type != 'bipartite':
1004 
1005         G = _read_nonbipartite_kthlist(input_file, graph_class)
1006 
1007     elif file_format == 'dimacs':
1008 
1009         G = _read_graph_dimacs_format(input_file, graph_class)
1010 
1011     elif file_format == 'matrix':
1012 
1013         G = _read_graph_matrix_format(input_file)
1014 
1015     else:
1016         raise RuntimeError(
1017             "[Internal error] Format {} not implemented".format(file_format))
1018 
1019     if graph_type == "dag" and not G.is_dag():
1020         raise ValueError(
1021             "[Input error] Graph must be explicitly acyclic (src->dest edges where src<dest)")
1022 
1023     return G
1024 
1025 
1026 def writeGraph(G, output_file, graph_type, file_format='autodetect'):
1027     """Write a graph to a file
1028 
1029     Parameters
1030     -----------
1031     G : BaseGraph
1032 
1033     output_file: file object
1034         the output file to which the graph is written. If it is a string
1035         then the graph is written to a file with that string as
1036         filename. Otherwise if ``output_file`` is a file object (or
1037         a text stream), the graph is written there.
1038 
1039         The file is written in UTF-8 by default.
1040 
1041     graph_type: string in {"simple","digraph","dag","bipartite"}
1042         see also :py:func:`cnfgen.graph.supported_formats`
1043 
1044     file_format: string, optional
1045         The file format that the parser should expect to receive.
1046         See also :py:func:`cnfgen.graph.supported_formats`. By default
1047         it tries to autodetect it from the file name extension (when applicable).
1048 
1049     Returns
1050     -------
1051     None
1052 
1053     Raises
1054     ------
1055     ValueError
1056         raised when either ``output_file`` is neither a file object
1057         nor a string, or when ``graph_type`` and ``file_format`` are
1058         invalid choices.
1059 
1060     IOError
1061         it is impossible to write on the ``output_file``
1062 
1063     See Also
1064     --------
1065     readGraph
1066 
1067     """
1068     if not isinstance(G, BaseGraph):
1069         raise TypeError("G must be a cnfgen.graphs.BaseGraph")
1070 
1071     # file name instead of file object
1072     if isinstance(output_file, str):
1073         with open(output_file, 'w', encoding='utf-8') as file_handle:
1074             return writeGraph(G, file_handle, graph_type, file_format)
1075 
1076     _, file_format = _process_graph_io_arguments(output_file, graph_type,
1077                                                  file_format, False)
1078 
1079     if file_format == 'dot':
1080 
1081         G = G.to_networkx()
1082         networkx.nx_pydot.write_dot(G, output_file)
1083 
1084     elif file_format == 'gml':
1085 
1086         # Networkx's GML writer expects to write to an ascii encoded
1087         # binary file. Thus we need to let Networkx write to
1088         # a temporary binary ascii encoded buffer and then convert the
1089         # content before sending it to the output file.
1090         tempbuffer = io.BytesIO()
1091         G = G.to_networkx()
1092         networkx.write_gml(G, tempbuffer)
1093         print(tempbuffer.getvalue().decode('ascii'), file=output_file)
1094 
1095     elif file_format == 'kthlist' and graph_type != 'bipartite':
1096 
1097         _write_graph_kthlist_nonbipartite(G, output_file)
1098 
1099     elif file_format == 'kthlist' and graph_type == 'bipartite':
1100 
1101         _write_graph_kthlist_bipartite(G, output_file)
1102 
1103     elif file_format == 'dimacs':
1104 
1105         _write_graph_dimacs_format(G, output_file)
1106 
1107     elif file_format == 'matrix':
1108 
1109         _write_graph_matrix_format(G, output_file)
1110 
1111     else:
1112         raise RuntimeError(
1113             "[Internal error] Format {} not implemented".format(file_format))
1114 
1115 
1116 #
1117 # In-house parsers
1118 #
1119 def _kthlist_parse(inputfile):
1120     """Read a graph from file, and produce the datas.
1121 
1122     First yeild (#vertex,first comment line)
1123     Then generates a sequence of (s,target,lineno)
1124 
1125     Raises:
1126         ValueError is parsing fails for some reason
1127     """
1128     # vertex number
1129     size = -1
1130     name = ""
1131 
1132     for i, l in enumerate(inputfile.readlines()):
1133 
1134         # first non empty comment line is the graph name
1135         # must be before the graph size
1136         if l[0] == 'c':
1137             if size < 0 and len(name) == 0 and len(l[2:].strip()) != 0:
1138                 name += l[2:]
1139             continue
1140 
1141         # empty line
1142         if len(l.strip()) == 0:
1143             continue
1144 
1145         if ':' not in l:
1146             # vertex number spec
1147             if size >= 0:
1148                 raise ValueError(
1149                     "Line {} contains a second spec directive.".format(i))
1150             try:
1151                 size = int(l.strip())
1152                 if size < 0:
1153                     raise ValueError
1154             except ValueError:
1155                 raise ValueError(
1156                     "Non negative number expected at line {}.".format(i))
1157             yield (size, name)
1158             continue
1159 
1160         # Load edges from this line
1161         left, right = l.split(':')
1162         try:
1163             left = int(left.strip())
1164             right = [int(s) for s in right.split()]
1165         except ValueError:
1166             raise ValueError("Non integer vertex ID at line {}.".format(i))
1167         if len(right) < 1 or right[-1] != 0:
1168             raise ValueError("Line {} must end with 0.".format(i))
1169 
1170         if left < 1 or left > size:
1171             raise ValueError(
1172                 "Vertex ID out of range [1,{}] at line {}.".format(size, i))
1173 
1174         right.pop()
1175         if len([x for x in right if x < 1 or x > size]) > 0:
1176             raise ValueError(
1177                 "Vertex ID out of range [1,{}] at line {}.".format(size, i))
1178         yield left, right, i
1179 
1180 
1181 def _read_bipartite_kthlist(inputfile):
1182     """Read a bipartite graph from file, in the KTH reverse adjacency lists format.
1183 
1184     Assumes the adjacecy list is given in order.
1185     - vertices are listed in increasing order
1186     - if bipartite, only the adjiacency list of the left side must be
1187       given, no list for a vertex of the right side is allowed.
1188 
1189     Parameters
1190     ----------
1191     inputfile : file object
1192         file handle of the input
1193 
1194     Raises
1195     ------
1196     ValueError
1197         Error parsing the file
1198 
1199     """
1200     # vertex number
1201     parser = _kthlist_parse(inputfile)
1202     size, name = next(parser)
1203     bipartition_ambiguous = [1, size]
1204     edges = {}
1205 
1206     previous = 0
1207     for left, right, lineno in parser:
1208 
1209         if left <= previous:
1210             raise ValueError(
1211                 "Vertex at line {} is smaller than the previous one.".format(lineno))
1212 
1213         # Check the bi-coloring on both side
1214         if left > bipartition_ambiguous[1]:
1215             raise ValueError(
1216                 "Bipartition violation al line {}. Vertex {} cannot be on the left side."
1217                 .format(lineno, left))
1218         bipartition_ambiguous[0] = max(bipartition_ambiguous[0], left + 1)
1219         for v in right:
1220             if v < bipartition_ambiguous[0]:
1221                 raise ValueError(
1222                     "Bipartition violation. Invalid edge ({},{}) at line {}."
1223                     .format(left, v, lineno))
1224             bipartition_ambiguous[1] = min(bipartition_ambiguous[1], v - 1)
1225 
1226         # after vertices, add the edges
1227         edges[left] = right
1228 
1229     # fix the bipartition
1230     # unsassigned vertices go to the right size
1231     L = bipartition_ambiguous[0]-1
1232     R = size - bipartition_ambiguous[0]+1
1233     G = BipartiteGraph(L, R, name)
1234 
1235     for u in edges:
1236         for v in edges[u]:
1237             G.add_edge(u, v - L)
1238 
1239     if size != G.number_of_vertices():
1240         raise ValueError("{} vertices expected. Got {} instead.".format(
1241             size, G.number_of_vertices()))
1242     return G
1243 
1244 
1245 def _read_nonbipartite_kthlist(inputfile, graph_class):
1246     """Read a graph from file, in the KTH reverse adjacency lists format.
1247 
1248     Only for simple and directed graph
1249 
1250     Assumes the adjacecy list is given in order.
1251     - vertices are listed in increasing order
1252     - if directed graph the adjacency list specifies incoming neighbous
1253     - if DAG, the graph must be given in topological order source->sink
1254 
1255     Parameters
1256     ----------
1257     inputfile : file object
1258         file handle of the input
1259 
1260     graph_class: class
1261         either Graph or DirectedGraph
1262 
1263     Raises
1264     ------
1265     ValueError
1266         Error parsing the file
1267 
1268     """
1269     assert graph_class in [Graph, DirectedGraph]
1270 
1271     # vertex number
1272     parser = _kthlist_parse(inputfile)
1273     size, name = next(parser)
1274     G = graph_class(size, name)
1275 
1276     previous = 0
1277     for succ, predecessors, lineno in parser:
1278 
1279         if succ <= previous:
1280             raise ValueError(
1281                 "Vertex at line {} is smaller than the previous one.".format(lineno))
1282 
1283         # after vertices, add the edges
1284         for v in predecessors:
1285             G.add_edge(v, succ)
1286 
1287         previous = succ
1288 
1289     if size != G.order():
1290         raise ValueError("{} vertices expected. Got {} instead.".format(
1291             size, G.order()))
1292 
1293     return G
1294 
1295 
1296 def _read_graph_dimacs_format(inputfile, graph_class):
1297     """Read a graph simple from file, in the DIMACS edge format.
1298 
1299     Parameters
1300     ----------
1301     inputfile : file object
1302         file handle of the input
1303 
1304     graph_class: class object
1305         either Graph or DirectedGraph
1306     """
1307     assert graph_class in [Graph, DirectedGraph]
1308 
1309     G = None
1310     name = ''
1311     n = -1
1312     m = -1
1313     m_cnt = 0
1314 
1315     # is the input topologically sorted?
1316     for i, l in enumerate(inputfile.readlines()):
1317 
1318         l = l.strip()
1319 
1320         # add the comment to the header
1321         if l[0] == 'c':
1322             name += l[2:]
1323             continue
1324 
1325         # parse spec line
1326         if l[0] == 'p':
1327             if G is not None:
1328                 raise ValueError(
1329                     "[Syntax error] " +
1330                     "Line {} contains a second spec line.".format(i+1))
1331             _, fmt, nstr, mstr = l.split()
1332             if fmt != 'edge':
1333                 raise ValueError("[Input error] " +
1334                                  "Dimacs \'edge\' format expected at line {}.".format(i+1))
1335             n = int(nstr)
1336             m = int(mstr)
1337             G = graph_class(n, name)
1338             continue
1339 
1340         # parse spec line
1341         if l[0] == 'e':
1342             if G is None:
1343                 raise ValueError("[Input error] " +
1344                                  "Edge before preamble at line".format(i))
1345             m_cnt += 1
1346             _, v, w = l.split()
1347             try:
1348                 G.add_edge(int(v), int(w))
1349             except ValueError:
1350                 raise ValueError("[Syntax error] " +
1351                                  "Line {} syntax error: edge must be 'e u v' where u, v are vertices".format(i))
1352 
1353     if m != m_cnt:
1354         raise ValueError("[Syntax error] " +
1355                          "{} edges were expected.".format(m))
1356 
1357     return G
1358 
1359 
1360 def _read_graph_matrix_format(inputfile):
1361     """Read a bipartite graph from file, in the adjiacency matrix format.
1362 
1363     This is an example of an adjacency matrix for a bipartite graph
1364     with 9 vertices on one side and 15 on the another side.
1365 
1366     .. 9 15
1367        1 1 0 1 0 0 0 1 0 0 0 0 0 0 0
1368        0 1 1 0 1 0 0 0 1 0 0 0 0 0 0
1369        0 0 1 1 0 1 0 0 0 1 0 0 0 0 0
1370        0 0 0 1 1 0 1 0 0 0 1 0 0 0 0
1371        0 0 0 0 1 1 0 1 0 0 0 1 0 0 0
1372        0 0 0 0 0 1 1 0 1 0 0 0 1 0 0
1373        0 0 0 0 0 0 1 1 0 1 0 0 0 1 0
1374        0 0 0 0 0 0 0 1 1 0 1 0 0 0 1
1375        1 0 0 0 0 0 0 0 1 1 0 1 0 0 0
1376 
1377     Parameters
1378     ----------
1379     inputfile: file object
1380         the file containing the graph specification
1381 
1382     Returns
1383     -------
1384     G : BipartiteGraph
1385 
1386     """
1387     def scan_integer(inputfile):
1388 
1389         num_buffer = []
1390         line_cnt = 0
1391 
1392         while True:
1393             if len(num_buffer) == 0:
1394 
1395                 line = inputfile.readline()
1396 
1397                 if len(line) == 0:
1398                     return
1399 
1400                 line_cnt += 1
1401                 tokens = line.split()
1402 
1403                 if len(tokens) == 0 or tokens[0][0] == '#':
1404                     continue  # comment line
1405 
1406                 try:
1407                     num_buffer.extend((int(lit), line_cnt) for lit in tokens)
1408                 except ValueError:
1409                     raise ValueError("[Syntax error] " +
1410                                      "Line {} contains a non numeric entry.".
1411                                      format(line_cnt))
1412 
1413             yield num_buffer.pop(0)
1414 
1415     scanner = scan_integer(inputfile)
1416 
1417     try:
1418         n = next(scanner)[0]
1419         m = next(scanner)[0]
1420 
1421         G = BipartiteGraph(n, m)
1422         G.name = ''
1423 
1424         # read edges
1425         for i in range(1, n + 1):
1426             for j in range(1, m + 1):
1427 
1428                 (b, l) = next(scanner)
1429                 if b == 1:
1430                     G.add_edge(i, j)
1431                 elif b == 0:
1432                     pass
1433                 else:
1434                     raise ValueError(
1435                         "[Input error at line {}] Only 0 or 1 are allowed".
1436                         format(l))
1437     except StopIteration:
1438         raise ValueError("[Input error] Unexpected end of the matrix")
1439 
1440     # check that there are is no more data
1441     try:
1442         (b, l) = next(scanner)
1443         raise ValueError(
1444             "[Input error at line {}] There are more than {}x{} entries".
1445             format(l, n, m))
1446     except StopIteration:
1447         pass
1448 
1449     return G
1450 
1451 
1452 #
1453 # In-house graph writers
1454 #
1455 def _write_graph_kthlist_nonbipartite(G, output_file):
1456     """Wrire a graph to a file, in the KTH reverse adjacency lists format.
1457 
1458     Parameters
1459     ----------
1460     G : Graph or DirectGraph
1461         the graph to write on file
1462 
1463     output_file : file object
1464         file handle of the output
1465     """
1466     assert isinstance(G, (Graph, DirectedGraph))
1467 
1468     print("c {}".format(G.name), file=output_file)
1469     print("{}".format(G.order()), file=output_file)
1470 
1471     from io import StringIO
1472     output = StringIO()
1473 
1474     for v in G.vertices():
1475 
1476         if G.is_directed():
1477             nbors = G.predecessors(v)
1478         else:
1479             nbors = G.neighbors(v)
1480 
1481         output.write(str(v) + " :")
1482         output.write("".join([' '+str(i) for i in nbors]))
1483         output.write(" 0\n")
1484 
1485     print(output.getvalue(), file=output_file)
1486 
1487 
1488 def _write_graph_kthlist_bipartite(G, output_file):
1489     """Wrire a bipartite graph to a file,
1490        in the KTH reverse adjacency lists format.
1491 
1492     Parameters
1493     ----------
1494     G : BipartiteGraph
1495         the graph to write on file
1496 
1497     output_file : file object
1498         file handle of the output
1499     """
1500     assert isinstance(G, BipartiteGraph)
1501     print("c {}".format(G.name), file=output_file)
1502     print("{}".format(G.order()), file=output_file)
1503 
1504     from io import StringIO
1505     output = StringIO()
1506 
1507     U, _ = G.parts()
1508     offset = len(U)
1509 
1510     for u in U:
1511         output.write(str(u) + " :")
1512         output.write("".join([' '+str(v + offset)
1513                      for v in G.right_neighbors(u)]))
1514         output.write(" 0\n")
1515 
1516     print(output.getvalue(), file=output_file)
1517 
1518 
1519 def _write_graph_dimacs_format(G, output_file):
1520     """Wrire a graph to a file, in DIMACS format.
1521 
1522     Parameters
1523     ----------
1524     G : Graph or DirectGraph
1525         the graph to write on file
1526 
1527     output_file : file object
1528         file handle of the output
1529     """
1530     assert isinstance(G, (Graph, DirectedGraph))
1531     print("c {}".format(G.name).strip(), file=output_file)
1532     n = G.number_of_vertices()
1533     m = G.number_of_edges()
1534     print("p edge {} {}".format(n, m), file=output_file)
1535 
1536     for v, w in G.edges():
1537         print("e {} {}".format(v, w), file=output_file)
1538 
1539 
1540 def _write_graph_matrix_format(G, output_file):
1541     """Wrire a graph to a file, in \"matrix\" format.
1542 
1543     Parameters
1544     ----------
1545     G : BipartiteGraph
1546         the graph to write in output
1547 
1548     output_file : file object
1549         file handle of the output
1550     """
1551     assert isinstance(G, BipartiteGraph)
1552     print("{} {}".format(G.left_order(), G.right_order()),
1553           file=output_file)
1554     L, R = G.parts()
1555     for u in L:
1556 
1557         adj_row = []
1558 
1559         for v in R:
1560             if G.has_edge(u, v):
1561                 adj_row.append("1")
1562             else:
1563                 adj_row.append("0")
1564 
1565         print(" ".join(adj_row), file=output_file)
1566 
1567 
1568 #
1569 # Bipartite graph generator
1570 # (we do not want to use networkx)
1571 #
1572 def bipartite_random_left_regular(l, r, d, seed=None):
1573     """Returns a random bipartite graph with constant left degree.
1574 
1575     Each vertex on the left side has `d` neighbors on the right side,
1576     picked uniformly at random without repetition.
1577 
1578     Each vertex in the graph has an attribute `bipartite` which is 0
1579     for the vertices on the left side and 1 for the vertices on the
1580     right side.
1581 
1582     Parameters
1583     ----------
1584     l : int
1585         vertices on the left side
1586     r : int
1587         vertices on the right side
1588     d : int
1589         degree on the left side.
1590     seed : hashable object
1591         seed the random generator
1592 
1593     Returns
1594     -------
1595     BipartiteGraph
1596 
1597     Raises
1598     ------
1599     ValueError
1600         unless ``l``, ``r`` and ``d`` are non negative.
1601 
1602     """
1603     import random
1604     if seed is not None:
1605         random.seed(seed)
1606 
1607     if l < 0 or r < 0 or d < 0:
1608         raise ValueError(
1609             "bipartite_random_left_regular(l,r,d) needs l,r,d >=0.")
1610 
1611     G = BipartiteGraph(l, r)
1612     G.name = "bipartite_random_left_regular({},{},{})".format(l, r, d)
1613     d = min(r, d)
1614 
1615     L, R = G.parts()
1616     for u in L:
1617         for v in sorted(random.sample(R, d)):
1618             G.add_edge(u, v)
1619 
1620     return G
1621 
1622 
1623 def bipartite_random_m_edges(L, R, m, seed=None):
1624     """Returns a random bipartite graph with M edges
1625 
1626     Build a random bipartite graph with :math:`L` left vertices,
1627     :math:`R` right vertices and :math:`m` edges sampled at random
1628     without repetition.
1629 
1630     Parameters
1631     ----------
1632     L : int
1633         vertices on the left side
1634     R : int
1635         vertices on the right side
1636     m : int
1637         number of edges.
1638     seed : hashable object
1639         seed the random generator
1640 
1641     Returns
1642     -------
1643     BipartiteGraph
1644 
1645     Raises
1646     ------
1647     ValueError
1648         unless ``L``, ``R`` and ``m`` are non negative.
1649 
1650     """
1651     import random
1652     if seed is not None:
1653         random.seed(seed)
1654 
1655     if L < 1 or R < 1 or m < 0 or m > L * R:
1656         raise ValueError(
1657             "bipartite_random_m_edges(L,R,m) needs L, R >= 1, 0<=m<=L*R")
1658     G = BipartiteGraph(L, R)
1659     G.name = "bipartite_random_m_edges({},{},{})".format(L, R, m)
1660 
1661     U, V = G.parts()
1662 
1663     if m > L * R // 3:
1664         # Sampling strategy (dense)
1665         E = ((u, v) for u in U for v in V)
1666         for u, v in random.sample(E, m):
1667             G.add_edge(u, v)
1668     else:
1669         # Sampling strategy (sparse)
1670         count = 0
1671         while count < m:
1672             u = random.randint(1, L)
1673             v = random.randint(1, R)
1674             if not G.has_edge(u, v):
1675                 G.add_edge(u, v)
1676                 count += 1
1677     assert G.number_of_edges() == m
1678     return G
1679 
1680 
1681 def bipartite_random(L, R, p, seed=None):
1682     """Returns a random bipartite graph with independent edges
1683 
1684     Build a random bipartite graph with :math:`L` left vertices,
1685     :math:`R` right vertices, where each edge is sampled independently
1686     with probability :math:`p`.
1687 
1688     Parameters
1689     ----------
1690     L : int
1691         vertices on the left side
1692     R : int
1693         vertices on the right side
1694     p : float
1695         probability to pick an edge
1696     seed : hashable object
1697         seed the random generator
1698 
1699     Returns
1700     -------
1701     BipartiteGraph
1702 
1703     Raises
1704     ------
1705     ValueError
1706         unless ``L``, ``R`` are non negative and 0<=``p``<=1.
1707     """
1708     import random
1709     if seed is not None:
1710         random.seed(seed)
1711 
1712     if L < 1 or R < 1 or p < 0 or p > 1:
1713         raise ValueError(
1714             "bipartite_random_graph(L,R,p) needs L, R >= 1, p in [0,1]")
1715     G = BipartiteGraph(L, R)
1716     G.name = "bipartite_random_graph({},{},{})".format(L, R, p)
1717 
1718     U, V = G.parts()
1719 
1720     for u in U:
1721         for v in V:
1722             if random.random() <= p:
1723                 G.add_edge(u, v)
1724     return G
1725 
1726 
1727 def bipartite_shift(N, M, pattern=[]):
1728     """Returns a bipartite graph where edges are a fixed shifted sequence.
1729 
1730     The graph has :math:`N` vertices on the left (numbered from
1731     :math:`1` to :math:`N`), and :math:`M` vertices on the right
1732     (numbered from :math:`1` to :math:`M`),
1733 
1734     Each vertex :math:`v` on the left side has edges to vertices
1735     :math:`v+d_1`, :math:`v+d_2`, :math:`v+d_3`,... with vertex
1736     indices on the right wrap around :wrap around over
1737     :math:`[1..M]`).
1738 
1739     Notice that this construction does not produces multiedges even if
1740     two offsets end up on the same right vertex.
1741 
1742     Parameters
1743     ----------
1744     N : int
1745         vertices on the left side
1746     M : int
1747         vertices on the right side
1748     pattern : list(int)
1749         pattern of neighbors
1750 
1751     Returns
1752     -------
1753     BipartiteGraph
1754 
1755     Raises
1756     ------
1757     ValueError
1758         unless ``N``, ``M`` are non negative and ``pattern`` has vertices outside the range.
1759 
1760     """
1761     if N < 1 or M < 1:
1762         raise ValueError("bipartite_shift(N,M,pattern) needs N,M >= 0.")
1763 
1764     G = BipartiteGraph(N, M)
1765     G.name = "bipartite_shift_regular({},{},{})".format(N, M, pattern)
1766 
1767     L, R = G.parts()
1768     pattern.sort()
1769     for u in L:
1770         for offset in pattern:
1771             G.add_edge(u, 1 + (u - 1 + offset) % M)
1772 
1773     return G
1774 
1775 
1776 def bipartite_random_regular(l, r, d, seed=None):
1777     """Returns a random bipartite graph with constant degree on both sides.
1778 
1779     The graph is d-regular on the left side and regular on the right
1780     size, so it must be that d*l / r is an integer number.
1781 
1782     Parameters
1783     ----------
1784     l : int
1785        vertices on the left side
1786     r : int
1787        vertices on the right side
1788     d : int
1789        degree of vertices at the left side
1790     seed : hashable object
1791        seed of random generator
1792 
1793     Returns
1794     -------
1795     BipartiteGraph
1796 
1797     Raises
1798     ------
1799     ValueError
1800         if one among ``l``, ``r`` and ``d`` is negative or
1801         if ``r`` does not divides `l*d`
1802 
1803     References
1804     ----------
1805     [1] http://...
1806 
1807     """
1808 
1809     import random
1810     if seed is not None:
1811         random.seed(seed)
1812 
1813     if l < 0 or r < 0 or d < 0:
1814         raise ValueError("bipartite_random_regular(l,r,d) needs l,r,d >=0.")
1815 
1816     if (l * d) % r != 0:
1817         raise ValueError(
1818             "bipartite_random_regular(l,r,d) needs r to divid l*d.")
1819 
1820     G = BipartiteGraph(l, r)
1821     G.name = "bipartite_random_regular({},{},{})".format(l, r, d)
1822 
1823     L, R = G.parts()
1824     A = list(L) * d
1825     B = list(R) * (l * d // r)
1826     assert len(B) == l * d
1827 
1828     for i in range(l * d):
1829         # Sample an edge, do not add it if it existed
1830         # We expect to sample at most d^2 edges
1831         for retries in range(3 * d * d):
1832             ea = random.randint(i, l * d - 1)
1833             eb = random.randint(i, l * d - 1)
1834             if not G.has_edge(A[ea], B[eb]):
1835                 G.add_edge(A[ea], B[eb])
1836                 A[i], A[ea] = A[ea], A[i]
1837                 B[i], B[eb] = B[eb], B[i]
1838                 break
1839         else:
1840             # Sampling takes too long, maybe no good edge exists
1841             failure = True
1842             for ea in range(i, l * d):
1843                 for eb in range(i, l * d):
1844                     if not G.has_edge(A[ea], B[eb]):
1845                         failure = False
1846                         break
1847                 if not failure:
1848                     break
1849             if failure:
1850                 return bipartite_random_regular(l, r, d)
1851 
1852     return G
1853 
1854 
1855 def dag_pyramid(height):
1856     """Generates the pyramid DAG
1857 
1858     Vertices are indexed from the bottom layer, starting from index 1
1859 
1860     Parameters
1861     ----------
1862     height : int
1863         the height of the pyramid graph (>=0)
1864 
1865     Returns
1866     -------
1867     cnfgen.graphs.DirectedGraph
1868 
1869     Raises
1870     ------
1871     ValueError
1872     """
1873     if height < 0:
1874         raise ValueError("The height of the tree must be >= 0")
1875 
1876     n = (height+1)*(height+2) // 2  # number of vertices
1877     D = DirectedGraph(n, 'Pyramid of height {}'.format(height))
1878 
1879     # edges
1880     leftsrc = 1
1881     dest = height+2
1882     for layer in range(1, height+1):
1883         for i in range(1, height-layer+2):
1884             D.add_edge(leftsrc, dest)
1885             D.add_edge(leftsrc+1, dest)
1886             leftsrc += 1
1887             dest += 1
1888         leftsrc += 1
1889 
1890     return D
1891 
1892 
1893 def dag_complete_binary_tree(height):
1894     """Generates the complete binary tree DAG
1895 
1896     Vertices are indexed from the bottom layer, starting from index 1
1897 
1898     Parameters
1899     ----------
1900     height : int
1901         the height of the tree
1902 
1903     Returns
1904     -------
1905     cnfgen.graphs.DirectedGraph
1906 
1907     Raises
1908     ------
1909     ValueError
1910 
1911     """
1912     if height < 0:
1913         raise ValueError("The height of the tree must be >= 0")
1914 
1915     # vertices plus 1
1916     N = 2 * (2**height)
1917     name = 'Complete binary tree of height {}'.format(height)
1918     D = DirectedGraph(N-1, name)
1919 
1920     # edges
1921     leftsrc = 1
1922     for dest in range(N // 2 + 1, N):
1923         D.add_edge(leftsrc, dest)
1924         D.add_edge(leftsrc+1, dest)
1925         leftsrc += 2
1926 
1927     return D
1928 
1929 
1930 def dag_path(length):
1931     """Generates a directed path DAG
1932 
1933     Vertices are indexed from 1..length+1
1934 
1935     Parameters
1936     ----------
1937     length : int
1938         the length of the path
1939 
1940     Returns
1941     -------
1942     cnfgen.graphs.DirectedGraph
1943 
1944     Raises
1945     ------
1946     ValueError
1947     """
1948     if length < 0:
1949         raise ValueError("The lenght of the path must be >= 0")
1950 
1951     name = 'Directed path of length {}'.format(length)
1952     D = DirectedGraph(length+1, name)
1953     # edges
1954     for i in range(1, length+1):
1955         D.add_edge(i, i + 1)
1956 
1957     return D
1958 
1959 
1960 def split_random_edges(G,k, seed=None):
1961     """Split m random missing edges to G
1962 
1963     If :math:`G` is a simple graph, it picks k random edges (and fails
1964     if there are not enough of them), and splits the edges in 2 adding
1965     a new vertex for each of them.
1966 
1967     Parameters
1968     ----------
1969     G : Graph
1970         a graph with at least :math:`m` missing edges
1971     k : int
1972        the number of  edges to sample
1973     seed : hashable object
1974        seed of random generator
1975 
1976     Example
1977     -------
1978     >>> G = Graph(5)
1979     >>> G.add_edges_from([(1,4),(4,5),(2,4),(2,3)])
1980     >>> G.number_of_edges()
1981     4
1982     >>> split_random_edges(G,2)
1983     >>> G.number_of_edges()
1984     6
1985     >>> G.number_of_vertices()
1986     7
1987     """
1988     if seed is not None:
1989         random.seed(seed)
1990 
1991     if not isinstance(G,Graph):
1992         raise TypeError("Edge splitting is only implemented for simple graphs")
1993 
1994     non_negative_int(k,'k')
1995     if k > G.number_of_edges():
1996         raise ValueError("The graph does not have {} edges.".format(k))
1997 
1998     tosplit = random.sample(list(G.edges()),k)
1999     nv = G.number_of_vertices()
2000     G.update_vertex_number(nv+k)
2001     x = nv + 1
2002     for u,v in tosplit:
2003         G.remove_edge(u,v)
2004         G.add_edge(u,x)
2005         G.add_edge(x,v)
2006         x += 1
2007 
2008 
2009 def add_random_missing_edges(G, m, seed=None):
2010     """Add m random missing edges to G
2011 
2012     If :math:`G` is not complete and has at least :math:`m` missing
2013     edges, :math:`m` of them are sampled and added to the graph.
2014 
2015     Parameters
2016     ----------
2017     G : Graph
2018         a graph with at least :math:`m` missing edges
2019     m : int
2020        the number of missing edges to sample
2021     seed : hashable object
2022        seed of random generator
2023 
2024     Raises
2025     ------
2026     ValueError
2027         if :math:`G` doesn't have :math:`m` missing edges
2028     RuntimeError
2029         Sampling failure in the sparse case
2030 
2031     """
2032     if seed is not None:
2033         random.seed(seed)
2034 
2035     if m < 0:
2036         raise ValueError("You can only sample a non negative number of edges.")
2037 
2038     total_number_of_edges = None
2039 
2040     if G.is_bipartite():
2041 
2042         Left, Right = G.parts()
2043         total_number_of_edges = len(Left) * len(Right)
2044 
2045         def edge_sampler():
2046             u = random.sample(Left, 1)[0]
2047             v = random.sample(Right, 1)[0]
2048             return (u, v)
2049 
2050         def available_edges():
2051             return [(u, v) for u in Left for v in Right if not G.has_edge(u, v)]
2052 
2053     else:
2054 
2055         V = G.number_of_vertices()
2056         total_number_of_edges = V * (V - 1) / 2
2057 
2058         def edge_sampler():
2059             return random.sample(range(1, V+1), 2)
2060 
2061         def available_edges():
2062             result = []
2063             for u in range(1, V):
2064                 for v in range(u+1, V+1):
2065                     if not G.has_edge(u, v):
2066                         result.append((u, v))
2067             return result
2068 
2069     # How many edges we want in the end?
2070     goal = G.number_of_edges() + m
2071 
2072     if goal > total_number_of_edges:
2073         raise ValueError(
2074             "The graph does not have {} missing edges to sample.".format(m))
2075 
2076     # Sparse case: sample and retry
2077     for _ in range(10 * m):
2078 
2079         if G.number_of_edges() >= goal:
2080             break
2081 
2082         u, v = edge_sampler()
2083         if not G.has_edge(u, v):
2084             G.add_edge(u, v)
2085 
2086     if G.number_of_edges() < goal:
2087         # Very unlikely case: sampling process failed and the solution
2088         # is to use the sampling process tailored for denser graph, so
2089         # that a correct result is guaranteed. This requires
2090         # generating all available edges
2091         for u, v in random.sample(available_edges(),
2092                                   goal - G.number_of_edges()):
2093             G.add_edge(u, v)
2094 
2095 
2096 def supported_graph_formats():
2097     """File formats supported for graph I/O
2098 
2099     Given as a dictionary that maps graph types to the respective
2100     supported formats.
2101 
2102     E.g. 'dag' -> ['dimacs', 'kthlist']
2103 """
2104     return {'simple': Graph.supported_file_formats(),
2105             'digraph': DirectedGraph.supported_file_formats(),
2106             'dag': DirectedGraph.supported_file_formats(),
2107             'bipartite': BipartiteGraph.supported_file_formats()}
graphs.py

 

localtypes.py

 1 #!/usr/bin/env python
 2 # -*- coding:utf-8 -*-
 3 """Functions to check the arguments types
 4 """
 5 
 6 import numbers
 7 
 8 
 9 def positive_int(value, name):
10     """Check that `value` is a positive integer"""
11     msg = "argument '{}' must be a positive integer".format(name)
12     if not isinstance(value, numbers.Integral):
13         raise TypeError(msg)
14     if value < 1:
15         raise ValueError(msg)
16 
17 def positive_int_seq(value, name):
18     """Check that `value` is a positive integer"""
19     msg = "argument '{}' must be a sequence of positive integers".format(name)
20     try:
21         for v in value:
22             if not isinstance(v, numbers.Integral):
23                 raise TypeError('non numeric element in the sequence')
24     except TypeError as te:
25         raise TypeError(msg) from te
26 
27     for v in value:
28         if v < 1:
29             raise ValueError(msg)
30 
31 def non_negative_int_seq(value, name):
32     """Check that `value` is a positive integer"""
33     msg = "argument '{}' must be a sequence of non negative integers".format(name)
34     try:
35         for v in value:
36             if not isinstance(v, numbers.Integral):
37                 raise TypeError('non numeric element in the sequence')
38     except TypeError as te:
39         raise TypeError(msg) from te
40 
41     for v in value:
42         if v < 0:
43             raise ValueError(msg)
44 
45 def one_of_values(value, name, choices):
46     '''Check if the value is in a specific set'''
47     msg = "argument '{}' must be one of [{}]".format(name,
48                                                      choices)
49     if value not in choices:
50         raise ValueError(msg)
51 
52 def any_int(value, name):
53     """Check that `value` is an integer"""
54     msg = "argument '{}' must be have integer value".format(name)
55     if not isinstance(value, numbers.Integral):
56         raise TypeError(msg)
57 
58 def non_negative_int(value, name):
59     """Check that the `value` is a non negative"""
60     msg = "argument '{}' must be a non negative integer".format(name)
61     if not isinstance(value, numbers.Integral):
62         raise TypeError(msg)
63     if value < 0:
64         raise ValueError(msg)
65 
66 
67 def probability_value(value, name):
68     """Check that the `value` is a real between 0 and 1"""
69     msg = "argument '{}' must be a real between 0 and 1".format(name)
70     if not isinstance(value, numbers.Real):
71         raise TypeError(msg)
72     if value < 0 or value > 1:
73         raise ValueError(msg)

 

2.1 Generating hard instances

Considering the outline of methods used in current DPLL-based SAT-solvers we can now make an observation about graphs which should lead to hard instances EC(G). First let us restrict our attention to graphs G which are obtained by picking an edge in a 4-regular graph on n vertices and subdividing it, i.e. introducing a new vertex as midpoint of the old edge. For such a graph EC(G) will be unsatisfiable, have 2n + 1 variables and 10n + 2 clauses. This gives us unsatisfiable formulae with a density of close to 5 clauses per variable. For formulae with clauses of length 4 this is a density for which a formula from the random 4-SAT ensemble is expected to be satisfiable, see e.g. [1]. In this random model a formula is constructed by for each possible clause of length 4, for a set of n variables, letting it be part of the formula with probability p. Random 4-SAT formulae with expected density 5 are also expected to be easy to solve.

考虑到目前基于dpll的sat求解器所使用的方法的大纲,我们现在可以对图进行观察,这将导致硬实例EC(G)。

首先让我们将注意力限制在图G上,图G是通过在n个顶点上的4正则图中选取一条边并将其细分而得到的,即引入一个新顶点作为旧边的中点。

对于这样一个EC(G)将不能满足的图,有2n + 1个变量和10n + 2子句。这给了我们无法满足的公式,每个变量的密度接近5个子句。对于子句长度为4的公式,这是一个密度4-SAT集合预期可以满足。

在这个随机模型中,对于n个变量集,每个长度为4的可能子句,构造一个公式,让它成为公式的一部分,概率为p。期望密度为5的随机4-SAT公式也很容易求解。

 

What can we say about the smallest number of variables involved in a nontrivial conflict, allowing us to learn a new clause, in a formula of this kind? We can get a simple lower bound for this by observing that we must at least set all variables corresponding to the edges of some cycle C in G before a nontrivial conflict can arise. Getting more accurate lower bounds is harder, but this tells us that the girth of the graph can be used to control the size of conflict sets.

在一个非微不足道的冲突中,我们可以用最小的变量数量来学习一个新的子句,在这种公式中,我们能说什么呢?

我们可以得到一个简单的下界,通过观察,我们必须至少设置对应于G中某个循环C的边的所有变量,才能产生非平凡的冲突。

获得更精确的下界是比较困难的,但这告诉我们,图的周长可以用来控制冲突集的大小。

 

This immediately draws our attention to 4-regular graphs, keeping the clause length low, of high girth. If the girth of G is g we will at first not be able to learn any nontrivial clauses of length less than, at least, g. If the SAT-solver we are using has a cut–off length for the clauses it learns, i.e. it does no retain any conflict clauses with more than some number k of literals, it will in fact be prevented from learning any new clauses at all, thus reducing it to a pure DPLL procedure.

这立即将我们的注意力吸引到4规则图上,保持子句长度低,周长高。如果G的周长是G,我们首先将无法学习任何长度至少小于G的非平凡子句。如果我们正在使用的sat -求解器对它学习的子句有一个截断长度,即它不保留任何冲突子句,其字面量不超过k个,那么它实际上将无法学习任何新的子句,从而将其简化为一个纯粹的DPLL过程。

 

If the solver uses restarts and removes long clauses when it performs a restart it will in a similar way risk changing into a DPLL procedure with learning, but with an effective run time corresponding to the time between consecutive restarts.

如果求解器在执行重新启动时使用重新启动并删除长子句,那么它将以类似的方式冒着变成具有学习功能的DPLL过程的风险,但其有效运行时间对应于连续重新启动之间的时间间隔。

 

With this in mind it seems natural to use formulae of the form EC(G) based on 4-regular graphs, as described, as challenging unsatisfiable instances for SAT-solvers. If the formulae are to be tuned for hardness then using graphs with high girth, or at least few short cycles, also seems natural. If hard solvable instances are desired one could apply the methods of [14] on EC(G). The paper [14] presents a construction which, based on a hard unsatisfiable formula builds a solvable formula which is hard for a broad class of solvers.

考虑到这一点,使用基于4正则图的EC(G)形式的公式似乎是很自然的,正如所描述的,作为sat求解者的不满意的挑战实例。如果要调整公式的硬度,那么使用高周长的图,或者至少有几个短周期的图,似乎也很自然。如果需要难解的实例,可以采用的方法 在EC (G)[14]。论文[14]给出了一个构造,它在一个难解不满足公式的基础上,建立了一个对广义求解器来说难解的可解公式。

 

   
 

 3. Experiments

 

 

 

 

 

 

 

 

4. Conclusions and further directions

Both our discussion in 2.1 and the experiments in the previous section clearly indicate the importance of small dense structures, such as triangles in the Eulerian graphs, for the hardness of an instance of the type considered here. However we would expect this to be true in greater generality. Let us make this a bit more precise.

Given a k-SAT formula F let us look at its clause-variable incidence graph, i.e., the graph with one vertex for each variable, one vertex for each clause, and an edge between a variable-vertex and a clause-vertex if the variable is present in the clause. Given a subset S of the variables we define F[S], the induced subformula of F on S, to be the set of clauses C in F such that C only contains variables present in S. The density of the formula F[S] is as usual the number of clauses in F[S] divided by |S|.

If a formula F has many dense subformulae a clause learning DPLL-solver has a good chance of learning new short conflict clauses due to conflicts within a dense local part. However for a given clause length k there exist a density r(k) such that a formula with clauses of length at most k and density less than r(k) is always satisfiable. So, if an unsatisfiable formula F lacks such local dense parts a solver will initially have to assign values to many variables in order to find a conflict and learn a new clause. In this case the learnt clauses will also tend to be quite long.

If we look at random k-SAT instances of fixed clause to variable density we will only expect a small number of short cycles in the clause-variable incidence graph, actually an asymptotically size independent number, just as in the case of random regular graphs [3]. As a consequence, the density of F[S] will typically be very small, often less than 1, for variables sets S of small size. Likewise our formulae EC(G) will lack small dense subformulae when the underlying graph G has high girth. If we instead use random regular graphs G we know, [3], that there will typically exist only a small number of cycles in G and no small subgraphs denser than a cycle. In this case the formulae EC(G) will thereby inherit expected properties much like those for random k-SAT.

Thus some of the experimentally observed hardness of random unsatisfiable instances may well come from their lack of local dense subformulae. Here an experimental study of running times for unsolvable random instances versus the number of short cycles in their incidence graphs could be interesting. In this context, it is interesting to note that for satisfiable instances this lack of local dense parts is considered to be responsible for some of the success of randomized algorithms like the survey propagation method [4].

An interesting possibility would be to make more explicit use of the structure of the clause-variable incidence graph of an instance in resolution methods as well. This could e.g. be used to control the choice of resolution variable in the DPLL procedure.

In order to reduced the risk of the kind of trap the EC(G)-formulae represent for clause learning SAT-solvers some simple modifications can be added. One addition which seems cost effective would be to keep track of a running mean of the size of the clauses discovered during the learning process and keep all clauses not much larger than the current mean. That way long clauses would still be kept for formulae in which no short clauses are possible to learn without first learning long clauses.

However, the general problem faced here is still the fundamental limitations of solvers which do not use a proof system stronger than general resolution. While producing more optimised resolution based solvers is undoubtedly a worthwhile undertaking it is becoming more and more important to find efficient algorithms based on stronger proof systems.

 

References

[1] Dimitris Achlioptas and Yuval Peres. The threshold for random k-SAT is 2 k log 2−O(k). J. Amer. Math. Soc., 17(4):947–973 (electronic), 2004.

[2] Paul Beame, Henry Kautz, and Ashish Sabharwal. Towards understanding and harnessing the potential of clause learning. J. Artificial Intelligence Res., 22:319–351 (electronic), 2004.

[3] B´ela Bollob´as. Random graphs, volume 73 of Cambridge Studies in Advanced Mathematics. Cambridge University Press, Cambridge, second edition, 2001.

[4] A. Braunstein, M. M´ezard, and R. Zecchina. Survey propagation: an algorithm for satisfiability. Random Structures Algorithms, 27(2):201–226, 2005.

[5] Martin Davis, George Logemann, and Donald Loveland. A machine program for theorem-proving. Comm. ACM, 5:394–397, 1962.

[6] Martin Davis and Hilary Putnam. A computing procedure for quantification theory. J. Assoc. Comput. Mach., 7:201–215, 1960.

[7] Nicklas Een and Niklas S¨orensson. Webpage for satzoo and minisat at chalmers university. http://www.cs.chalmers.se/∼een/Satzoo/.

[8] Nicklas Een and Niklas S¨orensson. An extensible sat-solver. In Lecture Notes in Computer Science, 2919,SAT 2003, pages 502–518, 2004.

[9] Eugene Goldberg and Yakov Novikov. Webpage for berkmin. http://heigold.tripod.com/BerkMin.html.

[10] Edward A. Hirsch and Arist Kojevnikov. UnitWalk: a new SAT solver that uses local search guided by unit clause elimination. Ann. Math. Artif. Intell., 43(1-4):91–111, 2005. Theory and applications of satisfiability testing.

[11] Hans Kleine Buning ¨ and Theodor Lettman. Propositional logic: deduction and algorithms, volume 48 of Cambridge Tracts in Theoretical Computer Science. Cambridge University Press, Cambridge, 1999. Translated from the 1994 German original by the authors.

[12] Anton Kotzig. Moves without forbidden transitions in a graph. Mat. Casopis ˇ Sloven. Akad. Vied, 18:76–80, 1968.

[13] Klas Markstr¨om. Web page of current author, see combinatorial data and programs. http://www.math.umu.se/∼klasm.

[14] Edward A. Hirsch Michael Alekhnovich and Dmitry Itsykson. Exponential lower bounds for the running time of DPLL algorithms on satisfiable formulas. In International Colloquium on Automata, Languages and Programming,ICALP 2004, pages 84–96, 2004.

[15] R´emi Monasson, Riccardo Zecchina, Scott Kirkpatrick, Bart Selman, and Lidror Troyansky. Determining computational complexity from characteristic “phase transitions”. Nature, 400(6740):133–137, 1999.

[16] Gordon Royle. Gordon royles homepages for combinatorial data. http://www.cs.uwa.edu.au/∼gordon/remote/cubics/index.html.

[17] Alasdair Urquhart. Hard examples for resolution. J. Assoc. Comput. Mach., 34(1):209– 219, 1987.

[18] Douglas B. West. Introduction to graph theory. Prentice Hall Inc., Upper Saddle River, NJ, 1996.

[19] Lintao Zhang. webpage for zchaff at princeton university. http://www.princeton.edu/∼chaff/zchaff.html.

[20] Lintao Zhang, Conor F. Madigan, Matthew W. Moskewicz, and Sharad Malik. Efficient conflict driven learning in boolean satisfiability solver. In International Conference on Computer Aided Design, ICCAD 2001, pages 279–285, 2001.

   
posted on 2022-10-11 10:12  海阔凭鱼跃越  阅读(37)  评论(0编辑  收藏  举报