比较不同聚类的指标

  • Jaccard Index
    The Jaccard index (also known as Jaccard similarity coefficient) between partitions A and B is defined as the size of the intersection divided by the size of the union of the sample sets:

    where a, b, c and d are the entries in the mismatch matrix.

     

  • Fowlkes and Mallows

    Another method for comparing clusters was proposed by Fowlkes and Mallows (1983) as an alternative for Rand index. The Fowlkes and Mallows index can be defined as
    Actually, the Wallace coefficient was derived from this index (Wallace, 1983) and therefore it can be rewritten as


  • Mirkin Metric
    This coefficient assumes null value for identical clusterings and positive values otherwise. It corresponds to the Hamming distance between the binary vector representation of each partition.
    It provides an alternative adjusted form of Rand index. However, unlike Hubert and Arabie's adjusted Rand (Hubert, 1985) it doesn't provide a correction for chance agreement. Meila (2005) also proposed a bounded version of this index:
  • NMI (normalized mutual information)
     

    where cA ( cB ) i s t he number of groups in the partition A ( B),Ci· ( C·j ) i s t he sum of elements of C in row i (column j),
    and N is the number of nodes.

    If A = B, t hen I ( A,B) = 1; if A and B are completely different, t hen I ( A,B) = 0.


(to be continued...)
posted @ 2011-12-02 13:12  Keosu  阅读(1166)  评论(0编辑  收藏  举报