kmeans python

前面写的JAVA版的KMEANS比较恶心,现在补上一个简单的python版本。

#kmeans
import math
def doKmeansCluster(data, cnum, itnum):
    c = data[:cnum]
    for time in range(itnum):
        groups = [[] for i in range(len(c))]
        for d in data:
            min = distance(d,c[0])
            index = 0
            for i in range(len(c)):
                dis = distance(d,c[i])
                index = [index,i][dis<min]
                min = [min,dis][dis<min]
            groups[index].append(d)
            
        c = []
        for g in groups:
            print g
            #transport the matrix, make all measure of the same demision in one same list
            trans = [[r[col] for r in g] for col in range(len(g[0]))]
            #get new center by sum and divide
            avg = [float(sum(trans[i]))/float(len(trans[0])) for i in range(len(trans))] 
            c.append(avg)




def distance(a, b):
    return math.sqrt(sum([math.pow(a[i]-b[i],2) for i in range(len(a))]))

只简单的测试了一下,没有考虑太多的约束,如果有心的话可以自己改写。

posted @ 2012-08-12 21:36  tadoo  阅读(362)  评论(0编辑  收藏  举报