LDA降维

feature_dict = {i:label for i,label in zip(range(4),('spepal length in cm','spepal width in cm','petal length in cm','petal width in cm'))}

import pandas as pd

df = pd.io.parsers.read_csv(filepath_or_buffer='链接(省略)',header=None,sep=',',)
df columns = [l for i,l in sorted(feature_dict.items())] + ['class label']
df.dropna(how="all",inplace=True)

df.tail()

映射:将y变成可衡量的数据格式

from
sklearn.preprocessing import LabelEncoder X = df[['spepal length in cm','spepal width in cm','petal length in cm','petal width in cm']].values y = df['class label'].values enc = LabelEncoder() label_encoder = enc.fit(y) y = label_encoder.transform(y) + 1

转化后的结果:
每类样例的均值:
import
numpy as np np.set_printoptions(precision=4) mean_vectors = [] for c1 in range(1,4): mean_vectors.append(np.mean(X[y==c1],axis=0) print('Mean vector class %s:%s\n'%(c1,mean_vectors[c1-1]))

 

posted @ 2017-11-23 22:57  天下一杰  阅读(269)  评论(0编辑  收藏  举报