LDA降维
feature_dict = {i:label for i,label in zip(range(4),('spepal length in cm','spepal width in cm','petal length in cm','petal width in cm'))} import pandas as pd df = pd.io.parsers.read_csv(filepath_or_buffer='链接(省略)',header=None,sep=',',) df columns = [l for i,l in sorted(feature_dict.items())] + ['class label'] df.dropna(how="all",inplace=True) df.tail()
映射:将y变成可衡量的数据格式
from sklearn.preprocessing import LabelEncoder X = df[['spepal length in cm','spepal width in cm','petal length in cm','petal width in cm']].values y = df['class label'].values enc = LabelEncoder() label_encoder = enc.fit(y) y = label_encoder.transform(y) + 1
转化后的结果:
![](https://images2018.cnblogs.com/blog/1257449/201711/1257449-20171123225125500-1760064495.png)
每类样例的均值:
import numpy as np np.set_printoptions(precision=4) mean_vectors = [] for c1 in range(1,4): mean_vectors.append(np.mean(X[y==c1],axis=0) print('Mean vector class %s:%s\n'%(c1,mean_vectors[c1-1]))