特征缩放和特征选择

特征缩放

x' = (x-min)/(max-min)

features = [125,140,185]
data = [float(x-min)/float(max-min) for x in features]

sklearn.preprocessing.MinMaxScaler

>>> from sklearn.preprocessing import MinMaxScaler
>>>
>>> data = [[-1, 2], [-0.5, 6], [0, 10], [1, 18]]
>>> scaler = MinMaxScaler()
>>> print(scaler.fit(data))
MinMaxScaler(copy=True, feature_range=(0, 1))
>>> print(scaler.data_max_)
[  1.  18.]
>>> print(scaler.transform(data))
[[ 0.    0.  ]
 [ 0.25  0.25]
 [ 0.5   0.5 ]
 [ 1.    1.  ]]
>>> print(scaler.transform([[2, 2]]))
[[ 1.5  0. ]]

特征选择

filter and wrapper
filter fast but ignore bias,sometimes miss the point. wrapper kind of slow but useful.
强关联和弱关联

posted @ 2017-12-07 09:54  james.yj  阅读(448)  评论(0编辑  收藏  举报