归一化方法-Z-score
归一化方法-Z-score
Z-score
定义
z-score
精确测量偏离数据点均值的标准差数。公式如下:
\(z = \frac{data\ point-mean}{standard\ deviation}\), 标准数学公式为:
\(z = \frac{x-\mu }{\sigma }\)
几个z-score相关的重要性质:
- z-score如果是正值,说明数据点高于均值;
- z-score如果是负值,说明数据点低于均值;
- z-score接近0,说明数据点接近均值;
- z-score如果高于3或者低于-3,说明数据点可能不可使用。
Z-score
python实现
def normalize(data):
for i in range(0, 3):
data[:,i] = sp.stats.zscore(data[:,i])
return data
data_ex = np.array([[-2.5022, 7.8546, 5.4552],
[-2.2184, 8.036 , 5.5997],
[-2.3919, 8.0438, 5.3814],
[-2.3578, 8.0125, 5.2548],
[-2.4651, 7.8921, 5.2071],
[-2.3001, 7.9735, 5.3466]])
normalized_data_ex = normalize(data_ex)
结果显示:
normalized_data_ex
array([[-1.35407489, -1.58814724, 0.62667636],
[ 1.61071709, 0.93563646, 1.74371667],
[-0.20179668, 1.04415638, 0.05617411],
[ 0.15443801, 0.60868543, -0.92249234],
[-0.9664999 , -1.06641687, -1.2912316 ],
[ 0.75721637, 0.06608585, -0.2128432 ]])