布尔型索引

布尔型索引

有一个用于存储数据的数组以及一个存储姓名的数组(含有重复项),利用numpy.random中的randn函数生成一些正态分布的随机数据

 1 In [15]: names = np.array(['Bob','Will','Joe','Bob','Will','Joe','Bob'])
 2 
 3 In [16]: data = np.random.randn(7,4)
 4 
 5 In [17]: names
 6 Out[17]: 
 7 array(['Bob', 'Will', 'Joe', 'Bob', 'Will', 'Joe', 'Bob'],
 8       dtype='|S4')
 9 
10 In [18]: data
11 Out[18]: 
12 array([[ 0.70228847,  2.15235924, -0.44546734,  0.11531414],
13        [ 0.09055973,  0.34700575, -0.21555319,  0.61604966],
14        [-1.82610739, -0.06912911, -0.02635386, -0.39026196],
15        [-0.64379412, -0.0173949 ,  0.79323255,  1.51808104],
16        [ 0.09152407,  3.40042424,  0.93578726,  0.02730237],
17        [ 0.21693798,  1.29032108, -0.86582956,  0.09536743],
18        [ 1.58762538,  0.22749992,  0.30686374, -0.74349097]])

假设每个名字对应data数组中的一行,要选出对应于名字‘Bob’的所有行,跟算数运算一样也是矢量化。

1 In [24]: names == 'Bob'
2 Out[24]: array([ True, False, False,  True, False, False,  True], dtype=bool)
3 In [26]: data[names == 'Bob'] 4 Out[26]: 5 array([[ 0.70228847, 2.15235924, -0.44546734, 0.11531414], 6 [-0.64379412, -0.0173949 , 0.79323255, 1.51808104], 7 [ 1.58762538, 0.22749992, 0.30686374, -0.74349097]])

布尔型数组的长度必须跟被索引的轴长度一致,还可以将布尔型数组与切片、整数混合使用。

1 In [27]: data[names == 'Bob',2:]
2 Out[27]: 
3 array([[-0.44546734,  0.11531414],
4        [ 0.79323255,  1.51808104],
5        [ 0.30686374, -0.74349097]])
6 
7 In [28]: data[names == 'Bob',3]
8 Out[28]: array([ 0.11531414,  1.51808104, -0.74349097])

要选择出‘Bob’以外的其他值,可以使用不等于符号

1 In [29]: names != 'Bob'
2 Out[29]: array([False,  True,  True, False,  True,  True, False], dtype=bool)

选取名字中两个需要组合应用多个布尔条件,使用&、|、之类的布尔算数运算符即可。

 1 In [32]: mask = (names == 'Bob') | (names == 'Will')
 2 
 3 In [33]: mask
 4 Out[33]: array([ True,  True, False,  True,  True, False,  True], dtype=bool)
 5 
 6 In [34]: data[mask]
 7 Out[34]: 
 8 array([[ 0.70228847,  2.15235924, -0.44546734,  0.11531414],
 9        [ 0.09055973,  0.34700575, -0.21555319,  0.61604966],
10        [-0.64379412, -0.0173949 ,  0.79323255,  1.51808104],
11        [ 0.09152407,  3.40042424,  0.93578726,  0.02730237],
12        [ 1.58762538,  0.22749992,  0.30686374, -0.74349097]])

将data中的所有负值都设置为0。

 1 In [35]: data[data < 0] = 0
 2 
 3 In [36]: data
 4 Out[36]: 
 5 array([[ 0.70228847,  2.15235924,  0.        ,  0.11531414],
 6        [ 0.09055973,  0.34700575,  0.        ,  0.61604966],
 7        [ 0.        ,  0.        ,  0.        ,  0.        ],
 8        [ 0.        ,  0.        ,  0.79323255,  1.51808104],
 9        [ 0.09152407,  3.40042424,  0.93578726,  0.02730237],
10        [ 0.21693798,  1.29032108,  0.        ,  0.09536743],
11        [ 1.58762538,  0.22749992,  0.30686374,  0.        ]])

 

posted @ 2017-11-22 15:41  薛乔毓  阅读(1727)  评论(0编辑  收藏  举报