随笔- 365 文章- 15 评论- 59 阅读- 87万

学习笔记226—Python Numpy之Nan和Inf处理

一、概念

Nan：Not a number

Inf：Infinity(无穷大）

当然容易搞混的还有None，None是python中用于标识空缺数据，Nan是nunpy和pandas中用于标识空缺数据，None是一个python特殊的数据类型，但是NaN却是用一个特殊的float，此处我仅针对Nan和Inf的处理。

二、Nan、Inf处理（以Nan为主，Inf可以借鉴相应方法）

1、找到Nan和Inf的行、列

其关键是通过where方法和isnan方法。

df = pd.DataFrame(np.arange(24).reshape(4,6), index=list('abcd'), columns=list('xyzuvw'))
Output:  
  x   y   z   u   v   w
a   0   1   2   3   4   5
b   6   7   8   9  10  11
c  12  13  14  15  16  17
d  18  19  20  21  22  23
# 将df的第一列变成NaN
df.x = np.nan
Output:
    x   y   z   u   v   w
a NaN   1   2   3   4   5
b NaN   7   8   9  10  11
c NaN  13  14  15  16  17
d NaN  19  20  21  22  23
np.where(np.isnan(df))
#得到结果，是一个tuple，前面array是横坐标，后面的array是纵坐标。
Output:
 (array([0, 1, 2, 3], dtype=int64), array([0, 0, 0, 0], dtype=int64))

2、数据处理

（1）数据替换

关键还是isnan方法，得到Nan值的索引。

df=df[np.isnan(df)]=2
#所得结果如下
Output:
     x   y   z   u   v   w
a  2.0   1   2   3   4   5
b  2.0   7   8   9  10  11
c  2.0  13  14  15  16  17
d  2.0  19  20  21  22  23

（2）删除相关数据

如果Nan所在数据行或列不影响整体数据分析，可以考虑去除相应行和列。

主要思路是通过isnan函数获得nan所在索引，通过where函数将其转成tuple形式，，再通过delete函数将所在行删除。

#create testing data
x=np.arange(0,24).reshape(4,6)
x=np.array(x,dtype=float)
x[2,3]=np.nan
x[0,4]=np.nan
print(x)
Output:
[[ 0.  1.  2.  3. nan  5.]
 [ 6.  7.  8.  9. 10. 11.]
 [12. 13. 14. nan 16. 17.]
 [18. 19. 20. 21. 22. 23.]]

#delete rows which have nan 
x1=np.delete(x,np.where(np.isnan(x))[0],axis=0))
print(x1)
Output:
[[ 6.  7.  8.  9. 10. 11.]
 [18. 19. 20. 21. 22. 23.]]

参考链接：https://zhuanlan.zhihu.com/p/38712765

posted @ 2021-08-02 11:48 何弈阅读(969) 评论(0) 编辑收藏举报

刷新页面返回顶部

登录后才能查看或发表评论，立即登录或者逛逛博客园首页

阅读排行：
· 分享4款.NET开源、免费、实用的商城系统
· 全程不用写代码，我用AI程序员写了一个飞机大战
· MongoDB 8.0这个新功能碉堡了，比商业数据库还牛
· 白话解读 Dapr 1.15：你的「微服务管家」又秀新绝活了
· 上周热点回顾（2.24-3.2）

公告

昵称：何弈
园龄： 6年7个月
粉丝： 109
关注： 77

+加关注

2025年3月

日

一

二

三

四

五

六

何弈

为天地立心，为民生立命，为往圣继绝学，为万世开太平 — 张载(宋)

学习笔记226—Python Numpy之Nan和Inf处理

公告

搜索

常用链接

我的标签

随笔分类

随笔档案

文章分类

阅读排行榜

评论排行榜

推荐排行榜

最新评论