pandas.DataFrame.drop_duplicates的使用介绍

参考链接:https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.drop_duplicates.html

DataFrame.drop_duplicates(subset=Nonekeep='first'inplace=Falseignore_index=False)

这个方法默认是去除每一行中的重复行,可以指定特定的去重的columns参数位subset。

keep{‘first’, ‘last’, False}, default ‘first’

Determines which duplicates (if any) to keep. - first : Drop duplicates except for the first occurrence. - last : Drop duplicates except for the last occurrence. - False : Drop all duplicates.

keep ,可以让你选择去重以后需要选择留下的内容,first为第一次出现的索引,last为最后一次出现的索引,Fasle为放弃所有的重复行

inplace就不介绍了。

ignore_indexbool, default False

If True, the resulting axis will be labeled 0, 1, …, n - 1.

New in version 1.0.0.

这个是是否重复调整索引

 

上官方demo

In [8]: df                                                                                                                   
Out[8]: 
     brand style  rating
0  Yum Yum   cup     4.0
1  Yum Yum   cup     4.0
2  Indomie   cup     3.5
3  Indomie  pack    15.0
4  Indomie  pack     5.0

In [9]: df.drop_duplicates()                                                                                                 
Out[9]: 
     brand style  rating
0  Yum Yum   cup     4.0
2  Indomie   cup     3.5
3  Indomie  pack    15.0
4  Indomie  pack     5.0

In [10]: df.drop_duplicates(ignore_index=True)                                                                               
Out[10]: 
     brand style  rating
0  Yum Yum   cup     4.0
1  Indomie   cup     3.5
2  Indomie  pack    15.0
3  Indomie  pack     5.0

In [11]: df.drop_duplicates(keep='last')                                                                                     
Out[11]: 
     brand style  rating
1  Yum Yum   cup     4.0
2  Indomie   cup     3.5
3  Indomie  pack    15.0
4  Indomie  pack     5.0

In [12]: df.drop_duplicates(keep=False)                                                                                      
Out[12]: 
     brand style  rating
2  Indomie   cup     3.5
3  Indomie  pack    15.0
4  Indomie  pack     5.0

  

 

posted @ 2021-02-02 12:01  就是想学习  阅读(515)  评论(0编辑  收藏  举报