pandas入门--筛选字符串+groupby+sort

一 先筛选出还有'from'列中带有'iphone 6s'的行,然后对这些数据进行groupby,结果倒序排

约等同于sql中的groupby+where+order by +desc

df[df['from'].str.contains('iphone 6s plus')].groupby(['from','to'])['uid'].agg({'uv':'count'}).sort_values(by='uv',ascending=0)

筛选groupby之后排序,分组取top值(分组排序的迂回方法,不知道有没有更好的方法)

df[df['from'].str.contains('oppo r9')].groupby(['from','to'])['uid'].agg({'uv':'count'}).sort_values(by='uv',ascending=0)['uv'].groupby(level=0,group_keys=False).nlargest(5000).to_csv('/Users/cici/Documents/group_huanji.csv',encoding='utf-8')

 

二 输出A列和B列带有某字符串的C列

df[(df['from']=='苹果-iphone 6s') & (df['to']=='苹果-iphone 7')]['uid'].to_csv('/Users/cici/Documents/iphone6_ip7.csv',header=0,index=False)

posted on 2017-03-28 15:06  fatcici  阅读(3563)  评论(0编辑  收藏  举报