3-3 groupby操作
Pandas章节应用的数据可以在以下链接下载: https://files.cnblogs.com/files/AI-robort/Titanic_Data-master.zip
In [1]:
import pandas as pd
df=pd.DataFrame({'key':['A','B','C','A','B','C','A','B','C'],
'data':[0,5,10,5,10,15,10,15,20]})
df
Out[1]:
In [3]:
for key in['A','B','C']:
print(key,df[df['key']==key].sum())#求每个key值的求和
In [4]:
df.groupby('key').sum()#和上面的分组是一样的
Out[4]:
In [7]:
import numpy as np
df.groupby('key').aggregate(np.mean)#aggregate是执行操作,如np的sum 、mean等
Out[7]:
In [8]:
df1=pd.read_csv('./Titanic_Data-master/Titanic_Data-master/train.csv')
In [13]:
df1.groupby('Sex')['Age'].mean()#统计性别对应的年龄的均值
Out[13]:
In [14]:
df1.groupby('Sex')['Survived'].mean()#统计性别对应的获救的平均概率
Out[14]:
此资源来自https://www.cnblogs.com/AI-robort/,博客园的karina512。