3-6 merge操作
In [1]:
import pandas as pd
In [6]:
left =pd.DataFrame({ 'A':['A0','A1','A2','A3'],
'B':['B0','B1','B2','B3'],
'key':['K0','K1','K2','K3'],})
right =pd.DataFrame({ 'C':['C0','C1','C2','C3'],
'D':['D0','D1','D2','D3'],
'key':['K0','K1','K2','K3'],})
In [7]:
left
Out[7]:
In [8]:
right
Out[8]:
merge:合并
In [10]:
pd.merge(left,right)#直接合并,重复的就不再显示
Out[10]:
In [12]:
pd.merge(left,right,on='key')#以key为界进行合并
Out[12]:
In [13]:
left =pd.DataFrame({ 'A':['A0','A1','A2','A3'],
'B':['B0','B1','B2','B3'],
'key1':['K0','K1','K2','K3'],
'key2':['K0','K1','K2','K3']})
right =pd.DataFrame({ 'C':['C0','C1','C2','C3'],
'D':['D0','D1','D2','D3'],
'key1':['K0','K1','K2','K3'],
'key2':['K0','K1','K2','K3']})
In [14]:
left
Out[14]:
In [16]:
right
Out[16]:
In [17]:
pd.merge(left,right)#直接合并,重复的就不再显示
Out[17]:
In [18]:
pd.merge(left,right,on='key1')#以key1为界进行合并,key2自动分组
Out[18]:
In [19]:
pd.merge(left,right,on=['key1','key2'])#以key1,key2为界进行合并
Out[19]:
使key2的值不完全一样,right改成K4
In [20]:
right =pd.DataFrame({ 'C':['C0','C1','C2','C3'],
'D':['D0','D1','D2','D3'],
'key1':['K0','K1','K2','K3'],
'key2':['K0','K1','K2','K4']})
In [21]:
pd.merge(left,right,on=['key1','key2'])#以key1,key2为界进行合并,但是key2不同的那一行就被删除
Out[21]:
In [22]:
pd.merge(left,right,on=['key1','key2'],how='outer')#how='outer'为并集,但是默认是交集
Out[22]:
In [23]:
pd.merge(left,right,on=['key1','key2'],how='outer',indicator=True)#指定当前的merge是交集还是并集
Out[23]:
In [24]:
pd.merge(left,right,on=['key1','key2'],how='left')#how='left'指定以left为基准,也可以指定其他为基准
Out[24]:
此资源来自https://www.cnblogs.com/AI-robort/,博客园的karina512。