数据规整

1.丢弃指定轴上的数据

data=pd.DataFrame(np.arange(16).reshape(4,4),index=['Shenzhen','Guangzhou','Beijing','Shanghai'],columns=['one','two','three','four'])
data
 onetwothreefour
Shenzhen 0 1 2 3
Guangzhou 4 5 6 7
Beijing 8 9 10 11
Shanghai 12 13 14 15

 

 

 

 

 

data.drop(['Shenzhen','Guangzhou'])
 onetwothreefour
Beijing 8 9 10 11
Shanghai 12 13 14 15

 

 

 

data.drop(['two'],axis=1)

删除第二列

2.函数映射

  Numpy的ufunc也可以用于操作pandas对象。

  例如:np.fabs(frame)

  

  DataFrame.apply

    DataFrame.apply(funcaxis=0broadcast=Noneraw=Falsereduce=Noneresult_type=Noneargs=()**kwds)[source]

    Apply a function along an axis of the DataFrame.

  DataFrame.applymap

    DataFrame.applymap(func)[source]

  Series.map

    Series.map(argna_action=None)[source]

    Map values of Series using input correspondence (a dict, Series, or function).

  

def f1(s):
    x=s.max()-s.min()
    return x

f = lambda x : x.max()-x.min()
frame.apply(f1)#列方向
one      3.168231
two      3.324250
three    2.111743
dtype: float64
f = lambda x: '%.2f' %x
frame.applymap(f)
 onetwothree
Shenzhen 1.55 -2.59 -1.21
Guangzhou 0.42 -0.16 0.17
Shanghai -1.62 0.73 -0.87
Beijing 0.33 0.00 0.90

 

 

 

 

 

3.排序

  sort_index / sort_value

4.数据合并

  pandas.merge

    DataFrame.merge(righthow='inner'on=Noneleft_on=Noneright_on=Noneleft_index=Falseright_index=Falsesort=Falsesuffixes=  ('_x''_y')copy=Trueindic      ator=Falsevalidate=None)[source]

    Merge DataFrame objects by performing a database-style join operation by columns or indexes.

    类似数据库表连接,左连、右连、内联、外联

    例子:

    

df1 = pd.DataFrame({'key1':['foo','bar','baz','foo'],'data1':list(np.arange(1,5))})
df2 = pd.DataFrame({'key2':['foo','bar','qux','bar'],'data2':list(np.arange(5,9))})

print(df1)
print(df2)

  

  key1  data1
0  foo      1
1  bar      2
2  baz      3
3  foo      4
  key2  data2
0  foo      5
1  bar      6
2  qux      7
3  bar      8

df1.merge(df2, left_on='key1', right_on='key2', how='right')#参数how代表连接方式,有'inner'、'left'、‘right’、‘outer’
 key1data1key2data2
0 foo 1.0 foo 5
1 foo 4.0 foo 5
2 bar 2.0 bar 6
3 bar 2.0 bar 8
4 NaN NaN qux 7

  

 

 

 

 

 

  pandas.concat

  pandas.combine_first

5.数据重塑

  DataFrame.stack/unstack

  

  

posted on 2018-07-05 13:30  么么唧唧  阅读(160)  评论(0编辑  收藏  举报

导航