公告

通用方法

pandas.melt => 把 value_vars 提取出来作为一个列值（比如value_vars=B 那么 var_name这一列的值都是B）

pd.melt(df, id_vars=['A'], value_vars=['B'], var_name='myVarname', value_name='myValname')
   A myVarname  myValname
0  a         B          1
1  b         B          3
2  c         B          5

pandas.pivot =>生成一个交叉表取一个列作为行一个列作为新的列取一个列作为交叉的值（不允许交叉重复的情况）

    foo   bar  baz  zoo
0   one   A    1    x
1   one   B    2    y
2   one   C    3    z
3   two   A    4    q
4   two   B    5    w
5   two   C    6    t

df.pivot(index='foo', columns='bar', values='baz')

    bar  A   B   C
foo

one     1    2   3
two     4    5   6

pandas.pivot_table=>对重复交叉进行一个函数计算

                 
df
     A    B      C  D  E
0  foo  one  small  1  2
1  foo  one  large  2  4
2  foo  one  large  2  5
3  foo  two  small  3  5
4  foo  two  small  3  6
5  bar  one  large  4  6
6  bar  one  small  5  8
7  bar  two  small  6  9
8  bar  two  large  7  9

table = pd.pivot_table(df, values='D', index=['A', 'B'], columns=['C'], aggfunc=np.sum)

　　 C     large  small
A   B
bar one    4.0    5.0
bar two    7.0    6.0
foo one    4.0    1.0
foo two    NaN    6.0

pandas.crosstab => 计算两个（或更多）因子的简单交叉表

选多个字段作为行选多个字段作为列行和列交叉（默认计算交叉后的count）

pandas.merge =>合并两个df

pd.merge_ordered =>合并两个df

pandas.get_dummies =》分类变量

一维度:值作为行

二维度:先对行和列笛卡尔积得到得数组作为行

然后遍历对象

对出现得进行 +1

A B
0 a c
1 b a
2 c b
3 a a

pandas.get_dummies(pandas.DataFrame({"A":["a","b","c","a"],"B":["c","a","b","a"]}))

A_a A_b A_c B_a B_b B_c
0 1 0 0 0 0 1
1 0 1 0 1 0 0

posted on 2022-08-25 18:02 谢Rain 阅读(35) 评论(0) 编辑收藏举报

会员力量，点亮园子希望

刷新页面返回顶部