pandas的corsstab

pandas.crosstab(index, columns, values=None, rownames=None, colnames=None, aggfunc=None, margins=False, dropna=True, normalize=False)

index : array-like, Series, or list of arrays/Series

Values to group by in the rows

columns : array-like, Series, or list of arrays/Series

Values to group by in the columns

values : array-like, optional

Array of values to aggregate according to the factors. Requires aggfunc be specified.

aggfunc : function, optional

If specified, requires values be specified as well

rownames : sequence, default None

If passed, must match number of row arrays passed

colnames : sequence, default None

If passed, must match number of column arrays passed

margins : boolean, default False

Add row/column margins (subtotals)

dropna : boolean, default True

Do not include columns whose entries are all NaN

normalize : boolean, {‘all’, ‘index’, ‘columns’}, or {0,1}, default False

Normalize by dividing all values by the sum of values.

If passed ‘all’ or True, will normalize over all values.

If passed ‘index’ will normalize over each row.

If passed ‘columns’ will normalize over each column.

If margins is True, will also normalize margin values.

New in version 0.18.1.

In [1]:

import numpy as np
a = np.array(["foo", "foo", "foo", "foo", "bar", "bar","bar", "bar", "foo", "foo", "foo"], dtype=object)
a

In [2]:

b = np.array(["one", "one", "one", "two", "one", "one", "one", "two", "two", "two", "one"], dtype=object)
b

In [3]:

pd.crosstab(a,b)

Out[3]:

col_0	one	two
row_0
bar	3	1
foo	4	3

In [4]:

 pd.crosstab(a, b, rownames=['a'], colnames=['b'])

Out[4]:

b	one	two
a
bar	3	1
foo	4	3

In [5]

c = np.array(["dull", "dull", "shiny", "dull", "dull", "shiny","shiny", "dull", "shiny", "shiny", "shiny"],
               dtype=object)
c

In [6]:

import pandas as pd 
pd.crosstab(a, [b, c], rownames=['a'], colnames=['b', 'c'])

Out[6]:

b	one		two
c	dull	shiny	dull	shiny
a
bar	1	2	1	0
foo	2	2	1	2

In [7]:

foo1 = pd.Categorical(['a', 'b'], categories=['a', 'b', 'c'])
bar1= pd.Categorical(['d', 'e'], categories=['d', 'e', 'f'])
pd.crosstab(foo1, bar1,dropna='true')  
# 'c' and 'f' are not represented in the data,
# and will not be shown in the output because
# dropna is True by default. Set 'dropna=False'
# to preserve categories with no data

Out[7]:

col_0	d	e	f
row_0
a	1	0	0
b	0	1	0
c	0	0	0

posted @ 2019-07-07 16:15 wqbin 阅读(190) 评论(0) 收藏举报

刷新页面返回顶部

少年阿斌

人类被赋予了一种工作，那就是精神的成长。

pandas的corsstab

公告