Pandas 索引和选择数据(Indexing and Selecting Data)
Pandas 目前支持三种多轴索引
.loc 主要基于标签,但也可以用于布尔数组。在 .loc 没有找到items时,会产生KeyError。
最基础的索引:
import pandas as pd import numpy as np dates = pd.date_range('1/1/2000', periods=8) df = pd.DataFrame(np.random.randn(8, 4), index=dates, columns=['A', 'B', 'C', 'D']) print(df) s = df['A'] print(s[dates[5]])
使用 .loc 索引(按标签选择)
import pandas as pd import numpy as np dates = pd.date_range('1/1/2000', periods=8) df = pd.DataFrame(np.random.randn(8, 4), index=dates, columns=['A', 'B', 'C', 'D']) print(df) print() print(df.loc[:,['A', 'C']]) print(df.loc['20000101':'20000104',])
使用.iloc索引(按位置选择)
import pandas as pd import numpy as np import matplotlib.pyplot as plt # s = pd.Series([1, 3, 4, np.nan, 6, 8]) dates = pd.date_range('20130101', periods=6) df = pd.DataFrame(np.random.randn(6, 4), index=dates, columns=list('ABCD')) df2 = pd.DataFrame({'A':1., 'B': pd.Timestamp('20130102'), 'C': pd.Series(1, index=list(range(4)), dtype='float32'), 'D': np.array([3] * 4, dtype='int32'), 'E': pd.Categorical(["test", "train", "test", "train"]), 'F': 'foo'}) print(df.iloc[3]) print(df.iloc[[1, 2, 4], [0, 2]])
使用.isin()方法进行过滤
import pandas as pd import numpy as np import matplotlib.pyplot as plt # s = pd.Series([1, 3, 4, np.nan, 6, 8]) dates = pd.date_range('20130101', periods=6) df = pd.DataFrame(np.random.randn(6, 4), index=dates, columns=list('ABCD')) df2 = df.copy(); df2['E'] = ['one', 'one', 'two', 'three', 'four', 'three'] print(df2) print(df2[df2['E'].isin(['two', 'four'])])
Indexers | |