pandas的Series
pandas.Series(data=None, index=None, dtype=None, name=None, copy=False, fastpath=False)
首先介绍一下基本的:
data : array-like, dict, or scalar value,数组类型
index : array-like or Index (1d),
dtype : numpy.dtype or None
copy : boolean, default False
初始化时,如果只输入data和index,则得保证两者长度相同,否则报错:
>>> pd.Series(range(4),index=list("list")) l 0 i 1 s 2 t 3 dtype: int32 >>> pd.Series(range(5),index=list("list")) Traceback (most recent call last): File "<stdin>", line 1, in <module> File "E:\Python3\lib\site-packages\pandas\core\series.py", line 245, in __init__ data = SingleBlockManager(data, index, fastpath=True) File "E:\Python3\lib\site-packages\pandas\core\internals.py", line 4070, in __init__ fastpath=True) File "E:\Python3\lib\site-packages\pandas\core\internals.py", line 2685, in make_block return klass(values, ndim=ndim, fastpath=fastpath, placement=placement) File "E:\Python3\lib\site-packages\pandas\core\internals.py", line 109, in __init__ len(self.mgr_locs))) ValueError: Wrong number of items passed 5, placement implies 4 >>> pd.Series(range(4),index=list("lists")) Traceback (most recent call last): File "<stdin>", line 1, in <module> File "E:\Python3\lib\site-packages\pandas\core\series.py", line 245, in __init__ data = SingleBlockManager(data, index, fastpath=True) File "E:\Python3\lib\site-packages\pandas\core\internals.py", line 4070, in __init__ fastpath=True) File "E:\Python3\lib\site-packages\pandas\core\internals.py", line 2685, in make_block return klass(values, ndim=ndim, fastpath=fastpath, placement=placement) File "E:\Python3\lib\site-packages\pandas\core\internals.py", line 109, in __init__ len(self.mgr_locs))) ValueError: Wrong number of items passed 4, placement implies 5
创建一个series:
>>> se = pd.Series(range(5)) >>> se.name = "values" >>> se = pd.Series(range(5),name="values") >>> se 0 0 1 1 2 2 3 3 4 4 Name: values, dtype: int32 # 两者效果等价
可以更改index:
>>> se.index RangeIndex(start=0, stop=5, step=1) >>> se.index = list("abcde") >>> se a 0 b 1 c 2 d 3 e 4 Name: values, dtype: int32
将index列命名:
>>> se.index.name = "id" >>> se id a 0 b 1 c 2 d 3 e 4 Name: values, dtype: int32
转化为dataframe:
>>> se.to_frame() values id a 0 b 1 c 2 d 3 e 4
选出一个:
>>> se["b"] 1 >>> se.loc["b"] 1
但是里面的字符串不能用数字,(否则会被认为是切片操作选择):
>>> se[1] # 元素充足时 1 >>> se[5] # 元素不足时,报错 Traceback (most recent call last): File "E:\Python3\lib\site-packages\pandas\indexes\base.py", line 2169, in get_value tz=getattr(series.dtype, 'tz', None)) File "pandas\index.pyx", line 98, in pandas.index.IndexEngine.get_value (pandas\index.c:3557) File "pandas\index.pyx", line 106, in pandas.index.IndexEngine.get_value (pandas\index.c:3240) File "pandas\index.pyx", line 154, in pandas.index.IndexEngine.get_loc (pandas\index.c:4279) File "pandas\src\hashtable_class_helper.pxi", line 732, in pandas.hashtable.PyObjectHashTable.get_item (pandas\hashtable.c:13742) File "pandas\src\hashtable_class_helper.pxi", line 740, in pandas.hashtable.PyObjectHashTable.get_item (pandas\hashtable.c:13696) KeyError: 5 During handling of the above exception, another exception occurred: Traceback (most recent call last): File "<stdin>", line 1, in <module> File "E:\Python3\lib\site-packages\pandas\core\series.py", line 603, in __getitem__ result = self.index.get_value(self, key) File "E:\Python3\lib\site-packages\pandas\indexes\base.py", line 2175, in get_value return tslib.get_value_box(s, key) File "pandas\tslib.pyx", line 946, in pandas.tslib.get_value_box (pandas\tslib.c:19053) File "pandas\tslib.pyx", line 962, in pandas.tslib.get_value_box (pandas\tslib.c:18770) IndexError: index out of bounds >>> se[5] = "s" # 也是错误的,越界了