Pandas系列(一)-Series详解
一、初始Series
Series 是一个带有 名称 和索引的一维数组,既然是数组,肯定要说到的就是数组中的元素类型,在 Series 中包含的数据类型可以是整数、浮点、字符串、Python对象等。
pandas.Series(data=None, index=None, dtype=None, name=None, copy=False, fastpath=False)
-
创建第一个Series
import pandas as pd user_age = pd.Series(data=[18, 30, 25, 40]) user_age.index = ["Tom", "Bob", "Mary", "James"] #加索引 user_age.index.name = "name" #索引加名字 user_age.name="user_age_info" #series加名字 user_age Out[4]: name Tom 18 Bob 30 Mary 25 James 40 Name: user_age_info, dtype: int64
- 创建Series的方式
- 列表方式创建
pd.Series([],index=[])
- 字典方式创建
pd.Series({}
# 方式一 t = pd.Series([1,2,3,4,43],index=list('asdfg')) print(t) a 1 s 2 d 3 f 4 g 43 dtype: int64 #方式二 temp_dict = {'name':'xiaohong','age':30,'tel':10086} t2 = pd.Series(temp_dict) t2 Out[10]: name xiaohong age 30 tel 10086 dtype: object import string #字典推导式 a = {string.ascii_uppercase[i]:i for i in range(10)} print(a) print(pd.Series(a)) print(pd.Series(a,index=list(string.ascii_uppercase[5:15]))) {'A': 0, 'B': 1, 'C': 2, 'D': 3, 'E': 4, 'F': 5, 'G': 6, 'H': 7, 'I': 8, 'J': 9} A 0 B 1 C 2 D 3 E 4 F 5 G 6 H 7 I 8 J 9 dtype: int64 F 5.0 G 6.0 H 7.0 I 8.0 J 9.0 K NaN L NaN M NaN N NaN O NaN dtype: float64
- 手动指定数据类型
#手动指定类型 name = ["Tom", "Bob", "Mary", "James"] user_age = pd.Series(data=[18, 30, 25, 40], index=name, name="user_age_info", dtype=float) user_age Out[7]: Tom 18.0 Bob 30.0 Mary 25.0 James 40.0 Name: user_age_info, dtype: float64
二、Series的索引
- series索引有五种方式:索引、序号、逻辑值查找、切片,数组
name = pd.Index(["Tom", "Bob", "Mary", "James"], name="name") user_age = pd.Series(data=[18, 30, 25, 40], index=name, name="user_age_info") """按索引、序号,逻辑值查找,切片""" user_age['Tom'] # 索引 Out[14]: 18 user_age.get('Tom') # get方式 Out[15]: 18 user_age[0] # 序号 Out[16]: 18 user_age[:3] # 切片 Out[17]: name Tom 18 Bob 30 Mary 25 Name: user_age_info, dtype: int64 user_age[user_age>25] # 逻辑值查找 Out[18]: name Bob 30 James 40 Name: user_age_info, dtype: int64 user_age[[3,1]] # 取多个值 Out[19]: name James 40 Bob 30 Name: user_age_info, dtype: int64 user_age[::2] # 切片 Out[20]: name Tom 18 Mary 25 Name: user_age_info, dtype: int64
三、基本属性
user_age.shape Out[21]: (4,) user_age.index Out[22]: Index(['Tom', 'Bob', 'Mary', 'James'], dtype='object', name='name') user_age.values Out[23]: array([18, 30, 25, 40], dtype=int64) user_age.unique() Out[24]: array([18, 30, 25, 40], dtype=int64) user_age.nunique() Out[25]: 4 user_age.dropna() Out[26]: name Tom 18 Bob 30 Mary 25 James 40 Name: user_age_info, dtype: int64 user_age.isin(list(range(30))) Out[28]: name Tom True Bob False Mary True James False Name: user_age_info, dtype: bool user_age.sort_index() Out[29]: name Bob 30 James 40 Mary 25 Tom 18 Name: user_age_info, dtype: int64 user_age.sort_values() Out[30]: name Tom 18 Mary 25 Bob 30 James 40 Name: user_age_info, dtype: int64 user_age.sort_values(ascending=False) Out[31]: name James 40 Bob 30 Mary 25 Tom 18 Name: user_age_info, dtype: int64
- 数字类型
- 字符串类型
四、画图
Series.plot(kind='line', ax=None, figsize=None, use_index=True, title=None, grid=None, legend=False, style=None, logx=False, logy=False, loglog=False, xticks=None, yticks=None, xlim=None, ylim=None, rot=None, fontsize=None, colormap=None, table=False, yerr=None, xerr=None, label=None, secondary_y=False, **kwds)[source]
官方文档:http://pandas.pydata.org/pandas-docs/stable/generated/pandas.Series.plot.html
- series转为dataframe
series.to_frame()