第四课： - 添加/删除列 - 索引操作

第 4 课

在本课中，我们将回到基础知识。我们将使用一个小的数据集，以便您可以轻松理解我想要解释的内容。我们将添加列，删除列，并以许多不同的方式切分数据。Enjoy！

In [1]:
# Import libraries
import pandas as pd
import sys

In [2]:

print('Python version ' + sys.version)
print('Pandas version: ' + pd.__version__)

Python version 3.5.1 |Anaconda custom (64-bit)| (default, Feb 16 2016, 09:49:46) [MSC v.1900 64 bit (AMD64)]
Pandas version: 0.20.1

In [3]:

# Our small data set
d = [0,1,2,3,4,5,6,7,8,9]

# Create dataframe
df = pd.DataFrame(d)
df

Out[3]:

	0
0	0
1	1
2	2
3	3
4	4
5	5
6	6
7	7
8	8
9	9

In [4]:

#改变df 列的名字

df.columns = ['Rev'] 

df

Out[4]:

	Rev
0	0
1	1
2	2
3	3
4	4
5	5
6	6
7	7
8	8
9	9

In [5]:

# 添加一列
df['NewCol'] = 5
df

Out[5]:

	Rev	NewCol
0	0	5
1	1	5
2	2	5
3	3	5
4	4	5
5	5	5
6	6	5
7	7	5
8	8	5
9	9	5

In [6]:

# 修改列
df['NewCol'] = df['NewCol'] + 1
df

Out[6]:

	Rev	NewCol
0	0	6
1	1	6
2	2	6
3	3	6
4	4	6
5	5	6
6	6	6
7	7	6
8	8	6
9	9	6

In [7]:

# 删除列
del df['NewCol']
df

Out[7]:

	Rev
0	0
1	1
2	2
3	3
4	4
5	5
6	6
7	7
8	8
9	9

In [8]:

# 添加几列
df['test'] = 3
df['col'] = df['Rev']
df

Out[8]:

	Rev	test	col
0	0	3	0
1	1	3	1
2	2	3	2
3	3	3	3
4	4	3	4
5	5	3	5
6	6	3	6
7	7	3	7
8	8	3	8
9	9	3	9

In [9]:

#如果我们想要，我们甚至可以改变索引的名称

i = ['a','b','c','d','e','f','g','h','i','j'] 
df.index = i
df

Out[9]:

	Rev	test	col
a	0	3	0
b	1	3	1
c	2	3	2
d	3	3	3
e	4	3	4
f	5	3	5
g	6	3	6
h	7	3	7
i	8	3	8
j	9	3	9

现在我们可以开始使用loc选择数据帧的各个部分。

In [10]:

df.loc['a']

Out[10]:

Rev     0
test    3
col     0
Name: a, dtype: int64

In [11]:

# df.loc[inclusive:inclusive]
df.loc['a':'d']

Out[11]:

	Rev	test	col
a	0	3	0
b	1	3	1
c	2	3	2
d	3	3	3

In [12]:

# df.iloc[inclusive:exclusive]
# 注意：.iloc基于严格的整数位置[版本0.11.0以上]

df.iloc[0:3]

Out[12]:

	Rev	test	col
a	0	3	0
b	1	3	1
c	2	3	2

我们也可以使用列名选择。

In [13]:

df['Rev']

Out[13]:

a    0
b    1
c    2
d    3
e    4
f    5
g    6
h    7
i    8
j    9
Name: Rev, dtype: int64

In [14]:

df[['Rev', 'test']]

Out[14]:

	Rev	test
a	0	3
b	1	3
c	2	3
d	3	3
e	4	3
f	5	3
g	6	3
h	7	3
i	8	3
j	9	3

In [15]:

# df.ix[rows,columns]
# 代替已弃用的ix函数

#df.ix[0:3,'Rev'] 

df.loc[df.index[0:3],'Rev']

Out[15]:

a    0
b    1
c    2
Name: Rev, dtype: int64

In [16]:

代替已弃用的ix函数

#df.ix[5:,'col'] 

df.loc[df.index[5:],'col']

Out[16]:

f    5
g    6
h    7
i    8
j    9
Name: col, dtype: int64

In [17]:

代替已弃用的ix函数

#df.ix[:3,['col', 'test']] 

df.loc[df.index[:3],['col', 'test']]

Out[17]:

	col	test
a	0	3
b	1	3
c	2	3

还有一些方便的功能可以选择数据帧的顶部和底部记录。

In [18]:

# Select top N number of records  
df.head(5)

Out[18]:

	Rev	test	col
a	0	3	0
b	1	3	1
c	2	3	2
d	3	3	3
e	4	3	4

In [19]:

# Select bottom N number of records  
df.tail(5)

Out[19]:

	Rev	test	col
f	5	3	5
g	6	3	6
h	7	3	7
i	8	3	8
j	9	3	9

This tutorial was recreated by 六尺巷人_cds

posted on 2018-05-22 10:30 六尺巷人阅读(202) 评论(0) 编辑收藏举报

刷新页面返回顶部

第四课： - 添加/删除列 - 索引操作

第 4 课

导航

公告