列处理——寻找并处理非法值

#input is a list named legislators, the first two elements looks like this:
#[['Bassett', 'Richard', '1745-04-02', 'M', 'sen', 'DE', 'Anti-Administration'], ['Bland', 'Theodorick', '1742-03-21', '', 'rep', 'VA', '']]
#Bassett Richard is the name
#1745-04-02 is bithdate
#M is gender

对性别列做处理。

单独取出该字段

genders_list = []
for rows in legislators:
    genders_list.append(rows[3])

得到该字段的所有取值

# Converting to a set, so we get the unique values
unique_genders = set(genders_list)
# We can't index sets, so we need to convert back into a list first.
unique_genders_list = list(unique_genders)
print(unique_genders_list)
# 输出为['', 'M', 'F']

看出，除了正常的M，F，还会有空值。

pandas获取方法：

#recent_grads['Major'].value_counts()是series类型
majors = recent_grads['Major'].value_counts().index

处理非法值

for row in legislators:
    if row[3]=='':
        row[3]='M'

posted on 2016-01-05 20:17 arsh 阅读(180) 评论(0) 收藏举报

刷新页面返回顶部

BLOCKS

列处理——寻找并处理非法值

公告

导航