列处理——寻找并处理非法值

#input is a list named legislators, the first two elements looks like this:
#[['Bassett', 'Richard', '1745-04-02', 'M', 'sen', 'DE', 'Anti-Administration'], ['Bland', 'Theodorick', '1742-03-21', '', 'rep', 'VA', '']]
#Bassett Richard is the name
#1745-04-02 is bithdate
#M is gender

对性别列做处理。

  • 单独取出该字段
genders_list = []
for rows in legislators:
    genders_list.append(rows[3])
  • 得到该字段的所有取值
# Converting to a set, so we get the unique values
unique_genders = set(genders_list)
# We can't index sets, so we need to convert back into a list first.
unique_genders_list = list(unique_genders)
print(unique_genders_list)
# 输出为['', 'M', 'F']

 看出,除了正常的M,F,还会有空值。

pandas获取方法:

#recent_grads['Major'].value_counts()是series类型
majors = recent_grads['Major'].value_counts().index

 

  • 处理非法值
for row in legislators:
    if row[3]=='':
        row[3]='M'

 

posted on 2016-01-05 20:17  arsh  阅读(169)  评论(0编辑  收藏  举报

导航