组合数据类型练习,英文词频统计实例上

1.字典实例:建立学生学号成绩字典,做增删改查遍历操作。

#建立学生学号成绩字典
d={'01':99,'02':100,'03':97,'04':80,'05':77,'06':100}
print(d,'\n')
#
d['07']=66
print('增加后的成绩字典为:',d,'\n')
#
d.pop('04')
print('删除后的成绩字典为:',d,'\n')
#
d['05']=80
print('修改后的成绩字典为:',d,'\n')
#
print('02同学的成绩为:',d['02'],'\n')           
#遍历
for i in d:
    print('{}\t{}'.format(i,d[i]))

2.列表,元组,字典,集合的遍历。
总结列表,元组,字典,集合的联系与区别。

#列表
li=list('123123123113')
print('列表的遍历:')
print(li)
for i in li:
    print(i)
#元组
tu=tuple('123123123113')
print('元组的遍历:')
print(tu)
for i in tu:
    print(i)   
#字典
d={'01':99,'02':100,'03':97,'04':80,'05':77,'06':100}
print('字典的遍历:')
print(d)
for i in d:
    print(i,d[i])  
#集合
s=set([1,2,3,1,2,3,1,2,3,1,1,3])
print('集合的遍历:')
print(s)
for i in s:
    print(i)

列表:是一种有序的序列,可以随时添加和删除其中的元素,没有长度限制、元素类型可以不同。

元组:和list非常相似,但是一旦初始化便不能修改。

字典:使用键-值进行存储,其中键必须为不可变的对象。

集合:值不能重复,所以遍历出来的值没有重复值,是无序的。

 

3.英文词频统计实例

  1.待分析字符串

  2.分解提取单词

    1.大小写 txt.lower()

    2.分隔符'.,:;?!-_’

    3.单词列表

  3.单词计数字典

 

str='''Tyler was born infected with HIV:his mother was also infected.From the very beginning of his life,he was dependent on medications to enable him to survive.When he was five,he had a tube surgically inserted in a vein in his chest.This tube was connected to a pump,which he carried in a small backpack on his back.Medications were hooked up to this pump and were continuously supplied through this tube to his bloodstream.At times,he also needed supplemented oxygen to support his breathing.'''

#将所有大写转换为小写
str=str.lower()
print('全部转换为小写的结果:'+str+'\n')

#将所有将所有其他做分隔符(,.?!)替换为空格
for i in ',.?!:':
    str=str.replace(i,' ')
print('其他分隔符替换为空格的结果:'+str+'\n')

#分隔出一个一个单词
str=str.split(' ')
print('分隔结果为:',str,'\n')

word = set(str)
dic={}
for i in word:
    dic[i]= str.count(i)
    
str=list(dic.items())
str.sort(key=lambda x:x[1],reverse=True)
print(str,'\n')
print('词频前10为:')
for i in range(10):
    word,count=str[i]
    print('{}\t{}'.format(word,count))

 

posted @ 2017-09-26 12:50  078刘凯敏  阅读(193)  评论(1编辑  收藏  举报