组合数据类型练习,英文词频统计实例上
1.字典实例:建立学生学号成绩字典,做增删改查遍历操作。
#建立学生学号成绩字典 d={'01':99,'02':100,'03':97,'04':80,'05':77,'06':100} print(d,'\n') #增 d['07']=66 print('增加后的成绩字典为:',d,'\n') #删 d.pop('04') print('删除后的成绩字典为:',d,'\n') #改 d['05']=80 print('修改后的成绩字典为:',d,'\n') #查 print('02同学的成绩为:',d['02'],'\n') #遍历 for i in d: print('{}\t{}'.format(i,d[i]))
2.列表,元组,字典,集合的遍历。
总结列表,元组,字典,集合的联系与区别。
#列表 li=list('123123123113') print('列表的遍历:') print(li) for i in li: print(i) #元组 tu=tuple('123123123113') print('元组的遍历:') print(tu) for i in tu: print(i) #字典 d={'01':99,'02':100,'03':97,'04':80,'05':77,'06':100} print('字典的遍历:') print(d) for i in d: print(i,d[i]) #集合 s=set([1,2,3,1,2,3,1,2,3,1,1,3]) print('集合的遍历:') print(s) for i in s: print(i)
列表:是一种有序的序列,可以随时添加和删除其中的元素,没有长度限制、元素类型可以不同。
元组:和list非常相似,但是一旦初始化便不能修改。
字典:使用键-值进行存储,其中键必须为不可变的对象。
集合:值不能重复,所以遍历出来的值没有重复值,是无序的。
3.英文词频统计实例
1.待分析字符串
2.分解提取单词
1.大小写 txt.lower()
2.分隔符'.,:;?!-_’
3.单词列表
3.单词计数字典
str='''Tyler was born infected with HIV:his mother was also infected.From the very beginning of his life,he was dependent on medications to enable him to survive.When he was five,he had a tube surgically inserted in a vein in his chest.This tube was connected to a pump,which he carried in a small backpack on his back.Medications were hooked up to this pump and were continuously supplied through this tube to his bloodstream.At times,he also needed supplemented oxygen to support his breathing.''' #将所有大写转换为小写 str=str.lower() print('全部转换为小写的结果:'+str+'\n') #将所有将所有其他做分隔符(,.?!)替换为空格 for i in ',.?!:': str=str.replace(i,' ') print('其他分隔符替换为空格的结果:'+str+'\n') #分隔出一个一个单词 str=str.split(' ') print('分隔结果为:',str,'\n') word = set(str) dic={} for i in word: dic[i]= str.count(i) str=list(dic.items()) str.sort(key=lambda x:x[1],reverse=True) print(str,'\n') print('词频前10为:') for i in range(10): word,count=str[i] print('{}\t{}'.format(word,count))