Python: 字典应用题
Write a program to read through the mbox-short.txt and figure out who has sent the greatest number of mail messages. The program looks for 'From ' lines and takes the second word of those lines as the person who sent the mail. The program creates a Python dictionary that maps the sender's mail address to a count of the number of times they appear in the file. After the dictionary is produced, the program reads through the dictionary using a maximum loop to find the most prolific committer.
name = input("Enter file:") if len(name) < 1 : name = "mbox-short.txt" handle = open(name) counts = dict() persons = list() clist = list for line in handle: line = line.rstrip() clist = line.split()
#Guardian pattern if line == '': continue if clist[0] is 'From': persons.append(clist[1])
for person in persons: counts[person] = counts.get(person,0) +1 bigcount = None bigperson = None for p,c in counts.items(): if bigcount is None or c>bigcount: bigcount = c bigperson = p print(bigperson, bigcount)
python 数据结构综合题:
- 每一个代码块都实现一个功能,功能之间不要冗杂:
- 以行为单位读文件
- .split() 把string拆成一个list, 如果list首元素为From,则把list[2]拿出来,存到新列表person list里面
- 用dictionary做histogram,把(key, value)是(人名,次数)
- 遍历 dictionary,找最大值
这样,功能分布清晰,提高代码可读性,也有利于debug
Guardian pattern:
防止line是空行,否则closet[0] 会报错: out of range. 因为是空list,没有list[0]元素
有个很好的debug教程:
Python data structures - week 4 - Assignment Chapter 8 - Worked exercise ( 看助教是怎么找错的~