随笔- 14 文章- 2 评论- 0 阅读- 1815

《三国演义》人物出场统计

使用到的库：jieba

《三国演义》人物出场统计---上

 1 #《三国演义》人物出场统计--上
 2 import jieba
 3 txt = open('I:\三国演义.txt', 'r', encoding = 'utf-8').read()
 4 words = jieba.lcut(txt)
 5 counts = {}
 6 for word in words:
 7     if len(word) == 1:
 8         continue
 9     else:
10         counts[word] = counts.get(word, 0) + 1
11 items = list(counts.items())
12 items.sort(key=lambda x:x[1], reverse=True)
13 for i in range(15):
14     word, count =items[i]
15     print('{0:<10}{1:>5}'.format(word, count))

《三国演义》人物出场统计---下

 1 import jieba
 2 excludes ={'将军','却说','二人','不可','荆州','不能','如此','商议','如何','主公','军士','左右','军马','次日','引兵','大喜'}
 3 txt = open('I:\三国演义.txt', 'r', encoding = 'utf-8').read()
 4 words = jieba.lcut(txt)
 5 counts = {}
 6 for word in words:
 7     if len(word) == 1:
 8         continue
 9     elif word == '诸葛亮' or word == '孔明曰':
10         rword = '孔明'
11     elif word == '关公' or word == '云长':
12         rword = '关羽'
13     elif word == '玄德' or word == '玄德曰':
14         rword = '刘备'
15     elif word == '孟德' or word == '丞相':
16         rword = '曹操'
17     else:
18         counts[word] = counts.get(word, 0) + 1
19 for word in excludes:
20     del counts[word]
21 items = list(counts.items())
22 items.sort(key=lambda x:x[1], reverse=True)
23 for i in range(15):
24     word, count =items[i]
25     print('{0:<10}{1:>5}'.format(word, count))

posted @ 2023-02-20 10:51 摆烂小T 阅读(450) 评论(0) 编辑收藏举报

刷新页面返回顶部

登录后才能查看或发表评论，立即登录或者逛逛博客园首页

相关博文：

· Hamlet 词频统计

· 关于基本统计值的计算

· 统计《三国演义》中人物出场次数。

· 红楼梦人物出场统计

· jieba 统计红楼梦人物出现次数

阅读排行：
· TypeScript + Deepseek 打造卜卦网站：技术与玄学的结合
· Manus的开源复刻OpenManus初探
· AI 智能体引爆开源社区「GitHub 热点速览」
· 三行代码完成国际化适配，妙~啊~
· .NET Core 中如何实现缓存的预热？

公告

昵称：摆烂小T
园龄： 2年2个月
粉丝： 0
关注： 0

+加关注

2025年3月

日

一

二

三

四

五

六

摆烂小T

《三国演义》人物出场统计

公告

搜索

常用链接

随笔分类

随笔档案

文章分类

阅读排行榜

推荐排行榜