[Bash]LeetCode192. 统计词频 | Word Frequency

★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★
➤微信公众号：山青咏芝（shanqingyongzhi）
➤博客园地址：山青咏芝（https://www.cnblogs.com/strengthen/）
➤GitHub地址：https://github.com/strengthen/LeetCode
➤原文地址：https://www.cnblogs.com/strengthen/p/10180228.html
➤如果链接不是山青咏芝的博客园地址，则可能是爬取作者的文章。
➤原文已修改更新！强烈建议点击原文地址阅读！支持作者！支持原创！
★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★

Write a bash script to calculate the frequency of each word in a text file words.txt.

For simplicity sake, you may assume:

words.txt contains only lowercase characters and space ' ' characters.
Each word must consist of lowercase characters only.
Words are separated by one or more whitespace characters.

Example:

Assume that words.txt has the following content:

the day is sunny the the
the sunny is is

Your script should output the following, sorted by descending frequency:

the 4
is 3
sunny 2
day 1

Note:

Don't worry about handling ties, it is guaranteed that each word's frequency count is unique.
Could you write it in one-line using Unix pipes?

写一个 bash 脚本以统计一个文本文件 words.txt 中每个单词出现的频率。

为了简单起见，你可以假设：

words.txt只包括小写字母和 ' ' 。
每个单词只由小写字母组成。
单词间由一个或多个空格字符分隔。

示例:

假设 words.txt 内容如下：

the day is sunny the the
the sunny is is

你的脚本应当输出（以词频降序排列）：

the 4
is 3
sunny 2
day 1

说明:

不要担心词频相同的单词的排序问题，每个单词出现的频率都是唯一的。
你可以使用一行 Unix pipes 实现吗？

4ms

1 # Read from the file words.txt and output the word frequency list to stdout.
2 cat words.txt | tr -s ' ' '\n' | sort | uniq -c | sort -r | awk '{ print $2, $1 }'

8ms

1 # Read from the file words.txt and output the word frequency list to stdout.
2 awk '{
3     for (i = 1; i <= NF; ++i) ++s[$i];
4 } END {
5     for (i in s) print i, s[i];
6 }' words.txt | sort -nr -k 2

16ms

1 # Read from the file words.txt and output the word frequency list to stdout.
2 
3 # try 1
4 sed 's/ \{1,\}/\n/g' words.txt | sed '/^$/d' | sort | uniq -c | sort -nr | awk '{print $2,$1}'

posted @ 2018-12-26 17:06 为敢技术阅读(418) 评论(0) 收藏举报

刷新页面返回顶部

山青咏芝

感觉到了的东西,我们不能立刻理解它;只有理解了的东西才能更深刻地感觉它.

[Bash]LeetCode192. 统计词频 | Word Frequency

公告