Linux下wc命令统计文件行数/词数/字符数/最长行字符数
wc命令帮助
$ wc --help Usage: wc [OPTION]... [FILE]... or: wc [OPTION]... --files0-from=F Print newline, word, and byte counts for each FILE, and a total line if more than one FILE is specified. A word is a non-zero-length sequence of characters delimited by white space. With no FILE, or when FILE is -, read standard input. The options below may be used to select which counts are printed, always in the following order: newline, word, character, byte, maximum line length. -c, --bytes print the byte counts -m, --chars print the character counts -l, --lines print the newline counts --files0-from=F read input from the files specified by NUL-terminated names in file F; If F is - then read names from standard input -L, --max-line-length print the maximum display width -w, --words print the word counts --help display this help and exit --version output version information and exit GNU coreutils online help: <http://www.gnu.org/software/coreutils/> Full documentation at: <http://www.gnu.org/software/coreutils/wc> or available locally via: info '(coreutils) wc invocation'
命令使用
统计行数
$ wc -l /usr/share/dict/american-english 99171 /usr/share/dict/american-english
统计词数
$ wc -w /usr/share/dict/american-english 99171 /usr/share/dict/american-english
统计字符数
wc -m /usr/share/dict/american-english 938587 /usr/share/dict/american-english
统计字节数
wc -c /usr/share/dict/american-english 938848 /usr/share/dict/american-english
注意-c和-m的区别在于对于多字节字符, 例如GBK, UTF-8编码的中文, 在-m中记一个, 在-c中记多个, 例如下面的测试, ubuntu默认编码是UTF-8, 中文是3个字节
$ echo -n "123, 测试"|wc -c 12 $ echo -n "123, 测试"|wc -m 8
统计最长的行
$ wc -L /usr/share/dict/american-english 23 /usr/share/dict/american-english
如果只希望获取数字, 不打印文件名, 可以通过以下两种方法, 从节约内存的角度看, 推荐前一种方法
$ wc -l /usr/share/dict/american-english | awk '{print $1}' 99171 $ cat /usr/share/dict/american-english | wc -l 99171