Linux中对为本去重

1.格式

uniq [OPTION]... [INPUT [OUTPUT]]

2.命令

       -c, --count
              prefix lines by the number of occurrences

       -d, --repeated
              only print duplicate lines

       -D, --all-repeated[=delimit-method]
              print all duplicate lines delimit-method={none(default),prepend,separate} Delimiting is done with blank lines

       -f, --skip-fields=N
              avoid comparing the first N fields

       -i, --ignore-case
              ignore differences in case when comparing

       -s, --skip-chars=N
              avoid comparing the first N characters

       -u, --unique
              only print unique lines

       -z, --zero-terminated
              end lines with 0 byte, not newline

       -w, --check-chars=N
              compare no more than N characters in lines

       --help display this help and exit

       --version
              output version information and exit

3.举例子

unique.txt

hellopython
hellopython
python
bbs.pythontab.com
python
pythontab.com
python
hello.pythontab.com
hellopythontab
hellopythontab

(1)执行 uniq unique.txt

hellopython
python
bbs.pythontab.com
python
pythontab.com
python
hello.pythontab.com
hellopythontab

(2)看了上面是不是感觉不对呢？再执行uniq -c unique.txt

2 hellopython
1 python
1 bbs.pythontab.com
1 python
1 pythontab.com
1 python
1 hello.pythontab.com
2 hellopythontab
1
#感觉还是不对，uniq检查重复行时，是按相邻的行进行检查的#

(3)再执行sort unique.txt | uniq -c

1
1 bbs.pythontab.com
2 hellopython
2 hellopythontab
3 python
1 pythontab.com
1 hello.pythontab.com

---------------------

EOF

posted @ 2014-10-21 16:34 天天AC 阅读(332) 评论(0) 收藏举报

刷新页面返回顶部

Dream

带着大脑去学习、喜爱所以专注、所以执着。。。

Linux中对为本去重

公告