Bash:常用命令工具-tr命令

tr命令可以用来做简单的字符替换与删除,常用的有-d, -s选项。它的替换与删除是按单个字符来的

假设有以下文本:

Read from the file words.txt 
and output the word frequency list
to stdout.
USE CASE 1:

将文本全部转化为大写

$ cat text.txt | tr [a-z] [A-Z]
READ FROM THE FILE WORDS.TXT
AND OUTPUT THE WORD FREQUENCY LIST
TO STDOUT.

tr命令中的[a-z]即为需要被替换的字符的一个集合,而[A-Z]则是替换之后的字符集合,它们是一一对应的关系(a对应A,b对应B...)。tr命令不能做字符串的替换工作如:

$ cat text.txt | tr "the" "an"
Rnad from ann filn words.axa
and ouapua ann word frnqunncy lisa
ao sadoua.

可以看到the并没有被替换为an,而是再一次的按照字符进行了一一替换(t-->a,h-->n,e-->n),这一点需要注意。

USE CASE 2:

将文本中的单词提取为一行一个

$ cat text.txt | tr -s " " "\n"
Read
from
the
file
words.txt
and
output
the
word
frequency
list
to
stdout.

这里使用-s参数用来去重,这里的去重指的是,但指定的字符被替换后,如果发现相邻的有重复的那么只保留一个,如果不加-s参数:

$ cat text.txt | tr  " " "\n"
Read
from
the
file
words.txt

and
output
the
word
frequency
list

to
stdout.

可以看到有多余的回车出现。举一个更极端的例子:

$ echo "AAaaaaabbBBbb"|tr -s [ab] [AB]
AB

可以看到使用了-s参数后,进行替换后连续重复的字符只保留了一个。

查看path变量:

$ echo $PATH
/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games
$ echo $PATH|tr -s ":" "\n"
/usr/local/sbin
/usr/local/bin
/usr/sbin
/usr/bin
/sbin
/bin
/usr/games
/usr/local/games
USE CASE 3:

删除指定字符

$ date|tr -d [A-Za-z]
  17 14:45:56  2015

 参考:http://blog.sina.com.cn/s/blog_58c3f7960100uttl.html

NAME
       tr - translate or delete characters

SYNOPSIS
       tr [OPTION]... SET1 [SET2]

DESCRIPTION
       Translate, squeeze, and/or delete characters from standard input, writing to standard output.

       -c, -C, --complement
              use the complement of SET1

       -d, --delete
              delete characters in SET1, do not translate

       -s, --squeeze-repeats
              replace each input sequence of a repeated character that is listed in SET1 with a single occurrence of that character

       -t, --truncate-set1
              first truncate SET1 to length of SET2

       --help display this help and exit

       --version
              output version information and exit

       SETs are specified as strings of characters.  Most represent themselves.  Interpreted sequences are:

       \NNN   character with octal value NNN (1 to 3 octal digits)

       \\     backslash

       \a     audible BEL

       \b     backspace

       \f     form feed

       \n     new line

       \r     return

       \t     horizontal tab

       \v     vertical tab

       CHAR1-CHAR2
              all characters from CHAR1 to CHAR2 in ascending order

       [CHAR*]
              in SET2, copies of CHAR until length of SET1

       [CHAR*REPEAT]
              REPEAT copies of CHAR, REPEAT octal if starting with 0

       [:alnum:]
              all letters and digits

       [:alpha:]
              all letters

       [:blank:]
              all horizontal whitespace

       [:cntrl:]
              all control characters

       [:digit:]
              all digits

       [:graph:]
              all printable characters, not including space

       [:lower:]
              all lower case letters

       [:print:]
              all printable characters, including space

       [:punct:]
              all punctuation characters

       [:space:]
              all horizontal or vertical whitespace

       [:upper:]
              all upper case letters

       [:xdigit:]
              all hexadecimal digits

       [=CHAR=]
              all characters which are equivalent to CHAR

       Translation  occurs  if  -d is not given and both SET1 and SET2 appear.  -t may be used only when translating.  SET2 is extended to length of SET1 by repeating its last character as necessary.  Excess
       characters of SET2 are ignored.  Only [:lower:] and [:upper:] are guaranteed to expand in ascending order; used in SET2 while translating, they may only be used in pairs to  specify  case  conversion.
       -s uses SET1 if not translating nor deleting; else squeezing uses SET2 and occurs after translation or deletion.

 

posted @ 2015-04-17 14:48  卖程序的小歪  阅读(478)  评论(0编辑  收藏  举报