Bash:常用命令工具-tr命令
tr命令可以用来做简单的字符替换与删除,常用的有-d, -s选项。它的替换与删除是按单个字符来的
假设有以下文本:
Read from the file words.txt
and output the word frequency list
to stdout.
USE CASE 1:
将文本全部转化为大写
$ cat text.txt | tr [a-z] [A-Z] READ FROM THE FILE WORDS.TXT AND OUTPUT THE WORD FREQUENCY LIST TO STDOUT.
tr命令中的[a-z]即为需要被替换的字符的一个集合,而[A-Z]则是替换之后的字符集合,它们是一一对应的关系(a对应A,b对应B...)。tr命令不能做字符串的替换工作如:
$ cat text.txt | tr "the" "an" Rnad from ann filn words.axa and ouapua ann word frnqunncy lisa ao sadoua.
可以看到the并没有被替换为an,而是再一次的按照字符进行了一一替换(t-->a,h-->n,e-->n),这一点需要注意。
USE CASE 2:
将文本中的单词提取为一行一个
$ cat text.txt | tr -s " " "\n" Read from the file words.txt and output the word frequency list to stdout.
这里使用-s参数用来去重,这里的去重指的是,但指定的字符被替换后,如果发现相邻的有重复的那么只保留一个,如果不加-s参数:
$ cat text.txt | tr " " "\n" Read from the file words.txt and output the word frequency list to stdout.
可以看到有多余的回车出现。举一个更极端的例子:
$ echo "AAaaaaabbBBbb"|tr -s [ab] [AB] AB
可以看到使用了-s参数后,进行替换后连续重复的字符只保留了一个。
查看path变量:
$ echo $PATH /usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games $ echo $PATH|tr -s ":" "\n" /usr/local/sbin /usr/local/bin /usr/sbin /usr/bin /sbin /bin /usr/games /usr/local/games
USE CASE 3:
删除指定字符
$ date|tr -d [A-Za-z] 17 14:45:56 2015
参考:http://blog.sina.com.cn/s/blog_58c3f7960100uttl.html
NAME tr - translate or delete characters SYNOPSIS tr [OPTION]... SET1 [SET2] DESCRIPTION Translate, squeeze, and/or delete characters from standard input, writing to standard output. -c, -C, --complement use the complement of SET1 -d, --delete delete characters in SET1, do not translate -s, --squeeze-repeats replace each input sequence of a repeated character that is listed in SET1 with a single occurrence of that character -t, --truncate-set1 first truncate SET1 to length of SET2 --help display this help and exit --version output version information and exit SETs are specified as strings of characters. Most represent themselves. Interpreted sequences are: \NNN character with octal value NNN (1 to 3 octal digits) \\ backslash \a audible BEL \b backspace \f form feed \n new line \r return \t horizontal tab \v vertical tab CHAR1-CHAR2 all characters from CHAR1 to CHAR2 in ascending order [CHAR*] in SET2, copies of CHAR until length of SET1 [CHAR*REPEAT] REPEAT copies of CHAR, REPEAT octal if starting with 0 [:alnum:] all letters and digits [:alpha:] all letters [:blank:] all horizontal whitespace [:cntrl:] all control characters [:digit:] all digits [:graph:] all printable characters, not including space [:lower:] all lower case letters [:print:] all printable characters, including space [:punct:] all punctuation characters [:space:] all horizontal or vertical whitespace [:upper:] all upper case letters [:xdigit:] all hexadecimal digits [=CHAR=] all characters which are equivalent to CHAR Translation occurs if -d is not given and both SET1 and SET2 appear. -t may be used only when translating. SET2 is extended to length of SET1 by repeating its last character as necessary. Excess characters of SET2 are ignored. Only [:lower:] and [:upper:] are guaranteed to expand in ascending order; used in SET2 while translating, they may only be used in pairs to specify case conversion. -s uses SET1 if not translating nor deleting; else squeezing uses SET2 and occurs after translation or deletion.