shell基础08 sed命令行编辑器(上)
sed编辑器被称作流编辑器(stream editor),和普通的交互式文本编辑器恰好相反。在交互式文本编辑器中(eg:vim),你可以用键盘命令来交互式的插入、删除或替换数据中的文本。sed会在编辑器处理数据之前基于预先提供的一组规则来编辑数据流。
sed命令可以根据命令来处理数据流中的数据,这些命令要么从命令行中输入,要么存储在一个命令文本文件中。sed编辑器会执行下列操作:
(1) 一次从输入中读取一行数据。
(2) 根据所提供的编辑器命令匹配数据。
(3) 按照命令修改流中的数据。
(4) 将新的数据输出到STDOUT。
在流编辑器将所有的命令与下一行数据匹配完毕后,它会读取下一行数据并重复这个过程。在流编辑器处理完流中的所有数据行后,他就会终止。
格式:
sed options script file
1. 在命令行定义编辑器命令
1 [Hermioner@localhost Documents]$ cat data1.txt 2 I love cat. 3 I love cat. 4 I love dog. 5 I love dog. 6 7 [Hermioner@localhost Documents]$ sed 's/dog/cat/' data1.txt 8 I love cat. 9 I love cat. 10 I love cat. 11 I love cat. 12 [Hermioner@localhost Documents]$ cat data1.txt 13 I love cat. 14 I love cat. 15 I love dog. 16 I love dog.
1 [Hermioner@localhost Documents]$ echo "This is a dog" | sed 's/dog/cat/' 2 This is a cat
Note:sed编辑器并不会修改文本文件的数据。它只会将修改后的数据发送到STDOUT。如果你查看原来的文本文件,它仍然保留着原始数据。s命令来执行替换,全称(sbustitute)
2. 在命令行使用多个编辑器命令
1 [Hermioner@localhost Documents]$ cat data1.txt 2 I love cat. 3 I love cat. 4 I love dog. 5 I love dog. 6 [Hermioner@localhost Documents]$ sed -e 's/cat/dog/; s/dog/cat/' data1.txt 7 I love cat. 8 I love cat. 9 I love cat. 10 I love cat.
Note: 执行多个命令时,采用-e参数,并且命令之间必须用分号隔开
3. 从文件中读取编辑器命令
1 [Hermioner@localhost Documents]$ cat data1.txt 2 The quick brown fox jumps over the lazy dog. 3 The quick brown fox jumps over the lazy dog. 4 The quick brown fox jumps over the lazy dog. 5 The quick brown fox jumps over the lazy dog. 6 [Hermioner@localhost Documents]$ cat script1.sed 7 s/brown/green/ 8 s/fox/elephant/ 9 s/dog/cat/ 10 [Hermioner@localhost Documents]$ sed -f script1.sed data1.txt 11 The quick green elephant jumps over the lazy cat. 12 The quick green elephant jumps over the lazy cat. 13 The quick green elephant jumps over the lazy cat. 14 The quick green elephant jumps over the lazy cat. 15 [Hermioner@localhost Documents]$
当有大量要处理的sed命令,将它们放进一个单独的文件中会更方便。可以在sed命令中使用-f选项来指定文件。
4. 更多的替换选项
默认情况下,只能替换每行中出现的第一处。如果要替换一行中不同地方出现的文本需要用到替换标记。格式如下:
s/pattern/replacement/flags
有4中可用的替换标记:
(1)数字,表明新文本将替换第几处模式匹配的地方;
(2)g,表明新文本将会替代所有匹配的文本;
(3)p,表明原先行的内容要打印出来;
(4)w file,将替换的结果写到文件中。
1 [Hermioner@localhost Documents]$ cat data.txt 2 This is a test of the test script. 3 This is the second test of the test script. 4 [Hermioner@localhost Documents]$ sed 's/test/trail/' data.txt 5 This is a trail of the test script. 6 This is the second trail of the test script. 7 [Hermioner@localhost Documents]$ sed 's/test/trail/2' data.txt 8 This is a test of the trail script. 9 This is the second test of the trail script. 10 [Hermioner@localhost Documents]$ sed 's/test/trail/g' data.txt 11 This is a trail of the trail script. 12 This is the second trail of the trail script. 13 [Hermioner@localhost Documents]$ sed 's/test/trail/p' data.txt 14 This is a trail of the test script. 15 This is a trail of the test script. 16 This is the second trail of the test script. 17 This is the second trail of the test script. 18 [Hermioner@localhost Documents]$ sed -n 's/test/trail/p' data.txt 19 This is a trail of the test script. 20 This is the second trail of the test script. 21 [Hermioner@localhost Documents]$ sed 's/test/trail/w test.txt' data.txt 22 This is a trail of the test script. 23 This is the second trail of the test script. 24 [Hermioner@localhost Documents]$ cat test.txt 25 This is a trail of the test script. 26 This is the second trail of the test script. 27 [Hermioner@localhost Documents]$
note: -n选项将禁止sed编辑器输出。但p替换标记会输出修改过的行。将二者配合使用的效果就是只输出被替换命令修改过的行。
对于文本字符串中的一些不太方便的字符,可以采用!来作为字符串分隔符,这样比转义字符更方便。由于正斜线通常作为字符串分隔符,因为如果他出现在了模式文本中的话,必须用反斜线来转义,麻烦。
1 [Hermioner@localhost Documents]$ cat dat.txt 2 /a/b/c 3 [Hermioner@localhost Documents]$ sed 's!/a/b/c!/e/f/g!' dat.txt 4 /e/f/g
5. 使用地址----行寻址
默认情况下,在sed编辑器中使用的命令会作用与文本数据的所有行。如果只想将命令作用于特定行或某些行,则必须用行寻址。在sed编辑器中有两种形式的行寻址:以数字形式表示行区间;用文本模式来过滤出行 两种形式都使用相同的格式来指定地址:
[address] command或者
address {
command1
command2
}
(1)数字方式的行寻址
1 [Hermioner@localhost Documents]$ cat data.txt 2 I like yellow cat. 3 I like yellow cat. 4 I like yellow cat. 5 I like yellow cat. 6 [Hermioner@localhost Documents]$ sed '3s/cat/dog/' data.txt 7 I like yellow cat. 8 I like yellow cat. 9 I like yellow dog. 10 I like yellow cat. 11 #只作用在第三行 12 [Hermioner@localhost Documents]$ sed '2,3s/cat/dog/' data.txt 13 I like yellow cat. 14 I like yellow dog. 15 I like yellow dog. 16 I like yellow cat. 17 #作用区间,第2到第3行 18 [Hermioner@localhost Documents]$ sed '2,$s/cat/dog/' data.txt 19 I like yellow cat. 20 I like yellow dog. 21 I like yellow dog. 22 I like yellow dog. 23 #作用到从第2行开始的所有行
(2)使用文本模式过滤器
格式前必须有斜杠,即 /pattern/command
1 [Hermioner@localhost Documents]$ grep I data.txt 2 I like yellow cat. 3 I like yellow cat. 4 I like yellow cat. 5 I like yellow cat. 6 [Hermioner@localhost Documents]$ cat data.txt 7 I like yellow cat. 8 I like yellow cat. 9 I like yellow cat. 10 I like yellow cat. 11 [Hermioner@localhost Documents]$ sed '/I/s/cat/dog/' data.txt 12 I like yellow dog. 13 I like yellow dog. 14 I like yellow dog. 15 I like yellow dog.
(3)命令组合----一行多条命令
1 [Hermioner@localhost Documents]$ cat data.txt 2 I like yellow cat. 3 I like yellow cat. 4 I like yellow cat. 5 I like yellow cat. 6 [Hermioner@localhost Documents]$ sed '2{ 7 > s/like/love/ 8 > s/cat/dog/ 9 > }' data.txt 10 I like yellow cat. 11 I love yellow dog. 12 I like yellow cat. 13 I like yellow cat. 14 [Hermioner@localhost Documents]$
使用区间:
1 [Hermioner@localhost Documents]$ sed '2,${ 2 > s/like/love/ 3 > s/cat/dog/ 4 > }' data.txt 5 I like yellow cat. 6 I love yellow dog. 7 I love yellow dog. 8 I love yellow dog.
6. 删除行
删除命令d,他会删除匹配指定寻址模式的所有行。如果什么数字也不加,会全删除。同样也可以匹配数字或者文本:
1 [Hermioner@localhost Documents]$ cat data1.txt 2 this is 1 line. 3 this is 2 line. 4 this is 3 line. 5 this is 4 line. 6 [Hermioner@localhost Documents]$ sed '2d' data1.txt 7 this is 1 line. 8 this is 3 line. 9 this is 4 line. 10 [Hermioner@localhost Documents]$ sed '1,3d' data1.txt 11 this is 4 line. 12 [Hermioner@localhost Documents]$ sed 'd' data1.txt 13 [Hermioner@localhost Documents]$ sed '2,$d' data1.txt 14 this is 1 line. 15 [Hermioner@localhost Documents]$ sed '/is 2/d' data1.txt 16 this is 1 line. 17 this is 3 line. 18 this is 4 line. 19 [Hermioner@localhost Documents]$
7. 插入和附加文本
插入(insert)命令(i)会在指定行前增加一新行;
附加(append)命令(a)会在指定行后增加一新行。
必须指定到哪个行:
sed '[address]command\new line'
1 [Hermioner@localhost Documents]$ cat data1.txt 2 this is 1 line. 3 this is 2 line. 4 this is 3 line. 5 this is 4 line. 6 [Hermioner@localhost Documents]$ echo "a" | sed 'i\b' 7 b 8 a 9 [Hermioner@localhost Documents]$ echo "a" | sed 'a\b' 10 a 11 b 12 [Hermioner@localhost Documents]$ sed '3a\haha' data1.txt 13 this is 1 line. 14 this is 2 line. 15 this is 3 line. 16 haha 17 this is 4 line. 18 [Hermioner@localhost Documents]$ sed '$a\hello' data1.txt 19 this is 1 line. 20 this is 2 line. 21 this is 3 line. 22 this is 4 line. 23 hello 24 #在最后一行后面添加 25 [Hermioner@localhost Documents]$ sed '1i\ 26 > a\ 27 > b' data1.txt 28 a 29 b 30 this is 1 line. 31 this is 2 line. 32 this is 3 line. 33 this is 4 line. 34 #插入多行,需要反斜线分开
8. 修改行
可以根据数字或者匹配模式进行修改。修改(change)
1 [Hermioner@localhost Documents]$ cat data1.txt 2 this is 1 line. 3 this is 2 line. 4 this is 3 line. 5 this is 4 line. 6 [Hermioner@localhost Documents]$ sed '3c\hello' data1.txt 7 this is 1 line. 8 this is 2 line. 9 hello 10 this is 4 line. 11 #指定行匹配 12 [Hermioner@localhost Documents]$ sed '/3 line/c\hi' data1.txt 13 this is 1 line. 14 this is 2 line. 15 hi 16 this is 4 line. 17 #模式匹配
9. 转换命令
转换(transform)命令(y)是唯一可以处理单个字符的sed编辑器命令。格式如下“
[address]y/inchars/outchars/ 转换命令会对inchars和outchars值进行一对一的映射。inchars中的第一个字符会被转换为outchars中的第一个字符,第二个,第三个。。。如果Inchars和outchars的长度不同,则sed会报错。比如:
1 [Hermioner@localhost Documents]$ cat data1.txt 2 this is 1 line. 3 this is 2 line. 4 this is 3 line. 5 this is 4 line. 6 [Hermioner@localhost Documents]$ sed 'y/123/789/' data1.txt 7 this is 7 line. 8 this is 8 line. 9 this is 9 line. 10 this is 4 line.
1 [Hermioner@localhost Documents]$ echo "This 2 is a test of 1 try" | sed 'y/123/456/' 2 This 5 is a test of 4 try 3 [Hermioner@localhost Documents]$
10. sed的其它打印--------(暂时不管)
之前介绍了用p标记和替换命令显示sed编辑器修改过的行。另外有3个命令也能用来打印数据流中的信息:
1)p命令用来打印文本行;
2)等号(=)命令用来打印行号;
3)l(小写L)命令用来列出行。
(1)打印行
1 [Hermioner@localhost Documents]$ cat data1.txt 2 this is 1 line. 3 this is 2 line. 4 this is 3 line. 5 this is 4 line. 6 [Hermioner@localhost Documents]$ sed -n '2,3p' data1.txt 7 this is 2 line. 8 this is 3 line.
(2)打印行号
1 [Hermioner@localhost Documents]$ cat data1.txt 2 this is 1 line. 3 this is 2 line. 4 this is 3 line. 5 this is 4 line. 6 [Hermioner@localhost Documents]$ sed '=' data1.txt 7 1 8 this is 1 line. 9 2 10 this is 2 line. 11 3 12 this is 3 line. 13 4 14 this is 4 line.
(3)列出行
11. 使用sed处理文件
(1)写入文件
w命令用来向文件写入行。格式如下:
[address]w filename
filename可以使用相对路径或绝对路径。
1 [Hermioner@localhost Documents]$ cat data1.txt 2 this is 1 line. 3 this is 2 line. 4 this is 3 line. 5 this is 4 line. 6 [Hermioner@localhost Documents]$ sed '1,2w test.txt' data1.txt 7 this is 1 line. 8 this is 2 line. 9 this is 3 line. 10 this is 4 line. 11 [Hermioner@localhost Documents]$ cat test.txt 12 this is 1 line. 13 this is 2 line. 14 #将前两行的内容写到另外一个文件中去。如果不想然sed输出到STDOUT,可以添加-n选项
1 [Hermioner@localhost Documents]$ cat data11.txt 2 Zhang,San A 3 Li,Si B 4 Wang,San B 5 [Hermioner@localhost Documents]$ sed -n '/B/w h.txt' data11.txt 6 [Hermioner@localhost Documents]$ cat h.txt 7 Li,Si B 8 Wang,San B 9 #模式匹配
(2)从文件读取数据
read命令允许将一个独立文件中的数据插入到数据流中。格式如下:
[address]r filename sed会将文件中的文本插入到指定地址后。
1 [Hermioner@localhost Documents]$ cat data11.txt 2 Zhang,San A 3 Li,Si B 4 Wang,San B 5 [Hermioner@localhost Documents]$ cat data1.txt 6 this is 1 line. 7 this is 2 line. 8 this is 3 line. 9 this is 4 line. 10 [Hermioner@localhost Documents]$ sed '3r data11.txt' data1.txt 11 this is 1 line. 12 this is 2 line. 13 this is 3 line. 14 Zhang,San A 15 Li,Si B 16 Wang,San B 17 this is 4 line.
1 [Hermioner@localhost Documents]$ cat data11.txt 2 Zhang,San A 3 Li,Si B 4 Wang,San B 5 [Hermioner@localhost Documents]$ cat data1.txt 6 this is 1 line. 7 this is 2 line. 8 this is 3 line. 9 this is 4 line. 10 11 [Hermioner@localhost Documents]$ sed '/2 line/r data11.txt' data1.txt 12 this is 1 line. 13 this is 2 line. 14 Zhang,San A 15 Li,Si B 16 Wang,San B 17 this is 3 line. 18 this is 4 line. 19 [Hermioner@localhost Documents]$
如果要在数据流的末尾添加文本,只需要用美元符地址符就行了。
1 [Hermioner@localhost Documents]$ cat data1.txt 2 this is 1 line. 3 this is 2 line. 4 this is 3 line. 5 this is 4 line. 6 [Hermioner@localhost Documents]$ cat data11.txt 7 Zhang,San A 8 Li,Si B 9 Wang,San B 10 [Hermioner@localhost Documents]$ sed '$r data11.txt' data1.txt 11 this is 1 line. 12 this is 2 line. 13 this is 3 line. 14 this is 4 line. 15 Zhang,San A 16 Li,Si B 17 Wang,San B 18 [Hermioner@localhost Documents]$
总结:sed基础差不多就这么多内容。作为一款流编辑器,sed能在读取数据时快速地自动处理数据。必须给sed编辑器提供便于处理数据的编辑命令。
参考文献:
Linux命令行与shell脚本编程大全(第3版)[美] 布鲁姆(Richard Blum),布雷斯纳汉(Christine Bresnahan) 著,门佳,武海峰 译