sed 转载

http://bbs.chinaunix.net/forum.php?mod=viewthread&tid=336126

我是新手，翻译得不好，加注得马马虎虎，很多地方都是凭自己的理解写的。由于刚开始学sed，所以很多地方写得很初级，呵呵，难免有些罗嗦。写到最后又有些头晕，还请大虾们多多指点，里头好几个命令我解释不清楚，如不吝赐教，感激不尽！
     同时欢迎拍砖！你拍一，我拍一，.......
FILE SPACING:
# double space a file
#使一个文件中每一行都占用两行空间(就是在每一行后边插一行空行)
sed G
###sed 中G命令的解释为append hold space to pattern space.
###就是在当前位置后加一行保留空间中的内容，无任何动作时，保留空间为空行
###所以就double space a file 了，呵呵．
# double space a file which already has blank lines in it. Output file
# should contain no more than one blank line between lines of text.
#假若文件已经含有空白行在其中，使一个文件中每一行占用两行
#空间。输出文件中文本行之间不含有超过一行的空行
sed '/^$/d;G'
###先用sed '/^$/d' 查找并删除空行；然后用 sed G插入空行
# triple space a file
#使一个文件中每一行都占用三行空间(就是在每一行后边插两行空行)
sed 'G;G'
###不用说了吧，重复两次sed G.
# undo double-spacing (assumes even-numbered lines are always blank)
#撤销占用两行空间的操作(假设偶数行都是空白行)
sed 'n;d'
###sed 中命令n的解释为Read the next line of input into the pattern space．
###所以我理解为用sed n 读入下一行兵紧接着用sed d 删除，如果隔两行删除一行那么
###用sed 'n,n,d',要是隔100行删除一行呢....什么???!!!你要写100个n???!!!
# insert a blank line above every line which matches "regex"
#在每个含有字符串regex的行上插入一行空白行
sed '/regex/{x;p;x;}'
###sed 中命令x解释为Exchange the contents of the hold and pattern spaces.
###我的理解也就是交换保留空间与模式空间的内容
###sed 中命令p为Print the current pattern space．就是打印模式空间中的内容．
###所以理解为保留空间中开始为空行，模式空间中经过sed '/regex/'查询后为包含
###regex内容的那一行，1)x;交换模式空间和保留空间的内容，此时模式空间中内容
###为空行，保留空间中内容为含有regex内容的行， 2)p；命令打印模式空间内容(
###空行)，在原文中含有regex内容的那一行的位置出现两行空行，其中后一行为
###模式空间中的内容，3)x;后交换模式空间和保留空间中的内容，．．．．结果就是在原
###来出现regex的位置前一行加入了一行空行。
# insert a blank line below every line which matches "regex"
# 在每个含有字符串regex的行下插入一行空白行
sed '/regex/G'
###比较简单，查找后在后边加入一空行
# insert a blank line above and below every line which matches "regex"
#在每个含有字符串regex的行上，下各插入一行空白行
sed '/regex/{x;p;x;G;}'
###兄弟两个sed '/regex/G'和sed '/regex/{x;p;x;}'合作的结果．
NUMBERING:
# number each line of a file (simple left alignment) Using a tab (see
# note on '\t' at end of file)instead of space will preserve margins.
#给文件每一行加上数字序号。用TAB制表符替换空间来保留空白(?)
#(在数字序号和文本中间加一TAB制表符)
sed = filename | sed 'N;s/\n/\t/'
###sed = filename的功能是 Print the current line number.
###但是这个功能是在每一行前面另加一行，并且显示行号,而不是直接在行首加序号
###sed中命令N的解释为Append the next line of input into the pattern space.
###也就是把当前行后一行的内容加在当前行后边．
###sed中命令s/regexp/replacement/解释为Attempt to match regexp against the
###pattern space. If successful, replace that portion matched with
### replacement.大概意思是在模式空间外匹配regexp，如果成功，使用匹配replace
###ment的内容替换regexp.说白了就是查找替换吧．\n是换行符,\t是TAB制表符,
###所以整个命令的意思也就出来了．
# number each line of a file (nnumber on left, right-aligned)
#给文件每一行加上数字序号(数字在左边，向右对齐？)
sed = filename | sed 'N; s/^/ /; s/ *$.\{6,\}$\n/\1 /'
### 前面不用说了，但是后边......
###s/ *$.\{6,\}$\n/\1 /' 这个地方确实不是很明白!~~
# number each line of file, but only print numbers if line is not blank
#给文件每一行加上数字序号，但是仅当行非空时打印数字
sed '/./=' filename | sed '/./N; s/\n/ /'
###sed '/./=' filename的用处是查找除非空行赋予行号,sed '/./N; s/\n/ /'查找非
##空行并把后一行附加到当前行,然后用空格替换换行符\n
# count lines (emulates "wc -l"
#统计行数(类似于 "wc -l"
sed -n '$='
#sed中参数n的含义是suppress automatic printing of pattern space,也就是限制
###自动打印模式空间中内容的功能， '$='中$的含义是Match the last line，=前边
###已经说过了，就是打印行号，所以匹配最后一行而且只打印行号不打印内容，就是
###"wc -l"了
TEXT CONVERSION AND SUBSTITUTION:
# IN UNIX ENVIRONMENT: convert DOS newlines (cR/LF)to Unix format
#在UNIX环境下：转换DOS换行符(?)(cR/LF)UNIX格式
sed 's/.$//' # assumes that all lines end with CR/LF
                 # 假设所有的行都以CR/LF结尾
###可能在DOS中的ASCII码(包括CR/LF)到了UNIX中都成了单字符吧，又因为".$"代表
###每行最后一个字符，所以把它们替换掉就OK了．CR/LF是啥？CR---ASCII Carriage
###Return(ASCII 回车) ,LF----ASCII Linefeed (ASCII换行)
sed 's/^M$//' # in bash/tcsh, press Ctrl-V then Ctrl-M
                      #在bash/tcsh中，按下Ctrl-V然后按 Ctrl-M
###没啥说的，就是查找替换，注意命令中"^M"在输入时一定是按下Ctrl-V然后按 Ctrl-M
###如果输入成ctrl+6键，然后输入一个大写M,什么替换也完成不了啦．
sed 's/\x0D$//' # gsed 3.02.80, but top script is easier
                         # ???
###不是很了解！高手说一下吧！
# IN UNIX ENVIRONMENT: convert Unix newlines (F)to DOS format
#在unix环境中：转换Unix换行符(F)DOS格式
sed "s/$/`echo -e \\\r`/" # command line under ksh
#在ksh下的命令行
sed 's/$'"/`echo \\\r`/" # command line under bash
#在bash下的命令行
sed "s/$/`echo \\\r`/" # command line under zsh
                       #在zsh下的命令行
sed 's/$/\r/' # gsed 3.02.80
# gsed3.02.80版本下的命令行
###以上四个命令是在不同的shell版本下用\r(好象就是ASCII码下的CR)替换行尾
# IN DOS ENVIRONMENT: convert Unix newlines (F)to DOS format
#在DOS环境下转换Unix换行符到DOS格式
sed "s/$//" # method 1
sed -n p # method 2
###这句又不是很了解，本来$就是行尾了，把行尾替换成空，那就变成了DOS格式了吗？
###下边一句也很奇怪，参数-n是suppress automatic printing of pattern space，命
###令p是Print the current pattern space，一正一反就换成DOS格式了？乖乖~~
# IN DOS ENVIRONMENT: convert DOS newlines (cR/LF)to Unix format
#在Dos环境下：转换DOS换行符为UNIX格式
# Cannot be done with DOS versions of sed. Use "tr" instead.
#用DOS版本的sed不能做到这点，用"tr"代替．
tr -d \r <infile >outfile # GNU tr version 1.22 or higher
#GNU tr 1.22版本或者更高版本
# delete leading whitespace (spaces, tabs)from front of each line
# aligns all text flush left
#删除每一行开头的空白(空格，TAB)左对齐排列全文．
sed 's/^[ \t]*//' # see note on '\t' at end of file
# ???
### 又是替换成空，^[ \t]* 的含义为以空格或者TAB键开始的(或者是他们的组合)行．
# delete trailing whitespace (spaces, tabs)from end of each line
#从每一行结尾处删除最后的空格(空格,TAB)
sed 's/[ \t]*$//' # see note on '\t' at end of file
                  #??
### 跟上边的命令"前呼后拥"呀．
# delete BOTH leading and trailing whitespace from each line
#删除每一行的开头和结尾的空格
sed 's/^[ \t]*//;s/[ \t]*$//'
###两步走．
# insert 5 blank spaces at beginning of each line (ake page offset)
#在每一行开始处插入5个空格(整页偏移)
sed 's/^/ /'
###没啥说的．
# align all text flush right on a 79-column width
#右对齐，按79列宽排列所有文本
sed -e :a -e 's/^.\{1,78\}$/ &/;ta' # set at 78 plus 1 space
###这个语句好像很麻烦，不过看懂了还挺有意思．：）
###首先出现了几个新东东1.":" 2."&". 3. "-e " 4."t"，解释一下
###1.":" Label for b and t commands.(给b和t命令加注标签)
###2."&" 表示重复整个匹配的规则表达式．
###3. "-e" add the script to the commands to be executed
###   把脚本加到命令里用以执行
###4. t label If a s/// has done a successful substitution since the last
###input line was read and since the last t or T command, then branch to
###label; if label is omitted, branch to end of script.
###如果从读入最后一个输入行并且执行最后一个t或者T命令后，一个s///命令成功替换，
###那么流程分支指向label处，如果label被忽略(就是没有成功替换吧，我想),那么流程
###分支指向脚本结束．
###回过头来看，整个sed命令行是一个循环执行的语句，每一行都要替换(78-当前行的字
###符数)次,所以如果整个文件比较大，或者一行字符数比较少，做这个动作就有点吃力了．
###不信你试试吧，呵呵．
# center all text in the middle of 79-column width. In method 1,
# spaces at the beginning of the line are significant, and trailing
# spaces are appended at the end of the line. In method 2, spaces at
# the beginning of the line are discarded in centering the line, and
# no trailing spaces appear at the end of lines.
#使所有文本居于79列宽的格式中央。在第一种方法中，每一行开头处的空格是
#很重要的，最后的空格被附在行尾。第二种方法中，一行开头的空格在中心对
#齐的行中被丢弃，行尾也没有原来结尾处的空格。
sed -e :a -e 's/^.\{1,77\}$/ & /;ta' # method 1
sed -e :a -e 's/^.\{1,77\}$/ &/;ta' -e 's/$*$1/\1/' # method 2
###跟上边的差不多,当两边都加空格的时候，效率要高一些~~
# substitute (ind and replace)"foo" with "bar" on each line
#在每一行中用"bar"替换(找并替换)foo"
sed 's/foo/bar/' # replaces only 1st instance in a line
                 # 在一行中，仅在第一次出现时替换
sed 's/foo/bar/4' # replaces only 4th instance in a line
                  #在一行中，仅在第四次出现时替换
sed 's/foo/bar/g' # replaces ALL instances in a line
                  #在一行中替换所有出现的值
###这三个命令很简单,不多说了.
sed 's/$.*$foo$.*foo$/\1bar\2/' # replace the next-to-last case
                                #替换紧邻最后一个匹配出现的值
###'s///---- The replacement may contain the special character & to refer to that
###portion of the pattern space which matched, and the special escapes \1
### through \9 to refer to the corresponding matching sub-expressions in the regexp.
###就不直接翻译了，大概意思就是replacement处可以包含&代表匹配的模式空间中
###的部分,特殊的\1-\9可以代表regexp中相应的"子表达式",也就是说前面regexp
###可以分为几个子表达式,而后边replacement中可以用\1-\9分别代表它们.这样就
###增加了灵活性，便于修改sed命令.
###把regexp中的\去掉后，就变成(.*)foo(*foo),其中(.*)表示零个或者多个字符，
###这样加上后边的\1bar\2就变成了改变倒数第二个foo,而倒数第一个不变
sed 's/$*$foo/\1bar/' # replace only the last case
                       #只替换最后一个值
###比上一个简单
# substitute "foo" with "bar" ONLY for lines which contain "baz"
#在每一含有"baz"的行中用"bar"替换(查找并替换)foo"
sed '/baz/s/foo/bar/g'
### /baz/用来查找，后边的用来替换
# substitute "foo" with "bar" EXCEPT for lines which contain "baz"
#在每一不含有"baz"的行中用"bar"替换(找并替换)foo"
sed '/baz/!s/foo/bar/g'
###反其道而行之．
# change "scarlet" or "ruby" or "puce" to "red"
#将"scarlet"或者"ruby"或者"puce"替换成"red"
sed 's/scarlet/red/g;s/ruby/red/g;s/puce/red/g' # most seds
                                                #大多数sed可用
###三步走．
gsed 's/scarlet\|ruby\|puce/red/g' # GNU sed only
                                   #仅GNU sed可用
# reverse order of lines (emulates "tac"
#反转文章行的顺序(类似"tac" )
# bug/feature in HHsed v1.5 causes blank lines to be deleted
#???????
sed '1!G;h;$!d' # method 1
###
###首先看第一个命令1!G，这个是啥意思?"!"表示后边的命令对所有没有
###被选定的行发生作用，G呢？获得保留空间(专业名词叫内存缓冲区?)中
###的内容，并追加到当前模式空间的后面.1就是选定第一行.h的含义是拷贝
###模式空间内容到保留空间(内存缓冲区)。那么先看 sed '1!G'什么含义
###执行一下这个命令，假若文件是
### $ cat test.txt
### 1
### 2
### 3
### 4
###那么 sed '1!G' test.txt的结果是
### $ sed '1!G' test.txt
### 1
### 2
###
### 3
###
### 4
###
### $
### 也就是说除了第一行,后边每行都加了空行,这是因为内存缓冲区中默认值
###是空行吧。然后我们加上h,看看发生什么
### $ sed '1!G;h' test.txt
### 1
### 2
### 1
### 3
### 2
### 1
### 4
### 3
### 2
### 1
### $
### 空行没了，咋回事?我是这样理解的，不知道对不对，大家帮助看看：）
###首先要确定，执行到每一行，sed把当前处理的行存在临时的缓冲区内，
###称为模式空间(pattern space).一旦sed完成对模式空间中行的处理，模式
###空间中的行就被送往屏幕．行被处理完成后，就被移出模式空间．．．
###
###命令执行第一行时，由于匹配了第一行，所以"!G"不起作用，只打印了
###第一行的内容，然后"h"把模版块中的内容也就是第一行的内容拷贝进缓冲区，
###注意此时是用第一行的内容替换空行.模式空间中的内容要打印，所以出现1.
###执行到第二行时，打印第二行内容，而且由于不匹配"1",所以在后边"G"命令起
###作用,获得了缓冲区中的第一行的内容，然后加到当前模式空间中，并打印,出现
###21。然后把模式空间中的内容写入缓冲区，也就是把21写入缓冲区。执行到第三行
###匹配不成功,所以缓冲区的内容应该是第二行的内容加上第一行的内容，追加到模
###式空间的第三行后边：321.然后把321拷贝到缓冲区，．．．以此类推就出现了上
###面的结果.
###我不知道这么解释对不对，但是当我把命令中的1换成2，3，4后执行，得到了我
###想象的结果。还请高手指点~~
###加上最后一句"$!d",那就是前面三行的结果删除，保留最后一行。这样就形成了
### tac的效果啦。
sed -n '1!G;h;$p' # method 2
###与上边类似的，不罗嗦!
# reverse each character on the line (emulates "rev"
#反转一行中每个字符的顺序(类似"rev"
sed '/\n/!G;s/$.$$.*\n$/&\2\1/;//D;s/.//'
###这个命令真是.....
###我是在解释不通,所以按照我的想法来说一下吧,完全是瞎说!
###'/\n/!G'是判断本行是否有换行符,如果没有执行G命令
###'s/$.$$.*\n$/&\2\1/'命令是在原来行+第二个字符(或者没有)开始到换行符+第一个字符
###//D命令是在模式空间删除第一行,注意执行完成后如果模式空间不为空，继续下一个
###循环执行.
###s/.//命令是删除第一个字符
###假设一行文字是 123\n
###那么执行后模式空间中的内容应该按下边的顺序变化
### 123\n
### 123\n23\n1
### 23\n1
### 23\n13\n21
### 13\n21
### 3\n21
### 3\n21\n321
### \n321
### 321
### 我的疑问就是,为什么第一次没有执行s/.//?!如果执行了,那么就得不到结果了啊!
### 救~~~~命~~~啊！????????????????????????????????

原帖由 "waker" 发表：
#反转一行中每个字符的顺序(类似"rev"
sed '/\n/!G;s/$.$$.*\n$/&\2\1/;//D;s/.//'
###假设一行文字是 123
###那么执行后模式空间中的内容应该按下边的顺序变化
执行/\n/!G;得
123\n
然后s/$.$$.*\n$/&\2\1/;
得
123\n23\n1
执行//D
23\n1
因为是D命令所以从头循环
模式空间有\n
所以/\n/!G;中G不执行
再来s...
23\n3\n21
再D
3\n21
循环,G不执行
再来s...
3\n\n321
再D
\n321
循环
G和s和D都不执行
执行最后的s/.//
321

# join pairs of lines side-by-side (like "paste"
#把两行合为一行(类似于"paste"
sed '$!N;s/\n/ /'
###这个命令改成 sed 'N;s/\n/ /'一样可以达到目的，不知前面
###的$!有什么用处...
# if a line ends with a backslash, append the next line to it
#如果一行以"\"结束，把下一行加在此行上
sed -e :a -e '/\\$/N; s/\\\n//; ta'
###循环操作，两次替换。
# if a line begins with an equal sign, append it to the previous line
# and replace the "=" with a single space
#如果一等号开始某一行，把这一行加到前一行后边，并且用一个空格替换等号
sed -e :a -e '$!N;s/\n=/ /;ta' -e 'P;D'
###和上边差不多，要注意两个新的命令:
### P命令--Print up to the first embedded newline of the current
###pattern space.打印当前模式空间中第一行。
###D命令--Delete up to the first embedded newline in
### the pattern space. Start next cycle, but skip reading from
###the input if there is still data in the pattern space.
###删除当前模式空间中第一行。开始新的循环，但是如果在模式空间中仍然
###有数据，那么跳过读取输入。
# add commas to numeric strings, changing "1234567" to "1,234,567"
#给数字串加逗号，把"1234567"变为"1,234,567"
gsed ':a;s/\B[0-9]\{3\}\>/,&/;ta' # GNU sed
sed -e :a -e 's/$.*[0-9]$$[0-9]\{3\}$/\1,\2/;ta' # other seds
###(.*[0-9])表示零个或者多个字符(可能包含数字)+一个数字,而
###([0-9]{3})表示三个数字,然后不停的替换,直到条件不成立,也就是没有
###四个以上连续的数字出现就停止了.
# add commas to numbers with decimal points and minus signs (NU sed)
#给带有小数点和负号的数字的数字加上逗号
gsed ':a;s/$^\|[^0-9.]$$[0-9]\+$$[0-9]\{3\}$/\1\2,\3/g;ta'
###没有gsed，不解释了
# add a blank line every 5 lines (after lines 5, 10, 15, 20, etc.)
#每五行后加一空行
gsed '0~5G' # GNU sed only
sed 'n;n;n;n;G;' # other seds
###一大早就说过了的...
SELECTIVE PRINTING OF CERTAIN LINES:
# print first 10 lines of file (emulates behavior of "head"
#打印一个文件的前10行(模仿动作"head"
sed 10q
# print first line of file (emulates "head -1")
#打印一个文件的第一行(仿"head -1")
sed q
### q命令的解释Immediately quit the sed script without processing
###any more input, except that if auto-print is not disabled the
###current pattern space will be printed.
### 所以上边两个命令都清楚了，执行到第10行退出就打印前10行，执行第一行
###就退出就打印第一行
# print the last 10 lines of a file (emulates "tail")
#打印一个文件的后10行(仿"tail")
sed -e :a -e '$q;N;11,$D;ba'
###Label b : Branch to label; if label is omitted, branch to end of script.
###命令D 删除模式空间内第一个 newline 字母 \n 前的资料。
###命令N 把输入的下一行添加到模式空间中。
### b label:分支到脚本中带有标号的地方，如果标号不存就分支到脚本的末尾
###

原帖由 "waker" 发表：
试着注一下，不知道对不对
如果我们只看sed -e :a -e '$q;N;ba'
这个循环不停的读入下一行直到结尾，这样整个文本就形成一个由\n分割的链
现在加上11,$D
sed -e :a -e '$q;N;11,$D;ba'
如果文本不超过10行
模式空间将保留整个文本打印出来
如果文本大于10行
从第11行开始，在下一行加入到链中后，模式空间第一个由\n分割的记录将被删除，这样看起来就是链头被链尾挤出整个链，总是保持10个链环，循环结束后，链中保存的就是文件的后10行,最后印出结果

# print the last 2 lines of a file (emulates "tail -2")
#打印一个文件的最后两行(仿"tail -2")
sed '$!N;$!D'
### 开始看不太懂，抄了CU精华一段
###sed '$!N;$!D' : 对文件倒数第二行以前的行来说，N 将当前行的下一行放到模
###式空间中以后，D 就将模式空间的内容删除了；到倒数第二行的时候，将最后一行
###附加到倒数第二行下面，然后最后一行不执行 D ，所以文件的最后两行都保存下来了。
###不知道是这段话说得有些含糊，还是我理解得有偏差，总觉得D命令解释成
###"将模式空间的内容删除了"有些让人糊涂.
###而我是这样理解的，不知道对不对.首先说D命令是 Delete up to the first
###embedded newline in the pattern space.也就是说D命令是删除模式空间中
###第一个换行符之前的内容，也就是删除第一行.然后D命令的解释还有一句,我认为
###这句很重要: Start next cycle, but skip reading from the input if there
### is still data in the pattern space.开始下一个循环，但是如果模式空间中有
###数据，则跳过从输入中读取数据.
###具体怎么工作呢? 假设文件为
### $ cat test.txt
###   1
###   2
###   3
###   4
###   5
### 那么当执行第一行时,$!N把第二行加入到模式空间中第一行后边,然后$!D把第一行
###内容删除，模式空间中只剩下第二行的内容.注意,此时由于D命令开始下一个循环，
###所以不打印模式空间中的内容! (这个地方也是我想了半天才这么解释的，我也知道
###很可能不对，欢迎拍砖，呵呵)由于D命令要求模式空间中有数据的话就跳过读取下一行，
###所以继续下一个循环又到了$!N，此时读入第三行加到第二行后边，....以此类推。
###执行到读入第5行附加在第四行后边，然后由于$!D得不到执行，所以第4行和第5行
###都被保留，命令结束，打印模式空间...
# print the last line of a file (emulates "tail -1")
#打印一个文件的最后一行(仿"tail -1")
sed '$!d' # method 1
sed -n '$p' # method 2
###哈哈，终于看懂了一个，你也看懂了吧：）
# print only lines which match regular expression (emulates "grep")
#只打印匹配的一定字符的行(仿"grep")
sed -n '/regexp/p' # method 1
sed '/regexp/!d' # method 2
###明白参数-n和命令p和d就明白这两个命令．
# print only lines which do NOT match regexp (emulates "grep -v")
#只打印于一定字符不匹配的行(效"grep -v")
sed -n '/regexp/!p' # method 1, corresponds to above
sed '/regexp/d' # method 2, simpler syntax
###和上边相反，正如注释所说．
# print the line immediately before a regexp, but not the line
# containing the regexp
#打印包含"regexp"那一行的上一行,但是不打印包含"regexp"的行.
sed -n '/regexp/{g;1!p;};h'
###在命令执行到包含"regexp"那一行的上一行时,模式空间中这行的内容被
###拷贝到保留空间中．执行到包含"regexp"那一行时就打印它了.
# print the line immediately after a regexp, but not the line
# containing the regexp
#打印在"regexp"之后紧跟那一行，但是除去包含"regexp"的行.
sed -n '/regexp/{n;p;}'
###与上边类似，比上边简单．
# print 1 line of context before and after regexp, with line number
# indicating where the regexp occurred (imilar to "grep -A1 -B1")
#在"regexp"前后打印一行上下文，使其行号指示"regexp"在哪里出现(
#grep -A1 -B1相似)
sed -n -e '/regexp/{=;x;1!p;g;$!N;p;D;}' -e h
###看上去好像挺复杂，其实倒是不难解释．
###假设文档是这样
###$ cat test.txt
### 1 abc
### 2 cde
### 3 regexp
### 4 fgh
### 5 xyz
###命令执行到regexp前一行，引号里边的命令不执行,只执行h命令得到结果
###     command          parttern space             holdspace              output
###   执行到前一行               2cde                2cde
###   执行到regexp行 "="         3regexp                                      3
###      "x"                      2cde              3regexp
###      "1!p"                    2cde              3regexp                  2cde
###      "g"                     3regexp            3regexp
###      "$N"                3regexp ; 4fgh         3regexp
###       "p"                3regexp ; 4fgh         3regexp                 3regexp
###                                                                          4fgh
###       "D"                     4fgh              3regexp
###       "h"                     4fgh               4fgh
###
### 看一下最右边的输出结果，还不错吧！
# grep for AAA and BBB and CCC (n any order)
#查找"AAA"和"BBB"和"CCC".(任意顺序)
sed '/AAA/!d; /BBB/!d; /CCC/!d'
# grep for AAA and BBB and CCC (n that order)
# 查找"AAA"和"BBB"和"CCC".(一定顺序)
sed '/AAA.*BBB.*CCC/!d'
# grep for AAA or BBB or CCC (emulates "egrep")
#查找"AAA"或"BBB"或"CCC".(任意顺序)
sed -e '/AAA/b' -e '/BBB/b' -e '/CCC/b' -e d # most seds
gsed '/AAA\|BBB\|CCC/!d' # GNU sed only
###上边三个没什么说的，就是查找功能呗．
# print paragraph if it contains AAA (blank lines separate paragraphs)
# HHsed v1.5 must insert a 'G;' after 'x;' in the next 3 scripts below
#如果某段包含"AAA",则打印这一段。(空行用来分隔段落)
#HHsed v1.5必须在'x;'之后插入一个'G;'
sed -e '/./{H;$!d;}' -e 'x;/AAA/!d;'
###前边一部分命令用保留空间来存储整个段落内容，后边一个命令用来查找
# print paragraph if it contains AAA and BBB and CCC (n any order)
#如果某段包含"AAA"和"BBB"和"CCC",则打印这一段
sed -e '/./{H;$!d;}' -e 'x;/AAA/!d;/BBB/!d;/CCC/!d'
###同上
# print paragraph if it contains AAA or BBB or CCC
# 如果某段包含"AAA"或"BBB"或"CCC",则打印这一段
sed -e '/./{H;$!d;}' -e 'x;/AAA/b' -e '/BBB/b' -e '/CCC/b' -e d
gsed '/./{H;$!d;};x;/AAA\|BBB\|CCC/b;d' # GNU sed only
###同上
# print only lines of 65 characters or longer
#仅打印长于65个字符的行
sed -n '/^.\{65\}/p'
###这也没什么好说的，正则表达式的运用．
# print only lines of less than 65 characters
#仅打印少于65个字符的行
sed -n '/^.\{65\}/!p' # method 1, corresponds to above
sed '/^.\{65\}/d' # method 2, simpler syntax
###又没啥吧
# print section of file from regular expression to end of file
#打印从字符"regexp"开始到文件结束的部分
sed -n '/regexp/,$p'
###还没啥，注意","的作用是选择行的范围，从包含regexp的行到最后一行
# print section of file based on line numbers (ines 8-12, inclusive)
#根据行号来打印文件的一部分(-12行，包括在内)
sed -n '8,12p' # method 1
sed '8,12!d' # method 2
# print line number 52
#打印第52行
sed -n '52p' # method 1
sed '52!d' # method 2
sed '52q;d' # method 3, efficient on large files
###仅注意第三种方法效率比较高就行了
# beginning at line 3, print every 7th line
#从第三行开始，每7行打印一行
gsed -n '3~7p' # GNU sed only
sed -n '3,${p;n;n;n;n;n;n;}' # other seds
###好像很容易理解了吧
# print section of file between two regular expressions (nclusive)
#打印文件中指定字符之间的部分(含字符在内)
sed -n '/Iowa/,/Montana/p' # case sensitive
###现在简单了吧．：）
SELECTIVE DELETION OF CERTAIN LINES:
# print all of file EXCEPT section between 2 regular expressions
#打印除指定字符之间部分之外的全文
sed '/Iowa/,/Montana/d'
###与上边相似的简单
# delete duplicate, consecutive lines from a file (emulates "uniq")
# First line in a set of duplicate lines is kept, rest are deleted.
#删除文件中重复的连续的行(似于"uniq"命令)
#重复行中第一行保留，其他删除
sed '$!N; /^$.*$\n\1$/!P; D'
###如果不是最后一行，就把下一行附加在模式空间，然后进行查找操作
###"^"和"$"中间的内容如果有重复就匹配成功．如果匹配不成功就用P打印
###第一行．然后删除第一行．
# delete duplicate, nonconsecutive lines from a file. Beware not to
# overflow the buffer size of the hold space, or else use GNU sed.
#删除文件中重复的，但不连续的行。注意不要溢出保留空间的缓冲器的大小，
#否则使用GNU sed.
sed -n 'G; s/\n/&&/; /^$[ -~]*\n$.*\n\1/d; s/\n//; h; P'
###在我的linux环境执行不了，出错是sed: -e expression #1, char 34:
###Invalid range end.是不是所谓的溢出保留空间的大小了呢？我也不得而知．
###大家补充吧．!!?????????????????
# delete the first 10 lines of a file
#删除一个文件中前10行
sed '1,10d'
# delete the last line of a file
#删除一个文件中最后1行
sed '$d'
###与上边一个都是查找删除
# delete the last 2 lines of a file
#删除一个文件中最后2行
sed 'N;$!P;$!D;$d'
###如果理解了sed '$!N;$!D'是如何工作的，这句话也不在话下吧！
# delete the last 10 lines of a file
#删除一个文件中后10行
sed -e :a -e '$d;N;2,10ba' -e 'P;D' # method 1
sed -n -e :a -e '1,10!{P;N;D;};N;ba' # method 2
###和打印后10行相似．什么？打印后10那个没看懂? /shakehand ：）
###?????????????????
# delete every 8th line
# 每8行删除1行
gsed '0~8d' # GNU sed only
sed 'n;n;n;n;n;n;n;d;' # other seds
###没说的!
# delete ALL blank lines from a file (ame as "grep '.' ")
#删除文件所有空白行(似于"grep '.' ")
sed '/^$/d' # method 1
sed '/./!d' # method 2
###这两句就是告诉我们1.无内容的删除,2.有内容的保留 : )
# delete all CONSECUTIVE blank lines from file except the first; also
# deletes all blank lines from top and end of file (emulates "cat -s")
#删除文件中除一行空白行之外的所有连续空白行，也同时删除所有从头到尾的所
#有空白行(似于"cat -s")
sed '/./,/^$/!d' # method 1, allows 0 blanks at top, 1 at EOF
                 #方法1不允许文件顶部有空行，文件尾部可以
sed '/^$/N;/\n$/D' # method 2, allows 1 blank at top, 0 at EOF
                 #方法2不允许文件尾部有空行，文件顶部可以
###两个先选择，后删除命令.不多说了.
# delete all CONSECUTIVE blank lines from file except the first 2:
#删除文件中连续空行中除前两行空白行之外的所有空白行
sed '/^$/N;/\n$/N;//D'
###跟上边的命令相似，多了一步而已.
# delete all leading blank lines at top of file
#删除文件开头部分中的所有空白行
sed '/./,$!d'
###从有字符开始的行直到最后一行保留，其他删除.
# delete all trailing blank lines at end of file
#删除文件结尾部分中的所有空白行
sed -e :a -e '/^\n*$/{$d;N;ba' -e '}' # works on all seds
sed -e :a -e '/^\n*$/N;/\n$/ba' # ditto, except for gsed 3.02*
###不行了要死了，还是高手说吧，我再看下去会疯的！
###?????????????????????????????
# delete the last line of each paragraph
#删除每个段落中最后1行
sed -n '/^$/{p;h;};/./{x;/./p;}'
###应该是假设段落间用空行分隔
###命令执行时，如果不是空行那么交换模式空间和保留空间，如果交换后
###模式空间不为空，则打印模式空间中内容；如果是空行，那么打印模式空间
###间中的内容,也就是打印空行...以此类推,出现结果.
###终于完了，下边的特殊应用没有加注，随便翻译了一下，可能不够准确，大家参考一下吧. :em11:
SPECIAL APPLICATIONS:
# remove nroff overstrikes (char, backspace)from man pages. The 'echo'
# command may need an -e switch if you use Unix System V or bash shell.
# 从man page页里删除所有overstrikes(字符,backspace).如果使用unix系统v
#或者bash shell,echo命令可能需要-e参数.
sed "s/.`echo \\\b`//g" # double quotes required for Unix environment
                        #unix环境下需要双引号
sed 's/.^H//g' # in bash/tcsh, press Ctrl-V and then Ctrl-H
               #在bash/tcsh中，按Ctrl-V然后按Ctrl-H
sed 's/.\x08//g' # hex expression for sed v1.5
                 #sed v1.5中的hex表达式
# get Usenet/e-mail message header
# 获得新闻组/e-mail信息的标题部分
sed '/^$/q' # deletes everything after first blank line
# get Usenet/e-mail message body
#获得新闻组/e-mail信息的主体部分
sed '1,/^$/d' # deletes everything up to first blank line
# get Subject header, but remove initial "Subject: " portion
#获得题目的标题，但是删去开始的"Subject: "部分
sed '/^Subject: */!d; s///;q'
# get return address header
#获得返回的地址标题()
sed '/^Reply-To:/q; /^From:/h; /./d;g;q'
# parse out the address proper. Pulls out the e-mail address by itself
# from the 1-line return address header (ee preceding script)
#正确解析地址。把email地址从一行中单独提出来并返回地址头()
sed 's/ *(*)/; s/>.*//; s/.*[:<] *//'
# add a leading angle bracket and space to each line (uote a message)
#给每行增加的尖括号和空格()信息被引用)
#sed 's/^/> /'
# delete leading angle bracket & space from each line (nquote a message)
#删除每行的尖括号和空格()信息不被引用)
sed 's/^> //'
# remove most HTML tags (ccommodates multiple-line tags)
#删去大部分HTML标签(供多行标签))
sed -e :a -e 's/<[^>]*>//g;/</N;//ba'
# extract multi-part un(?)encoded binaries, removing extraneous header
# info, so that only the uuencoded portion remains. Files passed to
# sed must be passed in the proper order. Version 1 can be entered
# from the command line; version 2 can be made into an executable
# Unix shell script. (odified from a script by Rahul Dhesi.)
#抽取多部分未编码的二进制字节,删除无关的头信息,使得只保留未编码的部分.
#文件传送给sed必须保持正确的顺序。第一版本可以用于命令行的执行，第二版本
#可以制作成一个可执行的unix shell脚本
sed '/^end/,/^begin/d' file1 file2 ... fileX | uudecode # vers. 1
sed '/^end/,/^begin/d' "$@" | uudecode # vers. 2
# zip up each .TXT file individually, deleting the source file and
# setting the name of each .ZIP file to the basename of the .TXT file
# (under DOS: the "dir /b" switch returns bare filenames in all caps)
#独立的压缩每个txt文件，删除原文件并且根绝原文本文件设置每个zip文件名。
echo @echo off >zipup.bat
dir /b *.txt | sed "s/^$*$.TXT/pkzip -mo \1 \1.TXT/" >>zipup.bat

posted @ 2012-07-19 17:55 RuiWang 阅读(270) 评论(0) 编辑收藏举报

刷新页面返回顶部

RuiWang

sed 转载

公告