grep与正则补充

grep与正则表达式补充
正则
grep补充
paste

grep与正则表达式补充

正则

关于中括号

通配符当中的中括号

[15:34root@www.zhanghehe.cn /tmp]# touch file{a,A,b,B,c,C,d,D}.txt
[15:34root@www.zhanghehe.cn /tmp]# ls
filea.txt  fileA.txt  fileb.txt  fileB.txt  filec.txt  fileC.txt  filed.txt  fileD.txt
[15:34root@www.zhanghehe.cn /tmp]# ls fiel[a-c].txt
ls: cannot access fiel[a-c].txt: No such file or directory
# 注意在通配符当中的中括号里面a-c并不是仅匹配到小写的abc，还有大写的， 如下所示；
[15:35root@www.zhanghehe.cn /tmp]# ls file[a-c].txt
filea.txt  fileA.txt  fileb.txt  fileB.txt  filec.txt
# ls后面虽然可以跟通配符，但这只是查看的时候，我们无法在touch或mkdir这样的创建性的命令上使用通配符，如下所示：
[15:39root@www.zhanghehe.cn /data]# touch file[a-c].txt
[15:39root@www.zhanghehe.cn /data]# ls
file[a-c].txt

正则表达式当中的中括号里面内容是狭义的，小写范围就是小写范围

[15:40root@www.zhanghehe.cn /tmp]# ls
filea.txt  fileA.txt  fileb.txt  fileB.txt  filec.txt  fileC.txt  filed.txt  fileD.txt
[15:40root@www.zhanghehe.cn /tmp]# ls | grep file[a-c].txt
[15:40root@www.zhanghehe.cn /tmp]# ls | grep "file[a-c].txt"
filea.txt
fileb.txt
filec.txt
[15:40root@www.zhanghehe.cn /tmp]# ls | grep "file[A-C].txt"
fileA.txt
fileB.txt
fileC.txt
[15:41root@www.zhanghehe.cn /tmp]# ls | grep "file[A-Ca-c].txt"
filea.txt
fileA.txt
fileb.txt
fileB.txt
filec.txt
fileC.txt

关于^号

这个符号在中括号内部和外部是不是同一个意思，比如[^{a-z]这是取反小写，而}[^a-z]是行首不是小写开着的，比如

[16:06root@www.zhanghehe.cn /data]# ls
a.txt  A.txt  b.txt  B.txt  c.txt  C.txt  d.txt  D.txt
[16:06root@www.zhanghehe.cn /data]# ls | grep '^[^a-z].*'
A.txt
B.txt
C.txt
D.txt

关于点星号

起初时我一直不理解在正则表达式当中点代表任何单个字符，而星号代表任意次数，而点和星号连接起来就是任意字符出现任意次数，后来渐渐理解，点代表的任意单个字符是真正的“单个”的，而不是只是一个，比如说：

那结果是什么呢？结果会把abcd全部都匹配上，在这种情况下再加上星号，就能很好的理解点星号就是任意字符出现任意次数的意思了。

还是关于中括号

当中括号里面有点的时候，那这个里面的点是什么意思？是单纯的点的意思，还是代表任意单个字符呢？

答案是中括号里面的点就真的是点，并没有任意的意思，而中括号外部的点才是任何单个字符的意思。

贪婪与懒惰

# 起过了自己所能匹配到的是短模式就是贪婪，如下所示
[16:14root@www.zhanghehe.cn ~]# echo "googolgooe" | grep "go*"
googolgooe

包含go的字符串其实只需要在第一个go就可以停止了，但*号却一直延伸到自己的极限。

# 而使用\?，完全的限制了次数，所以就会匹配到最短，这就是懒惰模式；
[16:14root@www.zhanghehe.cn ~]# echo "googolgooe" | grep "go\?"
googolgooe

关于词

用下划线分隔开的、带数字的、连接字母的不能看做是两个词，其实还是一个词。

换句话说，单词是由数字、字母、下划线组成

[16:57root@www.zhanghehe.cn /tmp]# cat -n test.txt
     1	hello zhanghe
     2	hello zhang_he
     3	hello zhang.he
     4	hello zhang-he
     5	hello zhang@he
     6	hello zhang he
     7	hello zhang,he
     8	hello zhang:he
     9	hello zhang123
[16:57root@www.zhanghehe.cn /tmp]# cat -n test.txt | grep '\bzhang\b'
     3	hello zhang.he
     4	hello zhang-he
     5	hello zhang@he
     6	hello zhang he
     7	hello zhang,he
     8	hello zhang:he

cut与grep

grep其实在一定程度是可以代替cut的，因为grep的 -o 选项就有切的意思

# 比如取出IP地址
ifconfig ens192 | grep netmask | grep "\([0-9]\{1,3\}.\)\{3\}[0-9]\{1,3\}" -o | head -1
192.168.88.77
ifconfig ens192 | grep netmask | grep "([0-9]{1,3}\.){3}[0-9]{1,3}" -Eo | head -1
192.168.88.77

取IP地址这一个值得说一下，其实这个取IP地址真的很简单，IP地址是每一个单元最大是255，最多三个数，所以数字出现最少一次最多三次，把数字和一个点当做是一个单元，共出现三次，最的一次没有点就手动写一下。

grep补充

grep里面好一些选项其实是和正则是类似的，比如取反，grep里面有-v，正则里面有^符号

-r 选项，非常实用，可以指定一个目录，然后搜索某个关键字，他就把所有的文件全部打开之后搜索一遍，如下所示，非常有用，小r是处理目录里面软链接，如果是R是会处理软链接的。

[17:26root@www.zhanghehe.cn ~]# echo $zh
tdtd
[17:26root@www.zhanghehe.cn ~]# grep -r tdtd /etc/
/etc/profile.d/test.tsh:zh=tdtd

-f 指定一个文件里面的关键字做模式，这个也非常的有用，比如我想知道两个文件共同存在的行

[17:29root@www.zhanghehe.cn /data]# cat test1.txt 
a
b
c
d
e
[17:29root@www.zhanghehe.cn /data]# cat test2.txt 
a
3
b
b
k
c
4
t
d
8
uoi
334324

[17:30root@www.zhanghehe.cn /data]# grep -f test1.txt test2.txt 
a
b
b
c
d

# 通过nuiq -d 可以重复的行
[17:42root@www.zhanghehe.cn /data]# cat test1.txt test2.txt | sort | uniq -d
a
b
c
d

计算年龄之和

[17:51root@www.zhanghehe.cn /data]# cat age.txt 
zhangsan=10
lisi=20
wanger=13
[17:52root@www.zhanghehe.cn /data]# cat age.txt  | grep -oE '[[:digit:]]{1,2}' 
10
20
13
[17:51root@www.zhanghehe.cn /data]# cat age.txt  | grep -oE '[[:digit:]]{1,2}' | tr '\n' '+' | grep '^[0-9].*'
10+20+13+
[17:51root@www.zhanghehe.cn /data]# cat age.txt  | grep -oE '[[:digit:]]{1,2}' | tr '\n' '+' | grep '^[0-9].*[0-9]'
10+20+13+
[17:51root@www.zhanghehe.cn /data]# cat age.txt  | grep -oE '[[:digit:]]{1,2}' | tr '\n' '+' | grep '^[0-9].*[0-9]' -o
10+20+13
[17:51root@www.zhanghehe.cn /data]# cat age.txt  | grep -oE '[[:digit:]]{1,2}' | tr '\n' '+' | grep '^[0-9].*[0-9]' -o | bc
43
[17:52root@www.zhanghehe.cn /data]# grep -oE '[[:digit:]]{1,2}' age.txt| tr '\n' '+' | grep '^[0-9].*[0-9]' -o | bc
43

统计当前连接状态数量

[19:04root@www.zhanghehe.cn /data]# cat ss2.log | grep -v "State" | sort | cut -d' ' -f1 | uniq -c
    118 ESTAB
      1 FIN-WAIT-1
     11 LAST-ACK
[19:09root@www.zhanghehe.cn /data]# cat ss2.log | grep -v "State" | sort | grep '^[[:alnum:]-]\+\b' -o | uniq -c
    118 ESTAB
      1 FIN-WAIT-1
     11 LAST-ACK

paste

用来做拼接的，如下所示：

[18:01root@www.zhanghehe.cn /data]# cat test1.txt 
a
b
c
d
e
[18:01root@www.zhanghehe.cn /data]# cat test2.txt 
a
3
b
b
k
c
4
t
d
8
uoi
334324

[18:01root@www.zhanghehe.cn /data]# paste test1.txt test2.txt 
a	a
b	3
c	b
d	b
e	k
	c
	4
	t
	d
	8
	uoi
	334324
[18:01root@www.zhanghehe.cn /data]# paste test2.txt test1.txt 
a	a
3	b
b	c
b	d
k	e
c	
4	
t	
d	
8	
uoi	
334324

-d 指明拼接的符号

[18:02root@www.zhanghehe.cn /data]# paste test2.txt test1.txt -d*
a*a
3*b
b*c
b*d
k*e
c*
4*
t*
d*
8*
uoi*
334324*
*
[18:03root@www.zhanghehe.cn /data]# paste test1.txt -d*
a
b
c
d
e
# -s 将当前文件当中第一行当做一个单元进行拼接
[18:03root@www.zhanghehe.cn /data]# paste test1.txt -d* -s
a*b*c*d*e

# 我们再来看一下拼接年龄那个示例
[18:07root@www.zhanghehe.cn /data]# cat age.txt 
zhangsan=10
lisi=20
wanger=13
[18:07root@www.zhanghehe.cn /data]# egrep '[0-9]{1,}' age.txt -o 
10
20
13
[18:08root@www.zhanghehe.cn /data]# egrep '[0-9]{1,}' age.txt -o | paste -s -d+
10+20+13
[18:08root@www.zhanghehe.cn /data]# egrep '[0-9]{1,}' age.txt -o | paste -s -d+ | bc
43

posted @ 2022-05-01 21:08 张贺贺呀阅读(71) 评论(0) 编辑收藏举报

刷新页面返回顶部

张贺贺