awk 实例
1.时间段匹配
[root@centos-1 ~]# cat c
10:01
10:02
10:03
10:05
10:06
10:07
12:01
11:01
[root@centos-1 ~]# awk '$1>="10:05" && $1<="11:10"' c
10:05
10:06
10:07
11:01
awk的时间匹配是不受时间顺序的影响
[root@centos-1 ~]# awk '/10:05/,/11:10/' c
10:05
10:06
10:07
12:01
11:01
而这种匹配是受数值顺序的影响的,这里需要特别注意
2.awk中调用shell中的变量
文件内容
[root@centos-1 ~]# cat a
2017-04-19 10:26:56 EcChannel EC EcChannel-3 socketuser3
2017-04-19 10:27:03 EcChannel EC EcChannel-3 socketuser3
2017-04-19 10:31:18 EcChannel EC EcChannel-3 socketuser3
2017-04-19 10:31:31 EcChannel EC EcChannel-3 socketuser3
2017-04-19 10:32:42 EcChannel EC EcChannel-4 socketuser4
2017-04-19 10:33:03 EcChannel EC EcChannel-3 socketuser3
2017-04-19 10:33:04 EcChannel EC EcChannel-3 socketuser3
2017-04-19 10:33:16 EcChannel EC EcChannel-4 socketuser4
2017-04-19 10:33:37 EcChannel EC EcChannel-3 socketuser3
2017-04-19 10:34:32 EcChannel EC EcChannel-3 socketuser3
2017-04-19 10:35:00 EcChannel EC EcChannel-3 socketuser3
2017-04-19 10:35:07 EcChannel EC EcChannel-3 socketuser3
2017-04-19 15:04:39 EcChannel EC EcChannel-3 socketuser3
2017-04-19 15:04:39 EcChannel EC EcChannel-3 socketuser3
2017-04-19 18:10:39 EcChannel EC EcChannel-3 socketuser3
要求:
过滤从现在的时间开始,到10个小时之前或10个小时之后,这个时间段内的行
#!/bin/bash
A=$(date |awk '{print $(NF-1)}')
B=$(date --date='+10 hour' |awk '{print $(NF-1)}')
awk '$2>="'$A'" && $2<="'$B'"' /root/a
解释:
注意的是,awk命令本身需要将模式和动作部分用单引号引用,所以里面引用shell变量是双引号+单引号,双引号是保证正确处理变量值的空格
执行结果:
[root@centos-1 ~]# sh b
2017-04-19 15:04:39 EcChannel EC EcChannel-3 socketuser3
2017-04-19 15:04:39 EcChannel EC EcChannel-3 socketuser3
2017-04-19 18:10:39 EcChannel EC EcChannel-3 socketuser3
3.过滤出nginx日志中,页面响应时间最长的URL路径
日志内容如下
192.168.3.1 - - [08/Jun/2017:01:36:26 +0800] "GET / HTTP/1.1" 304 0 "-" "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36" "-" 0.001
192.168.3.1 - - [08/Jun/2017:16:20:10 +0800] "GET /b.html HTTP/1.1" 304 0 "-" "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36" "-" 0.212
192.168.3.1 - - [08/Jun/2017:16:20:16 +0800] "GET /b.html HTTP/1.1" 200 18 "-" "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36" "-" 0.022
192.168.3.1 - - [08/Jun/2017:16:20:16 +0800] "GET /b.html HTTP/1.1" 200 18 "-" "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36" "-" 0.12
192.168.3.1 - - [08/Jun/2017:16:20:16 +0800] "GET /b.html HTTP/1.1" 200 18 "-" "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36" "-" 0.032
192.168.3.1 - - [08/Jun/2017:16:20:16 +0800] "GET /b.html HTTP/1.1" 200 18 "-" "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36" "-" 0.021
如果让日志中能够记录页面的响应时间,需要自定义日志,在日志格式中加入$request_time参数
log_format abc '$remote_addr - $remote_user [$time_local] "$request" '
'$status $body_bytes_sent "$http_referer" '
'"$http_user_agent" "$http_x_forwarded_for" $request_time';
过滤命令如下
[root@centos-1 logs]# grep $(awk 'BEGIN{a=0}{if ($NF>a){a=$NF}}END{print a}' /usr/local/nginx/logs/access.log ) /usr/local/nginx/logs/access.log
结果如下
192.168.3.1 - - [08/Jun/2017:16:20:10 +0800] "GET /b.html HTTP/1.1" 304 0 "-""Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko)Chrome/58.0.3029.110 Safari/537.36" "-" 0.212
然后过滤出url
[root@centos-1 logs]#grep $(awk 'BEGIN{a=0}{if ($NF>a){a=$NF}}END{print a}' /usr/local/nginx/logs/access.log ) /usr/local/nginx/logs/access.log |awk '{print $6,$7}'
结果如下
"GET /b.html