每天学习一个linux命令（1）awk

Awk pattern scanning and processing language，对文本和数据进行处理。

awk 是一种编程语言，用于在linux/unix下对文本和数据进行处理。数据可以来自标准输(stdin)、一个或多个文件，或其它命令的输出。它在命令行中使用，但更多是作为脚本来使用。awk有很多内建的功能，比如数组、函数等，这是它和C语言的相同之处，灵活性是awk最大的优势。

链接里有详细的AWK原理介绍，

这里直接给出使用示例：

（1）输出文件的第一列和第二列

示例文件：log.txt

2 this is a test
3 Are you like awk
This's a test
10 There are orange,apple,mongo

命令示例：

$  awk '{print $1,$2}' log.txt
2 this
3 Are
This's a
10 There

（2）格式化输出

$ awk '{printf "%-8s %-10s\n",$1,$4}' log.txt
2        a
3        like
This's
10       orange,apple,mongo

（3）使用","分割

2,this is a test
3,Are you like awk
10,There are orange,apple,mongo

命令示例：

$   awk -F, '{print $1,$2}'   log.txt
2 this is a test
3 Are you like awk
10 There are orange

（4）使用内建变量分割

$ awk 'BEGIN{FS=","} {print $1,$2}'     log.txt
2 this is a test
3 Are you like awk
10 There are orange

（5）awk -v # 设置变量

$  awk -F ',' -va=1 '{print $1,$1+a}' log.txt
2 3
3 4
10 11

（6）使用awk文件

 $ awk -f cal.awk log.txt

（7）过滤第一列大于2的行

示例文件:

2 this is a test
3 Are you like awk
10 There are orange,apple,mongo
1 haha

$ awk '$1>2' log.txt
3 Are you like awk
10 There are orange,apple,mongo

（8）过滤第一列等于2的行

$ awk '$1==2 {print $1,$3}' log.txt
2 is

（9）过滤第一列大于2并且第二列等于'Are'的行

$ awk '$1>2 && $2=="Are" {print $1,$2,$3}' log.txt
3 Are you

（10）内置变量"FILENAME","ARGC","FNR","FS","NF","NR","OFS","ORS","RS"

$ awk 'BEGIN{printf "%4s %4s %4s %4s %4s %4s %4s %4s %4s\n","FILENAME","ARGC","F
NR","FS","NF","NR","OFS","ORS","RS";printf "------------------------------------
---------\n"} {printf "%4s %4s %4s %4s %4s %4s %4s %4s %4s\n",FILENAME,ARGC,FNR,
FS,NF,NR,OFS,ORS,RS}'  log.txt
FILENAME ARGC  FNR   FS   NF   NR  OFS  ORS   RS
---------------------------------------------
log.txt    2    1         5    1


log.txt    2    2         5    2


log.txt    2    3         4    3


log.txt    2    4         2    4

（11）指定输出分割符

$   awk '{print $1,$2,$5}' OFS="="  log.txt
2=this=test
3=Are=awk
10=There=
1=haha=

（12）使用正则匹配字符串

输出第二列包含 "th"，并打印第二列与第四列

$ awk '$2 ~ /th/ {print $2,$4}' log.txt
this a



（13）忽略大小写

$  awk 'BEGIN{IGNORECASE=1} /this/' log.txt
2 this is a test

（14）模式取反

$ awk '$2 !~ /th/ {print $2,$4}' log.txt
Are like
There orange,apple,mongo
haha

(15)awk输出helloword

$ awk 'BEGIN { print "Hello, world!" }'
Hello, world!

（16）计算文件大小

$ ls -l *.txt
-rw-rw-r-- 1 sftcwl sftcwl 66 Nov 25 15:06 a.txt
-rw-rw-r-- 1 sftcwl sftcwl 75 Nov 25 16:48 log.txt

# sftcwl @ gz-cvm-ebuild-tongtian-dev001 in ~ [17:07:33]
$ ls -l *.txt | awk '{sum+=$5} END{print sum}'
141

（17）从文件中找出长度大于 80 的行

awk 'length>80' log.txt

（18）范围模式由逗号分隔的两组字符组成，从与第一个字符串匹配的记录开始，直到与第二个字符串的记录匹配为止的所有记录。

例如，显示从“Raptors”到 “Celtics”在内的记录，：

$ awk '/Raptors/,/Celtics/ {print $0}' teams.txt 
Raptors Toronto    58 24 0.707 
76ers Philadelphia 51 31 0.622
Celtics Boston     49 33 0.598

（19）范围模式也可以使用关系表达式，例如，显示第四个字段等于31到第四个字段等于34 的记录

[root@localhost ~]# awk '$4 == 31 , $4 == 34 {print $0}' teams.txt 
76ers Philadelphia 51 31 0.622
Celtics Boston     49 33 0.598
Pacers Indiana     48 34 0.585

(20)已知文件a.txt，第一列是文件名，第二列是版本号，打印出每个文件最大的版本号一行。（要求使用awk）

复制代码
[root@w ~]# awk '{print}' a
file 100
dir 11
file 100
dir 11
file 102
dir 112
file 120
dir 119

解答：
[root@w ~]# awk '{if(code[$1]<$2) code[$1]=$2}END{for (i in code) print i,code[i] }' a
file 120
dir 119

（21）统计一行的数字总和

[root@chavinking mnt]# cat textfile

chavinking 1 2 3 4 5 6

nope 1 2 3 4 5 6

[root@chavinking mnt]# cat textfile | awk '{for(i=1;i<=$NF+1;i++){sum=sum+$i} {print $1" "sum;sum=0}}'

chavinking 21

nope 21

[root@chavinking mnt]#

posted on 2021-11-25 17:32 1450811640 阅读(54) 评论(0) 编辑收藏举报

会员力量，点亮园子希望

刷新页面返回顶部

1450811640

每天学习一个linux命令（1）awk

导航

公告