每天学习一个linux命令(1)awk

Awk  pattern scanning and processing language,对文本和数据进行处理。

awk 是一种编程语言,用于在linux/unix下对文本和数据进行处理。数据可以来自标准输(stdin)、一个或多个文件,或其它命令的输出。它在命令行中使用,但更多是作为脚本来使用。awk有很多内建的功能,比如数组、函数等,这是它和C语言的相同之处,灵活性是awk最大的优势。

链接里有详细的AWK原理介绍,

这里直接给出使用示例:

(1)输出文件的第一列和第二列

示例文件:log.txt

2 this is a test
3 Are you like awk
This's a test
10 There are orange,apple,mongo

 命令示例:

$  awk '{print $1,$2}' log.txt
2 this
3 Are
This's a
10 There

 

(2)格式化输出

$ awk '{printf "%-8s %-10s\n",$1,$4}' log.txt
2        a
3        like
This's
10       orange,apple,mongo

 

(3)使用","分割

2,this is a test
3,Are you like awk
10,There are orange,apple,mongo

命令示例:

$   awk -F, '{print $1,$2}'   log.txt
2 this is a test
3 Are you like awk
10 There are orange

 

(4)使用内建变量分割

$ awk 'BEGIN{FS=","} {print $1,$2}'     log.txt
2 this is a test
3 Are you like awk
10 There are orange

 

(5)awk -v # 设置变量

$  awk -F ',' -va=1 '{print $1,$1+a}' log.txt
2 3
3 4
10 11

 

(6)使用awk文件

 $ awk -f cal.awk log.txt 

 

(7)过滤第一列大于2的行

示例文件:

2 this is a test
3 Are you like awk
10 There are orange,apple,mongo
1 haha
$ awk '$1>2' log.txt
3 Are you like awk
10 There are orange,apple,mongo

 

(8)过滤第一列等于2的行

$ awk '$1==2 {print $1,$3}' log.txt
2 is

 

(9)过滤第一列大于2并且第二列等于'Are'的行

$ awk '$1>2 && $2=="Are" {print $1,$2,$3}' log.txt
3 Are you

 

(10)内置变量"FILENAME","ARGC","FNR","FS","NF","NR","OFS","ORS","RS"

$ awk 'BEGIN{printf "%4s %4s %4s %4s %4s %4s %4s %4s %4s\n","FILENAME","ARGC","F
NR","FS","NF","NR","OFS","ORS","RS";printf "------------------------------------
---------\n"} {printf "%4s %4s %4s %4s %4s %4s %4s %4s %4s\n",FILENAME,ARGC,FNR,
FS,NF,NR,OFS,ORS,RS}'  log.txt
FILENAME ARGC  FNR   FS   NF   NR  OFS  ORS   RS
---------------------------------------------
log.txt    2    1         5    1


log.txt    2    2         5    2


log.txt    2    3         4    3


log.txt    2    4         2    4

 

 

(11)指定输出分割符

$   awk '{print $1,$2,$5}' OFS="="  log.txt
2=this=test
3=Are=awk
10=There=
1=haha=

 

 (12)使用正则匹配字符串

输出第二列包含 "th",并打印第二列与第四列
$ awk '$2 ~ /th/ {print $2,$4}' log.txt
this a



(13)忽略大小写
$  awk 'BEGIN{IGNORECASE=1} /this/' log.txt
2 this is a test

(14)模式取反

$ awk '$2 !~ /th/ {print $2,$4}' log.txt
Are like
There orange,apple,mongo
haha

 

(15)awk输出helloword

$ awk 'BEGIN { print "Hello, world!" }'
Hello, world!

 

(16)计算文件大小

$ ls -l *.txt
-rw-rw-r-- 1 sftcwl sftcwl 66 Nov 25 15:06 a.txt
-rw-rw-r-- 1 sftcwl sftcwl 75 Nov 25 16:48 log.txt

# sftcwl @ gz-cvm-ebuild-tongtian-dev001 in ~ [17:07:33]
$ ls -l *.txt | awk '{sum+=$5} END{print sum}'
141

 

(17)从文件中找出长度大于 80 的行

awk 'length>80' log.txt

 

(18)范围模式由逗号分隔的两组字符组成,从与第一个字符串匹配的记录开始,直到与第二个字符串的记录匹配为止的所有记录。

例如,显示从“Raptors”到 “Celtics”在内的记录,:

$ awk '/Raptors/,/Celtics/ {print $0}' teams.txt 
Raptors Toronto    58 24 0.707 
76ers Philadelphia 51 31 0.622
Celtics Boston     49 33 0.598

(19)范围模式也可以使用关系表达式,例如,显示第四个字段等于31到第四个字段等于34 的记录

[root@localhost ~]# awk '$4 == 31 , $4 == 34 {print $0}' teams.txt 
76ers Philadelphia 51 31 0.622
Celtics Boston     49 33 0.598
Pacers Indiana     48 34 0.585

 

(20)已知文件a.txt,第一列是文件名,第二列是版本号,打印出每个文件最大的版本号一行。(要求使用awk)

复制代码
[root@w ~]# awk '{print}' a
file 100
dir 11
file 100
dir 11
file 102
dir 112
file 120
dir 119

解答:
[root@w ~]# awk '{if(code[$1]<$2) code[$1]=$2}END{for (i in code) print i,code[i] }' a
file 120
dir 119

 

(21)统计一行的数字总和

[root@chavinking mnt]# cat textfile

chavinking 1 2 3 4 5 6

nope 1 2 3 4 5 6

[root@chavinking mnt]# cat textfile | awk '{for(i=1;i<=$NF+1;i++){sum=sum+$i} {print $1" "sum;sum=0}}'

chavinking 21

nope 21

[root@chavinking mnt]#

 

 
 

posted on 2021-11-25 17:32  1450811640  阅读(54)  评论(0编辑  收藏  举报