awk

1,awk程序的主要结构(awk手册):

awk程序中主要语法是 Pattern { Actions}, 故常见之awk 程序其型态如下 :

Pattern1 { Actions1 }

Pattern2 { Actions2 }

......

Pattern3 { Actions3 }

2,awk 程序中使用 Shell 命令

方法1,awk output 指令 | "Shell 接受的命令"

              echo "3 1 5"|awk'{printf("%d\n%d\n%d\n",$1,$2,$3)|"sort -n"}'

方法2 , "Shell 接受的命令" | awk input 指令

                echo|awk  '{ while("ps"|getlineline ) print line }'

方法3,使用system()

方法4,使用print cmd | “/bin/bash”

3,awk内部使用shell变量(赋值变量要加引号)

方法1   sh_v="shellvariable";echo |awk '{print awk_v}'       awk_v="$sh_v"

方法2   sh_v="shellvariable";echo|awk -v awk_v="$sh_v"      '{print awk_v}'

Gawk executes AWK programs in the following order.  First, all variable assignments specified via the -v option  are performed. Next, gawk compiles the program into an internal form.  Then, gawk executes  the  code  in  the  BEGIN  block(s)  (if  any), and then proceeds toread each file named in the ARGV array.  If there are no files  named on the command line, gawk reads the standard input. If a filename on the command line has the form var=val it is treated as a variable  assignment.   The  variable  var will be assigned the value val.  (This happens after any BEGIN block(s) have been run.)  Command line variable assignment is most useful for dynamically assigning values to the variables AWK uses to control how  input  is  broken into fields and records.  It is also useful for controlling state if multiple passes are needed over   a single data file.

4,AWK内置变量

ARGC        The number of command line arguments 

ARGV        Array  of  command  line arguments.  The array is indexed from 0 to ARGC - 1

NF          The number of fields in the current input record.

FS          The input field separator, a space by default。

SUBSEP      The character used to separate multiple subscripts in array elements, by default "\034".

5,AWK中的数组

awk中的数组是关联数组(associative arrays),因此下标可以是数字也可以是字符串。

awk中的多维数组 下标 是通过SUBSEP内置变量连接起来的。如:

              i = "A"; j = "B"; k = "C"
              x[i, j, k] = "hello, world\n"

这个数组的内部存储为x["A\034B\034C"]="hello, world\n"。即,多维数组的下标为通过SUBSEP变量连接起来,作为一个一维的下标存储。

数组的操作——遍历数组与判断数组元素存在:

The special operator in may be used in an if or for statement to see if an array has an index consisting of a particular value.

              for (var in array) statement
              if (val in array)  print array[val]

数组的操作——删除数组或数组元素:

  1. delete array                     #删除整个数组
  2. delete array[item]           # 删除某个数组元素(item)

6, AWK特殊处理

   space       String Operators concatenation.   awk中的空格有特殊意义  即用空格来连接2个字符串

7,Fields分隔符FS指定选项-F 可以为Regular Expressions

  如用字符串DELIMITER作为分隔符,截取字符串:echo name1-DELIMITER-id1|awk -F'-DELIMITER-' '{print $2}'

The value is a single-character string or a multicharacter regular expression that matches the separations between fields in an input record.

https://www.gnu.org/software/gawk/manual/html_node/Reading-Files.html#Reading-Files

8,

posted on 2011-10-14 17:10  Tonystz  阅读(147)  评论(0编辑  收藏  举报