Loading... <p><strong><span style="font-size: 24px">一、何为awk</span></strong></p> <p style="text-indent: 2em"><span style="font-size: 14px">awk是一个强大的文本分析工具,相对于grep的查找,sed的编辑,awk明显复杂的多也强大的多。如果虽然awk可以实现grep和sed的一些功能,但是效率却比grep和sed慢。故如非是在show,否则建议不要使用awk去实现grep或sed的功能。<span style="font-size: 14px">在linux中awk其实是gawk,gawk是免费的!这才是重点。</span></span></p> <p><strong><span style="font-size: 24px">二、awk语法</span></strong></p> <p><span style="font-size: 14px">man下awk文档,你会发现</span></p> <pre class="brush:bash;toolbar:false">NAME gawk - pattern scanning and processing language SYNOPSIS gawk [ POSIX or GNU style options ] -f program-file [ -- ] file ... gawk [ POSIX or GNU style options ] [ -- ] program-text file ... pgawk [ POSIX or GNU style options ] -f program-file [ -- ] file ... pgawk [ POSIX or GNU style options ] [ -- ] program-text file ...</pre> <p><span style="font-size: 14px">对于manual中的内容,可能会比较难理解,故简化为如下格式</span></p> <pre class="brush:bash;toolbar:false"> awk [options] 'program' file file ... awk [options] 'PATTERN{action}' file file ..</pre> <p><span style="font-size: 14px">常用options</span></p> <p><span style="font-size: 14px">-F fs 使用‘fs’作为文本分隔符</span></p> <p><span style="font-size: 14px">–field-separator fs</span></p> <pre class="brush:bash;toolbar:false">[leon@vm tmp]$ awk -F: '/^root/{print $1,$7}' /etc/passwd root /bin/bash</pre> <p><span style="font-size: 14px">-v var=val </span><span class="Apple-tab-span" style="font-size: 14px"> </span><span style="font-size: 14px">定义一个变量</span></p> <p><span style="font-size: 14px">–assign var=val</span></p> <pre class="brush:bash;toolbar:false">[leon@vm tmp]$ awk -v name=leon 'BEGIN{print name}' Leon</pre> <p><span style="font-size: 14px">-f program-file<span class="Apple-tab-span" style="font-size: 14px"> </span>从文件中读取awk程序文件,即读取的是'PATTERN{action}'</span></p> <p><span style="font-size: 14px">–file program-file</span></p> <pre class="brush:bash;toolbar:false">[leon@vm tmp]$ cat /tmp/awk.program /^root/{print $0} /^leon/{print $0} [leon@vm tmp]$ awk -f /tmp/awk.program /etc/passwd root:x:0:0:root:/root:/bin/bash leon:x:500:500:vm1:/home/leon:/bin/bash</pre> <p><strong><span style="font-size: 14px">2.1、awk的输出</span></strong></p> <p><span style="font-size: 14px">print item1, item2, …</span></p> <p><span style="font-size: 14px">注意事项:</span></p> <p><span style="font-size: 14px">(1) 各项目之间使用逗号分隔,而输出时则使用输出分隔符分隔;</span></p> <pre class="brush:bash;toolbar:false">[leon@vm tmp]$ awk -F: '/\/bin\/bash/{print $1,$7}' /etc/passwd root /bin/bash leon /bin/bash mysql /bin/bash</pre> <p><span style="font-size: 14px">(2) 输出的各item可以字符串或数值、当前记录的字段、变量或awk的表达式;数值会被隐式转 </span></p> <p><span style="font-size: 14px">换为字符串后输出;</span></p> <p><span style="font-size: 14px">(3) print后面item如果省略,相当于print $0;输出空白,使用pirnt "";</span></p> <pre class="brush:bash;toolbar:false">[leon@vm tmp]$ awk -F: '/\/bin\/bash/{print }' /etc/passwd root:x:0:0:root:/root:/bin/bash leon:x:500:500:vm1:/home/leon:/bin/bash mysql:x:495:493::/home/mysql:/bin/bash</pre> <p></p> <p><span style="font-size: 14px">printf format,item1,item3,…</span></p> <p><span style="font-size: 14px">注意事项:</span></p> <p><span style="font-size: 14px">(1) 要指定format;</span></p> <p><span style="font-size: 14px">(2) 不会自动换行;如需换行则需要给出\n</span></p> <p><span style="font-size: 14px">(3) format用于为后面的每个item指定其输出格式;</span></p> <pre class="brush:bash;toolbar:false">format格式的指示符都%开头,后跟一个字符: %c: 显示字符的ASCII码; %d, %i: 十进制整数; %e, %E: 科学计数法显示数值; %f: 显示浮点数; %g, %G: 以科学计数法格式或浮点数格式显示数值; %s: 显示字符串; %u: 显示无符号整数; %%: 显示%自身; 修饰符: #:显示宽度 -:左对齐 +:显示数值的符号 .#: 取值精度 # awk -F: '{printf "%15s %-20s\n",$1,$7}' /etc/passwd</pre> <p><strong><span style="font-size: 14px">2.2、awk的变量</span></strong></p> <p><strong><span style="font-size: 14px">2.2.1 内置变量</span></strong></p> <p><span style="font-size: 14px"><span style="font-size: 14px">FS:Field Seperator, 输入时的字段分隔符</span></span></p> <pre class="brush:bash;toolbar:false">[leon@vm tmp]$ awk 'BEGIN {FS=":"}/\/bin\/bash/{print $1,$7}' /etc/passwd root /bin/bash leon /bin/bash mysql /bin/bash</pre> <p><span style="font-size: 14px">RS:Record Seperator, 输出行分隔符</span></p> <pre class="brush:bash;toolbar:false">[leon@vm tmp]$ awk 'BEGIN{RS=":"}{print}' /etc/passwd | head -n 10 root x 0 0 root /root /bin/bash bin x 1</pre> <p><span style="font-size: 14px">OFS: Output Field Seperator, 输出时的字段分隔符;</span></p> <pre class="brush:bash;toolbar:false">[leon@vm tmp]$ awk -F: 'BEGIN{OFS="##"}/\/bin\/bash/{print $1,$7}' /etc/passwd root##/bin/bash leon##/bin/bash mysql##/bin/bash</pre> <p><span style="font-size: 14px">ORS: Outpput Row Seperator, 输出时的行分隔符;</span></p> <pre class="brush:bash;toolbar:false">[leon@vm tmp]$ awk -F: 'BEGIN{ORS="##"}/\/bin\/bash/{print $1,$7}' /etc/passwd root /bin/bash##leon /bin/bash##mysql /bin/bash##</pre> <p><span style="font-size: 14px">NF:Numbers of Field,字段数</span></p> <pre class="brush:bash;toolbar:false">[leon@vm tmp]$ awk -F: 'END{print NF}' /etc/passwd 7</pre> <p><span style="font-size: 14px">NR:Numbers of Record, 行数;所有文件的一并计数;</span></p> <pre class="brush:bash;toolbar:false">[leon@vm tmp]$ sudo awk -F: '{print NR}' /etc/passwd /etc/shadow 1 2 ... 74 75 76</pre> <p><span style="font-size: 14px">FNR:行数;各文件分别计数;</span></p> <pre class="brush:bash;toolbar:false">[leon@vm tmp]$ sudo awk -F: '{print FNR}' /etc/passwd /etc/shadow 1 2 3 ... 36 37 38 1 2 3 ...</pre> <p><span style="font-size: 14px">ARGV:数组,保存命令本身这个字符,awk '{print $0}' 1.txt 2.txt,意味着ARGV[0]保存 </span></p> <pre class="brush:bash;toolbar:false">[leon@vm tmp]$ awk 'END{print ARGV[0]}' /etc/passwd awk</pre> <p><span style="font-size: 14px">ARGC: 保存awk命令中参数的个数;</span></p> <p><span style="font-size: 14px"></span><span style="font-size: 14px">FILENAME: awk正在处理的当前文件的名称;</span></p> <pre class="brush:bash;toolbar:false">[leon@vm tmp]$ awk 'END{print FILENAME}' /etc/passwd /etc/passwd</pre> <p><strong><span style="font-size: 14px">2.2.2 自定义变量</span></strong><span style="font-size: 14px"> </span></p> <pre class="brush:bash;toolbar:false">#awk -v name=leon 'BEGIN{print name}' # awk 'BEGIN{name="leon";print name}'</pre> <p><strong><span style="font-size: 14px">2.3、 awk输出重定向</span></strong></p> <pre class="brush:bash;toolbar:false">print items > output-file print items >> output-file print items | command # awk -F: '{print $1,$7 >"/tmp/userAndBash.txt"}' /etc/passwd</pre> <p><strong>2.<span style="font-size: 14px">4、</span><span class="Apple-tab-span" style="font-size: 14px"> </span><span style="font-size: 14px">awk操作符</span></strong></p> <p><span style="font-size: 14px">几个就没什么好说的了,算术操作符,赋值操作符,比较操作符,逻辑操作符等都支持</span></p> <p><span style="font-size: 14px">三元条件表达式</span></p> <pre class="brush:bash;toolbar:false">selector?if-true-expression:if-false-expression # awk -F: '{$3>=500?utype="common user":utype="admin or system user";print $1,"is",utype}' /etc/passwd</pre> <p><strong>2.<span style="font-size: 14px">5、</span><span class="Apple-tab-span" style="font-size: 14px"> </span><span style="font-size: 14px">模式 PATTREN</span></strong></p> <p><strong><span style="font-size: 14px">2.5.1 Regexp: 格式为/PATTERN/</span></strong></p> <p><span style="font-size: 14px">仅处理被/PATTERN/匹配到的行;</span></p> <pre class="brush:bash;toolbar:false">[leon@vm ~]$ awk -F: '/^root/{print $0}' /etc/passwd root:x:0:0:root:/root:/bin/bash</pre> <p><strong><span style="font-size: 14px">2.5.2 Expression: 表达式</span></strong></p> <p><span style="font-size: 14px">其结果为非0或非空字符串时满足条件,</span><span style="font-size: 14px">仅处理满足条件的行;</span></p> <pre class="brush:bash;toolbar:false">[leon@vm tmp]$ awk -F: '$3>=500{print $1,$3,$7}' /etc/passwd leon 500 /bin/bash nfsnobody 65534 /sbin/nologin</pre> <p><strong><span style="font-size: 14px">2.5.3 Ranges: 行范围</span></strong></p> <p><span style="font-size: 14px">此前地址定界,startline, endline,</span><span style="font-size: 14px">仅处理范围内的行</span></p> <pre class="brush:bash;toolbar:false">[leon@vm ~]$ awk -F: '/^root/,/^daemon/{print $1,$7}' /etc/passwd root /bin/bash bin /sbin/nologin daemon /sbin/nologin</pre> <p><strong><span style="font-size: 14px">2.5.4 BEGIN/END: 特殊模式</span></strong></p> <p><span style="font-size: 14px">仅在awk命令的program运行之前(BEGIN)或运行之后 </span><span style="font-size: 14px">,(END)执行一次;</span></p> <pre class="brush:bash;toolbar:false">[leon@vm ~]$ sudo awk --re-interval -F: 'BEGIN{print "--there users have a passwd--"}/\$1/{print $1}END{print "----end---"}' /etc/shadow --there users have a passwd-- root leon ----end---</pre> <p><strong style="font-size: 14px">2.5.5 Empty:空模式</strong></p> <p><span style="font-size: 14px">匹配任意行;这里就不举例子了</span></p> <p><strong><span style="font-size: 14px">2.6、常用的action</span></strong></p> <p><span style="font-size: 14px"> (1) expressions :表达式</span></p> <p><span style="font-size: 14px"> (2)control statements :控制语句</span></p> <p><span style="font-size: 14px"> (3)compound statements :组合语句</span></p> <p><span style="font-size: 14px"> (4)input statements :输入语句</span></p> <p><span style="font-size: 14px"> (5)<span class="Apple-tab-span" style="font-size: 14px"> </span>output statements :输出语句</span></p> <p><strong><span style="font-size: 14px">2.7、控制语句</span></strong></p> <p><strong><span style="font-size: 14px">2.7.1 if-else</span></strong></p> <p><span style="font-size: 14px"><span class="Apple-tab-span" style="font-size: 14px"> </span>格式:if (condition) {body} else {body}</span></p> <pre class="brush:bash;toolbar:false">[leon@vm ~]$ awk -F: '{if ($3>=500) {print $1,"is a common user"} else {print $1, "is an admin or system user"}}' /etc/passwd root is an admin or system user bin is an admin or system user ... [leon@vm ~]$ awk -F: '{if (NF>=3){print $0}}' /etc/inittab id:3:initdefault:</pre> <p><strong style="font-size: 14px">2.7.2 while</strong></p> <p><span style="font-size: 14px"><span class="Apple-tab-span" style="font-size: 14px"> </span>格式:while (condition) {while body}</span></p> <pre class="brush:bash;toolbar:false"># awk '{i=1; while (i<=NF){printf "%s ",$i;i+=2};print ""}' /etc/inittab # awk '{i=1; while (i<=NF){if (length($i)>=6) {print $i}; i++}}' /etc/inittab length()函数:取字符串的长度</pre> <p><strong><span style="font-size: 14px">2.7.3 do-while循环</span></strong></p> <p><span style="font-size: 14px"><span class="Apple-tab-span" style="font-size: 14px"> </span>格式:do {body} while (condition)</span></p> <p>跟上面例子类似,只不过do中的内容无论如何都会执行一次,不举例</p> <p><span style="font-size: 14px"><span class="Apple-tab-span" style="font-size: 14px"></span><strong>2.7.4 for循环</strong></span></p> <p><span style="font-size: 14px"><span class="Apple-tab-span" style="font-size: 14px"> </span>格式:for (variable assignment; condition; iteration process) {</span><span style="font-size: 14px">body}</span></p> <pre class="brush:bash;toolbar:false"># awk '{for (i=1;i<=NF;i+=2){printf "%s ",$i};print ""}' /etc/inittab # awk '{for (i=1;i<=NF;i++){if (length($i)>=6) print $i}}' /etc/inittab</pre> <p><span style="font-size: 14px">for循环可用来遍历数组元素:</span></p> <p><span style="font-size: 14px"><span class="Apple-tab-span" style="font-size: 14px"> </span>语法:for (i in array) {body}</span></p> <pre class="brush:bash;toolbar:false">awk '{ip[$1]++}END{for (i in ip) {print i,ip[i]}}' /var/log/httpd/access_log</pre> <p><strong><span style="font-size: 14px">2.7.5 case语句</span></strong></p> <p><span style="font-size: 14px"><span class="Apple-tab-span" style="font-size: 14px"> </span>语法:switch (expression) {case VALUE or /RGEEXP/: statement1;… </span><span style="font-size: 14px">default: stementN}</span></p> <p><span style="font-size: 14px"><span class="Apple-tab-span" style="font-size: 14px"></span><strong>2.7.6 循环控制</strong></span></p> <pre class="brush:bash;toolbar:false">break #跳出当前循环 continue #直接进入下一循环</pre> <p><strong><span style="font-size: 14px">2.7.7 next</span></strong></p> <p><span style="font-size: 14px"><span class="Apple-tab-span" style="font-size: 14px"> </span>提前结束对本行的处理进而进入下一行的处理;</span></p> <pre class="brush:bash;toolbar:false"># awk -F: '{if($3%2==0) next;print $1,$3}' /etc/passwd # awk -F: '{if(NR%2==0) next; print NR,$1}' /etc/passwd</pre> <p><strong><span style="font-size: 14px">2.8、 数组</span></strong></p> <p><span style="font-size: 14px">array[index-expression]</span></p> <p><span style="font-size: 14px">index-expression :可以使用任意字符串,如果某数组元素在引用时事先不存在,那么在引用时,awk会自动创建此元素并将其赋值为空串;因此要判断数组中是否存在某元素,使用index in array这个格式</span></p> <p><span style="font-size: 14px">要遍历数组中的每一个元素,需要使用以下特殊结构</span></p> <p><span style="font-size: 14px">for (var in array_name) {for body}</span></p> <p><span style="font-size: 14px">其var会遍历array的索引</span></p> <pre class="brush:bash;toolbar:false"># awk '{ip[$1]++}END{for (i in ip) {print i,ip[i]}}' /var/log/httpd/access_log</pre> <p><span style="font-size: 14px">删除数组元素:</span></p> <p><span style="font-size: 14px"><span class="Apple-tab-span" style="font-size: 14px"> </span>delete array[index]</span></p> <p><strong><span style="font-size: 14px">2.9、 awk函数</span></strong></p> <p><strong><span style="font-size: 14px">2.9.1 内置函数</span></strong></p> <p><span style="font-size: 14px"> split(string,array[,fieldsep[,seps]])</span></p> <p><span style="font-size: 14px"> 功能:将string表示的字符串以fieldsep为分隔符进行切片,并切片后的结果保存至array为名的数组中;数组小标从1开始</span></p> <pre class="brush:bash;toolbar:false">awk 'BEGIN {split("root:x:0:0",user,":");for (i in user) print user[i]}'</pre> <p><span style="font-size: 14px">此函数有返回值,返回值为切片后的元素的个数</span></p> <pre class="brush:bash;toolbar:false"># netstat -tn | awk '/^tcp/{lens=split($5,client,":");ip[client[lens-1]]++}END{for (i in ip) print i,ip[i]}'</pre> <p><span style="font-size: 14px">length(string)</span></p> <p><span style="font-size: 14px"><span class="Apple-tab-span" style="font-size: 14px"> </span>功能:返回给定字串的长度</span></p> <pre class="brush:bash;toolbar:false"># awk '{for (i=1;i<=NF;i++){if (length($i)>=6) print $i}}' /etc/inittab</pre> <p><span style="font-size: 14px"><span class="Apple-tab-span" style="font-size: 14px"> </span>substr(string,start[,length])</span></p> <p><span style="font-size: 14px"><span class="Apple-tab-span" style="font-size: 14px"> </span>功能:从string中取子串,从start为起始位置为取length长度的子串;</span></p> <p><strong><span style="font-size: 14px">2.9.2 自定义函数</span></strong></p> <p><span style="font-size: 14px"><span class="Apple-tab-span" style="font-size: 14px"> </span>function function_name(arg1,arg2,….) {body}</span></p> <p><span style="font-size: 14px"><span class="Apple-tab-span" style="font-size: 14px"> </span>函数调用: function_name(arg1,arg2,…)</span></p> <pre class="brush:bash;toolbar:false">[leon@vm tmp]$ echo "3 2" | awk 'function compare(a,b){if(a>b) {return a} else {return b}};{print compare($1,$2)}' 3</pre> <p><span style="font-size: 24px"><strong>三、总结</strong></span></p> <p style="text-indent: 2em">作为三剑客之一,awk经常被使用,特别需要注意awk的语法。如,不要把BEGIN和ENG执行的内容当成PATTERN{action},而应该是program,同时function函数也是如此,当成program,否则一不留神就写成'{function function_name(arg…) {body}}'。犯了语法错误。</p> <p></p> 最后修改:2021 年 12 月 10 日 10 : 53 AM © 允许规范转载 赞赏 如果觉得我的文章对你有用,请随意赞赏 赞赏作者 支付宝微信