Loading... <p style="line-height:150%"><span><strong><span>正则表达式是个什么东东?</span></strong></span></p> <p style="text-indent:28px;line-height:150%"><span><strong><span>正则表达式</span></strong></span><span><span><strong><span></span></strong></span><span>,又称正规表示法、常规表示法(</span><span style="font-size: 12px;font-family: 宋体;line-height: 150%;background: white">英语<span>:</span></span>Regular Expression<span>,在代码中常简写为</span>regex<span>、</span>regexp<span>或</span>RE<span>)。在很多</span><span style="font-size: 12px;font-family: 宋体;line-height: 150%;background: white">文本编辑器<span>里、命令中,通常要使用检索、替换、放行和拒绝那些符合某个模式的文本。而</span></span><span style="font-size: 12px;font-family: 宋体;line-height: 150%">正则表达式就是用于描述这些规则的工具。换句话说,正则表达式就是记录文本规则的代码。</span><span> </span></span></p> <p style="text-indent: 32px;line-height: 150%;background: white"><span><span>摘自《</span><span>正则表达式之道</span><span>》)</span></span></p> <p style="text-indent: 32px;line-height: 150%;background: white"><span><span>正则表达式</span> <span>由一些<span>普通</span></span><span>字符</span><span>和一些</span><span>元字符</span><span>(</span>metacharacters<span>)组成。普通</span><span>字符</span><span>包括大小写的字母和数字,而</span><span>元字符</span><span>则具有特殊的含义,我们下面会给予解释。</span></span></p> <p style="text-indent: 32px;line-height: 150%;background: white"><span><span>在最简单的情况下,一个正则表达式看上去就是一个普通的查找串。例如,正则表达式</span>"testing"<span>中没有包含任何</span></span><span>元字符</span><span><span>,它可以匹配</span>"testing"<span>和</span>"testing123"<span>等字符串,但是不能匹配</span>"Testing"<span>。</span></span></p> <p style="text-indent: 32px;line-height: 150%;background: white"><span>要想真正的用好正则表达式,正确的理解</span><span>元字符</span><span><span>是最重要的事情。下表列出了所有的</span><span><span>元字符</span></span><span>和对它们的一个简短的描述。</span><span style="font-size: 12px;font-family: 宋体;line-height: 150%;color: red;background: yellow">元字符是工具,组合起来就是方式,普通字符是目标。目标有一个方式有多种。</span></span></p> <p style="text-indent:21px;line-height:150%"><span>正则表达式分类标准正则表达式和扩展正则表达式;主要的区别是在于一些元字符的书写方式和支持上,没有根本的区别。</span></p> <p style="line-height:150%"><span> <img src="//cto.wang/usr/uploads/2016/07/20160703155835-21.png" title="1427467744689359.png" alt="blob.png" /></span></p> <p style="line-height:150%"><span>灵活运用这些元字符来对字符进行匹配示例</span></p> <p style="line-height:150%"><span><span style="font-size: 12px;font-family: 宋体;line-height: 150%">匹配合理的的</span><span>IPV4</span><span style="font-size: 12px;font-family: 宋体;line-height: 150%">地址</span></span></p> <p style="line-height:150%"><span>1.0.0.1-239.255.255.255</span></p> <p style="line-height:150%"><span><span>1</span><span style="font-size: 12px;font-family: 宋体;line-height: 150%">、</span><span>IPV4</span><span style="font-size: 12px;font-family: 宋体;line-height: 150%">地址是以四组点分十进制来表示的,固定的字符</span><span> . </span><span style="font-size: 12px;font-family: 宋体;line-height: 150%">这个</span><span> . </span><span style="font-size: 12px;font-family: 宋体;line-height: 150%">(点)是固定的有三个把这串字符分为了四组;</span><span>. </span><span style="font-size: 12px;font-family: 宋体;line-height: 150%">(点)在正则表达式内是元字符,那么对它进行匹配需要进行转义;也就是</span><span>”\.”</span><span style="font-size: 12px;font-family: 宋体;line-height: 150%">在正则表达式内才表示为点。</span></span></p> <p style="line-height:150%"><span><span style="font-size: 12px;font-family: 宋体;line-height: 150%">如图所示这样就能匹配出</span><span> “.”</span><span style="font-size: 12px;font-family: 宋体;line-height: 150%">来了</span></span></p> <p style="line-height:150%"><img src="//cto.wang/usr/uploads/2016/07/20160703155836-10.jpg" title="1427467247976884.jpg" alt="1.jpg" /></p> <p style="line-height:150%"><span><span>2</span><span style="font-size: 12px;font-family: 宋体;line-height: 150%">、四组的取值范围分别是</span><span> <strong>“1-239”.”0-255”.”0-255”.”1-255”</strong></span></span></p> <p style="line-height:150%"><span><span> </span><span style="font-size: 12px;font-family: 宋体;line-height: 150%">我们先看第一组</span><span> <strong>“1-239”.</strong> </span><span style="font-size: 12px;font-family: 宋体;line-height: 150%">就是在</span><span>1……….…239</span><span style="font-size: 12px;font-family: 宋体;line-height: 150%">中的字符出现一次其实就是</span><span>1</span><span style="font-size: 12px;font-family: 宋体;line-height: 150%">或</span><span>2</span><span style="font-size: 12px;font-family: 宋体;line-height: 150%">或</span><span>3</span><span style="font-size: 12px;font-family: 宋体;line-height: 150%">或</span><span>…..239</span><span style="font-size: 12px;font-family: 宋体;line-height: 150%">;正则表达式写成</span><span>1|2|3…..|239</span><span style="font-size: 12px;font-family: 宋体;line-height: 150%">肯定可以就是太累了,估计这么写会被直接咔嚓的。哎!</span><span>1|2|3|4|5|6|7|8|9</span><span>|10..|99</span><span>|<span>100…|199|</span><span>200..|239</span></span><span style="font-size: 12px;font-family: 宋体;line-height: 150%">按照个位十位百位来</span></span></p> <p style="line-height:150%"><span><span>1|2|3|4|5|6|7|8|9</span><span> </span><span style="font-size: 12px;font-family: 宋体;line-height: 150%">不就是</span><span>[1-9]</span><span style="font-size: 12px;font-family: 宋体;line-height: 150%">;</span></span></p> <p style="line-height:150%"><span><span>|10..|99</span><span> </span><span style="font-size: 12px;font-family: 宋体;line-height: 150%">不就是</span><span>[1-9][0-9]</span><span style="font-size: 12px;font-family: 宋体;line-height: 150%">;</span></span></p> <p style="line-height:150%"><span><span>100…|199|</span><span> </span><span style="font-size: 12px;font-family: 宋体;line-height: 150%">不就是</span><span>[1][0-9][0-9]</span><span style="font-size: 12px;font-family: 宋体;line-height: 150%">;</span></span></p> <p style="line-height:150%"><span><span>200..|239</span><span> </span><span style="font-size: 12px;font-family: 宋体;line-height: 150%">不就是</span><span>[2][0-3][0-9]</span></span></p> <p style="line-height:150%"><span><span style="font-size: 12px;font-family: 宋体;line-height: 150%">组合起来</span><strong><span>[1-9]|[1-9][0-9]|1[0-9][0-9]|2[0-3][0-9] </span></strong></span></p> <p style="line-height:150%"><span><span style="font-size: 12px;font-family: 宋体;line-height: 150%">验证一下:首先我们看下</span><span> ifconfig </span><span style="font-size: 12px;font-family: 宋体;line-height: 150%">显示的内容中都有哪些数字</span> </span></p> <p style="line-height:150%"><span><strong><span style="font-size: 12px;font-family: 宋体;line-height: 150%"><img src="//cto.wang/usr/uploads/2016/07/20160703155836-82.jpg" title="1427467265100993.jpg" alt="2.jpg" /></span></strong></span></p> <p style="line-height:150%"><span><span style="font-size: 12px;font-family: 宋体;line-height: 150%">用我们写好的去匹配下看看是不是</span><span>1-239</span><span style="font-size: 12px;font-family: 宋体;line-height: 150%">的</span></span></p> <p style="line-height:150%"><img src="//cto.wang/usr/uploads/2016/07/20160703155836-89.jpg" title="1427467278138677.jpg" alt="3.jpg" /></p> <p style="line-height:150%"><span><span style="font-size: 12px;font-family: 宋体;line-height: 150%">对比下确实</span><span>0</span><span style="font-size: 12px;font-family: 宋体;line-height: 150%">和比</span><span>239</span><span style="font-size: 12px;font-family: 宋体;line-height: 150%">大的数字都没有出现,也就是说这个表达式正常的匹配到我们需要匹配的字符集。</span></span></p> <p style="line-height:150%"><span>剩下的我们依葫芦画瓢</span></p> <p style="line-height:150%"><span><strong><span>[0-9]|[1-9][0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5] 0-255</span></strong></span></p> <p style="line-height:150%"><span><strong><span>[1-9]|[1-9][0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5] 1-255</span></strong></span></p> <p style="line-height:150%"><span style="font-size: 12px;font-family: 宋体;line-height: 150%">好吧我们把它组合起来</span></p> <p style="line-height:150%"><span><strong><span>[1-9]|[1-9][0-9]|1[0-9][0-9]|2[0-3][0-9]</span></strong><strong><span>[\.]<span>[0-9]|[1-9][0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5]</span> [\.]<span>[0-9]|[1-9][0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5]</span>[\.]<span>[1-9]|[1-9][0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5]</span></span></strong></span></p> <p style="line-height:150%"><span style="font-size: 12px;font-family: 宋体;line-height: 150%">对不对先放一边真长啊,一准看花眼。先试试对不对。</span></p> <p style="line-height:150%"><img src="//cto.wang/usr/uploads/2016/07/20160703155836-52.jpg" title="1427467285139936.jpg" alt="4.jpg" /></p> <p style="line-height:150%"><span style="font-size: 12px;font-family: 宋体;line-height: 150%">哇不对,怎么显示的都是一个一个的单个数啊,对了这是被</span><span>.</span><span style="font-size: 12px;font-family: 宋体;line-height: 150%">(点)给分开的四组字符串,应该分组。</span><span> </span><span style="font-size: 12px;font-family: 宋体;line-height: 150%">分组再试!</span></p> <p style="line-height:150%"><img src="//cto.wang/usr/uploads/2016/07/20160703155836-95.jpg" title="1427467292135519.jpg" alt="5.jpg" /></p> <p style="line-height:150%"><span><span style="font-size: 12px;font-family: 宋体;line-height: 150%">噢耶大功告成。</span></span></p> <p style="line-height:150%"><span><span style="font-size: 12px;font-family: 宋体;line-height: 150%">不过我总觉得有点长,看着太累。</span></span></p> <p style="line-height:150%"><span><strong><span>([1-9]|[1-9][0-9]|1[0-9][0-9]|2[0-3][0-9]</span></strong><strong><span>)[\.](<span>[0-9]|[1-9][0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5]</span>)[\.](<span>[0-9]|[1-9][0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5]</span>)[\.](<span>[1-9]|[1-9][0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5]</span>)</span></strong></span></p> <p style="line-height:150%"><span style="font-size: 12px;font-family: 宋体;line-height: 150%">试着去对表达式进行优化,</span></p> <p style="line-height:150%"><span><span>1</span><span style="font-size: 12px;font-family: 宋体;line-height: 150%">、首先是已经对这个表达式进行了分组是否可以使用引用。</span></span></p> <p style="line-height:150%"><span><strong><span>([1-9]|[1-9][0-9]|1[0-9][0-9]|2[0-3][0-9]</span></strong><strong><span>)[\.](<span>[0-9]|[1-9][0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5]</span>)[\.]<span>\2</span> [\.](<span>[1-9]|[1-9][0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5]</span>)</span></strong></span></p> <p style="line-height:150%"><img src="//cto.wang/usr/uploads/2016/07/20160703155836-6.jpg" title="1427467304111308.jpg" alt="6.jpg" /></p> <p style="line-height:150%"><span><span style="font-size: 12px;font-family: 宋体;line-height: 150%">出错一开始我以为是我的</span><span>\n</span><span style="font-size: 12px;font-family: 宋体;line-height: 150%">的用法不对于是就把命令剪短后再次匹配</span></span></p> <p style="line-height:150%"><img src="//cto.wang/usr/uploads/2016/07/20160703155836-79.jpg" title="1427467310187460.jpg" alt="7.jpg" /></p> <p style="line-height:150%"><span style="font-size: 12px;font-family: 宋体;line-height: 150%">发现是可以陪陪的,但是发现没有前后的字符是一样的?回过头来看一遍</span></p> <table> <tbody> <tr class="firstRow"> <td width="386" valign="top">\</td> <td width="386" valign="top"><span>引用第n个括号所匹配到的内容,而非模式本身</span></td> </tr> </tbody> </table> <p style="line-height:150%"><span style="font-size: 12px;font-family: 宋体;line-height: 150%"></span></p> <p style="line-height:150%"><span>2</span><span style="font-size: 12px;font-family: 宋体;line-height: 150%">、仔细查看表达式发现,里面有</span><span>[1-9]|[1-9][0-9]</span><span> </span><span style="font-size: 12px;font-family: 宋体;line-height: 150%">这是</span><span>01-99 </span><span style="font-size: 12px;font-family: 宋体;line-height: 150%">前面这个</span><span>0-9</span><span style="font-size: 12px;font-family: 宋体;line-height: 150%">可以不出现就是</span><span>0-9</span><span style="font-size: 12px;font-family: 宋体;line-height: 150%">了</span><span> </span><span style="font-size: 12px;font-family: 宋体;line-height: 150%">太可恶了这么简单明了的解释都能出错,哎我这智商完蛋了。</span></p> <p style="line-height:150%"><span style="font-size: 12px;font-family: 宋体;line-height: 150%"></span></p> <table> <tbody> <tr class="firstRow"> <td width="386" valign="top">?</td> <td width="386" valign="top"><span>匹配前面的子表达式零次或一次。(基本表达式需要转意 \)</span></td> </tr> </tbody> </table> <p style="line-height:150%"><span> [1-9]?[0-9] </span><span style="font-size: 12px;font-family: 宋体;line-height: 150%">应该就是表示对的</span><span>1-99 </span></p> <p style="text-indent:7px;line-height:150%"><span><span>[0-9]?[0-9] </span><span style="font-size: 12px;font-family: 宋体;line-height: 150%">应该表示的就是</span><span>0-99 </span></span></p> <p style="text-indent:7px;line-height:150%"><span><span>[0-9]</span><span style="font-size: 12px;font-family: 宋体;line-height: 150%">出现</span><span>0</span><span style="font-size: 12px;font-family: 宋体;line-height: 150%">次或者</span><span>1</span><span style="font-size: 12px;font-family: 宋体;line-height: 150%">次后面跟上</span><span>[0-9]</span><span style="font-size: 12px;font-family: 宋体;line-height: 150%">出现一次,不就是</span><span>[0-9]{1</span><span style="font-size: 12px;font-family: 宋体;line-height: 150%">,</span><span>2} </span><span style="font-size: 12px;font-family: 宋体;line-height: 150%">应也是表示的是</span><span>0-99</span></span></p> <p style="text-indent:7px;line-height:150%"><span style="font-size: 12px;font-family: 宋体;line-height: 150%">去试试</span></p> <p style="text-indent:7px;line-height:150%"><span style="font-size: 12px;font-family: 宋体;line-height: 150%"></span></p> <p><span style="font-size: 14px">ifconfig | grep -Eo "([1-9]?[0-9]|1[0-9][0-9]|2[0-3][0-9])[\.]([0-9]{1,2}|1[0-9][0-9]|2[0-4][0-9]|25[0-5])[\.]([0-9]{1,2}|1[0-9][0-9]|2[0-4][0-9]|25[0-5])[\.]( [1-9]?[0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5])"</span></p> <p><span style="font-size: 14px">ifconfig | grep -Eo "([1-9]?[0-9]|1[0-9]{1,2}|2[0-3][0-9])[\.]([0-9]{1,2}|1[0-9]{1,2}|2[0-4][0-9]|25[0-5])[\.]([0-9]{1,2}|1[0-9]{1,2}|2[0-4][0-9]|25[0-5])[\.]([0-9]{1,2}|1[0-9]{1,2}|2[0-4][0-9]|25[0-5])"<span></span></span></p> <p style="line-height:150%"><img src="//cto.wang/usr/uploads/2016/07/20160703155836-55.jpg" title="1427467358937524.jpg" alt="8.jpg" /></p> <p style="line-height:150%"><span><strong><span style="font-size: 12px;font-family: 宋体;line-height: 150%">所用的命令及参数</span></strong></span></p> <pre class="brush:bash;toolbar:false">grep [option]... 'PATTERN' FILE... --color=auto 自动加点颜色显示 -o: 仅显示匹配的字串,而非字串所在的行 -E: 支持使用扩展正则表达式 -F: 不使用正则表达式</pre> <pre class="brush:bash;toolbar:false">sort: 对文件内容进行排序 sort [option] FILE... -f: 忽略字符大小写 -t: 指定分隔符 -k: 指定分隔之后要进行排序比较的字段 -n: 以数值大小进行排序 -u: 排序后去重</pre> <p style="line-height:150%"><span><strong><span style="font-size: 12px;font-family: 宋体;line-height: 150%">写的不好还请指导</span></strong><strong><span>O(∩_∩)O~</span></strong></span></p> <p style="line-height:150%"><span><strong><span> </span></strong></span></p> <p style="line-height:150%"><span><strong><span style="font-size: 12px;font-family: 宋体;line-height: 150%">后面的啰嗦一点通配符</span></strong></span></p> <p style="line-height:150%"><span></span></p> <p><span style="font-size: 12px">通配符:也是一种查找的工具能够进行模糊查找,同时我所了解的就是还有一个网络通配符。</span></p> <p><span style="font-size: 12px"> *: 任意长度的任意字符 </span></p> <p><span style="font-size: 12px"> ?: 匹配任意单字符</span></p> <p><span style="font-size: 12px"> [ ]: 匹配指定范围内的任意单字符 [abc], [a-z], [0-9], [0-9a-z]</span></p> <p><span style="font-size: 12px"> [^]:匹配指定范围以外的任意单字符 [^0-9a-z]</span></p> <p><span style="font-size: 12px"> 字符集合:使用需要加[ ]</span></p> <p><span style="font-size: 12px"> [:space:] : 所有空白字符</span></p> <p><span style="font-size: 12px"> [:punct:] : 所有标点符号</span></p> <p><span style="font-size: 12px"> [:lower:] :所有小写字母 [a-z] 不能写成[z-a]</span></p> <p><span style="font-size: 12px"> [:upper:] :所有大写字母 [A-Z]</span></p> <p><span style="font-size: 12px"> [:digit:] :所有数字 [0-9]</span></p> <p><span style="font-size: 12px"> [:alnum:] :所有数字和字母 [A-Z0-9a-z]</span></p> <p><span style="font-size: 12px"> [:alpha:] :所有字母 [a-zA-Z]</span></p> <hr /> <p><span style="font-size: 16px"><strong><span>举个例子:</span></strong></span><span> ls [*]*a.txt </span><span>查找以</span><span>*</span><span>开头中间有任意字符以</span><span>a.txt</span><span>结尾的文件</span></p> <p style="line-height:150%"><span>[root@zhuzw-centos6 tmp]# ls</span></p> <p style="line-height:150%"><span>a *a**a.txt aa.txt a.txt ceshi.txt cpvar.sh keyring-bBocvt logs orbit-gdm pulse-jFNHHBiALmGU pulse-kCc9R2jshzlv zhuzw</span></p> <p style="line-height:150%"><span>[root@zhuzw-centos6 tmp]# ls [*]*a.txt</span></p> <p style="line-height:150%"><span>*a**a.txt</span></p> <p style="line-height:150%"><span>[root@zhuzw-centos6 tmp]# ls [^*]*a.txt</span></p> <p style="line-height:150%"><span>aa.txt</span></p> <p style="line-height:150%"><span>[root@zhuzw-centos6 tmp]# ls <span>*[[:space:]]*</span> <span>匹配空白字符</span></span></p> <p style="line-height:150%"><span>a a.txt</span></p> <p style="line-height:150%"><span>[root@zhuzw-centos6 tmp]# ls [a-zA-Z0-9]</span></p> <p style="line-height:150%"><span>ls: <span>无法访问</span>[a-zA-Z0-9]: <span>没有那个文件或目录</span></span></p> <p style="line-height:150%"><span>[root@zhuzw-centos6 tmp]# ls [a-zA-Z0-9].TXT</span></p> <p style="line-height:150%"><span>ls: <span>无法访问</span>[a-zA-Z0-9].TXT: <span>没有那个文件或目录</span></span></p> <p style="line-height:150%"><span>[root@zhuzw-centos6 tmp]# ls [a-zA-Z0-9].txt</span></p> <p style="line-height:150%"><span>a.txt</span></p> <p style="line-height:150%"><span>[root@zhuzw-centos6 tmp]# touch 0</span></p> <p style="line-height:150%"><span>[root@zhuzw-centos6 tmp]# ls [9-0]</span></p> <p style="line-height:150%"><span>ls: <span>无法访问</span>[9-0]: <span>没有那个文件或目录</span></span></p> <p style="line-height:150%"><span>[root@zhuzw-centos6 tmp]# ls [0-9]</span></p> <p style="line-height:150%"><span>0</span></p> <p style="line-height:150%"><span><span>通配符的这一个表达式表示一整串字符串,从左开始向右结束。</span><span>主要用于模糊查找。当然了这并不全是通配符的全部功能之前在网络中曾经用过一个网络通配符就是</span>0<span>表示精准匹配,</span>1<span>表示不匹配。</span></span></p> <p style="line-height:150%"><span><span>通配符的还有适用于</span><span><span>访问控制列表</span></span><span>(</span>Access Control List<span>,</span>ACL<span>)</span></span></p> <p style="line-height:150%"><span> <span>R1#network 192.168.1.0 0.0.0.255</span></span></p> <p style="line-height:150%"><span><span>后面跟的这个和网络掩码是是反的,他表示就是</span>192.168.1.x<span>必须精准匹配,</span>x<span>可以为任意数,意思为宣告</span>192.168.1.0/24 <span>加入</span>ospf<span>区域</span></span></p> <p style="line-height:150%"><span>192.168.1.0 0.0.0.254<span>匹配出</span>192.168.1.0/24<span>网段中的所有偶数地址</span></span></p> <p style="line-height:150%"><span>192.168.1.1 0.0.0.254<span>匹配出</span>192.168.1.0/24<span>网段中的所有奇数地址</span></span></p> <p style="line-height:150%"><span> </span></p> 最后修改:2021 年 12 月 10 日 10 : 53 AM © 允许规范转载 赞赏 如果觉得我的文章对你有用,请随意赞赏 赞赏作者 支付宝微信