public static final String POEM= "Twas brilling, and the slithy toves\n" + "Did gyre and gimble in the wabe.\n"+ "All mimsy were the borogoves,\n" + "And the mome rathsoutgrable.\n\n"+ "Beware the Jabberwork, my son,\n"+ "The jaws that bite, the claws that catch.\n"+ "Beware hte Jubjub bird, and shun\n"+ "The frumious Bandersnatch."; public static void main(String[] args) { // TODO Auto-generated method stub Matcher m= Pattern.compile("(?m)(\\S+)\\s+((\\S+)\\s+(\\S+))$") .matcher(POEM); while (m.find()) { for (int j = 0; j <= m.groupCount(); j++) { System.out.print("["+ m.group(j)+ "]"); } System.out.println(); } }
output:
[the slithy toves][the][slithy toves][slithy][toves]
[in the wabe.][in][the wabe.][the][wabe.]
[were the borogoves,][were][the borogoves,][the][borogoves,]
[the mome rathsoutgrable.][the][mome rathsoutgrable.][mome][rathsoutgrable.]
[Jabberwork, my son,][Jabberwork,][my son,][my][son,]
[claws that catch.][claws][that catch.][that][catch.]
[bird, and shun][bird,][and shun][and][shun]
[The frumious Bandersnatch.][The][frumious Bandersnatch.][frumious][Bandersnatch.]
解析:
m.groupCout():匹配器匹配的组的总数,不包括0组。
m.group(j):匹配的第j组的值。group(0)是整个表达式
(?m):多行模式
\S:非空白字符
\s:空白字符 ==[ \t\n\x0B\f\r]
2.Matcher.find() vs .lookingAt() vs .matchers()
package com.westward; import java.util.regex.Matcher; import java.util.regex.Pattern; public class Demo31 { public static String input= "As long as there is injustice, whenever a\n"+ "Targathian baby cries out.wherever a distress\n" + "signal sounds among the stars ... We'll be there.\n"+ "This fine ship, and this fine crew ...\n" + "Never give up! Never surrender!"; private static class Display{ private boolean regexPrinted= false; private String regex; Display(String regex) { this.regex= regex; } void display(String message){ if (!regexPrinted) { System.out.println(regex); regexPrinted = true; } System.out.println(message); } } static void examine(String s,String regex){ Display d= new Display(regex); Pattern p= Pattern.compile(regex); Matcher m= p.matcher(s); while (m.find()) { d.display("find() '"+ m.group() + "' start= "+m.start()+ " end= "+ m.end()); } if (m.lookingAt()) { d.display("lookingAt() '"+ m.group() + "' start= "+m.start()+ " end= "+ m.end()); } if (m.matches()) { d.display("matches() '"+ m.group() + "' start= "+m.start()+ " end= "+ m.end()); } } public static void main(String[] args) { for (String in : input.split("\n")) { System.out.println("input :"+ in); for (String regex : new String[]{"\\w*ere\\w*", "\\w*ever","T\\w+","Never.*?!"}) { examine(in, regex); } } } }
output:
input :As long as there is injustice, whenever a
\w*ere\w*
find() 'there' start= 11 end= 16
\w*ever
find() 'whenever' start= 31 end= 39
input :Targathian baby cries out.wherever a distress
\w*ere\w*
find() 'wherever' start= 26 end= 34
\w*ever
find() 'wherever' start= 26 end= 34
T\w+
find() 'Targathian' start= 0 end= 10
lookingAt() 'Targathian' start= 0 end= 10
input :signal sounds among the stars ... We'll be there.
\w*ere\w*
find() 'there' start= 43 end= 48
input :This fine ship, and this fine crew ...
T\w+
find() 'This' start= 0 end= 4
lookingAt() 'This' start= 0 end= 4
input :Never give up! Never surrender!
\w*ever
find() 'Never' start= 0 end= 5
find() 'Never' start= 15 end= 20
lookingAt() 'Never' start= 0 end= 5
Never.*?!
find() 'Never give up!' start= 0 end= 14
find() 'Never surrender!' start= 15 end= 31
lookingAt() 'Never give up!' start= 0 end= 14
matches() 'Never give up! Never surrender!' start= 0 end= 31
总结:
Matcher.find():匹配字符串的任意位置
Matcher.lookingAt():匹配字符串的开始位置
Matcher.matchers():匹配整个字符串,String.matchers()底层就是调用的它。
3.Pattern标记 (Pattern的几个成员变量)
public static void main(String[] args) { // TODO Auto-generated method stub Pattern p= Pattern.compile("^java", Pattern.CASE_INSENSITIVE | Pattern.MULTILINE); Matcher m= p.matcher( "java has regex\nJava has regex\n"+ "JAVA has pretty good regular expressions\n"+ "Regular expressions are in Java"); while (m.find()) { System.out.println(m.group(0)); // System.out.println(m.group());//the same } }
output:
java
Java
JAVA
总结:不同的Pattern标记可以用 或| 来连接。
Pattern.CASE_INSENSITIVE(?i):字母大小写不敏感
Pattern.MULTILINE(?m):多行模式