Java中的正则表达式Pattern与Matcher
一般来说比起功能有限的String类,我们更愿意构造功能强大的正则表达式。我们可以通过Pattern 与 Matcher 来构建功能强大的正则表达式
import java.io.File; import java.io.FileNotFoundException; import java.util.ArrayList; import java.util.Arrays; import java.util.List; import java.util.Scanner; import java.util.regex.Matcher; import java.util.regex.Pattern; public class Main { //public static String s = "I am a good student... haha good"; public static void main(String[] args) throws FileNotFoundException { Scanner input = new Scanner(new File("data.in")); Pattern tp = Pattern.compile(" "); System.out.println(Arrays.asList(tp.split("a b c"))); String s = "abcabcabcdefabc"; List<String> s1 = new ArrayList<String>(); s1.add("abcabcabcdefabc"); s1.add("abc"); s1.add("(abc)+"); s1.add("(abc){2,}"); for (String s2 : s1) { System.out.println("正则:" + s2); Pattern p = Pattern.compile(s2); //编译自己的正则返回Pattern对象 ,Pattern对象表示编译后的正则表达式 Matcher m = p.matcher(s); //传入要检索的字符串,返回Matcher对象 while (m.find()) { System.out.println("Match : " + m.group() + " " + m.start() + " " + m.end()); } } /* * Pattern对象还提供了matches方法返回是否匹配整个字符串 * split分割方法 * * Matcher 提供方法: * matches用来判断整个输入字符串是否匹配正则表达式 * lookingAt则用来判断该字符串的始部是否能够匹配模式 * find 遍历输入字符串 * */ } } 输出: [a, b, c] 正则:abcabcabcdefabc Match : abcabcabcdefabc 0 15 正则:abc Match : abc 0 3 Match : abc 3 6 Match : abc 6 9 Match : abc 12 15 正则:(abc)+ Match : abcabcabc 0 9 Match : abc 12 15 正则:(abc){2,} Match : abcabcabc 0 9
组: 关于组的操作
组号为0表示表示整个表达式, 组号为1表示第一对括号
import java.io.File; import java.io.FileNotFoundException; import java.util.ArrayList; import java.util.Arrays; import java.util.List; import java.util.Scanner; import java.util.regex.Matcher; import java.util.regex.Pattern; public class Main { public static String s = "I am a good student\n" + "fhhj hhh nnnn oj\n" + "very am you can\n" ; public static void main(String[] args) throws FileNotFoundException { Matcher m = Pattern.compile("(?m)(\\S+)\\s+((\\S+)\\s+(\\S+)\\s+)$").matcher(s); while (m.find()) { for (int i = 0; i <= m.groupCount(); ++i) { System.out.print("[" + m.group(i) + "]"); } System.out.println(); } } }
Pattern的split分割:
public class Main { public static String s = "I am a good student"; public static void main(String[] args) throws FileNotFoundException { System.out.println( Arrays.asList(Pattern.compile(" ").split(s))); } }
替换操作:
import java.io.File; import java.io.FileNotFoundException; import java.lang.reflect.Array; import java.util.ArrayList; import java.util.Arrays; import java.util.List; import java.util.Scanner; import java.util.regex.Matcher; import java.util.regex.Pattern; public class Main { public static String s = "/*! here's a block of text to use as input to" + "the ruguler expresssion mavherer . note that we'll"+ "extracted block.!*/"; public static void main(String[] args) throws FileNotFoundException { Matcher m = Pattern.compile("/\\*!(.*)!\\*/").matcher(s); String ans = null; while (m.find()) { ans = m.group(1); } System.out.println("去掉注释符之后:" + ans); ans = ans.replaceAll(" {2,}", " "); System.out.println("去掉多余空格之后:" + ans); ans = ans.replaceAll("(?m)^ +", ""); System.out.println("去掉前边的空格之后:" + ans); String tmp = ans.replaceAll("[aeiou]", "HAHA"); System.out.println("替换之后:" + tmp); StringBuffer sb = new StringBuffer(); Pattern p = Pattern.compile("[aeiou]"); Matcher tm = p.matcher(ans); while (tm.find()) { /* * 首先把要发生替换的部分到字符串开始的地方都复制给sb * 这里我们能够更加灵活的对发生替换的部分进行处理,我们这里的处理是转化成大写字母 */ tm.appendReplacement(sb, tm.group().toUpperCase()); //System.out.println("SB = " + sb); 可以自己输出看一看 } tm.appendTail(sb); //把末尾的部分加上 System.out.println("替换之后:" + sb); } } 输出: 去掉注释符之后: here's a block of text to use as input tothe ruguler expresssion mavherer . note that we'llextracted block. 去掉多余空格之后: here's a block of text to use as input tothe ruguler expresssion mavherer . note that we'llextracted block. 去掉前边的空格之后:here's a block of text to use as input tothe ruguler expresssion mavherer . note that we'llextracted block. 替换之后:hHAHArHAHA's HAHA blHAHAck HAHAf tHAHAxt tHAHA HAHAsHAHA HAHAs HAHAnpHAHAt tHAHAthHAHA rHAHAgHAHAlHAHAr HAHAxprHAHAsssHAHAHAHAn mHAHAvhHAHArHAHAr . nHAHAtHAHA thHAHAt wHAHA'llHAHAxtrHAHActHAHAd blHAHAck. 替换之后:hErE's A blOck Of tExt tO UsE As InpUt tOthE rUgUlEr ExprEsssIOn mAvhErEr . nOtE thAt wE'llExtrActEd blOck.
我们可以用reset,将现有的Matcher对象应用于一个新的字符序列
public class Main { public static void main(String[] args) throws FileNotFoundException { List<String> list = new ArrayList<String>(); list.add("string"); list.add("Scdvffv njnjn"); list.add("test grgrg"); list.add("common gfgrg"); Pattern p = Pattern.compile("^[Ssct]\\w+"); Matcher m = p.matcher(""); for (String s : list) { m.reset(s); while (m.find()) { //首先输出匹配字符串, 查看开始部分是否满足, 查看整个字符串是否满足 System.out.println(m.group() + " " + m.lookingAt() +" " + m.matches()); } } } }
输出:
string true true
Scdvffv true false
test true false
common true false