commons-lang包的StringUtils.split()和jdk自带split()的区别
StringUtils的split()方法只能以单个字符进行切分,即使在使用StringUtils.split(String str, String splitChars),splitChars传入多个字符的字符串,也只会以splitChars中的所有包含的单个字符进行切分,具体如下:
import org.apache.commons.lang.StringUtils; public class Test { public static void main(String[] args) { String s = "0012001230010023001230023"; long time1 = System.currentTimeMillis(); String[] split1 = s.split("123"); long time2 = System.currentTimeMillis(); String[] split2 = StringUtils.split(s, "123"); long time3 = System.currentTimeMillis(); System.out.println("用时1>>>."+(time2-time1)); System.out.println("用时2>>>."+(time3-time2)); for(int i=0,len=split1.length; i<len; i++){ System.out.print(split1[i]); System.out.print("-"); } System.out.println(""); for(int i=0,len=split2.length; i<len; i++){ System.out.print(split2[i]); System.out.print("-"); } } }
运行结果:
用时1>>>.2
用时2>>>.17
001200-001002300-0023-
00-00-00-00-00-00-
另: StringUtils.split()是空指针安全的.
附StringUtils.split(String str, String separatorChars, int max)的实现源码:
private static String[] splitWorker(String str, String separatorChars, int max, boolean preserveAllTokens) { // Performance tuned for 2.0 (JDK1.4) // Direct code is quicker than StringTokenizer. // Also, StringTokenizer uses isSpace() not isWhitespace() if (str == null) { return null; } int len = str.length(); if (len == 0) { return ArrayUtils.EMPTY_STRING_ARRAY; } List list = new ArrayList(); int sizePlus1 = 1; int i = 0, start = 0; boolean match = false; boolean lastMatch = false; if (separatorChars == null) { // 使用StringUtils.split(String str)时默认进入这个判断,使用空字符进行切割 while (i < len) { if (Character.isWhitespace(str.charAt(i))) { if (match || preserveAllTokens) {//遇到空字符进入此判断 ① lastMatch = true; if (sizePlus1++ == max) { i = len; lastMatch = false; } list.add(str.substring(start, i));//将截取的字符串放入list match = false; } start = ++i;//将下一个字符作为截取开始点 continue; } lastMatch = false; match = true;//如果不是separatorChars则置为true,以便下一个字符进入判断 ① i++; } } else if (separatorChars.length() == 1) { // 截取字符是单个字符的情况 与默认空字符判断情况相同 char sep = separatorChars.charAt(0); while (i < len) { if (str.charAt(i) == sep) { if (match || preserveAllTokens) { lastMatch = true; if (sizePlus1++ == max) { i = len; lastMatch = false; } list.add(str.substring(start, i)); match = false; } start = ++i; continue; } lastMatch = false; match = true; i++; } } else { // standard case while (i < len) { if (separatorChars.indexOf(str.charAt(i)) >= 0) {//如果被切割字符串的第i个字符在separatorChars中,则切掉 (正是此处与jdk的split不同) if (match || preserveAllTokens) { lastMatch = true; if (sizePlus1++ == max) { i = len; lastMatch = false; } list.add(str.substring(start, i)); match = false; } start = ++i; continue; } lastMatch = false; match = true; i++; } } if (match || (preserveAllTokens && lastMatch)) { list.add(str.substring(start, i)); } return (String[]) list.toArray(new String[list.size()]); }