Java源码学习 -- java.lang.String
java.lang.String是使用频率非常高的类。要想更好的使用java.lang.String类,了解其源代码实现是非常有必要的。由java.lang.String,自然联想到java.lang.StringBuffer和java.lang.StringBuilder,下篇文章再来研究java.lang.StringBuffer和java.lang.StringBuilder。
重要属性
java.lang.String对象中字符串主要是以字符数组的形式存储。当调用对象方法获取字符串长度时,直接返回数组长度。判断字符串是否为空isEmpty()时,也直接检查数组长度是否为0。其部分发生代码如下所示:
1 /** The value is used for character storage. */ 2 private final char value[]; 3 4 /** Cache the hash code for the string */ 5 private int hash; // Default to 0
value:存储字符串的字符数组。该数组为final变量,一旦赋值,将不会更改。
hash:该String对象的哈希值。
构造方法
java.lang.String对象构造方法比较多,列举如下:
1 public String() 2 public String(String original) 3 public String(char value[]) 4 public String(char value[], int offset, int count) 5 public String(int[] codePoints, int offset, int count) 6 @Deprecated 7 public String(byte ascii[], int hibyte, int offset, int count) 8 @Deprecated 9 public String(byte ascii[], int hibyte) 10 public String(byte bytes[], int offset, int length, String charsetName) throws UnsupportedEncodingException 11 public String(byte bytes[], int offset, int length, Charset charset) 12 public String(byte bytes[], String charsetName) throws UnsupportedEncodingException 13 public String(byte bytes[], Charset charset) 14 public String(byte bytes[], int offset, int length) 15 public String(byte bytes[]) 16 public String(StringBuffer buffer) 17 public String(StringBuilder builder)
在 public String(StringBuffer buffer) 中,传入形参为StringBuffer,StringBuffer为线程安全类。则在此构造方法内部进行了synchronized关键字锁同步。代码如下:
1 public String(StringBuffer buffer) { 2 synchronized(buffer) { 3 this.value = Arrays.copyOf(buffer.getValue(), buffer.length()); 4 } 5 }
在 public String(StringBuilder builder) 中,传入形参为StringBuilder,StringBuilder为非线程安全类。则在此构造方法内部内部未做同步处理,对比 public String(StringBuffer buffer) 。代码如下:
1 public String(StringBuilder builder) { 2 this.value = Arrays.copyOf(builder.getValue(), builder.length()); 3 }
常用方法
java.lang.String对象中封装方法非常多,仅针对常用方法源代码进行分析。如:equals(),replace(), indexOf(),startsWith(),compareTo(),regionMathes(),hashCode()。
public boolean equals(Object anObject)
用于比较两对象存储内容是否相同。采用比较巧妙的方式进行排除比较:(1)先“==”比较两对象是否是同一对象,若是,直接返回true, 否则进一步判断;(2)判断待比较对象类型是否是java.lang.String,若不是,直接返回false,否则进一步判断;(3)判断两字符串长度是否相等,若不是直接返回false,否则进一步判断;(4)从字符数组中第一个字符开始,依次进行比较,一旦发现不相同字符直接返回false,若所在字符均相同则返回true。对字符数组中字符依次进行比较是一件非常耗时的操作,将此操作放在最后执行,先利用其它条件进行对其进行判断。比较巧妙!
1 public boolean equals(Object anObject) { 2 if (this == anObject) { 3 return true; 4 } 5 if (anObject instanceof String) { 6 String anotherString = (String)anObject; 7 int n = value.length; 8 if (n == anotherString.value.length) { 9 char v1[] = value; 10 char v2[] = anotherString.value; 11 int i = 0; 12 while (n-- != 0) { 13 if (v1[i] != v2[i]) 14 return false; 15 i++; 16 } 17 return true; 18 } 19 } 20 return false; 21 }
public String replace(char oldChar, char newChar)
将字符串中指定字符替换为新的字符。(1)先判断待替换字符和新字符是否相同,若相同,则直接返回原字符串,若不同,则继续执行;(2)找出第一次出现待替换字符位置i,创建新的等长字符数组,将该位置之前的字符依次放入新的字符数组中;(3)从位置i处依次遍历比较原字符数组中字符是否是待替换字符,若是,则将新字符放入新字符数组对应位置,若不是,则将原字符数组中字符放入对应位置。巧妙做了一个小优化,直接找出第一次出现待替换字符的位置,再从此处开始遍历,提高效率。
1 public String replace(char oldChar, char newChar) { 2 if (oldChar != newChar) { 3 int len = value.length; 4 int i = -1; 5 char[] val = value; /* avoid getfield opcode */ 6 7 while (++i < len) { 8 if (val[i] == oldChar) { 9 break; 10 } 11 } 12 if (i < len) { 13 char buf[] = new char[len]; 14 for (int j = 0; j < i; j++) { 15 buf[j] = val[j]; 16 } 17 while (i < len) { 18 char c = val[i]; 19 buf[i] = (c == oldChar) ? newChar : c; 20 i++; 21 } 22 return new String(buf, true); 23 } 24 } 25 return this; 26 }
public String replace(CharSequence target, CharSequence replacement)
该方法是我们通常意义所用到的 public String replace(String target, String replacement) ,java.lang.String实现了java.lang.CharSequence接口。方法内部调用正则表达式匹配替换来实现。
1 public String replace(CharSequence target, CharSequence replacement) { 2 return Pattern.compile(target.toString(), Pattern.LITERAL).matcher( 3 this).replaceAll(Matcher.quoteReplacement(replacement.toString())); 4 }
public int indexOf(String str)
该方法是找出目标字符串是第一次出现指定子字符串的位置,若不存在,则返回-1,若存在,则返回位置坐标。具体实现是调用 static int indexOf(char[] source, int sourceOffset, int sourceCount, char[] target, int targetOffset, int targetCount, int fromIndex) 方法。先对目标字符串中出现子字符串的位置可能范围,然后在此范围中遍历找出与子字符串第一个字符相同的位置,并对后面字符进行比较分析。
1 /** 2 * Returns the index within this string of the first occurrence of the 3 * specified substring. 4 */ 5 public int indexOf(String str) { 6 return indexOf(str, 0); 7 } 8 9 /** 10 * Returns the index within this string of the first occurrence of the 11 * specified substring, starting at the specified index. 12 */ 13 public int indexOf(String str, int fromIndex) { 14 return indexOf(value, 0, value.length, 15 str.value, 0, str.value.length, fromIndex); 16 } 17 18 /** 19 * Code shared by String and StringBuffer to do searches. The 20 * source is the character array being searched, and the target 21 * is the string being searched for. 22 * 23 * @param source the characters being searched. 24 * @param sourceOffset offset of the source string. 25 * @param sourceCount count of the source string. 26 * @param target the characters being searched for. 27 * @param targetOffset offset of the target string. 28 * @param targetCount count of the target string. 29 * @param fromIndex the index to begin searching from. 30 */ 31 static int indexOf(char[] source, int sourceOffset, int sourceCount, 32 char[] target, int targetOffset, int targetCount, 33 int fromIndex) { 34 if (fromIndex >= sourceCount) { 35 return (targetCount == 0 ? sourceCount : -1); 36 } 37 if (fromIndex < 0) { 38 fromIndex = 0; 39 } 40 if (targetCount == 0) { 41 return fromIndex; 42 } 43 44 char first = target[targetOffset]; 45 int max = sourceOffset + (sourceCount - targetCount); 46 47 for (int i = sourceOffset + fromIndex; i <= max; i++) { 48 /* Look for first character. */ 49 if (source[i] != first) { 50 while (++i <= max && source[i] != first); 51 } 52 53 /* Found first character, now look at the rest of v2 */ 54 if (i <= max) { 55 int j = i + 1; 56 int end = j + targetCount - 1; 57 for (int k = targetOffset + 1; j < end && source[j] 58 == target[k]; j++, k++); 59 60 if (j == end) { 61 /* Found whole string. */ 62 return i - sourceOffset; 63 } 64 } 65 } 66 return -1; 67 }
public int compareTo(String anotherString)
该方法是对字符串集合进行排序的基础,通过此方法可比较两字符串大小,原理很简单,源代码如下:
1 public int compareTo(String anotherString) { 2 int len1 = value.length; 3 int len2 = anotherString.value.length; 4 int lim = Math.min(len1, len2); 5 char v1[] = value; 6 char v2[] = anotherString.value; 7 8 int k = 0; 9 while (k < lim) { 10 char c1 = v1[k]; 11 char c2 = v2[k]; 12 if (c1 != c2) { 13 return c1 - c2; 14 } 15 k++; 16 } 17 return len1 - len2; 18 }
public boolean startsWith(String prefix)
判断目标字符串是否以指定字符子串开关,该方法内部是调用 public boolean startsWith(String prefix, int toffset) 方法实现,原理很简单,代码如下:
1 /** 2 * Tests if this string starts with the specified prefix. 3 * 4 * @param prefix the prefix. 5 */ 6 public boolean startsWith(String prefix) { 7 return startsWith(prefix, 0); 8 }
public int hashCode()
其hashCode()代码如下:
1 public int hashCode() { 2 int h = hash; 3 if (h == 0 && value.length > 0) { 4 char val[] = value; 5 6 for (int i = 0; i < value.length; i++) { 7 h = 31 * h + val[i]; 8 } 9 hash = h; 10 } 11 return h; 12 }