Integer和Long部分源码分析
Integer和Long的java中使用特别广泛,本人主要一下Integer.toString(int i)和Long.toString(long i)方法,其他方法都比较容易理解。
Integer.toString(int i)和Long.toString(long i),以Integer.toString(int i)为例,先看源码:
1 /** 2 * Returns a {@code String} object representing the 3 * specified integer. The argument is converted to signed decimal 4 * representation and returned as a string, exactly as if the 5 * argument and radix 10 were given as arguments to the {@link 6 * #toString(int, int)} method. 7 * 8 * @param i an integer to be converted. 9 * @return a string representation of the argument in base 10. 10 */ 11 public static String toString(int i) { 12 if (i == Integer.MIN_VALUE) 13 return "-2147483648"; 14 int size = (i < 0) ? stringSize(-i) + 1 : stringSize(i); 15 char[] buf = new char[size]; 16 getChars(i, size, buf); 17 return new String(buf, true); 18 }
通过调用stringSize来计算i的长度,也就是位数,用来分配合适大小的字符数组buf,然后调用getChars来设置buf的值。
stringSize的Integer和Long中的实现有所不同,先看看源码
Integer.stringSize(int x)源码:
1 final static int [] sizeTable = { 9, 99, 999, 9999, 99999, 999999, 9999999, 2 99999999, 999999999, Integer.MAX_VALUE }; 3 4 // Requires positive x 5 static int stringSize(int x) { 6 for (int i=0; ; i++) 7 if (x <= sizeTable[i]) 8 return i+1; 9 }
将数据存放在数组中,数组中的下标+1就是i的长度,当x小于sizeTable中的某个值时,这样设计只需要循环就可以得出长度,效率高。
Long.stringSize(long x)源码:
1 // Requires positive x 2 static int stringSize(long x) { 3 long p = 10; 4 for (int i=1; i<19; i++) { 5 if (x < p) 6 return i; 7 p = 10*p; 8 } 9 return 19; 10 }
因为Long的十进制最大长度是19,在计算长度时通过反复乘以10的方式求出来的,可能会问为什么不用Integer.stringSize(int x)的方法,我也没有找到合适的解释。
传统的方案可能是通过反复除以10的方法求出来的,但是这样的效率低,因为计算机在处理乘法时要比除法快。
getChars(int i, int index, char[] buf)源码:
1 /** 2 * Places characters representing the integer i into the 3 * character array buf. The characters are placed into 4 * the buffer backwards starting with the least significant 5 * digit at the specified index (exclusive), and working 6 * backwards from there. 7 * 8 * Will fail if i == Integer.MIN_VALUE 9 */ 10 static void getChars(int i, int index, char[] buf) { 11 int q, r; 12 int charPos = index; 13 char sign = 0; 14 15 if (i < 0) { 16 sign = '-'; 17 i = -i; 18 } 19 20 // Generate two digits per iteration 21 while (i >= 65536) { 22 q = i / 100; 23 // really: r = i - (q * 100); 24 r = i - ((q << 6) + (q << 5) + (q << 2)); 25 i = q; 26 buf [--charPos] = DigitOnes[r]; 27 buf [--charPos] = DigitTens[r]; 28 } 29 30 // Fall thru to fast mode for smaller numbers 31 // assert(i <= 65536, i); 32 for (;;) { 33 q = (i * 52429) >>> (16+3); 34 r = i - ((q << 3) + (q << 1)); // r = i-(q*10) ... 35 buf [--charPos] = digits [r]; 36 i = q; 37 if (i == 0) break; 38 } 39 if (sign != 0) { 40 buf [--charPos] = sign; 41 } 42 }
这是整个转换过程的核心代码,首先确定符号,其次当i>=65536时将i除以100,并且通过DigitOnes[r]和DigitTens[r]来获取十位和个位上的值,因为除法慢,所以一次性除以100提高效率,DigitOnes和DigitTens如下:
1 final static char [] DigitTens = { 2 '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', 3 '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', 4 '2', '2', '2', '2', '2', '2', '2', '2', '2', '2', 5 '3', '3', '3', '3', '3', '3', '3', '3', '3', '3', 6 '4', '4', '4', '4', '4', '4', '4', '4', '4', '4', 7 '5', '5', '5', '5', '5', '5', '5', '5', '5', '5', 8 '6', '6', '6', '6', '6', '6', '6', '6', '6', '6', 9 '7', '7', '7', '7', '7', '7', '7', '7', '7', '7', 10 '8', '8', '8', '8', '8', '8', '8', '8', '8', '8', 11 '9', '9', '9', '9', '9', '9', '9', '9', '9', '9', 12 } ; 13 14 final static char [] DigitOnes = { 15 '0', '1', '2', '3', '4', '5', '6', '7', '8', '9', 16 '0', '1', '2', '3', '4', '5', '6', '7', '8', '9', 17 '0', '1', '2', '3', '4', '5', '6', '7', '8', '9', 18 '0', '1', '2', '3', '4', '5', '6', '7', '8', '9', 19 '0', '1', '2', '3', '4', '5', '6', '7', '8', '9', 20 '0', '1', '2', '3', '4', '5', '6', '7', '8', '9', 21 '0', '1', '2', '3', '4', '5', '6', '7', '8', '9', 22 '0', '1', '2', '3', '4', '5', '6', '7', '8', '9', 23 '0', '1', '2', '3', '4', '5', '6', '7', '8', '9', 24 '0', '1', '2', '3', '4', '5', '6', '7', '8', '9', 25 } ;
假设r=34,通过查表可以得出DigitOnes[r]=4,DigitTens[r]=3。
1 q = (i * 52429) >>> (16+3); 的本质是将i/10,并去掉小数部分,219=524288,52429/524288=0.10000038146972656,为什么会选择52429/524288呢,看了下面就知道了:
1 2^10=1024, 103/1024=0.1005859375 2 2^11=2048, 205/2048=0.10009765625 3 2^12=4096, 410/4096=0.10009765625 4 2^13=8192, 820/8192=0.10009765625 5 2^14=16384, 1639/16384=0.10003662109375 6 2^15=32768, 3277/32768=0.100006103515625 7 2^16=65536, 6554/65536=0.100006103515625 8 2^17=131072, 13108/131072=0.100006103515625 9 2^18=262144, 26215/262144=0.10000228881835938 10 2^19=524288, 52429/524288=0.10000038146972656
可以看出52429/524288的精度最高,并且在Integer的取值范围内。
1 r = i - ((q << 3) + (q << 1)); // r = i-(q*10) ... 用位运算而不用乘法也是为了提高效率。
注:以上分析内容仅个人观点(部分参考网上),如有不正确的地方希望可以相互交流。