Integer和Long部分源码分析

Integer和Long的java中使用特别广泛，本人主要一下Integer.toString(int i)和Long.toString(long i)方法，其他方法都比较容易理解。

Integer.toString(int i)和Long.toString(long i)，以Integer.toString(int i)为例，先看源码：

 1    /**
 2      * Returns a {@code String} object representing the
 3      * specified integer. The argument is converted to signed decimal
 4      * representation and returned as a string, exactly as if the
 5      * argument and radix 10 were given as arguments to the {@link
 6      * #toString(int, int)} method.
 7      *
 8      * @param   i   an integer to be converted.
 9      * @return  a string representation of the argument in base&nbsp;10.
10      */
11     public static String toString(int i) {
12         if (i == Integer.MIN_VALUE)
13             return "-2147483648";
14         int size = (i < 0) ? stringSize(-i) + 1 : stringSize(i);
15         char[] buf = new char[size];
16         getChars(i, size, buf);
17         return new String(buf, true);
18     }

通过调用stringSize来计算i的长度，也就是位数，用来分配合适大小的字符数组buf，然后调用getChars来设置buf的值。

stringSize的Integer和Long中的实现有所不同，先看看源码

Integer.stringSize(int x)源码：

1    final static int [] sizeTable = { 9, 99, 999, 9999, 99999, 999999, 9999999,
2                                       99999999, 999999999, Integer.MAX_VALUE };
3 
4     // Requires positive x
5     static int stringSize(int x) {
6         for (int i=0; ; i++)
7             if (x <= sizeTable[i])
8                 return i+1;
9     }

将数据存放在数组中，数组中的下标+1就是i的长度，当x小于sizeTable中的某个值时，这样设计只需要循环就可以得出长度，效率高。

Long.stringSize(long x)源码：

 1     // Requires positive x
 2     static int stringSize(long x) {
 3         long p = 10;
 4         for (int i=1; i<19; i++) {
 5             if (x < p)
 6                 return i;
 7             p = 10*p;
 8         }
 9         return 19;
10     }

因为Long的十进制最大长度是19，在计算长度时通过反复乘以10的方式求出来的，可能会问为什么不用Integer.stringSize(int x)的方法，我也没有找到合适的解释。

传统的方案可能是通过反复除以10的方法求出来的，但是这样的效率低，因为计算机在处理乘法时要比除法快。

getChars(int i, int index, char[] buf)源码：

 1    /**
 2      * Places characters representing the integer i into the
 3      * character array buf. The characters are placed into
 4      * the buffer backwards starting with the least significant
 5      * digit at the specified index (exclusive), and working
 6      * backwards from there.
 7      *
 8      * Will fail if i == Integer.MIN_VALUE
 9      */
10     static void getChars(int i, int index, char[] buf) {
11         int q, r;
12         int charPos = index;
13         char sign = 0;
14 
15         if (i < 0) {
16             sign = '-';
17             i = -i;
18         }
19 
20         // Generate two digits per iteration
21         while (i >= 65536) {
22             q = i / 100;
23         // really: r = i - (q * 100);
24             r = i - ((q << 6) + (q << 5) + (q << 2));
25             i = q;
26             buf [--charPos] = DigitOnes[r];
27             buf [--charPos] = DigitTens[r];
28         }
29 
30         // Fall thru to fast mode for smaller numbers
31         // assert(i <= 65536, i);
32         for (;;) {
33             q = (i * 52429) >>> (16+3);
34             r = i - ((q << 3) + (q << 1));  // r = i-(q*10) ...
35             buf [--charPos] = digits [r];
36             i = q;
37             if (i == 0) break;
38         }
39         if (sign != 0) {
40             buf [--charPos] = sign;
41         }
42     }

这是整个转换过程的核心代码，首先确定符号，其次当i>=65536时将i除以100，并且通过DigitOnes[r]和DigitTens[r]来获取十位和个位上的值，因为除法慢，所以一次性除以100提高效率，DigitOnes和DigitTens如下：

 1   final static char [] DigitTens = {
 2         '0', '0', '0', '0', '0', '0', '0', '0', '0', '0',
 3         '1', '1', '1', '1', '1', '1', '1', '1', '1', '1',
 4         '2', '2', '2', '2', '2', '2', '2', '2', '2', '2',
 5         '3', '3', '3', '3', '3', '3', '3', '3', '3', '3',
 6         '4', '4', '4', '4', '4', '4', '4', '4', '4', '4',
 7         '5', '5', '5', '5', '5', '5', '5', '5', '5', '5',
 8         '6', '6', '6', '6', '6', '6', '6', '6', '6', '6',
 9         '7', '7', '7', '7', '7', '7', '7', '7', '7', '7',
10         '8', '8', '8', '8', '8', '8', '8', '8', '8', '8',
11         '9', '9', '9', '9', '9', '9', '9', '9', '9', '9',
12         } ;
13 
14     final static char [] DigitOnes = {
15         '0', '1', '2', '3', '4', '5', '6', '7', '8', '9',
16         '0', '1', '2', '3', '4', '5', '6', '7', '8', '9',
17         '0', '1', '2', '3', '4', '5', '6', '7', '8', '9',
18         '0', '1', '2', '3', '4', '5', '6', '7', '8', '9',
19         '0', '1', '2', '3', '4', '5', '6', '7', '8', '9',
20         '0', '1', '2', '3', '4', '5', '6', '7', '8', '9',
21         '0', '1', '2', '3', '4', '5', '6', '7', '8', '9',
22         '0', '1', '2', '3', '4', '5', '6', '7', '8', '9',
23         '0', '1', '2', '3', '4', '5', '6', '7', '8', '9',
24         '0', '1', '2', '3', '4', '5', '6', '7', '8', '9',
25         } ;

假设r=34，通过查表可以得出DigitOnes[r]=4，DigitTens[r]=3。

1 q = (i * 52429) >>> (16+3); 的本质是将i/10，并去掉小数部分，2¹⁹=524288，52429/524288=0.10000038146972656，为什么会选择52429/524288呢，看了下面就知道了：

 1 2^10=1024, 103/1024=0.1005859375
 2 2^11=2048, 205/2048=0.10009765625
 3 2^12=4096, 410/4096=0.10009765625
 4 2^13=8192, 820/8192=0.10009765625
 5 2^14=16384, 1639/16384=0.10003662109375
 6 2^15=32768, 3277/32768=0.100006103515625
 7 2^16=65536, 6554/65536=0.100006103515625
 8 2^17=131072, 13108/131072=0.100006103515625
 9 2^18=262144, 26215/262144=0.10000228881835938
10 2^19=524288, 52429/524288=0.10000038146972656

可以看出52429/524288的精度最高，并且在Integer的取值范围内。

1 r = i - ((q << 3) + (q << 1)); // r = i-(q*10) ... 用位运算而不用乘法也是为了提高效率。

注：以上分析内容仅个人观点（部分参考网上），如有不正确的地方希望可以相互交流。

posted @ 2017-02-25 23:19 PinXiong 阅读(546) 评论(0) 编辑收藏举报

刷新页面返回顶部

The God Who Only Knows Four Words

Every child has known God, not the God of names, not the God of don'ts, but the God who only knows four words and keeps repeating them, saying,'Come dance with me. Come, dance with me.'

Integer和Long部分源码分析