Lucene source Vint 写入
代码位置org.apache.lucene.store.DataOutput
1 /** Writes an int in a variable-length format. Writes between one and 2 * five bytes. Smaller values take fewer bytes. Negative numbers are 3 * supported, but should be avoided. 4 * <p>VByte is a variable-length format for positive integers is defined where the 5 * high-order bit of each byte indicates whether more bytes remain to be read. The 6 * low-order seven bits are appended as increasingly more significant bits in the 7 * resulting integer value. Thus values from zero to 127 may be stored in a single 8 * byte, values from 128 to 16,383 may be stored in two bytes, and so on.</p> 9 * <p>VByte Encoding Example</p> 10 * <table cellspacing="0" cellpadding="2" border="0"> 11 * <col width="64*"> 12 * <col width="64*"> 13 * <col width="64*"> 14 * <col width="64*"> 15 * <tr valign="top"> 16 * <th align="left" width="25%">Value</th> 17 * <th align="left" width="25%">Byte 1</th> 18 * <th align="left" width="25%">Byte 2</th> 19 * <th align="left" width="25%">Byte 3</th> 20 * </tr> 21 * <tr valign="bottom"> 22 * <td width="25%">0</td> 23 * <td width="25%"><kbd>00000000</kbd></td> 24 * <td width="25%"></td> 25 * <td width="25%"></td> 26 * </tr> 27 * <tr valign="bottom"> 28 * <td width="25%">1</td> 29 * <td width="25%"><kbd>00000001</kbd></td> 30 * <td width="25%"></td> 31 * <td width="25%"></td> 32 * </tr> 33 * <tr valign="bottom"> 34 * <td width="25%">2</td> 35 * <td width="25%"><kbd>00000010</kbd></td> 36 * <td width="25%"></td> 37 * <td width="25%"></td> 38 * </tr> 39 * <tr> 40 * <td valign="top" width="25%">...</td> 41 * <td valign="bottom" width="25%"></td> 42 * <td valign="bottom" width="25%"></td> 43 * <td valign="bottom" width="25%"></td> 44 * </tr> 45 * <tr valign="bottom"> 46 * <td width="25%">127</td> 47 * <td width="25%"><kbd>01111111</kbd></td> 48 * <td width="25%"></td> 49 * <td width="25%"></td> 50 * </tr> 51 * <tr valign="bottom"> 52 * <td width="25%">128</td> 53 * <td width="25%"><kbd>10000000</kbd></td> 54 * <td width="25%"><kbd>00000001</kbd></td> 55 * <td width="25%"></td> 56 * </tr> 57 * <tr valign="bottom"> 58 * <td width="25%">129</td> 59 * <td width="25%"><kbd>10000001</kbd></td> 60 * <td width="25%"><kbd>00000001</kbd></td> 61 * <td width="25%"></td> 62 * </tr> 63 * <tr valign="bottom"> 64 * <td width="25%">130</td> 65 * <td width="25%"><kbd>10000010</kbd></td> 66 * <td width="25%"><kbd>00000001</kbd></td> 67 * <td width="25%"></td> 68 * </tr> 69 * <tr> 70 * <td valign="top" width="25%">...</td> 71 * <td width="25%"></td> 72 * <td width="25%"></td> 73 * <td width="25%"></td> 74 * </tr> 75 * <tr valign="bottom"> 76 * <td width="25%">16,383</td> 77 * <td width="25%"><kbd>11111111</kbd></td> 78 * <td width="25%"><kbd>01111111</kbd></td> 79 * <td width="25%"></td> 80 * </tr> 81 * <tr valign="bottom"> 82 * <td width="25%">16,384</td> 83 * <td width="25%"><kbd>10000000</kbd></td> 84 * <td width="25%"><kbd>10000000</kbd></td> 85 * <td width="25%"><kbd>00000001</kbd></td> 86 * </tr> 87 * <tr valign="bottom"> 88 * <td width="25%">16,385</td> 89 * <td width="25%"><kbd>10000001</kbd></td> 90 * <td width="25%"><kbd>10000000</kbd></td> 91 * <td width="25%"><kbd>00000001</kbd></td> 92 * </tr> 93 * <tr> 94 * <td valign="top" width="25%">...</td> 95 * <td valign="bottom" width="25%"></td> 96 * <td valign="bottom" width="25%"></td> 97 * <td valign="bottom" width="25%"></td> 98 * </tr> 99 * </table> 100 * <p>This provides compression while still being efficient to decode.</p> 101 * 102 * @param i Smaller values take fewer bytes. Negative numbers are 103 * supported, but should be avoided. 104 * @throws IOException If there is an I/O error writing to the underlying medium. 105 * @see DataInput#readVInt() 106 */ 107 public final void writeVInt(int i) throws IOException { 108 while ((i & ~0x7F) != 0) { 109 writeByte((byte)((i & 0x7F) | 0x80)); 110 i >>>= 7; 111 } 112 writeByte((byte)i); 113 }
1.0x7F表示127,~0x7F 的二进制表示为1000 0000, i & ~0x7F取得i 的高位数据,则(i & ~0x7F) != 0表示i的高位是否有数据
2.writeByte((byte)((i & 0x7F) | 0x80)); //把i的二进制形式的低8位数最高位置1并将其底8位写入缓冲区,最高位置1表示还有一个Byte
3.i >>>= 7;右移7位
4.writeByte((byte)i);//把i的二进制形式的最高8位写入缓冲区
这样的做法使较小的整数可以用更少的字节表示,有效的节省空间
PS: byte 表示整数区间为-128 --- +127,计算机系统内部用补码来表示数值;
1 byte bytes = (byte)130; 2 System.out.println(bytes);
输出结果为-126
分析:130 的二进制表示为0000 0000 0000 0000 0000 0000 1000 0010
转化为byte则为 1000 0010
最高位是1则表示该数值是负数,则取反为1111 1101 ,加1为1111 1110 转换为十进制为-126
//byte 占用8位,能表示256个不同的数
//[0,127]
//[-0,-127]
//0和-0是一个数,用-0的补码形式表示-128