Lucene source Vint 写入

代码位置org.apache.lucene.store.DataOutput

  1 /** Writes an int in a variable-length format.  Writes between one and
  2    * five bytes.  Smaller values take fewer bytes.  Negative numbers are
  3    * supported, but should be avoided.
  4    * <p>VByte is a variable-length format for positive integers is defined where the
  5    * high-order bit of each byte indicates whether more bytes remain to be read. The
  6    * low-order seven bits are appended as increasingly more significant bits in the
  7    * resulting integer value. Thus values from zero to 127 may be stored in a single
  8    * byte, values from 128 to 16,383 may be stored in two bytes, and so on.</p>
  9    * <p>VByte Encoding Example</p>
 10    * <table cellspacing="0" cellpadding="2" border="0">
 11    * <col width="64*">
 12    * <col width="64*">
 13    * <col width="64*">
 14    * <col width="64*">
 15    * <tr valign="top">
 16    *   <th align="left" width="25%">Value</th>
 17    *   <th align="left" width="25%">Byte 1</th>
 18    *   <th align="left" width="25%">Byte 2</th>
 19    *   <th align="left" width="25%">Byte 3</th>
 20    * </tr>
 21    * <tr valign="bottom">
 22    *   <td width="25%">0</td>
 23    *   <td width="25%"><kbd>00000000</kbd></td>
 24    *   <td width="25%"></td>
 25    *   <td width="25%"></td>
 26    * </tr>
 27    * <tr valign="bottom">
 28    *   <td width="25%">1</td>
 29    *   <td width="25%"><kbd>00000001</kbd></td>
 30    *   <td width="25%"></td>
 31    *   <td width="25%"></td>
 32    * </tr>
 33    * <tr valign="bottom">
 34    *   <td width="25%">2</td>
 35    *   <td width="25%"><kbd>00000010</kbd></td>
 36    *   <td width="25%"></td>
 37    *   <td width="25%"></td>
 38    * </tr>
 39    * <tr>
 40    *   <td valign="top" width="25%">...</td>
 41    *   <td valign="bottom" width="25%"></td>
 42    *   <td valign="bottom" width="25%"></td>
 43    *   <td valign="bottom" width="25%"></td>
 44    * </tr>
 45    * <tr valign="bottom">
 46    *   <td width="25%">127</td>
 47    *   <td width="25%"><kbd>01111111</kbd></td>
 48    *   <td width="25%"></td>
 49    *   <td width="25%"></td>
 50    * </tr>
 51    * <tr valign="bottom">
 52    *   <td width="25%">128</td>
 53    *   <td width="25%"><kbd>10000000</kbd></td>
 54    *   <td width="25%"><kbd>00000001</kbd></td>
 55    *   <td width="25%"></td>
 56    * </tr>
 57    * <tr valign="bottom">
 58    *   <td width="25%">129</td>
 59    *   <td width="25%"><kbd>10000001</kbd></td>
 60    *   <td width="25%"><kbd>00000001</kbd></td>
 61    *   <td width="25%"></td>
 62    * </tr>
 63    * <tr valign="bottom">
 64    *   <td width="25%">130</td>
 65    *   <td width="25%"><kbd>10000010</kbd></td>
 66    *   <td width="25%"><kbd>00000001</kbd></td>
 67    *   <td width="25%"></td>
 68    * </tr>
 69    * <tr>
 70    *   <td valign="top" width="25%">...</td>
 71    *   <td width="25%"></td>
 72    *   <td width="25%"></td>
 73    *   <td width="25%"></td>
 74    * </tr>
 75    * <tr valign="bottom">
 76    *   <td width="25%">16,383</td>
 77    *   <td width="25%"><kbd>11111111</kbd></td>
 78    *   <td width="25%"><kbd>01111111</kbd></td>
 79    *   <td width="25%"></td>
 80    * </tr>
 81    * <tr valign="bottom">
 82    *   <td width="25%">16,384</td>
 83    *   <td width="25%"><kbd>10000000</kbd></td>
 84    *   <td width="25%"><kbd>10000000</kbd></td>
 85    *   <td width="25%"><kbd>00000001</kbd></td>
 86    * </tr>
 87    * <tr valign="bottom">
 88    *   <td width="25%">16,385</td>
 89    *   <td width="25%"><kbd>10000001</kbd></td>
 90    *   <td width="25%"><kbd>10000000</kbd></td>
 91    *   <td width="25%"><kbd>00000001</kbd></td>
 92    * </tr>
 93    * <tr>
 94    *   <td valign="top" width="25%">...</td>
 95    *   <td valign="bottom" width="25%"></td>
 96    *   <td valign="bottom" width="25%"></td>
 97    *   <td valign="bottom" width="25%"></td>
 98    * </tr>
 99    * </table>
100    * <p>This provides compression while still being efficient to decode.</p>
101    * 
102    * @param i Smaller values take fewer bytes.  Negative numbers are
103    * supported, but should be avoided.
104    * @throws IOException If there is an I/O error writing to the underlying medium.
105    * @see DataInput#readVInt()
106    */
107   public final void writeVInt(int i) throws IOException {
108     while ((i & ~0x7F) != 0) {
109       writeByte((byte)((i & 0x7F) | 0x80));
110       i >>>= 7;
111     }
112     writeByte((byte)i);
113   }
View Code

1.0x7F表示127,~0x7F 的二进制表示为1000 0000, i & ~0x7F取得i 的高位数据,则(i & ~0x7F) != 0表示i的高位是否有数据

2.writeByte((byte)((i & 0x7F) | 0x80)); //把i的二进制形式的低8位数最高位置1并将其底8位写入缓冲区,最高位置1表示还有一个Byte

3.i >>>= 7;右移7位

4.writeByte((byte)i);//把i的二进制形式的最高8位写入缓冲区

这样的做法使较小的整数可以用更少的字节表示,有效的节省空间

 

 PS: byte 表示整数区间为-128  ---  +127,计算机系统内部用补码来表示数值;

1 byte bytes = (byte)130;
2 System.out.println(bytes);
View Code

输出结果为-126

分析:130 的二进制表示为0000 0000 0000 0000 0000 0000 1000 0010
转化为byte则为 1000 0010
最高位是1则表示该数值是负数,则取反为1111 1101 ,加1为1111 1110 转换为十进制为-126

 

//byte 占用8位,能表示256个不同的数

//[0,127]

//[-0,-127]

//0和-0是一个数,用-0的补码形式表示-128

 

posted on 2013-06-09 12:00  ukouryou  阅读(185)  评论(0编辑  收藏  举报

导航