BytesWritable 长度问题(多出空格)

在使用 BytesWritable 进行小文件合并时,发现长度与原类容不一致,会多出一些空格

 

测试代码

@Test
public void test() {
    String str = "aaa";

    BytesWritable v = new BytesWritable();
    v.set(str.getBytes(), 0, str.getBytes().length);

    System.out.println("*" + new String(v.getBytes()) + "*");
}

结果,看到多出了一个空格

 

查看 BytesWritable 源码,发现复制后数组大小会被处理,真正存储类容长度的为 size 属性

public void set(byte[] newData, int offset, int length) {
    setSize(0);
    setSize(length);
    System.arraycopy(newData, offset, bytes, 0, size);
}
public void setSize(int size) { if (size > getCapacity()) { // Avoid overflowing the int too early by casting to a long. long newSize = Math.min(Integer.MAX_VALUE, (3L * size) / 2L); setCapacity((int) newSize); } this.size = size; }

 

既然知道长度,在转换时设置上就好了

@Test
public void test() {
    String str = "aaa";

    BytesWritable v = new BytesWritable();
    v.set(str.getBytes(), 0, str.getBytes().length);

    // getSize()为过期方法,使用 getLength()
    System.out.println("*" + new String(v.getBytes(),0,v.getLength()) + "*");
}


http://hadoop.apache.org/docs/r2.9.2/api/org/apache/hadoop/io/BytesWritable.html

posted @ 2019-04-30 14:36  江湖小小白  阅读(585)  评论(0编辑  收藏  举报