java 新IO

传统的IO

Java中的InputStream、OutputStream、Reader、Writer这样的面向流的输入输出系统被视为传统的IO。传统的IO是阻塞式的输入输出，并且是通过字节的移动来处理的，即传统的IO一次只能处理一个字节，效率不高。

新IO

新IO和传统的IO有相同的目的，都是用于进行输入输出功能。但是新IO采用内存映射文件的方式来处理输入输出，新IO将文件或文件的一段区域映射到内存中，这样就可以像访问内存一样访问文件了，这种方式要比传统的IO快得多。

Java中新IO相关的包如下：

java.nio包：主要提供一些和Buffer相关的类；

java.nio.channel包：主要包括Channel和Selector相关的类；

java.nio.charset包：主要包含和字符集相关的类；

java.nio.channel.spi包：主要包含提供Channel服务的类；

java.nio.charset.spi包：主要包含提供字符集服务的类。

这里主要涉及到了四个类Channel、Buffer、Charset、Selector

Channel和Buffer是新IO中的两个核心对象，Channel是对传统输入输出的模拟，在新IO系统中所有数据都需要通过Channel进行传输。Channel与传统的InputStream和OutputStream最大的区别是它提供了一个map方法，通过该map方法可以直接将“一块数据”映射到内存中。如果说传统IO是面向流的处理，新IO则是面向块的处理。

Buffer可以被理解为一个容器，它的本质是一个数组，发送到Channel的所有对象都必须首先放到Buffer中，从Channel中读取的对象也必须先读到Buffer中。正如其名，Buffer起到的是缓冲的作用。

Charset用于将UNICODE字符串映射成字节序列以及逆映射操作。

Selector类用于支持非阻塞式输入输出操作。

Buffer的使用

Buffer的内部结构类似一个数组，可以保存多个类型相同的数据。 Buffer是一个抽象类，对应每种基本类型（boolean除外）都有相应的Buffer类，如ByteBuffer、ShortBuffer、CharBufferr、IntBuffer、LongBuffer、FloatBuffer、DoubleBuffer等等。这些Buffer类，除了ByteBuffer外，都采用相同或相似的方法来管理数据。这些Buffer都没有构造器，通过使用一个静态的XxxBuffer allocate(int capacity)创建一个容量为capacity的Buffer对象。

Buffer类中使用较多的是ByteBuffer和CharBuffer，其他Buffer子类则较少用到。其中ByteBuffer还有一个子类MappedByteBuffer，用于表示Channel将磁盘文件的部分或全部内容映射到内存后得到的结果，通常MappedByteBuffer对象由Channel的map方法返回。

在Buffer中有3个重要概念：capacity（容量）、limit（界限）、position（位置）。

capacity：标识Buffer的最大容量即最多可以存储多少数据。capacity不能为负值，在创建后也不能改变。（这点和数组类似）

limit：第一个不能被读取也不能被写入的Buffer位置索引。也就是说，位于limit后的数据既不可被读也不可被写入。

position：用于指明下一个可以被读出或者写入的缓冲区位置索引（类似于传统IO中的记录指针）。当使用Buffer从Channel中读取数据时，position位置的值恰好等于已经堵到了多少数据。当新建一个Buffer对象时，其position为0。

此外还有一个可选的标记mark，mark允许直接将position定位到mark处。

mark、position、limit、capacity满足如下关系：

mark<=position<=limit<=capacity。

Buffer的主要作用就是装入数据然后输出数据，在写入后读取数据中position和limit在发生变化：

Buffer对象刚创建时，其position为0，limit为capacity；

程序调用put方法不断的向Buffer中填充数据，每填充一些数据，Buffer的position就向后移动一些位置；

当Buffer填充数据结束后，就会调用flip方法，将limit设置为position所在的位置，然后将position设置为0；

这样就可以从Buffer中读出数据，每读出一些数据，position就会向后移动一些位置；

数据读取完毕后，需要调用clear方法，clear方法不是将Buffer中的数据清空，而是将position重置为0，将limit重置为capacity，以等待数据的再次写入。

根据以上的步骤描述，可以发现Buffer的flip方法为读取数据做好了准备，clear方法为再次写入数据做好了准备。

put和get是Buffer的写入和读取数据的方法。Buffer既支持单个数据的读写，也支持批量读写（以数组为参数）。

当使用put和get来读写Buffer中的数据时，分为绝对和相对两种：

相对：从Buffer的当前位置读取或写入数据，然后将position的值按处理的元素个数增加。

绝对：直接根据索引向Buffer中读写数据，并不改变position的位置。

下面是一个Buffer的测试类，用于验证Buffer对象在创建及读写数据的过程中position、limit和capacity的变化：

package com.zhyea.newio;

import java.nio.CharBuffer;

/**
 * java新IOBuffer测试类
 * @author robin
 *
 */
public class BufferTest {

    public static void main(String[] args) {
        CharBuffer buff = CharBuffer.allocate(16);
        System.out.println("刚创建时：" 
                         + " capacity:" + buff.capacity() 
                         + " position:"+buff.position() 
                         + " limit:"+buff.limit());
        buff.put('a').put('b').put('c').put(new char[]{'d', 'e', 'f'});
        System.out.println("加入6个元素后：" 
                 + " capacity:" + buff.capacity() 
                 + " position:"+buff.position() 
                 + " limit:"+buff.limit());
        buff.flip();
        System.out.println("执行flip方法后：" 
                 + " capacity:" + buff.capacity() 
                 + " position:"+buff.position() 
                 + " limit:"+buff.limit());
        buff.get();
        System.out.println("取出第一个元素后：" 
                 + " capacity:" + buff.capacity() 
                 + " position:"+buff.position() 
                 + " limit:"+buff.limit());
        buff.clear();
        System.out.println("调用clear方法后：" 
                 + " capacity:" + buff.capacity() 
                 + " position:"+buff.position() 
                 + " limit:"+buff.limit());
        buff.get(3);
        System.out.println("执行绝对读取后：" 
                 + " capacity:" + buff.capacity() 
                 + " position:"+buff.position() 
                 + " limit:"+buff.limit());
    }

}

通过allocate方法创建的Buffer是普通Buffer，但是ByteBuffer还提供了一个allocateDirect方法（只有ByteBuffer提供了）来创建直接Buffer。创建直接Buffer的成本比较高，但这可以使运行时环境直接在该Buffer上进行较快的本机IO操作。由于创建成本较高，直接Buffer更适用于长生存期的Buffer，而不适用于短生存期，一用就丢弃的Buffer。直接Buffer与普通Buffer在使用上并无二致。

Channel的使用

Channel类似于传统的流对象，但是与之相比，Channel有两个主要的区别：

Channel可以将指定文件的部分或全部直接映射成Buffer；

程序不能直接读写Channel中的数据，需要通过Buffer来进行。

Channel是一个接口，位于java.nio.channels包下，系统为该接口提供了DataGramChannel、FileChannel、Pipe.SinkChannel、Pipe.SourceChannel等实现类。根据Channel的名字可以看出来，在新IO中Channel是按功能来划分的。

Channel不应该通过构造器来直接创建，而是通过传统的节点InputStream、OutputStream的getChannel方法来返回对应的Channel，不同的节点流返回的Channel不一样，例如FileInputStream的getChannel返回的是FileChannel，而PipedInputStream返回的则是Pipe.SinkChannel。

Channel中最常用的三类方法是map、read和write，其中map方法用于将Channel对应的部分或全部数据映射成ByteBuffer；而read、write方法用于从Buffer中读取或写入数据。

FileChannel读写示例：

package com.zhyea.newio;

import java.io.File;
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.IOException;
import java.nio.CharBuffer;
import java.nio.MappedByteBuffer;
import java.nio.channels.FileChannel;
import java.nio.channels.FileChannel.MapMode;
import java.nio.charset.Charset;
import java.nio.charset.CharsetDecoder;

/**
 * Channel测试类。使用Channel将A文件中的内容写入B文件
 * 
 * @author robin
 *
 */
public class FileChannelTest {

    public static void main(String[] args) {
        FileChannel in = null;
        FileChannel out = null;

        try {
            String fileAPath = "D:\\a.txt";
            String fileBPath = "D:\\b.txt";
            File file = new File(fileAPath);
            
            in = new FileInputStream(file).getChannel();
            MappedByteBuffer buffer = in.map(MapMode.READ_ONLY, 0, file.length());
            
            out = new FileOutputStream(fileBPath).getChannel();
            out.write(buffer);
            buffer.clear();
            
            Charset charset = Charset.forName("GBK");
            CharsetDecoder decoder = charset.newDecoder();
            CharBuffer charBuffer = decoder.decode(buffer);
            
            System.out.println(charBuffer);
        } catch (Exception e) {
            e.printStackTrace();
        } finally {
            try {
                if (null != in)
                    in.close();
                if (null != out)
                    out.close();
            } catch (IOException e) {
                e.printStackTrace();
            }
        }
    }

}

RandomAccessFile对应Channel实例，实现类文本文件的追加：

package com.zhyea.newio;

import java.io.File;
import java.io.IOException;
import java.io.RandomAccessFile;
import java.nio.ByteBuffer;
import java.nio.channels.FileChannel;
import java.nio.channels.FileChannel.MapMode;

public class RandomFileChannelTest {
    
    public static void main(String[] args) {
        FileChannel randomChannel = null;

        try {
            String fileBPath = "D:\\b.txt";
            File file = new File(fileBPath);
            
            randomChannel = new RandomAccessFile(file, "rw").getChannel();
            ByteBuffer buffer = randomChannel.map(MapMode.READ_ONLY, 0, file.length());
            randomChannel.position(file.length());
            
            randomChannel.write(buffer);
        } catch (Exception e) {
            e.printStackTrace();
        } finally {
            try {
                if (null != randomChannel)
                                randomChannel.close();
            } catch (IOException e) {
                e.printStackTrace();
            }
        }
    }
    
}

Channel测试，实现缓冲读取数据：

package com.zhyea.newio;

import java.io.File;
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.IOException;
import java.nio.ByteBuffer;
import java.nio.CharBuffer;
import java.nio.channels.FileChannel;
import java.nio.charset.Charset;
import java.nio.charset.CharsetDecoder;

public class ChannelTest {
    public static void main(String[] args) {
        FileChannel in = null;
        FileChannel out = null;

        try {
            String fileAPath = "D:\\a.txt";
            String fileBPath = "D:\\b.txt";
            File file = new File(fileAPath);
            
            in = new FileInputStream(file).getChannel();
            
            out = new FileOutputStream(fileBPath).getChannel();
            
            ByteBuffer buffer = ByteBuffer.allocate(16);
            while(in.read(buffer) != -1){
                buffer.flip();

                Charset charset = Charset.forName("GBK");
                CharsetDecoder decoder = charset.newDecoder();
                CharBuffer charBuffer = decoder.decode(buffer);
                
                System.out.println(charBuffer);
                
                buffer.position(0);
                out.write(buffer);
                
                buffer.clear();
            }
        } catch (Exception e) {
            e.printStackTrace();
        } finally {
            try {
                if (null != in)
                    in.close();
                if (null != out)
                    out.close();
            } catch (IOException e) {
                e.printStackTrace();
            }
        }

    }
}

Charset使用

计算机记录文件都是记录的字节序列。若要使记录的字节序列可以正确的展示和使用，必须使用合适的字符集将之解码。

Java中的编码采用Unicode字符集，若读取的文件编码格式不是Unicode则会出现乱码。

Java提供了Charset来处理字节序列和字符串之间的转换关系。Charset提供了编码和解码的方法，还提供了获取Charset所支持的字符集的方法。Charset类是不可变的。

Charset提供了availableCharset()静态方法来获取当前JDK所支持的所有字符集，演示代码如下：

package com.zhyea.newio;

import java.nio.charset.Charset;
import java.util.Map;

public class CharsetTest {

    public static void main(String[] args){
        listCharset();
    }
    
    public static void listCharset(){
        Map<String, Charset> map = Charset.availableCharsets();
        for(String tmp : map.keySet()){
            System.out.println(tmp + " ----------> " + map.get(tmp));
        }
    }
}

输出的结果很多，在我的机器上试了试共有169行。有兴趣大可一试，这里就不列出了。

知道了字符集的名称后，就可以使用Charset的forName()方法创建Charset对象，代码如下：

Charset charset = Charset.forName("UTF-8");

得到Charset对象后就可以使用Charset对象的newDecoder()和newEncoder()方法分别创建CharsetEncoder和CharsetDecoder对象。这二者分别是Charset的编码器和解码器。使用他们可以实现字节序列和字符串的互相转换。如果仅仅是为了完成编解码操作，可以直接使用Charset对象的encode()和decode()方法。

平时可能会遇到校验字符串编码这样的需求，这是就可以考虑使用Charset，实现一个测试类：

package com.zhyea.newio;

import java.nio.ByteBuffer;
import java.nio.charset.Charset;

public class CharsetTest {

    public static void main(String[] args) {
        String str = "这是一个测试";
        System.out.println(checkEncoding(str));
    }

    public static String checkEncoding(String str) {
        String tmp;
        for (Encode e : Encode.values()) {
            ByteBuffer buffer = ByteBuffer.wrap(str.getBytes());
            tmp = Charset.forName(e.getValue()).decode(buffer).toString();
            if (str.equals(tmp)) {
                return e.getValue();
            }
        }
        return null;
    }

    enum Encode {
        GBK("GBK"), UTF8("UTF-8"), BIG5("BIG5"), ISO88591("ISO-8859-1");

        private String value;

        Encode(String val) {
            this.value = val;
        }

        public String getValue() {
            return this.value;
        }
    }
}

在测试类中创建了一个枚举类来记录字符集编码。用来做测试的字符串是直接写的一个字符串，和本机的工作空间编码格式相同。

使用Selector

Selector主要是用于非阻塞Socket通信。与之配合使用的是SelectableChannel。

SelectableChannel代表可以支持非阻塞IO操作的Channel对象，可以使用register()方法将其注册到指定的Selector上，这种注册关系由SelectionKey实例表示。Selector对象提供了一个select()方法，该方法允许应用程序同时监控多个IO Channel。

SelectableChannel对象支持阻塞和非阻塞两种模式（所有的Channel默认都是阻塞模式）。SeletableChannel提供了如下两个方法来设置和返回该Channel的模式状态：

SelectableChannel configureBlocking(boolean block)：设置是否采用阻塞模式
boolean isBlocking()：返回该Channel是否是阻塞模式

posted @ 2014-12-11 22:57 robin·张阅读(515) 评论(0) 编辑收藏举报

会员力量，点亮园子希望

刷新页面返回顶部