随笔分类 (275)

阅读排行榜

BufferedInputStream 介绍

BufferedInputStream 是缓冲输入流。它继承于FilterInputStream。
BufferedInputStream 的作用是为另一个输入流添加一些功能，例如，提供“缓冲功能”以及支持“mark()标记”和“reset()重置方法”。
BufferedInputStream 本质上是通过一个内部缓冲区数组实现的。例如，在新建某输入流对应的BufferedInputStream后，当我们通过read()读取输入流的数据时，BufferedInputStream会将该输入流的数据分批的填入到缓冲区中。每当缓冲区中的数据被读完之后，输入流会再次填充数据缓冲区；如此反复，直到我们读完输入流数据位置。

BufferedInputStream 函数列表

BufferedInputStream(InputStream in)
BufferedInputStream(InputStream in, int size)

synchronized int     available()
void     close()
synchronized void     mark(int readlimit)
boolean     markSupported()
synchronized int     read()
synchronized int     read(byte[] buffer, int offset, int byteCount)
synchronized void     reset()
synchronized long     skip(long byteCount)

BufferedInputStream 源码分析(基于jdk1.7.40)

  1 package java.io;
  2 import java.util.concurrent.atomic.AtomicReferenceFieldUpdater;
  3 
  4 public class BufferedInputStream extends FilterInputStream {
  5 
  6     // 默认的缓冲大小是8192字节
  7     // BufferedInputStream 会根据“缓冲区大小”来逐次的填充缓冲区；
  8     // 即，BufferedInputStream填充缓冲区，用户读取缓冲区，读完之后，BufferedInputStream会再次填充缓冲区。如此循环，直到读完数据...
  9     private static int defaultBufferSize = 8192;
 10 
 11     // 缓冲数组
 12     protected volatile byte buf[];
 13 
 14     // 缓存数组的原子更新器。
 15     // 该成员变量与buf数组的volatile关键字共同组成了buf数组的原子更新功能实现，
 16     // 即，在多线程中操作BufferedInputStream对象时，buf和bufUpdater都具有原子性(不同的线程访问到的数据都是相同的)
 17     private static final
 18         AtomicReferenceFieldUpdater<BufferedInputStream, byte[]> bufUpdater =
 19         AtomicReferenceFieldUpdater.newUpdater
 20         (BufferedInputStream.class,  byte[].class, "buf");
 21 
 22     // 当前缓冲区的有效字节数。
 23     // 注意，这里是指缓冲区的有效字节数，而不是输入流中的有效字节数。
 24     protected int count;
 25 
 26     // 当前缓冲区的位置索引
 27     // 注意，这里是指缓冲区的位置索引，而不是输入流中的位置索引。
 28     protected int pos;
 29 
 30     // 当前缓冲区的标记位置
 31     // markpos和reset()配合使用才有意义。操作步骤：
 32     // (01) 通过mark() 函数，保存pos的值到markpos中。
 33     // (02) 通过reset() 函数，会将pos的值重置为markpos。接着通过read()读取数据时，就会从mark()保存的位置开始读取。
 34     protected int markpos = -1;
 35 
 36     // marklimit是标记的最大值。
 37     // 关于marklimit的原理，我们在后面的fill()函数分析中会详细说明。这对理解BufferedInputStream相当重要。
 38     protected int marklimit;
 39 
 40     // 获取输入流
 41     private InputStream getInIfOpen() throws IOException {
 42         InputStream input = in;
 43         if (input == null)
 44             throw new IOException("Stream closed");
 45         return input;
 46     }
 47 
 48     // 获取缓冲
 49     private byte[] getBufIfOpen() throws IOException {
 50         byte[] buffer = buf;
 51         if (buffer == null)
 52             throw new IOException("Stream closed");
 53         return buffer;
 54     }
 55 
 56     // 构造函数：新建一个缓冲区大小为8192的BufferedInputStream
 57     public BufferedInputStream(InputStream in) {
 58         this(in, defaultBufferSize);
 59     }
 60 
 61     // 构造函数：新建指定缓冲区大小的BufferedInputStream
 62     public BufferedInputStream(InputStream in, int size) {
 63         super(in);
 64         if (size <= 0) {
 65             throw new IllegalArgumentException("Buffer size <= 0");
 66         }
 67         buf = new byte[size];
 68     }
 69 
 70     // 从“输入流”中读取数据，并填充到缓冲区中。
 71     // 后面会对该函数进行详细说明！
 72     private void fill() throws IOException {
 73         byte[] buffer = getBufIfOpen();
 74         if (markpos < 0)
 75             pos = 0;            /* no mark: throw away the buffer */
 76         else if (pos >= buffer.length)  /* no room left in buffer */
 77             if (markpos > 0) {  /* can throw away early part of the buffer */
 78                 int sz = pos - markpos;
 79                 System.arraycopy(buffer, markpos, buffer, 0, sz);
 80                 pos = sz;
 81                 markpos = 0;
 82             } else if (buffer.length >= marklimit) {
 83                 markpos = -1;   /* buffer got too big, invalidate mark */
 84                 pos = 0;        /* drop buffer contents */
 85             } else {            /* grow buffer */
 86                 int nsz = pos * 2;
 87                 if (nsz > marklimit)
 88                     nsz = marklimit;
 89                 byte nbuf[] = new byte[nsz];
 90                 System.arraycopy(buffer, 0, nbuf, 0, pos);
 91                 if (!bufUpdater.compareAndSet(this, buffer, nbuf)) {
 92                     throw new IOException("Stream closed");
 93                 }
 94                 buffer = nbuf;
 95             }
 96         count = pos;
 97         int n = getInIfOpen().read(buffer, pos, buffer.length - pos);
 98         if (n > 0)
 99             count = n + pos;
100     }
101 
102     // 读取下一个字节
103     public synchronized int read() throws IOException {
104         // 若已经读完缓冲区中的数据，则调用fill()从输入流读取下一部分数据来填充缓冲区
105         if (pos >= count) {
106             fill();
107             if (pos >= count)
108                 return -1;
109         }
110         // 从缓冲区中读取指定的字节
111         return getBufIfOpen()[pos++] & 0xff;
112     }
113 
114     // 将缓冲区中的数据写入到字节数组b中。off是字节数组b的起始位置，len是写入长度
115     private int read1(byte[] b, int off, int len) throws IOException {
116         int avail = count - pos;
117         if (avail <= 0) {
118             // 加速机制。
119             // 如果读取的长度大于缓冲区的长度 并且没有markpos，
120             // 则直接从原始输入流中进行读取，从而避免无谓的COPY（从原始输入流至缓冲区，读取缓冲区全部数据，清空缓冲区， 
121             //  重新填入原始输入流数据）
122             if (len >= getBufIfOpen().length && markpos < 0) {
123                 return getInIfOpen().read(b, off, len);
124             }
125             // 若已经读完缓冲区中的数据，则调用fill()从输入流读取下一部分数据来填充缓冲区
126             fill();
127             avail = count - pos;
128             if (avail <= 0) return -1;
129         }
130         int cnt = (avail < len) ? avail : len;
131         System.arraycopy(getBufIfOpen(), pos, b, off, cnt);
132         pos += cnt;
133         return cnt;
134     }
135 
136     // 将缓冲区中的数据写入到字节数组b中。off是字节数组b的起始位置，len是写入长度
137     public synchronized int read(byte b[], int off, int len)
138         throws IOException
139     {
140         getBufIfOpen(); // Check for closed stream
141         if ((off | len | (off + len) | (b.length - (off + len))) < 0) {
142             throw new IndexOutOfBoundsException();
143         } else if (len == 0) {
144             return 0;
145         }
146 
147         // 读取到指定长度的数据才返回
148         int n = 0;
149         for (;;) {
150             int nread = read1(b, off + n, len - n);
151             if (nread <= 0)
152                 return (n == 0) ? nread : n;
153             n += nread;
154             if (n >= len)
155                 return n;
156             // if not closed but no bytes available, return
157             InputStream input = in;
158             if (input != null && input.available() <= 0)
159                 return n;
160         }
161     }
162 
163     // 忽略n个字节
164     public synchronized long skip(long n) throws IOException {
165         getBufIfOpen(); // Check for closed stream
166         if (n <= 0) {
167             return 0;
168         }
169         long avail = count - pos;
170 
171         if (avail <= 0) {
172             // If no mark position set then don't keep in buffer
173             if (markpos <0)
174                 return getInIfOpen().skip(n);
175 
176             // Fill in buffer to save bytes for reset
177             fill();
178             avail = count - pos;
179             if (avail <= 0)
180                 return 0;
181         }
182 
183         long skipped = (avail < n) ? avail : n;
184         pos += skipped;
185         return skipped;
186     }
187 
188     // 下一个字节是否存可读
189     public synchronized int available() throws IOException {
190         int n = count - pos;
191         int avail = getInIfOpen().available();
192         return n > (Integer.MAX_VALUE - avail)
193                     ? Integer.MAX_VALUE
194                     : n + avail;
195     }
196 
197     // 标记“缓冲区”中当前位置。
198     // readlimit是marklimit，关于marklimit的作用，参考后面的说明。
199     public synchronized void mark(int readlimit) {
200         marklimit = readlimit;
201         markpos = pos;
202     }
203 
204     // 将“缓冲区”中当前位置重置到mark()所标记的位置
205     public synchronized void reset() throws IOException {
206         getBufIfOpen(); // Cause exception if closed
207         if (markpos < 0)
208             throw new IOException("Resetting to invalid mark");
209         pos = markpos;
210     }
211 
212     public boolean markSupported() {
213         return true;
214     }
215 
216     // 关闭输入流
217     public void close() throws IOException {
218         byte[] buffer;
219         while ( (buffer = buf) != null) {
220             if (bufUpdater.compareAndSet(this, buffer, null)) {
221                 InputStream input = in;
222                 in = null;
223                 if (input != null)
224                     input.close();
225                 return;
226             }
227             // Else retry in case a new buf was CASed in fill()
228         }
229     }
230 }

说明：
要想读懂BufferedInputStream的源码，就要先理解它的思想。BufferedInputStream的作用是为其它输入流提供缓冲功能。创建BufferedInputStream时，我们会通过它的构造函数指定某个输入流为参数。BufferedInputStream会将该输入流数据分批读取，每次读取一部分到缓冲中；操作完缓冲中的这部分数据之后，再从输入流中读取下一部分的数据。
为什么需要缓冲呢？原因很简单，效率问题！缓冲中的数据实际上是保存在内存中，而原始数据可能是保存在硬盘或NandFlash等存储介质中；而我们知道，从内存中读取数据的速度比从硬盘读取数据的速度至少快10倍以上。
那干嘛不干脆一次性将全部数据都读取到缓冲中呢？第一，读取全部的数据所需要的时间可能会很长。第二，内存价格很贵，容量不像硬盘那么大。

下面，我就BufferedInputStream中最重要的函数fill()进行说明。其它的函数很容易理解，我就不详细介绍了，大家可以参考源码中的注释进行理解。

fill() 源码如下：

private void fill() throws IOException {
    byte[] buffer = getBufIfOpen();
    if (markpos < 0)
        pos = 0;
    else if (pos >= buffer.length) {
        if (markpos > 0) {  /* can throw away early part of the buffer */
            int sz = pos - markpos;
            System.arraycopy(buffer, markpos, buffer, 0, sz);
            pos = sz;
            markpos = 0;
        } else if (buffer.length >= marklimit) {
            markpos = -1;   /* buffer got too big, invalidate mark */
            pos = 0;        /* drop buffer contents */
        } else {            /* grow buffer */
            int nsz = pos * 2;
            if (nsz > marklimit)
                nsz = marklimit;
            byte nbuf[] = new byte[nsz];
            System.arraycopy(buffer, 0, nbuf, 0, pos);
            if (!bufUpdater.compareAndSet(this, buffer, nbuf)) {
                // Can't replace buf if there was an async close.
                // Note: This would need to be changed if fill()
                // is ever made accessible to multiple threads.
                // But for now, the only way CAS can fail is via close.
                // assert buf == null;
                throw new IOException("Stream closed");
            }
            buffer = nbuf;
        }
    }

    count = pos;
    int n = getInIfOpen().read(buffer, pos, buffer.length - pos);
    if (n > 0)
        count = n + pos;
}

根据fill()中的if...else...，下面我们将fill分为5种情况进行说明。

情况1：读取完buffer中的数据，并且buffer没有被标记

执行流程如下，
(01) read() 函数中调用 fill()
(02) fill() 中的 if (markpos < 0) ...
为了方便分析，我们将这种情况下fill()执行的操作等价于以下代码：

private void fill() throws IOException {
    byte[] buffer = getBufIfOpen();
    if (markpos < 0)
        pos = 0;

    count = pos;
    int n = getInIfOpen().read(buffer, pos, buffer.length - pos);
    if (n > 0)
        count = n + pos;
}

说明：
这种情况发生的情况是 — — 输入流中有很长的数据，我们每次从中读取一部分数据到buffer中进行操作。每次当我们读取完buffer中的数据之后，并且此时输入流没有被标记；那么，就接着从输入流中读取下一部分的数据到buffer中。
其中，判断是否读完buffer中的数据，是通过 if (pos >= count) 来判断的；
判断输入流有没有被标记，是通过 if (markpos < 0) 来判断的。

理解这个思想之后，我们再对这种情况下的fill()的代码进行分析，就特别容易理解了。
(01) if (markpos < 0) 它的作用是判断“输入流是否被标记”。若被标记，则markpos大于/等于0；否则markpos等于-1。
(02) 在这种情况下：通过getInIfOpen()获取输入流，然后接着从输入流中读取buffer.length个字节到buffer中。
(03) count = n + pos; 这是根据从输入流中读取的实际数据的多少，来更新buffer中数据的实际大小。

情况2：读取完buffer中的数据，buffer的标记位置>0，并且buffer中没有多余的空间

执行流程如下，
(01) read() 函数中调用 fill()
(02) fill() 中的 else if (pos >= buffer.length) ...
(03) fill() 中的 if (markpos > 0) ...

为了方便分析，我们将这种情况下fill()执行的操作等价于以下代码：

private void fill() throws IOException {
    byte[] buffer = getBufIfOpen();
    if (markpos >= 0 && pos >= buffer.length) {
        if (markpos > 0) {
            int sz = pos - markpos;
            System.arraycopy(buffer, markpos, buffer, 0, sz);
            pos = sz;
            markpos = 0;
        }
    }

    count = pos;
    int n = getInIfOpen().read(buffer, pos, buffer.length - pos);
    if (n > 0)
        count = n + pos;
}

说明：
这种情况发生的情况是 — — 输入流中有很长的数据，我们每次从中读取一部分数据到buffer中进行操作。当我们读取完buffer中的数据之后，并且此时输入流存在标记时；那么，就发生情况2。此时，我们要保留“被标记位置”到“buffer末尾”的数据，然后再从输入流中读取下一部分的数据到buffer中。
其中，判断是否读完buffer中的数据，是通过 if (pos >= count) 来判断的；
判断输入流有没有被标记，是通过 if (markpos < 0) 来判断的。
判断buffer中没有多余的空间，是通过 if (pos >= buffer.length) 来判断的。

理解这个思想之后，我们再对这种情况下的fill()代码进行分析，就特别容易理解了。
(01) int sz = pos - markpos; 作用是“获取‘被标记位置’到‘buffer末尾’”的数据长度。
(02) System.arraycopy(buffer, markpos, buffer, 0, sz); 作用是“将buffer中从markpos开始的数据”拷贝到buffer中(从位置0开始填充，填充长度是sz)。接着，将sz赋值给pos，即pos就是“被标记位置”到“buffer末尾”的数据长度。
(03) int n = getInIfOpen().read(buffer, pos, buffer.length - pos); 从输入流中读取出“buffer.length - pos”的数据，然后填充到buffer中。
(04) 通过第(02)和(03)步组合起来的buffer，就是包含了“原始buffer被标记位置到buffer末尾”的数据，也包含了“从输入流中新读取的数据”。

注意：执行过情况2之后，markpos的值由“大于0”变成了“等于0”！

情况3：读取完buffer中的数据，buffer被标记位置=0，buffer中没有多余的空间，并且buffer.length>=marklimit

执行流程如下，
(01) read() 函数中调用 fill()
(02) fill() 中的 else if (pos >= buffer.length) ...
(03) fill() 中的 else if (buffer.length >= marklimit) ...

为了方便分析，我们将这种情况下fill()执行的操作等价于以下代码：

private void fill() throws IOException {
    byte[] buffer = getBufIfOpen();
    if (markpos >= 0 && pos >= buffer.length) {
        if ( (markpos <= 0) && (buffer.length >= marklimit) ) {
            markpos = -1;   /* buffer got too big, invalidate mark */
            pos = 0;        /* drop buffer contents */
        }
    }

    count = pos;
    int n = getInIfOpen().read(buffer, pos, buffer.length - pos);
    if (n > 0)
        count = n + pos;
}

说明：这种情况的处理非常简单。首先，就是“取消标记”，即 markpos = -1；然后，设置初始化位置为0，即pos=0；最后，再从输入流中读取下一部分数据到buffer中。

情况4：读取完buffer中的数据，buffer被标记位置=0，buffer中没有多余的空间，并且buffer.length<marklimit

执行流程如下，
(01) read() 函数中调用 fill()
(02) fill() 中的 else if (pos >= buffer.length) ...
(03) fill() 中的 else { int nsz = pos * 2; ... }

为了方便分析，我们将这种情况下fill()执行的操作等价于以下代码：

private void fill() throws IOException {
    byte[] buffer = getBufIfOpen();
    if (markpos >= 0 && pos >= buffer.length) {
        if ( (markpos <= 0) && (buffer.length < marklimit) ) {
            int nsz = pos * 2;
            if (nsz > marklimit)
                nsz = marklimit;
            byte nbuf[] = new byte[nsz];
            System.arraycopy(buffer, 0, nbuf, 0, pos);
            if (!bufUpdater.compareAndSet(this, buffer, nbuf)) {
                throw new IOException("Stream closed");
            }
            buffer = nbuf;
        }
    }

    count = pos;
    int n = getInIfOpen().read(buffer, pos, buffer.length - pos);
    if (n > 0)
        count = n + pos;
}

说明：
这种情况的处理非常简单。
(01) 新建一个字节数组nbuf。nbuf的大小是“pos*2”和“marklimit”中较小的那个数。

int nsz = pos * 2;
if (nsz > marklimit)
    nsz = marklimit;
byte nbuf[] = new byte[nsz];

(02) 接着，将buffer中的数据拷贝到新数组nbuf中。通过System.arraycopy(buffer, 0, nbuf, 0, pos)
(03) 最后，从输入流读取部分新数据到buffer中。通过getInIfOpen().read(buffer, pos, buffer.length - pos);
注意：在这里，我们思考一个问题，“为什么需要marklimit，它的存在到底有什么意义？”我们结合“情况2”、“情况3”、“情况4”的情况来分析。

假设，marklimit是无限大的，而且我们设置了markpos。当我们从输入流中每读完一部分数据并读取下一部分数据时，都需要保存markpos所标记的数据；这就意味着，我们需要不断执行情况4中的操作，要将buffer的容量扩大……随着读取次数的增多，buffer会越来越大；这会导致我们占据的内存越来越大。所以，我们需要给出一个marklimit；当buffer>=marklimit时，就不再保存markpos的值了。

情况5：除了上面4种情况之外的情况

执行流程如下，
(01) read() 函数中调用 fill()
(02) fill() 中的 count = pos...

为了方便分析，我们将这种情况下fill()执行的操作等价于以下代码：

private void fill() throws IOException {
    byte[] buffer = getBufIfOpen();

    count = pos;
    int n = getInIfOpen().read(buffer, pos, buffer.length - pos);
    if (n > 0)
        count = n + pos;
}

说明：这种情况的处理非常简单。直接从输入流读取部分新数据到buffer中。

示例代码

关于BufferedInputStream中API的详细用法，参考示例代码(BufferedInputStreamTest.java)：

 1 import java.io.BufferedInputStream;
 2 import java.io.ByteArrayInputStream;
 3 import java.io.File;
 4 import java.io.InputStream;
 5 import java.io.FileInputStream;
 6 import java.io.IOException;
 7 import java.io.FileNotFoundException;
 8 import java.lang.SecurityException;
 9 
10 /**
11  * BufferedInputStream 测试程序
12  *
13  * @author skywang
14  */
15 public class BufferedInputStreamTest {
16 
17     private static final int LEN = 5;
18 
19     public static void main(String[] args) {
20         testBufferedInputStream() ;
21     }
22 
23     /**
24      * BufferedInputStream的API测试函数
25      */
26     private static void testBufferedInputStream() {
27 
28         // 创建BufferedInputStream字节流，内容是ArrayLetters数组
29         try {
30             File file = new File("bufferedinputstream.txt");
31             InputStream in =
32                   new BufferedInputStream(
33                       new FileInputStream(file), 512);
34 
35             // 从字节流中读取5个字节。“abcde”，a对应0x61，b对应0x62，依次类推...
36             for (int i=0; i<LEN; i++) {
37                 // 若能继续读取下一个字节，则读取下一个字节
38                 if (in.available() >= 0) {
39                     // 读取“字节流的下一个字节”
40                     int tmp = in.read();
41                     System.out.printf("%d : 0x%s\n", i, Integer.toHexString(tmp));
42                 }
43             }
44 
45             // 若“该字节流”不支持标记功能，则直接退出
46             if (!in.markSupported()) {
47                 System.out.println("make not supported!");
48                 return ;
49             }
50               
51             // 标记“当前索引位置”，即标记第6个位置的元素--“f”
52             // 1024对应marklimit
53             in.mark(1024);
54 
55             // 跳过22个字节。
56             in.skip(22);
57 
58             // 读取5个字节
59             byte[] buf = new byte[LEN];
60             in.read(buf, 0, LEN);
61             // 将buf转换为String字符串。
62             String str1 = new String(buf);
63             System.out.printf("str1=%s\n", str1);
64 
65             // 重置“输入流的索引”为mark()所标记的位置，即重置到“f”处。
66             in.reset();
67             // 从“重置后的字节流”中读取5个字节到buf中。即读取“fghij”
68             in.read(buf, 0, LEN);
69             // 将buf转换为String字符串。
70             String str2 = new String(buf);
71             System.out.printf("str2=%s\n", str2);
72 
73             in.close();
74        } catch (FileNotFoundException e) {
75            e.printStackTrace();
76        } catch (SecurityException e) {
77            e.printStackTrace();
78        } catch (IOException e) {
79            e.printStackTrace();
80        }
81     }
82 }

程序中读取的bufferedinputstream.txt的内容如下：

abcdefghijklmnopqrstuvwxyz
0123456789
ABCDEFGHIJKLMNOPQRSTUVWXYZ

运行结果：

0 : 0x61
1 : 0x62
2 : 0x63
3 : 0x64
4 : 0x65
str1=01234
str2=fghij

更多内容

01. java 集合系列目录(Category)

02. java io系列01之 IO框架

03. java io系列02之 ByteArrayInputStream的简介,源码分析和示例(包括InputStream)

04. java io系列03之 ByteArrayOutputStream的简介,源码分析和示例(包括OutputStream)

05. java io系列04之管道(PipedOutputStream和PipedInputStream)的简介,源码分析和示例

06. java io系列05之 ObjectInputStream 和 ObjectOutputStream

07. java io系列06之序列化总结(Serializable 和 Externalizable)

08. java io系列07之 FileInputStream和FileOutputStream

09. java io系列08之 File总结

10. java io系列09之 FileDescriptor总结

11. java io系列10之 FilterInputStream

12. java io系列11之 FilterOutputStream

13. java io系列12之 BufferedInputStream(缓冲输入流)的认知、源码和示例

posted on 2013-10-26 09:56 如果天空不死阅读(29290) 评论(2) 编辑收藏举报

刷新页面返回顶部

登录后才能查看或发表评论，立即登录或者逛逛博客园首页

导航

搜索

常用链接

随笔分类 (275)

阅读排行榜

最新评论

BufferedInputStream 介绍

BufferedInputStream 源码分析(基于jdk1.7.40)

示例代码