容器--ArrayList

一、前言

作为List的重要实现之一，ArrayList已经成了我们编写程序时不可或缺的重要容器之一，面试的时候也经常会被问到，所以，深入理解其实现机制，无论是对于我们正确使用这个类来说，还是准备面试，都是非常有好处的。

二、实现原理

ArrayList不再是一个抽象类，而是可以直接使用，所以我们要弄清楚这个类是如何实现一个List容器的，具体来说，我们需要关注以下几点：

1）数据如何存储？

　　这个比较明显，ArrayList内部定义了一个数组字段：private transient Object[] elementData; 注意虽然ArrayList是支持泛型的，但数组的类型还是Object, 这个是因为无法用泛型来new 一个数组，比如T data[] = new T[size]，这是肯定不行的，所以只能使用Object. 另外注意这个字段是transient的，也就意味着在序列化时会忽略，需要特殊处理。

那么，elementData数组中存储了ArrayList中的每一个元素，由于数组天生具备按下标随机访问，所以这使得ArrayList的get,set等方法变得非常方便。

　　2）存储空间如何扩容和回收？

如果ArrayList只能起到和数组一样的作用，那也就没有必要再定义这样的集合了，直接用数组就完了。我们使用ArrayList，至少有一部分原因是因为它是可以动态扩容的，而且使用者不用关心其是如何扩展的，而数组想要扩容只能程序员自己搞了，而且还很麻烦。那么ArrayList是如何扩容的呢？其容量和数组元素个数之前有什么区别呢？

ArrayList定义了一个字段，int size, 这个表示容器中元素的个数。而elementData的长度则表示其容量大小，通常情况下size < 数组的长度。当有元素到列表中时，系统会先检查当前容量的大小以判断是否需要扩容，以add(index, element)为例，相关实现如下：

 1  public boolean add(E e) {
 2 
 3         //事实上并不是每一次add操作都要扩容,但每一次,modCount都需要加1
 4         ensureCapacityInternal(size + 1);  // Increments modCount!!
 5         elementData[size++] = e;
 6         return true;
 7     }
 8 
 9 private void grow(int minCapacity) {
10         // overflow-conscious code
11         int oldCapacity = elementData.length; //原大小
12         //新的大小等于原大小在原来的基础上增加1/2,比如原长度是10, 则新长度是15
13         int newCapacity = oldCapacity + (oldCapacity >> 1);
14 
15         //如果新的容量 < 目标容量(比如目标是16), 则取目标
16         if (newCapacity - minCapacity < 0)
17             newCapacity = minCapacity;
18 
19         //太大,则根据目标容易来取值,最大不能超过整型的最大值
20         if (newCapacity - MAX_ARRAY_SIZE > 0)
21             newCapacity = hugeCapacity(minCapacity);
22         // minCapacity is usually close to size, so this is a win:
23 
24         //分配一个长度为newCapacity的数组,并将elementData中的元素复制过去
25         //当然,多出的空间,数组中元素的值默认就是null了
26         elementData = Arrays.copyOf(elementData, newCapacity);
27     }

我们看关键的grow方法，先是将空间增加1/2，然后如果还是未到预期空间，则newCapacity等于参数所指定的容器值。最后通过数组复制的方式将这个数组中的元素复制到新数组，以达到扩容的目的。

对于每次添加一个元素来说，这种机制并不会每次都需要扩容，但如果是addAll的方式添加一个集合，则是有可能的。

对于删除来说，elementData的空间并不会缩小，但是多出的部分会被置为null, 以避免不必要的内存泄露。

我们可以在初始化时就指定一个elementData的大小，若不指定，则为长度为0的数组，第一次添加时，默认扩展到10.

　　3）如何实现迭代器？这个比较简单，就是通过对数组的遍历来实现。

4）如何实现indexOf？基本思路还是对数组中的元素进行遍历，对每个元素调用equals来比较，返回第一个匹配的元素，或者返回-1. 但ArrayList 是允许null元素存在的，所以遍历要分两种情况，当目标对象为null时，其实判断方式就是 == null的形式。

5）数组元素如何调整位置？

　　记得笔试和面试经常会被问到ArrayList和LinkedList的优点和不足，通常我们会说ArrayList的不足在于添加和删除元素时会涉及到数组元素位置的调整，这个涉及到数组元素的移动，效率会比较慢。但看了源码之后，发现这个描述是错误的，因为再增加和删除元素后，其实现并非是在原数组的基础上改变元素的位置，而是直接使用到数组复制的方式。研究ArrayList的源码，你会发现很多地方都用到了System的arraycopy方法，下面对该方法做一个具体的定义。

本方法的定义如下：

 1 /**
 2      * Copies an array from the specified source array, beginning at the
 3      * specified position, to the specified position of the destination array.
 4      * A subsequence of array components are copied from the source
 5      * array referenced by <code>src</code> to the destination array
 6      * referenced by <code>dest</code>. The number of components copied is
 7      * equal to the <code>length</code> argument. The components at
 8      * positions <code>srcPos</code> through
 9      * <code>srcPos+length-1</code> in the source array are copied into
10      * positions <code>destPos</code> through
11      * <code>destPos+length-1</code>, respectively, of the destination
12      * array.
13      * <p>
14      * If the <code>src</code> and <code>dest</code> arguments refer to the
15      * same array object, then the copying is performed as if the
16      * components at positions <code>srcPos</code> through
17      * <code>srcPos+length-1</code> were first copied to a temporary
18      * array with <code>length</code> components and then the contents of
19      * the temporary array were copied into positions
20      * <code>destPos</code> through <code>destPos+length-1</code> of the
21      * destination array.
22      * <p>
23      * If <code>dest</code> is <code>null</code>, then a
24      * <code>NullPointerException</code> is thrown.
25      * <p>
26      * If <code>src</code> is <code>null</code>, then a
27      * <code>NullPointerException</code> is thrown and the destination
28      * array is not modified.
29      * <p>
30      * Otherwise, if any of the following is true, an
31      * <code>ArrayStoreException</code> is thrown and the destination is
32      * not modified:
33      * <ul>
34      * <li>The <code>src</code> argument refers to an object that is not an
35      *     array.
36      * <li>The <code>dest</code> argument refers to an object that is not an
37      *     array.
38      * <li>The <code>src</code> argument and <code>dest</code> argument refer
39      *     to arrays whose component types are different primitive types.
40      * <li>The <code>src</code> argument refers to an array with a primitive
41      *    component type and the <code>dest</code> argument refers to an array
42      *     with a reference component type.
43      * <li>The <code>src</code> argument refers to an array with a reference
44      *    component type and the <code>dest</code> argument refers to an array
45      *     with a primitive component type.
46      * </ul>
47      * <p>
48      * Otherwise, if any of the following is true, an
49      * <code>IndexOutOfBoundsException</code> is
50      * thrown and the destination is not modified:
51      * <ul>
52      * <li>The <code>srcPos</code> argument is negative.
53      * <li>The <code>destPos</code> argument is negative.
54      * <li>The <code>length</code> argument is negative.
55      * <li><code>srcPos+length</code> is greater than
56      *     <code>src.length</code>, the length of the source array.
57      * <li><code>destPos+length</code> is greater than
58      *     <code>dest.length</code>, the length of the destination array.
59      * </ul>
60      * <p>
61      * Otherwise, if any actual component of the source array from
62      * position <code>srcPos</code> through
63      * <code>srcPos+length-1</code> cannot be converted to the component
64      * type of the destination array by assignment conversion, an
65      * <code>ArrayStoreException</code> is thrown. In this case, let
66      * <b><i>k</i></b> be the smallest nonnegative integer less than
67      * length such that <code>src[srcPos+</code><i>k</i><code>]</code>
68      * cannot be converted to the component type of the destination
69      * array; when the exception is thrown, source array components from
70      * positions <code>srcPos</code> through
71      * <code>srcPos+</code><i>k</i><code>-1</code>
72      * will already have been copied to destination array positions
73      * <code>destPos</code> through
74      * <code>destPos+</code><i>k</I><code>-1</code> and no other
75      * positions of the destination array will have been modified.
76      * (Because of the restrictions already itemized, this
77      * paragraph effectively applies only to the situation where both
78      * arrays have component types that are reference types.)
79      *
80      * @param      src      the source array.
81      * @param      srcPos   starting position in the source array.
82      * @param      dest     the destination array.
83      * @param      destPos  starting position in the destination data.
84      * @param      length   the number of array elements to be copied.
85      * @exception  IndexOutOfBoundsException  if copying would cause
86      *               access of data outside array bounds.
87      * @exception  ArrayStoreException  if an element in the <code>src</code>
88      *               array could not be stored into the <code>dest</code> array
89      *               because of a type mismatch.
90      * @exception  NullPointerException if either <code>src</code> or
91      *               <code>dest</code> is <code>null</code>.
92      */
93     public static native void arraycopy(Object src,  int  srcPos,
94                                         Object dest, int destPos,
95                                         int length);

　　这是System类里定义的一个本地化方法，用于进行数组间的元素复制，具体的来说，将src数组里从strPos位置起的元素复制length个，放到dest数组中的descPos位置。这个方法由于是本地实现，直接进行内存copy，在数组的数据量比较大的情况下性能仍然比较好。

这个方法要求两个数组中存储的元素类型是一致的，至少是可转换的，不能一个是引用类型，另一个是基本类型。另外，strPos + length和descPos + length都不能超过自身数组的空间，否则会有越界异常。

特别的，src和dest可以是同一个数组，这种情况下，src的元素会先复制到一个临时数组里，然后再从临时数组中复制到dest中。

　　应该说，有了这个方法之后，元素的插入和删除并不会带来太大的性能开销。

三、总结

ArrayList的底层是基于数组的，arraycopy方法在其实现上发挥了很大的作用，理解了这个方法，整个结构也就不难理解了。所以其它的方法就不再一一介绍了，接下来我们会继续学习List的另一种常用实现：LinkedList

posted @ 2016-08-03 08:31 海上劳工阅读(214) 评论(0) 编辑收藏举报

刷新页面返回顶部

容器--ArrayList

公告