ConcurrentHashMap 源码解析 -- Java 容器

ConcurrentHashMap的整个结构是一个Segment数组,每个数组由单独的一个锁组成,Segment继承了ReentrantLock。
然后每个Segment中的结构又是类似于HashTable,也就是又是一个数组,数组的元素类型是HashEntry,每个形成一个桶。
要找每个元素的时候,首先hash一次,找到对应的Segment,再hash一次找到Segment中对应的数组下标,然后再通过遍历链表找到对应的元素。
 
concurrencyLevel 最多为1<<16 即 65536,最小为16
没有任何操作能够阻止访问整个表
如果只有一个修改线程,其他都是读取线程,那么把concurrencyLevel 设置为1就好
ConcurrentHashMap 同样不支持key或value为null的entry
 
下面这个函数用来确定在哪个Segment中,
    final Segment<K,V> segmentFor(int hash) {
        return segments[(hash >>> segmentShift) & segmentMask];
    }
 
但是先看看segmentShift和segmentMask是怎么来的,在构造函数中
 1         // Find power-of-two sizes best matching arguments
 2         int sshift = 0;
 3         int ssize = 1;
 4         while (ssize < concurrencyLevel) {
 5             ++sshift;
 6             ssize <<= 1;
 7         }
 8         segmentShift = 32 - sshift;
 9         segmentMask = ssize - 1;
10         this.segments = Segment.newArray(ssize);

ssize 初始值是1, 假如concurrencyLevel是16,在ssize不断左移乘与2的过程中,sshift记录了总共移动了多少位

concurrentyLevel是16,那么ssize从1到16,总共是移动了4位

segmentShift = 32 - sshift = 32 -4 = 28

segmentMask = ssize - 1 = 16 - 1 = 15,二进制表示就是 1111

所以在上面的segmentFor函数中,使用的是将hash值无符号右移segmentShift 位,再通过segmentMask进行与操作,得到其实原来hash值的高sshift位,这个例子就是最高4位的值,4位刚好能够表示16个Segment

接下来看看构造函数的其它部分

 1         if (initialCapacity > MAXIMUM_CAPACITY)
 2             initialCapacity = MAXIMUM_CAPACITY;
 3         int c = initialCapacity / ssize;
 4         if (c * ssize < initialCapacity)
 5             ++c;
 6         int cap = 1;
 7         while (cap < c)
 8             cap <<= 1;
 9 
10         for (int i = 0; i < this.segments.length; ++i)
11             this.segments[i] = new Segment<K,V>(cap, loadFactor);

1-2 行初始化初始容量,最大为1 <<30 ,不能超过这个值

接下来计算c,如果 c*ssize < initialCapacity, ssize是segment的数目,将多个容量平均分成ssize份,如果还是比initialCapacity小,那么就把c+1。不难理解。

默认情况下,initialCapacity是16,ssize是15,那么c==0,那么此时cap==1 不变

解析来又是移位操作,这么做是为了保证每个segment中元素的数量都是2的整数次幂。cap就是比c大的最小一个2的整数次幂。

然后通过for循环,在初始化每个Segment

1         Segment(int initialCapacity, float lf) {
2             loadFactor = lf;
3             setTable(HashEntry.<K,V>newArray(initialCapacity));
4         }
1     @SuppressWarnings("unchecked")
2     static final <K,V> HashEntry<K,V>[] newArray(int i) {
3         return new HashEntry[i];
4     }

在Segment的构造函数中,调用HashEntry的newArray函数,也就是创建一个initialCapacity大小的HashEntry的数组

 

ConcurrentHashMap中的读是不需要加锁的,通过volatile来实现,写数据的时候,写完写volatile,但是读之前,先读volatile。volatile的特性保证了,volatile之前写的东西,能够被volatile读之后的线程读到。也就是写volatile的时候,会把cache刷到内存中,而读volatile的时候,会将cache的数据invalidate,而从内存中读取,这样就能够保证是最新的值。

 

        void rehash() {
            HashEntry<K,V>[] oldTable = table;
            int oldCapacity = oldTable.length;
            if (oldCapacity >= MAXIMUM_CAPACITY)
                return;

            /*
             * Reclassify nodes in each list to new Map.  Because we are
             * using power-of-two expansion, the elements from each bin
             * must either stay at same index, or move with a power of two
             * offset. We eliminate unnecessary node creation by catching
             * cases where old nodes can be reused because their next
             * fields won't change. Statistically, at the default
             * threshold, only about one-sixth of them need cloning when
             * a table doubles. The nodes they replace will be garbage
             * collectable as soon as they are no longer referenced by any
             * reader thread that may be in the midst of traversing table
             * right now.
             */

            HashEntry<K,V>[] newTable = HashEntry.newArray(oldCapacity<<1);  //注意这里容量只扩大为原来的两倍,所以元素的下标不然就是原来的两倍大,不然就是和原来一样
            threshold = (int)(newTable.length * loadFactor);
            int sizeMask = newTable.length - 1;
            for (int i = 0; i < oldCapacity ; i++) {
                // We need to guarantee that any existing reads of old Map can
                //  proceed. So we cannot yet null out each bin.
                HashEntry<K,V> e = oldTable[i];      //获取链表的头部

                if (e != null) {
                    HashEntry<K,V> next = e.next;
                    int idx = e.hash & sizeMask;     //得到新的下标idx,注意sizeMask前面已经变成newTable.length-1,sizeMask变成了原来的两倍

                    //  Single node on list
                    if (next == null)   //如果只有一个节点,那么就结束了
                        newTable[idx] = e;   //直接把新的index指向它,这里为什么不用管newTable[idx]中原来是否有元素,因为前面是两倍扩容,所以前面的所有数组的index肯定不会映射到这里,所以肯定为null

                    else {
                        // Reuse trailing consecutive sequence at same slot
                        HashEntry<K,V> lastRun = e;   
                        int lastIdx = idx;    
                        for (HashEntry<K,V> last = next;   //在链表中遍历
                             last != null;
                             last = last.next) {
                            int k = last.hash & sizeMask;     计算下一个的hash值
                            if (k != lastIdx) {    //如果hash值等于前面计算的hash值,那么就换掉
                                lastIdx = k;   
                                lastRun = last;  
                            }
                        }
                        newTable[lastIdx] = lastRun;   //这样能够保留最长的同一个hash值的所有节点,至少是最后一个节点,相当于把一个链表最后几个能够hash到新的table中同一个位置的HashEntry重复利用起来

                        // Clone all remaining nodes
                        for (HashEntry<K,V> p = e; p != lastRun; p = p.next) {  //处理其他的节点
                            int k = p.hash & sizeMask;
                            HashEntry<K,V> n = newTable[k];   //每次都添加在每个链表的前面
                            newTable[k] = new HashEntry<K,V>(p.key, p.hash,
                                                             n, p.value);
                        }
                    }
                }
            }
            table = newTable;
        }

 

Segment的remove函数

 1        /**
 2          * Remove; match on key only if value null, else match both.
 3          */
 4         V remove(Object key, int hash, Object value) {
 5             lock();
 6             try {
 7                 int c = count - 1;
 8                 HashEntry<K,V>[] tab = table;
 9                 int index = hash & (tab.length - 1);
10                 HashEntry<K,V> first = tab[index];      
11                 HashEntry<K,V> e = first;
12                 while (e != null && (e.hash != hash || !key.equals(e.key)))
13                     e = e.next;
14 
15                 V oldValue = null;
16                 if (e != null) {
17                     V v = e.value;
18                     if (value == null || value.equals(v)) {
19                         oldValue = v;   //由于next指针是final的,所以前面的所有HashEntry都要被clone
20                         // All entries following removed node can stay
21                         // in list, but all preceding ones need to be
22                         // cloned.
23                         ++modCount;
24                         HashEntry<K,V> newFirst = e.next;//从头遍历到要删除的节点,不断添加到表头
25                         for (HashEntry<K,V> p = first; p != e; p = p.next)
26                             newFirst = new HashEntry<K,V>(p.key, p.hash,
27                                                           newFirst, p.value);
28                         tab[index] = newFirst;
29                         count = c; // write-volatile
30                     }
31                 }
32                 return oldValue;
33             } finally {
34                 unlock();
35             }
36         }

 

 1     public boolean isEmpty() {
 2         final Segment<K,V>[] segments = this.segments;
 3         /*
 4          * We keep track of per-segment modCounts to avoid ABA
 5          * problems in which an element in one segment was added and
 6          * in another removed during traversal, in which case the
 7          * table was never actually empty at any point. Note the
 8          * similar use of modCounts in the size() and containsValue()
 9          * methods, which are the only other methods also susceptible
10          * to ABA problems.
11          */
12         int[] mc = new int[segments.length];
13         int mcsum = 0;
14         for (int i = 0; i < segments.length; ++i) {
15             if (segments[i].count != 0)   //如果count!=0 那么肯定是false
16                 return false;
17             else
18                 mcsum += mc[i] = segments[i].modCount;   //记录modCount
19         }
20         // If mcsum happens to be zero, then we know we got a snapshot
21         // before any modifications at all were made.  This is
22         // probably common enough to bother tracking.
23         if (mcsum != 0) { //如果mssum 等于0,也就是说某一瞬间,集合是空的
24             for (int i = 0; i < segments.length; ++i) {
25                 if (segments[i].count != 0 ||   //再判断一次
26                     mc[i] != segments[i].modCount) //或者这个过程中modCount发生了变化
27                     return false;
28             }
29         }
30         return true;
31     }

 

 1     /**
 2      * Returns the number of key-value mappings in this map.  If the
 3      * map contains more than <tt>Integer.MAX_VALUE</tt> elements, returns
 4      * <tt>Integer.MAX_VALUE</tt>.
 5      *
 6      * @return the number of key-value mappings in this map
 7      */
 8     public int size() {
 9         final Segment<K,V>[] segments = this.segments;
10         long sum = 0;
11         long check = 0;
12         int[] mc = new int[segments.length];
13         // Try a few times to get accurate count. On failure due to
14         // continuous async changes in table, resort to locking.
15         for (int k = 0; k < RETRIES_BEFORE_LOCK; ++k) {  //最多检查两遍,如果这个过程中老是有线程在修改,那么就请求锁来解决
16             check = 0;
17             sum = 0;
18             int mcsum = 0;
19             for (int i = 0; i < segments.length; ++i) {
20                 sum += segments[i].count;  //计算个数
21                 mcsum += mc[i] = segments[i].modCount;   //记录modCount
22             }
23             if (mcsum != 0) {  //如果这个过程中发生了修改
24                 for (int i = 0; i < segments.length; ++i) {
25                     check += segments[i].count;   //重新计算check
26                     if (mc[i] != segments[i].modCount) { //如果某个modCount变化了,也说明发生了改变,必须重试
27                         check = -1; // force retry
28                         break;
29                     }
30                 }
31             }
32             if (check == sum) //
33                 break;
34         }
35         if (check != sum) { // Resort to locking all segments  //全部锁住再计算
36             sum = 0;
37             for (int i = 0; i < segments.length; ++i)
38                 segments[i].lock();
39             for (int i = 0; i < segments.length; ++i)
40                 sum += segments[i].count;
41             for (int i = 0; i < segments.length; ++i)
42                 segments[i].unlock();
43         }
44         if (sum > Integer.MAX_VALUE)
45             return Integer.MAX_VALUE;
46         else
47             return (int)sum;
48     }

 

 1     /**
 2      * Removes the key (and its corresponding value) from this map.
 3      * This method does nothing if the key is not in the map.
 4      *
 5      * @param  key the key that needs to be removed
 6      * @return the previous value associated with <tt>key</tt>, or
 7      *         <tt>null</tt> if there was no mapping for <tt>key</tt>
 8      * @throws NullPointerException if the specified key is null
 9      */
10     public V remove(Object key) {
11     int hash = hash(key.hashCode());
12         return segmentFor(hash).remove(key, hash, null);  //这里是如果原来这个key映射到某个值,那么就remove掉,这里传入null,在remove函数中就知道直接不管原来的值是什么,直接删掉
13     }
14 
15     /**
16      * {@inheritDoc}
17      *
18      * @throws NullPointerException if the specified key is null
19      */
20     public boolean remove(Object key, Object value) {
21         int hash = hash(key.hashCode());
22         if (value == null)   //但是如果value是null,是不允许的,所以return false,因为null被用来判断特殊情况,如上所示
23             return false;
24         return segmentFor(hash).remove(key, hash, value) != null;   
25     }

 

 

 

 

 

 

 

 

 

 

 

 

 

posted @ 2014-08-25 23:06  longshaohang  阅读(263)  评论(0编辑  收藏  举报