JDK 8 - java.util.HashMap 实现机制分析

官方文档对 HashMap 的定义：

public class HashMap<K,V>
extends AbstractMap<K,V>
implements Map<K,V>, Cloneable, Serializable

UML Class Diagram:

Class Diagram for HashMap

HashMap 实现了 Map interface。

HashMap 是一个数据结构，如同一个 DBMS 一样，基本功能其实就是增删改查。

Operations	Time Complexity	Notes
get, put	O(1)	assuming the hash function has dispersed the elements properly among the buckets
iteration over collection views	proportional to the capacity plus the size	do not set the initial capacity too high or the load factor too low

不保证迭代顺序，也不保证顺序一直不变。
非同步，允许null值（key&value）
迭代的时间复杂度：O(capacity+#mappings)，与容量和元素数量有关（若对迭代性能要求高，不要将capacity设置过高，load factor设置过小，避免空闲空间过多）。
内部数组长度为2的幂次方（为了高效实现取模运算），元素下标通过hash&(length-1)计算得到；hash&(length-1) == hash%length，这两个操作等价但不等效，前者更高效（使用了位与运算）。

1、根据 key 查询 map

调用 get(Object key) 方法：

1     public V get(Object key) {
2         Node<K,V> e;
3         return (e = getNode(hash(key), key)) == null ? null : e.value;
4     }

hash(key)：

 1     /**
 2      * Computes key.hashCode() and spreads (XORs) higher bits of hash
 3      * to lower.  Because the table uses power-of-two masking, sets of
 4      * hashes that vary only in bits above the current mask will
 5      * always collide. (Among known examples are sets of Float keys
 6      * holding consecutive whole numbers in small tables.)  So we
 7      * apply a transform that spreads the impact of higher bits
 8      * downward. There is a tradeoff between speed, utility, and
 9      * quality of bit-spreading. Because many common sets of hashes
10      * are already reasonably distributed (so don't benefit from
11      * spreading), and because we use trees to handle large sets of
12      * collisions in bins, we just XOR some shifted bits in the
13      * cheapest possible way to reduce systematic lossage, as well as
14      * to incorporate impact of the highest bits that would otherwise
15      * never be used in index calculations because of table bounds.
16      */
17     static final int hash(Object key) {
18         int h;
19         return (key == null) ? 0 : (h = key.hashCode()) ^ (h >>> 16);
20     }

这个方法计算 key 的哈希值，可以看到如果 key == null，哈希值为 0；

否则，调用 Object.hashCode() 计算 key 的哈希值，再将这个哈希值的低16位与自己的高16位按位异或，最后返回这个异或值。

从注释看，设计者为了降低 key collision，综合考虑了速度、实用、质量三个指标，再加上一般情况下key 的分布已经很均匀了，所以仅仅利用了key哈希值的高16位进行异或，最后得到了比较理想的结果。

然后调用了 getNode(int hash, Object key)，这个方法为 final，子类无法覆盖：

 1 final Node<K,V> getNode(int hash, Object key) {
 2         Node<K,V>[] tab; Node<K,V> first, e; int n; K k;
 3         // 如果table != null && not empty && 根据 hash 找到对应位置的元素也不为null
 4         if ((tab = table) != null && (n = tab.length) > 0 &&
 5             (first = tab[(n - 1) & hash]) != null) {
 6             // 如果哈希值相等 &&（key相同或相等），则找到了元素，直接返回
 7             if (first.hash == hash && // always check first node
 8                 ((k = first.key) == key || (key != null && key.equals(k))))
 9                 return first;
10             // 不是要找的元素，查找这个元素的下一个元素
11             if ((e = first.next) != null) {
12                 // 是一棵树，调用 getTreeNode 查找
13                 if (first instanceof TreeNode)
14                     return ((TreeNode<K,V>)first).getTreeNode(hash, key);
15                 // 是一个链表，顺序查找
16                 do {
17                     if (e.hash == hash &&
18                         ((k = e.key) == key || (key != null && key.equals(k))))
19                         return e;
20                 } while ((e = e.next) != null);
21             }
22         }
23         return null;
24     }

2、往 map 里插入数据

直接调用的是 put(K key, V value)：

1     public V put(K key, V value) {
2         return putVal(hash(key), key, value, false, true);
3     }

又调用了putVal(int hash, K key, V value, boolean onlyIfAbsent, boolean evict)，这个方法为 final，子类无法覆盖：

 1     final V putVal(int hash, K key, V value, boolean onlyIfAbsent,
 2                    boolean evict) {
 3         Node<K,V>[] tab; Node<K,V> p; int n, i;
 4         // 如果 table is null or is empty，调用resize()初始化table，默认大小16，负载因子0.75
 5         if ((tab = table) == null || (n = tab.length) == 0)
 6             n = (tab = resize()).length;
 7         // 根据hash找到对应位置的元素，如果这个元素为null则直接新增node；其中 (n - 1) & hash 使用了位与运算（&），比取余运算（%）更高效
 8         if ((p = tab[i = (n - 1) & hash]) == null)
 9             // 直接新建一个node并放到这个位置
10             tab[i] = newNode(hash, key, value, null);
11         // 不为null，则进一步检查
12         else {
13             Node<K,V> e; K k;
14             // 第一个元素就是我们要找的元素，最后e要么为空，要么保存了我们要找的元素
15             if (p.hash == hash &&
16                 ((k = p.key) == key || (key != null && key.equals(k))))
17                 e = p;
18             // 是一棵树，调用putTreeVal()方法
19             else if (p instanceof TreeNode)
20                 e = ((TreeNode<K,V>)p).putTreeVal(this, tab, hash, key, value);
21             // 是一个链表，遍历这个链表
22             else {
23                 for (int binCount = 0; ; ++binCount) {
24                     // 没有找到，则在最后新增一个node，只有在这种情况下e==null
25                     if ((e = p.next) == null) {
26                         p.next = newNode(hash, key, value, null);
27                         // 如果binCount>=TREEIFY_THRESHOLD - 1，则将链表转换为树
28                         if (binCount >= TREEIFY_THRESHOLD - 1) // -1 for 1st
29                             treeifyBin(tab, hash);
30                         break;
31                     }
32                     // 找到元素，跳出循环
33                     if (e.hash == hash &&
34                         ((k = e.key) == key || (key != null && key.equals(k))))
35                         break;
36                     // 没有找到元素，且没有到最后一个元素，则继续遍历下一个元素p
37                     p = e;
38                 }
39             }
40             // e != null 说明我们要找的元素存在于map中，就是e这个元素
41             if (e != null) { // existing mapping for key
42                 V oldValue = e.value;
43                 // 更新对应的 value
44                 if (!onlyIfAbsent || oldValue == null)
45                     e.value = value;
46                 // 空方法，由LinkedHashMap覆盖
47                 afterNodeAccess(e);
48                 // 返回旧值，其他情况说明key原来不存在，返回null
49                 return oldValue;
50             }
51         }
52         // 改变了map的结构，自增modCount
53         ++modCount;
54         // 自增size，如果size超过了阈值，则调用resize()对容量增加一倍
55         if (++size > threshold)
56             resize();
57         // 空方法，由LinkedHashMap覆盖
58         afterNodeInsertion(evict);
59         return null;
60     }

3、HashMap使用的树是红黑树

HashMap的节点分为普通的节点 Node, 树节点 TreeNode。

Node:

 1     static class Node<K,V> implements Map.Entry<K,V> {
 2         final int hash;
 3         final K key;
 4         V value;
 5         Node<K,V> next;
 6 
 7         Node(int hash, K key, V value, Node<K,V> next) {
 8             this.hash = hash;
 9             this.key = key;
10             this.value = value;
11             this.next = next;
12         }
13         // 省略部分代码
14      }

TreeNode 相比于 Node，增加了 before, after, parent, left, right, prev 等“指针”。

 1     static final class TreeNode<K,V> extends LinkedHashMap.Entry<K,V> {
 2         TreeNode<K,V> parent;  // red-black tree links
 3         TreeNode<K,V> left;
 4         TreeNode<K,V> right;
 5         TreeNode<K,V> prev;    // needed to unlink next upon deletion
 6         boolean red;
 7         TreeNode(int hash, K key, V val, Node<K,V> next) {
 8             super(hash, key, val, next);
 9         }
10 
11         /**
12          * Returns root of tree containing this node.
13          */
14         final TreeNode<K,V> root() {
15             for (TreeNode<K,V> r = this, p;;) {
16                 if ((p = r.parent) == null)
17                     return r;
18                 r = p;
19             }
20         }
21         // 省略部分代码
22       }

posted @ 2017-04-15 10:50 赫尔修斯阅读(237) 评论(0) 编辑收藏举报

会员力量，点亮园子希望

刷新页面返回顶部

JDK 8 - java.util.HashMap 实现机制分析

官方文档对 HashMap 的定义：

1、 根据 key 查询 map

2、往 map 里插入数据

3、HashMap使用的树是红黑树

公告

1、根据 key 查询 map