JDK源码阅读(4)：HashMap类阅读笔记

HashMap

public class HashMap<K, V> extends AbstractMap<K, V>
        implements Map<K, V>, Cloneable, Serializable {
}

1. 一些重要参数

1.1 serialVersionUID属性

// 序列化版本号
private static final long serialVersionUID = 362498820763181265L;

serialVersionUID适用于java序列化机制。简单来说，JAVA序列化的机制是通过判断类的serialVersionUID来验证的版本一致的。在进行反序列化时，JVM会把传来的字节流中的serialVersionUID于本地相应实体类的serialVersionUID进行比较。如果相同说明是一致的，可以进行反序列化，否则会出现反序列化版本一致的异常，即是InvalidCastException。

具体序列化的过程是这样的：序列化操作时会把系统当前类的serialVersionUID写入到序列化文件中。当反序列化时系统会自动检测文件中的serialVersionUID，判断它是否与当前类中的serialVersionUID一致。如果一致说明序列化文件的版本与当前类的版本是一样的，可以反序列化成功，否则就失败；

1.2 DEFAULT_INITIAL_CAPACITY属性

/**
 * The default initial capacity - MUST be a power of two.
 */
static final int DEFAULT_INITIAL_CAPACITY = 1 << 4; // aka 16

DEFAULT_INITIAL_CAPACITY是HashMap的默认初始容量，大小为16

1.3 DEFAULT_LOAD_FACTOR属性

/**
 * The load factor used when none specified in constructor.
 */
static final float DEFAULT_LOAD_FACTOR = 0.75f;

DEFAULT_LOAD_FACTOR是HashMap的默认负载因子大小，大小为0.75。

当元素数量 ≥ 容量*负载因子，那么HashMap需要扩容。

1.4 MAXIMUM_CAPACITY属性

/**
 * The maximum capacity, used if a higher value is implicitly specified  
 * by either of the constructors with arguments.
 * MUST be a power of two <= 1<<30.
 */
static final int MAXIMUM_CAPACITY = 1 << 30;

MAXIMUM_CAPACITY属性是HashMap的最大容量。

1.5 TREEIFY_THRESHOLD属性

/**
 * The bin count threshold for using a tree rather than list for a
 * bin.  Bins are converted to trees when adding an element to a
 * bin with at least this many nodes. The value must be greater
 * than 2 and should be at least 8 to mesh with assumptions in
 * tree removal about conversion back to plain bins upon
 * shrinkage.
 */
static final int TREEIFY_THRESHOLD = 8;

TREEIFY_THRESHOLD属性是HashMap转变为红黑树实现的阈值。当元素数量大于这个值，就会转变为红黑树。

1.6 UNTREEIFY_THRESHOLD属性

UNTREEIFY_THRESHOLD是HashMap由红黑树转变为链表的阈值。当元素数量小于这个值，就会转变为链表。

1.7 MIN_TREEIFY_CAPACITY属性

MIN_TREEIFY_CAPACITY参数表示将链表树化的哈希表最小容量。只有当整个HashMap的容量大于这个值时，具体的链表才可能发生树化。如果没有大于这个值，将进行扩容而非树化。（扩容也会减少单个链表中的元素数量）。

2. 一些重要属性

2.1 table属性

/**
 * The table, initialized on first use, and resized as
 * necessary. When allocated, length is always a power of two.
 * (We also tolerate length zero in some operations to allow
 * bootstrapping mechanics that are currently not needed.)
 */
transient Node<K, V>[] table;

table属性用来存节点。

2.2 entrySet属性

/**
 * Holds cached entrySet(). Note that AbstractMap fields are used
 * for keySet() and values().
 */
transient Set<Map.Entry<K, V>> entrySet;

entrySet属性用来缓存键值对集合。

2.3 size属性

/**
 * The number of key-value mappings contained in this map.
 */
transient int size;

size属性用来存map的键值对数目。

2.4 modCount属性

/**
 * The number of times this HashMap has been structurally modified
 * Structural modifications are those that change the number of mappings in
 * the HashMap or otherwise modify its internal structure (e.g.,
 * rehash).  This field is used to make iterators on Collection-views of
 * the HashMap fail-fast.  (See ConcurrentModificationException).
 */
transient int modCount;

modCount属性记录了map结构被修改的次数，用于在迭代器中的快速失败。

2.5 threshold属性

/**
 * The next size value at which to resize (capacity * load factor).
 *
 * @serial
 */
// (The javadoc description is true upon serialization.
// Additionally, if the table array has not been allocated, this
// field holds the initial array capacity, or zero signifying
// DEFAULT_INITIAL_CAPACITY.)
int threshold;

threshold属性记录了下一个需要resize的size大小

2.6 loadFactor属性

/**
 * The load factor for the hash table.
 *
 * @serial
 */
final float loadFactor;

loadFactor属性记录了当前map的负载因子

2.7 Node内部类

static class Node<K, V> implements Map.Entry<K, V> {
    final int hash;
    final K key;
    V value;
    Node<K, V> next;

    Node(int hash, K key, V value, Node<K, V> next) {
        this.hash = hash;
        this.key = key;
        this.value = value;
        this.next = next;
    }

    public final K getKey() {
        return key;
    }

    public final V getValue() {
        return value;
    }

    public final String toString() {
        return key + "=" + value;
    }

    public final int hashCode() {
        return Objects.hashCode(key) ^ Objects.hashCode(value);
    }

    public final V setValue(V newValue) {
        V oldValue = value;
        value = newValue;
        return oldValue;
    }

    public final boolean equals(Object o) {
        if (o == this)
            return true;
        if (o instanceof Map.Entry) {
            Map.Entry<?, ?> e = (Map.Entry<?, ?>) o;
            if (Objects.equals(key, e.getKey()) &&
                    Objects.equals(value, e.getValue()))
                return true;
        }
        return false;
    }
}

Node内部类实则表达了一个键值对，并包含了哈希值和next指针。

2.8 EntrySet内部类

final class EntrySet extends AbstractSet<Map.Entry<K,V>> {
    public final int size()                 { return size; }
    public final void clear()               { HashMap.this.clear(); }
    public final Iterator<Map.Entry<K,V>> iterator() {
        return new EntryIterator();
    }
    public final boolean contains(Object o) {
        if (!(o instanceof Map.Entry))
            return false;
        Map.Entry<?,?> e = (Map.Entry<?,?>) o;
        Object key = e.getKey();
        Node<K,V> candidate = getNode(hash(key), key);
        return candidate != null && candidate.equals(e);
    }
    public final boolean remove(Object o) {
        if (o instanceof Map.Entry) {
            Map.Entry<?,?> e = (Map.Entry<?,?>) o;
            Object key = e.getKey();
            Object value = e.getValue();
            return removeNode(hash(key), key, value, true, true) != null;
        }
        return false;
    }
    public final Spliterator<Map.Entry<K,V>> spliterator() {
        return new EntrySpliterator<>(HashMap.this, 0, -1, 0, 0);
    }
    public final void forEach(Consumer<? super Map.Entry<K,V>> action) {
        Node<K,V>[] tab;
        if (action == null)
            throw new NullPointerException();
        if (size > 0 && (tab = table) != null) {
            int mc = modCount;
            for (Node<K,V> e : tab) {
                for (; e != null; e = e.next)
                    action.accept(e);
            }
            if (modCount != mc)
                throw new ConcurrentModificationException();
        }
    }
}

EntrySet内部类用来表示当前HashMap的键集合。

3. 一些工具方法

3.1 hash方法

static final int hash(Object key) {
    int h;
    return (key == null) ? 0 : (h = key.hashCode()) ^ (h >>> 16);
}

这个hash()方法的(h = key.hashCode()) ^ (h >>> 16)过程值得研究。

首先，h >>> 16是将哈希值向右无符号移动16位，也就相当于取了原值的高16位。

0000 0100 1011 0011 1101 1111 1110 0001 >>> 16
得到：
0000 0000 0000 0000 0000 0100 1011 0011

然后，我们再来看一个JDK1.7中的indexFor()方法。在JDK8以后，源码中也大量出现tab[(n - 1) & hash]的形式，实际上也是一样的。

static int indexFor(int h, int length) {
    return h & (length-1);
}

indexFor()方法的返回值就是table数组中我们想要找到的数组下标。但由于绝大多数情况下length一般都小于2^16即小于65536。所以return h & (length-1)结果始终是h的低16位与（length-1）进行&运算。

所以很显然的是，我们每次在计算下标的时候，都几乎只能用到低位。如果我们想一个办法，把高位也利用起来，那么就可以增加散列的程度。所以hash()方法中选择了与自身的高十六位（h >>> 16）进行异或，来利用到高位。

3.2 tableSizeFor方法

/**
 * Returns a power of two size for the given target capacity.
 */
static final int tableSizeFor(int cap) {
    int n = -1 >>> Integer.numberOfLeadingZeros(cap - 1);
    return (n < 0) ? 1 : (n >= MAXIMUM_CAPACITY) ? MAXIMUM_CAPACITY : n + 1;
}

tableSizeFor方法用于计算参数cap对应的resize阈值。往往出现为下面的语句。

threshold = tableSizeFor(size);

3.3 resize方法

final Node<K, V>[] resize() {
    // 拿到旧的数组、容量、扩容阈值
    Node<K, V>[] oldTab = table;
    int oldCap = (oldTab == null) ? 0 : oldTab.length;
    int oldThr = threshold;
    int newCap, newThr = 0;
    if (oldCap > 0) {// 如果旧容量不为0
        if (oldCap >= MAXIMUM_CAPACITY) {// 如果旧容量已经达到了最大容量
            // 令扩容阈值为一个不可能达到的最大值
            threshold = Integer.MAX_VALUE;
            // 直接返回旧数组
            return oldTab;
        } else if ((newCap = oldCap << 1) < MAXIMUM_CAPACITY //如果旧容量扩大两倍后仍没达到最大容量，设其为新容量
                && oldCap >= DEFAULT_INITIAL_CAPACITY)// 并且旧容量大于等于初始容量
            // 令新的扩容阈值扩大为两倍
            newThr = oldThr << 1;
    } else if (oldThr > 0) // 如果旧容量为0，且旧扩容阈值大于0
        // 那么就让新容量变为旧扩容阈值
        newCap = oldThr;
    else { // 旧容量和旧扩容阈值都为0
        // 令新容量为默认初始容量，并计算扩容阈值
        newCap = DEFAULT_INITIAL_CAPACITY;
        newThr = (int) (DEFAULT_LOAD_FACTOR * DEFAULT_INITIAL_CAPACITY);
    }

    if (newThr == 0) {// 如果新扩容阈值为0
        // 计算当前的容量*负载因子
        float ft = (float) newCap * loadFactor;
        // 设置新扩容阈值为ft或Integer.MAX_VALUE
        newThr = (newCap < MAXIMUM_CAPACITY && ft < (float) MAXIMUM_CAPACITY ?
                (int) ft : Integer.MAX_VALUE);
    }
    // 更新扩容阈值
    threshold = newThr;
    // 创建一个新的table，大小为先前计算出来的新容量
    @SuppressWarnings({"rawtypes", "unchecked"})
    Node<K, V>[] newTab = (Node<K, V>[]) new Node[newCap];
    // 更新table指针
    table = newTab;

    if (oldTab != null) {// 如果旧数组不为null
        for (int j = 0; j < oldCap; ++j) {// 遍历之
            Node<K, V> e;
            if ((e = oldTab[j]) != null) {// 如果j处的旧数组值不为null，存进e
                // 将旧数组里的值设为null
                oldTab[j] = null;
                if (e.next == null)// 如果e.next为null
                    // 在新数组中找到并赋值
                    newTab[e.hash & (newCap - 1)] = e;
                else if (e instanceof TreeNode)// 如果e这里是一棵树，调用split分割树，并带到新数组中
                    ((TreeNode<K, V>) e).split(this, newTab, j, oldCap);
                else { // e这里是链表，且e.next不为null
                    // 下面要进行一个链表分割操作，将原链表分为lo串和hi串两串。
                    Node<K, V> loHead = null, loTail = null;
                    Node<K, V> hiHead = null, hiTail = null;
                    Node<K, V> next;
                    do {
                        // 保存next
                        next = e.next;
                        /* 通过e.hash & oldCap是0还是1，可以判断e应该在原串上还是在新串上。
                           hashMap扩容前后容量大小都是2的幂，假如原容量size二进制形式是1000,新容量二进制形式是10000
                           又由于数组下标位置的计算公式是e.hash & (size-1)，那么原来是hash & 0111，后来是hash & 01111。
                           也就实际上是多用了hash的高一位用来确定位置。那么hash&size就可以知道hash值中高的这一位是0还是1
                           如果高的一位是0，那么新位置和原位置一样，不变。如果高的一位是1，那么就要移动位置。
                         */
                        if ((e.hash & oldCap) == 0) {// 如果位置不用改变
                            // 将e加入到lo串中
                            if (loTail == null)
                                loHead = e;
                            else
                                loTail.next = e;
                            loTail = e;
                        } else {// 如果位置需要改变
                            // 将e加入到hi串中
                            if (hiTail == null)
                                hiHead = e;
                            else
                                hiTail.next = e;
                            hiTail = e;
                        }
                    } while ((e = next) != null);// 如果e赋为next后不为null
                    // 将lo串放到下标为j的位置处，即不变的位置
                    if (loTail != null) {
                        loTail.next = null;
                        newTab[j] = loHead;
                    }
                    // 将hi串放到下标为j+oldCap处，因为新位置的hash中高一位是1，那下标就要加一个oldCap
                    if (hiTail != null) {
                        hiTail.next = null;
                        newTab[j + oldCap] = hiHead;
                    }
                }
            }
        }
    }
    return newTab;
}

先尝试用数学方法确定新容量和新扩容阈值。

随后遍历数组，并深入遍历其中的链表/红黑树。如果是链表，要将他分裂为lo链表和hi链表。

3.4 treeifyBin方法

final void treeifyBin(Node<K, V>[] tab, int hash) {
    int n, index;
    Node<K, V> e;
    // 如果数组很空或很小，那么就resize()
    if (tab == null || (n = tab.length) < MIN_TREEIFY_CAPACITY)
        resize();
    else if ((e = tab[index = (n - 1) & hash]) != null) {// hash对应的位置在数组中不为null，就可以进入这个循环
        TreeNode<K, V> hd = null, tl = null;
        do {
            // 将节点e转变为一个树节点存到变量p
            TreeNode<K, V> p = replacementTreeNode(e, null);
            if (tl == null)
                hd = p;
            else {
                p.prev = tl;
                tl.next = p;
            }
            tl = p;
        } while ((e = e.next) != null);
        if ((tab[index] = hd) != null)
            // 转变为红黑树
            hd.treeify(tab);
    }
}

treeifyBin方法将符合要求的链表转变为一个红黑树。

4. 一些业务方法

4.1 get方法

public V get(Object key) {
    Node<K, V> e;
    return (e = getNode(hash(key), key)) == null ? null : e.value;
}

final Node<K, V> getNode(int hash, Object key) {
    Node<K, V>[] tab;
    Node<K, V> first, e;
    int n;
    K k;
    if ((tab = table) != null // table不为空
        && (n = tab.length) > 0 // table长度不为0
        && (first = tab[(n - 1) & hash]) != null) {// 在算出来的位置拿到的Node不为null，这是链表头
        if (first.hash == hash // 检查first的哈希值，假如正确且key也符合
            && ((k = first.key) == key || (key != null && key.equals(k))))
            return first;// 直接返回first
        if ((e = first.next) != null) {// 如果first链表长度不止为1
            if (first instanceof TreeNode) {// 如果table当前位置延伸出的是一个红黑树
                return ((TreeNode<K, V>) first).getTreeNode(hash, key);// 调用getTreeNode方法
            }
            // 不是红黑树，是链表，遍历查找即可
            do {
                if (e.hash == hash &&
                    ((k = e.key) == key || (key != null && key.equals(k))))
                    return e;
            } while ((e = e.next) != null);
        }
    }
    return null;
}

详见上面的注释。用哈希值计算出数组中的位置，再沿着链表或红黑树去找。

4.2 put方法

public V put(K key, V value) {
    // 调用putVal
    return putVal(hash(key), key, value, false, true);
}

final V putVal(int hash, K key, V value, boolean onlyIfAbsent,
               boolean evict) {
    Node<K, V>[] tab;
    Node<K, V> p;
    int n, i;
    if ((tab = table) == null || (n = tab.length) == 0)
        // 如果table为null或者为空，先扩容，再把长度赋值为n
        n = (tab = resize()).length;

    if ((p = tab[i = (n - 1) & hash]) == null)
        // 如果找到的指定位置为空，那么在这里用新键值对调用newNode
        tab[i] = newNode(hash, key, value, null);
    else {
        // 如果找到的位置不为空
        Node<K, V> e;
        K k;
        if (p.hash == hash // 如果哈希值符合
                && ((k = p.key) == key || (key != null && key.equals(k))))// 并且key也吻合
            // 将当前节点p赋给e
            e = p;
        else if (p instanceof TreeNode)// 如果当前p是一个树节点，调用putTreeVal方法
            e = ((TreeNode<K, V>) p).putTreeVal(this, tab, hash, key, value);
        else {// 如果当前是一个链表节点
            for (int binCount = 0; ; ++binCount) {
                if ((e = p.next) == null) {// 令e=p.next，如果它为null
                    // 令p.next为用新键值对创建的newNode，就是将新节点插在了链表尾
                    p.next = newNode(hash, key, value, null);
                    // 如果长度到达了转变为红黑树的最小阈值，那么就转变为红黑树
                    if (binCount >= TREEIFY_THRESHOLD - 1) // -1 for 1st
                        treeifyBin(tab, hash);
                    break;
                }
                // 如果找到了
                if (e.hash == hash &&
                        ((k = e.key) == key || (key != null && key.equals(k))))
                    break;
                // 令p为e，for循环一开始的时候e=p.next，所以相当于p后移了
                p = e;
            }
        }
        if (e != null) { // 如果key已经存在，更新它并返回oldValue
            V oldValue = e.value;
            if (!onlyIfAbsent || oldValue == null)
                e.value = value;
            afterNodeAccess(e);
            return oldValue;
        }
    }
    ++modCount;
    // 检查是否需要扩容
    if (++size > threshold)
        resize();
    afterNodeInsertion(evict);
    return null;
}

详见上面的注释。用哈希值计算出数组中的位置，再在链表或红黑树中搜索，做更新操作或newNode操作。

4.3 remove方法

public V remove(Object key) {
    Node<K, V> e;
    return (e = removeNode(hash(key), key, null, false, true)) == null ?
        null : e.value;
}

final Node<K, V> removeNode(int hash, Object key, Object value,
                            boolean matchValue, boolean movable) {
    Node<K, V>[] tab;
    Node<K, V> p;
    int n, index;
    if ((tab = table) != null && (n = tab.length) > 0 &&
            (p = tab[index = (n - 1) & hash]) != null) {// 检查对应数组位置是否存在链表/树
        Node<K, V> node = null, e;
        K k;
        V v;
        if (p.hash == hash &&
                ((k = p.key) == key || (key != null && key.equals(k))))
            // 如果链表头就是，赋值给node
            node = p;
        else if ((e = p.next) != null) {
            if (p instanceof TreeNode)
                // 如果是树，就调用对应的方法找到，赋值给node
                node = ((TreeNode<K, V>) p).getTreeNode(hash, key);
            else {
                // 是链表，就遍历查找
                do {
                    if (e.hash == hash &&
                            ((k = e.key) == key ||
                                    (key != null && key.equals(k)))) {
                        // 找到了，将当前节点赋值给node
                        node = e;
                        // 退出循环，这导致了当找到后，下面的p=e没有执行
                        break;
                    }
                    // 将当前节点赋值给p
                    p = e;
                } while ((e = e.next) != null);
            }
        }
        if (node != null && (!matchValue || (v = node.value) == value ||
                (value != null && value.equals(v)))) {
            if (node instanceof TreeNode)
                // 如果是树，用对应方法删除
                ((TreeNode<K, V>) node).removeTreeNode(this, tab, movable);
            else if (node == p)
                // 如果要移除的是链表头，那么根据上面do-while块中的逻辑，node==p，那么我们直接让链表从node.next开始
                tab[index] = node.next;
            else
                // 如果要移除的不是链表头，那么根据上面do-while块中的逻辑node!=p，之前是p->node，删除node
                p.next = node.next;
            // 处理modCount和size属性
            ++modCount;
            --size;
            afterNodeRemoval(node);
            // 返回remove掉的节点node
            return node;
        }
    }
    return null;
}

找->删，非常清晰的逻辑

4.4 clone方法

@Override
public Object clone() {
    HashMap<K,V> result;
    try {
        result = (HashMap<K,V>)super.clone();
    } catch (CloneNotSupportedException e) {
        // this shouldn't happen, since we are Cloneable
        throw new InternalError(e);
    }
    result.reinitialize();
    result.putMapEntries(this, false);
    return result;
}

clone()方法对HashMap做了一个浅拷贝。

5. 一些来自JDK8的方法

下面的方法来自于重写JDK8中的Map接口中的方法。

一些函数式的方法在此不做解析，如compute()和merge()等。

5.1 getOrDefault / putIfAbsent

@Override
public V getOrDefault(Object key, V defaultValue) {
    Node<K,V> e;
    return (e = getNode(hash(key), key)) == null ? defaultValue : e.value;
}

@Override
public V putIfAbsent(K key, V value) {
    return putVal(hash(key), key, value, true, true);
}

很好理解，获得实际值或默认值/不存在时插入。

5.2 remove / replace

@Override
public boolean remove(Object key, Object value) {
    return removeNode(hash(key), key, value, true, true) != null;
}

@Override
public boolean replace(K key, V oldValue, V newValue) {
    Node<K,V> e; V v;
    if ((e = getNode(hash(key), key)) != null &&
            ((v = e.value) == oldValue || (v != null && v.equals(oldValue)))) {
        e.value = newValue;
        afterNodeAccess(e);
        return true;
    }
    return false;
}

@Override
public V replace(K key, V value) {
    Node<K,V> e;
    if ((e = getNode(hash(key), key)) != null) {
        V oldValue = e.value;
        e.value = value;
        afterNodeAccess(e);
        return oldValue;
    }
    return null;
}

当对应键值对存在时，进行删除\替换操作。

posted @ 2021-10-10 15:57 pedro7 阅读(174) 评论(0) 编辑收藏举报

刷新页面返回顶部

pedro7