Java8源码分析-HashMap

1HashMap

底层是基于哈希值的桶(数组)和链表+红黑树(1.8)的数据结构。当我们将键值对传递给put方法时，它调用键对象的hashCode()方法来计算hashcode，作为数组的下标。即找到数组中bucket(桶)的位置来储存值对象。桶的位置已被占用的时候，使用链表就是为了解决hash碰撞的问题。当hash地址上的链表大于8个节点的时候，会转换为红黑树。
时间复杂度：增加、查询、更新、删除（如果不移动）都为 o（1），所以很快

1.1 构造方法

默认加载系数为0.75。如果预判程序至少会使用多大的容量，可以使用HashMap(int initialCapacity, float loadFactor)来初始化元素的数量大小。

    static final int DEFAULT_INITIAL_CAPACITY = 1 << 4 ;  
    static final float DEFAULT_LOAD_FACTOR = 0.75f;
    public HashMap() {
        //设置加载系数为0.75
        this.loadFactor = DEFAULT_LOAD_FACTOR; // all other fields defaulted
    }    
    * @param  initialCapacity the initial capacity 初始大小16
    * @param  loadFactor      the load factor 系数=size/capacity 默认构造方法值为0.75f。也就是元素数量的阈值是map容量的75% 初始化大小16*0.75=12   
     public HashMap(int initialCapacity, float loadFactor) {
        if (initialCapacity < 0)
            throw new IllegalArgumentException("Illegal initial capacity: " +
                                               initialCapacity);
        if (initialCapacity > MAXIMUM_CAPACITY)
            initialCapacity = MAXIMUM_CAPACITY;
        if (loadFactor <= 0 || Float.isNaN(loadFactor))
            throw new IllegalArgumentException("Illegal load factor: " +
                                               loadFactor);
        this.loadFactor = loadFactor;
        this.threshold = tableSizeFor(initialCapacity);
    }

1.2 hash

static final int hash(Object key) {
    int h;
    return (key == null) ? 0 : (h = key.hashCode()) ^ (h >>> 16); //>>>16 无符号右移，提高散列性
}

1.3 扩容

获得容量大小。由于HashMap的capacity都是2的幂，因此这个方法用于找到大于等于initialCapacity的最小的2的幂
https://blog.csdn.net/fan2012huan/article/details/51097331
在构造方法中有调用它

    /**（这种算法好像有点快）
     * Returns a power of two size for the given target capacity.
     */
    static final int tableSizeFor(int cap) {
        int n = cap - 1;
        n |= n >>> 1;
        n |= n >>> 2;
        n |= n >>> 4;
        n |= n >>> 8;
        n |= n >>> 16;
        return (n < 0) ? 1 : (n >= MAXIMUM_CAPACITY) ? MAXIMUM_CAPACITY : n + 1;
    }

解析：这个方法是如何找到大于cap的2次幂呢？  
n |= n >>> 1;  表示将n无符号右移1之后，做按位或运算 （无符号：忽略符号位，空位都以0补齐）  
例如 cap = 10;  
那就是 1010 右移 编程0101 然后在做或运算  
1010  
0101  
按位或----  
1111  
然后在返回n+1 即 16。这样就获得了比10大的2次幂  
在举个例子 ****表示任意  
cap = 1000 ****  
先右移1位，得0100 ***，再做计算  
1000 ****  
0100 0***  
按位或----  
1100 0***  
再右移2位 得 0011 000*  
1100 0***  
0011 000  
按位或----  
1111 0***  
再右移4位 得 0000 1111  
1111 0***  
0000 1111  
按位或----  
1111 1111  
再移动8位 得 0000 0000  
与运算之后，还是1111 1111  
然后在返回 n+1 即 128  

为什么最多右移到16就结束了？因为int总共为32位 可能的值是 1010 、 1000...31个0。  
还有一个方法，先判断数字是不是2次幂，即通过遍历1后面的位
如果不是，那左移一位，再将1后面的位置位0。这种方法需要遍历，效率没有jdk中方法高

1.3.1扩容过程

1.创建一个二次幂大小的数组
2.将原来的数据，散列到新的数组里面

1.4 内部是一个Node集合的数组，单链表，红黑树

1.hash散列的范围内，组成一个 Node<K,V>[] tab 数组;
2.如果hash碰撞之后，会在当前数组位置，生成一个链表结构，如果链表过大，会变成一个树结构TreeNode

    transient Node<K,V>[] table;
    static class Node<K,V> implements Map.Entry<K,V> {
        final int hash;
        final K key;
        V value;
        Node<K,V> next;//第一次插入的时候，为null，当哈希冲突的时候，就会形成一个链表，指向下一个元素

        Node(int hash, K key, V value, Node<K,V> next) {
            this.hash = hash;
            this.key = key;
            this.value = value;
            this.next = next;
        }

        public final K getKey()        { return key; }
        public final V getValue()      { return value; }
        public final String toString() { return key + "=" + value; }

        public final int hashCode() {
            return Objects.hashCode(key) ^ Objects.hashCode(value);
        }

        public final V setValue(V newValue) {
            V oldValue = value;
            value = newValue;
            return oldValue;
        }

        public final boolean equals(Object o) {
            if (o == this)
                return true;
            if (o instanceof Map.Entry) {
                Map.Entry<?,?> e = (Map.Entry<?,?>)o;
                if (Objects.equals(key, e.getKey()) &&
                    Objects.equals(value, e.getValue()))
                    return true;
            }
            return false;
        }
    }

1.5 增加 put方法

put操作，首次初始容量为16，后续容量为n的2次幂

    /**
     * Implements Map.put and related methods
     *
     * @param hash hash for key
     * @param key the key
     * @param value the value to put
     * @param onlyIfAbsent if true, don't change existing value
     * @param evict if false, the table is in creation mode.
     * @return previous value, or null if none
     */
    final V putVal(int hash, K key, V value, boolean onlyIfAbsent,
                   boolean evict) {
        Node<K,V>[] tab; Node<K,V> p; int n, i;
        if ((tab = table) == null || (n = tab.length) == 0)
            n = (tab = resize()).length;//初始化map大小为16
        //通过debug发现tab[i = (n - 1) & hash] 结果i的结果第一次put是1 然后 2 3 。hash为object的hashcode方法，很神奇。hash的值可能和n相关。是根据n的大小来计算的，那么扩容之后会怎么样？要再看看
        //其中n-1=15 依次为47665 47665 47665，hashcode底层为c++
        if ((p = tab[i = (n - 1) & hash]) == null)//如果hash地址这里为null 就插入
            tab[i] = newNode(hash, key, value, null); 
        else {
            Node<K,V> e; K k;
            if (p.hash == hash &&
                ((k = p.key) == key || (key != null && key.equals(k))))
                e = p;//如果key一样，就替换
            else if (p instanceof TreeNode)
                //如果哈希地址对应的这里是一棵树的节点，就put到树中。这里的树形内存结构是由下面代码treeifyBin(tab, hash);转换的
                e = ((TreeNode<K,V>)p).putTreeVal(this, tab, hash, key, value);
            else {//未超过8个节点之前，是链表结构
                for (int binCount = 0; ; ++binCount) {//遍历链表
                    if ((e = p.next) == null) {//如果为null 就插入
                        p.next = newNode(hash, key, value, null);
                        if (binCount >= TREEIFY_THRESHOLD - 1) // -1 for 1st
                            //若增加该结点后，链表上的结点数超过了TREEIFY_THRESHOLD则转为树。
                            //即判断是否遍历到链表末尾，如果到结尾了，那就转成树。下次新增的时候，直接调用的if (p instanceof TreeNode)这里
                            treeifyBin(tab, hash);
                        break;
                    }
                    //在遍历链表的过程中，找到哈希一样，key也一样找到了，就直接跳出遍历。下面进行赋值e.value = value;
                    if (e.hash == hash &&
                        ((k = e.key) == key || (key != null && key.equals(k))))
                        break;
                    p = e;//p指向下一个节点
                }
            }
            if (e != null) { // existing mapping for key
                V oldValue = e.value;
                //在onlyIfAbsent为false时，可以覆盖键值对，或者onlyIfAbesent为true但是value为空时也可以覆盖
                if (!onlyIfAbsent || oldValue == null)
                    e.value = value;
                afterNodeAccess(e);
                return oldValue;
            }
        }
        ++modCount;
        if (++size > threshold)
            resize();//扩容
        afterNodeInsertion(evict);
        return null;
    }

1.6 获得 get方法

    public V get(Object key) {
        Node<K,V> e;
        //hash(key),底层调用的是object 的 hashcode
        return (e = getNode(hash(key), key)) == null ? null : e.value;
    }
  //如果知道map内部数据结构，这里比较易懂
  final Node<K,V> getNode(int hash, Object key) {
        Node<K,V>[] tab; Node<K,V> first, e; int n; K k;
        if ((tab = table) != null && (n = tab.length) > 0 &&
            (first = tab[(n - 1) & hash]) != null) {//当前hash存在数据的时候
            if (first.hash == hash && // always check first node
                ((k = first.key) == key || (key != null && key.equals(k))))//比较hash位置的第一个元素是否和目标元素相等
                return first;//如果相等，就返回第一个数据
            if ((e = first.next) != null) {//如果不相等，可能就有两种情况。1.是当前位置为树 2.为链表
                if (first instanceof TreeNode)
                    return ((TreeNode<K,V>)first).getTreeNode(hash, key);//待深入了解
                do {//遍历树，看能不能找到，能找到相等的元素就返回
                    if (e.hash == hash &&
                        ((k = e.key) == key || (key != null && key.equals(k))))
                        return e;
                } while ((e = e.next) != null);
            }
        }
        return null;
    }

1.7 删除 remove方法

    final Node<K,V> removeNode(int hash, Object key, Object value,
                               boolean matchValue, boolean movable) {
        Node<K,V>[] tab; Node<K,V> p; int n, index;
        if ((tab = table) != null && (n = tab.length) > 0 &&
            (p = tab[index = (n - 1) & hash]) != null) {
            //第一步找到对应位置
            Node<K,V> node = null, e; K k; V v;
            if (p.hash == hash &&
                ((k = p.key) == key || (key != null && key.equals(k))))
                node = p; //通常情况，这个位置的第一个节点的key符合要删除的节点key
            else if ((e = p.next) != null) {//如果这个节点不止一个元素，那么就要遍历一下这个树或者链表
                if (p instanceof TreeNode)
                    node = ((TreeNode<K,V>)p).getTreeNode(hash, key);//遍历树，找到对应的位置（有待深入研究）
                else {
                    do {//一直找到符合的链表的位置
                        if (e.hash == hash &&
                            ((k = e.key) == key ||
                             (key != null && key.equals(k)))) {
                            node = e;
                            break;
                        }
                        p = e;
                    } while ((e = e.next) != null);
                }
            }
            //如果位置找到了，!matchValue 默认不需要匹配值
            if (node != null && (!matchValue || (v = node.value) == value ||
                                 (value != null && value.equals(v)))) {
                if (node instanceof TreeNode)
                    ((TreeNode<K,V>)node).removeTreeNode(this, tab, movable);//执行删除操作（还待进入，看看）
                else if (node == p)
                    tab[index] = node.next;//通常情况，这个位置的第一个节点的key符合要删除的节点key。删除，指向node.next 可能是null，也可能是下一个节点
                else
                    p.next = node.next;//是链表的情况下，删除node节点
                ++modCount;
                --size;
                afterNodeRemoval(node); // 调用afterNodeRemoval方法，该方法HashMap没有任何实现逻辑，目的是为了让子类根据需要自行覆写
                return node;
            }
        }
        return null;
    }

1.8 containsKey方法，判断是否包含这个key

通过hashcode来定位数组，时间复杂度o(1)很快

    public boolean containsKey(Object key) {
        return getNode(hash(key), key) != null;
    }
    final Node<K,V> getNode(int hash, Object key) {
        Node<K,V>[] tab; Node<K,V> first, e; int n; K k;
        if ((tab = table) != null && (n = tab.length) > 0 &&
            (first = tab[(n - 1) & hash]) != null) {
            if (first.hash == hash && // always check first node
                ((k = first.key) == key || (key != null && key.equals(k))))
                return first;
            if ((e = first.next) != null) {
                if (first instanceof TreeNode)
                    return ((TreeNode<K,V>)first).getTreeNode(hash, key);
                do {
                    if (e.hash == hash &&
                        ((k = e.key) == key || (key != null && key.equals(k))))
                        return e;
                } while ((e = e.next) != null);
            }
        }
        return null;
    }

1.9 containsValue方法

有个问题，如果不是链表，是树，遍历树也可以用next吗？不知道具体什么机制，要再看一看

    public boolean containsValue(Object value) {
        Node<K,V>[] tab; V v;
        if ((tab = table) != null && size > 0) {
            for (int i = 0; i < tab.length; ++i) {//遍历tab数组
                for (Node<K,V> e = tab[i]; e != null; e = e.next) {//遍历当前tab数组位置的链表（如果是树呢？）
                    if ((v = e.value) == value ||
                        (value != null && value.equals(v)))
                        return true;
                }
            }
        }
        return false;
    }

1.10 KeySet

很奇怪，不知道怎么赋值的。。

1.11 values

同上

1.12 java.util.Map.Entry

interface Entry<K,V>
同上

1.13 entrySet方法

同上

    public Set<Map.Entry<K,V>> entrySet() {
        Set<Map.Entry<K,V>> es;
        return (es = entrySet) == null ? (entrySet = new EntrySet()) : es;
    }

1.14 clear方法

public void clear() {
    Node<K,V>[] tab;
    modCount++;
    if ((tab = table) != null && size > 0) {
        size = 0;
        for (int i = 0; i < tab.length; ++i)
            tab[i] = null;
    }
}

1.15 java8使用函数式接口的方法之computeIfAbsent

1.16 java8使用函数式接口的方法之computeIfPresent

1.17 java8使用函数式接口的方法之compute

1.18 java8使用函数式接口的方法之merge

1.19 java8使用函数式接口的方法之forEach

    @Override
    public void forEach(BiConsumer<? super K, ? super V> action) {
        Node<K,V>[] tab;
        if (action == null)
            throw new NullPointerException();
        if (size > 0 && (tab = table) != null) {
            int mc = modCount;
            for (int i = 0; i < tab.length; ++i) {
                for (Node<K,V> e = tab[i]; e != null; e = e.next)
                    action.accept(e.key, e.value);//通过遍历来获得key value值。并不能改变值
            }
            if (modCount != mc)
                throw new ConcurrentModificationException();
        }
    }

1.20 java8使用函数式接口的方法之replaceAll

    @Override
    public void replaceAll(BiFunction<? super K, ? super V, ? extends V> function) {
        Node<K,V>[] tab;
        if (function == null)
            throw new NullPointerException();
        if (size > 0 && (tab = table) != null) {
            int mc = modCount;
            for (int i = 0; i < tab.length; ++i) {
                for (Node<K,V> e = tab[i]; e != null; e = e.next) {
                    e.value = function.apply(e.key, e.value);//改变value的值
                }
            }
            if (modCount != mc)
                throw new ConcurrentModificationException();
        }
    }

一些方法变量

公共crud方法，变量控制
matchValue：匹配值，如匹配到值一样才删除
movable：移动，如删除之后，是否移动位置
onlyIfAbsent：新增，有匹配的值，不覆盖

2.思考

2.1 为什么扩容要是 2倍？

https://www.jianshu.com/p/5ddf1b664641
定位数组下标的时候，取值范围才能快速的算出符合的位置
使用了(n - 1) & hash 运算，在 n 为 2次幂的情况下时，(n - 1) & hash ≈ hash % n ,因为2进制的运算速度远远高于取模，所以就使用了这种方式，所以要求为2的幂。

2.2 为什么当哈希冲突的时候，jdk1.7插入的是头部，1.8是尾部？

1.7是因为插入头部效率高，没有使用红黑树，不需要遍历计算链表长度
1.8是因为本来就要计算链表的长度，所以就直接插入到尾部了，而且可以避免像1.7多线程的时候，形成一个链表环

2.3 如何确保hash散列均匀分布？

posted @ 2020-06-08 17:45 wullll 阅读(274) 评论(0) 收藏举报

刷新页面返回顶部

itwuliang

知其然知其所以然

Java8源码分析-HashMap

1HashMap

1.1 构造方法

1.2 hash

1.3 扩容

1.3.1扩容过程

1.4 内部是一个Node集合的数组，单链表，红黑树

1.5 增加 put方法

1.6 获得 get方法

1.7 删除 remove方法

1.8 containsKey方法，判断是否包含这个key

1.9 containsValue方法

1.10 KeySet

1.11 values

1.12 java.util.Map.Entry

1.13 entrySet方法

1.14 clear方法

1.15 java8使用函数式接口的方法之computeIfAbsent

1.16 java8使用函数式接口的方法之computeIfPresent

1.17 java8使用函数式接口的方法之compute

1.18 java8使用函数式接口的方法之merge

1.19 java8使用函数式接口的方法之forEach

1.20 java8使用函数式接口的方法之replaceAll

一些方法变量

2.思考

2.1 为什么扩容要是 2倍？

2.2 为什么当哈希冲突的时候，jdk1.7插入的是头部，1.8是尾部？

2.3 如何确保hash散列均匀分布？

公告

itwuliang

知其然知其所以然

Java8源码分析-HashMap

1HashMap

1.1 构造方法

1.2 hash

1.3 扩容

1.3.1扩容过程

1.4 内部是一个Node集合的数组，单链表，红黑树

1.5 增加 put方法

1.6 获得 get方法

1.7 删除 remove方法

1.8 containsKey方法，判断是否包含这个key

1.9 containsValue方法

1.10 KeySet

1.11 values

1.12 java.util.Map.Entry

1.13 entrySet方法

1.14 clear方法

1.15 java8使用函数式接口的方法之computeIfAbsent

1.16 java8使用函数式接口的方法之computeIfPresent

1.17 java8使用函数式接口的方法之compute

1.18 java8使用函数式接口的方法之merge

1.19 java8使用函数式接口的方法之forEach

1.20 java8使用函数式接口的方法之replaceAll

一些方法变量

2.思考

2.1 为什么 扩容 要是 2倍？

2.2 为什么当哈希冲突的时候，jdk1.7插入的是头部，1.8是尾部？

2.3 如何确保hash散列均匀分布？

公告

2.1 为什么扩容要是 2倍？