HashMap原理

hashMap中重要参数：

　　/**
     * table 数组默认长度
     */
    static final int DEFAULT_INITIAL_CAPACITY = 1 << 4; // aka 16

　　/**
     * Table数组的最大长度
     */
    static final int MAXIMUM_CAPACITY = 1 << 30;

    /**
     * 扩展因子
     */
    static final float DEFAULT_LOAD_FACTOR = 0.75f;

    /**
     * 链表树化阙值，表示在一个node（Table）节点下链表长度大于8时候，会将链表转换成为红黑树
     */
    static final int TREEIFY_THRESHOLD = 8;

    /**
     * 红黑树链化阙值：表示在进行扩容期间，单个Node节点下的红黑树节点的个数小于6时候，会将红黑树转化成为链表
     */
    static final int UNTREEIFY_THRESHOLD = 6;

    /**
     * The smallest table capacity for which bins may be treeified.
     * (Otherwise the table is resized if too many nodes in a bin.)
     * Should be at least 4 * TREEIFY_THRESHOLD to avoid conflicts
     * between resizing and treeification thresholds.
     * 最小树化阈值，当Table所有元素超过该值，才会进行树化
     */
    static final int MIN_TREEIFY_CAPACITY = 64;

hashMap实现原理

　　hashMap主要是通过table[]数组 + 链表的方式来实现的，具体结构如下所示：

　　1、put方法

　　①对key的hashCode()做hash运算，计算index

　　②如果没碰撞直接放到bucket里

　　③如果碰撞了，以链表的形式存在buckets后

　　④如果碰撞导致链表长度大于等于TREEIFY_THRESHOLD，就把链表转换成红黑树

　　⑤如果节点已经存在就替换key对应的value值

　　⑥如果bucket的数量大于 DEFAULT_LOAD_FACTOR * DEFAULT_INITIAL_CAPACITY 时就需要扩容

final V putVal(int hash, K key, V value, boolean onlyIfAbsent,
                   boolean evict) {
        Node<K,V>[] tab; Node<K,V> p; int n, i;
        // table数组为空的时候先进行数组的初始化操作
        if ((tab = table) == null || (n = tab.length) == 0)
            n = (tab = resize()).length; // resize()方法初始化数组，或达到DEFAULT_LOAD_FACTOR时扩展数组
        if ((p = tab[i = (n - 1) & hash]) == null){
            //i = (n - 1) & hash用来计算key在数组中的位置，通过hash值与数组长度减1做与运算
            //数组长度为2的幂，n-1转成2进制刚好全为1，hash和n-1做运算，得到的值不会大于数组最大下标
            //当该下标值为空时，新建Node，并且将next设置为空
            tab[i] = newNode(hash, key, value, null);
        }
        else {//该数值位置不为空时，判断是否为key值是否相等
            //判断方法：先判断key的hash是否相等，hash值不等的时候再使用equals方法
            Node<K,V> e; K k;
            if (p.hash == hash &&
                ((k = p.key) == key || (key != null && key.equals(k))))
                e = p;
            else if (p instanceof TreeNode) //如果当前Node类型为TreeNode，即此时为红黑树，往红黑树中添加节点。
                e = ((TreeNode<K,V>)p).putTreeVal(this, tab, hash, key, value);
            else {//如果不是TreeNode，则就是链表，遍历并与输入key做命中碰撞。
                for (int binCount = 0; ; ++binCount) {
                    if ((e = p.next) == null) { //如果当前Table中不存在当前key，则添加。
                        p.next = newNode(hash, key, value, null);
                        if (binCount >= TREEIFY_THRESHOLD - 1) //超过了TREEIFY_THRESHOLD，将链表转化为红黑树
                            treeifyBin(tab, hash);
                        break;
                    }
                    if (e.hash == hash &&
                        ((k = e.key) == key || (key != null && key.equals(k))))//找到相同的key，则为更新
                        break;
                    p = e;
                }
            }
            if (e != null) { //如果命中不为空，更新操作。
                V oldValue = e.value;
                if (!onlyIfAbsent || oldValue == null)
                    e.value = value;
                afterNodeAccess(e);
                return oldValue;
            }
        }
        ++modCount;
        if (++size > threshold) //扩容检测！
            resize();
        afterNodeInsertion(evict);
        return null;
    }

　　2、resize()扩容或初始化方法

　　resize()方法对HashMap进行初始化，或者超过扩容阀值时，对原有的数据进行拷贝移动到新的链表上，扩容大大小，采用左移1位，即容量翻倍

　　新的计算key在newTab位置时：e.hash & oldCap == 0 使用原有的位置，与 (n - 1) & hash 的值是一样的，同理 e.hash & oldCap != 0 时新位置为 j + oldCap 与 (n - 1) & hash 是一样的，自己可以计算一下，这里的n为扩容后的大小

　　/**
     * Initializes or doubles table size.  If null, allocates in
     * accord with initial capacity target held in field threshold.
     * Otherwise, because we are using power-of-two expansion, the
     * elements from each bin must either stay at same index, or move
     * with a power of two offset in the new table.
     *
     * @return the table
     */
    final Node<K,V>[] resize() {
        //保存旧的table，方便后面操作
        Node<K,V>[] oldTab = table;
        int oldCap = (oldTab == null) ? 0 : oldTab.length;
        int oldThr = threshold;
        int newCap, newThr = 0;
        if (oldCap > 0) {
            //数组长度不能超过最大值，超过则为最大值
            if (oldCap >= MAXIMUM_CAPACITY) {
                threshold = Integer.MAX_VALUE;
                return oldTab;
            }
            else if ((newCap = oldCap << 1) < MAXIMUM_CAPACITY &&
                     oldCap >= DEFAULT_INITIAL_CAPACITY) 
　　　　　　　　　　//新的数组长度左移1位，且在最大值和最小值之间，threshold也左移1位
                newThr = oldThr << 1; // 即容量和扩容阀值都乘2
        }
        else if (oldThr > 0) // 容量为0，扩容阀值不为0，容量设置为扩容阀值
            newCap = oldThr;
        else {               // 容量和扩容阀值都为0时，默认配置
            newCap = DEFAULT_INITIAL_CAPACITY;
            newThr = (int)(DEFAULT_LOAD_FACTOR * DEFAULT_INITIAL_CAPACITY);
        }
        if (newThr == 0) { // 若新的阙值为0 就得用 新容量* 加载因子重新进行计算
            float ft = (float)newCap * loadFactor;
            newThr = (newCap < MAXIMUM_CAPACITY && ft < (float)MAXIMUM_CAPACITY ?
                      (int)ft : Integer.MAX_VALUE);
        }
        threshold = newThr;
        @SuppressWarnings({"rawtypes","unchecked"})
        Node<K,V>[] newTab = (Node<K,V>[])new Node[newCap];
        table = newTab;
        //开始对新的hash表进行相对应的操作
        if (oldTab != null) {
            for (int j = 0; j < oldCap; ++j) {
                Node<K,V> e;
                if ((e = oldTab[j]) != null) { //该位置有值时才进行移动处理
                    oldTab[j] = null;
                    if (e.next == null) // 该位置只有一个元素，重新计算hash值的位置，放入
                        newTab[e.hash & (newCap - 1)] = e;
                    else if (e instanceof TreeNode) //如果在旧哈希表中是树形的结果，就要把新hash表中也变成树形结构
                        ((TreeNode<K,V>)e).split(this, newTab, j, oldCap);
                    else { // 该位置是链表，处理链表位置挪动
                        Node<K,V> loHead = null, loTail = null;
                        Node<K,V> hiHead = null, hiTail = null;
                        Node<K,V> next;
                        do { //遍历当前Table内的Node 赋值给新的Table
                            next = e.next;
                            if ((e.hash & oldCap) == 0) { // 原索引
                                if (loTail == null) 
　　　　　　　　　　　　　　　　　　　　　//找到第一个元素，后续链表的其他元素放在，该元素的next，放入newTab时只需放head
                                    loHead = e;
                                else
                                    loTail.next = e;
                                loTail = e;
                            }
                            else { // 原索引+oldCap
                                if (hiTail == null)
                                    hiHead = e;
                                else
                                    hiTail.next = e;
                                hiTail = e;
                            }
                        } while ((e = next) != null);
                        if (loTail != null) { // 原索引放到bucket里面
                            loTail.next = null;
                            newTab[j] = loHead;
                        }
                        if (hiTail != null) { // 原索引+oldCap 放到bucket里面
                            hiTail.next = null;
                            newTab[j + oldCap] = hiHead;
                        }
                    }
                }
            }
        }
        return newTab;
    }

　　3、get方法

　　在判断key是否相等时，总是先判断hash值是否相等，相等时在使用equals是否相等，这里主要是因为：hash值相等equals不一定相等，equals相等，hash值一定相同，hash值比较比equals快，所以先使用hash值相比，hash值不同的直接比较下一个，这样效率更快，也保证了正确性

　　①.对key的hashCode()做hash运算，计算index;

　　②.如果在bucket⾥的第⼀个节点⾥直接命中，则直接返回；

　　③.如果有冲突，则通过key.equals(k)去查找对应的Entry;

　　④. 若为树，则在树中通过key.equals(k)查找，O(logn)；

　　⑤. 若为链表，则在链表中通过key.equals(k)查找，O(n)。

final Node<K,V> getNode(int hash, Object key) {
        Node<K,V>[] tab; Node<K,V> first, e; int n; K k;
        if ((tab = table) != null && (n = tab.length) > 0 &&
            (first = tab[(n - 1) & hash]) != null) {//数组不为空时，通过hash计算在数组中的位置
            if (first.hash == hash && // 检查第一个Node 节点，若是命中则不需要进行do... whirle 循环
                ((k = first.key) == key || (key != null && key.equals(k))))
                return first;
            if ((e = first.next) != null) {
                if (first instanceof TreeNode)//树形结构，则对树进行检索
                    return ((TreeNode<K,V>)first).getTreeNode(hash, key);
                do {//链表结构，遍历链表
                    if (e.hash == hash &&
                        ((k = e.key) == key || (key != null && key.equals(k))))
                        return e;
                } while ((e = e.next) != null);
            }
        }
        return null;
    }

　　4、HashMap中的一些常见问题

　　①树化阀值TREEIFY_THRESHOLD = 8，而将树表为链表的阀值UNTREEIFY_THRESHOLD = 6，中间有个差值7可以防⽌链表和树之间频繁的转换，比如有个长度为8的链表，在该位置频繁的插入删除，插入删除，那么就会一直进行树和链表的来回转换

　　②HashMap是线程不安全的，在多线程中使用ConcurrentHashMap

　　③HashMap是通过空间来换取时间，以达到快速获取查找元素；需要将数据尽量的放在数组上，减少链表的长度，而在数据在链表的位置是通过 hash & (n - 1) 来获取，所以只要hash值是足够散列的，那么通过hash值计算出的位置也是正态分布在数组上

　　hash值的计算：

　　static final int hash(Object key) {
    　　int h;
    　　return (key == null) ? 0 : (h = key.hashCode()) ^ (h >>> 16);
　　}

　　将生成的hashcode值的高16位于低16位进行异或运算，这样得到的值再进行相与，一得到最散列的下标值

　　一般使用String作为HashMap的key，因为字符串是不可变的，所以在它创建的时候hashcode就被缓存了，不需要重新计算。这就使得字符串很适合作为Map中的键，字符串的处理速度要快过其它的键对象，获取对象的时候要用到到equals()和hashCode()方法，那么键对象正确的重写这两个方法是非常重要的，String类已经很规范的覆写了hashCode()以及equals()方法。

　　④负载因子是可以修改的，也可以大于1，但是建议不要轻易修改

　　ps：没有分析红黑树相关代码

posted @ 2021-09-14 21:09 筱小2 阅读(42) 评论(0) 编辑收藏举报

会员力量，点亮园子希望

刷新页面返回顶部

筱小2

HashMap原理

hashMap中重要参数：

hashMap实现原理

1、put方法

2、resize()扩容或初始化方法

3、get方法

4、HashMap中的一些常见问题

公告

　　1、put方法

　　2、resize()扩容或初始化方法

　　3、get方法

　　4、HashMap中的一些常见问题