[原创]Android系统中常用JAVA类源码浅析之HashMap
由于是浅析,所以我只分析常用的接口,注意是Android系统中的JAVA类,可能和JDK的源码有区别。
首先从构造函数开始,
1 /** 2 * Min capacity (other than zero) for a HashMap. Must be a power of two 3 * greater than 1 (and less than 1 << 30). 4 */ 5 private static final int MINIMUM_CAPACITY = 4; 6 7 /** 8 * Max capacity for a HashMap. Must be a power of two >= MINIMUM_CAPACITY. 9 */ 10 private static final int MAXIMUM_CAPACITY = 1 << 30; 11 12 /** 13 * An empty table shared by all zero-capacity maps (typically from default 14 * constructor). It is never written to, and replaced on first put. Its size 15 * is set to half the minimum, so that the first resize will create a 16 * minimum-sized table. 17 */ 18 private static final Entry[] EMPTY_TABLE 19 = new HashMapEntry[MINIMUM_CAPACITY >>> 1]; 20 21 /** 22 * The default load factor. Note that this implementation ignores the 23 * load factor, but cannot do away with it entirely because it's 24 * mentioned in the API. 25 * 26 * <p>Note that this constant has no impact on the behavior of the program, 27 * but it is emitted as part of the serialized form. The load factor of 28 * .75 is hardwired into the program, which uses cheap shifts in place of 29 * expensive division. 30 */ 31 static final float DEFAULT_LOAD_FACTOR = .75F; 32 33 /** 34 * The hash table. If this hash map contains a mapping for null, it is 35 * not represented this hash table. 36 */ 37 transient HashMapEntry<K, V>[] table; 38 39 /** 40 * The entry representing the null key, or null if there's no such mapping. 41 */ 42 transient HashMapEntry<K, V> entryForNullKey; 43 44 /** 45 * The number of mappings in this hash map. 46 */ 47 transient int size; 48 49 /** 50 * Incremented by "structural modifications" to allow (best effort) 51 * detection of concurrent modification. 52 */ 53 transient int modCount; 54 55 /** 56 * The table is rehashed when its size exceeds this threshold. 57 * The value of this field is generally .75 * capacity, except when 58 * the capacity is zero, as described in the EMPTY_TABLE declaration 59 * above. 60 */ 61 private transient int threshold; 62 63 public HashMap() { 64 table = (HashMapEntry<K, V>[]) EMPTY_TABLE; 65 threshold = -1; // Forces first put invocation to replace EMPTY_TABLE 66 } 67 68 public HashMap(int capacity) { 69 if (capacity < 0) { 70 throw new IllegalArgumentException("Capacity: " + capacity); 71 } 72 73 if (capacity == 0) { 74 @SuppressWarnings("unchecked") 75 HashMapEntry<K, V>[] tab = (HashMapEntry<K, V>[]) EMPTY_TABLE; 76 table = tab; 77 threshold = -1; // Forces first put() to replace EMPTY_TABLE 78 return; 79 } 80 81 if (capacity < MINIMUM_CAPACITY) { 82 capacity = MINIMUM_CAPACITY; 83 } else if (capacity > MAXIMUM_CAPACITY) { 84 capacity = MAXIMUM_CAPACITY; 85 } else { 86 capacity = Collections.roundUpToPowerOfTwo(capacity); 87 } 88 makeTable(capacity); 89 } 90 91 public HashMap(int capacity, float loadFactor) { 92 this(capacity); 93 94 if (loadFactor <= 0 || Float.isNaN(loadFactor)) { 95 throw new IllegalArgumentException("Load factor: " + loadFactor); 96 } 97 98 /* 99 * Note that this implementation ignores loadFactor; it always uses 100 * a load factor of 3/4. This simplifies the code and generally 101 * improves performance. 102 */ 103 }
通过三个构造函数的源码,我们可以知道:
- HashMap内部实际上使用HashMapEntry数组来实现的。
- 当调用new HashMap()时,会创建容量为2的HashMapEntry数组,并且threshold为-1。
- 当调用HashMap(int capacity)时,HashMap会将传入的capacity转换成最小的大于等于capacity的2的次方,比如:capacity=25,会转换成32。并且threshold为总容量的75%,threshold的作用是当entry的数量大于threshold时,进行扩容。
- HashMap(int capacity, float loadFactory)实际上和HashMap(int capacity)是一样的,loadFactory参数未被使用(注意这是Android做的修改,实际上JDK中会使用这个参数)。
既然是HashMapEntry数组实现的,我们简单看下这个Entry什么样,
static class HashMapEntry<K, V> implements Entry<K, V> { final K key; V value; final int hash; HashMapEntry<K, V> next; HashMapEntry(K key, V value, int hash, HashMapEntry<K, V> next) { this.key = key; this.value = value; this.hash = hash; this.next = next; } }
这里注意关注next属性,有一定经验的朋友肯定知道,这是单向链表的实现,所以实现HashMap的数组的每一项其实是一个单向链表的Head,继续往下看,
接下来我们分析下put(K key, V value)方法,
1 void addNewEntryForNullKey(V value) { 2 entryForNullKey = new HashMapEntry<K, V>(null, value, 0, null); 3 } 4 5 private V putValueForNullKey(V value) { 6 HashMapEntry<K, V> entry = entryForNullKey; 7 if (entry == null) { 8 addNewEntryForNullKey(value); 9 size++; 10 modCount++; 11 return null; 12 } else { 13 preModify(entry); 14 V oldValue = entry.value; 15 entry.value = value; 16 return oldValue; 17 } 18 } 19 20 private HashMapEntry<K, V>[] makeTable(int newCapacity) { 21 @SuppressWarnings("unchecked") HashMapEntry<K, V>[] newTable 22 = (HashMapEntry<K, V>[]) new HashMapEntry[newCapacity]; 23 table = newTable; 24 threshold = (newCapacity >> 1) + (newCapacity >> 2); // 3/4 capacity 25 return newTable; 26 } 27 28 private HashMapEntry<K, V>[] doubleCapacity() { 29 HashMapEntry<K, V>[] oldTable = table; 30 int oldCapacity = oldTable.length; 31 if (oldCapacity == MAXIMUM_CAPACITY) { 32 return oldTable; 33 } 34 int newCapacity = oldCapacity * 2; 35 HashMapEntry<K, V>[] newTable = makeTable(newCapacity); 36 if (size == 0) { 37 return newTable; 38 } 39 40 for (int j = 0; j < oldCapacity; j++) { 41 /* 42 * Rehash the bucket using the minimum number of field writes. 43 * This is the most subtle and delicate code in the class. 44 */ 45 HashMapEntry<K, V> e = oldTable[j]; 46 if (e == null) { 47 continue; 48 } 49 int highBit = e.hash & oldCapacity; 50 HashMapEntry<K, V> broken = null; 51 newTable[j | highBit] = e; 52 for (HashMapEntry<K, V> n = e.next; n != null; e = n, n = n.next) { 53 int nextHighBit = n.hash & oldCapacity; 54 if (nextHighBit != highBit) { 55 if (broken == null) 56 newTable[j | nextHighBit] = n; 57 else 58 broken.next = n; 59 broken = e; 60 highBit = nextHighBit; 61 } 62 } 63 if (broken != null) 64 broken.next = null; 65 } 66 return newTable; 67 } 68 69 @Override public V put(K key, V value) { 70 if (key == null) { 71 return putValueForNullKey(value); 72 } 73 74 int hash = Collections.secondaryHash(key); 75 HashMapEntry<K, V>[] tab = table; 76 int index = hash & (tab.length - 1); 77 for (HashMapEntry<K, V> e = tab[index]; e != null; e = e.next) { 78 if (e.hash == hash && key.equals(e.key)) { 79 preModify(e); 80 V oldValue = e.value; 81 e.value = value; 82 return oldValue; 83 } 84 } 85 86 // No entry for (non-null) key is present; create one 87 modCount++; 88 if (size++ > threshold) { 89 tab = doubleCapacity(); 90 index = hash & (tab.length - 1); 91 } 92 addNewEntry(key, value, hash, index); 93 return null; 94 } 95 96 void addNewEntry(K key, V value, int hash, int index) { 97 table[index] = new HashMapEntry<K, V>(key, value, hash, table[index]); 98 }
从put(K key, V value)的源码我们可以得到如下信息:
- 当添加key为null的value时,会用单独的HashMapEntry entryForNullKey对象来储存。
- entry数组的索引是通过hash算出来的:int index = hash & (tab.length - 1)。
- 当发生碰撞时(也就是算出的index上已经存在entry了),会首先检查是否是同一个hash和key,如果是则更新value,然后直接将old value返回。
- 新创建的entry会被设置成对应index上的链表Head。
- 当entry数量大于threshold(capacity的75%)时,对数组进行扩容,扩大为原来的2倍,并重新计算原数组中所有entry的index,然后复制到新数组中。
分析完put后,其他如get、remove、containsKey等接口就大同小异了,在此直接略过。
接下来我们看下Set<K> keySet()接口:
@Override public Set<K> keySet() { Set<K> ks = keySet; return (ks != null) ? ks : (keySet = new KeySet()); } private final class KeySet extends AbstractSet<K> { public Iterator<K> iterator() { return newKeyIterator(); } public int size() { return size; } public boolean isEmpty() { return size == 0; } public boolean contains(Object o) { return containsKey(o); } public boolean remove(Object o) { int oldSize = size; HashMap.this.remove(o); return size != oldSize; } public void clear() { HashMap.this.clear(); } } Iterator<K> newKeyIterator() { return new KeyIterator(); } private final class KeyIterator extends HashIterator implements Iterator<K> { public K next() { return nextEntry().key; } }
从源码中可以得出如下结论:
- keySet返回的Set对象实际上和HashMap是强关联的,对Set接口的调用,实际上操作的还是HashMap。
- Set中的iterator实际上也是实现自HashIterator。
- entrySet()、valueSet()和keySet()的实现原理一样。
知道HashMap的实现原理后,我们就可以知道他的优缺点了:
优点:读写效率高,接近数组的索引方式。
缺陷:会占用大量的无效内存,为了减少碰撞,Entry数组的容量只能是2的N次幂,并且当entry数大于总容量的75%时就会扩容两倍。
如有问题,欢迎指出!
转载请注明出处。