Java容器Map接口
Map接口容器存放的是key-value对,由于Map是按key索引的,因此 key 是不可重复的,但 value 允许重复。 下面简单介绍一下Map接口的实现,包括HashMap,LinkedHashMap,WeakHashMap,Hashtable,IdentityHashMap和TreeMap。需要注意的是,Map接口并没有继承Collection接口!
1、HashMap
HashMap 继承于AbstractMap,实现了Cloneable、java.io.Serializable接口,而AbstractMap实现了Map接口。HashMap 的实现不是同步的,这意味着它不是线程安全的。它的key、value都可以为null。此外,HashMap依赖key哈希值,所以其不是有序的,其底层使用数组(元素为链表,根据元素的hashcode确定数组下表)来存储数据。
public class HashMap<K,V> extends AbstractMap<K,V> implements Map<K,V>, Cloneable, Serializable {
public abstract class AbstractMap<K,V> implements Map<K,V> { /** * Sole constructor. (For invocation by subclass constructors, typically * implicit.) */ protected AbstractMap() { }
public interface Map<K,V> { // Query Operations /** * Returns the number of key-value mappings in this map. If the * map contains more than <tt>Integer.MAX_VALUE</tt> elements, returns * <tt>Integer.MAX_VALUE</tt>. * * @return the number of key-value mappings in this map */ int size();
public class HashMap<K,V> extends AbstractMap<K,V> implements Map<K,V>, Cloneable, Serializable { /** * The default initial capacity - MUST be a power of two. */ static final int DEFAULT_INITIAL_CAPACITY = 16; /** * The maximum capacity, used if a higher value is implicitly specified * by either of the constructors with arguments. * MUST be a power of two <= 1<<30. */ static final int MAXIMUM_CAPACITY = 1 << 30; /** * The load factor used when none specified in constructor. */ static final float DEFAULT_LOAD_FACTOR = 0.75f; /** * The table, resized as necessary. Length MUST Always be a power of two. */ transient Entry[] table; /** * The number of key-value mappings contained in this map. */ transient int size; /** * The next size value at which to resize (capacity * load factor). * @serial */ int threshold; /** * The load factor for the hash table. * * @serial */ final float loadFactor; /** * The number of times this HashMap has been structurally modified * Structural modifications are those that change the number of mappings in * the HashMap or otherwise modify its internal structure (e.g., * rehash). This field is used to make iterators on Collection-views of * the HashMap fail-fast. (See ConcurrentModificationException). */ transient volatile int modCount; /** * Constructs an empty <tt>HashMap</tt> with the specified initial * capacity and load factor. * * @param initialCapacity the initial capacity * @param loadFactor the load factor * @throws IllegalArgumentException if the initial capacity is negative * or the load factor is nonpositive */ public HashMap(int initialCapacity, float loadFactor) { if (initialCapacity < 0) throw new IllegalArgumentException("Illegal initial capacity: " + initialCapacity); if (initialCapacity > MAXIMUM_CAPACITY) initialCapacity = MAXIMUM_CAPACITY; if (loadFactor <= 0 || Float.isNaN(loadFactor)) throw new IllegalArgumentException("Illegal load factor: " + loadFactor); // Find a power of 2 >= initialCapacity int capacity = 1; while (capacity < initialCapacity) capacity <<= 1; this.loadFactor = loadFactor; threshold = (int)(capacity * loadFactor); table = new Entry[capacity]; init(); } static class Entry<K,V> implements Map.Entry<K,V> { final K key; V value; Entry<K,V> next; final int hash; /** * Creates new entry. */ Entry(int h, K k, V v, Entry<K,V> n) { value = v; next = n; key = k; hash = h; }
HashMap 的实例有两个参数影响其性能:“初始容量” 和 “加载因子”。容量是哈希表中桶的数量,初始容量只是哈希表在创建时的容量。加载因子是哈希表在其容量自动增加之前可以达到多满的一种尺度。当哈希表中的条目数超出了加载因子与当前容量的乘积时,则要对该哈希表进行resize操作(即重建内部数据结构),从而哈希表将具有大约两倍的桶数。通常,默认加载因子是 0.75, 这是在时间和空间成本上寻求一种折衷。
public V put(K key, V value) { if (key == null) return putForNullKey(value); int hash = hash(key.hashCode()); int i = indexFor(hash, table.length); for (Entry<K,V> e = table[i]; e != null; e = e.next) { Object k; if (e.hash == hash && ((k = e.key) == key || key.equals(k))) { V oldValue = e.value; e.value = value; e.recordAccess(this); return oldValue; } } modCount++; addEntry(hash, key, value, i); return null; }
void addEntry(int hash, K key, V value, int bucketIndex) { Entry<K,V> e = table[bucketIndex]; table[bucketIndex] = new Entry<K,V>(hash, key, value, e); // 在链表头加上此Entry if (size++ >= threshold) resize(2 * table.length); }
void resize(int newCapacity) { Entry[] oldTable = table; int oldCapacity = oldTable.length; if (oldCapacity == MAXIMUM_CAPACITY) { threshold = Integer.MAX_VALUE; return; } Entry[] newTable = new Entry[newCapacity]; transfer(newTable); table = newTable; threshold = (int)(newCapacity * loadFactor); }
2、LinkedHashMap
LinkedHashMap继承与HashMap,与HashMap相比LinkedHashMap维护的是一个具有双重链表的HashMap,LinkedHashMap支持排序,输出时其元素是有顺序的,先插入的先遍历到(对链表的遍历都是使用这个双向链表),而HashMap输出时是随机的。如果Map映射比较复杂而又要求高效率的话,最好使用LinkedHashMap。同样,LinkedHashMap也是非线程安全的。
友情提示一下,可以利用LinkedHashMap很方便的实现LRU算法,主要是重写boolean removeEldestEntry(Map.Entry<K,V> eldest)方法。如果应该从映射移除最旧的条目,则返回 true;如果应该保留,则返回 false。
public class LinkedHashMap<K,V> extends HashMap<K,V> implements Map<K,V> { private static final long serialVersionUID = 3801124242820219131L; /** * The head of the doubly linked list. */ private transient Entry<K,V> header; // 较HashMap多维护的一条双向链表
/** * This override alters behavior of superclass put method. It causes newly * allocated entry to get inserted at the end of the linked list and * removes the eldest entry if appropriate. */ void addEntry(int hash, K key, V value, int bucketIndex) { createEntry(hash, key, value, bucketIndex); // Remove eldest entry if instructed, else grow capacity if appropriate Entry<K,V> eldest = header.after; if (removeEldestEntry(eldest)) { removeEntryForKey(eldest.key); } else { if (size >= threshold) resize(2 * table.length); } }
3、WeakHashMap
WeakHashMap继承了AbstractMap而且实现了Map接口。WeakHashMap的每一个key对象保存了实际对象的弱引用,当系统回收了该key所对应的实际对象之后,WeakHashMap会自动删除该key对应的key-value对(http://zhang-xzhi-xjtu.iteye.com/blog/413159)。虽然Java有 垃圾回收机制,但是只要是自己管理的内存,就应该警惕内存泄露的问题,例如的对象池、缓存中的过期对象都有可能引发内存泄露的问题,其中我们可以用 WeakHashMap来作为缓存的容器可以有效解决这一问题。
WeakHashMap维护了一个ReferenceQueue,保存了所有存在引用的Key对象:
public class WeakHashMap<K,V> extends AbstractMap<K,V> implements Map<K,V> { /** * The default initial capacity -- MUST be a power of two. */ private static final int DEFAULT_INITIAL_CAPACITY = 16; /** * The maximum capacity, used if a higher value is implicitly specified * by either of the constructors with arguments. * MUST be a power of two <= 1<<30. */ private static final int MAXIMUM_CAPACITY = 1 << 30; /** * The load fast used when none specified in constructor. */ private static final float DEFAULT_LOAD_FACTOR = 0.75f; /** * The table, resized as necessary. Length MUST Always be a power of two. */ private Entry[] table; /** * The number of key-value mappings contained in this weak hash map. */ private int size; /** * The next size value at which to resize (capacity * load factor). */ private int threshold; /** * The load factor for the hash table. */ private final float loadFactor; /** * Reference queue for cleared WeakEntries */ private final ReferenceQueue<K> queue = new ReferenceQueue<K>(); // 这个是重点,引用队列
/** * The entries in this hash table extend WeakReference, using its main ref * field as the key. */ private static class Entry<K,V> extends WeakReference<K> implements Map.Entry<K,V> { private V value; private final int hash; private Entry<K,V> next; /** * Creates new entry. */ Entry(K key, V value, ReferenceQueue<K> queue, int hash, Entry<K,V> next) { super(key, queue); this.value = value; this.hash = hash; this.next = next; }
同时, WeakHashMap中有一个私有的expungeStaleEntries()方法,会在大部分共有方法中被调用,而且这个方法会将ReferenceQueue中所有失效的引用从Map中去除,从而实现了weak的效果。
/** * Expunges stale entries from the table. */ private void expungeStaleEntries() { Entry<K,V> e; while ( (e = (Entry<K,V>) queue.poll()) != null) { int h = e.hash; int i = indexFor(h, table.length); Entry<K,V> prev = table[i]; Entry<K,V> p = prev; while (p != null) { Entry<K,V> next = p.next; if (p == e) { if (prev == e) table[i] = next; else prev.next = next; e.next = null; // Help GC e.value = null; // " " size--; break; } prev = p; p = next; } } }
4、HashTable
HashTable是比较久远以前的类了,其继承了Dictionary类,实现了Map接口。
public class Hashtable<K,V> extends Dictionary<K,V> implements Map<K,V>, Cloneable, java.io.Serializable { /** * The hash table data. */ private transient Entry[] table; /** * The total number of entries in the hash table. */ private transient int count; /** * The table is rehashed when its size exceeds this threshold. (The * value of this field is (int)(capacity * loadFactor).) * * @serial */ private int threshold; /** * The load factor for the hashtable. * * @serial */ private float loadFactor; /** * The number of times this Hashtable has been structurally modified * Structural modifications are those that change the number of entries in * the Hashtable or otherwise modify its internal structure (e.g., * rehash). This field is used to make iterators on Collection-views of * the Hashtable fail-fast. (See ConcurrentModificationException). */ private transient int modCount = 0; /** use serialVersionUID from JDK 1.0.2 for interoperability */ private static final long serialVersionUID = 1421746759512286392L; /** * Constructs a new, empty hashtable with the specified initial * capacity and the specified load factor. * * @param initialCapacity the initial capacity of the hashtable. * @param loadFactor the load factor of the hashtable. * @exception IllegalArgumentException if the initial capacity is less * than zero, or if the load factor is nonpositive. */ public Hashtable(int initialCapacity, float loadFactor) { if (initialCapacity < 0) throw new IllegalArgumentException("Illegal Capacity: "+ initialCapacity); if (loadFactor <= 0 || Float.isNaN(loadFactor)) throw new IllegalArgumentException("Illegal Load: "+loadFactor); if (initialCapacity==0) initialCapacity = 1; this.loadFactor = loadFactor; table = new Entry[initialCapacity]; threshold = (int)(initialCapacity * loadFactor); }
HashTable是线程安全的,其很多方法的实现都是用了synchronized关键字:
/** * Returns the number of keys in this hashtable. * * @return the number of keys in this hashtable. */ public synchronized int size() { return count; }
另外,HashTable在扩容的时候会rehash(HashMap已经用resize方法不会去rehash从而优化了这个的效率问题),这个是非常耗时的操作。同时,HashTable不允许存入null的key或者value,否则会抛出NullPointerException
public synchronized V put(K key, V value) { // Make sure the value is not null if (value == null) { throw new NullPointerException(); } // Makes sure the key is not already in the hashtable. Entry tab[] = table; int hash = key.hashCode(); int index = (hash & 0x7FFFFFFF) % tab.length; for (Entry<K,V> e = tab[index] ; e != null ; e = e.next) { if ((e.hash == hash) && e.key.equals(key)) { V old = e.value; e.value = value; return old; } } modCount++; if (count >= threshold) { // Rehash the table if the threshold is exceeded rehash(); /////////////////////////////////////////////////////////////////////////////////////// tab = table; index = (hash & 0x7FFFFFFF) % tab.length; }
5、IdentityHashMap
IdentityHashMap类利用哈希表实现 Map 接口,与HashMap无太大区别。下面描述了IdentityHashMap和HashMap比较不一样的地方。
比较键(和值)时使用引用相等性代替对象相等性,换句话说,在 IdentityHashMap 中,当且仅当 (k1==k2) 时,才认为两个键 k1 和 k2 相等(在正常 Map 实现中,当且仅当满足(k1==null ? k2==null : e1.equals(e2))才认为两个键 k1 和 k2 相等)。
public V put(K key, V value) { Object k = maskNull(key); Object[] tab = table; int len = tab.length; int i = hash(k, len); Object item; while ( (item = tab[i]) != null) { if (item == k) { V oldValue = (V) tab[i + 1]; tab[i + 1] = value; return oldValue; } i = nextKeyIndex(i, len); } modCount++; tab[i] = k; tab[i + 1] = value; if (++size >= threshold) resize(len); // len == 2 * current capacity. return null; }
默认的价值加载因子为2/3,在重新哈希后,加载因子变为1/3.当哈希表中的条目数超出了加载因子与当前容量的乘积时,通过调用 reszie 方法将容量翻倍,重新进行哈希。增加桶数,重新哈希,可能相当昂贵。
private void resize(int newCapacity) { // assert (newCapacity & -newCapacity) == newCapacity; // power of 2 int newLength = newCapacity * 2; Object[] oldTable = table; int oldLength = oldTable.length; if (oldLength == 2*MAXIMUM_CAPACITY) { // can't expand any further if (threshold == MAXIMUM_CAPACITY-1) throw new IllegalStateException("Capacity exhausted."); threshold = MAXIMUM_CAPACITY-1; // Gigantic map! return; } if (oldLength >= newLength) return; Object[] newTable = new Object[newLength]; threshold = newLength / 3; for (int j = 0; j < oldLength; j += 2) { Object key = oldTable[j]; if (key != null) { Object value = oldTable[j+1]; oldTable[j] = null; oldTable[j+1] = null; int i = hash(key, newLength); ////////////////////这里重哈希了///////////////////////// while (newTable[i] != null) i = nextKeyIndex(i, newLength); newTable[i] = key; newTable[i + 1] = value; } } table = newTable; }
6、TreeMap
TreeMap 的实现就是红黑树数据结构,也就说是一棵自平衡的排序二叉树,这样就可以保证当需要快速检索指定节点。红黑树是一种自平衡排序二叉树,树中每个节点的值,都大于或等于在它的左子树中的所有节点的值,并且小于或等于在它的右子树中的所有节点的值,这确保红黑树运行时可以快速地在树中查找和定位的所需节点。 对于 TreeMap 而言,由于它底层采用一棵“红黑树”来保存集合中的 Entry,这意味这 TreeMap 添加元素、取出元素的性能都比 HashMap 低:当 TreeMap 添加元素时,需要通过循环找到新增 Entry 的插入位置,因此比较耗性能;当从 TreeMap 中取出元素时,需要通过循环才能找到合适的 Entry,也比较耗性能。但 TreeMap比 HashMap 的优势在于:TreeMap 中的所有 Entry 总是按 key 根据指定排序规则保持有序状态,TreeSet 中所有元素总是根据指定排序规则保持有序状态。
public class TreeMap<K,V> extends AbstractMap<K,V> implements NavigableMap<K,V>, Cloneable, java.io.Serializable { /** * The comparator used to maintain order in this tree map, or * null if it uses the natural ordering of its keys. * * @serial */ private final Comparator<? super K> comparator; private transient Entry<K,V> root = null; /** * The number of entries in the tree */ private transient int size = 0; /** * The number of structural modifications to the tree. */ private transient int modCount = 0; /** * Constructs a new, empty tree map, using the natural ordering of its * keys. All keys inserted into the map must implement the {@link * Comparable} interface. Furthermore, all such keys must be * <i>mutually comparable</i>: <tt>k1.compareTo(k2)</tt> must not throw * a <tt>ClassCastException</tt> for any keys <tt>k1</tt> and * <tt>k2</tt> in the map. If the user attempts to put a key into the * map that violates this constraint (for example, the user attempts to * put a string key into a map whose keys are integers), the * <tt>put(Object key, Object value)</tt> call will throw a * <tt>ClassCastException</tt>. */ public TreeMap() { comparator = null; } /** * Constructs a new, empty tree map, ordered according to the given * comparator. All keys inserted into the map must be <i>mutually * comparable</i> by the given comparator: <tt>comparator.compare(k1, * k2)</tt> must not throw a <tt>ClassCastException</tt> for any keys * <tt>k1</tt> and <tt>k2</tt> in the map. If the user attempts to put * a key into the map that violates this constraint, the <tt>put(Object * key, Object value)</tt> call will throw a * <tt>ClassCastException</tt>. * * @param comparator the comparator that will be used to order this map. * If <tt>null</tt>, the {@linkplain Comparable natural * ordering} of the keys will be used. */ public TreeMap(Comparator<? super K> comparator) { this.comparator = comparator; }
static final class Entry<K,V> implements Map.Entry<K,V> { K key; V value; Entry<K,V> left = null; Entry<K,V> right = null; Entry<K,V> parent; boolean color = BLACK; /** * Make a new cell with given key, value, and parent, and with * <tt>null</tt> child links, and BLACK color. */ Entry(K key, V value, Entry<K,V> parent) { this.key = key; this.value = value; this.parent = parent; }
同样是支持有序状态,LinkedHashMap保存了记录的插入顺序,先插入的先遍历到,而TreeMap默认是按升序排,也可以指定排序的比较器,遍历的时候按升序遍历。这个TreeMap的实现比较复杂,有需要研究的朋友可以自行查看其源码。
最后,我们来讨论一下hashcode在Map中的重要性。从上面我们了解到Map主要是依赖Hash码来查找元素链表的。根据某个计算公式,使用hashcode找到Entry所在的段,最后遍历对应段的链表结构再次去对比hashcode以及key值,最后找到value。当key对象的hash方法设计不当时,可能导致单个链表数据量过大,最后线性查找性能下降,导致整个map的查找性能下降。