HashSet源码浅析
引言
HashSet 实现set接口,继承AbstractSet类;是一个不允许存储重复元素的集合;
public class HashSet<E>
extends AbstractSet<E>
implements Set<E>, Cloneable, java.io.Serializable
HashSet类两个主要属性
成员变量map修饰符为transient:hashset也定制了序列化方法;针对transient 修饰的字段
private transient HashMap<E,Object> map;
// Dummy value to associate with an Object in the backing Map
private static final Object PRESENT = new Object();
可以看出
- HashSet内部是由hashMap实现的;
- PRESENT :是所有写入 map 的 value 值。
add
调用map的put方法,写入<K,V> = <e,PRESENT>
public boolean add(E e) {
return map.put(e, PRESENT)==null;
}
可以看出它是将存放的对象当做了 HashMap 的健,value 都是相同的 PRESENT 。由于 HashMap 的 key 是不能重复的,所以每当有重复的值写入到 HashSet 时,value 会被覆盖,但 key 不会受到影响,这样就保证了 HashSet 中只能存放不重复的元素。
如果指定的元素尚不存在,则将其添加到此集合中。如果此集合不包含元素e2(e == null?e2 == null:e.equals(e2)),则将指定元素e添加到此集合。 如果此set已包含该元素,则调用将保持set不变并返回false。
contains
类似的hashset是否contains一个元素,调用的是map.containsKey(o)
方法:map是否包含该key
public boolean contains(Object o) {
return map.containsKey(o);
}
remove
如果存在,则从该集合中移除指定的元素。 更正式的,如果此集合包含这样的元素,则删除元素e(o == null?e == null:o.equals(e))。 如果此set包含该元素,则返回true(或等效地,如果此set由于调用而更改)。 (一旦调用返回,该集合将不包含该元素。)
public boolean remove(Object o) {
return map.remove(o)==PRESENT;
}
序列化
…
官方说明
This class implements the Set interface, backed by a hash table (actually a HashMap instance). It makes no guarantees as to the iteration order of the set; in particular, it does not guarantee that the order will remain constant over time. This class permits the null element.
允许null元素
This class offers constant time performance for the basic operations (add, remove, contains and size), assuming the hash function disperses the elements properly among the buckets. Iterating over this set requires time proportional to the sum of the HashSet instance’s size (the number of elements) plus the “capacity” of the backing HashMap instance (the number of buckets). Thus, it’s very important not to set the initial capacity too high (or the load factor too low) if iteration performance is important.
基本操作的时间复杂度都为O(1),常数时间
Note that this implementation is not synchronized. If multiple threads access a hash set concurrently, and at least one of the threads modifies the set, it must be synchronized externally. This is typically accomplished by synchronizing on some object that naturally encapsulates the set. If no such object exists, the set should be “wrapped” using the Collections.synchronizedSet method. This is best done at creation time, to prevent accidental unsynchronized access to the set:
同样是线程不安全的,转换成线程安全,可以使用Collections.synchronizedSet()包装一下
Set s = Collections.synchronizedSet(new HashSet(...));
The iterators returned by this class’s iterator method are fail-fast: if the set is modified at any time after the iterator is created, in any way except through the iterator’s own remove method, the Iterator throws a ConcurrentModificationException. Thus, in the face of concurrent modification, the iterator fails quickly and cleanly, rather than risking arbitrary, non-deterministic behavior at an undetermined time in the future.
fail-fast机制,ConcurrentModificationException