HashSet源码学习和总结
类继承图
类注释
/** * This class implements the <tt>Set</tt> interface, backed by a hash table (actually a <tt>HashMap</tt> instance). It makes no guarantees as to the iteration order of the set; * in particular, it does not guarantee that the order will remain constant over time. This class permits the <tt>null</tt> element. * HashSet实现了set 接口,以HashMap作为底层支持,不保证集合的迭代顺序,允许null值 * This class offers constant time performance for the basic operations(<tt>add</tt>, <tt>remove</tt>, <tt>contains</tt> and <tt>size</tt>), * assuming the hash function disperses the elements properly among the buckets. * 假设有很好的hash算法使得元素能够散列在桶中的话,add,remove,contains,size() 操作可以提供O(1)复杂度 * Iterating over this set requires time proportional to the sum of the <tt>HashSet</tt> instance's size (the number of elements) plus the * "capacity" of the backing <tt>HashMap</tt> instance (the number of buckets). * 对set进行迭代需要的时间与 HashSet实例的size+HashMap桶实例的size之和成正比 * Thus, it's very important not to set the initial capacity too high (or the load factor too low) if iteration performance is important. * 因此如果迭代操作很重要的话,不要将初始容量设置的太大(或负载因子设置太低) * <p><strong>Note that this implementation is not synchronized.</strong> If multiple threads access a hash set concurrently, * and at least one of the threads modifies the set, it <i>must</i> be synchronized externally. * HashSet没有实现同步,如果多个线程并发访问,并且至少有一个线程进行修改操作,则必须在外部进行同步(加锁) * This is typically accomplished by synchronizing on some object that naturally encapsulates the set. * 通常的实现方式是 在set之外对一个对象进行同步实现(加锁实现) * If no such object exists, the set should be "wrapped" using the @link Collections#synchronizedSet Collections.synchronizedSet} ethod. * 如果没有这样的对象,可以使用Collections.synchronizedSet 方法对set 进行封装 * this is best done at creation time, to prevent accidental unsynchronized access to the set:<pre> Set s = Collections.synchronizedSet(new HashSet(...));</pre> * 这个操作最好在创建的时候就做,防止未经过同步就访问set,例如定义 Set s = Collections.synchronizedSet(new HashSet(...)) * <p>The iterators returned by this class's <tt>iterator</tt> method are <i>fail-fast</i>: if the set is modified at any time after the iterator is * created, in any way except through the iterator's own <tt>remove</tt> method, the Iterator throws a {@link ConcurrentModificationException}. * Thus, in the face of concurrent modification, the iterator fails quickly and cleanly, rather than risking arbitrary, non-deterministic behavior at an undetermined time in the future. * <p>Note that the fail-fast behavior of an iterator cannot be guaranteed as it is, generally speaking, impossible to make any hard guarantees in the * presence of unsynchronized concurrent modification. Fail-fast iterators * throw <tt>ConcurrentModificationException</tt> on a best-effort basis. * Therefore, it would be wrong to write a program that depended on this exception for its correctness: <i>the fail-fast behavior of iterators should be used only to detect bugs.</i>
* 上面这一段讲的是HashSet迭代器方法的快速失败机制 * * @author Josh Bloch * @author Neal Gafter * @see Collection * @see HashMap * @since 1.2 */
核心属性
// 底层存储使用的是HashMap,value是一个假的值,就是下面这个final值 private transient HashMap<E,Object> map; // Dummy value to associate with an Object in the backing Map private static final Object PRESENT = new Object();
构造方法
其实就是在初始化一个HashMap
public HashSet() { map = new HashMap<>(); } public HashSet(int initialCapacity) { map = new HashMap<>(initialCapacity); } public HashSet(int initialCapacity, float loadFactor) { map = new HashMap<>(initialCapacity, loadFactor); } public HashSet(Collection<? extends E> c) { map = new HashMap<>(Math.max((int) (c.size()/.75f) + 1, 16)); addAll(c); }
重要方法
//这里调用的是底层的map,添加的e-PRESENT键值对,注意当set中已经包含了e,返回false public boolean add(E e) { return map.put(e, PRESENT)==null; }
public boolean remove(Object o) {
return map.remove(o)==PRESENT;
}
总结
HashSet底层是借用的HashMap实现的,存储的元素是HashMap的key,value是一个final值new Object()
迭代器满足快速失败,元素不重复,无序