HashSet源码学习和总结

类继承图

 

 类注释

/**
 * This class implements the <tt>Set</tt> interface, backed by a hash table (actually a <tt>HashMap</tt> instance).  It makes no guarantees as to the iteration order of the set; 
 * in particular, it does not guarantee that the order will remain constant over time.  This class permits the <tt>null</tt> element.
 * HashSet实现了set 接口,以HashMap作为底层支持,不保证集合的迭代顺序,允许null值
 * This class offers constant time performance for the basic operations(<tt>add</tt>, <tt>remove</tt>, <tt>contains</tt> and <tt>size</tt>),
 * assuming the hash function disperses the elements properly among the buckets.  
 * 假设有很好的hash算法使得元素能够散列在桶中的话,add,remove,contains,size() 操作可以提供O(1)复杂度
 * Iterating over this set requires time proportional to the sum of the <tt>HashSet</tt> instance's size (the number of elements) plus the
 * "capacity" of the backing <tt>HashMap</tt> instance (the number of buckets).  
 * 对set进行迭代需要的时间与 HashSet实例的size+HashMap桶实例的size之和成正比 
 * Thus, it's very important not to set the initial capacity too high (or the load factor too low) if iteration performance is important.
 * 因此如果迭代操作很重要的话,不要将初始容量设置的太大(或负载因子设置太低)
 * <p><strong>Note that this implementation is not synchronized.</strong> If multiple threads access a hash set concurrently, 
 * and at least one of the threads modifies the set, it <i>must</i> be synchronized externally.
 * HashSet没有实现同步,如果多个线程并发访问,并且至少有一个线程进行修改操作,则必须在外部进行同步(加锁)
 * This is typically accomplished by synchronizing on some object that naturally encapsulates the set.
 * 通常的实现方式是 在set之外对一个对象进行同步实现(加锁实现)
 * If no such object exists, the set should be "wrapped" using the @link Collections#synchronizedSet Collections.synchronizedSet} ethod.  
 * 如果没有这样的对象,可以使用Collections.synchronizedSet 方法对set 进行封装
 * this is best done at creation time, to prevent accidental unsynchronized access to the set:<pre> Set s = Collections.synchronizedSet(new HashSet(...));</pre>
 * 这个操作最好在创建的时候就做,防止未经过同步就访问set,例如定义 Set s = Collections.synchronizedSet(new HashSet(...))
 * <p>The iterators returned by this class's <tt>iterator</tt> method are <i>fail-fast</i>: if the set is modified at any time after the iterator is
 * created, in any way except through the iterator's own <tt>remove</tt> method, the Iterator throws a {@link ConcurrentModificationException}.
 * Thus, in the face of concurrent modification, the iterator fails quickly and cleanly, rather than risking arbitrary, non-deterministic behavior at an undetermined time in the future.
 * <p>Note that the fail-fast behavior of an iterator cannot be guaranteed as it is, generally speaking, impossible to make any hard guarantees in the
 * presence of unsynchronized concurrent modification.  Fail-fast iterators
 * throw <tt>ConcurrentModificationException</tt> on a best-effort basis.
 * Therefore, it would be wrong to write a program that depended on this exception for its correctness: <i>the fail-fast behavior of iterators should be used only to detect bugs.</i>
* 上面这一段讲的是HashSet迭代器方法的快速失败机制 * *
@author Josh Bloch * @author Neal Gafter * @see Collection * @see HashMap * @since 1.2 */

核心属性

// 底层存储使用的是HashMap,value是一个假的值,就是下面这个final值
private transient HashMap<E,Object> map;
// Dummy value to associate with an Object in the backing Map
private static final Object PRESENT = new Object();

构造方法

其实就是在初始化一个HashMap

public HashSet() {
    map = new HashMap<>();
}
public HashSet(int initialCapacity) {
    map = new HashMap<>(initialCapacity);
}
public HashSet(int initialCapacity, float loadFactor) {
    map = new HashMap<>(initialCapacity, loadFactor);
}
public HashSet(Collection<? extends E> c) {
    map = new HashMap<>(Math.max((int) (c.size()/.75f) + 1, 16));
    addAll(c);
}

重要方法

//这里调用的是底层的map,添加的e-PRESENT键值对,注意当set中已经包含了e,返回false
public boolean add(E e) {
     return map.put(e, PRESENT)==null;
}
public boolean remove(Object o) {
return map.remove(o)==PRESENT;
}

总结

HashSet底层是借用的HashMap实现的,存储的元素是HashMap的key,value是一个final值new Object()

迭代器满足快速失败,元素不重复,无序

 

posted @ 2022-08-01 16:16  鼠标的博客  阅读(27)  评论(0编辑  收藏  举报