ThreadLocal原理及魔数0x61c88647

ThreadLocal结构

下图是本文介绍到的一些对象之间的引用关系图,实线表示强引用,虚线表示弱引用:

ThreadLocal的hashcode

ThreadLocalMap中都需要根据索引i来get,set

 int i = key.threadLocalHashCode & (len-1);

这里关键的threadLocalHashCode

下面仿照ThreadLocal来跑threadLocalHashCode

单线程,多实例化

public class ThreadLocalMapDemo {

	private final int threadLocalHashCode = nextHashCode();

	private static AtomicInteger nextHashCode =
			new AtomicInteger();

	private static final int HASH_INCREMENT = 0x61c88647;

	private static int nextHashCode() {
		return nextHashCode.getAndAdd(HASH_INCREMENT);
	}

	public static void main(String[] args) {
		System.out.println(new ThreadLocalMapDemo().threadLocalHashCode);
		System.out.println(new ThreadLocalMapDemo().threadLocalHashCode);
		System.out.println(new ThreadLocalMapDemo().threadLocalHashCode);
		System.out.println(new ThreadLocalMapDemo().threadLocalHashCode);
	}
}

Output:

0
1640531527
-1013904242
626627285

多线程,单实例化

public class ThreadLocalMapDemo {

	private final int threadLocalHashCode = nextHashCode();

	private static AtomicInteger nextHashCode =
			new AtomicInteger();

	private static final int HASH_INCREMENT = 0x61c88647;

	private static int nextHashCode() {
		return nextHashCode.getAndAdd(HASH_INCREMENT);
	}

	public static void main(String[] args) {
		for(int i=0;i<5;i++){
			new Thread(() -> {
				System.out.println("threadName:"+Thread.currentThread().getName()+":"+new ThreadLocalMapDemo().threadLocalHashCode);
			}).start();
		}
	}
}

Output:

threadName:Thread-0:0
threadName:Thread-1:1640531527
threadName:Thread-2:-1013904242
threadName:Thread-3:626627285
threadName:Thread-4:-2027808484

将ThreadLocal对象和局部变量作为key-value初始化一个Entry实例并存储到数组里之前哈希映射到的位置里

每次实例化ThreadLocal,那么就会生成不同的threadLocalHashCode,从而将Entry均匀的分布到数组table中。

设置初始值

设置初始值方法如下

private T setInitialValue() {
  T value = initialValue();
  Thread t = Thread.currentThread();
  ThreadLocalMap map = getMap(t);
  if (map != null)
    map.set(this, value);
  else
    createMap(t, value);
  return value;
}

该方法为 private 方法,无法被重载。

首先,通过initialValue()方法获取初始值。该方法为 public 方法,且默认返回 null。所以典型用法中常常重载该方法。上例中即在内部匿名类中将其重载。

然后拿到该线程对应的 ThreadLocalMap 对象,若该对象不为 null,则直接将该 ThreadLocal 对象与对应实例初始值的映射添加进该线程的 ThreadLocalMap中。若为 null,则先创建该 ThreadLocalMap 对象再将映射添加其中。

这里并不需要考虑 ThreadLocalMap 的线程安全问题。因为每个线程有且只有一个 ThreadLocalMap 对象,并且只有该线程自己可以访问它,其它线程不会访问该 ThreadLocalMap,也即该对象不会在多个线程中共享,也就不存在线程安全的问题。

 private static ThreadLocal<StringBuilder> counter = new ThreadLocal<StringBuilder>() {
      @Override
      protected StringBuilder initialValue() {
        return new StringBuilder();
      }
    };

set方法

ThreadLocal内部code:

  public void set(T value) {
        Thread t = Thread.currentThread();
        ThreadLocalMap map = getMap(t);
        if (map != null)
            map.set(this, value);
        else
            createMap(t, value);
    }
 ThreadLocalMap getMap(Thread t) {
        return t.threadLocals;
    }
 void createMap(Thread t, T firstValue) {
        t.threadLocals = new ThreadLocalMap(this, firstValue);
    }

ThreadLocalMap内部code:

 ThreadLocalMap(ThreadLocal<?> firstKey, Object firstValue) {
            table = new Entry[INITIAL_CAPACITY];
            int i = firstKey.threadLocalHashCode & (INITIAL_CAPACITY - 1);
            table[i] = new Entry(firstKey, firstValue);
            size = 1;
            setThreshold(INITIAL_CAPACITY);
        }

ThreadLocalMap--set

/**
  * Set the value associated with key.
  *     存储键值对,比较有趣的是Entry并不是链表,这意味着ThreadLocalMap底层只是数组
  *     其解决冲突(或者说散列优化)的关键在于神奇的0x61c88647
  *     若遇到过期槽,就占用该过期槽(会涉及位移和槽清除操作)
  *     当清理成功同时到达阈值,需要扩容
  * @param key the thread local object
  * @param value the value to be set
  */
private void set(ThreadLocal key, Object value) {
    Entry[] tab = table;
    int len = tab.length;//数组容量
    //计算数组下标 跟HashMap的 index = key.hashCode() & (cap -1) 保持一致(即取模运算优化版) 
    int i = key.threadLocalHashCode & (len-1);
    for (Entry e = tab[i]; e != null; e = tab[i = nextIndex(i, len)]) {
        ThreadLocal k = e.get();
        //若key已存在,替换值即可
        if (k == key) {
            e.value = value;
            return;
        }
        //若当前槽为过期槽,就清除和占用该过期槽
        if (k == null) {
            replaceStaleEntry(key, value, i);
            return;
        }
        //否则继续往后 直到找到key相等或第一个过期槽为止
    }
    tab[i] = new Entry(key, value);
    int sz = ++size;
    //当清理成功同时到达阈值,需要扩容
    //cleanSomeSlots要处理的量是已有元素数量
    if (!cleanSomeSlots(i, sz) && sz >= threshold)
        rehash();
}
/**
  * Increment i modulo len. 不超过长度就自增1
  */
private static int nextIndex(int i, int len) {
    return ((i + 1 < len) ? i + 1 : 0);
}

通过i取出的entry对象取出的key为null(思考:为什么会有存储在ThreadLocalMap对象中的entry实体中的key为null?)

对于已经不再被使用且已被回收的 ThreadLocal 对象,它在每个线程内对应的实例由于被线程的ThreadLocalMapEntry 强引用,无法被回收,可能会造成内存泄漏。

针对该问题,ThreadLocalMapset方法中,通过 replaceStaleEntry 方法将所有键为 null Entry 的值设置为 null,从而使得该值可被回收。另外,会在 rehash 方法中通过 expungeStaleEntry 方法将键和值为null的 Entry 设置为 null 从而使得该 Entry 可被回收。通过这种方式,ThreadLocal 可防止内存泄漏。

replaceStaleEntry(清除和占用该过期槽)

 /**
 * Replace a stale entry encountered during a set operation with an entry 
 * for the specified key.  The value passed in the value parameter is stored in
 * the entry, whether or not an entry already exists for the specified key.
 *   在set时用新元素替换掉一个过期元素(也就是占用过期元素的所在槽)
 * As a side effect, this method expunges all stale entries in the
 * "run" containing the stale entry.  (A run is a sequence of entries
 * between two null slots.)
 *  该方法的副作用是将当前过期槽前后两个空槽之间的所有过期元素全部移除
 * @param  key the key
 * @param  value the value to be associated with key
 * @param  staleSlot index of the first stale entry encountered while searching for key. 
 *      !!重点:过期槽:这里指的都是key为null的槽,由于key(ThreadLocal)是弱引用类型,
 *      !!所以可能被GC自动回收,从而导致key为null,但槽对应的Entry并不一定被回收,value不一定被回收
 */
private void replaceStaleEntry(ThreadLocal key, Object value, int staleSlot) {
    Entry[] tab = table;
    int len = tab.length;
    Entry e;
    int slotToExpunge = staleSlot;//先备份一个要处理的过期槽下标
    //1和2 的综合作用是将当前过期槽前后两个空槽之间的所有过期元素全部移除
    //1.从当前过期槽开始往前找,一旦找到一个空槽就停止,记录前一个空槽下标
    for (int i = prevIndex(staleSlot, len);
         (e = tab[i]) != null;i = prevIndex(i, len))
        //找到前一个空槽并记录其下标
        if (e.get() == null) 
            slotToExpunge = i;
    //Find either the key or trailing null slot of run, whichever occurs first
    //2.从当前过期槽开始往后找,一旦找到当前key 或 之后的第一个空槽 就停止
    for (int i = nextIndex(staleSlot, len);
            (e = tab[i]) != null; i = nextIndex(i, len)) {
        ThreadLocal k = e.get();
        //一旦先找到key,就替换
        if (k == key) {
            e.value = value;
            tab[i] = tab[staleSlot];//原槽点对应entry移动到新的槽点上
            tab[staleSlot] = e;//当前entry占领原槽点
            // Start expunge at preceding stale entry if it exists
            //当第一次扫描找到,slotToExpunge要变成i,作为后续清除操作的新的起始槽点
            if (slotToExpunge == staleSlot)
                slotToExpunge = i;
            cleanSomeSlots(expungeStaleEntry(slotToExpunge), len);
            return;
        }
        // If we didn't find stale entry on backward scan, the
        // first stale entry seen while scanning for key is the
        // first still present in the run.
        //当第一次扫描的时候就碰到过期槽点(或空槽点),slotToExpunge要变成i
        //作为后续清除操作的起始槽点
        if (k == null && slotToExpunge == staleSlot)
            slotToExpunge = i;
    }
    // If key not found, put new entry in stale slot
    // 若key不存在,直接用新元素占据该过期槽点
    tab[staleSlot].value = null;//先把过期槽点的value清除,防止泄露
    tab[staleSlot] = new Entry(key, value);//占领
    // If there are any other stale entries in run, expunge them
    //若还有过期元素,清除他们
    if (slotToExpunge != staleSlot)
        cleanSomeSlots(expungeStaleEntry(slotToExpunge), len);
}

cleanSomeSlots

/**
  * Heuristically scan some cells looking for stale entries.
  * This is invoked when either a new element is added, or
  * another stale one has been expunged. It performs a
  * logarithmic number of scans, as a balance between no
  * scanning (fast but retains garbage) and a number of scans
  * proportional to number of elements, that would find all
  * garbage but would cause some insertions to take O(n) time.
  *     当添加一个新元素或一个过期元素被移除时,该方法会被调用,用来扫描一些槽的过期元素并清洗
  *     为了取得无扫描和全扫描之间的一个平衡,该方法使用对数扫描(也就是log)
  *     它将发现需要回收的元素同时可能导致插入操作的性能降低为O(n)
  * @param i a position known NOT to hold a stale entry. The
  *     scan starts at the element after i. 从该槽点之后开始扫描(已知该槽点没有存储过期元素)
  * @param n scan control: <tt>log2(n)</tt> cells are scanned,
  * unless a stale entry is found, in which case <tt>log2(table.length)-1</tt>
  * additional cells are scanned.When called from insertions,this parameter is the number
  * of elements, but when from replaceStaleEntry, it is the table length.
  *     log2(n)个槽点将被扫描,当插入时被调用,这指的是已有元素数量,当替换时被调用,指的是数组容量
  * But this version is simple, fast, and seems to work well.
  *     官方说这种写法简单、快速同时工作良好,读者可自行测试一番(主要跟n的权重有关)
  * @return true if any stale entries have been removed.
  *     一旦有过期元素被移除,就返回true,表示至少有一个过期元素被清除成功
  */
private boolean cleanSomeSlots(int i, int n) {
    boolean removed = false;
    Entry[] tab = table;
    int len = tab.length;
    //这里跟skipList的跳跃思想有点类似,区别是跳跃表是空间换时间,这是就是简单的跳跃
    do {
        i = nextIndex(i, len);
        Entry e = tab[i];
        //找到一个过期槽点(可能也是空槽点)
        if (e != null && e.get() == null) {
            n = len;
            removed = true;//找到一个过期槽点就标志成功
            //但有个疑问就是此时并没有完成清洗操作,但文档描述称 have been removed
            i = expungeStaleEntry(i);
        }
    } while ( (n >>>= 1) != 0);//2进制往右移动一位:即log2(n)
    //简单回顾一下数学知识:2^n 的逆运算就是 log2(n),不理解的读者请心中愧对中学数学老师3秒钟
    return removed;
}

expungeStaleEntry

/**
  * Expunge a stale entry by rehashing any possibly colliding entries
  * lying between staleSlot and the next null slot. This also expunges
  * any other stale entries encountered before the trailing null.  
  *     在当前过期槽点和下一个空槽点之间,移除过期元素
  *     该方法主要干了两个事情:
  *         1.清理当前过期槽
  *         2.从下一个槽开始遍历数组,移除过期槽,一旦遇到空槽就停止:
  *             2.1 当key为空时,移除过期槽
  *             2.2 当key非空但rehash之后rehash之后下标变化则移除原槽,元素搬迁新空槽
  * @param staleSlot index of slot known to have null key
  * @return the index of the next null slot after staleSlot 返回过期槽后面第一个空槽下标
  * (all between staleSlot and this slot will have been checked for expunging).
  *     在当前过期槽点和下一个空槽点之间所有过期元素都会被移除
  */
private int expungeStaleEntry(int staleSlot) {
    Entry[] tab = table;
    int len = tab.length;//注意是数组容量
    // expunge entry at staleSlot 移除过期槽中的过期元素 加速GC
    tab[staleSlot].value = null;//1.value help gc
    tab[staleSlot] = null;//2.slot help gc
    size--;
    // Rehash until we encounter null 遍历数组并Rehash,直到遇到null时停止
    Entry e;
    int i;
    //从当前过期槽的下一个槽开始遍历数组
    for (i = nextIndex(staleSlot, len);
        //根据(e = tab[i]) != null可知,一旦遇到空槽就停止
         (e = tab[i]) != null; i = nextIndex(i, len)) {
        ThreadLocal k = e.get();
        //key是空就清除元素,防止内存泄露,help gc
        if (k == null) {
            //为了防止内存泄露,当ThreadLocal已过期失效时,通过主动移除value和slot帮助加速GC
            //同时还可以空出一个空槽供后面使用,不浪费空间
            e.value = null;
            tab[i] = null;
            size--;
        } else {
            //当key已存在,则需要重新计算下标(为什么不叫index而叫h?)
            int h = k.threadLocalHashCode & (len - 1);
            //当前后坐标不一致时(可能是扩容导致的 - 总之就是len变动导致下标变化)
            if (h != i) {
                //清空原槽,元素搬迁到新的空槽,原槽提供给新元素使用
                tab[i] = null;
                // Unlike Knuth 6.4 Algorithm R, we must scan until
                // null because multiple entries could have been stale.
                // 一直往后找,直到找到一个空槽位置
                while (tab[h] != null)
                    h = nextIndex(h, len);
                tab[h] = e;
            }
        }
    }
    return i;
}

魔数0x61c88647与碰撞解决

  • 机智的读者肯定发现ThreadLocalMap并没有使用链表或红黑树去解决hash冲突的问题,而仅仅只是使用了数组来维护整个哈希表,那么重中之重的散列性要如何保证就是一个很大的考验
  • ThreadLocalMap通过结合三个巧妙的设计去解决这个问题:
    • 1.Entry的key设计成弱引用,因此key随时可能被GC(也就是失效快),尽量多的面对空槽
    • 2.(单个ThreadLocal时)当遇到碰撞时,通过线性探测的开放地址法解决冲突问题
    • 3.(多个ThreadLocal时)引入了神奇的0x61c88647,增强其的散列性,大大减少碰撞几率
  • 之所以不用累加而用该值,笔者认为可能跟其找最近的空槽有关(跳跃查找比自增1查找用来找空槽可能更有效一些,因为有了更多可选择的空间spreading out),同时也跟其良好的散列性有关
  • 0x61c88647与黄金比例、Fibonacci 数有关,读者可参见What is the meaning of 0x61C88647 constant in ThreadLocal.java
private static final int HASH_INCREMENT = 0x61c88647;
/**
 * Returns the next hash code.
 *  每个ThreadLocal的hashCode每次累加HASH_INCREMENT
 */
private static int nextHashCode() {
    //the previous id + our magic number
    return nextHashCode.getAndAdd(HASH_INCREMENT); 
}

ThreadLocal与内存泄露

ThreadLocal导致内存泄露的错误行为

  • 1.使用static的ThreadLocal,延长了ThreadLocal的生命周期,可能导致内存泄漏
  • 2.分配使用了ThreadLocal又不再调用get()set()remove()方法 就会导致内存泄漏
  • 3.当使用线程池时,即当前线程不一定会退出(比如固定大小的线程池),这样将一些大对象设置到ThreadLocal中,可能会导致系统出现内存泄露(当对象不再使用时,因为引用存在,无法被回收)

ThreadLocal导致内存泄露的根源

  • 首先需要明确一点:ThreadLocal本身的设计是不会导致内存泄露的,原因更多是使用不当导致的!
  • ThreadLocalMap对象被Thread对象所持有,当线程退出时,Thread类执行清理操作,比如清理ThreadLocalMap;否则该ThreadLocalMap对象的引用并不会被回收。
//先回顾一下:Thread的exit方法
/**
  * This method is called by the system to give a Thread
  * a chance to clean up before it actually exits.
  */
private void exit() {
    if (group != null) {
        group.threadTerminated(this);
        group = null;
    }
    /* Aggressively null out all reference fields: see bug 4006245 */
    target = null;
    /* Speed the release of some of these resources */
    threadLocals = null;//清空threadLocalMap的引用
    inheritableThreadLocals = null;
    inheritedAccessControlContext = null;
    blocker = null;
    uncaughtExceptionHandler = null;
}
  • 根源:由于Entry的key弱引用特性(见注意),当每次GC时JVM会主动将无用的弱引用回收掉,因此当ThreadLocal外部没有强引用依赖时,就会被自动回收,这样就可能造成当ThreadLocal被回收时,相当于将Map中的key设置为null,但问题是该key对应的entry和value并不会主动被GC回收
  • 当Entry和value未被主动回收时,除非当前线程死亡,否则线程对于Entry的强引用会一直存在,从而导致内存泄露
  • 建议: 当希望回收对象,最好使用ThreadLocal.remove()方法将该变量主动移除,告知JVM执行GC回收
  • 注意: ThreadLocal本身不是弱引用的,Entry继承了WeakReference,同时Entry又将自身的key封装成弱引用,所有真正的弱引用是Entry的key,只不过恰好Entry的key是ThreadLocal!!
static class Entry extends WeakReference<ThreadLocal<?>> {
    Object value;
    Entry(ThreadLocal<?> k, Object v) {
        //这里才是真正的弱引用!!
        super(k);//将key变成了弱引用!而key恰好又是ThreadLocal!
        value = v;
    }
}
public class WeakReference<T> extends Reference<T> {
    public WeakReference(T referent) {
        super(referent);
    }
    public WeakReference(T referent, ReferenceQueue<? super T> q) {
        super(referent, q);
    }
}

仿ThreadLocalMap结构测试

public class AnalogyThreadLocalDemo {

	public static void main(String[] args) {
		HashMap map = new HashMap();
		Obj o1 = new Obj();
		Obj o2 = new Obj();
		map.put(o1, "o1");
		map.put(o2, "o2");
		o1 = null;
		System.gc();
		System.out.println("##########o1 gc:" + map);
		o2 = null;
		System.gc();
		System.out.println("##########o2 gc:" + map);
		map.clear();
		System.gc();
		System.out.println("##########GC after map clear:" + map);
	}
}

class Obj {
	private final String DESC = "obj exists";
	@Override
	public String toString() {
		return DESC;
	}
	@Override
	protected void finalize() throws Throwable {
		System.out.println("##########gc over");
	}
}

设置VM options:

-verbose:gc
-XX:+PrintGCDetails
-XX:+PrintTenuringDistribution
-XX:+PrintGCTimeStamps

Output:

0.316: [GC (System.gc()) 
Desired survivor size 11010048 bytes, new threshold 7 (max 15)
[PSYoungGen: 7911K->1290K(76288K)] 7911K->1298K(251392K), 0.0025504 secs] [Times: user=0.00 sys=0.00, real=0.00 secs] 
0.319: [Full GC (System.gc()) [PSYoungGen: 1290K->0K(76288K)] [ParOldGen: 8K->1194K(175104K)] 1298K->1194K(251392K), [Metaspace: 3310K->3310K(1056768K)], 0.0215288 secs] [Times: user=0.00 sys=0.00, real=0.02 secs] 
##########o1 gc:{obj exists=o1, obj exists=o2}
0.342: [GC (System.gc()) 
Desired survivor size 11010048 bytes, new threshold 7 (max 15)
[PSYoungGen: 1310K->64K(76288K)] 2504K->1258K(251392K), 0.0002418 secs] [Times: user=0.00 sys=0.00, real=0.00 secs] 
0.342: [Full GC (System.gc()) [PSYoungGen: 64K->0K(76288K)] [ParOldGen: 1194K->964K(175104K)] 1258K->964K(251392K), [Metaspace: 3322K->3322K(1056768K)], 0.0058113 secs] [Times: user=0.00 sys=0.00, real=0.01 secs] 
##########o2 gc:{obj exists=o1, obj exists=o2}
0.348: [GC (System.gc()) 
Desired survivor size 11010048 bytes, new threshold 7 (max 15)
[PSYoungGen: 1310K->32K(76288K)] 2275K->996K(251392K), 0.0002203 secs] [Times: user=0.00 sys=0.00, real=0.00 secs] 
0.349: [Full GC (System.gc()) [PSYoungGen: 32K->0K(76288K)] [ParOldGen: 964K->964K(175104K)] 996K->964K(251392K), [Metaspace: 3322K->3322K(1056768K)], 0.0055209 secs] [Times: user=0.00 sys=0.00, real=0.01 secs] 
##########gc over
##########gc over
##########GC after map clear:{}
Heap
 PSYoungGen      total 76288K, used 3932K [0x000000076af00000, 0x0000000770400000, 0x00000007c0000000)
  eden space 65536K, 6% used [0x000000076af00000,0x000000076b2d7248,0x000000076ef00000)
  from space 10752K, 0% used [0x000000076ef00000,0x000000076ef00000,0x000000076f980000)
  to   space 10752K, 0% used [0x000000076f980000,0x000000076f980000,0x0000000770400000)
 ParOldGen       total 175104K, used 964K [0x00000006c0c00000, 0x00000006cb700000, 0x000000076af00000)
  object space 175104K, 0% used [0x00000006c0c00000,0x00000006c0cf1240,0x00000006cb700000)
 Metaspace       used 3328K, capacity 4500K, committed 4864K, reserved 1056768K
  class space    used 355K, capacity 388K, committed 512K, reserved 1048576K

可以看出,当map.clear()以后,Obj对象才被finalize回收

总结

  • ThreadLocal 并不解决线程间共享数据的问题
  • ThreadLocal 通过隐式的在不同线程内创建独立实例副本避免了实例线程安全的问题
  • 每个线程持有一个 Map 并维护了 ThreadLocal 对象与具体实例的映射,该 Map 由于只被持有它的线程访问,故不存在线程安全以及锁的问题
  • ThreadLocalMap 的 Entry 对 ThreadLocal 的引用为弱引用,避免了 ThreadLocal 对象无法被回收的问题
  • ThreadLocalMap 的 set 方法通过调用 replaceStaleEntry 方法回收键为 null 的 Entry 对象的值(即为具体实例)以及 Entry 对象本身从而防止内存泄漏
  • ThreadLocal 适用于变量在线程间隔离且在方法间共享的场景

参考:

Java进阶(七)正确理解Thread Local的原理与适用场景

ThreadLocal源码详细解析、魔数0x61c88647

ThreadLocal的hash算法(关于 0x61c88647)

彻底理解ThreadLocal

并发番@ThreadLocal一文通(1.7版)

并发之线程封闭与ThreadLocal解析

posted @ 2019-12-27 17:35  hongdada  阅读(2362)  评论(1编辑  收藏  举报