低版本spring-cloud-sleuth的TraceableExecutorService一个坑,注意绕行

我们知道Java 线程池提交任务有两种:

1,submit 会抛出异常,会有异常日志

2,execute 没有抛出异常,需要执行返回值Future的get方法触发抛出异常

execute在执行中没有做任何包装,在线程池源码中会执行到runWorker方法,这个方法中可以看到会把异常抛出,我们也可以通过覆盖afterExecute方法对异常进行自定义操作。这也就是execute会有异常信息打印的原因。

final void runWorker(Worker w) {
    Thread wt = Thread.currentThread();
    Runnable task = w.firstTask;
    w.firstTask = null;
    w.unlock(); // allow interrupts
    boolean completedAbruptly = true;
    try {
        while (task != null || (task = getTask()) != null) {
            w.lock();
            // If pool is stopping, ensure thread is interrupted;
            // if not, ensure thread is not interrupted.  This
            // requires a recheck in second case to deal with
            // shutdownNow race while clearing interrupt
            if ((runStateAtLeast(ctl.get(), STOP) ||
                 (Thread.interrupted() &&
                  runStateAtLeast(ctl.get(), STOP))) &&
                !wt.isInterrupted())
                wt.interrupt();
            try {
                beforeExecute(wt, task);
                Throwable thrown = null;
                try {
                    task.run();
                } catch (RuntimeException x) {
                    thrown = x; throw x;
                } catch (Error x) {
                    thrown = x; throw x;
                } catch (Throwable x) {
                    thrown = x; throw new Error(x);
                } finally {
                    afterExecute(task, thrown);
                }
            } finally {
                task = null;
                w.completedTasks++;
                w.unlock();
            }
        }
        completedAbruptly = false;
    } finally {
        processWorkerExit(w, completedAbruptly);
    }
}

submit在提交过程中会把任务包装成FutureTask,FutureTaskrun方法:

public void run() {
    if (state != NEW ||
        !UNSAFE.compareAndSwapObject(this, runnerOffset,
                                     null, Thread.currentThread()))
        return;
    try {
        Callable<v> c = callable;
        if (c != null && state == NEW) {
            V result;
            boolean ran;
            try {
                result = c.call();
                ran = true;
            } catch (Throwable ex) {
                result = null;
                ran = false;
                setException(ex);
            }
            if (ran)
                set(result);
        }
    } finally {
        // runner must be non-null until state is settled to
        // prevent concurrent calls to run()
        runner = null;
        // state must be re-read after nulling runner to prevent
        // leaked interrupts
        int s = state;
        if (s >= INTERRUPTING)
            handlePossibleCancellationInterrupt(s);
    }
}

看代码就知道,它不会直接把在执行任务的时候把异常直接抛出来,而是执行了 setException(ex);方法把异常存了起来,等到get方法请求时触发异常。

话要从前几天一个问题讲起,我们应用使用Spring Cloud,我对应用使用的线程池进行统一管理,研发在使用execute方法提交任务后,没有看到执行任务的代码,也没有任何报错信息。

根据前面的分析,execute有异常是会抛出异常也是有日志信息的,我调试了一下异常情况,居然没有异常抛出,发现我们的线程池都被TraceableExecutorService代理了,打开这个类,我就惊呆了,execute方法被篡改成submit提交了。

Spring Cloud 里会使用spring-cloud-sleuth来做日志链路跟踪,一种常见的场景就是在应用中线程池使用的线程,如何把跟踪的用的traceId 传递过去,Spring的做法是把所有ExecutorService类型的Bean都用 TraceableExecutorService 进行代理,这个代理类生产代码实现在ExecutorBeanPostProcessor里的postProcessAfterInitialization方法中,具体代码:

if (bean instanceof ExecutorService
      && !(bean instanceof TraceableExecutorService)) {
   if (isProxyNeeded(beanName)) {
      return wrapExecutorService(bean);
   }
   else {
      log.info("Not instrumenting bean " + beanName);
   }
}

注意,我使用的是Spring Cloud 的版本是Greenwich.SR1,它依赖的spring-cloud-starter-sleuth2.1.1.RELEASE而这个版本的TraceableExecutorService实现了execute代理方法,把一个execute篡改成submit提交任务,带来的后果就是预期的异常抛出却没有抛出。

public void execute(Runnable command) {
	this.delegate.submit(ContextUtil.isContextInCreation(this.beanFactory) ? command
			: new TraceRunnable(tracing(), spanNamer(), command, this.spanName));
}

这个是bug产生时的改动:

https://github.com/spring-cloud/spring-cloud-sleuth/commit/17d8165c5e1433cf7b2943ade53f06bdd12a86de#diff-7b796c619250cf07f1d1f80d53c9e32d140e268fdfcea3bee20460fc731c0105

这个是修复bug的描述:

https://github.com/spring-cloud/spring-cloud-sleuth/commit/e2e1cdb336ec89b26fcd22116510c2d5eee6924e

在我看来这是个低级错误,复制粘贴代码出的错,带来的影响虽然不大,但在一些场景没有异常信息其实是很致命的,甚至会导致一个bug隐藏很久或者一个bug要查很久。

posted on 2021-09-28 13:55  每当变幻时  阅读(489)  评论(0编辑  收藏  举报

导航