ANR原理分析

ANR原理分析

前言:

ANR,应用程序无响应。触发后会弹一个dialog,提示用户。

主要分析以下几点:

  1. ANR触发场景
  2. ANR产生的过程
  3. ANR分析方法
  4. ANR监控

1. ANR触发场景

ANR类型 超时时间 报错信息
输入事件(按键、触摸等) 5s Input event dispatching timed out
广播BroadcastReceiver 前台10s,后台/offload 60s Receiver during timeout of
Service服务 前台 10s,普通 20s,后台 200s Timeout executing service
ContentProvider 10s timeout publishing content providers

2. ANR产生的过程

这里分别讨论广播,服务,内容提供者,输入事件的ANR产生过程。

产生过程可以总结为:
事件发生前通过Handler发送延迟消息
如果事件成功发生就把延迟消息移除
否则延迟消息触发,随之产生ANR

有博客讲这三个过程比喻为埋炸弹,拆炸弹和启动炸弹,非常合适。

2.1 broadcast超时机制

2.1.1 设置广播超时时间

调用链如下:

ContextImpl.sendBroadcast

->AMS.broadcastIntent
->AMS.broadcastIntentLocked

broadcastIntentLocked方法中将广播加入到队列中。

    @GuardedBy("this")
    final int broadcastIntentLocked(ProcessRecord callerApp,
            String callerPackage, Intent intent, String resolvedType,
            IIntentReceiver resultTo, int resultCode, String resultData,
            Bundle resultExtras, String[] requiredPermissions, int appOp, Bundle bOptions,
            boolean ordered, boolean sticky, int callingPid, int callingUid, int realCallingUid,
            int realCallingPid, int userId, boolean allowBackgroundActivityStarts) {
        intent = new Intent(intent);

        //...

        if ((receivers != null && receivers.size() > 0)
                || resultTo != null) {
            BroadcastQueue queue = broadcastQueueForIntent(intent);
            BroadcastRecord r = new BroadcastRecord(queue, intent, callerApp,
                    callerPackage, callingPid, callingUid, callerInstantApp, resolvedType,
                    requiredPermissions, appOp, brOptions, receivers, resultTo, resultCode,
                    resultData, resultExtras, ordered, sticky, false, userId,
                    allowBackgroundActivityStarts, timeoutExempt);

            if (DEBUG_BROADCAST) Slog.v(TAG_BROADCAST, "Enqueueing ordered broadcast " + r);

            final BroadcastRecord oldRecord =
                    replacePending ? queue.replaceOrderedBroadcastLocked(r) : null;
            if (oldRecord != null) {
            //...
            } else {
                queue.enqueueOrderedBroadcastLocked(r);//加入到mOrderedBroadcasts队列中
                queue.scheduleBroadcastsLocked();//处理广播
            }
        //...
        return ActivityManager.BROADCAST_SUCCESS;
    }

BroadCastQueue.scheduleBroadcastsLocked方法,这里发送了一个BROADCAST_INTENT_MSG消息

    public void scheduleBroadcastsLocked() {
        if (DEBUG_BROADCAST) Slog.v(TAG_BROADCAST, "Schedule broadcasts ["
                + mQueueName + "]: current="
                + mBroadcastsScheduled);

        if (mBroadcastsScheduled) {
            return;
        }
        mHandler.sendMessage(mHandler.obtainMessage(BROADCAST_INTENT_MSG, this));
        mBroadcastsScheduled = true;
    }

    private final class BroadcastHandler extends Handler {
        public BroadcastHandler(Looper looper) {
            super(looper, null, true);
        }

        @Override
        public void handleMessage(Message msg) {
            switch (msg.what) {
                case BROADCAST_INTENT_MSG: {
                    if (DEBUG_BROADCAST) Slog.v(
                            TAG_BROADCAST, "Received BROADCAST_INTENT_MSG ["
                            + mQueueName + "]");
                    processNextBroadcast(true);
                } break;
                case BROADCAST_TIMEOUT_MSG: {
                    synchronized (mService) {
                        broadcastTimeoutLocked(true);
                    }
                } break;
            }
        }
    }

然后跳转到processNextBroadcast方法,真正执行是processNextBroadcastLocked

    final void processNextBroadcast(boolean fromMsg) {
        synchronized (mService) {
            processNextBroadcastLocked(fromMsg, false);
        }
    }

    final void processNextBroadcastLocked(boolean fromMsg, boolean skipOomAdj) {
        BroadcastRecord r;
        //...
        if (! mPendingBroadcastTimeoutMessage) {
            long timeoutTime = r.receiverTime + mConstants.TIMEOUT;
            if (DEBUG_BROADCAST) Slog.v(TAG_BROADCAST,
                    "Submitting BROADCAST_TIMEOUT_MSG ["
                    + mQueueName + "] for " + r + " at " + timeoutTime);
            setBroadcastTimeoutLocked(timeoutTime);//设置超时时间
        }
       //...
    }

processNextBroadcastLocked方法会设置超时时间,就是埋炸弹,在时间一到就要判断是否引爆。

    final void setBroadcastTimeoutLocked(long timeoutTime) {
        if (! mPendingBroadcastTimeoutMessage) {
            Message msg = mHandler.obtainMessage(BROADCAST_TIMEOUT_MSG, this);
            mHandler.sendMessageAtTime(msg, timeoutTime);
            mPendingBroadcastTimeoutMessage = true;
        }
    }

2.1.2 判断是否产生ANR,即是否超时

再回到BroadcastHandler#BROADCAST_TIMEOUT_MSG
时间一到执行broadcastTimeoutLocked方法,判断是否超时。

    final void broadcastTimeoutLocked(boolean fromMsg) {
        //...
        long now = SystemClock.uptimeMillis();
        BroadcastRecord r = mDispatcher.getActiveBroadcastLocked();
        if (fromMsg) {

            //...

            long timeoutTime = r.receiverTime + mConstants.TIMEOUT;
            if (timeoutTime > now) { //未超时
                // We can observe premature timeouts because we do not cancel and reset the
                // broadcast timeout message after each receiver finishes.  Instead, we set up
                // an initial timeout then kick it down the road a little further as needed
                // when it expires.
                setBroadcastTimeoutLocked(timeoutTime);
                return;
            }
        }
        //...
        if (!debugging && anrMessage != null) {
            // Post the ANR to the handler since we do not want to process ANRs while
            // potentially holding our lock.
            mHandler.post(new AppNotResponding(app, anrMessage)); //超时发送anr消息
        }
    }

2.2 service超时机制

2.2.1 service超时时间设置(埋炸弹)

以startService举例:

Context.startService
调用链如下:
AMS.startService
ActiveServices.startService
ActiveServices.startServiceLocked
ActiveServices.startServiceInnerLocked
ActiveServices.bringUpServiceLocked
ActiveServices.realStartServiceLocked

    private final void realStartServiceLocked(ServiceRecord r,
            ProcessRecord app, boolean execInFg) throws RemoteException {

        bumpServiceExecutingLocked(r, execInFg, "create");//1、这里会发送delay消息(SERVICE_TIMEOUT_MSG)

        try {
            ...
            //2、通知AMS创建服务
            app.thread.scheduleCreateService(r, r.serviceInfo,
                    mAm.compatibilityInfoForPackage(r.serviceInfo.applicationInfo),
                    app.getReportedProcState());
            r.postNotification();
            created = true;
        } 
        //...
    }

    private final void bumpServiceExecutingLocked(ServiceRecord r, boolean fg, String why) {
              ...
              scheduleServiceTimeoutLocked(r.app);//具体发送消息的方法
              ...
    }

    void scheduleServiceTimeoutLocked(ProcessRecord proc) {
        if (proc.executingServices.size() == 0 || proc.thread == null) {
            return;
        }
        Message msg = mAm.mHandler.obtainMessage(
                ActivityManagerService.SERVICE_TIMEOUT_MSG);
        msg.obj = proc;
        // 发送deley消息,前台服务是20s,后台服务是10s
        mAm.mHandler.sendMessageDelayed(msg,
                proc.execServicesFg ? SERVICE_TIMEOUT : SERVICE_BACKGROUND_TIMEOUT);
    }

如果服务所在的时间内,没有移除这个消息,那么就会在AMS里面处理消息:

    final class MainHandler extends Handler {
        public MainHandler(Looper looper) {
            super(looper, null, true);
        }

        @Override
        public void handleMessage(Message msg) {
            switch (msg.what) {
            //...服务超时会调用serviceTimeout方法
            case SERVICE_TIMEOUT_MSG: {
                mServices.serviceTimeout((ProcessRecord)msg.obj);
            } break;
       }
    }

2.2.2 service移除超时消息(拆炸弹)

启动一个Service,先要经过AMS管理,然后AMS会通知应用进程执行Service的生命周期,
ActivityThread的handleCreateService方法会被调用

    //ActivityThread.java
    @UnsupportedAppUsage
    private void handleCreateService(CreateServiceData data) {
         //...
        try {
           //...
            Application app = packageInfo.makeApplication(false, mInstrumentation);
            service.attach(context, this, data.info.name, data.token, app,
                    ActivityManager.getService());
            service.onCreate();//1、service onCreate调用
            mServices.put(data.token, service);
            try {
                ActivityManager.getService().serviceDoneExecuting(//2、拆炸弹在这里
                        data.token, SERVICE_DONE_EXECUTING_ANON, 0, 0);
            } catch (RemoteException e) {
                throw e.rethrowFromSystemServer();
            }
        } 
        //...
    }

注释1,Service的onCreate方法被调用,
注释2,调用AMS的serviceDoneExecuting方法,最终会调用到ActiveServices.serviceDoneExecutingLocked

    private void serviceDoneExecutingLocked(ServiceRecord r, boolean inDestroying,
            boolean finishing) {
           //...
           mAm.mHandler.removeMessages(ActivityManagerService.SERVICE_TIMEOUT_MSG, r.app);//移除delay消息
           //...
    }

可以看到,onCreate方法调用完之后,就会移除delay消息,炸弹被拆除。

2.2.3 service触发ANR(引爆炸弹)

假设Service的onCreate执行超过10s,那么炸弹就会引爆,也就是

    void serviceTimeout(ProcessRecord proc) {
        //...
        if (anrMessage != null) {
                mAm.mAppErrors.appNotResponding(proc, null, null, false, anrMessage);
            }
        //...
    }

2.3 ContentProvider超时机制

2.3.1 ContentProvider设置超时时间(埋炸弹)

在应用启动时,ContentProvider发布 若超时也会发生ANR。

调用链:
ActivityThread.attach()
AMS.attachApplicationLocked()

应用启动后,ActivityThread执行attach()操作,最后会执行attachApplicationLocked() 实现上述ANR判断。

//ActivityManagerService.java
// How long we wait for an attached process to publish its content providers
// before we decide it must be hung.
static final int CONTENT_PROVIDER_PUBLISH_TIMEOUT = 10*1000;
/**
 * How long we wait for an provider to be published. Should be longer than
 * {@link #CONTENT_PROVIDER_PUBLISH_TIMEOUT}.
 */
static final int CONTENT_PROVIDER_WAIT_TIMEOUT = 20 * 1000;

@GuardedBy("this")
private boolean attachApplicationLocked(@NonNull IApplicationThread thread,
        int pid, int callingUid, long startSeq) {
    if (providers != null && checkAppInLaunchingProvidersLocked(app)) {
        Message msg = mHandler.obtainMessage(CONTENT_PROVIDER_PUBLISH_TIMEOUT_MSG);
        msg.obj = app;
        //可能ANR
        mHandler.sendMessageDelayed(msg, CONTENT_PROVIDER_PUBLISH_TIMEOUT);
    }
    //...
    try {
        //移除CONTENT_PROVIDER_PUBLISH_TIMEOUT_MSG延迟消息
        //这里的thread是从ActivityThread传入的,ApplicationThread对象。
        thread.bindApplication(processName, appInfo, providers, ...);
    }
}


final class MainHandler extends Handler {
    @Override
    public void handleMessage(Message msg) {
        switch (msg.what) {
        case CONTENT_PROVIDER_PUBLISH_TIMEOUT_MSG: {
            ProcessRecord app = (ProcessRecord)msg.obj;
            synchronized (ActivityManagerService.this) {
                processContentProviderPublishTimedOutLocked(app);
            }
        } break;
        }
    }
}

@GuardedBy("this")
private final void processContentProviderPublishTimedOutLocked(ProcessRecord app) {
    cleanupAppInLaunchingProvidersLocked(app, true);
    mProcessList.removeProcessLocked(app, false, true, "timeout publishing content providers");
}

2.3.2 ContentProvider移除超时消息(拆炸弹)

如何移除,看thread.bindApplication(),该方法在延迟发送消息之后执行,即移除延迟消息。如在10s内执行完成,就是不会触发ANR。

注:这是最简单直接看到的一种,移除该消息的调用地方不只一处。

简单看下这个移除延迟消息过程:

//ActivityThread.java
private class ApplicationThread extends IApplicationThread.Stub {
    public final void bindApplication(String processName, ApplicationInfo appInfo,
            ...) {
        sendMessage(H.BIND_APPLICATION, data);
    }
}

class H extends Handler {
    public void handleMessage(Message msg) {
        switch (msg.what) {
            case BIND_APPLICATION:
                AppBindData data = (AppBindData)msg.obj;
                handleBindApplication(data);
                break;
        }
    }
}

@UnsupportedAppUsage
private void handleBindApplication(AppBindData data) {
    try {
        if (!data.restrictedBackupMode) {
            if (!ArrayUtils.isEmpty(data.providers)) {
                installContentProviders(app, data.providers);
            }
        }
    } 
}

@UnsupportedAppUsage
private void installContentProviders(
        Context context, List<ProviderInfo> providers) {
    try {
        ActivityManager.getService().publishContentProviders(
            getApplicationThread(), results);
    } catch (RemoteException ex) {
        throw ex.rethrowFromSystemServer();
    }
}

//ActivityManagerService.java
public final void publishContentProviders(IApplicationThread caller,
            List<ContentProviderHolder> providers) {
    final ProcessRecord r = getRecordForAppLocked(caller);
    if (wasInLaunchingProviders) {
        mHandler.removeMessages(CONTENT_PROVIDER_PUBLISH_TIMEOUT_MSG, r);
    }  
}

2.3.3 ContentProvider触发ANR(引爆炸弹)

//AMS
@GuardedBy("this")
private final void processContentProviderPublishTimedOutLocked(ProcessRecord app) {
    cleanupAppInLaunchingProvidersLocked(app, true);
    mProcessList.removeProcessLocked(app, false, true, "timeout publishing content providers");
}

若发生超时,这里没有调用appNotResponding()(不像前3种),这里会杀掉进程并清理了相关信息。

2.4 input事件超时机制

这个等分析input事件机制后再写吧。看不太明白。

2.5 ANR处理过程

android 10是调用ProcessRecord的appNotResponding方法。

看下这个方法做了哪些操作:

    void appNotResponding(String activityShortComponentName, ApplicationInfo aInfo,
            String parentShortComponentName, WindowProcessController parentProcess,
            boolean aboveSystem, String annotation) {

            //...

            //1、写入event log
            // Log the ANR to the event log.
            EventLog.writeEvent(EventLogTags.AM_ANR, userId, pid, processName, info.flags,
                    annotation);

        //...

         //2、收集需要的log,anr、cpu等
        // Log the ANR to the main log.
        StringBuilder info = new StringBuilder();
        info.setLength(0);
        info.append("ANR in ").append(processName);
        if (activityShortComponentName != null) {
            info.append(" (").append(activityShortComponentName).append(")");
        }
        info.append("\n");
        info.append("PID: ").append(pid).append("\n");
        if (annotation != null) {
            info.append("Reason: ").append(annotation).append("\n");
        }
        if (parentShortComponentName != null
                && parentShortComponentName.equals(activityShortComponentName)) {
            info.append("Parent: ").append(parentShortComponentName).append("\n");
        }

        ProcessCpuTracker processCpuTracker = new ProcessCpuTracker(true);
       
       //...

        // 3、dump堆栈信息,包括java堆栈和native堆栈,保存到文件中
        // For background ANRs, don't pass the ProcessCpuTracker to
        // avoid spending 1/2 second collecting stats to rank lastPids.
        File tracesFile = ActivityManagerService.dumpStackTraces(firstPids,
                (isSilentAnr()) ? null : processCpuTracker, (isSilentAnr()) ? null : lastPids,
                nativePids);

        String cpuInfo = null;
        if (isMonitorCpuUsage()) {
            mService.updateCpuStatsNow();
            synchronized (mService.mProcessCpuTracker) {
                cpuInfo = mService.mProcessCpuTracker.printCurrentState(anrTime);
            }
            info.append(processCpuTracker.printCurrentLoad());
            info.append(cpuInfo);
        }

        info.append(processCpuTracker.printCurrentState(anrTime));

        Slog.e(TAG, info.toString());//4、输出ANR 日志
        if (tracesFile == null) {
            // There is no trace file, so dump (only) the alleged culprit's threads to the log
            Process.sendSignal(pid, Process.SIGNAL_QUIT);// 5、没有抓到tracesFile,发一个SIGNAL_QUIT信号
        }

        StatsLog.write(StatsLog.ANR_OCCURRED, uid, processName,
                activityShortComponentName == null ? "unknown": activityShortComponentName,
                annotation,
                (this.info != null) ? (this.info.isInstantApp()
                        ? StatsLog.ANROCCURRED__IS_INSTANT_APP__TRUE
                        : StatsLog.ANROCCURRED__IS_INSTANT_APP__FALSE)
                        : StatsLog.ANROCCURRED__IS_INSTANT_APP__UNAVAILABLE,
                isInterestingToUserLocked()
                        ? StatsLog.ANROCCURRED__FOREGROUND_STATE__FOREGROUND
                        : StatsLog.ANROCCURRED__FOREGROUND_STATE__BACKGROUND,
                getProcessClassEnum(),
                (this.info != null) ? this.info.packageName : "");
        final ProcessRecord parentPr = parentProcess != null
                ? (ProcessRecord) parentProcess.mOwner : null;
        // 6、输出到drapbox
        mService.addErrorToDropBox("anr", this, processName, activityShortComponentName,
                parentShortComponentName, parentPr, annotation, cpuInfo, tracesFile, null);
        //...
        synchronized (mService) {
            // mBatteryStatsService can be null if the AMS is constructed with injector only. This
            // will only happen in tests.
            if (mService.mBatteryStatsService != null) {
                mService.mBatteryStatsService.noteProcessAnr(processName, uid);
            }

            if (isSilentAnr() && !isDebugging()) {
                kill("bg anr", true);//7、后台ANR,直接杀进程
                return;
            }
            //8、错误报告
            // Set the app's notResponding state, and look up the errorReportReceiver
            makeAppNotRespondingLocked(activityShortComponentName,
                    annotation != null ? "ANR " + annotation : "ANR", info.toString());

            //9、弹出ANR dialog,会调用handleShowAnrUi方法
            // mUiHandler can be null if the AMS is constructed with injector only. This will only
            // happen in tests.
            if (mService.mUiHandler != null) {
                // Bring up the infamous App Not Responding dialog
                Message msg = Message.obtain();
                msg.what = ActivityManagerService.SHOW_NOT_RESPONDING_UI_MSG;
                msg.obj = new AppNotRespondingDialog.Data(this, aInfo, aboveSystem);

                mService.mUiHandler.sendMessage(msg);
            }
        }
    }

主要流程如下:
1、写入event log
2、写入 main log
3、生成tracesFile
4、输出ANR logcat(控制台可以看到)
5、如果没有获取到tracesFile,会发一个SIGNAL_QUIT信号,这里看注释是会触发收集线程堆栈信息流程,写入traceFile
6、输出到drapbox
7、后台ANR,直接杀进程
8、错误报告
9、弹出ANR dialog,会调用 AppErrors#handleShowAnrUi方法。

3. ANR分析方法

3.1 通过traces.txt分析ANR

上面已经分析了ANR触发流程,最终会把发生ANR时的线程堆栈、cpu等信息保存起来,我们一般都是分析 /data/anr/traces.txt 文件

4. ANR监控

线上问题,怎么样才能拿到ANR日志呢?

这部分咱也没做过,照抄的这篇文章

先抄下来再研究吧。

4.1 抓取系统traces.txt 上传

1、当监控线程发现主线程卡死时,主动向系统发送SIGNAL_QUIT信号。
2、等待/data/anr/traces.txt文件生成。
3、文件生成以后进行上报。

存在两个问题:
1、traces.txt 里面包含所有线程的信息,上传之后需要人工过滤分析
2、很多高版本系统需要root权限才能读取 /data/anr这个目录

4.2 ANRWatchDog

[ANRWatchDog](ANRWatchDog 是一个自动检测ANR的开源库 "ANRWatchDog") 是一个自动检测ANR的开源库

4.2.1 ANRWatchDog原理

其源码只有两个类,核心是ANRWatchDog这个类,继承自Thread,它的run 方法如下,看注释处

    public void run() {
        setName("|ANR-WatchDog|");

        long interval = _timeoutInterval;
       // 1、开启循环
        while (!isInterrupted()) {
            boolean needPost = _tick == 0;
            _tick += interval;
            if (needPost) {
               // 2、往UI线程post 一个Runnable,将_tick 赋值为0,将 _reported 赋值为false                      
              _uiHandler.post(_ticker);
            }

            try {
                // 3、线程睡眠5s
                Thread.sleep(interval);
            } catch (InterruptedException e) {
                _interruptionListener.onInterrupted(e);
                return ;
            }

            // If the main thread has not handled _ticker, it is blocked. ANR.
            // 4、线程睡眠5s之后,检查 _tick 和 _reported 标志,正常情况下_tick 已经被主线程改为0,_reported改为false,如果不是,说明 2 的主线程Runnable一直没有被执行,主线程卡住了
            if (_tick != 0 && !_reported) {
                ...
                if (_namePrefix != null) {
                    // 5、判断发生ANR了,那就获取堆栈信息,回调onAppNotResponding方法
                    error = ANRError.New(_tick, _namePrefix, _logThreadsWithoutStackTrace);
                } else {
                    error = ANRError.NewMainOnly(_tick);
                }
                _anrListener.onAppNotResponding(error);
                interval = _timeoutInterval;
                _reported = true;
            }

        }

    }

ANRWatchDog 的原理是比较简单的,概括为以下几个步骤

  1. 开启一个线程,死循环,循环中睡眠5s
  2. 往UI线程post 一个Runnable,将_tick 赋值为0,将 _reported 赋值为false
  3. 线程睡眠5s之后检查_tick和_reported字段是否被修改
  4. 如果_tick和_reported没有被修改,说明给主线程post的Runnable一直没有被执行,也就说明主线程卡顿至少5s(只能说至少,这里存在5s内的误差)
  5. 将线程堆栈信息输出

其中涉及到并发的一个知识点,关于 volatile 关键字的使用,面试中的常客,
volatile的特点是:保证可见性,禁止指令重排,适合在一个线程写,其它线程读的情况。
面试中一般会展开问JMM,工作内存,主内存等,以及为什么要有工作内存,能不能所有字段都用 volatile 关键字修饰等问题。

回到ANRWatchDog本身,细心的同学可能会发现一个问题,使用ANRWatchDog有时候会捕获不到ANR,是什么原因呢?

4.2.2 ANRWatchDog 缺点

ANRWatchDog 会出现漏检测的情况,看图
image

如上图这种情况,红色表示卡顿,

假设主线程卡顿了2s之后,ANRWatchDog这时候刚开始一轮循环,将_tick 赋值为5,并往主线程post一个任务,把_tick修改为0

主线程过了3s之后不卡顿了,将_tick赋值为0
等到ANRWatchDog睡眠5s之后,发现_tick的值是0,判断为没有发生ANR。而实际上,主线程中间是卡顿了5s,ANRWatchDog误差是在5s之内的(5s是默认的,线程的睡眠时长)

针对这个问题,可以做一下优化。

4.3 ANRMonitor

ANRWatchDog 漏检测的问题,根本原因是因为线程睡眠5s,不知道前一秒主线程是否已经出现卡顿了,如果改成每间隔1秒检测一次,就可以把误差降低到1s内。
接下来通过改造ANRWatchDog ,来做一下优化,命名为ANRMonitor。
我们想让子线程间隔1s执行一次任务,可以通过 HandlerThread来实现
流程如下:
核心的Runnable代码

    @Volatile
    var mainHandlerRunEnd = true

    //子线程会间隔1s调用一次这个Runnable
    private val mThreadRunnable = Runnable {

        blockTime++
        //1、标志位 mainHandlerRunEnd 没有被主线程修改,说明有卡顿
        if (!mainHandlerRunEnd && !isDebugger()) {
            logw(TAG, "mThreadRunnable: main thread may be block at least $blockTime s")
        }

        //2、卡顿超过5s,触发ANR流程,打印堆栈
        if (blockTime >= 5) {
            if (!mainHandlerRunEnd && !isDebugger() && !mHadReport) {
                mHadReport = true
                //5s了,主线程还没更新这个标志,ANR
                loge(TAG, "ANR->main thread may be block at least $blockTime s ")
                loge(TAG, getMainThreadStack())
                //todo 回调出去,这里可以按需把其它线程的堆栈也输出
                //todo debug环境可以开一个新进程,弹出堆栈信息
            }
        }

        //3、如果上一秒没有卡顿,那么重置标志位,然后让主线程去修改这个标志位
        if (mainHandlerRunEnd) {
            mainHandlerRunEnd = false
            mMainHandler.post {
                mainHandlerRunEnd = true
            }

        }

        //子线程间隔1s调用一次mThreadRunnable
        sendDelayThreadMessage()

    }

子线程每隔1s会执行一次mThreadRunnable,检测标志位 mainHandlerRunEnd 是否被修改
假如mainHandlerRunEnd如期被主线程修改为true,那么重置mainHandlerRunEnd标志位为false,然后继续执行步骤1
假如mainHandlerRunEnd没有被修改true,说明有卡顿,累计卡顿5s就触发ANR流程

在监控到ANR的时候,除了获取主线程堆栈,还有cpu、内存占用等信息也是比较重要的,demo中省略了这部分内容。

参考文档

彻底理解安卓应用无响应机制

ANR的原理分析和简单总结

卡顿、ANR、死锁,线上如何监控?

posted @ 2022-06-03 18:17  cfdroid  阅读(185)  评论(0编辑  收藏  举报