Android系统启动-1-综述-1
基于-Android 6.0
一、概述
Android 系统底层基于Linux Kernel, 当 Kernel 启动过程会创建 init 进程, 该进程是所有用户空间的鼻祖, init 进程会启动 servicemanager(binder服务管家), Zygote 进程(Java进程的鼻祖). Zygote 进程会创建 system_server 进程以及各种app进程,下图是这几个系统重量级进程之间的层级关系。
二、init
init 是 Linux 系统中用户空间的第一个进程(pid=1), Kernel 启动后会调用 /system/core/init/Init.cpp 的 main() 方法。
2.1 Init.main
int main(int argc, char** argv) { ... klog_init(); //初始化kernel log property_init(); //创建一块共享的内存空间,用于属性服务 signal_handler_init(); //初始化子进程退出的信号处理过程 property_load_boot_defaults(); //加载/default.prop文件 start_property_service(); //启动属性服务器(通过socket通信) init_parse_config_file("/init.rc"); //解析init.rc文件 //执行rc文件中触发器为 on early-init的语句 action_for_each_trigger("early-init", action_add_queue_tail); //执行rc文件中触发器为 on init的语句 action_for_each_trigger("init", action_add_queue_tail); //执行rc文件中触发器为 on late-init的语句 action_for_each_trigger("late-init", action_add_queue_tail); while (true) { if (!waiting_for_exec) { execute_one_command(); restart_processes(); } int timeout = -1; if (process_needs_restart) { timeout = (process_needs_restart - gettime()) * 1000; if (timeout < 0) timeout = 0; } if (!action_queue_empty() || cur_action) { timeout = 0; } epoll_event ev; //循环 等待事件发生 int nr = TEMP_FAILURE_RETRY(epoll_wait(epoll_fd, &ev, 1, timeout)); if (nr == -1) { ERROR("epoll_wait failed: %s\n", strerror(errno)); } else if (nr == 1) { ((void (*)()) ev.data.ptr)(); } } return 0; }
init进程的主要功能点:
(1) 分析和运行所有的 init.rc 文件;
(2) 生成设备驱动节点;(通过rc文件创建)
(3) 处理子进程的终止(signal 方式);
(4) 提供属性服务 property service。
2.2 Zygote自动重启机制
当 init 解析到下面这条语句,便会启动 Zygote 进程
service zygote /system/bin/app_process -Xzygote /system/bin --zygote --start-system-server class main //伴随着main class的启动而启动 TODO:?? socket zygote stream 660 root system //创建socket onrestart write /sys/android_power/request_state wake onrestart write /sys/power/state on onrestart restart media //当zygote重启时,则会重启media onrestart restart netd //当zygote重启时,则会重启netd
当 init 子进程(Zygote)退出时,会产生 SIGCHLD 信号,并发送给 init 进程,通过 socket 套接字传递数据,调用到 wait_for_one_process()方法,根据是否是 oneshot,来决定是重启子进程,还是放弃启动。由于缺省模式 oneshot=false,因此 Zygote 一旦被杀便会再次由init进程拉起。
接下来,便是进入了Zygote进程.
三、Zygote
当 Zygote 进程启动后, 便会执行到 frameworks/base/cmds/app_process/App_main.cpp 文件的 main()方法。整个调用流程:
App_main.main
AndroidRuntime.start
AndroidRuntime.startVm
AndroidRuntime.startReg
ZygoteInit.main (首次进入Java世界)
registerZygoteSocket
preload
startSystemServer
runSelectLoop
3.1 App_main.main
int main(int argc, char* const argv[]) { AppRuntime runtime(argv[0], computeArgBlockSize(argc, argv)); while (i < argc) { ...//参数解析 } //设置进程名 if (!niceName.isEmpty()) { runtime.setArgv0(niceName.string()); set_process_name(niceName.string()); } if (zygote) { // 启动AppRuntime,见小节[3.2] runtime.start("com.android.internal.os.ZygoteInit", args, zygote); //TODO: 怎么启动的? } else if (className) { runtime.start("com.android.internal.os.RuntimeInit", args, zygote); } }
在 app_process 进程启动过程,有两个分支://TODO: 区别是啥?
(1) 当 zygote 为 true 时,则执行 ZygoteInit.main()
(2) 当 zygote 为 false 时,则执行 RuntimeInit.main()
3.2 AndroidRuntime::start
[-> AndroidRuntime.cpp] void AndroidRuntime::start(const char* className, const Vector<String8>& options) { ... // 虚拟机创建 if (startVm(&mJavaVM, &env, zygote) != 0) { return; } onVmCreated(env); // JNI方法注册 if (startReg(env) < 0) { return; } ... // 调用 ZygoteInit.main()方法[见小节3.3] env->CallStaticVoidMethod(startClass, startMeth, strArray);
3.3 ZygoteInit.main
[–>ZygoteInit.java] public static void main(String argv[]) { try { ... registerZygoteSocket(socketName); //为 Zygote 注册 socket preload(); //预加载类和资源[见小节3.4] ... if (startSystemServer) { startSystemServer(abiList, socketName);//启动 system_server [见小节3.5] } Log.i(TAG, "Accepting command socket connections"); runSelectLoop(abiList); //进入循环模式[见小节3.6] ... } catch (MethodAndArgsCaller caller) { caller.run(); //启动system_server中会讲到。 } ... }
3.4 ZygoteInit.preload
[–>ZygoteInit.java] static void preload() { Log.d(TAG, "begin preload"); preloadClasses(); preloadResources(); preloadOpenGL(); preloadSharedLibraries(); WebViewFactory.prepareWebViewInZygote(); Log.d(TAG, "end preload"); }
3.5 ZygoteInit.startSystemServer
[–>ZygoteInit.java] private static boolean startSystemServer(String abiList, String socketName) throws MethodAndArgsCaller, RuntimeException { ... // fork 子进程 system_server pid = Zygote.forkSystemServer( parsedArgs.uid, parsedArgs.gid, parsedArgs.gids, parsedArgs.debugFlags, null, parsedArgs.permittedCapabilities, parsedArgs.effectiveCapabilities); ... if (pid == 0) { if (hasSecondZygote(abiList)) { waitForSecondaryZygote(socketName); } //进入 system_server 进程[见小节4.1] handleSystemServerProcess(parsedArgs); } return true; }
3.6 ZygoteInit.runSelectLoop
[–>ZygoteInit.java] private static void runSelectLoop(String abiList) throws MethodAndArgsCaller { ArrayList<FileDescriptor> fds = new ArrayList<FileDescriptor>(); ArrayList<ZygoteConnection> peers = new ArrayList<ZygoteConnection>(); //sServerSocket 是 socket 通信中的服务端,即 zygote 进程 fds.add(sServerSocket.getFileDescriptor()); peers.add(null); while (true) { StructPollfd[] pollFds = new StructPollfd[fds.size()]; for (int i = 0; i < pollFds.length; ++i) { pollFds[i] = new StructPollfd(); pollFds[i].fd = fds.get(i); pollFds[i].events = (short) POLLIN; } ... Os.poll(pollFds, -1); for (int i = pollFds.length - 1; i >= 0; --i) { //采用I/O多路复用机制,当客户端发出连接请求或者数据处理请求时,则执行 continue if ((pollFds[i].revents & POLLIN) == 0) { continue; } if (i == 0) { //创建客户端连接 ZygoteConnection newPeer = acceptCommandPeer(abiList); peers.add(newPeer); fds.add(newPeer.getFileDesciptor()); } else { //处理客户端数据事务 boolean done = peers.get(i).runOnce(); if (done) { peers.remove(i); fds.remove(i); } } } } }
Zygote 进程创建 Java 虚拟机,并注册JNI方法,真正成为Java进程的母体,用于孵化Java进程. 在创建完 system_server 进程后,zygote 功成身退,调用 runSelectLoop(),随时待命,当接收到请求创建新进程请求时立即唤醒并执行相应工作。
四、system_server
Zygote 通过 fork 后创建 system_server 进程,在小节[3.5]执行完 startSystemServer() 方法后,进入到了 handleSystemServerProcess() 方法,如下所示。
4.1 handleSystemServerProcess
[–>ZygoteInit.java] private static void handleSystemServerProcess( ZygoteConnection.Arguments parsedArgs) throws ZygoteInit.MethodAndArgsCaller { ... if (parsedArgs.niceName != null) { //设置当前进程名为"system_server" ####### Process.setArgV0(parsedArgs.niceName); } final String systemServerClasspath = Os.getenv("SYSTEMSERVERCLASSPATH"); if (systemServerClasspath != null) { //执行dex优化操作,比如 services.jar performSystemServerDexOpt(systemServerClasspath); } if (parsedArgs.invokeWith != null) { ... } else { ClassLoader cl = null; if (systemServerClasspath != null) { cl = new PathClassLoader(systemServerClasspath, ClassLoader.getSystemClassLoader()); Thread.currentThread().setContextClassLoader(cl); } //[见小节4.2] RuntimeInit.zygoteInit(parsedArgs.targetSdkVersion, parsedArgs.remainingArgs, cl); } }
system_server 进程创建 PathClassLoader 类加载器。
4.2 RuntimeInit.zygoteInit
[–> RuntimeInit.java] public static final void zygoteInit(int targetSdkVersion, String[] argv, ClassLoader classLoader) throws ZygoteInit.MethodAndArgsCaller { Trace.traceBegin(Trace.TRACE_TAG_ACTIVITY_MANAGER, "RuntimeInit"); redirectLogStreams(); //重定向log输出 commonInit(); //通用的一些初始化 nativeZygoteInit(); //zygote 初始化 applicationInit(targetSdkVersion, argv, classLoader); // [见小节3.4] }
Binder线程池启动
nativeZygoteInit() 方法经过层层调用,会进入 app_main.cpp 中的 onZygoteInit() 方法, Binder线程池的创建也是在这个过程,如下:
virtual void onZygoteInit() { sp<ProcessState> proc = ProcessState::self(); proc->startThreadPool(); //启动新binder线程池 }
捕获特殊异常
applicationInit() 方法经过层层调用,会抛出异常 ZygoteInit.MethodAndArgsCaller(m, argv), 具体过程如下:
protected static Runnable applicationInit(int targetSdkVersion, String[] argv, ClassLoader classLoader) { ... VMRuntime.getRuntime().setTargetHeapUtilization(0.75f); VMRuntime.getRuntime().setTargetSdkVersion(targetSdkVersion); final Arguments args = new Arguments(argv); //找到目标类的静态 main()方法 invokeStaticMain(args.startClass, args.startArgs, classLoader); } private static void invokeStaticMain(String className, String[] argv, ClassLoader classLoader) throws ZygoteInit.MethodAndArgsCaller { //此处的 className 等于 SystemServer Class<?> cl = Class.forName(className, true, classLoader); Method m = cl.getMethod("main", new Class[] { String[].class }); //TODO: 参数2是什么意思? //抛出异常 Runnable 对象 throw new ZygoteInit.MethodAndArgsCaller(m, argv); }
设置虚拟机的堆利用率 0.75 和置 TargetSdk 版本;并抛出异常,然后由 ZygoteInit.main() 捕获该异常, 见下文
4.3 ZygoteInit.main
[–>ZygoteInit.java] public static void main(String argv[]) { try { startSystemServer(abiList, socketName); //抛出 MethodAndArgsCaller 异常 .... } catch (MethodAndArgsCaller caller) { caller.run(); //此处通过反射,会调用 SystemServer.main() 方法 [见小节4.4] } catch (RuntimeException ex) { ... } } static class MethodAndArgsCaller implements Runnable { private final Method mMethod; private final String[] mArgs; public MethodAndArgsCaller(Method method, String[] args) { mMethod = method; mArgs = args; } public void run() { //执行 SystemServer.main() mMethod.invoke(null, new Object[] { mArgs }); //TODO: 参数2是什么意思? } }
采用抛出异常的方式,用于栈帧清空,提供利用率,以至于现在大家看到的每个Java进程的调用栈如下:
... at com.android.server.SystemServer.main(SystemServer.java:175) at java.lang.reflect.Method.invoke!(Native method) at com.android.internal.os.ZygoteInit$MethodAndArgsCaller.run(ZygoteInit.java:738) at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:628)
4.4 SystemServer.main
[–>SystemServer.java] public final class SystemServer { ... public static void main(String[] args) { //先初始化 SystemServer 对象,再调用对象的 run() 方法 new SystemServer().run(); } }
4.5 SystemServer.run
[–>SystemServer.java] private void run() { if (System.currentTimeMillis() < EARLIEST_SUPPORTED_TIME) { Slog.w(TAG, "System clock is before 1970; setting to 1970."); SystemClock.setCurrentTimeMillis(EARLIEST_SUPPORTED_TIME); } ... Slog.i(TAG, "Entered the Android system server!"); EventLog.writeEvent(EventLogTags.BOOT_PROGRESS_SYSTEM_RUN, SystemClock.uptimeMillis()); Looper.prepareMainLooper(); //准备主线程looper //加载 android_servers.so 库,该库包含的源码在 frameworks/base/services/ 目录下 //TODO: 这个库是什么作用? System.loadLibrary("android_servers"); //检测上次关机过程是否失败,该方法可能不会返回 performPendingShutdown(); createSystemContext(); //初始化系统上下文 //创建系统服务管理 mSystemServiceManager = new SystemServiceManager(mSystemContext); LocalServices.addService(SystemServiceManager.class, mSystemServiceManager); //启动各种系统服务 try { startBootstrapServices(); // 启动引导服务 startCoreServices(); // 启动核心服务 startOtherServices(); // 启动其他服务[见小节4.6] } catch (Throwable ex) { Slog.e("System", "************ Failure starting system services", ex); throw ex; } //一直循环执行 Looper.loop(); throw new RuntimeException("Main thread loop unexpectedly exited"); }
4.6 服务启动
public final class SystemServer { private void startBootstrapServices() { ... //phase100 mSystemServiceManager.startBootPhase(SystemService.PHASE_WAIT_FOR_DEFAULT_DISPLAY); ... } private void startOtherServices() { ... //phase480 和phase500 mSystemServiceManager.startBootPhase(SystemService.PHASE_LOCK_SETTINGS_READY); mSystemServiceManager.startBootPhase(SystemService.PHASE_SYSTEM_SERVICES_READY); ... //[见小节4.7] mActivityManagerService.systemReady(new Runnable() { @Override public void run() { //phase550 mSystemServiceManager.startBootPhase(SystemService.PHASE_ACTIVITY_MANAGER_READY); ... //phase600 mSystemServiceManager.startBootPhase(SystemService.PHASE_THIRD_PARTY_APPS_CAN_START); } } } }
start: 创建 AMS, PMS, LightsService, DMS.
phase100: 进入 Phase100, 创建 PKMS, WMS, IMS, DBMS, LockSettingsService, JobSchedulerService, MmsService 等服务;
phase480 && 500: 进入 Phase480, 调用 WMS, PMS, PKMS, DisplayManagerService 这4个服务的 systemReady(); #######
Phase550: 进入 phase550, 执行 AMS.systemReady(), 启动 SystemUI, WebViewFactory, Watchdog.
Phase600: 进入 phase600, 执行 AMS.systemReady(), 执行各种服务的 systemRunning().
Phase1000: 进入1000, 执行 finishBooting, 启动启动 on-hold 进程. //TODO: 什么是 on-hold 进程?
4.7 AMS.systemReady
public final class ActivityManagerService extends ActivityManagerNative implements Watchdog.Monitor, BatteryStatsImpl.BatteryCallback { public void systemReady(final Runnable goingCallback) { ... //update相关 mSystemReady = true; //杀掉所有非 persistent 进程 removeProcessLocked(proc, true, false, "system update done"); mProcessesReady = true; goingCallback.run(); //[见小节1.6.2] addAppLocked(info, false, null); //启动所有的 persistent 进程 mBooting = true; //启动 home ########## startHomeActivityLocked(mCurrentUserId, "systemReady"); //恢复栈顶的 Activity mStackSupervisor.resumeTopActivitiesLocked(); } }
System_server 主线程的启动工作,总算完成,进入 Looper.loop() 状态,等待其他线程通过 handler 发送消息再处理。
五、app
对于普通的 app 进程,跟 system_server 进程的启动过来有些类似。不同的是 app 进程是发消息给 system_server 进程, 由 system_server 向 zygote 发出创建进程的请求。
理解Android进程创建流程,可知进程创建后,接下来会进入 ActivityThread.main() 过程。
5.1 ActivityThread.main
public static void main(String[] args) { ... Environment.initForCurrentUser(); ... Process.setArgV0("<pre-initialized>"); //创建主线程looper Looper.prepareMainLooper(); ActivityThread thread = new ActivityThread(); thread.attach(false); //attach到系统进程 if (sMainThreadHandler == null) { sMainThreadHandler = thread.getHandler(); } //主线程进入循环状态 Looper.loop(); throw new RuntimeException("Main thread loop unexpectedly exited"); }
5.2 调用栈对比
App进程的主线程调用栈的栈底如下:
... at android.app.ActivityThread.main(ActivityThread.java:5442) at java.lang.reflect.Method.invoke!(Native method) at com.android.internal.os.ZygoteInit$MethodAndArgsCaller.run(ZygoteInit.java:738) at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:628)
跟前面介绍的system_server进程调用栈对比:
at com.android.server.SystemServer.main(SystemServer.java:175) at java.lang.reflect.Method.invoke!(Native method) at com.android.internal.os.ZygoteInit$MethodAndArgsCaller.run(ZygoteInit.java:738) at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:628)
六、启动日志分析
以下列举启动部分重要进程以及关键节点会打印出的log
/system/bin/vold: 383 /system/bin/lmkd: 432 /system/bin/surfaceflinger: 434 /system/bin/debuggerd64: 537 /system/bin/mediaserver: 540 /system/bin/installd: 541 /system/vendor/bin/thermal-engine: 552 zygote64: 557 zygote: 558 system_server: 1274
6.1 before zygote 日志
//启动vold, 再列举当前系统所支持的文件系统. 执行到system/vold/main.cpp的main() 11-23 14:36:47.474 383 383 I vold : Vold 3.0 (the awakening) firing up 11-23 14:36:47.475 383 383 V vold : Detected support for: ext4 vfat //使用内核的lmk策略 11-23 14:36:47.927 432 432 I lowmemorykiller: Using in-kernel low memory killer interface //启动SurfaceFlinger 11-23 14:36:48.041 434 434 I SurfaceFlinger: SurfaceFlinger is starting 11-23 14:36:48.042 434 434 I SurfaceFlinger: SurfaceFlinger's main thread ready to run. Initializing graphics H/W... // 开机动画 11-23 14:36:48.583 508 508 I BootAnimation: bootanimation launching ... // debuggerd 11-23 14:36:50.306 537 537 I : debuggerd: starting // installd启动 11-23 14:36:50.311 541 541 I installd: installd firing up // thermal守护进程 11-23 14:36:50.369 552 552 I ThermalEngine: Thermal daemon started
6.2 zygote 日志
// Zygote64进程(Zygote): AndroidRuntime::start 11-23 14:36:51.260 557 557 D AndroidRuntime: >>>>>> START com.android.internal.os.ZygoteInit uid 0 <<<<<< // Zygote64进程: AndroidRuntime::startVm 11-23 14:36:51.304 557 557 D AndroidRuntime: CheckJNI is OFF // 执行ZygoteInit.preload() 11-23 14:36:52.134 557 557 D Zygote : begin preload // 执行ZygoteInit.preloadClasses(), 预加载3860个classes, 花费时长746ms 11-23 14:36:52.134 557 557 I Zygote : Preloading classes... 11-23 14:36:52.881 557 557 I Zygote : ...preloaded 3860 classes in 746ms. // 执行ZygoteInit.preloadClasses(), 预加载86组资源, 花费时长179ms 11-23 14:36:53.114 557 557 I Zygote : Preloading resources... 11-23 14:36:53.293 557 557 I Zygote : ...preloaded 86 resources in 179ms. // 执行ZygoteInit.preloadSharedLibraries() 11-23 14:36:53.494 557 557 I Zygote : Preloading shared libraries... 11-23 14:36:53.503 557 557 D Zygote : end preload // 执行com_android_internal_os_Zygote_nativeForkSystemServer(),成功fork出system_server进程 11-23 14:36:53.544 557 557 I Zygote : System server process 1274 has been created // Zygote开始进入runSelectLoop() 11-23 14:36:53.546 557 557 I Zygote : Accepting command socket connections
6.3 system_server 日志
//进入system_server, 建立跟Zygote进程的socket通道 11-23 14:36:53.586 1274 1274 I Zygote : Process: zygote socket opened, supported ABIS: armeabi-v7a,armeabi // 执行SystemServer.run() 11-23 14:36:53.618 1274 1274 I SystemServer: Entered the Android system server! <===> boot_progress_system_run // 等待installd准备就绪 11-23 14:36:53.707 1274 1274 I Installer: Waiting for installd to be ready. //服务启动 11-23 14:36:53.732 1274 1274 I ActivityManager: Memory class: 192 //phase100 11-23 14:36:53.883 1274 1274 I SystemServiceManager: Starting phase 100 11-23 14:36:53.902 1274 1274 I SystemServer: Package Manager 11-23 14:37:03.816 1274 1274 I SystemServer: User Service ... 11-23 14:37:03.940 1274 1274 I SystemServer: Init Watchdog 11-23 14:37:03.941 1274 1274 I SystemServer: Input Manager 11-23 14:37:03.946 1274 1274 I SystemServer: Window Manager ... 11-23 14:37:04.081 1274 1274 I SystemServiceManager: Starting com.android.server.MountService$Lifecycle 11-23 14:37:04.088 1274 2717 D MountService: Thinking about reset, mSystemReady=false, mDaemonConnected=true 11-23 14:37:04.088 1274 1274 I SystemServiceManager: Starting com.android.server.UiModeManagerService 11-23 14:37:04.520 1274 1274 I SystemServer: NetworkTimeUpdateService //phase480 && 500 11-23 14:37:05.056 1274 1274 I SystemServiceManager: Starting phase 480 11-23 14:37:05.061 1274 1274 I SystemServiceManager: Starting phase 500 11-23 14:37:05.231 1274 1274 I ActivityManager: System now ready <==> boot_progress_ams_ready 11-23 14:37:05.234 1274 1274 I SystemServer: Making services ready 11-23 14:37:05.243 1274 1274 I SystemServer: WebViewFactory preparation //phase550 11-23 14:37:05.234 1274 1274 I SystemServiceManager: Starting phase 550 11-23 14:37:05.237 1274 1288 I ActivityManager: Force stopping com.android.providers.media appid=10010 user=-1: vold reset //Phase600 11-23 14:37:06.066 1274 1274 I SystemServiceManager: Starting phase 600 11-23 14:37:06.236 1274 1274 D MountService: onStartUser 0
6.4 logcat 小技巧
通过 adb bugreport 抓取log信息。先看 zygote 是否起来, 再看 system_server 主线程的运行情况,再看 ActivityManager 情况。
adb logcat -s Zygote adb logcat -s SystemServer adb logcat -s SystemServiceManager adb logcat | grep "1359 1359" //system_server情况 adb logcat -s ActivityManager
现场调试命令
cat proc/[pid]/stack //查看kernel调用栈 debuggerd -b [pid] //也不可以不带参数-b, 则直接输出到/data/tombstones/目录 kill -3 [pid] //生成/data/anr/traces.txt文件 lsof [pid] //查看进程所打开的文件
七、总结
各大核心进程启动后,都会进入各种对象所相应的 main() 方法,如下
7.1 进程main方法
进程 主方法 --------------------------------------- init进程 Init.main() zygote进程 ZygoteInit.main() app_process进程 RuntimeInit.main() system_server进程 SystemServer.main() app进程 ActivityThread.main()
注意 app_process 进程是指通过 /system/bin/app_process 启动的进程,且后面跟的参数不带 –zygote,即并非启动 zygote 进程。比如常见的有通过 adb shell 方式来执行 am、pm 等命令,便是这种方式。
7.2 重启相关进程
关于重要进程重启的过程,会触发哪些关联进程重启名单:
zygote:触发 media、netd 以及子进程(包括 system_server 进程)重启;
system_server: 触发 zygote 重启;
surfaceflinger:触发 zygote 重启;
servicemanager: 触发 zygote、healthd、media、surfaceflinger、drm 重启;
所以,surfaceflinger,servicemanager,zygote 自身以及 system_server 进程被杀都会触发 Zygote 重启。
参考:
Android系统启动-综述: http://gityuan.com/2016/02/01/android-booting/
posted on 2023-05-04 00:50 Hello-World3 阅读(144) 评论(0) 编辑 收藏 举报