温暖的电波  

问题背景

两个环境环境上cpuset,cpu,cpuacct三个cgroup子系统的路径不一样,导致业务使用同一套运维代码在不同环境上运行出错。

环境一:上述三个子系统分布到一个目录 /sys/fs/cgroup/cpuset,cpu,cpuacct

环境二:上述三个子系统分别分布到两个目录,cpu和cpuacct分布在/sys/fs/cgroup/cpu,cpuacct目录,而cpuset则在/sys/fs/cgroup/cpuset目录

原因排查

第一步,查明cgroup是谁挂载的。通过多方资料查询,了解到centos环境上cgroup是通过systemd调用mount_cgroup_controllers来挂载的

第二步,分析挂载差异原因。两个环境虽然都是centos系统,但是systemd的版本有所不同;分析systemd的代码和提交记录,发现如下修改造成的上述差异:

commit a07fdfa376add41d9101d39db25fb2ecb17d5fca
Author: Lennart Poettering <lennart@poettering.net>
Date:   Mon Sep 24 11:35:51 2012 +0200

    main: don't try to mout cpuset with cpu+cpuacct anymore
    
    Turns out cpuset needs explicit initialization before we could make use
    of it. Thus mounting cpuset with cpu/cpuacct would make it impossible to
    just create a group in "cpu" and start it.

从这个提交信息来看原来的代码是将cpuset和cpu、cpuacct挂载到一个目录的;而这个提交正是将cpuset与cpu、cpuacct挂载路径分开,我们分析一下这个补丁:

diff --git a/src/core/main.c b/src/core/main.c
index 04fc0b3b59..f9aba46b58 100644
--- a/src/core/main.c
+++ b/src/core/main.c
@@ -1227,6 +1227,28 @@ static void test_cgroups(void) {
         sleep(10);
 }
 
+static int initialize_join_controllers(void) {
+        /* By default, mount "cpu" + "cpuacct" together, and "net_cls"
+         * + "net_prio". We'd like to add "cpuset" to the mix, but
+         * "cpuset" does't really work for groups with no initialized
+         * attributes. */
+
+        arg_join_controllers = new(char**, 3);
+        if (!arg_join_controllers)
+                return -ENOMEM;
+
+        arg_join_controllers[0] = strv_new("cpu", "cpuacct", NULL);
+        if (!arg_join_controllers[0])
+                return -ENOMEM;
+
+        arg_join_controllers[1] = strv_new("net_cls", "net_prio", NULL);
+        if (!arg_join_controllers[1])
+                return -ENOMEM;
+
+        arg_join_controllers[2] = NULL;
+        return 0;
+}
+
 int main(int argc, char *argv[]) {
         Manager *m = NULL;
         int r, retval = EXIT_FAILURE;
@@ -1371,16 +1393,8 @@ int main(int argc, char *argv[]) {
                 goto finish;
         }

-        /* By default, mount "cpu" and "cpuacct" together */
-        arg_join_controllers = new(char**, 3);
-        if (!arg_join_controllers)
-                goto finish;
-
-        arg_join_controllers[0] = strv_new("cpu", "cpuacct", "cpuset", NULL);
-        arg_join_controllers[1] = strv_new("net_cls", "net_prio", NULL);
-        arg_join_controllers[2] = NULL;
-
-        if (!arg_join_controllers[0])
+        r = initialize_join_controllers();
+        if (r < 0)
                 goto finish;

从补丁分析,这里将原来的"cpu", "cpuacct", "cpuset"的挂载组合替换成了"cpu", "cpuacct";因此新版本的cpuset没有和cpu、cpuacct挂载在同一个路径。

那systemd中cgroup的挂载具体是如何一个流程呢?除了上述的"组合挂载"外,其他的子系统是如何挂载的呢?

systemd中cgroup挂载流程

下面是systemd中关于cgroup挂载的的基本流程:

main
    //[1]初始化需要合并挂载的子系统,目前支持cpu和cpuacct,net_cls和net_prio
    initialize_join_controllers()

    //[2]将mount_table[]的fs枚举挂载,包括procfs,sysfs,devtmpfs,/dev/shm挂载tmpfs,/sys/fs/cgroup挂载tmpfs,/sys/fs/cgroup/systemd挂载cgroup
    mount_setup

    //[3]先从/proc/cgroups读取所有子系统,然后挂载这些子系统和第[1]步合并的子系统
    mount_cgroup_controllers
        //从/proc/cgroups读取可用的子系统放到controllers中
        cg_kernel_controllers
        //遍历并挂载所有子系统以及合并的子系统
        mount_one

 

posted on 2024-04-30 08:06  温暖的电波  阅读(225)  评论(0编辑  收藏  举报