Kubernetes ControllerManager 源码解析

Kubernetes ControllerManager 负责管理各种 Controller,同时也是各种 Controller 源码阅读入口,在看 ControllerManager 源码前,首先可以大致看下最新版本中有哪些 Controller。

// First add "special" controllers that aren't initialized normally. These controllers cannot be initialized
    // in the main controller loop initialization, so we add them here only for the metadata and duplication detection.
    // app.ControllerDescriptor#RequiresSpecialHandling should return true for such controllers
    // The only known special case is the ServiceAccountTokenController which *must* be started
    // first to ensure that the SA tokens for future controllers will exist. Think very carefully before adding new
    // special controllers.
    register(newServiceAccountTokenControllerDescriptor(nil))

    register(newEndpointsControllerDescriptor())
    register(newEndpointSliceControllerDescriptor())
    register(newEndpointSliceMirroringControllerDescriptor())
    register(newReplicationControllerDescriptor())
    register(newPodGarbageCollectorControllerDescriptor())
    register(newResourceQuotaControllerDescriptor())
    register(newNamespaceControllerDescriptor())
    register(newServiceAccountControllerDescriptor())
    register(newGarbageCollectorControllerDescriptor())
    register(newDaemonSetControllerDescriptor())
    register(newJobControllerDescriptor())
    register(newDeploymentControllerDescriptor())
    register(newReplicaSetControllerDescriptor())
    register(newHorizontalPodAutoscalerControllerDescriptor())
    register(newDisruptionControllerDescriptor())
    register(newStatefulSetControllerDescriptor())
    register(newCronJobControllerDescriptor())
    register(newCertificateSigningRequestSigningControllerDescriptor())
    register(newCertificateSigningRequestApprovingControllerDescriptor())
    register(newCertificateSigningRequestCleanerControllerDescriptor())
    register(newTTLControllerDescriptor())
    register(newBootstrapSignerControllerDescriptor())
    register(newTokenCleanerControllerDescriptor())
    register(newNodeIpamControllerDescriptor())
    register(newNodeLifecycleControllerDescriptor())

    register(newServiceLBControllerDescriptor())          // cloud provider controller
    register(newNodeRouteControllerDescriptor())          // cloud provider controller
    register(newCloudNodeLifecycleControllerDescriptor()) // cloud provider controller
    // TODO: persistent volume controllers into the IncludeCloudLoops only set as a cloud provider controller.

    register(newPersistentVolumeBinderControllerDescriptor())
    register(newPersistentVolumeAttachDetachControllerDescriptor())
    register(newPersistentVolumeExpanderControllerDescriptor())
    register(newClusterRoleAggregrationControllerDescriptor())
    register(newPersistentVolumeClaimProtectionControllerDescriptor())
    register(newPersistentVolumeProtectionControllerDescriptor())
    register(newTTLAfterFinishedControllerDescriptor())
    register(newRootCACertificatePublisherControllerDescriptor())
    register(newEphemeralVolumeControllerDescriptor())

    // feature gated
    register(newStorageVersionGarbageCollectorControllerDescriptor())
    register(newResourceClaimControllerDescriptor())
    register(newLegacyServiceAccountTokenCleanerControllerDescriptor())
    register(newValidatingAdmissionPolicyStatusControllerDescriptor())
    register(newTaintEvictionControllerDescriptor())
    register(newServiceCIDRsControllerDescriptor())
    register(newStorageVersionMigratorControllerDescriptor())

 可以看到根据源码的空行,大致给这些 Controller 分了类。

  1. 首先是账号校验准入,账号鉴权是一切操作的基础嘛。
  2. 接着是基础功能的 Controller,Namespace ResourceQuota Deployment等。
  3. 之后是对接云厂商 LB 的相关 Controller。
  4. 再往下是存储相关的 Controller。
  5. 最后是一些新特性相关的 Controller。

大家对哪个 Controller 源码感兴趣可以从这点进去看看,参考上一篇 Kubernetes DeploymentController 源码解析

 


 下面开始本篇正文,一起来看下 ControllerManager 的源码

 从入口main方法开始,进程直接 Run 方法启动,代码如下:

// Run runs the KubeControllerManagerOptions.
func Run(ctx context.Context, c *config.CompletedConfig) error {
    logger := klog.FromContext(ctx)
    stopCh := ctx.Done()

    // To help debugging, immediately log version
    logger.Info("Starting", "version", version.Get())

    logger.Info("Golang settings", "GOGC", os.Getenv("GOGC"), "GOMAXPROCS", os.Getenv("GOMAXPROCS"), "GOTRACEBACK", os.Getenv("GOTRACEBACK"))

    // Start events processing pipeline.
    c.EventBroadcaster.StartStructuredLogging(0)
    c.EventBroadcaster.StartRecordingToSink(&v1core.EventSinkImpl{Interface: c.Client.CoreV1().Events("")})
    defer c.EventBroadcaster.Shutdown()

    if cfgz, err := configz.New(ConfigzName); err == nil {
        cfgz.Set(c.ComponentConfig)
    } else {
        logger.Error(err, "Unable to register configz")
    }

    // Setup any healthz checks we will want to use.
    var checks []healthz.HealthChecker
    var electionChecker *leaderelection.HealthzAdaptor
    if c.ComponentConfig.Generic.LeaderElection.LeaderElect {
        electionChecker = leaderelection.NewLeaderHealthzAdaptor(time.Second * 20)
        checks = append(checks, electionChecker)
    }
    healthzHandler := controllerhealthz.NewMutableHealthzHandler(checks...)

    // Start the controller manager HTTP server
    // unsecuredMux is the handler for these controller *after* authn/authz filters have been applied
    var unsecuredMux *mux.PathRecorderMux
    if c.SecureServing != nil {
        unsecuredMux = genericcontrollermanager.NewBaseHandler(&c.ComponentConfig.Generic.Debugging, healthzHandler)
        slis.SLIMetricsWithReset{}.Install(unsecuredMux)

        handler := genericcontrollermanager.BuildHandlerChain(unsecuredMux, &c.Authorization, &c.Authentication)
        // TODO: handle stoppedCh and listenerStoppedCh returned by c.SecureServing.Serve
        if _, _, err := c.SecureServing.Serve(handler, 0, stopCh); err != nil {
            return err
        }
    }

    clientBuilder, rootClientBuilder := createClientBuilders(logger, c)

    saTokenControllerDescriptor := newServiceAccountTokenControllerDescriptor(rootClientBuilder)

    run := func(ctx context.Context, controllerDescriptors map[string]*ControllerDescriptor) {
        controllerContext, err := CreateControllerContext(ctx, c, rootClientBuilder, clientBuilder)
        if err != nil {
            logger.Error(err, "Error building controller context")
            klog.FlushAndExit(klog.ExitFlushTimeout, 1)
        }

        if err := StartControllers(ctx, controllerContext, controllerDescriptors, unsecuredMux, healthzHandler); err != nil {
            logger.Error(err, "Error starting controllers")
            klog.FlushAndExit(klog.ExitFlushTimeout, 1)
        }

        controllerContext.InformerFactory.Start(stopCh)
        controllerContext.ObjectOrMetadataInformerFactory.Start(stopCh)
        close(controllerContext.InformersStarted)

        <-ctx.Done()
    }

    // No leader election, run directly
    if !c.ComponentConfig.Generic.LeaderElection.LeaderElect {
        controllerDescriptors := NewControllerDescriptors()
        controllerDescriptors[names.ServiceAccountTokenController] = saTokenControllerDescriptor
        run(ctx, controllerDescriptors)
        return nil
    }

    id, err := os.Hostname()
    if err != nil {
        return err
    }

    // add a uniquifier so that two processes on the same host don't accidentally both become active
    id = id + "_" + string(uuid.NewUUID())

    // leaderMigrator will be non-nil if and only if Leader Migration is enabled.
    var leaderMigrator *leadermigration.LeaderMigrator = nil

    // If leader migration is enabled, create the LeaderMigrator and prepare for migration
    if leadermigration.Enabled(&c.ComponentConfig.Generic) {
        logger.Info("starting leader migration")

        leaderMigrator = leadermigration.NewLeaderMigrator(&c.ComponentConfig.Generic.LeaderMigration,
            "kube-controller-manager")

        // startSATokenControllerInit is the original InitFunc.
        startSATokenControllerInit := saTokenControllerDescriptor.GetInitFunc()

        // Wrap saTokenControllerDescriptor to signal readiness for migration after starting
        //  the controller.
        saTokenControllerDescriptor.initFunc = func(ctx context.Context, controllerContext ControllerContext, controllerName string) (controller.Interface, bool, error) {
            defer close(leaderMigrator.MigrationReady)
            return startSATokenControllerInit(ctx, controllerContext, controllerName)
        }
    }

    // Start the main lock
    go leaderElectAndRun(ctx, c, id, electionChecker,
        c.ComponentConfig.Generic.LeaderElection.ResourceLock,
        c.ComponentConfig.Generic.LeaderElection.ResourceName,
        leaderelection.LeaderCallbacks{
            OnStartedLeading: func(ctx context.Context) {
                controllerDescriptors := NewControllerDescriptors()
                if leaderMigrator != nil {
                    // If leader migration is enabled, we should start only non-migrated controllers
                    //  for the main lock.
                    controllerDescriptors = filteredControllerDescriptors(controllerDescriptors, leaderMigrator.FilterFunc, leadermigration.ControllerNonMigrated)
                    logger.Info("leader migration: starting main controllers.")
                }
                controllerDescriptors[names.ServiceAccountTokenController] = saTokenControllerDescriptor
                run(ctx, controllerDescriptors)
            },
            OnStoppedLeading: func() {
                logger.Error(nil, "leaderelection lost")
                klog.FlushAndExit(klog.ExitFlushTimeout, 1)
            },
        })

    // If Leader Migration is enabled, proceed to attempt the migration lock.
    if leaderMigrator != nil {
        // Wait for Service Account Token Controller to start before acquiring the migration lock.
        // At this point, the main lock must have already been acquired, or the KCM process already exited.
        // We wait for the main lock before acquiring the migration lock to prevent the situation
        //  where KCM instance A holds the main lock while KCM instance B holds the migration lock.
        <-leaderMigrator.MigrationReady

        // Start the migration lock.
        go leaderElectAndRun(ctx, c, id, electionChecker,
            c.ComponentConfig.Generic.LeaderMigration.ResourceLock,
            c.ComponentConfig.Generic.LeaderMigration.LeaderName,
            leaderelection.LeaderCallbacks{
                OnStartedLeading: func(ctx context.Context) {
                    logger.Info("leader migration: starting migrated controllers.")
                    controllerDescriptors := NewControllerDescriptors()
                    controllerDescriptors = filteredControllerDescriptors(controllerDescriptors, leaderMigrator.FilterFunc, leadermigration.ControllerMigrated)
                    // DO NOT start saTokenController under migration lock
                    delete(controllerDescriptors, names.ServiceAccountTokenController)
                    run(ctx, controllerDescriptors)
                },
                OnStoppedLeading: func() {
                    logger.Error(nil, "migration leaderelection lost")
                    klog.FlushAndExit(klog.ExitFlushTimeout, 1)
                },
            })
    }

    <-stopCh
    return nil
}

首先打印当前的版本信息,注册事件广播(EventBroadcaster),顺便提一下 EventBroadcaster 是基本每个组件启动时候都需要的注册的,用来向 apiserver 报告组件相关的事件,通过 apiserver 保存到 etcd中。

之后会注册 HealthChecker 来对外通过 RESTful API 暴露组件自身的状态信息,接着声明 run 方法,启动所有的 Controller 都是在这个 run 方法里,在这里声明可以在后面复用 run 方法中的逻辑。

后面的代码主要是进行 ControllerManager 的 Leader 选举,若果禁用选举直接初始化所有 Controller (NewControllerDescriptors) 执行 run,或者是选举后的 Leader 实例来初始化所有 Controller (NewControllerDescriptors) 执行 run。

在 run 方法中调用 StartControllers 方法来启动所有的 Controller。

// StartControllers starts a set of controllers with a specified ControllerContext
func StartControllers(ctx context.Context, controllerCtx ControllerContext, controllerDescriptors map[string]*ControllerDescriptor,
    unsecuredMux *mux.PathRecorderMux, healthzHandler *controllerhealthz.MutableHealthzHandler) error {
    var controllerChecks []healthz.HealthChecker

    // Always start the SA token controller first using a full-power client, since it needs to mint tokens for the rest
    // If this fails, just return here and fail since other controllers won't be able to get credentials.
    if serviceAccountTokenControllerDescriptor, ok := controllerDescriptors[names.ServiceAccountTokenController]; ok {
        check, err := StartController(ctx, controllerCtx, serviceAccountTokenControllerDescriptor, unsecuredMux)
        if err != nil {
            return err
        }
        if check != nil {
            // HealthChecker should be present when controller has started
            controllerChecks = append(controllerChecks, check)
        }
    }

    // Initialize the cloud provider with a reference to the clientBuilder only after token controller
    // has started in case the cloud provider uses the client builder.
    if controllerCtx.Cloud != nil {
        controllerCtx.Cloud.Initialize(controllerCtx.ClientBuilder, ctx.Done())
    }

    // Each controller is passed a context where the logger has the name of
    // the controller set through WithName. That name then becomes the prefix of
    // of all log messages emitted by that controller.
    //
    // In StartController, an explicit "controller" key is used instead, for two reasons:
    // - while contextual logging is alpha, klog.LoggerWithName is still a no-op,
    //   so we cannot rely on it yet to add the name
    // - it allows distinguishing between log entries emitted by the controller
    //   and those emitted for it - this is a bit debatable and could be revised.
    for _, controllerDesc := range controllerDescriptors {
        if controllerDesc.RequiresSpecialHandling() {
            continue
        }

        check, err := StartController(ctx, controllerCtx, controllerDesc, unsecuredMux)
        if err != nil {
            return err
        }
        if check != nil {
            // HealthChecker should be present when controller has started
            controllerChecks = append(controllerChecks, check)
        }
    }

    healthzHandler.AddHealthChecker(controllerChecks...)

    return nil
}

需要首先启动 serviceAccountTokenController,因为后面其他 Controller 都依赖它来获取 credentials。接着通过 for range 遍历 controllerDescriptors 启动所有 Controller,同时把 Controller 各自的 HealthChecker 注册到 healthzHandler。

// StartController starts a controller with a specified ControllerContext
// and performs required pre- and post- checks/actions
func StartController(ctx context.Context, controllerCtx ControllerContext, controllerDescriptor *ControllerDescriptor,
    unsecuredMux *mux.PathRecorderMux) (healthz.HealthChecker, error) {
    logger := klog.FromContext(ctx)
    controllerName := controllerDescriptor.Name()

    for _, featureGate := range controllerDescriptor.GetRequiredFeatureGates() {
        if !utilfeature.DefaultFeatureGate.Enabled(featureGate) {
            logger.Info("Controller is disabled by a feature gate", "controller", controllerName, "requiredFeatureGates", controllerDescriptor.GetRequiredFeatureGates())
            return nil, nil
        }
    }

    if controllerDescriptor.IsCloudProviderController() && controllerCtx.LoopMode != IncludeCloudLoops {
        logger.Info("Skipping a cloud provider controller", "controller", controllerName, "loopMode", controllerCtx.LoopMode)
        return nil, nil
    }

    if !controllerCtx.IsControllerEnabled(controllerDescriptor) {
        logger.Info("Warning: controller is disabled", "controller", controllerName)
        return nil, nil
    }

    time.Sleep(wait.Jitter(controllerCtx.ComponentConfig.Generic.ControllerStartInterval.Duration, ControllerStartJitter))

    logger.V(1).Info("Starting controller", "controller", controllerName)

    initFunc := controllerDescriptor.GetInitFunc()
    ctrl, started, err := initFunc(klog.NewContext(ctx, klog.LoggerWithName(logger, controllerName)), controllerCtx, controllerName)
    if err != nil {
        logger.Error(err, "Error starting controller", "controller", controllerName)
        return nil, err
    }
    if !started {
        logger.Info("Warning: skipping controller", "controller", controllerName)
        return nil, nil
    }

    check := controllerhealthz.NamedPingChecker(controllerName)
    if ctrl != nil {
        // check if the controller supports and requests a debugHandler
        // and it needs the unsecuredMux to mount the handler onto.
        if debuggable, ok := ctrl.(controller.Debuggable); ok && unsecuredMux != nil {
            if debugHandler := debuggable.DebuggingHandler(); debugHandler != nil {
                basePath := "/debug/controllers/" + controllerName
                unsecuredMux.UnlistedHandle(basePath, http.StripPrefix(basePath, debugHandler))
                unsecuredMux.UnlistedHandlePrefix(basePath+"/", http.StripPrefix(basePath, debugHandler))
            }
        }
        if healthCheckable, ok := ctrl.(controller.HealthCheckable); ok {
            if realCheck := healthCheckable.HealthChecker(); realCheck != nil {
                check = controllerhealthz.NamedHealthChecker(controllerName, realCheck)
            }
        }
    }

    logger.Info("Started controller", "controller", controllerName)
    return check, nil
}

StartController 首先就是三个判断,分别是如果没有开启默认特性,如果是云提供商的插件,如果被禁用,直接 return nil 跳过启动,接下来一行很有趣。

time.Sleep(wait.Jitter(controllerCtx.ComponentConfig.Generic.ControllerStartInterval.Duration, ControllerStartJitter))

在前面 for range 中调用 StartController 在启动每个 Controller 时候加一个 Jitter 抖动时间,默认是1秒,避免压力过大。

后面开始正式启动对应的 Controller,先拿到对应的 initFunc 然后执行,之后封装一个上面说到的 check 返回给上层。

接着看 initFunc 这个方法

type ControllerDescriptor struct {
    name                      string
    initFunc                  InitFunc
    requiredFeatureGates      []featuregate.Feature
    aliases                   []string
    isDisabledByDefault       bool
    isCloudProviderController bool
    requiresSpecialHandling   bool
}

func (r *ControllerDescriptor) Name() string {
    return r.name
}

func (r *ControllerDescriptor) GetInitFunc() InitFunc {
    return r.initFunc
}

那么 initFunc 是从哪来的,如果注意到前面 run 初始化所有 Controller 的方法 NewControllerDescriptors,在初始化的时候每个 Controller 实例会构建一个 ControllerDescriptor,然后调用 register 方法注册自己,

我们以 DeploymentController 看下。

register(newDeploymentControllerDescriptor())
func newDeploymentControllerDescriptor() *ControllerDescriptor {
    return &ControllerDescriptor{
        name:     names.DeploymentController,
        aliases:  []string{"deployment"},
        initFunc: startDeploymentController,
    }
}

看到这其实就很清晰了,我们执行 DeploymentController 的 initFunc 方法其实执行的就是 startDeploymentController,其他的 Controller 同理。startDeploymentController 后续参考我前面的文章 Kubernetes DeploymentController 源码解析

到这里 Kubernetes ControllerManager  的启动源码已经结束了,剩下的就是各个 Controller 各自功能的部分了,读者可以根据本文开头的 Controller 列表挑选自己感兴趣的或者工作需用到的阅读就好,不需要每个Controller 的源码都看一遍。

 

posted @ 2024-05-20 11:19  MrPei  阅读(3)  评论(0编辑  收藏  举报