Kubernetes CSI插件注册(二)—— Kubelet CSI插件注册机制源码分析

1、概述

Kubernetes的CSI Plugin注册机制的实现分为两个部分,第一部分是 sidecar "node-driver-registrar",第二部分是Kubelet的pluginManager,第一部分详细内容请参见《Kubernetes CSI插件注册(一)—— node-driver-registrar源码分析》,本文主要讲后者,即Kubelet的pluginManager模块的源码。

工程源码:https://github.com/kubernetes/kubernetes/tree/release-1.24

注意:本文基于Kubernetes 1.24.10版本编写。

2、Kubelet CSI插件注册原理

Kubelet的pluginManager模块会监听特定的目录 /var/lib/kubelet/plugins_registry(其中 /var/lib/kubelet 是Kubelet工作路径,可以变更), sidecar "node-driver-registrar"实现了特定的接口(GRPC服务,RegistrationServer)向Kubelet提供插件注册信息并创建一个socket放到该目录下(每个csi plugin会对应一个Node Driver Registrar组件,也就是说,一个Node Driver Registrar只负责一个plugin的注册工作,socket文件命名规则为:csi插件名-reg.sock),Kubelet的pluginManager模块会通过该socket获取CSI Plugin的信息(主要是csi plugin name和csi plugin socket的路径),并将其写入k8s node节点的annotations,从而实现CSI Plugin的注册操作。
plugin注册完成后,后续kubelet将通过CSI Plugin暴露的socket与CSI Plugin进行通信,做存储卷挂载/解除挂载等操作。
结合sidecar "node-driver-registrar"和Kubelet的pluginManager可得出Kubernetes CSI插件注册的完整步骤,步骤如下:

第 1 步

Node Driver Registrar 连接 csi plugin 组件暴露的 grpc 服务 socket 地址,调用 GetPluginInfo 接口,获取 csi plugin 的 driver 名称。

第 2 步

在/var/lib/kubelet/plugins_registry目录下启动一个 socket(socket文件名规则,csi插件名-reg.sock),对外暴露 GetInfo 和 NotifyRegistrationStatus 两个方法向Kubelet提供插件注册信息。Kubelet 通过 Watcher 可以发现该 socket。

第 3 步

Kubelet 通过 Watcher 监控/var/lib/kubelet/plugins_registry/目录,发现上述 socket 后,通过该 socket 调用 Node-Driver-Registrar 的 GetInfo 方法,获取 csi plugin 类型、csi plugin 的 driver 名称、csi plugin 暴露的 grpc 服务 socket 地址以及 csi plugin 支持的版本。

第 4 步

Kubelet 通过 csi plugin 组件暴露的 grpc 服务 socket 地址对其 NodeGetInfo 方法进行调用,获取 csi plugin 的 nodeID、存储卷最大挂载数、拓扑信息。

第 5 步

Kubelet 根据上一步获得的信息,去更新 node 节点的 Annotations信息,同时创建(或更新)一个 CSINode 对象。

第 6 步

Kubelet 通过 socket 调用 Node-Driver-Registrar 容器的 NotifyRegistrationStatus 方法,通知注册 csi plugin 成功。

通过以上 6 步就实现了Kubernetes CSI Plugin注册机制。

3、Kubelet pluginManger源码分析

Kubelet的pluginManager模块借助sidecar "node-driver-registrar"提供的GRPC服务(即RegistrationServer)完成Kubernetes CSI插件注册,在《Kubernetes CSI插件注册(一)—— node-driver-registrar源码分析》这篇博文中分析了node-driver-registrar工程源码,在此篇博文中重点分析下Kubelet的pluginManager模块的源码,分析将分为pluginManager的初始化分析以及pluginManager的运行(处理逻辑)两部分。

3.1 pluginManager的初始化源码分析

NewMainKubelet():

Kubelet进程启动时,会调用NewMainKubelet()方法初始化Kubelet,紧接着NewMainKubelet()方法会调用pluginmanager.NewPluginManager来初始化pluginManager,所以把NewMainKubelet()作为分析入口,Kubelet程序从main方法启动到调用NewMainKubelet()方法的调用栈如下:

1
2
3
4
kubelet的main()方法 (cmd/kubelet/kubelet.go 40行) --> code := run(command) (cmd/kubelet/kubelet.go 47行) -->
return Run(....)(cmd/kubelet/app/server.go 267行)-->  err := run(....);(cmd/kubelet/app/server.go 406行)-->
RunKubelet(....)(cmd/kubelet/app/server.go 760行)--> createAndInitKubelet(....) (cmd/kubelet/app/server.go 1131行)-->
kubelet.NewMainKubelet(....)(cmd/kubelet/app/server.go 1236行)

NewMainKubelet()中调用pluginmanager.NewPluginManager来初始化pluginManager。

1
2
3
4
5
6
    // pkg/kubelet/kubelet:740
klet.pluginManager = pluginmanager.NewPluginManager(
    //  这里sockDir的值就是 /var/lib/kubelet/plugins_registry
    klet.getPluginsRegistrationDir(), /* sockDir */
    kubeDeps.Recorder,
)

接着看下pluginManager结构体对象初始化及pluginManager的结构体的结构。

PluginManager结构体的初始化(pkg/kubelet/pluginmanager/plugin_manager.go):

调用创建pluginManager结构体对象时会将sockDir传参进入pluginManager的desiredStateOfWorldPopulator结构体当中,相当于pluginManager会监听plugins_registry目录(负责向kubelet注册csi driver的组件Node Driver Registrar会在该目录下创建暴露服务的socket文件),pluginManager通过Node Driver Registrar组件暴露的socket获取plugin信息(包括plugin的socket地址、plugin名称等),从而最终做到根据该目录下socket文件的新增/删除来做相应的plugin注册/取消注册操作。
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
func NewPluginManager(
    sockDir string,
    recorder record.EventRecorder) PluginManager {
    asw := cache.NewActualStateOfWorld()
    dsw := cache.NewDesiredStateOfWorld()
    reconciler := reconciler.NewReconciler(
        operationexecutor.NewOperationExecutor(
            operationexecutor.NewOperationGenerator(
                recorder,
            ),
        ),
        loopSleepDuration,
        dsw,
        asw,
    )
 
    pm := &pluginManager{
        desiredStateOfWorldPopulator: pluginwatcher.NewWatcher(
            sockDir,
            dsw,
        ),
        reconciler:          reconciler,
        desiredStateOfWorld: dsw,
        actualStateOfWorld:  asw,
    }
    return pm
}

PluginManager结构体(pkg/kubelet/pluginmanager/plugin_manager.go):

注意看,pluginManager结构体中定义了desiredStateOfWorldPopulator、actualStateOfWorld、desiredStateOfWorld 和 reconciler 字段。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
type PluginManager interface {
    // Starts the plugin manager and all the asynchronous loops that it controls
    Run(sourcesReady config.SourcesReady, stopCh <-chan struct{})
 
    // AddHandler adds the given plugin handler for a specific plugin type, which
    // will be added to the actual state of world cache so that it can be passed to
    // the desired state of world cache in order to be used during plugin
    // registration/deregistration
    AddHandler(pluginType string, pluginHandler cache.PluginHandler)
}
 
const (
    // loopSleepDuration is the amount of time the reconciler loop waits
    // between successive executions
    loopSleepDuration = 1 * time.Second
)
 
// pluginManager implements the PluginManager interface
type pluginManager struct {
    // desiredStateOfWorldPopulator (the plugin watcher) runs an asynchronous
    // periodic loop to populate the desiredStateOfWorld.
    desiredStateOfWorldPopulator *pluginwatcher.Watcher
 
    // reconciler runs an asynchronous periodic loop to reconcile the
    // desiredStateOfWorld with the actualStateOfWorld by triggering register
    // and unregister operations using the operationExecutor.
    reconciler reconciler.Reconciler
 
    // actualStateOfWorld is a data structure containing the actual state of
    // the world according to the manager: i.e. which plugins are registered.
    // The data structure is populated upon successful completion of register
    // and unregister actions triggered by the reconciler.
    actualStateOfWorld cache.ActualStateOfWorld
 
    // desiredStateOfWorld is a data structure containing the desired state of
    // the world according to the plugin manager: i.e. what plugins are registered.
    // The data structure is populated by the desired state of the world
    // populator (plugin watcher).
    desiredStateOfWorld cache.DesiredStateOfWorld
}
 
var _ PluginManager = &pluginManager{}
如果没怎么看过其他的k8s代码,可能对k8s的actualStateOfWorld(后文简称asw)和desiredStateOfWorld(后文简称dsw)不熟悉。
在遵循声明式管理的k8s中,dsw一般对应了k8s资源的spec,即期望状态。asw一般对应了k8s资源的status,即实际状态。k8s最常见的一个处理框架,即是监测到外界变化后,先将其写入dsw,然后在reconciler中对dsw和asw进行比较,做相应的处理后,再把dsw中的变化写入asw,让两者同步。
在pluginManger中,也有着相似的处理:
asw中存放的是已经在k8s中已成功注册的plugin信息,dsw中存放的是期望注册的plugin信息,后续会讲到Kubelet是如何监测到plugin的变化并将其存储到dsw,最终同步到asw。

actualStateOfWorld(pkg/kubelet/pluginmanager/cache/actual_state_of_world.go):

actualStateOfWorld结构体中存放的是已经完成了plugin注册操作的Node Driver Registrar组件暴露的socket相关信息。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
type ActualStateOfWorld interface {
 
    // GetRegisteredPlugins generates and returns a list of plugins
    // that are successfully registered plugins in the current actual state of world.
    GetRegisteredPlugins() []PluginInfo
 
    // AddPlugin add the given plugin in the cache.
    // An error will be returned if socketPath of the PluginInfo object is empty.
    // Note that this is different from desired world cache's AddOrUpdatePlugin
    // because for the actual state of world cache, there won't be a scenario where
    // we need to update an existing plugin if the timestamps don't match. This is
    // because the plugin should have been unregistered in the reconciler and therefore
    // removed from the actual state of world cache first before adding it back into
    // the actual state of world cache again with the new timestamp
    AddPlugin(pluginInfo PluginInfo) error
 
    // RemovePlugin deletes the plugin with the given socket path from the actual
    // state of world.
    // If a plugin does not exist with the given socket path, this is a no-op.
    RemovePlugin(socketPath string)
 
    // PluginExists checks if the given plugin exists in the current actual
    // state of world cache with the correct timestamp
    PluginExistsWithCorrectTimestamp(pluginInfo PluginInfo) bool
}
 
// NewActualStateOfWorld returns a new instance of ActualStateOfWorld
func NewActualStateOfWorld() ActualStateOfWorld {
    return &actualStateOfWorld{
        socketFileToInfo: make(map[string]PluginInfo),
    }
}
 
type actualStateOfWorld struct {
 
    // socketFileToInfo is a map containing the set of successfully registered plugins
    // The keys are plugin socket file paths. The values are PluginInfo objects
    socketFileToInfo map[string]PluginInfo
    sync.RWMutex
}
 
var _ ActualStateOfWorld = &actualStateOfWorld{}
 
// PluginInfo holds information of a plugin
type PluginInfo struct {
    SocketPath string
    Timestamp  time.Time
    Handler    PluginHandler
    Name       string
}

desiredStateOfWorld(pkg/kubelet/pluginmanager/cache/desired_state_of_world.go):

desiredStateOfWorld结构体中存放的是在pluginManager监听目录下存在的希望完成plugin注册操作的Node Driver Registrar组件暴露的socket信息。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
type DesiredStateOfWorld interface {
    // AddOrUpdatePlugin add the given plugin in the cache if it doesn't already exist.
    // If it does exist in the cache, then the timestamp of the PluginInfo object in the cache will be updated.
    // An error will be returned if socketPath is empty.
    AddOrUpdatePlugin(socketPath string) error
 
    // RemovePlugin deletes the plugin with the given socket path from the desired
    // state of world.
    // If a plugin does not exist with the given socket path, this is a no-op.
    RemovePlugin(socketPath string)
 
    // GetPluginsToRegister generates and returns a list of plugins
    // in the current desired state of world.
    GetPluginsToRegister() []PluginInfo
 
    // PluginExists checks if the given socket path exists in the current desired
    // state of world cache
    PluginExists(socketPath string) bool
}
 
// NewDesiredStateOfWorld returns a new instance of DesiredStateOfWorld.
func NewDesiredStateOfWorld() DesiredStateOfWorld {
    return &desiredStateOfWorld{
        socketFileToInfo: make(map[string]PluginInfo),
    }
}
 
type desiredStateOfWorld struct {
 
    // socketFileToInfo is a map containing the set of successfully registered plugins
    // The keys are plugin socket file paths. The values are PluginInfo objects
    socketFileToInfo map[string]PluginInfo
    sync.RWMutex
}
 
var _ DesiredStateOfWorld = &desiredStateOfWorld{}

reconciler(pkg/kubelet/pluginmanager/reconciler/reconciler.go):

reconciler对比desiredStateOfWorld与actualStateOfWorld做调谐,做plugin的注册操作/取消注册操作。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
type Reconciler interface {
    // Run starts running the reconciliation loop which executes periodically,
    // checks if plugins are correctly registered or unregistered.
    // If not, it will trigger register/unregister operations to rectify.
    Run(stopCh <-chan struct{})
 
    // AddHandler adds the given plugin handler for a specific plugin type,
    // which will be added to the actual state of world cache.
    AddHandler(pluginType string, pluginHandler cache.PluginHandler)
}
 
// NewReconciler returns a new instance of Reconciler.
//
// loopSleepDuration - the amount of time the reconciler loop sleeps between
//
//  successive executions
//  syncDuration - the amount of time the syncStates sleeps between
//  successive executions
//
// operationExecutor - used to trigger register/unregister operations safely
//
//  (prevents more than one operation from being triggered on the same
//  socket path)
//
// desiredStateOfWorld - cache containing the desired state of the world
// actualStateOfWorld - cache containing the actual state of the world
func NewReconciler(
    operationExecutor operationexecutor.OperationExecutor,
    loopSleepDuration time.Duration,
    desiredStateOfWorld cache.DesiredStateOfWorld,
    actualStateOfWorld cache.ActualStateOfWorld) Reconciler {
    return &reconciler{
        operationExecutor:   operationExecutor,
        loopSleepDuration:   loopSleepDuration,
        desiredStateOfWorld: desiredStateOfWorld,
        actualStateOfWorld:  actualStateOfWorld,
        handlers:            make(map[string]cache.PluginHandler),
    }
}
 
type reconciler struct {
    operationExecutor   operationexecutor.OperationExecutor
    loopSleepDuration   time.Duration
    desiredStateOfWorld cache.DesiredStateOfWorld
    actualStateOfWorld  cache.ActualStateOfWorld
    handlers            map[string]cache.PluginHandler
    sync.RWMutex
}
 
var _ Reconciler = &reconciler{}

Watcher(pkg/kubelet/pluginmanager/pluginwatcher/plugin_watcher.go):

监听plugins_registry目录下socket文件的新增/删除,进而更新dsw中存放的plugin信息。
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
// Watcher is the plugin watcher
type Watcher struct {
    path                string
    fs                  utilfs.Filesystem
    fsWatcher           *fsnotify.Watcher
    desiredStateOfWorld cache.DesiredStateOfWorld
}
 
// NewWatcher provides a new watcher for socket registration
func NewWatcher(sockDir string, desiredStateOfWorld cache.DesiredStateOfWorld) *Watcher {
    return &Watcher{
        path:                sockDir,
        fs:                  &utilfs.DefaultFs{},
        desiredStateOfWorld: desiredStateOfWorld,
    }
}

3.2 pluginManager的运行(处理逻辑)源码分析 

上面介绍了pluginManager的初始化,接下来介绍pluginManager的运行也即Run方法进行分析,分析一下pluginManager的处理逻辑。
因为调用逻辑比较复杂,这里直接跳过了调用过程的分析,直接进入kl.pluginManager.Run()的分析,下面只给出该方法的一个调用链:

1
2
3
4
5
6
kubelet的main()方法 (cmd/kubelet/kubelet.go 40行) --> code := run(command) (cmd/kubelet/kubelet.go 47行) -->
return Run(....)(cmd/kubelet/app/server.go 267行)-->  err := run(....);(cmd/kubelet/app/server.go 406行)-->
RunKubelet(....)(cmd/kubelet/app/server.go 760行)--> startKubelet(....) (cmd/kubelet/app/server.go 1181行)-->
go k.Run(....)(cmd/kubelet/app/server.go 1190行)-->     go wait.Until(kl.updateRuntimeUp, ...) (pkg/kubelet/kubelet.go 1438行) -->
kl.oneTimeInitializer.Do(kl.initializeRuntimeDependentModules) (pkg/kubelet/kubelet.go 2370行,oneTimeInitializer是sync.Once函数)-->
go kl.pluginManager.Run(kl.sourcesReady, wait.NeverStop) (pkg/kubelet/kubelet.go 1396行)

注意,在执行kl.pluginManager.Run()方法前,pluginManager对象添加了CSI插件注册回调方法和设备插件回调方法。

1
2
3
4
5
6
7
// Adding Registration Callback function for CSI Driver
kl.pluginManager.AddHandler(pluginwatcherapi.CSIPlugin, plugincache.PluginHandler(csi.PluginHandler))
// Adding Registration Callback function for Device Manager
kl.pluginManager.AddHandler(pluginwatcherapi.DevicePlugin, kl.containerManager.GetPluginRegistrationHandler())
// Start the plugin manager
klog.V(4).InfoS("Starting plugin manager")
go kl.pluginManager.Run(kl.sourcesReady, wait.NeverStop)

插件handler结构体必须实现插件验证方法、插件注册方法以及插件删除方法。

1
2
3
4
5
6
7
8
9
10
11
12
type PluginHandler interface {
    // Validate returns an error if the information provided by
    // the potential plugin is erroneous (unsupported version, ...)
    ValidatePlugin(pluginName string, endpoint string, versions []string) error
    // RegisterPlugin is called so that the plugin can be register by any
    // plugin consumer
    // Error encountered here can still be Notified to the plugin.
    RegisterPlugin(pluginName, endpoint string, versions []string) error
    // DeRegister is called once the pluginwatcher observes that the socket has
    // been deleted.
    DeRegisterPlugin(pluginName string)
}

kl.pluginManager.Run(pkg/kubelet/pluginmanager/plugin_manager.go  109行):

下面直接看到kl.pluginManager.Run的代码,该方法主要逻辑有两个:
(1)pm.desiredStateOfWorldPopulator.Start():持续监听/var/lib/kubelet/plugins_registry/目录下socket文件的变化事件(遍历/var/lib/kubelet/plugins_registry/目录包括其子目录下的所有socket文件),将Node Driver Registrar的socket信息写入desiredStateOfWorld中/从desiredStateOfWorld中删除;
(2)pm.reconciler.Run(): 对dsw和asw进行了同步,完成了plugin的注册/取消注册的操作。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
func (pm *pluginManager) Run(sourcesReady config.SourcesReady, stopCh <-chan struct{}) {
    // 捕获崩溃并记录错误,默认不传参的话,在程序发送崩溃时,在控制台打印一下崩溃日志后再崩溃,方便技术人员排查程序错误
    defer runtime.HandleCrash()
 
    /*
       对 plugins_registry 目录启动了一个watcher,监听create和delete的操作
       监听到 create 事件,会将socket路径写入 dsw
       监听到 delete 事件,会将socket路径从 dsw 移除
    */
    pm.desiredStateOfWorldPopulator.Start(stopCh)
    klog.V(2).InfoS("The desired_state_of_world populator (plugin watcher) starts")
 
    klog.InfoS("Starting Kubelet Plugin Manager")
    /*
       调用了常见的 wait.Until,以cron的形式去调rc.reconcile(),对比dsw和asw,并进行同步
    */
    go pm.reconciler.Run(stopCh)
 
    metrics.Register(pm.actualStateOfWorld, pm.desiredStateOfWorld)
    <-stopCh
    klog.InfoS("Shutting down Kubelet Plugin Manager")
}  

pm.desiredStateOfWorldPopulator.Start(stopCh) (pkg/kubelet/pluginmanager/pluginwatcher/plugin_watcher.go):

对 plugins_registry 目录启动了一个watcher,然后遍历plugins_registry目录(遍历目录中的现有子级),如果是dir便添加到watcher中监控,如果是socket文件便调用w.handleCreateEvent,将该socket加入到desiredStateOfWorld中,主要用于当节点kubelet服务重启时,注册节点已存在的kubelet插件。

之后运行一个goroutine,持续监听plugins_registry注册目录的变化事件:
(1)当变化事件为新增事件时,即plugins_registry目录下多了文件,则调用w.handleCreateEvent,如果多的文件是socket文件类型,将该socket加入到desiredStateOfWorld中;
(2)当变化事件为删除事件时,即plugins_registry目录下删除了文件,则调用w.handleDeleteEvent,,如果删除的文件是socket文件类型,将该socket从desiredStateOfWorld中删除。 

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
// Start watches for the creation and deletion of plugin sockets at the path
func (w *Watcher) Start(stopCh <-chan struct{}) error {
    klog.V(2).InfoS("Plugin Watcher Start", "path", w.path)
 
    // 如果plugins_registry目录(/var/lib/kubelet/plugins_registry/)目录不存在的话便创建plugins_registry目录
    // Creating the directory to be watched if it doesn't exist yet,
    // and walks through the directory to discover the existing plugins.
    if err := w.init(); err != nil {
        return err
    }
 
    fsWatcher, err := fsnotify.NewWatcher()
    if err != nil {
        return fmt.Errorf("failed to start plugin fsWatcher, err: %v", err)
    }
    w.fsWatcher = fsWatcher
 
    /*
     在开始插件处理goroutine之前,遍历plugins_registry目录(遍历目录中的现有子级),如果是dir便添加到fsWtacher对象中监控,如果是socket文件便
     新增fsnotify事件,并调用w.handleCreateEvent,将该socket加入到dsw中,如果是其他普通文件便忽略掉。其主要作用是当节点kubelet服务重启时,将
     当前节点plugins_registry目录下已经存在的各种csi类型的Node Driver Registrar组件暴露的socket文件(当然也可能是设备插件暴露的socket文件)
     注册到dsw,以使得kubelet的pluginManager组件完成各种csi组件的注册。
     */
    // Traverse plugin dir and add filesystem watchers before starting the plugin processing goroutine.
    if err := w.traversePluginDir(w.path); err != nil {
        klog.ErrorS(err, "Failed to traverse plugin socket path", "path", w.path)
    }
 
    // 启动协程,for循环监听plugins_registry目录(遍历目录中的现有子级)下产生的fsnotify.create和fsnotify.delete的事件,并调用w.handleCreateEvent方法
    go func(fsWatcher *fsnotify.Watcher) {
        for {
            select {
            case event := <-fsWatcher.Events:
                //TODO: Handle errors by taking corrective measures
                if event.Op&fsnotify.Create == fsnotify.Create {
                    err := w.handleCreateEvent(event)
                    if err != nil {
                        klog.ErrorS(err, "Error when handling create event", "event", event)
                    }
                } else if event.Op&fsnotify.Remove == fsnotify.Remove {
                    w.handleDeleteEvent(event)
                }
                continue
            case err := <-fsWatcher.Errors:
                if err != nil {
                    klog.ErrorS(err, "FsWatcher received error")
                }
                continue
            case <-stopCh:
                w.fsWatcher.Close()
                return
            }
        }
    }(fsWatcher)
 
    return nil
}
 
// Walks through the plugin directory discover any existing plugin sockets.
// Ignore all errors except root dir not being walkable
func (w *Watcher) traversePluginDir(dir string) error {
    // watch the new dir
    err := w.fsWatcher.Add(dir)
    if err != nil {
        return fmt.Errorf("failed to watch %s, err: %v", w.path, err)
    }
    // traverse existing children in the dir
    return w.fs.Walk(dir, func(path string, info os.FileInfo, err error) error {
        if err != nil {
            if path == dir {
                return fmt.Errorf("error accessing path: %s error: %v", path, err)
            }
 
            klog.ErrorS(err, "Error accessing path", "path", path)
            return nil
        }
 
        // do not call fsWatcher.Add twice on the root dir to avoid potential problems.
        if path == dir {
            return nil
        }
 
        switch mode := info.Mode(); {
        case mode.IsDir():
            if err := w.fsWatcher.Add(path); err != nil {
                return fmt.Errorf("failed to watch %s, err: %v", path, err)
            }
        case mode&os.ModeSocket != 0:
            event := fsnotify.Event{
                Name: path,
                Op:   fsnotify.Create,
            }
            //TODO: Handle errors by taking corrective measures
            if err := w.handleCreateEvent(event); err != nil {
                klog.ErrorS(err, "Error when handling create", "event", event)
            }
        default:
            klog.V(5).InfoS("Ignoring file", "path", path, "mode", mode)
        }
 
        return nil
    })
}

w.handleCreateEvent()(pkg/kubelet/pluginmanager/pluginwatcher/plugin_watcher.go):

w.handleCreateEvent()主要逻辑:
(1)判断新增事件是否为文件,且是否是socket文件;
(2)是socket文件,则调用w.handlePluginRegistration做处理,主要是将该socket加入到desiredStateOfWorld中。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
func (w *Watcher) handleCreateEvent(event fsnotify.Event) error {
    klog.V(6).InfoS("Handling create event", "event", event)
 
    fi, err := os.Stat(event.Name)
    // TODO: This is a workaround for Windows 20H2 issue for os.Stat(). Please see
    // microsoft/Windows-Containers#97 for details.
    // Once the issue is resvolved, the following os.Lstat() is not needed.
    if err != nil && runtime.GOOS == "windows" {
        fi, err = os.Lstat(event.Name)
    }
    if err != nil {
        return fmt.Errorf("stat file %s failed: %v", event.Name, err)
    }
 
    if strings.HasPrefix(fi.Name(), ".") {
        klog.V(5).InfoS("Ignoring file (starts with '.')", "path", fi.Name())
        return nil
    }
 
    if !fi.IsDir() {
        isSocket, err := util.IsUnixDomainSocket(util.NormalizePath(event.Name))
        if err != nil {
            return fmt.Errorf("failed to determine if file: %s is a unix domain socket: %v", event.Name, err)
        }
        if !isSocket {
            klog.V(5).InfoS("Ignoring non socket file", "path", fi.Name())
            return nil
        }
 
        return w.handlePluginRegistration(event.Name)
    }
 
    // 如果新增的是dir,再将dir及dir中所有文件添加到fsWtacher对象中监控
    return w.traversePluginDir(event.Name)
}
 
func (w *Watcher) handlePluginRegistration(socketPath string) error {
    if runtime.GOOS == "windows" {
        socketPath = util.NormalizePath(socketPath)
    }
    // Update desired state of world list of plugins
    // If the socket path does exist in the desired world cache, there's still
    // a possibility that it has been deleted and recreated again before it is
    // removed from the desired world cache, so we still need to call AddOrUpdatePlugin
    // in this case to update the timestamp
    klog.V(2).InfoS("Adding socket path or updating timestamp to desired state cache", "path", socketPath)
    err := w.desiredStateOfWorld.AddOrUpdatePlugin(socketPath)
    if err != nil {
        return fmt.Errorf("error adding socket path %s or updating timestamp to desired state cache: %v", socketPath, err)
    }
    return nil
}
 
func (dsw *desiredStateOfWorld) AddOrUpdatePlugin(socketPath string) error {
    dsw.Lock()
    defer dsw.Unlock()
 
    if socketPath == "" {
        return fmt.Errorf("socket path is empty")
    }
    if _, ok := dsw.socketFileToInfo[socketPath]; ok {
        klog.V(2).InfoS("Plugin exists in actual state cache, timestamp will be updated", "path", socketPath)
    }
 
    // Update the PluginInfo object.
    // Note that we only update the timestamp in the desired state of world, not the actual state of world
    // because in the reconciler, we need to check if the plugin in the actual state of world is the same
    // version as the plugin in the desired state of world
    dsw.socketFileToInfo[socketPath] = PluginInfo{
        SocketPath: socketPath,
        Timestamp:  time.Now(),
    }
    return nil
}

w.handleDeleteEvent(event)(pkg/kubelet/pluginmanager/pluginwatcher/plugin_watcher.go):

w.handleDeleteEvent()主要逻辑:
(1)将socket从desiredStateOfWorld中删除。 

1
2
3
4
5
6
7
8
9
10
11
12
13
14
func (w *Watcher) handleDeleteEvent(event fsnotify.Event) {
    klog.V(6).InfoS("Handling delete event", "event", event)
 
    socketPath := event.Name
    klog.V(2).InfoS("Removing socket path from desired state cache", "path", socketPath)
    w.desiredStateOfWorld.RemovePlugin(socketPath)
}
 
func (dsw *desiredStateOfWorld) RemovePlugin(socketPath string) {
    dsw.Lock()
    defer dsw.Unlock()
 
    delete(dsw.socketFileToInfo, socketPath)
}

pm.reconciler.Run(stopCh)(pkg/kubelet/pluginmanager/reconciler/reconciler.go):

pm.reconciler.Run()主要逻辑为对比desiredStateOfWorld与actualStateOfWorld做调谐,做plugin的注册操作/取消注册操作。具体逻辑如下:
(1)对比actualStateOfWorld,如果desiredStateOfWorld中没有该socket信息,或者desiredStateOfWorld中该socket的Timestamp值与actualStateOfWorld中的不相等(即plugin更新了),则说明该plugin需要取消注册(更新的plugin需先取消注册,然后再次注册),调用rc.operationExecutor.UnregisterPlugin做plugin取消注册操作;
(2)对比desiredStateOfWorld,如果actualStateOfWorld中没有该socket信息,则调用rc.operationExecutor.RegisterPlugin做plugin注册操作。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
// 定时调度器,每隔1秒调用1次rc.reconcile()方法,时间间隔不包括执行rc.reconcile()方法逻辑时间
func (rc *reconciler) Run(stopCh <-chan struct{}) {
    wait.Until(func() {
        rc.reconcile()
    },
        rc.loopSleepDuration,
        stopCh)
}
 
func (rc *reconciler) reconcile() {
    // Unregisterations are triggered before registrations
 
    // Ensure plugins that should be unregistered are unregistered.
    for _, registeredPlugin := range rc.actualStateOfWorld.GetRegisteredPlugins() {
        unregisterPlugin := false
        // 如果desiredStateOfWorld中没有该socket信息,则说明该plugin需要取消注册
        if !rc.desiredStateOfWorld.PluginExists(registeredPlugin.SocketPath) {
            unregisterPlugin = true
        } else {
            // We also need to unregister the plugins that exist in both actual state of world
            // and desired state of world cache, but the timestamps don't match.
            // Iterate through desired state of world plugins and see if there's any plugin
            // with the same socket path but different timestamp.
            for _, dswPlugin := range rc.desiredStateOfWorld.GetPluginsToRegister() {
                // desiredStateOfWorld中该socket的Timestamp值与actualStateOfWorld中的不相等(即plugin更新了),则说明该plugin需要取消注册(更新的plugin需先取消注册,然后再重新完成一次注册操作流程)
                if dswPlugin.SocketPath == registeredPlugin.SocketPath && dswPlugin.Timestamp != registeredPlugin.Timestamp {
                    klog.V(5).InfoS("An updated version of plugin has been found, unregistering the plugin first before reregistering", "plugin", registeredPlugin)
                    unregisterPlugin = true
                    break
                }
            }
        }
 
        // 取消插件
        if unregisterPlugin {
            klog.V(5).InfoS("Starting operationExecutor.UnregisterPlugin", "plugin", registeredPlugin)
            err := rc.operationExecutor.UnregisterPlugin(registeredPlugin, rc.actualStateOfWorld)
            if err != nil &&
                !goroutinemap.IsAlreadyExists(err) &&
                !exponentialbackoff.IsExponentialBackoff(err) {
                // Ignore goroutinemap.IsAlreadyExists and exponentialbackoff.IsExponentialBackoff errors, they are expected.
                // Log all other errors.
                klog.ErrorS(err, "OperationExecutor.UnregisterPlugin failed", "plugin", registeredPlugin)
            }
            if err == nil {
                klog.V(1).InfoS("OperationExecutor.UnregisterPlugin started", "plugin", registeredPlugin)
            }
        }
    }
 
    // Ensure plugins that should be registered are registered
    for _, pluginToRegister := range rc.desiredStateOfWorld.GetPluginsToRegister() {
        // 忽略已注册的插件,即desiredStateOfWorld中该socket的Timestamp值与actualStateOfWorld中的相等
        if !rc.actualStateOfWorld.PluginExistsWithCorrectTimestamp(pluginToRegister) {
            klog.V(5).InfoS("Starting operationExecutor.RegisterPlugin", "plugin", pluginToRegister)
            err := rc.operationExecutor.RegisterPlugin(pluginToRegister.SocketPath, pluginToRegister.Timestamp, rc.getHandlers(), rc.actualStateOfWorld)
            if err != nil &&
                !goroutinemap.IsAlreadyExists(err) &&
                !exponentialbackoff.IsExponentialBackoff(err) {
                // Ignore goroutinemap.IsAlreadyExists and exponentialbackoff.IsExponentialBackoff errors, they are expected.
                klog.ErrorS(err, "OperationExecutor.RegisterPlugin failed", "plugin", pluginToRegister)
            }
            if err == nil {
                klog.V(1).InfoS("OperationExecutor.RegisterPlugin started", "plugin", pluginToRegister)
            }
        }
    }
}

rc.operationExecutor.RegisterPlugin(....)  (pkg/kubelet/pluginmanager/operationexecutor/operation_executor.go):

rc.operationExecutor.RegisterPlugin()主要逻辑:做plugin注册操作。

那plugin注册操作具体做了什么呢?继续往下分析。

plugin注册操作方法调用链
1
2
3
kl.pluginManager.Run() --> pm.desiredStateOfWorldPopulator.Start() --> pm.reconciler.Run() --> rc.reconcile()
--> rc.operationExecutor.RegisterPlugin() -->  oe.pendingOperations.Run() --> oe.operationGenerator.GenerateRegisterPluginFunc()
--> handler.RegisterPlugin() --> nim.InstallCSIDriver() --> nim.updateNode() && nim.updateCSINode()  

下面来对plugin注册操作的部分关键方法进行分析。

GenerateRegisterPluginFunc(....)  (pkg/kubelet/pluginmanager/operationexecutor/operation_generator.go)

下面来分析下GenerateRegisterPluginFunc的逻辑,主要是定义并实现一个plugin注册的方法,然后返回。plugin注册方法主要逻辑如下:

(1)检测Node Driver Registrar组件socket的连通性;
(2)通过Node Driver Registrar组件socket获取plugin信息(获取 csi插件类型、csi插件名、csi插件监听的grpc服务socket地址以及此csi插件支持的版本);
(3)调用handler.ValidatePlugin(),检查已注册的plugin中是否有比该需要注册的plugin同名的的更高的版本,如有,则返回注册失败,并通知plugin注册失败;注意:自Kubernetes v1.17起,不再支持CSI v0.x 版本;
(4)向actualStateOfWorld中增加该Node Driver Registrar组件的socket信息;
(5)调用handler.RegisterPlugin()做进一步的plugin注册操作;
(6)调用og.notifyPlugin,通知plugin,已经向kubelet注册成功/注册失败。

所以接下来会对handler.RegisterPlugin()方法进行分析。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
func (oe *operationExecutor) RegisterPlugin(
    socketPath string,
    timestamp time.Time,
    pluginHandlers map[string]cache.PluginHandler,
    actualStateOfWorld ActualStateOfWorldUpdater) error {
    generatedOperation :=
        oe.operationGenerator.GenerateRegisterPluginFunc(socketPath, timestamp, pluginHandlers, actualStateOfWorld)
 
    return oe.pendingOperations.Run(
        socketPath, generatedOperation)
 
func (grm *goRoutineMap) Run(
    operationName string,
    operationFunc func() error) error {
    grm.lock.Lock()
    defer grm.lock.Unlock()
 
    existingOp, exists := grm.operations[operationName]
    if exists {
        // Operation with name exists
        if existingOp.operationPending {
            return NewAlreadyExistsError(operationName)
        }
 
        if err := existingOp.expBackoff.SafeToRetry(operationName); err != nil {
            return err
        }
    }
 
    grm.operations[operationName] = operation{
        operationPending: true,
        expBackoff:       existingOp.expBackoff,
    }
    go func() (err error) {
        // 捕获崩溃并记录错误,默认不传参的话,在程序发送崩溃时,在控制台打印一下崩溃日志后再崩溃,方便技术人员排查程序错误。
        defer k8sRuntime.HandleCrash()
        /*
          调用插件注册/删除方法不报错或者grm.exponentialBackOffOnError=false的话,将从grm.operations中移除此协程;
          调用插件注册/删除方法报错并且grm.exponentialBackOffOnError=true,则将产生指数级补偿,到达补偿时间后才能调用插件注册/删除方法
         */
        // Handle completion of and error, if any, from operationFunc()
        defer grm.operationComplete(operationName, &err)
        // Handle panic, if any, from operationFunc()
        defer k8sRuntime.RecoverFromPanic(&err)
        return operationFunc()
    }()
 
    return nil
}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
func (og *operationGenerator) GenerateRegisterPluginFunc(
    socketPath string,
    timestamp time.Time,
    pluginHandlers map[string]cache.PluginHandler,
    actualStateOfWorldUpdater ActualStateOfWorldUpdater) func() error {
 
    registerPluginFunc := func() error {
        client, conn, err := dial(socketPath, dialTimeoutDuration)
        if err != nil {
            return fmt.Errorf("RegisterPlugin error -- dial failed at socket %s, err: %v", socketPath, err)
        }
        defer conn.Close()
 
        ctx, cancel := context.WithTimeout(context.Background(), time.Second)
        defer cancel()
 
        // kubelet 作为Grpc客户端调用Grpc服务端(node-driver-registrar组件 RegistrationServer服务)的GetInfo方法,获取 csi插件类型、csi插件名、csi插件监听的grpc服务socket地址以及此csi插件支持的版本
        infoResp, err := client.GetInfo(ctx, &registerapi.InfoRequest{})
        if err != nil {
            return fmt.Errorf("RegisterPlugin error -- failed to get plugin info using RPC GetInfo at socket %s, err: %v", socketPath, err)
        }
 
        // 判断此插件的类型是否在reconciler的Handlers中注册了,kubelet默认只注册了CSIPlugin和DevicePlugin两个类型
        handler, ok := pluginHandlers[infoResp.Type]
        if !ok {
            if err := og.notifyPlugin(client, false, fmt.Sprintf("RegisterPlugin error -- no handler registered for plugin type: %s at socket %s", infoResp.Type, socketPath)); err != nil {
                return fmt.Errorf("RegisterPlugin error -- failed to send error at socket %s, err: %v", socketPath, err)
            }
            return fmt.Errorf("RegisterPlugin error -- no handler registered for plugin type: %s at socket %s", infoResp.Type, socketPath)
        }
 
        // 如果csi插件监听的grpc服务socket地址为空的话便设置其值为node-driver-registrar监听的socket地址
        if infoResp.Endpoint == "" {
            infoResp.Endpoint = socketPath
        }
        // 检查已注册的plugin中是否有比该需要注册的plugin同名的的更高的版本,如有,则返回注册失败,并通知plugin注册失败;注意:自Kubernetes v1.17起,不再支持CSI v0.x 版本。
        if err := handler.ValidatePlugin(infoResp.Name, infoResp.Endpoint, infoResp.SupportedVersions); err != nil {
            if err = og.notifyPlugin(client, false, fmt.Sprintf("RegisterPlugin error -- plugin validation failed with err: %v", err)); err != nil {
                return fmt.Errorf("RegisterPlugin error -- failed to send error at socket %s, err: %v", socketPath, err)
            }
            return fmt.Errorf("RegisterPlugin error -- pluginHandler.ValidatePluginFunc failed")
        }
 
        // 向actualStateOfWorld中增加该Node Driver Registrar组件的socket信息;
        // We add the plugin to the actual state of world cache before calling a plugin consumer's Register handle
        // so that if we receive a delete event during Register Plugin, we can process it as a DeRegister call.
        err = actualStateOfWorldUpdater.AddPlugin(cache.PluginInfo{
            SocketPath: socketPath,   // node-driver-registrar监听的socket地址
            Timestamp:  timestamp,
            Handler:    handler,
            Name:       infoResp.Name,
        })
        if err != nil {
            klog.ErrorS(err, "RegisterPlugin error -- failed to add plugin", "path", socketPath)
        }
        /*
        插件注册
        1)插件版本验证;
        2)csiDrivers使用全局变量存放注册csi插件信息(后续kubelet调用csi plugin进行存储的挂载/解除挂载操作,将通过plugin名称从此变量中拿到socket地址并进行通信);
        3)kubelet 作为Grpc客户端调用Grpc服务端(csi Plugin的NodeServer服务)的NodeGetInfo方法,返回参数:node在k8s集群id、存储卷最大挂载数、拓扑信息;
        4)更新node对象注解和标签(accessibleTopology值不为nil的话);创建或更新CSINode对象,维护上此插件的drivers信息。
         */
        if err := handler.RegisterPlugin(infoResp.Name, infoResp.Endpoint, infoResp.SupportedVersions); err != nil {
            return og.notifyPlugin(client, false, fmt.Sprintf("RegisterPlugin error -- plugin registration failed with err: %v", err))
        }
 
        // Notify is called after register to guarantee that even if notify throws an error Register will always be called after validate
        if err := og.notifyPlugin(client, true, ""); err != nil {
            return fmt.Errorf("RegisterPlugin error -- failed to send registration status at socket %s, err: %v", socketPath, err)
        }
        return nil
    }
    return registerPluginFunc
}
 
// 告知Node Driver Registrar组件插件是否注册成功
func (og *operationGenerator) notifyPlugin(client registerapi.RegistrationClient, registered bool, errStr string) error {
    ctx, cancel := context.WithTimeout(context.Background(), notifyTimeoutDuration)
    defer cancel()
 
    status := &registerapi.RegistrationStatus{
        PluginRegistered: registered,
        Error:            errStr,
    }
 
    if _, err := client.NotifyRegistrationStatus(ctx, status); err != nil {
        return fmt.Errorf("%s: %w", errStr, err)
    }
 
    if errStr != "" {
        return errors.New(errStr)
    }
 
    return nil
}
 
// 用于创建访问Node Driver Registrar组件(RegistrationServer服务)的grpc客户端
// Dial establishes the gRPC communication with the picked up plugin socket. https://godoc.org/google.golang.org/grpc#Dial
func dial(unixSocketPath string, timeout time.Duration) (registerapi.RegistrationClient, *grpc.ClientConn, error) {
    ctx, cancel := context.WithTimeout(context.Background(), timeout)
    defer cancel()
 
    c, err := grpc.DialContext(ctx, unixSocketPath, grpc.WithInsecure(), grpc.WithBlock(),
        grpc.WithContextDialer(func(ctx context.Context, addr string) (net.Conn, error) {
            return (&net.Dialer{}).DialContext(ctx, "unix", addr)
        }),
    )
 
    if err != nil {
        return nil, nil, fmt.Errorf("failed to dial socket %s, err: %v", unixSocketPath, err)
    }
 
    return registerapi.NewRegistrationClient(c), c, nil
}
handler.RegisterPlugin() (go/src/kubernetes_projects/kubernetes-1.24.10/pkg/volume/csi/csi_plugin.go):

 handler.RegisterPlugin()方法主要逻辑:

(1)存储该plugin信息(主要是plugin名称与plugin的socket地址)到csiDrivers变量中(后续kubelet调用csi plugin进行存储的挂载/解除挂载操作,将通过plugin名称从此变量中拿到socket地址并进行通信);
(2)kubelet 作为Grpc客户端调用Grpc服务端(csi Plugin的NodeServer服务)的NodeGetInfo方法,返回参数:node在k8s集群id、存储卷最大挂载数、拓扑信息
(3)调用nim.InstallCSIDriver,做进一步的plugin注册操作(插件注册,更新node对象注解和标签(accessibleTopology值不为nil的话);创建或更新CSINode对象)。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
// RegisterPlugin is called when a plugin can be registered
func (h *RegistrationHandler) RegisterPlugin(pluginName string, endpoint string, versions []string) error {
    klog.Infof(log("Register new plugin with name: %s at endpoint: %s", pluginName, endpoint))
 
    // 版本验证
    highestSupportedVersion, err := h.validateVersions("RegisterPlugin", pluginName, endpoint, versions)
    if err != nil {
        return err
    }
 
    // csiDrivers使用全局变量存放已注册csi插件信息,包括csi插件监听的grpc服务socket地址以及csi插件支持的最高版本
    // Storing endpoint of newly registered CSI driver into the map, where CSI driver name will be the key
    // all other CSI components will be able to get the actual socket of CSI drivers by its name.
    csiDrivers.Set(pluginName, Driver{
        endpoint:                endpoint,
        highestSupportedVersion: highestSupportedVersion,
    })
 
    // Get node info from the driver.
    csi, err := newCsiDriverClient(csiDriverName(pluginName))
    if err != nil {
        return err
    }
 
    ctx, cancel := context.WithTimeout(context.Background(), csiTimeout)
    defer cancel()
 
    // kubelet 作为Grpc客户端调用Grpc服务端(csi Plugin的NodeServer服务)的NodeGetInfo方法,返回参数:node在k8s集群id、存储卷最大挂载数、拓扑信息
    driverNodeID, maxVolumePerNode, accessibleTopology, err := csi.NodeGetInfo(ctx)
    if err != nil {
        if unregErr := unregisterDriver(pluginName); unregErr != nil {
            klog.Error(log("registrationHandler.RegisterPlugin failed to unregister plugin due to previous error: %v", unregErr))
        }
        return err
    }
 
    // 插件注册,更新node对象注解和标签(accessibleTopology值不为nil的话);创建或更新CSINode对象。
    err = nim.InstallCSIDriver(pluginName, driverNodeID, maxVolumePerNode, accessibleTopology)
    if err != nil {
        if unregErr := unregisterDriver(pluginName); unregErr != nil {
            klog.Error(log("registrationHandler.RegisterPlugin failed to unregister plugin due to previous error: %v", unregErr))
        }
        return err
    }
 
    return nil
}
nim.InstallCSIDriver() (pkg/volume/csi/nodeinfomanager/nodeinfomanager.go)

nim.InstallCSIDriver()中主要看到updateNodeIDInNode()与nim.updateCSINode()两个方法,主要逻辑都在其中:
(1)updateNodeIDInNode():更新node对象,向node对象的annotation中key为csi.volume.kubernetes.io/nodeid的值中去增加注册的plugin信息。
(2)nim.updateCSINode():创建或更新CSINode对象。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
func (nim *nodeInfoManager) InstallCSIDriver(driverName string, driverNodeID string, maxAttachLimit int64, topology map[string]string) error {
    if driverNodeID == "" {
        return fmt.Errorf("error adding CSI driver node info: driverNodeID must not be empty")
    }
 
    nodeUpdateFuncs := []nodeUpdateFunc{
        updateNodeIDInNode(driverName, driverNodeID),
        updateTopologyLabels(topology),
    }
 
    err := nim.updateNode(nodeUpdateFuncs...)
    if err != nil {
        return fmt.Errorf("error updating Node object with CSI driver node info: %v", err)
    }
 
    err = nim.updateCSINode(driverName, driverNodeID, maxAttachLimit, topology)
    if err != nil {
        return fmt.Errorf("error updating CSINode object with CSI driver node info: %v", err)
    }
 
    return nil
}
 
// updateNodeIDInNode returns a function that updates a Node object with the given
// Node ID information.
func updateNodeIDInNode(
    csiDriverName string,
    csiDriverNodeID string) nodeUpdateFunc {
    return func(node *v1.Node) (*v1.Node, bool, error) {
        existingDriverMap, err := buildNodeIDMapFromAnnotation(node)
        if err != nil {
            return nil, false, err
        }
 
        if val, ok := existingDriverMap[csiDriverName]; ok {
            if val == csiDriverNodeID {
                // Value already exists in node annotation, nothing more to do
                return node, false, nil
            }
        }
 
        // Add/update annotation value
        existingDriverMap[csiDriverName] = csiDriverNodeID
        jsonObj, err := json.Marshal(existingDriverMap)
        if err != nil {
            return nil, false, fmt.Errorf(
                "error while marshalling node ID map updated with driverName=%q, nodeID=%q: %v",
                csiDriverName,
                csiDriverNodeID,
                err)
        }
 
        if node.ObjectMeta.Annotations == nil {
            node.ObjectMeta.Annotations = make(map[string]string)
        }
        node.ObjectMeta.Annotations[annotationKeyNodeID] = string(jsonObj)
 
        return node, true, nil
    }
}
 
// updateTopologyLabels returns a function that updates labels of a Node object with the given
// topology information.
func updateTopologyLabels(topology map[string]string) nodeUpdateFunc {
    return func(node *v1.Node) (*v1.Node, bool, error) {
        if len(topology) == 0 {
            return node, false, nil
        }
 
        for k, v := range topology {
            if curVal, exists := node.Labels[k]; exists && curVal != v {
                return nil, false, fmt.Errorf("detected topology value collision: driver reported %q:%q but existing label is %q:%q", k, v, k, curVal)
            }
        }
 
        if node.Labels == nil {
            node.Labels = make(map[string]string)
        }
        for k, v := range topology {
            node.Labels[k] = v
        }
        return node, true, nil
    }
}

rc.operationExecutor.UnregisterPlugin(....)  (pkg/kubelet/pluginmanager/operationexecutor/operation_executor.go):

rc.operationExecutor.UnregisterPlugin()主要逻辑:做plugin取消注册操作。

那plugin取消注册操作具体做了什么呢?继续往下分析。

plugin取消注册操作方法调用链
1
2
3
4
kl.pluginManager.Run --> pm.desiredStateOfWorldPopulator.Start() --> pm.reconciler.Run() --> rc.reconcile()
--> rc.operationExecutor.UnregisterPlugin() --> oe.operationGenerator.GenerateUnregisterPluginFunc() --> handler.DeRegisterPlugin()
--> nim.UninstallCSIDriver() --> nim.uninstallDriverFromCSINode() --> nim.updateNode(
removeMaxAttachLimit(driverName),removeNodeIDFromNode(driverName))

下面来对plugin取消注册操作的部分关键方法进行分析。

GenerateUnregisterPluginFunc(....)  (pkg/kubelet/pluginmanager/operationexecutor/operation_generator.go)

下面来分析下GenerateUnregisterPluginFunc的逻辑,主要是定义并实现一个plugin取消注册的方法,然后返回。plugin取消注册方法主要逻辑如下:
(1)从actualStateOfWorld中删除该Node Driver Registrar组件的socket信息;
(2)调用handler.DeRegisterPlugin做进一步的plugin取消注册操作。

所以接下来会对handler.DeRegisterPlugin方法进行分析。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
func (oe *operationExecutor) UnregisterPlugin(
    pluginInfo cache.PluginInfo,
    actualStateOfWorld ActualStateOfWorldUpdater) error {
    generatedOperation :=
        oe.operationGenerator.GenerateUnregisterPluginFunc(pluginInfo, actualStateOfWorld)
 
    return oe.pendingOperations.Run(
        pluginInfo.SocketPath, generatedOperation)
}
 
func (grm *goRoutineMap) Run(
    operationName string,
    operationFunc func() error) error {
    grm.lock.Lock()
    defer grm.lock.Unlock()
 
    existingOp, exists := grm.operations[operationName]
        ......
        ......
 
    return nil
}
 
func (og *operationGenerator) GenerateUnregisterPluginFunc(
    pluginInfo cache.PluginInfo,
    actualStateOfWorldUpdater ActualStateOfWorldUpdater) func() error {
 
    unregisterPluginFunc := func() error {
        if pluginInfo.Handler == nil {
            return fmt.Errorf("UnregisterPlugin error -- failed to get plugin handler for %s", pluginInfo.SocketPath)
        }
 
        // actualStateOfWorld中删除该Node Driver Registrar组件的socket信息;
        // We remove the plugin to the actual state of world cache before calling a plugin consumer's Unregister handle
        // so that if we receive a register event during Register Plugin, we can process it as a Register call.
        actualStateOfWorldUpdater.RemovePlugin(pluginInfo.SocketPath)
        /*
        插件删除
        1)csiDrivers全局变量中删除已注册csi插件信息;
        2)更新或者删除csinode对象,删除此插件的drivers信息;更新node对象注解和status(maxAttachLimit或allocatable值不为nil的话)
         */
        pluginInfo.Handler.DeRegisterPlugin(pluginInfo.Name)
 
        klog.V(4).InfoS("DeRegisterPlugin called", "pluginName", pluginInfo.Name, "pluginHandler", pluginInfo.Handler)
        return nil
    }
    return unregisterPluginFunc
}
handler.DeRegisterPlugin() (go/src/kubernetes_projects/kubernetes-1.24.10/pkg/volume/csi/csi_plugin.go):

handler.DeRegisterPlugin()方法里逻辑比较简单,主要是调用了unregisterDriver()方法。

unregisterDriver()方法主要逻辑:
(1)从csiDrivers变量中删除该plugin信息(后续kubelet调用csi plugin进行存储的挂载/解除挂载操作,将通过plugin名称从csiDrivers变量中拿到socket地址并进行通信,所以取消注册plugin时,需要从csiDrivers变量中把该plugin信息去除);
(2)调用nim.UninstallCSIDriver()做进一步处理。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
func (h *RegistrationHandler) DeRegisterPlugin(pluginName string) {
    klog.Info(log("registrationHandler.DeRegisterPlugin request for plugin %s", pluginName))
    if err := unregisterDriver(pluginName); err != nil {
        klog.Error(log("registrationHandler.DeRegisterPlugin failed: %v", err))
    }
}
 
func unregisterDriver(driverName string) error {
    csiDrivers.Delete(driverName)
 
    if err := nim.UninstallCSIDriver(driverName); err != nil {
        return errors.New(log("Error uninstalling CSI driver: %v", err))
    }
 
    return nil
}
nim.UninstallCSIDriver()(pkg/volume/csi/nodeinfomanager/nodeinfomanager.go):

接下来看到nim.UninstallCSIDriver()方法的分析。

nim.UninstallCSIDriver()中主要看到nim.uninstallDriverFromCSINode()、removeMaxAttachLimit()与removeNodeIDFromNode()3个方法,主要逻辑都在其中:
(1)nim.uninstallDriverFromCSINode():更新CSINode对象,从中去除取消注册的plugin的相关信息。
(2)removeMaxAttachLimit():更新node对象,从node.Status.Capacity及node.Status.Allocatable中去除取消注册的plugin的相关信息。
(3)removeNodeIDFromNode():更新node对象,从node对象的annotation中key为csi.volume.kubernetes.io/nodeid的值中去除取消注册的plugin信息。

node对象的annotation示例:

1
csi.volume.kubernetes.io/nodeid: '{"nfs.csi.k8s.io":"node2","rook-ceph.cephfs.csi.ceph.com":"node2","rook-ceph.rbd.csi.ceph.com":"node2"}'

CSINode对象示例:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
[root@node2 ~]# kubectl get csinodes.storage.k8s.io node2  -o yaml
apiVersion: storage.k8s.io/v1
kind: CSINode
metadata:
  annotations:
    storage.alpha.kubernetes.io/migrated-plugins: kubernetes.io/cinder
  creationTimestamp: "2022-03-27T04:38:23Z"
  name: node2
  ownerReferences:
  - apiVersion: v1
    kind: Node
    name: node2
    uid: 1ec4f78b-9144-4255-8b65-464eccdb032b
  resourceVersion: "104431682"
  uid: 8542cd13-8f12-4549-8d05-94723fd83b8f
spec:
  drivers:
  - name: nfs.csi.k8s.io
    nodeID: node2
    topologyKeys: null
  - name: rook-ceph.cephfs.csi.ceph.com
    nodeID: node2
    topologyKeys: null
  - name: rook-ceph.rbd.csi.ceph.com
    nodeID: node2
    topologyKeys: null

nim.UninstallCSIDriver()源码:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
func (nim *nodeInfoManager) UninstallCSIDriver(driverName string) error {
    err := nim.uninstallDriverFromCSINode(driverName)
    if err != nil {
        return fmt.Errorf("error uninstalling CSI driver from CSINode object %v", err)
    }
 
    err = nim.updateNode(
        removeMaxAttachLimit(driverName),
        removeNodeIDFromNode(driverName),
    )
    if err != nil {
        return fmt.Errorf("error removing CSI driver node info from Node object %v", err)
    }
    return nil
}
 
func (nim *nodeInfoManager) uninstallDriverFromCSINode(
    csiDriverName string) error {
 
    csiKubeClient := nim.volumeHost.GetKubeClient()
    if csiKubeClient == nil {
        return fmt.Errorf("error getting CSI client")
    }
 
    var updateErrs []error
    err := wait.ExponentialBackoff(updateBackoff, func() (bool, error) {
        if err := nim.tryUninstallDriverFromCSINode(csiKubeClient, csiDriverName); err != nil {
            updateErrs = append(updateErrs, err)
            return false, nil
        }
        return true, nil
    })
    if err != nil {
        return fmt.Errorf("error updating CSINode: %v; caused by: %v", err, utilerrors.NewAggregate(updateErrs))
    }
    return nil
}
 
 
func removeMaxAttachLimit(driverName string) nodeUpdateFunc {
    return func(node *v1.Node) (*v1.Node, bool, error) {
        limitKey := v1.ResourceName(util.GetCSIAttachLimitKey(driverName))
 
        capacityExists := false
        if node.Status.Capacity != nil {
            _, capacityExists = node.Status.Capacity[limitKey]
        }
 
        allocatableExists := false
        if node.Status.Allocatable != nil {
            _, allocatableExists = node.Status.Allocatable[limitKey]
        }
 
        if !capacityExists && !allocatableExists {
            return node, false, nil
        }
 
        delete(node.Status.Capacity, limitKey)
        if len(node.Status.Capacity) == 0 {
            node.Status.Capacity = nil
        }
 
        delete(node.Status.Allocatable, limitKey)
        if len(node.Status.Allocatable) == 0 {
            node.Status.Allocatable = nil
        }
 
        return node, true, nil
    }
}
 
// removeNodeIDFromNode returns a function that removes node ID information matching the given
// driver name from a Node object.
func removeNodeIDFromNode(csiDriverName string) nodeUpdateFunc {
    return func(node *v1.Node) (*v1.Node, bool, error) {
        var previousAnnotationValue string
        if node.ObjectMeta.Annotations != nil {
            previousAnnotationValue =
                node.ObjectMeta.Annotations[annotationKeyNodeID]
        }
 
        if previousAnnotationValue == "" {
            return node, false, nil
        }
 
        // Parse previousAnnotationValue as JSON
        existingDriverMap := map[string]string{}
        if err := json.Unmarshal([]byte(previousAnnotationValue), &existingDriverMap); err != nil {
            return nil, false, fmt.Errorf(
                "failed to parse node's %q annotation value (%q) err=%v",
                annotationKeyNodeID,
                previousAnnotationValue,
                err)
        }
 
        if _, ok := existingDriverMap[csiDriverName]; !ok {
            // Value is already missing in node annotation, nothing more to do
            return node, false, nil
        }
 
        // Delete annotation value
        delete(existingDriverMap, csiDriverName)
        if len(existingDriverMap) == 0 {
            delete(node.ObjectMeta.Annotations, annotationKeyNodeID)
        } else {
            jsonObj, err := json.Marshal(existingDriverMap)
            if err != nil {
                return nil, false, fmt.Errorf(
                    "failed while trying to remove key %q from node %q annotation. Existing data: %v",
                    csiDriverName,
                    annotationKeyNodeID,
                    previousAnnotationValue)
            }
 
            node.ObjectMeta.Annotations[annotationKeyNodeID] = string(jsonObj)
        }
 
        return node, true, nil
    }
}

4、总结

本节主要讲解了Kubelet注册CSI Plugin的原理,以及其代码的分析,也顺带提了一下Node Driver Registrar组件,下面来做个总结。

kubelet的pluginManager会监听某个特定目录,而负责向kubelet注册csi driver的组件Node Driver Registrar会创建暴露服务的socket在该目录下(每个plugin会对应一个Node Driver Registrar组件,也就是说,一个Node Driver Registrar只负责一个plugin的注册工作),pluginManager通过Node Driver Registrar组件暴露的socket获取plugin信息(包括plugin的socket地址、plugin名称等),从而最终做到根据该目录下socket文件的新增/删除来做相应的plugin注册/取消注册操作。

plugin注册完成后,后续kubelet将通过CSI Plugin暴露的socket与CSI Plugin进行通信,做存储卷挂载/解除挂载等操作。

下面再来总结一下在Kubelet的pluginManager中,Plugin的注册/取消注册操作分别做了什么动作。

plugin注册操作

(1)存储该plugin信息(主要是plugin名称与plugin的socket地址)到csiDrivers变量中(后续kubelet调用csi plugin进行存储的挂载/解除挂载操作,将通过plugin名称从此变量中拿到socket地址并进行通信);
(2)更新node对象,向node对象的annotation中key为csi.volume.kubernetes.io/nodeid的值中去增加注册的plugin信息。
(3)创建或更新CSINode对象。

plugin取消注册操作

(1)从csiDrivers变量中删除该plugin信息(后续kubelet调用csi plugin进行存储的挂载/解除挂载操作,将通过plugin名称从csiDrivers变量中拿到socket地址并进行通信,所以取消注册plugin时,需要从csiDrivers变量中把该plugin信息去除);
(2)更新CSINode对象,从中去除取消注册的plugin的相关信息。
(3)更新node对象,从node.Status.Capacity及node.Status.Allocatable中去除取消注册的plugin的相关信息。
(4)更新node对象,从node对象的annotation中key为csi.volume.kubernetes.io/nodeid的值中去除取消注册的plugin信息。

参考:https://juejin.cn/post/7126222522755842055

参考:https://www.cnblogs.com/lianngkyle/p/14906274.html

参考:https://www.jianshu.com/p/1d21a1e529d7

posted @   人艰不拆_zmc  阅读(798)  评论(0编辑  收藏  举报
相关博文:
阅读排行:
· TypeScript + Deepseek 打造卜卦网站:技术与玄学的结合
· 阿里巴巴 QwQ-32B真的超越了 DeepSeek R-1吗?
· 【译】Visual Studio 中新的强大生产力特性
· 张高兴的大模型开发实战:(一)使用 Selenium 进行网页爬虫
· 【设计模式】告别冗长if-else语句:使用策略模式优化代码结构
点击右上角即可分享
微信分享提示