Kubernetes CSI插件注册(一)—— node-driver-registrar源码分析
1、概述
node-driver-registrar是一个由官方K8s sig小组维护的辅助容器(sidecar),它的主要作用则是获取CSI插件信息并通过GRPC服务(即RegistrationServer)向Kubelet提供插件的注册信息。Kubelet基于node-driver-registrar工程提供的GRPC服务(即RegistrationServer)成功注册对应CSI plugin(CSI Driver,两个名词意义一样)后,Kubelet做Volume mount/umount 操作时便能够调用相应的CSI Plugin进行操作了。
node-driver-registrar工程源码地址:https://github.com/kubernetes-csi/node-driver-registrar.git
注意:本文基于node-driver-registrar v2.6.3版本编写。
2、node-driver-registrar容器部署 yaml 详解
node-driver-registrar是一个sidecar容器,与CSI plugin NodeServer 容器在同一个Pod里面(使用 DaemonSet 部署,DaemonSet名称一般为csi-存储类型-node),node-driver-registrar容器通过CSI plugin NodeServer 容器来获取CSI插件信息。
下面以NFS CSI plugin的csi-nfs-node这个DaemonSet配置为例,解释下node-driver-registrar容器和csi plugin NodeServer 容器的依赖关系以及node-driver-registrar容器规格配置文件含义。
--- kind: DaemonSet apiVersion: apps/v1 metadata: name: csi-nfs-node namespace: kube-system spec: updateStrategy: rollingUpdate: maxUnavailable: 1 type: RollingUpdate selector: matchLabels: app: csi-nfs-node template: metadata: labels: app: csi-nfs-node spec: hostNetwork: true # original nfs connection would be broken without hostNetwork setting dnsPolicy: Default # available values: Default, ClusterFirstWithHostNet, ClusterFirst serviceAccountName: csi-nfs-node-sa nodeSelector: kubernetes.io/os: linux tolerations: - operator: "Exists" containers: - name: liveness-probe image: registry.k8s.io/sig-storage/livenessprobe:v2.7.0 args: - --csi-address=/csi/csi.sock - --probe-timeout=3s - --health-port=29653 - --v=2 volumeMounts: - name: socket-dir mountPath: /csi resources: limits: memory: 100Mi requests: cpu: 10m memory: 20Mi - name: node-driver-registrar image: registry.k8s.io/sig-storage/csi-node-driver-registrar:v2.5.1 args: - --v=2 # csi插件监听的socket文件地址,通过此socket文件node-driver-registrar容器作为Grpc客户端获取csi插件服务端提供的插件信息 - --csi-address=/csi/csi.sock # 此配置存放csi插件监听socket的文件路径,kubelet进行插件注册时会通过调用node-driver-registrar的GRPC服务(RegistrationServer服务GetInfo方法)返回 # kubelet-registration-path,kubelet注册插件成功后,Kubelet通过此socket文件做Volume 的 mount/umount 操作时便能够调用相应的 csi Plugin进行操作了。 - --kubelet-registration-path=$(DRIVER_REG_SOCK_PATH) # livenessProbe: exec: command: # 健康检查,用于检查csi插件是否注册成功 - /csi-node-driver-registrar - --kubelet-registration-path=$(DRIVER_REG_SOCK_PATH) - --mode=kubelet-registration-probe initialDelaySeconds: 30 timeoutSeconds: 15 env: - name: DRIVER_REG_SOCK_PATH value: /var/lib/kubelet/plugins/csi-nfsplugin/csi.sock - name: KUBE_NODE_NAME valueFrom: fieldRef: fieldPath: spec.nodeName volumeMounts: - name: socket-dir mountPath: /csi - name: registration-dir mountPath: /registration resources: limits: memory: 100Mi requests: cpu: 10m memory: 20Mi - name: nfs # csi plugin NodeServer容器不是本文重点,将简略解释此容器的配置文件 securityContext: privileged: true capabilities: add: ["SYS_ADMIN"] allowPrivilegeEscalation: true # csi plugin插件容器镜像,运行后作为csi grpc服务端为node-driver-registrar容器提供csi插件信息 image: registry.k8s.io/sig-storage/nfsplugin:v4.1.0 args: - "-v=5" - "--nodeid=$(NODE_ID)" - "--endpoint=$(csi_ENDPOINT)" env: - name: NODE_ID valueFrom: fieldRef: fieldPath: spec.nodeName # 使用Unix domain socket - name: csi_ENDPOINT value: unix:///csi/csi.sock ports: - containerPort: 29653 name: healthz protocol: TCP livenessProbe: failureThreshold: 5 httpGet: path: /healthz port: healthz initialDelaySeconds: 30 timeoutSeconds: 10 periodSeconds: 30 imagePullPolicy: "IfNotPresent" volumeMounts: # 由于存储卷使用了hostPath类型,csi插件作为服务端运行后,产生的socket文件都将存放到"/var/lib/kubelet/plugins/csi插件类型"目录下 - name: socket-dir mountPath: /csi - name: pods-mount-dir mountPath: /var/lib/kubelet/pods mountPropagation: "Bidirectional" resources: limits: memory: 300Mi requests: cpu: 10m memory: 20Mi volumes: - name: socket-dir hostPath: # /var/lib/kubelet/plugins目录用于存放csi插件作为GRPC服务端监听的socket文件,不同的csi插件会在此目录下新建对应类型目录, # 目录下存放csi插件作为GRPC服务端监听的socket文件csi.sock,本示例中nfs csi插件创建的目录为csi-nfsplugin。 path: /var/lib/kubelet/plugins/csi-nfsplugin # 指定的路径不存在时,自动将其创建为 0755 权限的空目录,属主和属组均为 kubelet。 type: DirectoryOrCreate - name: pods-mount-dir hostPath: path: /var/lib/kubelet/pods type: Directory - hostPath: # kubelet插件注册机制会watch此目录进行kubelet插件(包括设备插件、csi插件)注册。对于csi插件的话,node-driver-registrar # 作为GRPC服务端监听的socket文件将存放在此目录下,文件命令格式为:csi插件名-reg.sock path: /var/lib/kubelet/plugins_registry # 事先必须存在的目录路径。 type: Directory name: registration-dir
下面讲解下node-driver-registrar容器最关键的两个启动参数,主要是配置 2 个 socket 地址:
- csi-address:CSI插件监听的socket文件地址,通过此socket文件node-driver-registrar容器作为Grpc客户端获取CSI插件服务端提供的插件信息
- kubelet-registration-path:此配置存放CSI插件监听socket的文件路径,kubelet进行插件注册时会通过node-driver-registrar的GRPC服务(RegistrationServer服务GetInfo方法)返回kubelet-registration-path,kubelet注册插件成功后,Kubelet通过此socket文件做Volume的 mount/umount操作时便能够调用相应的CSI Plugin进行操作了。
下面再说一下node-driver-registrar容器权限:
- RBAC:Node Driver Registrar 容器没有访问 kubernetes API 的需求,所以不用做相关的 RBAC 配置。
- 需要将前面提到的 2 个 socket 的父目录作为 hostPath 挂载进容器中,并对 socket 拥有创建、删除、访问等权限(通过配置 containers[].securityContext.privileged=true 获得权限)。
3、node-driver-registrar工程源码详解
3.1.main()
先从 main()入手。
主要逻辑:
(1)组件启动参数校验;
(2)连接 CSI plugin 组件暴露的 grpc 服务 socket 地址,调用 GetPluginInfo 接口,获取CSI plugin 的 driver 名称;
(3)调用 nodeRegister 方法,此方法通过GRPC服务(即RegistrationServer)向Kubelet提供插件的注册信息。
func main() { klog.InitFlags(nil) flag.Set("logtostderr", "true") flag.Parse() if *showVersion { fmt.Println(os.Args[0], version) return } // /var/lib/kubelet/plugins/csi-nfsplugin/csi.sock if *kubeletRegistrationPath == "" { klog.Error("kubelet-registration-path is a required parameter") os.Exit(1) } // /var/lib/kubelet/plugins/csi-nfsplugin // set after we made sure that *kubeletRegistrationPath exists kubeletRegistrationPathDir := filepath.Dir(*kubeletRegistrationPath) // 存活探针检查路径,kueblet插件注册成功后,会在node-driver-registrar容器创建此目录,/var/lib/kubelet/plugins/csi-nfsplugin/registration registrationProbePath = filepath.Join(kubeletRegistrationPathDir, "registration") // with the mode kubelet-registration-probe if modeIsKubeletRegistrationProbe() { lockfileExists, err := util.DoesFileExist(registrationProbePath) if err != nil { klog.Fatalf("Failed to check if registration path exists, registrationProbePath=%s err=%v", registrationProbePath, err) os.Exit(1) } if !lockfileExists { klog.Fatalf("Kubelet plugin registration hasn't succeeded yet, file=%s doesn't exist.", registrationProbePath) os.Exit(1) } klog.Infof("Kubelet plugin registration succeeded.") os.Exit(0) } klog.Infof("Version: %s", version) klog.Infof("Running node-driver-registrar in mode=%s", *mode) if *healthzPort > 0 && *httpEndpoint != "" { klog.Error("only one of `--health-port` and `--http-endpoint` can be set.") os.Exit(1) } // 插件注册服务端地址 var addr string if *healthzPort > 0 { addr = ":" + strconv.Itoa(*healthzPort) } else { addr = *httpEndpoint } if *connectionTimeout != 0 { klog.Warning("--connection-timeout is deprecated and will have no effect") } // Unused metrics manager, necessary for connection.Connect below cmm := metrics.NewcsiMetricsManagerForSidecar("") // Once https://github.com/container-storage-interface/spec/issues/159 is // resolved, if plugin does not support PUBLISH_UNPUBLISH_VOLUME, then we // can skip adding mapping to "csi.volume.kubernetes.io/nodeid" annotation. klog.V(1).Infof("Attempting to open a gRPC connection with: %q", *csiAddress) // 创建grpc客户端连接,用于连接csi plugin grpc服务端 csiConn, err := connection.Connect(*csiAddress, cmm) if err != nil { klog.Errorf("error connecting to csi driver: %v", err) os.Exit(1) } klog.V(1).Infof("Calling csi driver to discover driver name") ctx, cancel := context.WithTimeout(context.Background(), *operationTimeout) defer cancel() // 获取csi plugin插件名(插件名不能为空),例如drivername=nfs.csi.k8s.io csiDriverName, err := csirpc.GetDriverName(ctx, csiConn) if err != nil { klog.Errorf("error retreiving csi driver name: %v", err) os.Exit(1) } klog.V(2).Infof("csi driver name: %q", csiDriverName) cmm.SetDriverName(csiDriverName) // 后端运行kubelet插件注册Grpc服务端 // Run forever nodeRegister(csiDriverName, addr) }
3.2 nodeRegister()
nodeRegister()通过GRPC服务(即RegistrationServer)向Kubelet提供插件的注册信息。主要逻辑:
(1)调用 newRegistrationServer()初始化 registrationServer 结构体;
(2)在/var/lib/kubelet/plugins_registry目录下启动GRPC服务(即RegistrationServer)socket,对外暴露 GetInfo 和 NotifyRegistrationStatus 两个方法(kubelet 插件注册机制通过 Watcher 此目录可以发现该 socket);
(3)如果启动node-driver-registrar容器时,容器启动参数healthzPort或httpEndpoint值不为空的话,将提供restful接口(监听httpEndpoint)用于判断socketPath文件是否已经创建,即插件注册grpc服务端是否已经启动;
(4)获取到程序kill信号时,删除/var/lib/kubelet/plugins_registry目录下启动GRPC服务(即RegistrationServer)创建的socket文件。
// 插件注册服务,通过此服务向Kubelet提供插件的注册信息 // registrationServer is a sample plugin to work with plugin watcher type registrationServer struct { // csi插件名,例如drivername=nfs.csi.k8s.io driverName string // csi插件监听的socket地址,kubelet注册csi插件成功后通过此地址调用csi插件方法,csi中使用的都是Unix domain socket,例如/var/lib/kubelet/plugins/csi-nfsplugin/csi.sock endpoint string // csi插件版本 version []string } var _ registerapi.RegistrationServer = registrationServer{} // kebelet定义的插件注册接口,想为kubelet提供插件注册服务的话,必须实现此接口 type RegistrationServer interface { GetInfo(context.Context, *InfoRequest) (*PluginInfo, error) NotifyRegistrationStatus(context.Context, *RegistrationStatus) (*RegistrationStatusResponse, error) } /* Copyright 2018 The Kubernetes Authors. Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. */ package main import ( "fmt" "net" "net/http" "net/http/pprof" "os" "os/signal" "runtime" "syscall" "google.golang.org/grpc" "github.com/kubernetes-csi/node-driver-registrar/pkg/util" "k8s.io/klog/v2" registerapi "k8s.io/kubelet/pkg/apis/pluginregistration/v1" ) func nodeRegister(csiDriverName, httpEndpoint string) { // When kubeletRegistrationPath is specified then driver-registrar ONLY acts // as gRPC server which replies to registration requests initiated by kubelet's // plugins watcher infrastructure. Node labeling is done by kubelet's csi code. // kubeletRegistrationPath == /var/lib/kubelet/plugins/csi-nfsplugin/csi.sock // 创建插件注册服务对象 registrar := newRegistrationServer(csiDriverName, *kubeletRegistrationPath, supportedVersions) // 插件注册grpc服务端将监听的文件:/registration/nfs.csi.k8s.io-reg.sock socketPath := buildSocketPath(csiDriverName) // 先删除节点存在的插件注册grpc服务端监听socket文件,比如/registration/nfs.csi.k8s.io-reg.sock文件 if err := util.CleanupSocketFile(socketPath); err != nil { klog.Errorf("%+v", err) os.Exit(1) } var oldmask int if runtime.GOOS == "linux" { // Default to only user accessible socket, caller can open up later if desired oldmask, _ = util.Umask(0077) } klog.Infof("Starting Registration Server at: %s\n", socketPath) lis, err := net.Listen("unix", socketPath) if err != nil { klog.Errorf("failed to listen on socket: %s with error: %+v", socketPath, err) os.Exit(1) } if runtime.GOOS == "linux" { util.Umask(oldmask) } klog.Infof("Registration Server started at: %s\n", socketPath) // 创建grpc服务端 grpcServer := grpc.NewServer() // 删除node-driver-registrar容器监控检查目录/var/lib/kubelet/plugins/csi-nfsplugin/registration,在kubelet通过registrationServer服务端注册 // csi插件之前,请确保node-driver-registrar容器中健康检查目录不存在,因为容器被强制关闭,所以可能存在锁定文件 // Before registering node-driver-registrar with the kubelet ensure that the lockfile doesn't exist // a lockfile may exist because the container was forcefully shutdown util.CleanupFile(registrationProbePath) // 将插件注册服务注册到grpc服务端中 // Registers kubelet plugin watcher api. registerapi.RegisterRegistrationServer(grpcServer, registrar) // 提供restful接口(监听httpEndpoint)判断socketPath文件是否已经创建,即插件注册grpc服务端是否已经启动 go httpServer(socketPath, httpEndpoint) // 获取到程序kill信号时,删除/registration/nfs.csi.k8s.io-reg.sock文件 go removeRegSocket(csiDriverName) // Starts service if err := grpcServer.Serve(lis); err != nil { klog.Errorf("Registration Server stopped serving: %v", err) os.Exit(1) } // clean the file on graceful shutdown util.CleanupFile(registrationProbePath) // If gRPC server is gracefully shutdown, cleanup and exit os.Exit(0) } func buildSocketPath(csiDriverName string) string { return fmt.Sprintf("%s/%s-reg.sock", *pluginRegistrationPath, csiDriverName) } func httpServer(socketPath string, httpEndpoint string) { if httpEndpoint == "" { klog.Infof("Skipping HTTP server because endpoint is set to: %q", httpEndpoint) return } klog.Infof("Starting HTTP server at endpoint: %v\n", httpEndpoint) // Prepare http endpoint for healthz + profiling (if enabled) mux := http.NewServeMux() mux.HandleFunc("/healthz", func(w http.ResponseWriter, req *http.Request) { socketExists, err := util.DoesSocketExist(socketPath) if err == nil && socketExists { w.WriteHeader(http.StatusOK) w.Write([]byte(`ok`)) klog.V(5).Infof("health check succeeded") } else if err != nil { w.WriteHeader(http.StatusInternalServerError) w.Write([]byte(err.Error())) klog.Errorf("health check failed: %+v", err) } else if !socketExists { w.WriteHeader(http.StatusNotFound) w.Write([]byte("registration socket does not exist")) klog.Errorf("health check failed, registration socket does not exist") } }) // 是否开启go程序性能分析工具 if *enableProfile { klog.InfoS("Starting profiling", "endpoint", httpEndpoint) mux.HandleFunc("/debug/pprof/", pprof.Index) mux.HandleFunc("/debug/pprof/cmdline", pprof.Cmdline) mux.HandleFunc("/debug/pprof/profile", pprof.Profile) mux.HandleFunc("/debug/pprof/symbol", pprof.Symbol) mux.HandleFunc("/debug/pprof/trace", pprof.Trace) } klog.Fatal(http.ListenAndServe(httpEndpoint, mux)) } func removeRegSocket(csiDriverName string) { sigc := make(chan os.Signal, 1) signal.Notify(sigc, syscall.SIGTERM) <-sigc socketPath := buildSocketPath(csiDriverName) err := os.Remove(socketPath) if err != nil && !os.IsNotExist(err) { klog.Errorf("failed to remove socket: %s with error: %+v", socketPath, err) os.Exit(1) } os.Exit(0) }
3.3 RegistrationServer服务对外暴露的 GetInfo 和 NotifyRegistrationStatus 方法
GetInfo:kubelet 作为Grpc客户端调用Grpc服务端(RegistrationServer服务)的GetInfo方法,获取 csi插件类型、csi插件名、csi插件监听的grpc服务socket地址以及此csi插件支持的版本。kubelet调用此方法的逻辑中还会在node-driver-registrar容器中创建registrationProbePath目录,用于容器存活探针健康检查使用。
// GetInfo is the RPC invoked by plugin watcher func (e registrationServer) GetInfo(ctx context.Context, req *registerapi.InfoRequest) (*registerapi.PluginInfo, error) { klog.Infof("Received GetInfo call: %+v", req) // 成功注册后,创建注册探测文件:/var/lib/kubelet/plugins/csi-nfsplugin/registration // on successful registration, create the registration probe file err := util.TouchFile(registrationProbePath) if err != nil { klog.ErrorS(err, "Failed to create registration probe file", "registrationProbePath", registrationProbePath) } else { klog.InfoS("Kubelet registration probe created", "path", registrationProbePath) } return ®isterapi.PluginInfo{ Type: registerapi.csiPlugin, Name: e.driverName, Endpoint: e.endpoint, SupportedVersions: e.version, }, nil }
NotifyRegistrationStatus:kubelet 通过调用Grpc服务端(RegistrationServer服务)的 NotifyRegistrationStatus 方法,通知kueblet注册 csi plugin 是否成功。
// 如果插件注册失败则退出程序 func (e registrationServer) NotifyRegistrationStatus(ctx context.Context, status *registerapi.RegistrationStatus) (*registerapi.RegistrationStatusResponse, error) { klog.Infof("Received NotifyRegistrationStatus call: %+v", status) if !status.PluginRegistered { klog.Errorf("Registration process failed with error: %+v, restarting registration container.", status.Error) os.Exit(1) } return ®isterapi.RegistrationStatusResponse{}, nil }
4、总结
node-driver-registrar主要作用是获取CSI插件信息并通过GRPC服务(即RegistrationServer)向Kubelet提供插件的注册信息。Kubelet基于node-driver-registrar工程提供的GRPC服务(即RegistrationServer)成功注册对应CSI plugin后,主要是kubelet 作为GRPC客户端调用Grpc服务端(RegistrationServer服务)的GetInfo方法,获取到 CSI插件类型、CSI插件名、CSI插件监听的GRPC服务socket地址以及此CSI插件支持的版本信息后,Kubelet做Volume 的 mount/umount 操作时便能够作为CSI Plugin的GRPC客户端调用CSI Plugin提供的方法(GRPC服务端)进行操作了。