k8s的flannel的pod运行一段时间init error

问题现象

使用Kubeadm部署的flannel网络运行一段时间后,提示init:Error错误,查看具体的信息如下:

[root@node1 ~]# kubectl describe pod kube-flannel-ds-amd64-cglhm -n kube-system
Name:               kube-flannel-ds-amd64-cglhm
Namespace:          kube-system
Priority:           0
PriorityClassName:  <none>
Node:               node1/192.168.1.205
Start Time:         Wed, 09 Jan 2019 22:34:28 -0500
Labels:             app=flannel
                    controller-revision-hash=6bbd4cd779
                    pod-template-generation=1
                    tier=node
Annotations:        <none>
Status:             Running
IP:                 192.168.1.205
Controlled By:      DaemonSet/kube-flannel-ds-amd64
Init Containers:
  install-cni:
    Container ID:  
    Image:         quay.io/coreos/flannel:v0.10.0-amd64
    Image ID:      
    Port:          <none>
    Host Port:     <none>
    Command:
      cp
    Args:
      -f
      /etc/kube-flannel/cni-conf.json
      /etc/cni/net.d/10-flannel.conflist
    State:          Waiting
      Reason:       RunContainerError
    Last State:     Terminated
      Reason:       ContainerCannotRun
      Message:      OCI runtime create failed: docker-runc did not terminate sucessfully: unknown
      Exit Code:    128
      Started:      Thu, 10 Jan 2019 15:47:59 -0500
      Finished:     Thu, 10 Jan 2019 15:47:59 -0500
    Ready:          False
    Restart Count:  0
    Environment:    <none>
    Mounts:
      /etc/cni/net.d from cni (rw)
      /etc/kube-flannel/ from flannel-cfg (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from flannel-token-4px5t (ro)
Containers:
  kube-flannel:
    Container ID:  docker://d80792918c91bddb163dccecc563233140dc184db56154aa162898ee0507d98b
    Image:         quay.io/coreos/flannel:v0.10.0-amd64
    Image ID:      docker://sha256:f0fad859c909baef1b038ef8d2f6e76fc252e25a3d9af37b82ce70623fb7cd6f
    Port:          <none>
    Host Port:     <none>
    Command:
      /opt/bin/flanneld
    Args:
      --ip-masq
      --kube-subnet-mgr
    State:          Waiting
      Reason:       RunContainerError
    Last State:     Terminated
      Reason:       ContainerCannotRun
      Message:      OCI runtime create failed: container_linux.go:348: starting container process caused "process_linux.go:402: container init caused \"rootfs_linux.go:70: creating device nodes caused \\\"cannot allocate memory\\\"\"": unknown
      Exit Code:    128
      Started:      Thu, 10 Jan 2019 15:47:53 -0500
      Finished:     Thu, 10 Jan 2019 15:47:53 -0500
    Ready:          False
    Restart Count:  38
    Limits:
      cpu:     100m
      memory:  50Mi
    Requests:
      cpu:     100m
      memory:  50Mi
    Environment:
      POD_NAME:       kube-flannel-ds-amd64-cglhm (v1:metadata.name)
      POD_NAMESPACE:  kube-system (v1:metadata.namespace)
    Mounts:
      /etc/kube-flannel/ from flannel-cfg (rw)
      /run from run (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from flannel-token-4px5t (ro)
Conditions:
  Type              Status
  Initialized       False 
  Ready             False 
  ContainersReady   False 
  PodScheduled      True 
Volumes:
  run:
    Type:          HostPath (bare host directory volume)
    Path:          /run
    HostPathType:  
  cni:
    Type:          HostPath (bare host directory volume)
    Path:          /etc/cni/net.d
    HostPathType:  
  flannel-cfg:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      kube-flannel-cfg
    Optional:  false
  flannel-token-4px5t:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  flannel-token-4px5t
    Optional:    false
QoS Class:       Guaranteed
Node-Selectors:  beta.kubernetes.io/arch=amd64
Tolerations:     :NoSchedule
                 node.kubernetes.io/disk-pressure:NoSchedule
                 node.kubernetes.io/memory-pressure:NoSchedule
                 node.kubernetes.io/network-unavailable:NoSchedule
                 node.kubernetes.io/not-ready:NoExecute
                 node.kubernetes.io/unreachable:NoExecute
                 node.kubernetes.io/unschedulable:NoSchedule
Events:
  Type     Reason                  Age                      From            Message
  ----     ------                  ----                     ----            -------
  Warning  FailedCreatePodSandBox  34m (x10524 over 4h23m)  kubelet, node1  Failed create pod sandbox: rpc error: code = Unknown desc = failed to start sandbox container for pod "kube-flannel-ds-amd64-cglhm": Error response from daemon: OCI runtime create failed: container_linux.go:348: starting container process caused "process_linux.go:301: running exec setns process for init caused \"signal: broken pipe\"": unknown
  Normal   SandboxChanged          4m58s (x12379 over 15h)  kubelet, node1  Pod sandbox changed, it will be killed and re-created.
[root@node1 ~]# docker version
Client:
 Version:           18.06.1-ce
 API version:       1.38
 Go version:        go1.10.3
 Git commit:        e68fc7a
 Built:             Tue Aug 21 17:23:03 2018
 OS/Arch:           linux/amd64
 Experimental:      false

Server:
 Engine:
  Version:          18.06.1-ce
  API version:      1.38 (minimum version 1.12)
  Go version:       go1.10.3
  Git commit:       e68fc7a
  Built:            Tue Aug 21 17:25:29 2018
  OS/Arch:          linux/amd64
  Experimental:     false

问题解决办法

我们查看kube-flannel默认pod分配的内存为50M,网络负载较大时,内存资源是不够的,导致Pod退出,提示Error

[root@node1 home]# cat kube-flannel.yml |grep memory
            memory: "50Mi"
            memory: "50Mi"
            memory: "50Mi"
            memory: "50Mi"
            memory: "50Mi"
            memory: "50Mi"
            memory: "50Mi"
            memory: "50Mi"
            memory: "50Mi"
            memory: "50Mi"

修改kube-flannel的memory值为100Mi以上

[root@node1 ~]# cat kube-flannel.yml |grep memory 
            memory: "100Mi"
            memory: "100Mi"
            memory: "100Mi"
            memory: "100Mi"
            memory: "100Mi"
            memory: "100Mi"
            memory: "100Mi"
            memory: "100Mi"
            memory: "100Mi"
            memory: "100Mi"
posted @ 2019-01-18 10:01  yuhaohao  阅读(2126)  评论(0编辑  收藏  举报