k8s HPA 配置时出现: failed to get cpu utilization: missing request for cpu

K8s HPA使用注意事项

 

关键提示

# https://github.com/kubernetes/kubernetes/issues/79365
# 在这个讨论中给出了这次故障的答案.

1. 关键点:
  一个pod中若定义了2个或多个Pod,那么配置 resources资源限制时,是每个容器都要指定,若仅指定了一个,那这个resource仅对指定的容器做资源限制,没有指定的,则不做资源限制!!!
  而HPA要正常工作,前提是Pod中每个容器都必须指定 resources资源限制,否则就会报错:
    failed to get cpu utilization: missing request for cpu
    invalid metrics (2 invalid out of 2), first error is: failed to get memory utilization: missing request for memory
    failed to get memory utilization: missing request for memory

 

2. 第二个可能引起 HPA 出现错误的因素--- aggregator

  aggregator 是K8s提供的针对自定义API方便注册到K8s的APIServer中去,方便开发人员将k8s的请求通过Aggregator,让请求进入自己的service并转发给自己的程序。

   aggregator若没有启用,会对metrics-server的产生影响,具体还没太明白:

  https://blog.csdn.net/fly910905/article/details/105375822/
   这篇文件中介绍了k8s中多个证书的作业,已经aggragator的启用方法.

   # 开启步骤

   1. 先生成一套aggregator的证书

# sudo apt install golang-cfssl

# aggregator-ca-config 
cat > aggregator-ca-config.json <<EOF
{
  "signing": {
    "default": {
       "expiry": "438000h"
    },
    "profiles": {
      "aggregator": {
        "usages": [
           "signing",
           "key encipherment",
           "server auth",
           "client auth"
        ],
       "expiry": "438000h"
      }
    }
  }
}
EOF

# aggregator-ca-csr
 cat > aggregator-ca-csr.json <<EOF

   {
    "CN": "aggregator",
    "hosts": [],
    "key": {
       "algo": "rsa",
       "size": 2048
     },
     "names": [
       {
         "C": "CN",
         "ST": "BeiJing",
         "L": "CP",
         "O": "k8s",
         "OU": "System"
       }
     ],
     "ca": {
       "expiry": "87600h"
     }
   }
   EOF

# 生成自签名CA证书
 cfssl gencert -initca aggregator-ca-csr.json |cfssljson -bare aggregator-ca
 

# 创建 aggregator证书申请配置
 cat > aggregator-csr.json <<EOF
   {
    "CN": "aggregator",
    "hosts": [
       "127.0.0.1",
       "10.120.0.1",
       "192.168.99.106",
       "kubernetes",
       "kubernetes.default",
       "kubernetes.default.svc",
       "kubernetes.default.svc.cluster.local",
       "paas-106"
     ],
    "key": {
       "algo": "rsa",
       "size": 2048
     },
     "names": [
       {
         "C": "CN",
         "ST": "BeiJing",
         "L": "CP",
         "O": "k8s",
         "OU": "System"
       }
     ]
   }
   EOF

# 生成 aggregator证书和key
cfssl gencert -ca=aggregator-ca.pem -ca-key=aggregator-ca-key.pem -config=aggregator-ca-config.json -profile=aggregator aggregator-csr.json |cfssljson -bare aggregator

 

  1. 修改 kube-apiserver.yaml

  # 默认使用kubeadm部署是没有 aggregator的配置的,需要添加

   $ sudo cp /etc/kubernetes/manifests/kube-apiserver.yaml .
   $ sudo vim kube-apiserver.yaml
       ....
       - --enable-aggregator-routing=true    #因为kube-apiserver和kube-proxy在多节点时,肯定有不在一起的,所以加上它.
       - --requestheader-client-ca-file=/etc/kubernetes/pki/aggregator-ca.pem
       - --requestheader-allowed-names=aggregator
       - --proxy-client-cert-file=/etc/kubernetes/pki/aggregator.pem
       - --proxy-client-key-file=/etc/kubernetes/pki/aggregator-key.pem
       ....


   # 最后将修改完成的,在复制回去;只有manifests目录中的文件发生改变,就会自动重启相关pod
   sudo  cp kube-apiserver.yaml /etc/kubernetes/manifests/kube-apiserver.yaml

   # 等一会儿后,kube-apiserver, kube-controller-manager, kube-scheduler都重启完成就可用了。

 

 

3. 第三个容易忽略的点

# 在编辑 HPA 配置时需要注意下面提示
apiVersion: autoscaling/v2beta2 kind: HorizontalPodAutoscaler metadata: name: render-hpa-sharegpu namespace: render-hpa spec: maxReplicas: 3 minReplicas: 1 metrics: - resource: name: memory target: averageUtilization: 40 type: Utilization type: Resource - resource: name: cpu target: averageUtilization: 30 type: Utilization type: Resource scaleTargetRef: apiVersion: apps/v1 #这里写时,要和使用的控制器所在群组一致 kind: Deployment name: render-hpa-sharegpu

 

posted @ 2022-06-07 21:46  张朝锋  阅读(2165)  评论(0编辑  收藏  举报