针对于K8s集群,主要是对三方面进行监控,分别是Node、Namespace、Pod。
一、 Node监控
针对于节点的维度,主要监控内存、CPU使用率、磁盘和索引的使用率,过高告警。还要监控NodeNotReady的情况。
1、NodeMemorySpaceFillingUp
监控Node内存使用率,如果大于80%则报警。
1 2 3 4 5 6 7 8 9
|
alert:NodeMemorySpaceFillingUp expr:((1 - (node_memory_MemAvailable_bytes{job="node-exporter"} / node_memory_MemTotal_bytes{job="node-exporter"}) * on(instance) group_left(nodename) (node_uname_info) > 0.8) * 100) for: 5m labels: cluster: critical type: node annotations: description: Memory usage on `{{$labels.nodename}}`({{ $labels.instance }}) up to {{ printf "%.2f" $value }}%. summary: Node memory will be exhausted.
|
2、NodeCpuUtilisationHigh
监控Node CPU使用率,如果大于80%则报警。
1 2 3 4 5 6 7 8 9
|
alert:NodeFilesystemAlmostOutOfSpace expr:((node_filesystem_avail_bytes{fstype!="",job="node-exporter"} / node_filesystem_size_bytes{fstype!="",job="node-exporter"} * 100 < 20 and node_filesystem_readonly{fstype!="",job="node-exporter"} == 0) * on(instance) group_left(nodename) (node_uname_info)) for: 5m labels: cluster: critical type: node annotations: description: Filesystem on `{{ $labels.device }}` at `{{$labels.nodename}}`({{ $labels.instance }}) has only {{ printf "%.2f" $value }}% available space left. summary: Node filesystem has less than 20% space left.
|
3、NodeFilesystemAlmostOutOfSpace
监控Node磁盘使用率,剩余空间<10%则报警。
1 2 3 4 5 6 7 8 9
|
alert:NodeFilesystemAlmostOutOfSpace expr:((node_filesystem_avail_bytes{fstype!="",job="node-exporter"} / node_filesystem_size_bytes{fstype!="",job="node-exporter"} * 100 < 10 and node_filesystem_readonly{fstype!="",job="node-exporter"} == 0) * on(instance) group_left(nodename) (node_uname_info)) for: 5m labels: cluster: critical type: node annotations: description: Filesystem on `{{ $labels.device }}` at `{{$labels.nodename}}`({{ $labels.instance }}) has only {{ printf "%.2f" $value }}% available space left. summary: Node filesystem has less than 10% space left.
|
4、NodeFilesystemAlmostOutOfFiles
监控Node索引节点使用率,剩余空间<10%则报警。
1 2 3 4 5 6 7 8 9
|
alert:NodeFilesystemAlmostOutOfFiles expr:((node_filesystem_files_free{fstype!="",job="node-exporter"} / node_filesystem_files{fstype!="",job="node-exporter"} * 100 < 10 and node_filesystem_readonly{fstype!="",job="node-exporter"} == 0) * on(instance) group_left(nodename) (node_uname_info)) for: 5m labels: cluster: critical type: node annotations: description: Filesystem on `{{ $labels.device }}` at `{{$labels.nodename}}`({{ $labels.instance }}) has only {{ printf "%.2f" $value }}% available inodes left. summary: Node filesystem has less than 10% inodes left.
|
5、KubeNodeNotReady
监控Node状态,如果有Node Not Ready则报警。
1 2 3 4 5 6 7 8 9
|
alert:KubeNodeNotReady expr:(kube_node_status_condition{condition="Ready",job="kube-state-metrics",status="true"} == 0) for: 5m labels: cluster: critical type: node annotations: description: {{ $labels.node }} has been unready for more than 15 minutes. summary: Node is not ready.
|
6、KubeNodePodsTooMuch
监控Node上pod数量,我们设置的最大每个Node上最多运行110个Pod,如果使用率>80%则报警。
1 2 3 4 5 6 7 8 9
|
alert:KubeNodePodsTooMuch expr:(sum by(node) (kube_pod_info) * 100 / 110 > 80) for: 5m labels: cluster: critical type: node annotations: description: Pods usage on `{{$labels.node}}` up to {{ printf "%.2f" $value }}%. summary: Node pods too much.
|
二、Namespace监控
Namespace关于CPU和内存有三个值,分别是limit、request和usage。
我的理解是:
-
limit 最多可以申请多少资源(Pod 维度))
cpu:namespace_cpu:kube_pod_container_resource_limits:sum
memory:namespace_memory:kube_pod_container_resource_requests:sum
-
request 申请了多少资源(Pod 维度))
cpu:namespace_cpu:kube_pod_container_resource_requests:sum
memory:namespace_memory:kube_pod_container_resource_limits:sum
-
usage 实际使用了多少资源(Pod 维度))
cpu:sum(node_namespace_pod_container:container_cpu_usage_seconds_total:sum_irate) by (namespace)
memory:sum(node_namespace_pod_container:container_memory_working_set_bytes) by (namespace)
我们在kubord里面可以设置namespace的limit内存和cpu,这个暂时不知道如何从Prometheus获取,后续会持续关注。
当request/limit > 80%则说明Namespace资源可能不够,需要扩大namespace资源。
1、NamespaceCpuUtilisationHigh
监控namespace cpu使用率,高于90%则报警。
1 2 3 4 5 6 7 8 9
|
alert:NamespaceCpuUtilisationHigh expr:(sum(node_namespace_pod_container:container_cpu_usage_seconds_total:sum_irate) by (namespace) / sum(namespace_cpu:kube_pod_container_resource_limits:sum) by (namespace) * 100 > 90) for: 5m labels: cluster: critical type: namespace annotations: description: CPU utilisation on `{{$labels.namespace}}` up to {{ printf "%.2f" $value }}%. summary: Namespace CPU utilisation high.
|
2、NamespaceCpuUtilisationLow
监控namespace cpu使用率,低于10%则报警。
1 2 3 4 5 6 7 8 9
|
alert:NamespaceCpuUtilisationLow expr:(sum(node_namespace_pod_container:container_cpu_usage_seconds_total:sum_irate) by (namespace) / sum(namespace_cpu:kube_pod_container_resource_limits:sum) by (namespace) * 100 < 10) for: 5m labels: cluster: critical type: namespace annotations: description: CPU utilisation on `{{$labels.namespace}}` as low as {{ printf "%.2f" $value }}%. summary: Namespace CPU underutilization.
|
3、NamespaceMemorySpaceFillingUp
监控namespace 内存使用率,高于90%则报警。
1 2 3 4 5 6 7 8 9
|
alert:NamespaceMemorySpaceFillingUp expr:(sum(node_namespace_pod_container:container_memory_working_set_bytes) by (namespace) / sum(namespace_memory:kube_pod_container_resource_limits:sum) by (namespace) * 100 > 90) for: 5m labels: cluster: critical type: namespace annotations: description: Memory usage on `{{$labels.namespace}}` up to {{ printf "%.2f" $value }}%. summary: Namespace memory will be exhausted.
|
4、NamespaceMemorySpaceLow
监控namespace 内存使用率,低于10%则报警。
1 2 3 4 5 6 7 8 9
|
alert:NamespaceMemorySpaceLow expr:(sum(node_namespace_pod_container:container_memory_working_set_bytes) by (namespace) / sum(namespace_memory:kube_pod_container_resource_limits:sum) by (namespace) * 100 < 10) for: 5m labels: cluster: critical type: namespace annotations: description: Memory usage on `{{$labels.namespace}}` as low as {{ printf "%.2f" $value }}%. summary: Under-utilized namespace memory.
|
5、KubePodNotReady
监控pod状态,如果存在pod持续not-ready
达到十五分钟则报警。
1 2 3 4 5 6 7 8 9
|
alert:KubePodNotReady expr:(sum by(namespace, pod) (max by(namespace, pod) (kube_pod_status_phase{job="kube-state-metrics",namespace=~".*",phase=~"Pending|Unknown"}) * on(namespace, pod) group_left(owner_kind) topk by(namespace, pod) (1, max by(namespace, pod, owner_kind) (kube_pod_owner{owner_kind!="Job"}))) > 0) for: 5m labels: cluster: critical type: namespace annotations: description: Pod {{ $labels.namespace }}/{{ $labels.pod }} has been in a non-ready state for longer than 15 minutes. summary: Pod has been in a non-ready state for more than 15 minutes.
|
6、KubeContainerWaiting
监控pod状态,如果存在pod持续waiting
达到十五分钟则报警。
1 2 3 4 5 6 7 8 9
|
alert:KubeContainerWaiting expr:(sum by(namespace, pod, container) (kube_pod_container_status_waiting_reason{job="kube-state-metrics",namespace=~".*"}) > 0) for: 5m labels: cluster: critical type: namespace annotations: description:Pod {{ $labels.namespace }}/{{ $labels.pod }} container {{$labels.container}} has been in waiting state for longer than 15 minutes. summary: Pod container waiting longer than 15 minutes.
|
7、PodRestart
这个告警配置的就是如果kube-system这个namespace下面存在某个pod重启了则发送告警。
因为在kube-system这个namespace下面,存在很多集群相关的pod,比如我们的日志收集组件Fluentd和corends等,所以如果这个namespace下面有容器重启,那么需要警惕一下是否集群出现了问题。
1 2 3 4 5 6 7 8 9
|
alert:PodRestart expr:(floor(increase(kube_pod_container_status_restarts_total{namespace="kube-system"}[1m])) > 0) for: 5m labels: cluster: critical type: namespace annotations: description:Pod {{ $labels.namespace }}/{{ $labels.pod }} restart {{ $value }} times in last 1 minutes. summary: Pod restart in last 1 minutes.
|
8、PrometheusOom
Prometheus自己也是存在宕机的风险,所以我们加了一个监控来检测Prometheus,如果内存使用率达到90%则可能出现异常,所以发送告警。
1 2 3 4 5 6 7 8 9
|
alert:PrometheusOom expr:(container_memory_working_set_bytes{container="prometheus"} / container_spec_memory_limit_bytes{container="prometheus"} > 0.9) for: 5m labels: cluster: critical type: namespace annotations: description:Memory usage on `Prometheus` up to {{ printf "%.2f" $value }}%. summary: Prometheus will be oom.
|