Kubernetes EFK收集日志

一、环境准备

  • Kubernetes:1.19.2

  • Elasticsearch:外部Elasticsearch,版本7.8

  • Kibana:外部Kibana,版本7.8

  • filebeat: elastic/filebeat:7.8.0

  • logstash: elastic/logstash:7.8.0

  • redis: redis:5.0.10-alpine

 

二、日志目录

K8S中的日志目录有3种:

  • /var/lib/docker/containers
  • /var/log/containers
  • /var/log/pods

  它们之间的联系

  • 当Docker 作为 k8s 容器运行时,容器日志的落盘将由 docker 来完成,保存在/var/lib/docker/containers/$CONTAINERID 目录下
  • Kubelet 会在 /var/log/pods 和 /var/log/containers 下建立软链接,指向 /var/lib/docker/containers/CONTAINERID 该目录下的容器日志文件
  • 当Containerd 作为 k8s 容器运行时, 容器日志的落盘由 Kubelet 来完成,保存至 /var/log/pods/$CONTAINER_NAME 目录下,同时在 /var/log/containers 目录下创建软链接,指向日志文件

  /var/log/containers目录中日志名称:podName_NameSpace_ContainerName_ContainerID

 

三、创建External service

  用于连接K8s外部的elasticsearch

  先创建一个public-serivce的namespace  

kubectl create ns public-service

  es-svc-external.yaml

apiVersion: v1
kind: Service
metadata:
  name: elaticsearch-svc-external
  namespace: public-service
  labels:
    app: elaticsearch-svc-external
spec:
  ports:
  - port: 9200
    name: elaticsearch-svc-external
    protocol: TCP
  type: ClusterIP

  创建elasticsearch的endpoint

  es-ep.yaml

apiVersion: v1
kind: Endpoints
metadata:
  labels:
    app: elaticsearch-svc-external
  name: elaticsearch-svc-external
  namespace: public-service
subsets:
- addresses:
  - ip: 192.168.100.144
  ports:
  - name: elaticsearch-svc-external
    port: 9200
    protocol: TCP
  • 如果有多个节点,再添加-ip:x.x.x.x

  查看

kubectl get svc -n public-service
NAME                        TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)    AGE
elaticsearch-svc-external   ClusterIP   10.109.66.130    <none>        9200/TCP   43h

kubectl get ep -n public-service
NAME                        ENDPOINTS              AGE
elaticsearch-svc-external   192.168.100.144:9200   43h

  测试,启动一个带curl的pod测试能否获取elasticsearch的信息

  这个elasticsearch有安全认证

curl -XGET --user elastic:elastic http://elaticsearch-svc-external.public-service:9200/_cluster/health?pretty=true
{
  "cluster_name" : "es-cluster",
  "status" : "yellow",
  "timed_out" : false,
  "number_of_nodes" : 1,
  "number_of_data_nodes" : 1,
  "active_primary_shards" : 11,
  "active_shards" : 11,
  "relocating_shards" : 0,
  "initializing_shards" : 0,
  "unassigned_shards" : 3,
  "delayed_unassigned_shards" : 0,
  "number_of_pending_tasks" : 0,
  "number_of_in_flight_fetch" : 0,
  "task_max_waiting_in_queue_millis" : 0,
  "active_shards_percent_as_number" : 78.57142857142857
}

  

四、启动一个redis

  redis.yaml

apiVersion: v1
kind: Service
metadata:
  name: redis-svc
  namespace: public-service
  labels:
    app:  redis
spec:
  selector:
    app: redis-single
  ports:
    - protocol: TCP
      port: 6379
      targetPort: 6379

---
apiVersion: apps/v1
kind: Deployment
metadata:
  annotations:
    deployment.kubernetes.io/revision: "18"
  labels:
    app: redis
  name: redis
  namespace: public-service
spec:
  replicas: 1
  revisionHistoryLimit: 10
  selector:
    matchLabels:
      app: redis-single
  strategy:
    rollingUpdate:
      maxSurge: 1
      maxUnavailable: 0
    type: RollingUpdate
  template:
    metadata:
      labels:
        app: redis-single
    spec:
      nodeSelector:
        logging: "true"
      affinity:
        podAntiAffinity:
          preferredDuringSchedulingIgnoredDuringExecution:
          - podAffinityTerm:
              labelSelector:
                matchExpressions:
                - key: app
                  operator: In
                  values:
                  - redis-single
              topologyKey: kubernetes.io/hostname
            weight: 100
      containers:
      - command:
        - redis-server
        - /usr/local/etc/redis.conf
        env:
        - name: TZ
          value: Asia/Shanghai
        - name: LANG
          value: C.UTF-8
        image: redis:5.0.10-alpine
        imagePullPolicy: IfNotPresent
        lifecycle: {}
        livenessProbe:
          failureThreshold: 2
          initialDelaySeconds: 10
          periodSeconds: 10
          successThreshold: 1
          tcpSocket:
            port: 6379
          timeoutSeconds: 2
        name: redis
        ports:
        - containerPort: 6379
          name: redis-port
          protocol: TCP
        readinessProbe:
          failureThreshold: 2
          initialDelaySeconds: 10
          periodSeconds: 10
          successThreshold: 1
          tcpSocket:
            port: 6379
          timeoutSeconds: 2
        resources:
          limits:
            cpu: 100m
            memory: 100Mi
          requests:
            cpu: 10m
            memory: 10Mi
        volumeMounts:
        - mountPath: /usr/share/zoneinfo/Asia/Shanghai
          name: tz-config
        - mountPath: /etc/localtime
          name: tz-config
        - mountPath: /usr/local/etc/
          name: redis-conf
      dnsPolicy: ClusterFirst
      restartPolicy: Always
      schedulerName: default-scheduler
      securityContext: {}
      terminationGracePeriodSeconds: 30
      tolerations:
      - effect: NoExecute
        key: node.kubernetes.io/unreachable
        operator: Exists
        tolerationSeconds: 30
      - effect: NoExecute
        key: node.kubernetes.io/not-ready
        operator: Exists
        tolerationSeconds: 30
      volumes:
      - hostPath:
          path: /usr/share/zoneinfo/Asia/Shanghai
          type: ""
        name: tz-config
      - configMap:
          name: redis-conf
        name: redis-conf
  • nodeSelector 选择一个节点部署

  redis.conf

protected-mode no
port 6379
tcp-backlog 511
timeout 0
tcp-keepalive 300
#daemonize yes
supervised no
loglevel notice
logfile "/data/redis.log"
databases 16
always-show-logo yes
save 900 1
save 300 10
save 60 10000
stop-writes-on-bgsave-error yes
rdbcompression yes
rdbchecksum yes
dir "/data"
slave-serve-stale-data yes
slave-read-only yes
repl-diskless-sync no
repl-diskless-sync-delay 5
repl-disable-tcp-nodelay no
slave-priority 100
lazyfree-lazy-eviction no
lazyfree-lazy-expire no
lazyfree-lazy-server-del no
slave-lazy-flush no
appendonly yes
appendfilename "appendonly.aof"
appendfsync everysec
no-appendfsync-on-rewrite no
auto-aof-rewrite-percentage 100
auto-aof-rewrite-min-size 1gb
aof-load-truncated yes
aof-use-rdb-preamble no
lua-time-limit 5000
slowlog-log-slower-than 10000
slowlog-max-len 128
latency-monitor-threshold 0
notify-keyspace-events "xE"
hash-max-ziplist-entries 512
hash-max-ziplist-value 64
list-max-ziplist-size -2
list-compress-depth 0
set-max-intset-entries 512
zset-max-ziplist-entries 128
zset-max-ziplist-value 64
hll-sparse-max-bytes 3000
activerehashing yes
client-output-buffer-limit normal 0 0 0
client-output-buffer-limit slave 256mb 64mb 60
client-output-buffer-limit pubsub 32mb 8mb 60
hz 10
aof-rewrite-incremental-fsync yes

  创建

# kubectl create configmap redis-conf --from-file=redis.conf -n public-service

# kubectl create -f redis.yaml

# kubectl get pods  -n public-service
NAME                     READY   STATUS    RESTARTS   AGE
redis-6f45c688f9-j6jsn   1/1     Running   0          130m

# kubectl get svc -n public-service | grep redis
redis-svc                   ClusterIP   10.107.68.121    <none>        6379/TCP   137m

  

五、部署filebeat

  filebeat-configmap.yaml

apiVersion: v1
kind: ConfigMap
metadata:
  name: filebeat-config
  namespace: logging
  labels:
    k8s-app: filebeat
data:
  filebeat.yml: |-
    filebeat.inputs:
    - type: container
      paths:
        # 项目以pb开头
        - /var/log/containers/pb-*.log

      #多行合并
      multiline.pattern: '^[0-9]{4}-[0-9]{2}-[0-9]{2}'
      multiline.negate: true
      multiline.match: after
      multiline.timeout: 30
      fields:
        #自定义字段用于logstash识别k8s输入的日志
        service: java-log

      processors:
        - add_kubernetes_metadata:
            #添加k8s描述字段
            default_indexers.enabled: true
            default_matchers.enabled: true
            host: ${NODE_NAME}
            matchers:
            - logs_path:
                logs_path: "/var/log/containers/"
        - drop_fields:
            #删除的多余字段
            fields: ["host", "tags", "ecs", "log", "prospector", "agent", "input", "beat", "offset"]
            ignore_missing: true
    output.redis:
      # 不同namespace下的service联通需要添加其所在的namespace名称
      hosts: ["redis-svc.public-service"]
      key: "filebeat"
      db: 1
      timeout: 5

  filebeat-rbac.yaml

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: filebeat
subjects:
- kind: ServiceAccount
  name: filebeat
  namespace: logging
roleRef:
  kind: ClusterRole
  name: filebeat
  apiGroup: rbac.authorization.k8s.io
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: filebeat
  labels:
    k8s-app: filebeat
rules:
- apiGroups: [""] # "" indicates the core API group
  resources:
  - namespaces
  - pods
  verbs:
  - get
  - watch
  - list
---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: filebeat
  namespace: logging
  labels:
    k8s-app: filebeat

  filebeat-daemonset.yaml

apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: filebeat
  namespace: logging
  labels:
    k8s-app: filebeat
spec:
  selector:
    matchLabels:
      k8s-app: filebeat
  template:
    metadata:
      labels:
        k8s-app: filebeat
    spec:
      serviceAccountName: filebeat
      terminationGracePeriodSeconds: 30
      nodeSelector:
        logging: "true"
      containers:
      - name: filebeat
        image: docker.io/elastic/filebeat:7.8.0
        imagePullPolicy: IfNotPresent
        # args: [
        #   "-c", "/etc/filebeat.yml",
        #   "-e",
        # ]
        env:
        - name: NODE_NAME
          valueFrom:
            fieldRef:
              fieldPath: spec.nodeName
        securityContext:
          runAsUser: 0
          # If using Red Hat OpenShift uncomment this:
          #privileged: true
        resources:
          limits:
            memory: 200Mi
          requests:
            cpu: 100m
            memory: 100Mi
        volumeMounts:
        - name: config
          mountPath: /usr/share/filebeat/filebeat.yml
          readOnly: true
          subPath: filebeat.yml
        - name: varlibdockercontainers
          mountPath: /var/lib/docker/containers
        - name: varlog
          mountPath: /var/log
      volumes:
      - name: config
        configMap:
          defaultMode: 0640
          name: filebeat-config
      - name: varlibdockercontainers
        hostPath:
          path: /var/lib/docker/containers
      - name: varlog
        hostPath:
          path: /var/log

  给选择的节点打标签,nodeSelector选择了需要采集日志的节点

# kubectl label node node01 logging=true

  创建

# kubectl create ns  logging
# kubectl create -f .

  查看

# kubectl get pod -n logging | grep filebeat
filebeat-z5276                        1/1     Running   0          132m

# kubectl get cm -n logging | grep filebeat
filebeat-config      1      8h

  

六、部署logstash

  logstash-configmap.yaml

apiVersion: v1
kind: ConfigMap
metadata:
  name: logstash-configmap
  namespace: logging
data:
  logstash.yml: |
    http.host: "0.0.0.0"
    path.config: /usr/share/logstash/pipeline
  logstash.conf: |
    # all input will come from filebeat, no local logs
    input {
        beats {
          port => 5044
        }
        redis {
           host => "redis-svc.public-service"
           port => "6379"
           db => 1
           data_type => "list"
           key => "filebeat"
           type => "k8s-log"
          }
    }
    
    filter {
      if [fields][service] == "java-log" {
        grok {
                match => { "message" => "%{TIMESTAMP_ISO8601:logdatetime}\s(?:\[\]|\[%{UUID:trace_id}\])\s(?:%{SYSLOG5424SD:thread}|%{SYSLOG5424SD})\s*%{LOGLEVEL:level}\s*%{JAVACLASS:class}\s-\s%{GREEDYDATA:logmessage}"}
          remove_field => [ "message" ]
          remove_field => [ "agent" ]
          remove_field => [ "ecs" ]
          remove_field => [ "tags" ]
        }
      }
    }


    output {
      if [fields][service] == "java-log" {
        elasticsearch {
          hosts => ["elaticsearch-svc-external.public-service.svc.cluster.local:9200"]
          user => elastic
          password => elastic
          index => "k8s-%{[kubernetes][namespace]}-%{[kubernetes][labels][app]}-%{+YYYY.MM.dd}"
           }
    #stdout { codec => rubydebug { metadata => true}}
         }
      }

  

logstash 中grok语法和配置检测

语法:https://github.com/logstash-plugins/logstash-patterns-core/blob/master/patterns/grok-patterns

测试:http://grokdebug.herokuapp.com/

  • index:自动创建索引,名称为(k8s-namespace-appname)需要在spec.template.namespace / spec.template.labels.app中写明需要的参数

 

  logstash-deployment.yaml

apiVersion: apps/v1
kind: Deployment
metadata:
  name: logstash-deployment
  namespace: logging
spec:
  selector:
    matchLabels:
      app: logstash
  replicas: 1
  template:
    metadata:
      labels:
        app: logstash
    spec:
      nodeSelector:
        logging: "true"
      containers:
      - name: logstash
        image: elastic/logstash:7.8.0
        imagePullPolicy: IfNotPresent
        ports:
        - containerPort: 5044
        volumeMounts:
          - name: config-volume
            mountPath: /usr/share/logstash/config
          - name: logstash-pipeline-volume
            mountPath: /usr/share/logstash/pipeline
      volumes:
      - name: config-volume
        configMap:
          name: logstash-configmap
          items:
            - key: logstash.yml
              path: logstash.yml
      - name: logstash-pipeline-volume
        configMap:
          name: logstash-configmap
          items:
            - key: logstash.conf
              path: logstash.conf

  创建

# kubectl create -f .

  查看

# kubectl get pods -n logging | grep logstash
logstash-deployment-bc4d6c6dc-zsmrw   1/1     Running   0          135m

  

七、测试日志

  新建一个日志输出pod

  counter-deploy.yaml

apiVersion: apps/v1
kind: Deployment
metadata:
  name: pb-counter
  namespace: logging
  labels:
    app: pb-counter
    logging: "true" 
spec:
  replicas: 1
  selector:
    matchLabels:
      app: pb-counter
      logging: "true"
  template:
    metadata:
      name: pb-counter
      namespace: logging
      labels:
        app: pb-counter
        logging: "true"
    spec:
      nodeSelector:
        logging: "true"
      containers:
      - name: my-counter-test
        image: busybox
        imagePullPolicy: IfNotPresent
        args: [/bin/sh, -c, 'tail -n+1 -f /opt/msg/message.txt']
        volumeMounts:
        - name: msg
          mountPath: /opt/msg
      volumes:
      - name: msg
        configMap:
          name: msg
  • msg:一段java日志,以configmap的形式挂载到pb-counter中    

  查看kibana

  

 

  详细日志

  

 

 

八、基于日志的报警

  elastalert 是 yelp 使用 python 开发的 elasticsearch 告警工具。elastalert 依照一定频率查询 ES,将查询结果对比告警阈值,超过阈值即进行告警。告警方式包括但不局限于邮箱、微信、钉钉等。

  elastalert.yaml

apiVersion: v1
kind: ConfigMap
metadata:
  name: elastalert-config
  namespace: logging
  labels:
    app: elastalert
data:
  elastalert_config: |-
    ---
    rules_folder: /opt/rules       # 指定规则的目录
    scan_subdirectories: false
    run_every:                     # 多久从 ES 中查询一次
      minutes: 1
    buffer_time:
      minutes: 15
    es_host: elaticsearch-svc-external.public-service
    es_port: 9200
    es_username: elastic
    es_password: elastic
    writeback_index: elastalert
    use_ssl: False
    verify_certs: True
    alert_time_limit:             # 失败重试限制
      minutes: 2880
---
apiVersion: v1
kind: ConfigMap
metadata:
  name: elastalert-rules
  namespace: logging
  labels:
    app: elastalert
data:
  rule_config.yaml: |-
    name: dummylogs error     # 规则名字,唯一值
    es_host: elaticsearch-svc-external.public-service
    es_port: 9200
    es_user: elastic
    es_password: elastic
    type: any                 # 报警类型
    index: k8s-*              # es索引
    
    filter:                   # 过滤
    - query:
        query_string:
          query: "level:ERROR"  # 报警条件

    alert:                    # 报警类型
    - "email"
    smtp_host: smtp.qq.com
    smtp_port: 587
    smtp_auth_file: /opt/auth/smtp_auth_file.yaml
    email_reply_to: xxx@qq.com
    from_addr: xxx@qq.com
    email:                  # 接受邮箱
    - "xx@xx.com"
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: elastalert
  namespace: logging
  labels:
    app: elastalert
spec:
  selector:
    matchLabels:
      app: elastalert
  template:
    metadata:
      labels:
        app: elastalert
    spec:
      nodeSelector:
        logging: "true"
      containers:
      - name: elastalert
        image: jertel/elastalert-docker:0.2.4
        imagePullPolicy: IfNotPresent
        volumeMounts:
        - name: config
          mountPath: /opt/config
        - name: rules
          mountPath: /opt/rules
        - name: auth
          mountPath: /opt/auth
        resources:
          limits:
            cpu: 50m
            memory: 256Mi
          requests:
            cpu: 50m
            memory: 256Mi
      volumes:
      - name: auth
        secret:
          secretName: smtp-auth
      - name: rules
        configMap:
          name: elastalert-rules
      - name: config
        configMap:
          name: elastalert-config
          items:
          - key: elastalert_config
            path: elastalert_config.yaml

  使用邮件进行报警的时候,需要指定一个 smtp_auth_file 的文件,文件中包含用户名和密码:(smtp_auth_file.yaml) 

user: "xxxxx"       # 发送的邮箱地址
password: "cwzknocmolvvbbji"   # 不是qq邮箱的登录密码,是授权码

  开通qq第三方客户端授权码:https://service.mail.qq.com/cgi-bin/help?subtype=1&&id=28&&no=1001256

  创建一个secret

kubectl create secret generic smtp-auth --from-file=smtp_auth_file.yaml -n logging

  创建上面的 elastalert 应用

# kubectl create -f elastalert.yaml

# kubectl get pod -n logging | grep elast
elastalert-fb86bd644-tcbxk            1/1     Running   0          13m

# kubectl logs -f elastalert-fb86bd644-tcbxk -n logging
Elastic Version: 7.8.0
Reading Elastic 6 index mappings:
Reading index mapping 'es_mappings/6/silence.json'
Reading index mapping 'es_mappings/6/elastalert_status.json'
Reading index mapping 'es_mappings/6/elastalert.json'
Reading index mapping 'es_mappings/6/past_elastalert.json'
Reading index mapping 'es_mappings/6/elastalert_error.json'
Index elastalert already exists. Skipping index creation.

  查看elasticsearch中新增的index

  

  产生ERROR级别的日志后收到的邮箱信息

  

 

posted @ 2020-11-11 10:22  Bigberg  阅读(1323)  评论(0编辑  收藏  举报