EFK(Elasticsearch+Filebeat+Kibana) 收集K8s容器日志

一、Kubernetes日志采集难点

  在 Kubernetes 中,日志采集相比传统虚拟机、物理机方式要复杂很多,最根本的原因是 Kubernetes 把底层异常屏蔽,提供更加细粒度的资源调度,向上提供稳定、动态的环境。因此日志采集面对的是更加丰富、动态的环境,需要考虑的点也更加的多。

  1. 对于运行时间很短的 Job 类应用,从启动到停止只有几秒的时间,如何保证日志采集的实时性能够跟上而且数据不丢? 
  2. K8s 一般推荐使用大规格节点,每个节点可以运行 10-100+ 的容器,如何在资源消耗尽可能低的情况下采集 100+ 的容器?
  3. 在 K8s 中,应用都以 yaml 的方式部署,而日志采集还是以手工的配置文件形式为主,如何能够让日志采集以 K8s 的方式进行部署?

二、Kubernetes日志采集方式

  1. 业务直写:在应用中集成日志采集的 SDK,通过 SDK 直接将日志发送到服务端。这种方式省去了落盘采集的逻辑,也不需要额外部署 Agent,对于系统的资源消耗最低,但由于业务和日志 SDK 强绑定,整体灵活性很低,一般只有日志量极大的场景中使用;
  2. DaemonSet 方式:在每个 node 节点上只运行一个日志 agent,采集这个节点上所有的日志。DaemonSet 相对资源占用要小很多,但扩展性、租户隔离性受限,比较适用于功能单一或业务不是很多的集群;
  3. Sidecar 方式:为每个 POD 单独部署日志 agent,这个 agent 只负责一个业务应用的日志采集。Sidecar 相对资源占用较多,但灵活性以及多租户隔离性较强,建议大型的 K8s 集群或作为 PaaS 平台为多个业务方服务的集群使用该方式。

总结:

  1. 业务直写推荐在日志量极大的场景中使用

  2. DaemonSet一般在中小型集群中使用

  3. Sidecar推荐在超大型的集群中使用

  实际应用场景中,一般都是使用 DaemonSet 或 DaemonSet 与 Sidecar 混用方式,DaemonSet 的优势是资源利用率高,但有一个问题是 DaemonSet 的所有 Logtail 都共享全局配置,而单一的 Logtail 有配置支撑的上限,因此无法支撑应用数比较多的集群。

  • 一个配置尽可能多的采集同类数据,减少配置数,降低 DaemonSet 压力;
  • 核心的应用采集要给予充分的资源,可以使用 Sidecar 方式;
  • 配置方式尽可能使用 CRD 方式;
  • Sidecar 由于每个 Logtail 是单独的配置,所以没有配置数的限制,这种比较适合于超大型的集群使用。

三、以SeatefulSet的方式创建单节点elasticsearch的yaml文件

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
# cat elasticsearch.yaml
 
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: elasticsearch
  namespace: kube-system
  labels:
    k8s-app: elasticsearch
spec:
  serviceName: elasticsearch
  selector:
    matchLabels:
      k8s-app: elasticsearch
  template:
    metadata:
      labels:
        k8s-app: elasticsearch
    spec:
      initContainers:
      - name: busybox
        imagePullPolicy: IfNotPresent
        image: busybox:latest
        securityContext:
          privileged: true
        command: ["sh", "-c", "mkdir -p /usr/share/elasticsearch/data/logs;chown -R 1000:1000 /usr/share/elasticsearch/data"]
        volumeMounts:
        - name: elasticsearch-data
          mountPath: "/usr/share/elasticsearch/data"
 
      containers:
      - image: elasticsearch:7.10.1
        name: elasticsearch
        resources:
          limits:
            cpu: 1
            memory: 2Gi
          requests:
            cpu: 0.5
            memory: 500Mi
        env:
          - name: "discovery.type"
            value: "single-node"
          - name: ES_JAVA_OPTS
            value: "-Xms512m -Xmx2g"
        ports:
        - containerPort: 9200
          name: db
          protocol: TCP
        volumeMounts:
        - name: elasticsearch-data
          mountPath: /usr/share/elasticsearch/data
  volumeClaimTemplates:
  - metadata:
      name: elasticsearch-data
    spec:
      storageClassName: "disk-sc"
      accessModes: [ "ReadWriteOnce" ]
      resources:
        requests:
          storage: 1000Gi
 
---
apiVersion: v1
kind: Service
metadata:
  name: elasticsearch
  namespace: kube-system
spec:
  clusterIP: None
  ports:
  - port: 9200
    protocol: TCP
    targetPort: db
  selector:
    k8s-app: elasticsearch

 四、以Deployment的方式创建kibana的yaml文件

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
# cat kibana.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: kibana
  namespace: kube-system
  labels:
    k8s-app: kibana
spec:
  replicas: 1
  selector:
    matchLabels:
      k8s-app: kibana
  template:
    metadata:
      labels:
        k8s-app: kibana
    spec:
      containers:
      - name: kibana
        image: kibana:7.10.1
        resources:
          limits:
            cpu: 1
            memory: 500Mi
          requests:
            cpu: 0.5
            memory: 200Mi
        env:
          - name: ELASTICSEARCH_HOSTS
            value: http://elasticsearch-0.elasticsearch.kube-system:9200
          - name: I18N_LOCALE
            value: zh-CN
        ports:
        - containerPort: 5601
          name: ui
          protocol: TCP
 
---  # 使用云厂商的负载均衡
apiVersion: v1
kind: Service
metadata:
  name: kibana
  namespace: kube-system
  annotations:
    service.kubernetes.io/qcloud-loadbalancer-internal-subnetid: subnet-cadefsb
spec:
  ports:
  - name: kibana-pvc
    protocol: TCP
    port: 5601
    targetPort: 5601
  selector:
    k8s-app: kibana
  type: LoadBalancer

五、以DaemonSet的方式创建filebeat的yaml文件

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
# cat filebeat-kubernetes.yaml
---
apiVersion: v1
kind: ConfigMap
metadata:
  name: filebeat-config
  namespace: kube-system
  labels:
    k8s-app: filebeat
data:
  filebeat.yml: |-
    filebeat.config:
      inputs:
        # Mounted `filebeat-inputs` configmap:
        path: ${path.config}/inputs.d/*.yml
        # Reload inputs configs as they change:
        reload.enabled: false
      modules:
        path: ${path.config}/modules.d/*.yml
        # Reload module configs as they change:
        reload.enabled: false
    filebeat.autodiscover:       # 使用filebeat自动发现的方式
      providers:
        - type: kubernetes
          templates:
            - condition:
                equals:
                  kubernetes.namespace: prod      # 收集prod命名空间的日志
              config:
                - type: log      # 日志类型为log而非docker或者container,因为我们输出的日志非json格式。
                  containers.ids:
                    - "${data.kubernetes.container.id}"
                  paths:<br>                    # 收集日志的路径,如果设置为"/var/lib/kubelet/pods/**/*info.log",则会收集多余的日志,使用云k8s,所以非/var/lib/docker/containers/目录
                    - "/var/lib/kubelet/pods/${data.kubernetes.pod.uid}/volumes/kubernetes.io~empty-dir/data-vol/log/java/*/*info.log"
                  encoding: utf-8
                  scan_frequency: 1s    # 扫描新文件的时间间隔,默认为10秒
                  tail_files: true      # 设置为true时filebeat从每个文件的末尾取新文件,而不是开始。首次启用filebeat时忽略旧的日志,下次起动时关闭此选项。
                  fields_under_root: true      # 设置为true后,fields存储在输出文档的顶级位置
                  fields:
                    type: "prod-info"
        - type: kubernetes
          templates:
            - condition:
                equals:
                  kubernetes.namespace: prod
              config:
                - type: log
                  containers.ids:
                    - "${data.kubernetes.container.id}"
                  paths:
                    - "/var/lib/kubelet/pods/${data.kubernetes.pod.uid}/volumes/kubernetes.io~empty-dir/data-vol/log/java/*/*error.log"
                  encoding: utf-8
                  scan_frequency: 1s
                  tail_files: true
                  fields_under_root: true
                  fields:
                    type: "prod-error"
                  multiline.type: pattern
                  multiline.pattern: '^[[:space:]]+(at|\.{3})[[:space:]]+\b|^Caused by:'
                  multiline.negate: false
                  multiline.match: after
 
        - type: kubernetes
          templates:
            - condition:
                equals:
                  kubernetes.namespace: test
              config:
                - type: log
                  containers.ids:
                    - "${data.kubernetes.container.id}"
                  paths:
                    - "/var/lib/kubelet/pods/${data.kubernetes.pod.uid}/volumes/kubernetes.io~empty-dir/data-vol/log/java/*/*info.log"
                  encoding: utf-8
                  scan_frequency: 1s
                  tail_files: true
                  fields_under_root: true
                  fields:
                    type: "test-info"
        - type: kubernetes
          templates:
            - condition:
                equals:
                  kubernetes.namespace: test
              config:
                - type: log
                  containers.ids:
                    - "${data.kubernetes.container.id}"
                  paths:
                    - "/var/lib/kubelet/pods/${data.kubernetes.pod.uid}/volumes/kubernetes.io~empty-dir/data-vol/log/java/*/*error.log"
                  encoding: utf-8
                  scan_frequency: 1s
                  tail_files: true
                  fields_under_root: true
                  fields:
                    type: "test-error"
                  multiline.type: pattern
                  multiline.pattern: '^[[:space:]]+(at|\.{3})[[:space:]]+\b|^Caused by:'      # 标准的java多行匹配
                  multiline.negate: false
                  multiline.match: after
 
 
    setup.ilm.enabled: false
    output.elasticsearch:
      hosts: ['${ELASTICSEARCH_HOST:elasticsearch}:${ELASTICSEARCH_PORT:9200}']
      indices:
        - index: "k8s-test-info-%{+yyyy.MM.dd}"
          when.contains:
            type: "test-info"
        - index: "k8s-test-error-%{+yyyy.MM.dd}"
          when.contains:
            type: "test-error"
 
        - index: "k8s-prod-info-%{+yyyy.MM.dd}"
          when.contains:
            type: "prod-info"
        - index: "k8s-prod-error-%{+yyyy.MM.dd}"
          when.contains:
            type: "prod-error"<br>
   # output.redis:      # 将日志输出到redis中,再用logstash收日志输出到es中
   #   hosts: ["192.168.5.99:6379"]
   #   db: 0
   #   password: "password"
   #   key: "default_list"
   #   keys:
   #     - key: "k8s-prod-error-%{+yyyy.MM.dd}"
   #       when.contains:
   #         type: "prod-error"
   #     - key: "k8s-prod-info-%{+yyyy.MM.dd}"
   #       when.contains:
   #         type: "prod-info"
 
---  # 此configmap没有使用,可以将上面的配置文件粘贴到下面,也可删除
apiVersion: v1
kind: ConfigMap
metadata:
  name: filebeat-inputs
  namespace: kube-system
  labels:
    k8s-app: filebeat
data:
  kubernetes.yml: |-
    #- type: log
    #  paths:
       # - "/var/lib/kubelet/pods/**/*info.log"
      #processors:
      #  - add_kubernetes_metadata:
       #     default_indexers.enabled: false
       #     default_matchers.enabled: false
       #     in_cluster: true
       #     indexers:
       #       - ip_port:
       #     matchers:
       #       - field_format:
       #           format: '%{[destination.ip]}:%{[destination.port]}'
       #     matchers:
       #       - logs_path:
       #           logs_path: '/var/log/pods'
       #           resource_type: 'pod'
 
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: filebeat
  namespace: kube-system
  labels:
    k8s-app: filebeat
spec:
  selector:
    matchLabels:
      k8s-app: filebeat
  template:
    metadata:
      labels:
        k8s-app: filebeat
    spec:
      serviceAccountName: filebeat
      terminationGracePeriodSeconds: 30
      containers:
      - name: filebeat
        image: elastic/filebeat:7.10.1
        args: [
          "-c", "/etc/filebeat.yml",
          "-e",
        ]
        env:
        - name: ELASTICSEARCH_HOST
          value: elasticsearch-0.elasticsearch.kube-system
        - name: ELASTICSEARCH_PORT
          value: "9200"
        securityContext:
          runAsUser: 0
          # If using Red Hat OpenShift uncomment this:
          #privileged: true
        resources:
          limits:
            memory: 6144Mi
          requests:
            cpu: 100m
            memory: 100Mi
        volumeMounts:
        - name: config
          mountPath: /etc/filebeat.yml
          readOnly: true
          subPath: filebeat.yml
        - name: inputs
          mountPath: /usr/share/filebeat/inputs.d
          readOnly: true
        - name: data
          mountPath: /usr/share/filebeat/data
        - name: varlibdockercontainers
          mountPath: /var/lib/docker/containers
          readOnly: true
        - name: varlibkubeletpods
          mountPath: /var/lib/kubelet/pods
          readOnly: true
      volumes:
      - name: config
        configMap:
          defaultMode: 0600
          name: filebeat-config
      - name: varlibkubeletpods
        hostPath:
          path: /var/lib/kubelet/pods
      - name: varlibdockercontainers
        hostPath:
          path: /var/lib/docker/containers
      - name: inputs
        configMap:
          defaultMode: 0600
          name: filebeat-inputs
      # data folder 存储所有文件的读取状态注册表,因此不会在filebeat pod重启后再次发送所有的文件。
      - name: data
        hostPath:
          path: /var/lib/filebeat-data
          type: DirectoryOrCreate
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRoleBinding
metadata:
  name: filebeat
subjects:
- kind: ServiceAccount
  name: filebeat
  namespace: kube-system
roleRef:
  kind: ClusterRole
  name: filebeat
  apiGroup: rbac.authorization.k8s.io
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRole
metadata:
  name: filebeat
  labels:
    k8s-app: filebeat
rules:
- apiGroups: [""] # "" indicates the core API group
  resources:
  - namespaces
  - pods
  verbs:
  - get
  - watch
  - list
---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: filebeat
  namespace: kube-system
  labels:
    k8s-app: filebeat
---

六、参考信息及报错

参考链接 :https://www.elastic.co/guide/en/beats/filebeat/current/configuration-autodiscover.html

以下报错是因为日志格式非json类型

1
2
2021-02-16T12:23:33.920Z    ERROR   [reader_docker_json]    readjson/docker_json.go:204 Parse line error: invalid CRI log format
2021-02-16T12:23:33.920Z    INFO    log/harvester.go:335    Skipping unparsable line in file: /var/lib/kubelet/pods/694538b6-4629-4903-804e-8e9fc36ced4a/volumes/kubernetes.io~empty-dir/data-vol/log/java/daemon/daemon-info.log

 配置信息如下:

1
2
3
4
5
filebeat.yml: |-
  filebeat.inputs:
  - type: container      # 日志类型必须为json格式日志才可以正常收集,k8s转到前台的日志可以正常收集。
    paths:
      - "/var/lib/kubelet/pods/*/volumes/kubernetes.io~empty-dir/data-vol/log/java/*/*.log" 

 七、filebeat排除某些不需要的字段和容器项目日志

1
2
3
4
5
6
7
8
9
10
processors:
  - drop_fields:
      fields: ["agent.ephemeral_id","agent.id","agent.name","agent.type","container.runtime","ecs.version","host.hostname","host.name","kubernetes.labels.pod-template-hash","host.os.version","agent.version","container.image.name","container.id","kubernetes.pod.uid","kubernetes.replicaset.name"]
      ignore_missing: false
  - add_kubernetes_metadata:
      in_cluster: true
  - drop_event.when:
      or:
      - equals:
          kubernetes.labels.app: "tapd-page"

结果展示:

 

posted @   林中龙虾  阅读(4140)  评论(2编辑  收藏  举报
编辑推荐:
· 从 HTTP 原因短语缺失研究 HTTP/2 和 HTTP/3 的设计差异
· AI与.NET技术实操系列:向量存储与相似性搜索在 .NET 中的实现
· 基于Microsoft.Extensions.AI核心库实现RAG应用
· Linux系列:如何用heaptrack跟踪.NET程序的非托管内存泄露
· 开发者必知的日志记录最佳实践
阅读排行:
· TypeScript + Deepseek 打造卜卦网站:技术与玄学的结合
· Manus的开源复刻OpenManus初探
· AI 智能体引爆开源社区「GitHub 热点速览」
· 从HTTP原因短语缺失研究HTTP/2和HTTP/3的设计差异
· 三行代码完成国际化适配,妙~啊~
点击右上角即可分享
微信分享提示