filebeat 直连 elasticsearch 利用 pipeline processor 提取 message 中的字段

要求:

  1. 从message中提取event字段
  2. 从message中提取streamId字段
  3. 从message中提取time字段
  4. 将提取出的time字段的值,格式为时间戳类型,并保存到ts字段

实现步骤:

  1. 打开kibana网址:http://your_ip:5601/app/dev_tools#/console
  2. 输入内容,进行模拟验证:
POST /_ingest/pipeline/_simulate
{
  "pipeline": {
    "processors": [
      {
        "dissect": {
          "field": "message",
          "pattern": "%{event_pre1}event=%{event}",
          "ignore_failure": true
        }
      },
      {
        "dissect": {
          "field": "message",
          "pattern": "%{event_pre2}event=%{event} %{event_post2}",
          "ignore_failure": true
        }
      },
      {
        "dissect": {
          "field": "message",
          "pattern": "%{streamId_pre1}streamId=%{streamId}",
          "ignore_failure": true
        }
      },
      {
        "dissect": {
          "field": "message",
          "pattern": "%{streamId_pre2}streamId=%{streamId} %{streamId_post2}",
          "ignore_failure": true
        }
      },
      {
        "dissect": {
          "field": "message",
          "pattern": "%{time_pre1}time=\"%{time}\"",
          "ignore_failure": true
        }
      },
      {
        "dissect": {
          "field": "message",
          "pattern": "%{time_pre2}time=\"%{time}\" %{time_post2}",
          "ignore_failure": true
        }
      },
      {
        "date": {
          "field": "time",
          "target_field": "ts",
          "formats": [
            "yyyy-MM-dd HH:mm:ss.SSS"
          ],
          "output_format": "epoch_millis",
          "timezone": "Asia/Shanghai",
          "ignore_failure": true
        }
      }
    ]
  },
  "docs": [
    {
      "_source": {
        "message": "time=\"2023-01-12 17:17:50.023\" level=I pid=15414-15418 line=MangerTask@965 task=TN_Rtc-Mgr msg=full-link-tracing event=createSource streamId=ad7d38474_web_video_134408"
      }
    }
  ]
}
  1. 执行,输出了结果:
{
  "docs" : [
    {
      "doc" : {
        "_index" : "_index",
        "_type" : "_doc",
        "_id" : "_id",
        "_source" : {
          "streamId" : "ad7d38474_web_video_134408",
          "streamId_pre1" : """time="2023-01-12 17:17:50.023" level=I pid=15414-15418 line=MangerTask@965 task=TN_Rtc-Mgr msg=full-link-tracing event=createSource """,
          "time_post2" : "level=I pid=15414-15418 line=MangerTask@965 task=TN_Rtc-Mgr msg=full-link-tracing event=createSource streamId=ad7d38474_web_video_134408",
          "message" : """time="2023-01-12 17:17:50.023" level=I pid=15414-15418 line=MangerTask@965 task=TN_Rtc-Mgr msg=full-link-tracing event=createSource streamId=ad7d38474_web_video_134408""",
          "time_pre2" : "",
          "time_pre1" : "",
          "event_post2" : "streamId=ad7d38474_web_video_134408",
          "event_pre1" : """time="2023-01-12 17:17:50.023" level=I pid=15414-15418 line=MangerTask@965 task=TN_Rtc-Mgr msg=full-link-tracing """,
          "time" : "2023-01-12 17:17:50.023",
          "event" : "createSource",
          "event_pre2" : """time="2023-01-12 17:17:50.023" level=I pid=15414-15418 line=MangerTask@965 task=TN_Rtc-Mgr msg=full-link-tracing """,
          "ts" : "1673515070023"
        },
        "_ingest" : {
          "timestamp" : "2023-01-13T01:49:33.609557268Z"
        }
      }
    }
  ]
}

说明

  1. 提取字段时,需要写2个dissect处理的原因是:处理该字段有可能是最后一个字段的情况;如果字段是最后一个字段,第1个dissect处理会得到结果,第2个dissect处理会失败并被忽略;如果字段不是最后一个字段,第1个dissect处理会失败并被忽略,第2个dissect处理会得到结果。

posted on 2022-09-29 14:54  cag2050  阅读(281)  评论(0编辑  收藏  举报

导航