Kafka和InfluxDB集成
随着物联网的广泛应用,时序数据库在软件开发中越来越多的被采用,而InfluxDB作为市场排名第一的时序数据库,基本都是时序数据库的首选。
最近接手一个项目,客户当前的解决方案是使用一个Connector收Kafka中的特定Topic消息,解析后转存入InluxDB。 从这个可以看出客户并不是很熟悉InluxDB的应用,我给出的改进方案就是在Kafka和InfluxDB中间加入Telegraf,Kafka作为Telegraf的Plugin,这样就把收Kafka消息再解析转存的任务交给Telegraf了。
只需要改一下Telegraf的配置,如下:
vi /etc/telegraf/telegraf.conf
# Configuration for sending metrics to InfluxDB
[[outputs.influxdb]]
## The full HTTP or UDP URL for your InfluxDB instance.
##
## Multiple URLs can be specified for a single cluster, only ONE of the
## urls will be written to each interval.
# urls = ["unix:///var/run/influxdb.sock"]
# urls = ["udp://127.0.0.1:8089"]
urls = ["http://127.0.0.1:8086"]
## The target database for metrics; will be created as needed.
## For UDP url endpoint database needs to be configured on server side.
database = "telegraf"
## The value of this tag will be used to determine the database. If this
## tag is not set the 'database' option is used as the default.
# database_tag = ""
## If true, no CREATE DATABASE queries will be sent. Set to true when using
## Telegraf with a user without permissions to create databases or when the
## database already exists.
# skip_database_creation = false
## Name of existing retention policy to write to. Empty string writes to
## the default retention policy. Only takes effect when using HTTP.
# retention_policy = ""
## Write consistency (clusters only), can be: "any", "one", "quorum", "all".
## Only takes effect when using HTTP.
# write_consistency = "any"
## Timeout for HTTP messages.
timeout = "5s"
## HTTP Basic Auth
username = "xiaodong"
password = "HON123well"
## HTTP User-Agent
# user_agent = "telegraf"
## UDP payload size is the maximum packet size to send.
# udp_payload = "512B"
## Optional TLS Config for use on HTTP connections.
# tls_ca = "/etc/telegraf/ca.pem"
# tls_cert = "/etc/telegraf/cert.pem"
# tls_key = "/etc/telegraf/key.pem"
## Use TLS but skip chain & host verification
# insecure_skip_verify = false
## HTTP Proxy override, if unset values the standard proxy environment
## variables are consulted to determine which proxy, if any, should be used.
# http_proxy = "http://corporate.proxy:3128"
## Additional HTTP headers
# http_headers = {"X-Special-Header" = "Special-Value"}
## HTTP Content-Encoding for write request body, can be set to "gzip" to
## compress body or "identity" to apply no encoding.
# content_encoding = "identity"
## When true, Telegraf will output unsigned integers as unsigned values,
## i.e.: "42u". You will need a version of InfluxDB supporting unsigned
## integer values. Enabling this option will result in field type errors if
## existing data has been written.
# influx_uint_support = false
# # Read metrics from Kafka topic(s)
[[inputs.kafka_consumer]]
# ## kafka servers
brokers = ["localhost:9092"]
# ## topic(s) to consume
topics = ["telegraf"]
# ## Add topic as tag if topic_tag is not empty
# topic_tag = ""
#
# ## Optional Client id
# # client_id = "Telegraf"
#
# ## Set the minimal supported Kafka version. Setting this enables the use of new
# ## Kafka features and APIs. Of particular interest, lz4 compression
# ## requires at least version 0.10.0.0.
# ## ex: version = "1.1.0"
# # version = ""
#
# ## Optional TLS Config
# # tls_ca = "/etc/telegraf/ca.pem"
# # tls_cert = "/etc/telegraf/cert.pem"
# # tls_key = "/etc/telegraf/key.pem"
# ## Use TLS but skip chain & host verification
# # insecure_skip_verify = false
#
# ## Optional SASL Config
sasl_username = "kafka"
sasl_password = "secret"
#
# ## the name of the consumer group
consumer_group = "telegraf_metrics_consumers"
# ## Offset (must be either "oldest" or "newest")
offset = "oldest"
# ## Maximum length of a message to consume, in bytes (default 0/unlimited);
# ## larger messages are dropped
max_message_len = 1000000
#
# ## Maximum messages to read from the broker that have not been written by an
# ## output. For best throughput set based on the number of metrics within
# ## each message and the size of the output's metric_batch_size.
# ##
# ## For example, if each message from the queue contains 10 metrics and the
# ## output metric_batch_size is 1000, setting this to 100 will ensure that a
# ## full batch is collected and the write is triggered immediately without
# ## waiting until the next flush_interval.
max_undelivered_messages = 1000
#
# ## Data format to consume.
# ## Each data format has its own unique set of configuration options, read
# ## more about them here:
# ## https://github.com/influxdata/telegraf/blob/master/docs/DATA_FORMATS_INPUT.md
data_format = "json"
#measurement name, if no name_override, measurement name will be kafka_consumer
name_override = "topicB"
## Tag keys is an array of keys that should be added as tags.
tag_keys = ["first"]
## String fields is an array of keys that should be added as string fields.
json_string_fields = ["last"]
## Query is a GJSON path that specifies a specific chunk of JSON to be
## parsed, if not specified the whole document will be parsed.
##
## GJSON query paths are described here:
## https://github.com/tidwall/gjson#path-syntax
json_query = "obj.friends"
## Time key is the key containing the time that should be used to create the metric.
json_time_key = ""
## Time format is the time layout that should be used to interprete the
## json_time_key. The time must be `unix`, `unix_ms` or a time in the
## "reference time".
## ex: json_time_format = "Mon Jan 2 15:04:05 -0700 MST 2006"
## json_time_format = "2006-01-02T15:04:05Z07:00"
## json_time_format = "unix"
## json_time_format = "unix_ms"
json_time_format = ""
Kafka Message input:
{
"obj": {
"name": {"first": "Tom", "last": "Anderson"},
"mrname": "myjson",
"age":37,
"children": ["Sara","Alex","Jack"],
"fav.movie": "Deer Hunter",
"friends": [
{"first": "Dale", "last": "Murphy", "age": 44},
{"first": "Roger", "last": "Craig", "age": 68},
{"first": "Jane", "last": "Murphy", "age": 47}
]
}
}
Output, that is data in InfluxDB:
time age first host last
---- --- ----- ---- ----
1567033241593012039 44 Dale ch71s8dev214.honeywell.com Murphy
1567033241593021507 68 Roger ch71s8dev214.honeywell.com Craig
1567033241593024972 48 Jane ch71s8dev214.honeywell.com Murphy