debezium 使用踩坑

在已经启动后的连接器配置中table.include.list 添加了一张已有数据的表,如何为该表做snapshot

> 开发环境  debezium版本是1.3.final 

如题,这里要介绍一个参数 “snapshot.new.tables” ,这个参数有点神奇,是被官方雪藏起来的,官方issue给的解释是 https://issues.redhat.com/browse/DBZ-1977

示例如下:

{
    "connector.class": "io.debezium.connector.mysql.MySqlConnector",
    "database.user": "debezium_mysql",
    "tasks.max": "1",
    "database.history.kafka.bootstrap.servers": "cdh04:9092,cdh05:9092,cdh06:9092",
    "database.history.kafka.topic": "ninth_studio_connector_history",
    "database.history.kafka.recovery.poll.interval.ms": "5000",
    "database.server.name": "ninth_studio_connector",
    "database.port": "3306",
    "tombstones.on.delete": "false",
    "snapshot.new.tables":"parallel",
    "database.hostname": "common.mysql.test.local",
    "database.password": "********",
    "database.serverTimezone":"UTC" ,
    "table.include.list": "ninth_studio.wehub_action_logs,ninth_studio.ab_py_assistant_binds",
    "database.include.list": "ninth_studio"
}

ps:1.3版本在配置了snapshot.mode = when_needed , snapshot.include.collection.list 后,偶尔会出现监听不到数据的情况,要更新配置之后(随便更新什么配置,主要是为了让connect重启,单纯的使用rest api 重启不会起作用)才能继续读取数据

时区问题

设置参考连接: https://my.oschina.net/dacoolbaby/blog/3096451

这位大佬分享了很多debezium的坑点,有很多可以借鉴的地方

decimal数据类型转换

参考链接: https://blog.csdn.net/u012551524/article/details/83546765

默认precise会将其转为“F3A=”,设置为double则可以正常显示了

binlog文件失效导致抛异常

异常如下:

 Connector requires binlog file 'mysql-bin.000003', but MySQL only has mysql-bin.000041, mysql-bin.000042 (io.debezium.connector.mysql.MySqlConnectorTask:330)
[2018-10-03 12:48:43,752] INFO Stopping MySQL connector task (io.debezium.connector.mysql.MySqlConnectorTask:245)
[2018-10-03 12:48:43,752] INFO WorkerSourceTask{id=debezium-mysql-0} Committing offsets (org.apache.kafka.connect.runtime.WorkerSourceTask:397)
[2018-10-03 12:48:43,752] INFO WorkerSourceTask{id=debezium-mysql-0} flushing 0 outstanding messages for offset commit (org.apache.kafka.connect.runtime.WorkerSourceTask:414)
[2018-10-03 12:48:43,753] ERROR WorkerSourceTask{id=debezium-mysql-0} Task threw an uncaught and unrecoverable exception (org.apache.kafka.connect.runtime.WorkerTask:177)
org.apache.kafka.connect.errors.ConnectException: The connector is trying to read binlog starting at binlog file 'mysql-bin.000003', pos=715349422, skipping 982 events plus 40 rows, but this is no longer available on the server. Reconfigure the connector to use a snapshot when needed.
 
这种时候最简单的方式是将 "snapshot.mode"设置为when_needed,如果想从根本上避免,可以考虑将mysql的binlog失效时间expire_logs_days调大一点
ps: 有时并不会直接抛出异常,而是会在日志里不断打印 `Couldn't commit processed log positions with the source database due to a concurrent connector shutdown or restart`
这时需要尝试重启connect 服务,然后再去查看任务运行状态
posted @ 2020-11-28 12:21  可以看看你胖次吗  阅读(7729)  评论(5编辑  收藏  举报
Live2D