Canal 1.1.5实现MariaDB主库故障后自动切换到从库继续同步

一、实验背景

canal有一个参数canal.instance.standby.address可以指定源端数据库的从库为备选数据库,当源端master宕机后,canal仍能指向备库进行同步。

但是发现canal 1.1.4不兼容mariadb的gtid:

当canal instance里指定了gtid(示例:canal.instance.master.gtid=0-1-146),重启canal instance会报错,不识别这种gtid格式:

ERROR com.alibaba.otter.canal.common.alarm.LogAlarmHandler - destination:ff-test[java.lang.RuntimeException: parseUUIDSet failed due to wrong format: 0-1-146

  • 当canal instance里指定了binlog名称及position,canal instance不报错,但是识别不到gtid:

2021-04-22 17:17:42.168 [destination = ff-test , address = /192.168.144.246:3306 , EventParser] WARN c.a.o.c.p.inbound.mysql.rds.RdsBinlogEventParserProxy - ---> find start position successfully, EntryPosition[included=false,journalName=slave-bin.000009,position=4585,serverId=1,gtid=,timestamp=1619016171000] cost : 238ms , the next step is binlog dump

  • 当主库宕机,发生主从切换后,canal由于识别不到gtid,因此按时间戳找位置:

2021-04-22 17:42:37.002 [destination = ff-test , address = /192.168.144.245:3306 , EventParser] WARN c.a.o.c.p.inbound.mysql.rds.RdsBinlogEventParserProxy - ---> begin to find start position, it will be long time for reset or first position

2021-04-22 17:42:37.094 [destination = ff-test , address = /192.168.144.245:3306 , EventParser] WARN c.a.o.c.p.inbound.mysql.rds.RdsBinlogEventParserProxy - prepare to find start position by switch ::1619083861000

2021-04-22 17:42:41.910 [destination = ff-test , address = /192.168.144.245:3306 , EventParser] WARN c.a.o.c.p.inbound.mysql.rds.RdsBinlogEventParserProxy - ---> find start position successfully, EntryPosition[included=false,journalName=master-bin.000013,position=7080,serverId=1,gtid=,timestamp=1619021230000] cost : 4400ms , the next step is binlog dump

按时间戳找的位置是不准确的,经过多次验证,发现找的位置靠前,因此会重复执行一些SQL,造成数据混乱。

Canal 1.1.5版本修复了mariadb gtid问题,现记录下实验过程。
二、实验步骤
2.1 配置canal server

## detecing config

canal.instance.detecting.enable = true

#canal.instance.detecting.sql = insert into retl.xdual values(1,now()) on duplicate key update x=now()

canal.instance.detecting.sql = select 1

canal.instance.detecting.interval.time = 30

canal.instance.detecting.retry.threshold = 4

canal.instance.detecting.heartbeatHaEnable = true

canal.instance.tsdb.enable = false

2.2 配置canal instance

canal.instance.mysql.slaveId=5 #不要和集群里现有的server_id重复

canal.instance.gtidon=true

canal.instance.master.address=192.168.144.245:3306 #源端ip:端口

canal.instance.tsdb.enable=false

canal.instance.standby.address =192.168.144.246:3306 #备库ip:端口

canal.instance.master.gtid=0-1-165 #可在主库上执行SELECT @@gtid_binlog_pos;查询

canal.instance.dbUsername=canal

canal.instance.dbPassword=canal

canal.instance.filter.regex=hh.* #同步的库表

启动canal instance后,可看到识别到gtid了:

2021-04-22 17:57:46.687 [destination = hh-test , address = /192.168.144.245:3306 , EventParser] WARN c.a.o.c.p.inbound.mysql.rds.RdsBinlogEventParserProxy - ---> begin to find start position, it will be long time for reset or first position

2021-04-22 17:57:46.724 [destination = hh-test , address = /192.168.144.245:3306 , EventParser] WARN c.a.o.c.p.inbound.mysql.rds.RdsBinlogEventParserProxy - ---> find start position successfully, EntryPosition[included=false,journalName=,position=<null>,serverId=<null>,gtid=0-1-165,timestamp=<null>] cost : 1ms , the next step is binlog dump
2.3 验证mariadb主从切换

关掉主库,canal instance日志显示主从切换过程:

2021-04-22 18:03:41.114 [destination = hh-test , address = /192.168.144.245:3306 , HeartBeatTimeTask] WARN c.a.o.c.p.inbound.mysql.rds.RdsBinlogEventParserProxy - try to ha switch, old:192.168.144.245/192.168.144.245:3306, new:/192.168.144.246:3306

2021-04-22 18:03:41.114 [destination = hh-test , address = /192.168.144.245:3306 , HeartBeatTimeTask] ERROR com.alibaba.otter.canal.common.alarm.LogAlarmHandler - destination:hh-test[try to ha switch, old:192.168.144.245/192.168.144.245:3306, new:/192.168.144.246:3306]

2021-04-22 18:03:41.126 [destination = hh-test , address = /192.168.144.245:3306 , HeartBeatTimeTask] WARN c.a.o.canal.parse.inbound.mysql.dbsync.LogEventConvert - --> init table filter : ^hh.*$

2021-04-22 18:03:41.126 [destination = hh-test , address = /192.168.144.245:3306 , HeartBeatTimeTask] WARN c.a.o.canal.parse.inbound.mysql.dbsync.LogEventConvert - --> init table black filter :

2021-04-22 18:03:41.127 [destination = hh-test , address = /192.168.144.245:3306 , HeartBeatTimeTask] WARN c.a.o.c.p.inbound.mysql.rds.RdsBinlogEventParserProxy - connect failed by

java.io.IOException: connect /192.168.144.245:3306 failure

……

2021-04-22 18:03:41.183 [destination = hh-test , address = /192.168.144.246:3306 , EventParser] WARN c.a.o.c.p.inbound.mysql.rds.RdsBinlogEventParserProxy - ---> begin to find start position, it will be long time for reset or first position

2021-04-22 18:03:41.184 [destination = hh-test , address = /192.168.144.246:3306 , EventParser] WARN c.a.o.c.p.inbound.mysql.rds.RdsBinlogEventParserProxy - ---> find start position successfully, EntryPosition[included=false,journalName=master-bin.000015,position=4288,serverId=1,gtid=0-1-171,timestamp=1619085735000] cost : 0ms , the next step is binlog dump

经测试,找的位置是准确的。

 

但是canal 1.1.5目前发现有一点不足的地方,canal server重启后,canal client不会自动连canal server,canal client停止往目标库同步,一直在报错:

AdapterProcessor - com.alibaba.otter.canal.protocol.exception.CanalClientException: java.io.IOException: Broken pipe Error sync but ACK!

需要重启canal client才行。

不知是我哪里没配对,还是canal 1.1.5目前还不完善。
————————————————
版权声明:本文为CSDN博主「雅冰石」的原创文章,遵循CC 4.0 BY-SA版权协议,转载请附上原文出处链接及本声明。
原文链接:https://blog.csdn.net/yabingshi_tech/article/details/116002271

posted @ 2021-04-27 13:38  ~*一生所爱*~  阅读(689)  评论(0编辑  收藏  举报
在国际交往中,实力 永远是维护正义的基础;国防 才是外交真正的后盾;尊严 只在剑峰之上;真理 只在大炮射程之内。