mysql+canal+adapter+es实现数据同步
一、版本
MySQL:5.7 canal:1.1.5 adapter:1.1.5 elasticsearch:7.4.2
二、MySQL开启binlog
1、MySQL配置文件
[mysqld] log-bin=mysql-bin # 开启 binlog binlog-format=ROW # 选择 ROW 模式 server_id=1 # 配置 MySQL replaction 需要定义,不要和 canal 的 slaveId 重复 #注:如果canal监听的是mysql slave库, 请在slave库加上如下配置 --logs-slave-updates
2、查看配置是否生效
# 是否开启binlog show variables like 'log_bin'; # 结果 log_bin ON # binlog模式 show variables like 'binlog_format'; # 结果 binlog_format ROW
3、授权canal连接MySQL的账号具有作为MySQL slave的权限,如果有账号可以直接grant
# 创建用户canal及密码设置 CREATE USER canal IDENTIFIED BY 'canal'; # 赋权 GRANT SELECT, REPLICATION SLAVE, REPLICATION CLIENT ON *.* TO 'canal'@'%'; # 刷新生效 FLUSH PRIVILEGES;
三、下载canal及adapter
# canal-server https://github.com/alibaba/canal/releases/download/canal-1.1.5/canal.deployer-1.1.5.tar.gz # canal-adapter https://github.com/alibaba/canal/releases/download/canal-1.1.5/canal.adapter-1.1.5.tar.gz
1、解压、展示目录
canal-server
drwxr-xr-x 2 root root 4096 9月 25 11:54 bin drwxr-xr-x 5 root root 4096 9月 25 11:54 conf drwxr-xr-x 2 root root 4096 9月 25 11:54 lib drwxrwxrwx 2 root root 4096 4月 19 2021 log drwxrwxrwx 2 root root 4096 4月 19 2021 plugin
canal-adapter
drwxr-xr-x 2 root root 4096 9月 25 11:54 bin drwxrwxrwx 8 root root 4096 4月 19 2021 conf drwxr-xr-x 2 root root 4096 9月 25 11:54 lib drwxrwxrwx 2 root root 4096 4月 19 2021 logs drwxrwxrwx 2 root root 4096 4月 19 2021 plugin
四、编辑canal配置文件
vi conf/example/instance.properties
此处只展示修改的配置
# 伪装成从库的slaveId,不能与MySQL重复 canal.instance.mysql.slaveId=1234 # 数据库的ip:端口 canal.instance.master.address=127.0.0.1:3306 # 数据库用户名密码 canal.instance.dbUsername=canal canal.instance.dbPassword=canal
五、启动canal
cd bin sh startup.sh
1、观察日志
canal.log
cd logs/canal tail -f canal.log 2022-09-25 12:50:43.316 [main] INFO com.alibaba.otter.canal.deployer.CanalLauncher - ## set default uncaught exception handler 2022-09-25 12:50:43.330 [main] INFO com.alibaba.otter.canal.deployer.CanalLauncher - ## load canal configurations 2022-09-25 12:50:43.335 [main] INFO com.alibaba.otter.canal.deployer.CanalStarter - ## start the canal server. 2022-09-25 12:50:43.353 [main] INFO com.alibaba.otter.canal.deployer.CanalController - ## start the canal server[0.0.1.1(0.0.1.1):11111] 2022-09-25 12:50:43.872 [main] INFO com.alibaba.otter.canal.deployer.CanalStarter - ## the canal server is running now ......
canal_stdout.log
tail -f canal_stdout.log OpenJDK 64-Bit Server VM warning: ignoring option PermSize=96m; support was removed in 8.0 OpenJDK 64-Bit Server VM warning: ignoring option MaxPermSize=256m; support was removed in 8.0 OpenJDK 64-Bit Server VM warning: UseCMSCompactAtFullCollection is deprecated and will likely be removed in a future release.
如遇到如下报错
OpenJDK 64-Bit Server VM warning: ignoring option PermSize=96m; support was removed in 8.0 OpenJDK 64-Bit Server VM warning: ignoring option MaxPermSize=256m; support was removed in 8.0 OpenJDK 64-Bit Server VM warning: UseCMSCompactAtFullCollection is deprecated and will likely be removed in a future release. The stack size specified is too small, Specify at least 384k Error: Could not create the Java Virtual Machine. Error: A fatal exception has occurred. Program will exit.
调整startup.sh脚本的-Xss参数
vi bin/startup.sh # 我这里调整到-Xss512k if [ -n "$str" ]; then JAVA_OPTS="-server -Xms2048m -Xmx3072m -Xmn1024m -XX:SurvivorRatio=2 -XX:PermSize=96m -XX:MaxPermSize=256m -Xss512k -XX:-UseAdaptiveSizePolicy -XX:MaxTenuringThreshold=15 -XX:+DisableExplicitGC -XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled -XX:+UseCMSCompactAtFullCollection -XX:+UseFastAccessorMethods -XX:+UseCMSInitiatingOccupancyOnly -XX:+HeapDumpOnOutOfMemoryError" else JAVA_OPTS="-server -Xms1024m -Xmx1024m -XX:NewSize=256m -XX:MaxNewSize=256m -XX:MaxPermSize=128m "
example.log
cd logs/example tail -f example.log 2022-09-25 12:50:43.848 [main] INFO c.a.otter.canal.instance.spring.CanalInstanceWithSpring - start CannalInstance for 1-example 2022-09-25 12:50:43.854 [main] WARN c.a.o.canal.parse.inbound.mysql.dbsync.LogEventConvert - --> init table filter : ^.*\..*$ 2022-09-25 12:50:43.854 [main] WARN c.a.o.canal.parse.inbound.mysql.dbsync.LogEventConvert - --> init table black filter : ^mysql\.slave_.*$ 2022-09-25 12:50:43.857 [main] INFO c.a.otter.canal.instance.core.AbstractCanalInstance - start successful.... 2022-09-25 12:50:44.323 [destination = example , address = /127.0.0.1:3306 , EventParser] WARN c.a.o.c.p.inbound.mysql.rds.RdsBinlogEventParserProxy - ---> begin to find start position, it will be long time for reset or first position 2022-09-25 12:50:44.324 [destination = example , address = /127.0.0.1:3306 , EventParser] WARN c.a.o.c.p.inbound.mysql.rds.RdsBinlogEventParserProxy - prepare to find start position just show master status 2022-09-25 12:53:24.653 [destination = example , address = /127.0.0.1:3306 , EventParser] WARN c.a.o.c.p.inbound.mysql.rds.RdsBinlogEventParserProxy - ---> find start position successfully, EntryPosition[included=false,journalName=mysql-bin.000005,position=115925585,serverId=1,gtid=,timestamp=1663923876000] cost : 160325ms , the next step is binlog dump
六、编辑adapter配置文件
cd conf vi application.yml
此处只展示修改的配置
canal.conf: consumerProperties: # 单机配置属性 # canal.tcp.server.host: 127.0.0.1:11111 # 此配置数据库信息与canal-server配置的数据库信息相同 srcDataSources: defaultDs: url: jbdc:mysql://127.0.0.1:3306/canal_test username: canal password: canal # 配置 ES信息 canalAdapters: groups: outerAdapters: - name: logger - name: es7 hosts: http://127.0.0.1:9200 properties: mode: rest security.auth: es账号:es密码 cluster.name: es的名字
适配器表映射文件
cd conf/es7/ cp mytest_user.yml canal_test_order.yml rm biz_order.yml customer.yml mytest_user.yml vi canal_test_order.yml dataSourceKey: defaultDS destination: example groupId: g1 esMapping: _index: canal_test_order _id: _id # 这个必须要加,源文件没有 _type: _doc upsert: true # pk: id sql: "select a.id as _id, a.order_no as orderNo, a.order_name as orderName from t_order a" # objFields: # _labels: array:; etlCondition: "where a.c_time>={}" commitBatch: 3000
七、启动adapter
cd bin sh startup.sh
1、观察日志
cd logs ls
发现没有日志
坑爹的地方
ps -ef|grep canal-adapter
发现没有进程启动
在bin目录下只有一个adapter.pid文件
想了半天,关键是连个日志都没有,不知道哪里报错
最后突然想到是不是脚本也需要改-Xss虚拟机参数
于是把-Xss由256k改为512k
vi startup.sh if [ -n "$str" ]; then JAVA_OPTS="-server -Xms2048m -Xmx3072m -Xmn1024m -XX:SurvivorRatio=2 -XX:PermSize=96m -XX:MaxPermSize=256m -Xss512k -XX:-UseAdaptiveSizePolicy -XX:MaxTenuringThreshold=15 -XX:+DisableExplicitGC -XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled -XX:+UseCMSCompactAtFullCollection -XX:+UseFastAccessorMethods -XX:+UseCMSInitiatingOccupancyOnly -XX:+HeapDumpOnOutOfMemoryError" else JAVA_OPTS="-server -Xms1024m -Xmx1024m -XX:NewSize=256m -XX:MaxNewSize=256m -XX:MaxPermSize=128m "
再次启动出现日志
cd logs ls adapter cd adapter ls adapter.log tail -f adapter.log
此时发现程序虽然运行起来了,但是有报错如下
2022-09-25 14:17:57.838 [main] INFO c.a.o.c.client.adapter.es.core.config.ESSyncConfigLoader - ## ES mapping config loaded 2022-09-25 14:17:57.997 [main] ERROR c.a.o.canal.adapter.launcher.loader.CanalAdapterLoader - Load canal adapter: es7 failed java.lang.RuntimeException: java.lang.RuntimeException: java.lang.ClassCastException: com.alibaba.druid.pool.DruidDataSource cannot be cast to com.alibaba.druid.pool.DruidDataSource at com.alibaba.otter.canal.client.adapter.es7x.ES7xAdapter.init(ES7xAdapter.java:54) ~[client-adapter.es7x-1.1.5-jar-with-dependencies.jar:na] at com.alibaba.otter.canal.adapter.launcher.loader.CanalAdapterLoader.loadAdapter(CanalAdapterLoader.java:225) [client-adapter.launcher-1.1.5.jar:na] at com.alibaba.otter.canal.adapter.launcher.loader.CanalAdapterLoader.init(CanalAdapterLoader.java:56) [client-adapter.launcher-1.1.5.jar:na] at com.alibaba.otter.canal.adapter.launcher.loader.CanalAdapterService.init(CanalAdapterService.java:60) [client-adapter.launcher-1.1.5.jar:na] at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[na:1.8.0_322] at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[na:1.8.0_322] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[na:1.8.0_322] at java.lang.reflect.Method.invoke(Method.java:498) ~[na:1.8.0_322] at org.springframework.beans.factory.annotation.InitDestroyAnnotationBeanPostProcessor$LifecycleElement.invoke(InitDestroyAnnotationBeanPostProcessor.java:365) [spring-beans-5.0.5.RELEASE.jar:5.0.5.RELEASE] at org.springframework.beans.factory.annotation.InitDestroyAnnotationBeanPostProcessor$LifecycleMetadata.invokeInitMethods(InitDestroyAnnotationBeanPostProcessor.java:308) [spring-beans-5.0.5.RELEASE.jar:5.0.5.RELEASE] at org.springframework.beans.factory.annotation.InitDestroyAnnotationBeanPostProcessor.postProcessBeforeInitialization(InitDestroyAnnotationBeanPostProcessor.java:135) [spring-beans-5.0.5.RELEASE.jar:5.0.5.RELEASE] at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.applyBeanPostProcessorsBeforeInitialization(AbstractAutowireCapableBeanFactory.java:422) [spring-beans-5.0.5.RELEASE.jar:5.0.5.RELEASE]
解决办法,更改源码
下载canal-adapter源码
修改client-adapter/escore/pom.xml为
原 <dependency> <groupId>com.alibaba</groupId> <artifactId>druid</artifactId> </dependency> 改成 <dependency> <groupId>com.alibaba</groupId> <artifactId>druid</artifactId> <scope>provided</scope> </dependency>
重新打包编译生成client-adapter.es7x-1.1.5-jar-with-dependencies.jar
放入canal-adapter的plugin目录下,替换原jar
重新启动,日志如下
2022-09-25 14:23:27.903 [main] INFO o.s.jmx.export.annotation.AnnotationMBeanExporter - Located managed bean 'configurationPropertiesRebinder': registering with JMX server as MBean [org.springframework.cloud.context.properties:name=configurationPropertiesRebinder,context=8ad6665,type=ConfigurationPropertiesRebinder] 2022-09-25 14:23:27.910 [main] INFO c.a.o.canal.adapter.launcher.loader.CanalAdapterService - ## syncSwitch refreshed. 2022-09-25 14:23:27.910 [main] INFO c.a.o.canal.adapter.launcher.loader.CanalAdapterService - ## start the canal client adapters. 2022-09-25 14:23:27.912 [main] INFO c.a.otter.canal.client.adapter.support.ExtensionLoader - extension classpath dir: /Users/fengchong/canal/canal-adapter/plugin 2022-09-25 14:23:27.958 [main] INFO c.a.o.canal.adapter.launcher.loader.CanalAdapterLoader - Load canal adapter: logger succeed 2022-09-25 14:23:28.155 [main] INFO c.a.o.c.client.adapter.es.core.config.ESSyncConfigLoader - ## Start loading es mapping config ... 2022-09-25 14:23:28.175 [main] INFO c.a.o.c.client.adapter.es.core.config.ESSyncConfigLoader - ## ES mapping config loaded 2022-09-25 14:23:28.341 [main] INFO c.a.o.canal.adapter.launcher.loader.CanalAdapterLoader - Load canal adapter: es7 succeed 2022-09-25 14:23:28.345 [main] INFO c.alibaba.otter.canal.connector.core.spi.ExtensionLoader - extension classpath dir: /Users/fengchong/canal/canal-adapter/plugin 2022-09-25 14:23:28.357 [main] INFO c.a.o.canal.adapter.launcher.loader.CanalAdapterLoader - Start adapter for canal-client mq topic: example-g1 succeed 2022-09-25 14:23:28.357 [main] INFO c.a.o.canal.adapter.launcher.loader.CanalAdapterService - ## the canal client adapters are running now ...... 2022-09-25 14:23:28.357 [Thread-4] INFO c.a.otter.canal.adapter.launcher.loader.AdapterProcessor - =============> Start to connect destination: example <============= 2022-09-25 14:23:28.361 [main] INFO org.apache.coyote.http11.Http11NioProtocol - Starting ProtocolHandler ["http-nio-8081"] 2022-09-25 14:23:28.363 [main] INFO org.apache.tomcat.util.net.NioSelectorPool - Using a shared selector for servlet write/read 2022-09-25 14:23:28.374 [main] INFO o.s.boot.web.embedded.tomcat.TomcatWebServer - Tomcat started on port(s): 8081 (http) with context path '' 2022-09-25 14:23:28.379 [main] INFO c.a.otter.canal.adapter.launcher.CanalAdapterApplication - Started CanalAdapterApplication in 2.602 seconds (JVM running for 3.131) 2022-09-25 14:23:28.402 [Thread-4] INFO c.a.otter.canal.adapter.launcher.loader.AdapterProcessor - =============> Subscribe destination: example succeed <=============
启动成功!
八、测试
在MySQL手动插入一条数据
adapter.log打印日志如下
2022-09-25 15:20:25.519 [pool-2-thread-1] INFO c.a.o.canal.client.adapter.logger.LoggerAdapterExample - DML: {"data":[{"id":7,"order_no":1122,"order_name":"2211"}],"database":"canal_test","destination":"example","es":1664090425000,"groupId":"g1","isDdl":false,"old":null,"pkNames":["id"],"sql":"","table":"t_order","ts":1664090425518,"type":"INSERT"} 2022-09-25 15:20:25.520 [pool-2-thread-1] DEBUG c.a.o.canal.client.adapter.es.core.service.ESSyncService - DML: {"data":[{"id":7,"order_no":1122,"order_name":"2211"}],"database":"canal_test","destination":"example","es":1664090425000,"groupId":"g1","isDdl":false,"old":null,"pkNames":["id"],"sql":"","table":"t_order","ts":1664090425518,"type":"INSERT"} Affected indexes: canal_test_order
查看elasticsearch数据
# get 127.0.0.1:9200/canal_test_order/_search { "took": 0, "timed_out": false, "_shards": { "total": 1, "successful": 1, "skipped": 0, "failed": 0 }, "hits": { "total": { "value": 1, "relation": "eq" }, "max_score": 1.0, "hits": [ { "_index": "canal_test_order", "_type": "_doc", "_id": "1", "_score": 1.0, "_source": { "orderNo": 111111, "orderName": "11111" } } ] } }
九、添加一张表关联
删除原es索引
重新建立新索引,含一对多关系,如一笔订单含有多件产品
# 原映射 { "properties": { "orderNo": { "type": "integer" }, "orderName": { "type": "keyword" } } } # 现映射 { "properties": { "productList": { "type": "nested", "properties": { "id": { "type": "integer" }, "productName": { "type": "keyword" }, "orderId": { "type": "integer" } } }, "orderNo": { "type": "integer" }, "orderName": { "type": "keyword" } } }
修改adapter适配器表映射文件为
dataSourceKey: defaultDS destination: example groupId: g1 esMapping: _index: canal_test_order _id: _id _type: _doc upsert: true # pk: id sql: "SELECT a.id AS _id, a.order_no AS orderNo, a.order_name AS orderName, concat( '[', l.productList, ']' ) AS productList FROM t_order a LEFT JOIN ( SELECT order_id, group_concat( JSON_OBJECT( 'id', id,'productName',product_name, 'orderId', order_id )) AS productList FROM t_order_product GROUP BY order_id ) l ON a.id = l.order_id" objFields: # 注意开启 productList: object # list中数据为对象型 etlCondition: "where a.c_time>={}" commitBatch: 3000
插入一条订单表数据es结果
{ "took": 0, "timed_out": false, "_shards": { "total": 1, "successful": 1, "skipped": 0, "failed": 0 }, "hits": { "total": { "value": 1, "relation": "eq" }, "max_score": 1.0, "hits": [ { "_index": "canal_test_order", "_type": "_doc", "_id": "8", "_score": 1.0, "_source": { "orderNo": 112233, "orderName": "112233", "productList": null } } ] } }
此时productList为空
向产品表添加两条产品,含关联此订单表的id
{ "took": 0, "timed_out": false, "_shards": { "total": 1, "successful": 1, "skipped": 0, "failed": 0 }, "hits": { "total": { "value": 1, "relation": "eq" }, "max_score": 1.0, "hits": [ { "_index": "canal_test_order", "_type": "_doc", "_id": "8", "_score": 1.0, "_source": { "orderNo": 112233, "orderName": "112233", "productList": [ { "orderId": 8, "id": 21, "productName": "2221" }, { "orderId": 8, "id": 22, "productName": "11" } ] } } ] } }
【推荐】国内首个AI IDE,深度理解中文开发场景,立即下载体验Trae
【推荐】编程新体验,更懂你的AI,立即体验豆包MarsCode编程助手
【推荐】抖音旗下AI助手豆包,你的智能百科全书,全免费不限次数
【推荐】轻量又高性能的 SSH 工具 IShell:AI 加持,快人一步
· Manus重磅发布:全球首款通用AI代理技术深度解析与实战指南
· 被坑几百块钱后,我竟然真的恢复了删除的微信聊天记录!
· 没有Manus邀请码?试试免邀请码的MGX或者开源的OpenManus吧
· 【自荐】一款简洁、开源的在线白板工具 Drawnix
· 园子的第一款AI主题卫衣上架——"HELLO! HOW CAN I ASSIST YOU TODAY