mysql+canal+adapter+es实现数据同步
一、版本
MySQL:5.7
canal:1.1.5
adapter:1.1.5
elasticsearch:7.4.2
二、MySQL开启binlog
1、MySQL配置文件
[mysqld]
log-bin=mysql-bin # 开启 binlog
binlog-format=ROW # 选择 ROW 模式
server_id=1 # 配置 MySQL replaction 需要定义,不要和 canal 的 slaveId 重复
#注:如果canal监听的是mysql slave库, 请在slave库加上如下配置
--logs-slave-updates
2、查看配置是否生效
# 是否开启binlog
show variables like 'log_bin';
# 结果
log_bin ON
# binlog模式
show variables like 'binlog_format';
# 结果
binlog_format ROW
3、授权canal连接MySQL的账号具有作为MySQL slave的权限,如果有账号可以直接grant
# 创建用户canal及密码设置
CREATE USER canal IDENTIFIED BY 'canal';
# 赋权
GRANT SELECT, REPLICATION SLAVE, REPLICATION CLIENT ON *.* TO 'canal'@'%';
# 刷新生效
FLUSH PRIVILEGES;
三、下载canal及adapter
# canal-server
https://github.com/alibaba/canal/releases/download/canal-1.1.5/canal.deployer-1.1.5.tar.gz
# canal-adapter
https://github.com/alibaba/canal/releases/download/canal-1.1.5/canal.adapter-1.1.5.tar.gz
1、解压、展示目录
canal-server
drwxr-xr-x 2 root root 4096 9月 25 11:54 bin
drwxr-xr-x 5 root root 4096 9月 25 11:54 conf
drwxr-xr-x 2 root root 4096 9月 25 11:54 lib
drwxrwxrwx 2 root root 4096 4月 19 2021 log
drwxrwxrwx 2 root root 4096 4月 19 2021 plugin
canal-adapter
drwxr-xr-x 2 root root 4096 9月 25 11:54 bin
drwxrwxrwx 8 root root 4096 4月 19 2021 conf
drwxr-xr-x 2 root root 4096 9月 25 11:54 lib
drwxrwxrwx 2 root root 4096 4月 19 2021 logs
drwxrwxrwx 2 root root 4096 4月 19 2021 plugin
四、编辑canal配置文件
vi conf/example/instance.properties
此处只展示修改的配置
# 伪装成从库的slaveId,不能与MySQL重复
canal.instance.mysql.slaveId=1234
# 数据库的ip:端口
canal.instance.master.address=127.0.0.1:3306
# 数据库用户名密码
canal.instance.dbUsername=canal
canal.instance.dbPassword=canal
五、启动canal
cd bin
sh startup.sh
1、观察日志
canal.log
cd logs/canal
tail -f canal.log
2022-09-25 12:50:43.316 [main] INFO com.alibaba.otter.canal.deployer.CanalLauncher - ## set default uncaught exception handler
2022-09-25 12:50:43.330 [main] INFO com.alibaba.otter.canal.deployer.CanalLauncher - ## load canal configurations
2022-09-25 12:50:43.335 [main] INFO com.alibaba.otter.canal.deployer.CanalStarter - ## start the canal server.
2022-09-25 12:50:43.353 [main] INFO com.alibaba.otter.canal.deployer.CanalController - ## start the canal server[0.0.1.1(0.0.1.1):11111]
2022-09-25 12:50:43.872 [main] INFO com.alibaba.otter.canal.deployer.CanalStarter - ## the canal server is running now ......
canal_stdout.log
tail -f canal_stdout.log
OpenJDK 64-Bit Server VM warning: ignoring option PermSize=96m; support was removed in 8.0
OpenJDK 64-Bit Server VM warning: ignoring option MaxPermSize=256m; support was removed in 8.0
OpenJDK 64-Bit Server VM warning: UseCMSCompactAtFullCollection is deprecated and will likely be removed in a future release.
如遇到如下报错
OpenJDK 64-Bit Server VM warning: ignoring option PermSize=96m; support was removed in 8.0
OpenJDK 64-Bit Server VM warning: ignoring option MaxPermSize=256m; support was removed in 8.0
OpenJDK 64-Bit Server VM warning: UseCMSCompactAtFullCollection is deprecated and will likely be removed in a future release.
The stack size specified is too small, Specify at least 384k
Error: Could not create the Java Virtual Machine.
Error: A fatal exception has occurred. Program will exit.
调整startup.sh脚本的-Xss参数
vi bin/startup.sh
# 我这里调整到-Xss512k
if [ -n "$str" ]; then
JAVA_OPTS="-server -Xms2048m -Xmx3072m -Xmn1024m -XX:SurvivorRatio=2 -XX:PermSize=96m -XX:MaxPermSize=256m -Xss512k -XX:-UseAdaptiveSizePolicy -XX:MaxTenuringThreshold=15 -XX:+DisableExplicitGC -XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled -XX:+UseCMSCompactAtFullCollection -XX:+UseFastAccessorMethods -XX:+UseCMSInitiatingOccupancyOnly -XX:+HeapDumpOnOutOfMemoryError"
else
JAVA_OPTS="-server -Xms1024m -Xmx1024m -XX:NewSize=256m -XX:MaxNewSize=256m -XX:MaxPermSize=128m "
example.log
cd logs/example
tail -f example.log
2022-09-25 12:50:43.848 [main] INFO c.a.otter.canal.instance.spring.CanalInstanceWithSpring - start CannalInstance for 1-example
2022-09-25 12:50:43.854 [main] WARN c.a.o.canal.parse.inbound.mysql.dbsync.LogEventConvert - --> init table filter : ^.*\..*$
2022-09-25 12:50:43.854 [main] WARN c.a.o.canal.parse.inbound.mysql.dbsync.LogEventConvert - --> init table black filter : ^mysql\.slave_.*$
2022-09-25 12:50:43.857 [main] INFO c.a.otter.canal.instance.core.AbstractCanalInstance - start successful....
2022-09-25 12:50:44.323 [destination = example , address = /127.0.0.1:3306 , EventParser] WARN c.a.o.c.p.inbound.mysql.rds.RdsBinlogEventParserProxy - ---> begin to find start position, it will be long time for reset or first position
2022-09-25 12:50:44.324 [destination = example , address = /127.0.0.1:3306 , EventParser] WARN c.a.o.c.p.inbound.mysql.rds.RdsBinlogEventParserProxy - prepare to find start position just show master status
2022-09-25 12:53:24.653 [destination = example , address = /127.0.0.1:3306 , EventParser] WARN c.a.o.c.p.inbound.mysql.rds.RdsBinlogEventParserProxy - ---> find start position successfully, EntryPosition[included=false,journalName=mysql-bin.000005,position=115925585,serverId=1,gtid=,timestamp=1663923876000] cost : 160325ms , the next step is binlog dump
六、编辑adapter配置文件
cd conf
vi application.yml
此处只展示修改的配置
canal.conf:
consumerProperties:
# 单机配置属性
# canal.tcp.server.host: 127.0.0.1:11111
# 此配置数据库信息与canal-server配置的数据库信息相同
srcDataSources:
defaultDs:
url: jbdc:mysql://127.0.0.1:3306/canal_test
username: canal
password: canal
# 配置 ES信息
canalAdapters:
groups:
outerAdapters:
- name: logger
- name: es7
hosts: http://127.0.0.1:9200
properties:
mode: rest
security.auth: es账号:es密码
cluster.name: es的名字
适配器表映射文件
cd conf/es7/
cp mytest_user.yml canal_test_order.yml
rm biz_order.yml customer.yml mytest_user.yml
vi canal_test_order.yml
dataSourceKey: defaultDS
destination: example
groupId: g1
esMapping:
_index: canal_test_order
_id: _id
# 这个必须要加,源文件没有
_type: _doc
upsert: true
# pk: id
sql: "select
a.id as _id,
a.order_no as orderNo,
a.order_name as orderName
from t_order a"
# objFields:
# _labels: array:;
etlCondition: "where a.c_time>={}"
commitBatch: 3000
七、启动adapter
cd bin
sh startup.sh
1、观察日志
cd logs
ls
发现没有日志
坑爹的地方
ps -ef|grep canal-adapter
发现没有进程启动
在bin目录下只有一个adapter.pid文件
想了半天,关键是连个日志都没有,不知道哪里报错
最后突然想到是不是脚本也需要改-Xss虚拟机参数
于是把-Xss由256k改为512k
vi startup.sh
if [ -n "$str" ]; then
JAVA_OPTS="-server -Xms2048m -Xmx3072m -Xmn1024m -XX:SurvivorRatio=2 -XX:PermSize=96m -XX:MaxPermSize=256m -Xss512k -XX:-UseAdaptiveSizePolicy -XX:MaxTenuringThreshold=15 -XX:+DisableExplicitGC -XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled -XX:+UseCMSCompactAtFullCollection -XX:+UseFastAccessorMethods -XX:+UseCMSInitiatingOccupancyOnly -XX:+HeapDumpOnOutOfMemoryError"
else
JAVA_OPTS="-server -Xms1024m -Xmx1024m -XX:NewSize=256m -XX:MaxNewSize=256m -XX:MaxPermSize=128m "
再次启动出现日志
cd logs
ls
adapter
cd adapter
ls
adapter.log
tail -f adapter.log
此时发现程序虽然运行起来了,但是有报错如下
2022-09-25 14:17:57.838 [main] INFO c.a.o.c.client.adapter.es.core.config.ESSyncConfigLoader - ## ES mapping config loaded
2022-09-25 14:17:57.997 [main] ERROR c.a.o.canal.adapter.launcher.loader.CanalAdapterLoader - Load canal adapter: es7 failed
java.lang.RuntimeException: java.lang.RuntimeException: java.lang.ClassCastException: com.alibaba.druid.pool.DruidDataSource cannot be cast to com.alibaba.druid.pool.DruidDataSource
at com.alibaba.otter.canal.client.adapter.es7x.ES7xAdapter.init(ES7xAdapter.java:54) ~[client-adapter.es7x-1.1.5-jar-with-dependencies.jar:na]
at com.alibaba.otter.canal.adapter.launcher.loader.CanalAdapterLoader.loadAdapter(CanalAdapterLoader.java:225) [client-adapter.launcher-1.1.5.jar:na]
at com.alibaba.otter.canal.adapter.launcher.loader.CanalAdapterLoader.init(CanalAdapterLoader.java:56) [client-adapter.launcher-1.1.5.jar:na]
at com.alibaba.otter.canal.adapter.launcher.loader.CanalAdapterService.init(CanalAdapterService.java:60) [client-adapter.launcher-1.1.5.jar:na]
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[na:1.8.0_322]
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[na:1.8.0_322]
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[na:1.8.0_322]
at java.lang.reflect.Method.invoke(Method.java:498) ~[na:1.8.0_322]
at org.springframework.beans.factory.annotation.InitDestroyAnnotationBeanPostProcessor$LifecycleElement.invoke(InitDestroyAnnotationBeanPostProcessor.java:365) [spring-beans-5.0.5.RELEASE.jar:5.0.5.RELEASE]
at org.springframework.beans.factory.annotation.InitDestroyAnnotationBeanPostProcessor$LifecycleMetadata.invokeInitMethods(InitDestroyAnnotationBeanPostProcessor.java:308) [spring-beans-5.0.5.RELEASE.jar:5.0.5.RELEASE]
at org.springframework.beans.factory.annotation.InitDestroyAnnotationBeanPostProcessor.postProcessBeforeInitialization(InitDestroyAnnotationBeanPostProcessor.java:135) [spring-beans-5.0.5.RELEASE.jar:5.0.5.RELEASE]
at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.applyBeanPostProcessorsBeforeInitialization(AbstractAutowireCapableBeanFactory.java:422) [spring-beans-5.0.5.RELEASE.jar:5.0.5.RELEASE]
解决办法,更改源码
下载canal-adapter源码
修改client-adapter/escore/pom.xml为
原
<dependency>
<groupId>com.alibaba</groupId>
<artifactId>druid</artifactId>
</dependency>
改成
<dependency>
<groupId>com.alibaba</groupId>
<artifactId>druid</artifactId>
<scope>provided</scope>
</dependency>
重新打包编译生成client-adapter.es7x-1.1.5-jar-with-dependencies.jar
放入canal-adapter的plugin目录下,替换原jar
重新启动,日志如下
2022-09-25 14:23:27.903 [main] INFO o.s.jmx.export.annotation.AnnotationMBeanExporter - Located managed bean 'configurationPropertiesRebinder': registering with JMX server as MBean [org.springframework.cloud.context.properties:name=configurationPropertiesRebinder,context=8ad6665,type=ConfigurationPropertiesRebinder]
2022-09-25 14:23:27.910 [main] INFO c.a.o.canal.adapter.launcher.loader.CanalAdapterService - ## syncSwitch refreshed.
2022-09-25 14:23:27.910 [main] INFO c.a.o.canal.adapter.launcher.loader.CanalAdapterService - ## start the canal client adapters.
2022-09-25 14:23:27.912 [main] INFO c.a.otter.canal.client.adapter.support.ExtensionLoader - extension classpath dir: /Users/fengchong/canal/canal-adapter/plugin
2022-09-25 14:23:27.958 [main] INFO c.a.o.canal.adapter.launcher.loader.CanalAdapterLoader - Load canal adapter: logger succeed
2022-09-25 14:23:28.155 [main] INFO c.a.o.c.client.adapter.es.core.config.ESSyncConfigLoader - ## Start loading es mapping config ...
2022-09-25 14:23:28.175 [main] INFO c.a.o.c.client.adapter.es.core.config.ESSyncConfigLoader - ## ES mapping config loaded
2022-09-25 14:23:28.341 [main] INFO c.a.o.canal.adapter.launcher.loader.CanalAdapterLoader - Load canal adapter: es7 succeed
2022-09-25 14:23:28.345 [main] INFO c.alibaba.otter.canal.connector.core.spi.ExtensionLoader - extension classpath dir: /Users/fengchong/canal/canal-adapter/plugin
2022-09-25 14:23:28.357 [main] INFO c.a.o.canal.adapter.launcher.loader.CanalAdapterLoader - Start adapter for canal-client mq topic: example-g1 succeed
2022-09-25 14:23:28.357 [main] INFO c.a.o.canal.adapter.launcher.loader.CanalAdapterService - ## the canal client adapters are running now ......
2022-09-25 14:23:28.357 [Thread-4] INFO c.a.otter.canal.adapter.launcher.loader.AdapterProcessor - =============> Start to connect destination: example <=============
2022-09-25 14:23:28.361 [main] INFO org.apache.coyote.http11.Http11NioProtocol - Starting ProtocolHandler ["http-nio-8081"]
2022-09-25 14:23:28.363 [main] INFO org.apache.tomcat.util.net.NioSelectorPool - Using a shared selector for servlet write/read
2022-09-25 14:23:28.374 [main] INFO o.s.boot.web.embedded.tomcat.TomcatWebServer - Tomcat started on port(s): 8081 (http) with context path ''
2022-09-25 14:23:28.379 [main] INFO c.a.otter.canal.adapter.launcher.CanalAdapterApplication - Started CanalAdapterApplication in 2.602 seconds (JVM running for 3.131)
2022-09-25 14:23:28.402 [Thread-4] INFO c.a.otter.canal.adapter.launcher.loader.AdapterProcessor - =============> Subscribe destination: example succeed <=============
启动成功!
八、测试
在MySQL手动插入一条数据
adapter.log打印日志如下
2022-09-25 15:20:25.519 [pool-2-thread-1] INFO c.a.o.canal.client.adapter.logger.LoggerAdapterExample - DML: {"data":[{"id":7,"order_no":1122,"order_name":"2211"}],"database":"canal_test","destination":"example","es":1664090425000,"groupId":"g1","isDdl":false,"old":null,"pkNames":["id"],"sql":"","table":"t_order","ts":1664090425518,"type":"INSERT"}
2022-09-25 15:20:25.520 [pool-2-thread-1] DEBUG c.a.o.canal.client.adapter.es.core.service.ESSyncService - DML: {"data":[{"id":7,"order_no":1122,"order_name":"2211"}],"database":"canal_test","destination":"example","es":1664090425000,"groupId":"g1","isDdl":false,"old":null,"pkNames":["id"],"sql":"","table":"t_order","ts":1664090425518,"type":"INSERT"}
Affected indexes: canal_test_order
查看elasticsearch数据
# get 127.0.0.1:9200/canal_test_order/_search
{
"took": 0,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 1,
"relation": "eq"
},
"max_score": 1.0,
"hits": [
{
"_index": "canal_test_order",
"_type": "_doc",
"_id": "1",
"_score": 1.0,
"_source": {
"orderNo": 111111,
"orderName": "11111"
}
}
]
}
}
九、添加一张表关联
删除原es索引
重新建立新索引,含一对多关系,如一笔订单含有多件产品
# 原映射
{
"properties": {
"orderNo": {
"type": "integer"
},
"orderName": {
"type": "keyword"
}
}
}
# 现映射
{
"properties": {
"productList": {
"type": "nested",
"properties": {
"id": {
"type": "integer"
},
"productName": {
"type": "keyword"
},
"orderId": {
"type": "integer"
}
}
},
"orderNo": {
"type": "integer"
},
"orderName": {
"type": "keyword"
}
}
}
修改adapter适配器表映射文件为
dataSourceKey: defaultDS
destination: example
groupId: g1
esMapping:
_index: canal_test_order
_id: _id
_type: _doc
upsert: true
# pk: id
sql: "SELECT
a.id AS _id,
a.order_no AS orderNo,
a.order_name AS orderName,
concat( '[', l.productList, ']' ) AS productList
FROM
t_order a
LEFT JOIN (
SELECT order_id,
group_concat( JSON_OBJECT( 'id', id,'productName',product_name, 'orderId', order_id )) AS productList
FROM t_order_product
GROUP BY order_id ) l
ON a.id = l.order_id"
objFields: # 注意开启
productList: object # list中数据为对象型
etlCondition: "where a.c_time>={}"
commitBatch: 3000
插入一条订单表数据es结果
{
"took": 0,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 1,
"relation": "eq"
},
"max_score": 1.0,
"hits": [
{
"_index": "canal_test_order",
"_type": "_doc",
"_id": "8",
"_score": 1.0,
"_source": {
"orderNo": 112233,
"orderName": "112233",
"productList": null
}
}
]
}
}
此时productList为空
向产品表添加两条产品,含关联此订单表的id
{
"took": 0,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 1,
"relation": "eq"
},
"max_score": 1.0,
"hits": [
{
"_index": "canal_test_order",
"_type": "_doc",
"_id": "8",
"_score": 1.0,
"_source": {
"orderNo": 112233,
"orderName": "112233",
"productList": [
{
"orderId": 8,
"id": 21,
"productName": "2221"
},
{
"orderId": 8,
"id": 22,
"productName": "11"
}
]
}
}
]
}
}