mysql+canal+adapter+es实现数据同步

一、版本

MySQL:5.7
canal:1.1.5
adapter:1.1.5
elasticsearch:7.4.2

二、MySQL开启binlog

1、MySQL配置文件

[mysqld]
log-bin=mysql-bin # 开启 binlog
binlog-format=ROW # 选择 ROW 模式
server_id=1 # 配置 MySQL replaction 需要定义,不要和 canal 的 slaveId 重复

#注:如果canal监听的是mysql slave库, 请在slave库加上如下配置
--logs-slave-updates

2、查看配置是否生效

# 是否开启binlog
show variables like 'log_bin';
# 结果
log_bin	ON

# binlog模式
show variables like 'binlog_format';
# 结果
binlog_format	ROW

3、授权canal连接MySQL的账号具有作为MySQL slave的权限,如果有账号可以直接grant

# 创建用户canal及密码设置
CREATE USER canal IDENTIFIED BY 'canal';  
# 赋权
GRANT SELECT, REPLICATION SLAVE, REPLICATION CLIENT ON *.* TO 'canal'@'%';
# 刷新生效
FLUSH PRIVILEGES;

三、下载canal及adapter

# canal-server
https://github.com/alibaba/canal/releases/download/canal-1.1.5/canal.deployer-1.1.5.tar.gz
# canal-adapter
https://github.com/alibaba/canal/releases/download/canal-1.1.5/canal.adapter-1.1.5.tar.gz

1、解压、展示目录

canal-server

drwxr-xr-x 2 root root     4096 9月  25 11:54 bin
drwxr-xr-x 5 root root     4096 9月  25 11:54 conf
drwxr-xr-x 2 root root     4096 9月  25 11:54 lib
drwxrwxrwx 2 root root     4096 4月  19 2021 log
drwxrwxrwx 2 root root     4096 4月  19 2021 plugin

canal-adapter

drwxr-xr-x 2 root root      4096 9月  25 11:54 bin
drwxrwxrwx 8 root root      4096 4月  19 2021 conf
drwxr-xr-x 2 root root      4096 9月  25 11:54 lib
drwxrwxrwx 2 root root      4096 4月  19 2021 logs
drwxrwxrwx 2 root root      4096 4月  19 2021 plugin

四、编辑canal配置文件

vi conf/example/instance.properties

此处只展示修改的配置

# 伪装成从库的slaveId,不能与MySQL重复
canal.instance.mysql.slaveId=1234
# 数据库的ip:端口
canal.instance.master.address=127.0.0.1:3306
# 数据库用户名密码
canal.instance.dbUsername=canal
canal.instance.dbPassword=canal

五、启动canal

cd bin
sh startup.sh

1、观察日志

canal.log

cd logs/canal
tail -f canal.log

2022-09-25 12:50:43.316 [main] INFO  com.alibaba.otter.canal.deployer.CanalLauncher - ## set default uncaught exception handler
2022-09-25 12:50:43.330 [main] INFO  com.alibaba.otter.canal.deployer.CanalLauncher - ## load canal configurations
2022-09-25 12:50:43.335 [main] INFO  com.alibaba.otter.canal.deployer.CanalStarter - ## start the canal server.
2022-09-25 12:50:43.353 [main] INFO  com.alibaba.otter.canal.deployer.CanalController - ## start the canal server[0.0.1.1(0.0.1.1):11111]
2022-09-25 12:50:43.872 [main] INFO  com.alibaba.otter.canal.deployer.CanalStarter - ## the canal server is running now ......

canal_stdout.log

tail -f canal_stdout.log

OpenJDK 64-Bit Server VM warning: ignoring option PermSize=96m; support was removed in 8.0
OpenJDK 64-Bit Server VM warning: ignoring option MaxPermSize=256m; support was removed in 8.0
OpenJDK 64-Bit Server VM warning: UseCMSCompactAtFullCollection is deprecated and will likely be removed in a future release.

如遇到如下报错

OpenJDK 64-Bit Server VM warning: ignoring option PermSize=96m; support was removed in 8.0
OpenJDK 64-Bit Server VM warning: ignoring option MaxPermSize=256m; support was removed in 8.0
OpenJDK 64-Bit Server VM warning: UseCMSCompactAtFullCollection is deprecated and will likely be removed in a future release.

The stack size specified is too small, Specify at least 384k
Error: Could not create the Java Virtual Machine.
Error: A fatal exception has occurred. Program will exit.

调整startup.sh脚本的-Xss参数

vi bin/startup.sh
# 我这里调整到-Xss512k

if [ -n "$str" ]; then
        JAVA_OPTS="-server -Xms2048m -Xmx3072m -Xmn1024m -XX:SurvivorRatio=2 -XX:PermSize=96m -XX:MaxPermSize=256m -Xss512k -XX:-UseAdaptiveSizePolicy -XX:MaxTenuringThreshold=15 -XX:+DisableExplicitGC -XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled -XX:+UseCMSCompactAtFullCollection -XX:+UseFastAccessorMethods -XX:+UseCMSInitiatingOccupancyOnly -XX:+HeapDumpOnOutOfMemoryError"
else
        JAVA_OPTS="-server -Xms1024m -Xmx1024m -XX:NewSize=256m -XX:MaxNewSize=256m -XX:MaxPermSize=128m "

example.log

cd logs/example
tail -f example.log

2022-09-25 12:50:43.848 [main] INFO  c.a.otter.canal.instance.spring.CanalInstanceWithSpring - start CannalInstance for 1-example 
2022-09-25 12:50:43.854 [main] WARN  c.a.o.canal.parse.inbound.mysql.dbsync.LogEventConvert - --> init table filter : ^.*\..*$
2022-09-25 12:50:43.854 [main] WARN  c.a.o.canal.parse.inbound.mysql.dbsync.LogEventConvert - --> init table black filter : ^mysql\.slave_.*$
2022-09-25 12:50:43.857 [main] INFO  c.a.otter.canal.instance.core.AbstractCanalInstance - start successful....
2022-09-25 12:50:44.323 [destination = example , address = /127.0.0.1:3306 , EventParser] WARN  c.a.o.c.p.inbound.mysql.rds.RdsBinlogEventParserProxy - ---> begin to find start position, it will be long time for reset or first position
2022-09-25 12:50:44.324 [destination = example , address = /127.0.0.1:3306 , EventParser] WARN  c.a.o.c.p.inbound.mysql.rds.RdsBinlogEventParserProxy - prepare to find start position just show master status
2022-09-25 12:53:24.653 [destination = example , address = /127.0.0.1:3306 , EventParser] WARN  c.a.o.c.p.inbound.mysql.rds.RdsBinlogEventParserProxy - ---> find start position successfully, EntryPosition[included=false,journalName=mysql-bin.000005,position=115925585,serverId=1,gtid=,timestamp=1663923876000] cost : 160325ms , the next step is binlog dump

六、编辑adapter配置文件

cd conf
vi application.yml

此处只展示修改的配置

canal.conf:
  consumerProperties:
  	# 单机配置属性
  	# canal.tcp.server.host: 127.0.0.1:11111
  # 此配置数据库信息与canal-server配置的数据库信息相同
  srcDataSources:
  	defaultDs:
  	  url: jbdc:mysql://127.0.0.1:3306/canal_test
  	  username: canal
  	  password: canal
  # 配置 ES信息
  canalAdapters:
    groups:
      outerAdapters: 
      	- name: logger
      	- name: es7
      	  hosts: http://127.0.0.1:9200
      	  properties:
      	  	mode: rest
      	  	security.auth: es账号:es密码
      	  	cluster.name: es的名字

适配器表映射文件

cd conf/es7/
cp mytest_user.yml canal_test_order.yml
rm biz_order.yml customer.yml mytest_user.yml
vi canal_test_order.yml


dataSourceKey: defaultDS
destination: example
groupId: g1
esMapping:
  _index: canal_test_order
  _id: _id
# 这个必须要加,源文件没有
  _type: _doc
  upsert: true
#  pk: id
  sql: "select
        a.id as _id,
        a.order_no as orderNo,
        a.order_name as orderName
        from t_order a"
#  objFields:
#    _labels: array:;
  etlCondition: "where a.c_time>={}"
  commitBatch: 3000

七、启动adapter

cd bin
sh startup.sh

1、观察日志

cd logs
ls

发现没有日志

坑爹的地方

ps -ef|grep canal-adapter

发现没有进程启动

在bin目录下只有一个adapter.pid文件

想了半天,关键是连个日志都没有,不知道哪里报错

最后突然想到是不是脚本也需要改-Xss虚拟机参数

于是把-Xss由256k改为512k

vi startup.sh

if [ -n "$str" ]; then
        JAVA_OPTS="-server -Xms2048m -Xmx3072m -Xmn1024m -XX:SurvivorRatio=2 -XX:PermSize=96m -XX:MaxPermSize=256m -Xss512k -XX:-UseAdaptiveSizePolicy -XX:MaxTenuringThreshold=15 -XX:+DisableExplicitGC -XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled -XX:+UseCMSCompactAtFullCollection -XX:+UseFastAccessorMethods -XX:+UseCMSInitiatingOccupancyOnly -XX:+HeapDumpOnOutOfMemoryError"
else
        JAVA_OPTS="-server -Xms1024m -Xmx1024m -XX:NewSize=256m -XX:MaxNewSize=256m -XX:MaxPermSize=128m "

再次启动出现日志

cd logs
ls

adapter
cd adapter
ls
adapter.log
tail -f adapter.log

此时发现程序虽然运行起来了,但是有报错如下

2022-09-25 14:17:57.838 [main] INFO  c.a.o.c.client.adapter.es.core.config.ESSyncConfigLoader - ## ES mapping config loaded
2022-09-25 14:17:57.997 [main] ERROR c.a.o.canal.adapter.launcher.loader.CanalAdapterLoader - Load canal adapter: es7 failed
java.lang.RuntimeException: java.lang.RuntimeException: java.lang.ClassCastException: com.alibaba.druid.pool.DruidDataSource cannot be cast to com.alibaba.druid.pool.DruidDataSource
	at com.alibaba.otter.canal.client.adapter.es7x.ES7xAdapter.init(ES7xAdapter.java:54) ~[client-adapter.es7x-1.1.5-jar-with-dependencies.jar:na]
	at com.alibaba.otter.canal.adapter.launcher.loader.CanalAdapterLoader.loadAdapter(CanalAdapterLoader.java:225) [client-adapter.launcher-1.1.5.jar:na]
	at com.alibaba.otter.canal.adapter.launcher.loader.CanalAdapterLoader.init(CanalAdapterLoader.java:56) [client-adapter.launcher-1.1.5.jar:na]
	at com.alibaba.otter.canal.adapter.launcher.loader.CanalAdapterService.init(CanalAdapterService.java:60) [client-adapter.launcher-1.1.5.jar:na]
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[na:1.8.0_322]
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[na:1.8.0_322]
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[na:1.8.0_322]
	at java.lang.reflect.Method.invoke(Method.java:498) ~[na:1.8.0_322]
	at org.springframework.beans.factory.annotation.InitDestroyAnnotationBeanPostProcessor$LifecycleElement.invoke(InitDestroyAnnotationBeanPostProcessor.java:365) [spring-beans-5.0.5.RELEASE.jar:5.0.5.RELEASE]
	at org.springframework.beans.factory.annotation.InitDestroyAnnotationBeanPostProcessor$LifecycleMetadata.invokeInitMethods(InitDestroyAnnotationBeanPostProcessor.java:308) [spring-beans-5.0.5.RELEASE.jar:5.0.5.RELEASE]
	at org.springframework.beans.factory.annotation.InitDestroyAnnotationBeanPostProcessor.postProcessBeforeInitialization(InitDestroyAnnotationBeanPostProcessor.java:135) [spring-beans-5.0.5.RELEASE.jar:5.0.5.RELEASE]
	at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.applyBeanPostProcessorsBeforeInitialization(AbstractAutowireCapableBeanFactory.java:422) [spring-beans-5.0.5.RELEASE.jar:5.0.5.RELEASE]

解决办法,更改源码

下载canal-adapter源码

修改client-adapter/escore/pom.xml为

原
        <dependency>
            <groupId>com.alibaba</groupId>
            <artifactId>druid</artifactId>
        </dependency>

改成
        <dependency>
            <groupId>com.alibaba</groupId>
            <artifactId>druid</artifactId>
            <scope>provided</scope>
        </dependency>

重新打包编译生成client-adapter.es7x-1.1.5-jar-with-dependencies.jar

放入canal-adapter的plugin目录下,替换原jar

重新启动,日志如下

2022-09-25 14:23:27.903 [main] INFO  o.s.jmx.export.annotation.AnnotationMBeanExporter - Located managed bean 'configurationPropertiesRebinder': registering with JMX server as MBean [org.springframework.cloud.context.properties:name=configurationPropertiesRebinder,context=8ad6665,type=ConfigurationPropertiesRebinder]
2022-09-25 14:23:27.910 [main] INFO  c.a.o.canal.adapter.launcher.loader.CanalAdapterService - ## syncSwitch refreshed.
2022-09-25 14:23:27.910 [main] INFO  c.a.o.canal.adapter.launcher.loader.CanalAdapterService - ## start the canal client adapters.
2022-09-25 14:23:27.912 [main] INFO  c.a.otter.canal.client.adapter.support.ExtensionLoader - extension classpath dir: /Users/fengchong/canal/canal-adapter/plugin
2022-09-25 14:23:27.958 [main] INFO  c.a.o.canal.adapter.launcher.loader.CanalAdapterLoader - Load canal adapter: logger succeed
2022-09-25 14:23:28.155 [main] INFO  c.a.o.c.client.adapter.es.core.config.ESSyncConfigLoader - ## Start loading es mapping config ... 
2022-09-25 14:23:28.175 [main] INFO  c.a.o.c.client.adapter.es.core.config.ESSyncConfigLoader - ## ES mapping config loaded
2022-09-25 14:23:28.341 [main] INFO  c.a.o.canal.adapter.launcher.loader.CanalAdapterLoader - Load canal adapter: es7 succeed
2022-09-25 14:23:28.345 [main] INFO  c.alibaba.otter.canal.connector.core.spi.ExtensionLoader - extension classpath dir: /Users/fengchong/canal/canal-adapter/plugin
2022-09-25 14:23:28.357 [main] INFO  c.a.o.canal.adapter.launcher.loader.CanalAdapterLoader - Start adapter for canal-client mq topic: example-g1 succeed
2022-09-25 14:23:28.357 [main] INFO  c.a.o.canal.adapter.launcher.loader.CanalAdapterService - ## the canal client adapters are running now ......
2022-09-25 14:23:28.357 [Thread-4] INFO  c.a.otter.canal.adapter.launcher.loader.AdapterProcessor - =============> Start to connect destination: example <=============
2022-09-25 14:23:28.361 [main] INFO  org.apache.coyote.http11.Http11NioProtocol - Starting ProtocolHandler ["http-nio-8081"]
2022-09-25 14:23:28.363 [main] INFO  org.apache.tomcat.util.net.NioSelectorPool - Using a shared selector for servlet write/read
2022-09-25 14:23:28.374 [main] INFO  o.s.boot.web.embedded.tomcat.TomcatWebServer - Tomcat started on port(s): 8081 (http) with context path ''
2022-09-25 14:23:28.379 [main] INFO  c.a.otter.canal.adapter.launcher.CanalAdapterApplication - Started CanalAdapterApplication in 2.602 seconds (JVM running for 3.131)
2022-09-25 14:23:28.402 [Thread-4] INFO  c.a.otter.canal.adapter.launcher.loader.AdapterProcessor - =============> Subscribe destination: example succeed <=============

启动成功!

八、测试

在MySQL手动插入一条数据

adapter.log打印日志如下

2022-09-25 15:20:25.519 [pool-2-thread-1] INFO  c.a.o.canal.client.adapter.logger.LoggerAdapterExample - DML: {"data":[{"id":7,"order_no":1122,"order_name":"2211"}],"database":"canal_test","destination":"example","es":1664090425000,"groupId":"g1","isDdl":false,"old":null,"pkNames":["id"],"sql":"","table":"t_order","ts":1664090425518,"type":"INSERT"}
2022-09-25 15:20:25.520 [pool-2-thread-1] DEBUG c.a.o.canal.client.adapter.es.core.service.ESSyncService - DML: {"data":[{"id":7,"order_no":1122,"order_name":"2211"}],"database":"canal_test","destination":"example","es":1664090425000,"groupId":"g1","isDdl":false,"old":null,"pkNames":["id"],"sql":"","table":"t_order","ts":1664090425518,"type":"INSERT"} 
Affected indexes: canal_test_order 

查看elasticsearch数据

# get 127.0.0.1:9200/canal_test_order/_search

{
    "took": 0,
    "timed_out": false,
    "_shards": {
        "total": 1,
        "successful": 1,
        "skipped": 0,
        "failed": 0
    },
    "hits": {
        "total": {
            "value": 1,
            "relation": "eq"
        },
        "max_score": 1.0,
        "hits": [
            {
                "_index": "canal_test_order",
                "_type": "_doc",
                "_id": "1",
                "_score": 1.0,
                "_source": {
                    "orderNo": 111111,
                    "orderName": "11111"
                }
            }
        ]
    }
}

九、添加一张表关联

删除原es索引

重新建立新索引,含一对多关系,如一笔订单含有多件产品

# 原映射
{
    "properties": {
        "orderNo": {
            "type": "integer"
        },
        "orderName": {
            "type": "keyword"
        }
    }
}

# 现映射

{
    "properties": {
        "productList": {
            "type": "nested",
            "properties": {
                "id": {
                    "type": "integer"
                },
                "productName": {
                    "type": "keyword"
                },
                "orderId": {
                    "type": "integer"
                }
            }
        },
        "orderNo": {
            "type": "integer"
        },
        "orderName": {
            "type": "keyword"
        }
    }
}

修改adapter适配器表映射文件为

dataSourceKey: defaultDS
destination: example
groupId: g1
esMapping:
  _index: canal_test_order
  _id: _id
  _type: _doc
  upsert: true
#  pk: id
  sql: "SELECT
        a.id AS _id,
        a.order_no AS orderNo,
        a.order_name AS orderName,
        concat( '[', l.productList, ']' ) AS productList
        FROM
        t_order a
        LEFT JOIN (
        SELECT order_id,
        group_concat( JSON_OBJECT( 'id', id,'productName',product_name, 'orderId', order_id )) AS productList
        FROM t_order_product
        GROUP BY order_id ) l
        ON a.id = l.order_id"
  objFields:  # 注意开启
   productList: object # list中数据为对象型
  etlCondition: "where a.c_time>={}"
  commitBatch: 3000

插入一条订单表数据es结果

{
    "took": 0,
    "timed_out": false,
    "_shards": {
        "total": 1,
        "successful": 1,
        "skipped": 0,
        "failed": 0
    },
    "hits": {
        "total": {
            "value": 1,
            "relation": "eq"
        },
        "max_score": 1.0,
        "hits": [
            {
                "_index": "canal_test_order",
                "_type": "_doc",
                "_id": "8",
                "_score": 1.0,
                "_source": {
                    "orderNo": 112233,
                    "orderName": "112233",
                    "productList": null
                }
            }
        ]
    }
}

此时productList为空

向产品表添加两条产品,含关联此订单表的id

{
    "took": 0,
    "timed_out": false,
    "_shards": {
        "total": 1,
        "successful": 1,
        "skipped": 0,
        "failed": 0
    },
    "hits": {
        "total": {
            "value": 1,
            "relation": "eq"
        },
        "max_score": 1.0,
        "hits": [
            {
                "_index": "canal_test_order",
                "_type": "_doc",
                "_id": "8",
                "_score": 1.0,
                "_source": {
                    "orderNo": 112233,
                    "orderName": "112233",
                    "productList": [
                        {
                            "orderId": 8,
                            "id": 21,
                            "productName": "2221"
                        },
                        {
                            "orderId": 8,
                            "id": 22,
                            "productName": "11"
                        }
                    ]
                }
            }
        ]
    }
}
posted @ 2022-09-25 16:55  大列巴同学  阅读(11)  评论(0编辑  收藏  举报  来源