canal-admin-1.1.6、deployer、adapter部署

canal 使用

简介

img

canal [kə'næl],译意为水道/管道/沟渠,主要用途是基于 MySQL 数据库增量日志解析,提供增量数据订阅和消费

基于日志增量订阅和消费的业务包括

  • 数据库镜像
  • 数据库实时备份
  • 索引构建和实时维护(拆分异构索引、倒排索引等)
  • 业务 cache 刷新
  • 带业务逻辑的增量数据处理

当前的 canal 支持源端 MySQL 版本包括 5.1.x , 5.5.x , 5.6.x , 5.7.x , 8.0.x

工作原理

MySQL主备复制原理

  • MySQL master 将数据变更写入二进制日志( binary log, 其中记录叫做二进制日志事件binary log events,可以通过 show binlog events 进行查看)
  • MySQL slave 将 master 的 binary log events 拷贝到它的中继日志(relay log)
  • MySQL slave 重放 relay log 中事件,将数据变更反映它自己的数据

canal 工作原理

  • canal 模拟 MySQL slave 的交互协议,伪装自己为 MySQL slave ,向 MySQL master 发送dump 协议
  • MySQL master 收到 dump 请求,开始推送 binary log 给 slave (即 canal )
  • canal 解析 binary log 对象(原始为 byte 流)

本文使用数据库是mysql8.0.27,所以解压后的包依赖需要更换mysql驱动;mysql8默认已经开始binlog

# mysql -u root -p -h 127.0.0.1 -P 3506
mysql> use mysql;
mysql> select user from user; 
+------------------+
| user             |
+------------------+
| root             |
| mysql.infoschema |
| mysql.session    |
| mysql.sys        |
+------------------+
mysql> create user 'mantishell'@'%' identified by '123456';

这里表示创建一个不限制ip登录的用户mantishell;

密码是123456

%表示不限制登录ip

创建名字叫canal_manager的数据库,然后授权

mysql> create database canal_manager;
mysql> grant all privileges on canal_manager.* to 'mantishell'@'%';
mysql> flush privileges;

授权:grant all privileges on 数据库 to '用户名'@'IP地址';

撤销权限:revoke all privileges from 数据库 to '用户名'@'IP地址';

all privileges指除了grant之外的所有权限,也可以自己设置权限

grant insert on canal_manager.* to '用户名'@'IP地址';(只能对canal_manager数据库做插入操作,canal_manager.*表示对world中所有表)

查看是否已经开始binlog

mysql> show variables like 'log_bin';
+---------------+-------+
| Variable_name | Value |
+---------------+-------+
| log_bin       | ON    |
+---------------+-------+
# on - 表示已经开启

部署canal

1、配置canal.admin-1.16

canal admin提供了友好的ui界面

  1. conf/canal_manager.sql创建canal_manager数据库。 $ mysql -u mantishell -p -h 127.0.0.1 -P 3506 my < ./canal_manager.sql

  2. conf/application.yml修改配置

    server:
      port: 8089
    spring:
      jackson:
        date-format: yyyy-MM-dd HH:mm:ss
        time-zone: GMT+8
    
    spring.datasource:
      address: 127.0.0.1:3306 # admin数据库地址
      database: canal_manager # 数据库名称
      username: mantishell # 数据库账号
      password: 123456     # 数据库密码
      driver-class-name: com.mysql.jdbc.Driver
      url: jdbc:mysql://${spring.datasource.address}/${spring.datasource.database}?useUnicode=true&characterEncoding=UTF-8&useSSL=false
      hikari:
        maximum-pool-size: 30
        minimum-idle: 1
    
    canal:
      adminUser: admin   # 网页登录的账号
      adminPasswd: 123456 # 网页登录的默认密码
    

    注意:

    ​ 1. 驱动需要和mysql的版本一致

    ​ 2. 如果出现错误:ConnectionException: Public Key Retrieval is not allowed,可以在连接字符串后面加&allowPublicKeyRetrieval=true

  3. 登录网页:http://localhost:8089,在集群管理中新建集群192.168.10.6,注意这个名称将要在admin-deployer的配置中使用。操作->主配置->载入模板->保存

    #################################################
    ######### 		common argument		#############
    #################################################
    # tcp bind ip
    canal.ip =
    # register ip to zookeeper
    canal.register.ip = 
    canal.port = 11111
    canal.metrics.pull.port = 11112
    # canal instance user/passwd
    # canal.user = canal
    # canal.passwd = E3619321C1A937C46A0D8BD1DAC39F93B27D4458
    
    # canal admin config 
    canal.admin.manager = 127.0.0.1:8089
    canal.admin.port = 11110
    canal.admin.user = admin
    canal.admin.passwd = 4ACFE3202A5FF5CF467898FC58AAB1D615029441
    # admin auto register
    #canal.admin.register.auto = true
    #canal.admin.register.cluster =
    #canal.admin.register.name =
    
    canal.zkServers =
    # flush data to zk
    canal.zookeeper.flush.period = 1000
    canal.withoutNetty = false
    # tcp, kafka, rocketMQ, rabbitMQ
    canal.serverMode = tcp
    # flush meta cursor/parse position to file
    canal.file.data.dir = ${canal.conf.dir}
    canal.file.flush.period = 1000
    ## memory store RingBuffer size, should be Math.pow(2,n)
    canal.instance.memory.buffer.size = 16384
    ## memory store RingBuffer used memory unit size , default 1kb
    canal.instance.memory.buffer.memunit = 1024 
    ## meory store gets mode used MEMSIZE or ITEMSIZE
    canal.instance.memory.batch.mode = MEMSIZE
    canal.instance.memory.rawEntry = true
    
    ## detecing config
    canal.instance.detecting.enable = false
    #canal.instance.detecting.sql = insert into retl.xdual values(1,now()) on duplicate key update x=now()
    canal.instance.detecting.sql = select 1
    canal.instance.detecting.interval.time = 3
    canal.instance.detecting.retry.threshold = 3
    canal.instance.detecting.heartbeatHaEnable = false
    
    # support maximum transaction size, more than the size of the transaction will be cut into multiple transactions delivery
    canal.instance.transaction.size =  1024
    # mysql fallback connected to new master should fallback times
    canal.instance.fallbackIntervalInSeconds = 60
    
    # network config
    canal.instance.network.receiveBufferSize = 16384
    canal.instance.network.sendBufferSize = 16384
    canal.instance.network.soTimeout = 30
    
    # binlog filter config
    canal.instance.filter.druid.ddl = true
    canal.instance.filter.query.dcl = false
    canal.instance.filter.query.dml = false
    canal.instance.filter.query.ddl = false
    canal.instance.filter.table.error = false
    canal.instance.filter.rows = false
    canal.instance.filter.transaction.entry = false
    canal.instance.filter.dml.insert = false
    canal.instance.filter.dml.update = false
    canal.instance.filter.dml.delete = false
    
    # binlog format/image check
    canal.instance.binlog.format = ROW,STATEMENT,MIXED 
    canal.instance.binlog.image = FULL,MINIMAL,NOBLOB
    
    # binlog ddl isolation
    canal.instance.get.ddl.isolation = false
    
    # parallel parser config
    canal.instance.parser.parallel = true
    ## concurrent thread number, default 60% available processors, suggest not to exceed Runtime.getRuntime().availableProcessors()
    #canal.instance.parser.parallelThreadSize = 16
    ## disruptor ringbuffer size, must be power of 2
    canal.instance.parser.parallelBufferSize = 256
    
    # table meta tsdb info
    canal.instance.tsdb.enable = true
    canal.instance.tsdb.dir = ${canal.file.data.dir:../conf}/${canal.instance.destination:}
    canal.instance.tsdb.url = jdbc:h2:${canal.instance.tsdb.dir}/h2;CACHE_SIZE=1000;MODE=MYSQL;
    canal.instance.tsdb.dbUsername = canal
    canal.instance.tsdb.dbPassword = canal
    # dump snapshot interval, default 24 hour
    canal.instance.tsdb.snapshot.interval = 24
    # purge snapshot expire , default 360 hour(15 days)
    canal.instance.tsdb.snapshot.expire = 360
    
    #################################################
    ######### 		destinations		#############
    #################################################
    # 配置
    canal.destinations = example
    # conf root dir
    canal.conf.dir = ../conf
    # auto scan instance dir add/remove and start/stop instance
    canal.auto.scan = true
    canal.auto.scan.interval = 5
    # set this value to 'true' means that when binlog pos not found, skip to latest.
    # WARN: pls keep 'false' in production env, or if you know what you want.
    canal.auto.reset.latest.pos.mode = false
    
    canal.instance.tsdb.spring.xml = classpath:spring/tsdb/h2-tsdb.xml
    #canal.instance.tsdb.spring.xml = classpath:spring/tsdb/mysql-tsdb.xml
    
    canal.instance.global.mode = manager
    canal.instance.global.lazy = false
    canal.instance.global.manager.address = ${canal.admin.manager}
    #canal.instance.global.spring.xml = classpath:spring/memory-instance.xml
    canal.instance.global.spring.xml = classpath:spring/file-instance.xml
    #canal.instance.global.spring.xml = classpath:spring/default-instance.xml
    
    ##################################################
    ######### 	      MQ Properties      #############
    ##################################################
    # aliyun ak/sk , support rds/mq
    canal.aliyun.accessKey =
    canal.aliyun.secretKey =
    canal.aliyun.uid=
    
    canal.mq.flatMessage = false
    canal.mq.canalBatchSize = 50
    canal.mq.canalGetTimeout = 100
    # Set this value to "cloud", if you want open message trace feature in aliyun.
    canal.mq.accessChannel = local
    
    canal.mq.database.hash = false
    canal.mq.send.thread.size = 30
    canal.mq.build.thread.size = 8
    
    ##################################################
    ######### 		     Kafka 		     #############
    ##################################################
    kafka.bootstrap.servers = 127.0.0.1:6667
    kafka.acks = all
    kafka.compression.type = none
    kafka.batch.size = 16384
    kafka.linger.ms = 1
    kafka.max.request.size = 1048576
    kafka.buffer.memory = 33554432
    kafka.max.in.flight.requests.per.connection = 1
    kafka.retries = 0
    
    kafka.kerberos.enable = false
    kafka.kerberos.krb5.file = "../conf/kerberos/krb5.conf"
    kafka.kerberos.jaas.file = "../conf/kerberos/jaas.conf"
    
    ##################################################
    ######### 		    RocketMQ	     #############
    ##################################################
    rocketmq.producer.group = test
    rocketmq.enable.message.trace = false
    rocketmq.customized.trace.topic =
    rocketmq.namespace =
    rocketmq.namesrv.addr = 127.0.0.1:9876
    rocketmq.retry.times.when.send.failed = 0
    rocketmq.vip.channel.enabled = false
    rocketmq.tag = 
    
    ##################################################
    ######### 		    RabbitMQ	     #############
    ##################################################
    rabbitmq.host =
    rabbitmq.virtual.host =
    rabbitmq.exchange =
    rabbitmq.username =
    rabbitmq.password =
    rabbitmq.deliveryMode =
    

2、配置canal.deployer-1.16

  • canal_local.properties 为注册admin配置
  • canal.properties 为本地配置
  1. 修改conf/canal_local.properties

    # register ip 本服务在admin中的显示ip,名称也是用这个
    canal.register.ip = 
    
    # canal admin config admin的访问地址
    canal.admin.manager = 127.0.0.1:8089
    canal.admin.port = 11110
    canal.admin.user = admin
    canal.admin.passwd = 4ACFE3202A5FF5CF467898FC58AAB1D615029441 # 加密后的密码
    # admin auto register 启动后自动注册到admin中
    canal.admin.register.auto = true
    # 集群名称 admin启动以后才能配置,否则deployer将启动失败
    canal.admin.register.cluster = 192.168.10.6
    # 节点名称
    canal.admin.register.name = 
    
  2. 启动canal-deployer

    # 以canal_local.properties的配置方式启动
    $ ./bin/startup.sh local
    

3、配置canal-adapter-1.16

  1. 修改bootstrap.yml配置

    canal:
      manager:
        jdbc:
          url: jdbc:mysql://127.0.0.1:3506/canal_manager?useUnicode=true&characterEncoding=UTF-8&useSSL=false&allowPublicKeyRetrieval=true
          username: mantishell
          password: 123456
    
  2. 修改application.yml配置

      srcDataSources:
        defaultDS:
          url: jdbc:mysql://127.0.0.1:3506/src_db?useUnicode=true&characterEncoding=UTF-8&useSSL=false&allowPublicKeyRetrieval=true
          username: root
          password: 123456
      canalAdapters:
      - instance: example # canal instance Name or mq topic name
        groups:
        - groupId: g1
          outerAdapters:
          - name: logger
          - name: rdb
            # key的值需要和rdb里yml文件里outerAdapterKey的值相同
            key: mysql1
            properties:
              jdbc.driverClassName: com.mysql.cj.jdbc.Driver
              # 目的地址
              jdbc.url: jdbc:mysql://192.168.10.10:3506/dest_db?useUnicode=true&characterEncoding=UTF-8&useSSL=false&allowPublicKeyRetrieval=true
              jdbc.username: root
              jdbc.password: 123456
    
  3. 修改rdb/*.yml配置

    dataSourceKey: defaultDS
    destination: example
    groupId: g1
    outerAdapterKey: mysql1
    concurrent: true
    dbMapping:
      database: src_db
      table: user
      targetTable: dest_db.user
      targetPk:
        id: id
      mapAll: true
    #  targetColumns:
    #    id:
    #    name:
    #    role_id:
    #    c_time:
    #    test1:
    #  etlCondition: "where c_time>={}"
      commitBatch: 3000 # 批量提交的大小
    
  4. 运行./startup.sh

  5. 在canal-admin的Instance管理中新建Instance->载入模板,修改模板后保存

    ## mysql serverId , v1.0.26+ will autoGen mysql从库编号,与主/从id不同
    # canal.instance.mysql.slaveId=
    # position info binlog服务器所在地址,也就是监听的MySQL主机和端口
    canal.instance.master.address=192.168.10.6:3506
    # 二进制日志文件 sql命令:show master status;
    canal.instance.master.journal.name=binlog.000015
    # 二进制日志文件从什么位置监听
    canal.instance.master.position=
    canal.instance.master.timestamp=
    ...
    # username/password binlog所在服务器的mysql连接
    canal.instance.dbUsername=root
    canal.instance.dbPassword=123456
    ...
    # table regex 需要处理的数据库
    canal.instance.filter.regex=my_db\\..*
    # table black regex 不需要处理的数据库
    canal.instance.filter.black.regex=mysql\\.slave_.*
    

    查看mysql的信息
    mysql> show master status;
    +---------------+----------+--------------+------------------+-------------------+
    | File | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set |
    +---------------+----------+--------------+------------------+-------------------+
    | binlog.000015 | 156 | | | |
    +---------------+----------+--------------+------------------+-------------------+

    主要配置:

    • canal.instance.mysql.slaveId= # mysql从库编号,与主/从id不同
    • canal.instance.master.address=127.0.0.1:3306 # 监听的MySQL主机和端口
    • canal.instance.master.journal.name=binlog.000015 #二进制日志文件
    • canal.instance.master.position=156 #二进制日志文件从什么位置监听
    • canal.instance.master.timestamp=
posted @ 2022-10-05 21:42  mantishell  阅读(2192)  评论(0编辑  收藏  举报