OceanBase学习之---MySQL迁移数据到OceanBase
案例说明:
通过mysqldump及datax,从MySQL迁移数据到OceanBase数据库。
一、数据库环境
1、MySQL环境
mysql> select version();
+-----------+
| version() |
+-----------+
| 8.0.23 |
+-----------+
1 row in set (0.00 sec)
mysql> show variables like '%chara%';
+--------------------------+----------------------------------+
| Variable_name | Value |
+--------------------------+----------------------------------+
| character_set_client | utf8 |
| character_set_connection | utf8 |
| character_set_database | utf8 |
| character_set_filesystem | binary |
| character_set_results | utf8 |
| character_set_server | utf8 |
| character_set_system | utf8 |
| character_sets_dir | /usr/local/mysql/share/charsets/ |
+--------------------------+----------------------------------+
8 rows in set (0.01 sec)
2、OceanBase环境
[admin@node203 obproxy-4.2.0.0]$ obclient -h127.0.0.1 -P2883 -uroot@ob_mysql -p -Doceanbase -A
Enter password:
Welcome to the OceanBase. Commands end with ; or \g.
Your OceanBase connection id is 15
Server version: OceanBase_CE 4.2.0.0 (r100010022023081817-0bdf1c0c5674e88c5ae9a8d0ae4f8077465d7fae) (Built Aug 18 2023 17:32:49)
Copyright (c) 2000, 2018, OceanBase and/or its affiliates. All rights reserved.
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
obclient [oceanbase]> select version();
+------------------------------+
| version() |
+------------------------------+
| 5.7.25-OceanBase_CE-v4.2.0.0 |
+------------------------------+
1 row in set (0.001 sec)
obclient [oceanbase]> use mysql;
Database changed
obclient [mysql]> show variables like '%chara%';
+--------------------------+---------+
| Variable_name | Value |
+--------------------------+---------+
| character_set_client | utf8mb4 |
| character_set_connection | utf8mb4 |
| character_set_database | utf8mb4 |
| character_set_filesystem | binary |
| character_set_results | utf8mb4 |
| character_set_server | utf8mb4 |
| character_set_system | utf8mb4 |
+--------------------------+---------+
7 rows in set (0.004 sec)
# 配置root用户口令
obclient [mysql]> alter user 'root'@'%' identified by 'beijing';
Query OK, 0 rows affected (0.059 sec)
二、mysqldump迁移数据到OceanBase
通过MySQL下的mysqldump将数据导出为SQL文本格式,将数据备份文件传输到OceanBase数据库主机后,通过source命令导入到OceanBase数据库。
1、导出mysql数据
# 创建测试数据
mysql> use prod;
Database changed
mysql> create table emp( empno int primary key ,ename varchar(10),sal int , deptno int);
Query OK, 0 rows affected (0.03 sec)
mysql> create table dept (deptno int primary key ,dname varchar(10), location varchar(10));
Query OK, 0 rows affected (0.01 sec)
mysql> insert into dept values (10,'caiwu','beijing'),(20,'shichang','shanghai'),(30,'jishu','guangzhou');
Query OK, 3 rows affected (0.01 sec)
Records: 3 Duplicates: 0 Warnings: 0
mysql> insert into emp values(1001,'tom',3000,10),(1002,'jerry',5000,20),(1003,'rose',3000,30);
Query OK, 3 rows affected (0.01 sec)
Records: 3 Duplicates: 0 Warnings: 0
mysql> select * from emp;
+-------+-------+------+--------+
| empno | ename | sal | deptno |
+-------+-------+------+--------+
| 1001 | tom | 3000 | 10 |
| 1002 | jerry | 5000 | 20 |
| 1003 | rose | 3000 | 30 |
+-------+-------+------+--------+
3 rows in set (0.00 sec)
mysql> select * from dept;
+--------+----------+-----------+
| deptno | dname | location |
+--------+----------+-----------+
| 10 | caiwu | beijing |
| 20 | shichang | shanghai |
| 30 | jishu | guangzhou |
+--------+----------+-----------+
3 rows in set (0.00 sec)
# 通过mysqldump导出数据
[root@node202 bin]# ./mysqldump -uroot -p --databases prod > ~/prod.sql
Enter password:
2、将数据导入到OceanBase
# 连接OceanBase数据库
obclient [test]> use prod;
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A
Database changed
obclient [prod]> show tables;
+----------------+
| Tables_in_prod |
+----------------+
| t1 |
1 row in set (0.002 sec)
# 导入数据到数据库
obclient [prod]> source /home/admin/prod.sql
Query OK, 0 rows affected (0.000 sec)
.......
Query OK, 0 rows affected (0.000 sec)
# 查看导入数据
obclient [prod]> show tables;
+----------------+
| Tables_in_prod |
+----------------+
| dept |
| emp |
| t1 |
+----------------+
3 rows in set (0.002 sec)
obclient [prod]> select * from emp;
+-------+-------+------+--------+
| empno | ename | sal | deptno |
+-------+-------+------+--------+
| 1001 | tom | 3000 | 10 |
| 1002 | jerry | 5000 | 20 |
| 1003 | rose | 3000 | 30 |
+-------+-------+------+--------+
3 rows in set (0.002 sec)
obclient [prod]> select * from dept;
+--------+----------+-----------+
| deptno | dname | location |
+--------+----------+-----------+
| 10 | caiwu | beijing |
| 20 | shichang | shanghai |
| 30 | jishu | guangzhou |
+--------+----------+-----------+
3 rows in set (0.003 sec)
---如上所示,数据从mysql导入到OceanBase数据库成功。
三、通过datax从MySQL离线导入数据到OceanBase
1、datax简介
DataX 是阿里云 DataWorks 数据集成的开源版本,是阿里巴巴集团内被广泛使用的离线数据同步工具/平台。DataX 实现了包括 MySQL、Oracle、SQLserver、Postgre、HDFS、Hive、ADS、HBase、TableStore(OTS)、MaxCompute(ODPS)、Hologres、DRDS 、OceanBase 等各种异构数据源之间高效的数据同步功能。
2、DataX 框架设计
DataX 采用 Framework + Plugin 架构,将数据源读取和写入抽象称为 Reader/Writer 插件,纳入到整个同步框架中。
3、datax部署安装
1)datax 下载地址:https://datax-opensource.oss-cn-hangzhou.aliyuncs.com/202210/datax.tar.gz
2)解压datax
[root@node203 Oceanbase]# tar zxvf datax.tar.gz
3)datax目录结构
[admin@node203 datax]$ tree -L 1 --filelimit 30
.
├── bin
├── conf
├── job
├── lib
├── log
├── log_perf
├── plugin
├── script
└── tmp
4、测试datax是否安装成功
[root@node203 bin]# python datax.py ../job/job.json
DataX (DATAX-OPENSOURCE-3.0), From Alibaba !
Copyright (C) 2010-2017, Alibaba Group. All Rights Reserved.
2023-09-12 14:23:55.924 [main] INFO VMInfo - VMInfo# operatingSystem class => com.sun.management.internal.OperatingSystemImpl
2023-09-12 14:23:55.929 [main] INFO Engine - the machine info =>
osInfo: Oracle Corporation 20 20.0.2+9-78
jvmInfo: Linux amd64 3.10.0-327.el7.x86_64
cpu num: 8
totalPhysicalMemory: -0.00G
freePhysicalMemory: -0.00G
maxFileDescriptorCount: -1
currentOpenFileDescriptorCount: -1
GC Names [G1 Young Generation, G1 Old Generation, G1 Concurrent GC]
MEMORY_NAME | allocation_size | init_size
CodeHeap 'profiled nmethods' | 117.21MB | 2.44MB
G1 Old Gen | 1,024.00MB | 973.00MB
G1 Survivor Space | -0.00MB | 0.00MB
CodeHeap 'non-profiled nmethods' | 117.21MB | 2.44MB
Compressed Class Space | 1,024.00MB | 0.00MB
Metaspace | -0.00MB | 0.00MB
G1 Eden Space | -0.00MB | 51.00MB
CodeHeap 'non-nmethods' | 5.57MB | 2.44MB
2023-09-12 14:23:55.958 [main] INFO Engine -
{
"content":[
{
"reader":{
"name":"streamreader",
"parameter":{
"column":[
{
"type":"string",
"value":"DataX"
},
{
"type":"long",
"value":19890604
},
{
"type":"date",
"value":"1989-06-04 00:00:00"
},
{
"type":"bool",
"value":true
},
{
"type":"bytes",
"value":"test"
}
],
"sliceRecordCount":100000
}
},
"writer":{
"name":"streamwriter",
"parameter":{
"encoding":"UTF-8",
"print":false
}
}
}
],
"setting":{
"errorLimit":{
"percentage":0.02,
"record":0
},
"speed":{
"byte":10485760
}
}
}
.......
2023-09-12 14:24:06.078 [job-0] INFO JobContainer -
任务启动时刻 : 2023-09-12 14:23:55
任务结束时刻 : 2023-09-12 14:24:06
任务总计耗时 : 10s
任务平均流量 : 253.91KB/s
记录写入速度 : 10000rec/s
读出记录总数 : 100000
读写失败总数 : 0
5、创建datax配置模版
[root@node203 bin]# python ./datax.py -r mysqlreader -w oceanbasev10writer
DataX (DATAX-OPENSOURCE-3.0), From Alibaba !
Copyright (C) 2010-2017, Alibaba Group. All Rights Reserved.
.......
{
"job": {
"content": [
{
"reader": {
"name": "mysqlreader",
"parameter": {
"column": [],
"connection": [
{
"jdbcUrl": [],
"table": []
}
],
"password": "",
"username": "",
"where": ""
}
},
"writer": {
"name": "oceanbasev10writer",
"parameter": {
"column": [],
"connection": [
{
"jdbcUrl": "",
"table": []
}
],
"obWriteMode": "",
"password": "",
"username": ""
}
}
}
],
"setting": {
"speed": {
"channel": ""
}
}
}
}
---如上所示,可以根据配置模版,修改创建job的配置文件。
6、查看job配置文件
[root@node203 job]# cat myjob.json
{
"job":{
"entry": {
"jvm": "-Xms1024m -Xmx1024m"
},
"setting": {
"speed": {
"channel": 4
},
"errorLimit": {
"record": 0,
"percentage": 0.1
}
},
"content": [
{
"reader": {
"name": "mysqlreader",
"parameter": {
"username": "root",
"password": "beijing",
"column": [
"*"
],
"connection": [
{
"table": [
"tb1"
],
"jdbcUrl": ["jdbc:mysql://127.0.0.1:3306/prod?useUnicode=true&characterEncoding=utf8&useSSL=false"]
}
]
}
},
"writer": {
"name": "oceanbasev10writer",
"parameter": {
"obWriteMode": "insert",
"column": [
"*"
],
"preSql": [
"truncate table tb1"
],
"connection": [
{
"jdbcUrl": "||_dsc_ob10_dsc_||obdemo:ob_mysql||_dsc_ob10_dsc_||jdbc:oceanbase://127.0.0.1:2883/prod?useLocalSessionState=true&allowBatch=true&allowMultiQueries=true&rewriteBatchedStatements=true",
"table": [
"tb1"
]
}
],
"username": "root",
"password":"beijing",
"writerThreadCount":10,
"batchSize": 1000,
"memstoreThreshold": "0.9"
}
}
}
]
}
}
7、执行离线数据同步
Tips:离线迁移是对静态数据的迁移,在迁移时,源端数据库不能有数据写入。
[root@node203 bin]# python ./datax.py ../job/myjob.json
DataX (DATAX-OPENSOURCE-3.0), From Alibaba !
Copyright (C) 2010-2017, Alibaba Group. All Rights Reserved.
2023-09-15 14:02:44.555 [main] INFO MessageSource - JVM TimeZone: GMT+08:00, Locale: zh_CN
2023-09-15 14:02:44.557 [main] INFO MessageSource - use Locale: zh_CN timeZone: sun.util.calendar.ZoneInfo[id="GMT+08:00",offset=28800000,dstSavings=0,useDaylight=false,transitions=0,lastRule=null]
2023-09-15 14:02:44.607 [main] INFO VMInfo - VMInfo# operatingSystem class => com.sun.management.internal.OperatingSystemImpl
2023-09-15 14:02:44.618 [main] INFO Engine - the machine info =>
osInfo: Oracle Corporation 20 20.0.2+9-78
jvmInfo: Linux amd64 3.10.0-327.el7.x86_64
cpu num: 8
totalPhysicalMemory: -0.00G
freePhysicalMemory: -0.00G
maxFileDescriptorCount: -1
currentOpenFileDescriptorCount: -1
GC Names [G1 Young Generation, G1 Old Generation, G1 Concurrent GC]
MEMORY_NAME | allocation_size | init_size
CodeHeap 'profiled nmethods' | 117.21MB | 2.44MB
G1 Old Gen | 1,024.00MB | 973.00MB
G1 Survivor Space | -0.00MB | 0.00MB
CodeHeap 'non-profiled nmethods' | 117.21MB | 2.44MB
Compressed Class Space | 1,024.00MB | 0.00MB
Metaspace | -0.00MB | 0.00MB
G1 Eden Space | -0.00MB | 51.00MB
CodeHeap 'non-nmethods' | 5.57MB | 2.44MB
2023-09-15 14:02:44.646 [main] INFO Engine -
.......
执行完成:
8、查看目标库oceanbase迁移数据
源端数据:
目标端数据:
四、迁移故障问题
1、执行迁移出现以下故障
[admin@node203 bin]$ python /home/soft/Oceanbase/datax/bin/datax.py /home/soft/Oceanbase/datax/job/myjob.json
DataX (DATAX-OPENSOURCE-3.0), From Alibaba !
Copyright (C) 2010-2017, Alibaba Group. All Rights Reserved.
2023-09-11 19:26:34.926 [main] WARN ConfigParser - 插件[mysqlreader,oceanbasev10writer]加载失败,1s后重试... Exception:Code:[Common-00], Describe:[您提供的配置文件存在错误信息,请检查您的作业配置 .] - 配置信息错误,您提供的配置文件[/home/soft/Oceanbase/datax/plugin/reader/._drdsreader/plugin.json]不存在. 请检查您的配置文件.
2023-09-11 19:26:35.934 [main] ERROR Engine -
经DataX智能分析,该任务最可能的错误原因是:
com.alibaba.datax.common.exception.DataXException: Code:[Common-00], Describe:[您提供的配置文件存在错误信息,请检查您的作业配置 .] - 配置信息错误,您提供的配置文件[/home/soft/Oceanbase/datax/plugin/reader/._drdsreader/plugin.json]不存在. 请检查您的配置文件.
at com.alibaba.datax.common.exception.DataXException.asDataXException(DataXException.java:26)
at com.alibaba.datax.common.util.Configuration.from(Configuration.java:95)
.......
2、解决方案
如下图所示,删除reader和writer目录下的隐藏文件: