DataX支持mysql8.X
简介:
DataX 是一个异构数据源离线同步工具,致力于实现包括关系型数据库(MySQL、Oracle等)、HDFS、Hive、ODPS、HBase、FTP等各种异构数据源之间稳定高效的数据同步功能。github地址: https://github.com/alibaba/DataX
1 注意部分
目前dataX不支持mysql8.X,需要修改源码,修改的地方
- OriginalConfPretreatmentUtil类中引用的DataBaseType的追加参数,mysql8的zeroDateTimeBehavior=convertToNull修改为zeroDateTimeBehavior=CONVERT_TO_NULL
修改前:suffix = "yearIsDateType=false&zeroDateTimeBehavior=convertToNull&tinyInt1isBit=false&rewriteBatchedStatements=true";
修改后: suffix = "yearIsDateType=false&zeroDateTimeBehavior=CONVERT_TO_NULL&tinyInt1isBit=false&rewriteBatchedStatements=true";
- mysql驱动,在mysql reader和writer的pom文件修改为
<dependency>
<groupId>mysql</groupId>
<artifactId>mysql-connector-java</artifactId>
<version>8.0.11</version>
</dependency>
- clean install 跳过测试
- 将reader和writer生成的target下面的datax的plugin拷贝到core工程项目和bin同级的plugin(源码生成是没有的,新建)
2 使用部分
- 目录级别
- json模板
{
"job": {
"setting": {
"speed": {
"byte":10485760
},
"errorLimit": {
"record": 0,
"percentage": 0.02
}
},
"content": [
{
"reader": {
"name": "streamreader",
"parameter": {
"column" : [
{
"value": "DataX",
"type": "string"
},
{
"value": 19890604,
"type": "long"
},
{
"value": "1989-06-04 00:00:00",
"type": "date"
},
{
"value": true,
"type": "bool"
},
{
"value": "test",
"type": "bytes"
}
],
"sliceRecordCount": 100000
}
},
"writer": {
"name": "streamwriter",
"parameter": {
"print": false,
"encoding": "UTF-8"
}
}
}
]
}
}
- mysql示例json(github官网可以查看)
{
"job": {
"content": [
{
"reader": {
"name": "mysqlreader",
"parameter": {
"username": "root",
"password": "123456",
"column": ["id","name"],
"where": "id>0",
"connection": [
{
"table": [
"user"
],
"jdbcUrl": [
"jdbc:mysql://47.101.137.97:3306/test1?serverTimezone=UTC"
]
}
]
}
},
"writer": {
"name": "mysqlwriter",
"parameter": {
"username": "root",
"password": "123456",
"column": ["id","name"],
"connection": [
{
"table": [
"user"
],
"jdbcUrl":"jdbc:mysql://47.101.137.97:3306/test2?serverTimezone=UTC"
}
]
}
}
}
],
"setting": {
"speed": {
"channel": 1,
"byte": 104857600
},
"errorLimit": {
"record": 10,
"percentage": 0.05
}
}
}
}
- 执行
进入到datax的bin目录(eg./Users/xuzhihui/test/backend/DataX-master/core/target/datax/bin),然后执行
python datax.py ../job/test.json
- 结果