DataX的使用——大数据同步技术
准备工作:
1.视频教学http://113.31.104.47/portal/#/course/dashboard/b34d160db64624732ef152a1118af11a
2.DataX的安装部署https://www.cnblogs.com/qingyunzong/p/9759993.html#_label1_0
3.DataX的使用Python版本要求:2.7.X,DataX未更新至Python3Win10下python 2.7与python 3.6双环境安装图文教程
设计json文档:(sqlserver to mysql)
{ "job": { "setting": { "speed": { "channel": 3, "byte": 1048576 }, "errorLimit": { "record": 0, "percentage": 0.02 } }, "content": [ { "reader": { "name": "sqlserverreader", "parameter": { "username": "sa", "password": "######", "where": "", "column": [ "bname", "bpwd" ], "connection": [ { "table": ["buyer"], "jdbcUrl": [ "jdbc:sqlserver://localhost:1433;DatabaseName=bookshop" ] } ] } }, "writer": { "name": "mysqlwriter", "parameter": { "writeMode": "insert", "username": "root", "password": "######", "column": [ "name", "pwd" ], "session": [], "connection": [ { "jdbcUrl": "jdbc:mysql://127.0.0.1:3306/hotwords?useUnicode=true&characterEncoding=utf8&useSSL=false&serverTimezone=GMT%2B8", "table": ["user"] } ] } } } ] } }
官方解读各个数据库文档https://github.com/alibaba/DataX
运行:
python 空格{datax文件夹路径}\bin\datax.py 空格{json配置文件的路径}
python2 D:\download\datax\datax\bin\datax.py D:\download\datax\job\sqlserverTomysql.json
乱码输入:
CHCP 65001
出错:
ERROR RetryUtil - Exception when calling callable, 即将尝试执行第1次重试.本次重试计划等待[1000]ms,实际等待[1000]ms, 异常Msg:[DataX无法连接对应的数据库,可能原因是:1) 配置的ip/port/database/jdbc错误,无法连接。2) 配置的username/password错误,鉴权失败。请和DBA确认该数据库的连接信息是否正确。]
解决方法:
datax里面的mysql驱动更换成合适的8.x的版本就好了:
查询你的mysql版本,下载相应的mysql-connector jar包
mysql -uroot -p
替换:
datax->plugins->reader->mysqlreader->libs->mysql-connector-5...的jar包换成8.XX的版本
datax->plugins->write->mysqlwriter->libs->coonector-5...的jar包换成8.XX的版本
运行成功: