SeaTunnel数据同步(Oracle to mysql)

因为datax2023年9月以后就没有更新,所以想找个新的且活跃的etl开源工具。

apache SeaTunnel是一个非常易用、超高性能的分布式数据集成平台,支持实时海量数据同步。 每天可稳定高效同步数百亿数据,已被近百家企业应用于生产。

直接安装体验:

export version="2.3.9"
wget "https://archive.apache.org/dist/seatunnel/${version}/apache-seatunnel-${version}-bin.tar.gz"
tar -xzvf "apache-seatunnel-${version}-bin.tar.gz"

安装插件:

1
sh bin/install-plugin.sh

 

lib目录放入需要的jdbc jar包:

 

mysql的emp表是提前创建好的.

复制代码
create table emp 
   (    empno int, 
    ename varchar(10), 
    job varchar(9), 
    mgr int, 
    hiredate datetime, 
    sal int, 
    comm int, 
    deptno int, 
     constraint pk_emp primary key (empno)
     );
复制代码

 

编辑config文件(更多的source及sink配置看官方文档,且支持CDC实时同步)

复制代码
env {
  parallelism = 4
  job.mode = "BATCH"
}
source{
    Jdbc {
        url = "jdbc:oracle:thin:@10.40.12.219:1521:sharedb"
        driver = "oracle.jdbc.OracleDriver"
        user = "system"
        password = "xxxx"
        query = "SELECT * FROM scott.emp"
    }
}

transform {
    # If you would like to get more information about how to configure seatunnel                                                                                                              and see full list of transform plugins,
    # please go to https://seatunnel.apache.org/docs/transform-v2/sql
}

sink {
    jdbc {
        url = "jdbc:mysql://10.40.13.75:3306/ceshi?useUnicode=true&characterEnc                                                                                                             oding=UTF-8&rewriteBatchedStatements=true"
        driver = "com.mysql.cj.jdbc.Driver"
        user = "root"
        password = "xxxx"
        # Automatically generate sql statements based on database table names
        generate_sink_sql = true
        database = ceshi
        table = emp
    }
}
复制代码

 

cdc相关参考配置:

复制代码
env {
  # You can set engine configuration here
  parallelism = 1
  job.mode = "STREAMING"
  checkpoint.interval = 5000
}

source {
  # This is a example source plugin **only for test and demonstrate the feature source plugin**
  Oracle-CDC {
    plugin_output = "customers"
    username = "dbzuser"
    password = "dbz"
    database-names = ["ORCLCDB"]
    schema-names = ["DEBEZIUM"]
    table-names = ["ORCLCDB.DEBEZIUM.FULL_TYPES"]
    base-url = "jdbc:oracle:thin:@oracle-host:1521/ORCLCDB"
    source.reader.close.timeout = 120000
    connection.pool.size = 1
    
    schema-changes.enabled = true
  }
}

sink {
  jdbc {
    plugin_input = "customers"
    url = "jdbc:mysql://oracle-host:3306/oracle_sink"
    driver = "com.mysql.cj.jdbc.Driver"
    user = "st_user_sink"
    password = "mysqlpw"
    generate_sink_sql = true
    # You need to configure both database and table
    database = oracle_sink
    table = oracle_cdc_2_mysql_sink_table
    primary_keys = ["ID"]
  }
}
复制代码

 

开始执行:

./bin/seatunnel.sh --config oracle_to_mysql.config -m local

 

安装简单,配置简单;赞!!!

apache相关的开源产品也是吊啊!!!

参考官方文档:https://seatunnel.apache.org/zh-CN/docs/2.3.9/about

 

posted @   阿西吧li  阅读(30)  评论(0编辑  收藏  举报
相关博文:
阅读排行:
· CSnakes vs Python.NET:高效嵌入与灵活互通的跨语言方案对比
· 【.NET】调用本地 Deepseek 模型
· Plotly.NET 一个为 .NET 打造的强大开源交互式图表库
· 上周热点回顾(2.17-2.23)
· 如何使用 Uni-app 实现视频聊天(源码,支持安卓、iOS)
点击右上角即可分享
微信分享提示