Hudi-Flink SQL实时读取kafka数据写入Hudi表
0.进入shell
./sql-client.sh embedded shell
1.建表关联kafka
CREATE TABLE order_kafka_source( `orderId` STRING, `userId` STRING, `orderTime` STRING, `ip` STRING, `orderMoney` DOUBLE, `orderStatus` INT ) WITH( 'connector' = 'kafka', 'topic'='order-topic', 'properties.bootstrap.servers' = 'localhost:9092', 'properties.group.id' = 'gid-1001', 'scan.startup.mode' = 'latest-offset', 'format' = 'json', 'json.fail-on-missing-field' = 'false', 'json.ignore-parse-errors' = 'true' );
2.建表sink到hudi
CREATE TABLE order_hudi_sink( `orderId` STRING PRIMARY KEY NOT ENFORCED, `userId` STRING, `orderTime` STRING, `ip` STRING, `orderMoney` DOUBLE, `orderStatus` INT, `ts` STRING, `partition_day` STRING ) PARTITIONED BY (partition_day) WITH( 'connector' = 'hudi', 'path'='hdfs://localhost:9000/hudi-warehouse/flink_hudi_order', 'table.type' = 'MERGE_ON_READ', 'write.operation' = 'upsert', 'hoodie.datasource.write.recordkey.field' = 'orderId', 'write.precombine.field' = 'ts', 'write.tasks' = '1', 'compaction.tasks' = '1', 'compaction.async.enable' = 'true', 'compaction.trigger.strategy' = 'num_commits', 'compaction.delta_commits' = '1' );
3.写入hudi表
INSERT INTO order_hudi_sink SELECT orderId,userId,orderTime,ip,orderMoney,orderStatus, substring(orderId, 0, 17) AS ts, substring(orderTime, 0, 10) AS partition_day FROM order_kafka_source;
分类:
Hudi
【推荐】国内首个AI IDE,深度理解中文开发场景,立即下载体验Trae
【推荐】编程新体验,更懂你的AI,立即体验豆包MarsCode编程助手
【推荐】抖音旗下AI助手豆包,你的智能百科全书,全免费不限次数
【推荐】轻量又高性能的 SSH 工具 IShell:AI 加持,快人一步
· 分享4款.NET开源、免费、实用的商城系统
· 全程不用写代码,我用AI程序员写了一个飞机大战
· MongoDB 8.0这个新功能碉堡了,比商业数据库还牛
· 白话解读 Dapr 1.15:你的「微服务管家」又秀新绝活了
· 记一次.NET内存居高不下排查解决与启示
2017-03-10 java 获取某路径下的子文件/子路径