Hudi-Flink SQL实时读取kafka数据写入Hudi表

0.进入shell

./sql-client.sh embedded shell
 

1.建表关联kafka

复制代码
CREATE TABLE order_kafka_source(
    `orderId` STRING,
    `userId` STRING,
    `orderTime` STRING,
    `ip` STRING,
    `orderMoney` DOUBLE,
    `orderStatus` INT
)
WITH(
    'connector' = 'kafka',
    'topic'='order-topic',
    'properties.bootstrap.servers' = 'localhost:9092',
    'properties.group.id' = 'gid-1001',
    'scan.startup.mode' = 'latest-offset',
    'format' = 'json',
    'json.fail-on-missing-field' = 'false',
    'json.ignore-parse-errors' = 'true'
);
复制代码

 

 

2.建表sink到hudi

复制代码
CREATE TABLE order_hudi_sink(
    `orderId` STRING PRIMARY KEY NOT ENFORCED,
    `userId` STRING,
    `orderTime` STRING,
    `ip` STRING,
    `orderMoney` DOUBLE,
    `orderStatus` INT,
    `ts` STRING,
    `partition_day` STRING
)
PARTITIONED BY (partition_day)
WITH(
    'connector' = 'hudi',
    'path'='hdfs://localhost:9000/hudi-warehouse/flink_hudi_order',
    'table.type' = 'MERGE_ON_READ',
    'write.operation' = 'upsert',
    'hoodie.datasource.write.recordkey.field' = 'orderId',
    'write.precombine.field' = 'ts',
    'write.tasks' = '1',
    'compaction.tasks' = '1',
    'compaction.async.enable' = 'true',
    'compaction.trigger.strategy' = 'num_commits',
    'compaction.delta_commits' = '1'
);
复制代码

 

 

3.写入hudi表

INSERT INTO order_hudi_sink
SELECT 
    orderId,userId,orderTime,ip,orderMoney,orderStatus,
    substring(orderId, 0, 17) AS ts, substring(orderTime, 0, 10) AS partition_day
FROM order_kafka_source;

 

    

posted on   嘣嘣嚓  阅读(663)  评论(0编辑  收藏  举报

相关博文:
阅读排行:
· 分享4款.NET开源、免费、实用的商城系统
· 全程不用写代码,我用AI程序员写了一个飞机大战
· MongoDB 8.0这个新功能碉堡了,比商业数据库还牛
· 白话解读 Dapr 1.15:你的「微服务管家」又秀新绝活了
· 记一次.NET内存居高不下排查解决与启示
历史上的今天:
2017-03-10 java 获取某路径下的子文件/子路径

导航

< 2025年3月 >
23 24 25 26 27 28 1
2 3 4 5 6 7 8
9 10 11 12 13 14 15
16 17 18 19 20 21 22
23 24 25 26 27 28 29
30 31 1 2 3 4 5
点击右上角即可分享
微信分享提示