|NO.Z.00043|——————————|BigDataEnd|——|Hadoop&实时数仓.V23|——|项目.v23|需求二:数据处理&增量统计.V1|——|需求分析|
一、需求2:每隔5分钟统计最近1小时内的订单交易情况,显示城市/省份/交易总金额/订单总数---增量统计

二、编程实现流程
### --- 读取数据源(input)
~~~ # input读取数据源:
input1:mysql:yanqi_area --- HBase:dim_yanqi_area(地域宽表) # 从hbash中获取地域宽表
input2:mysql:yanqi_trade_orders --- kafka:yanqi_trade_orders # 数据不用下沉到hbash中,直接从kafka中获取数据
### --- 对input进行transformation(转化)
~~~ # input1:区域id, 区域的名字,城市的id,城市的名字,省份的id,省份的名字
~~~ # toList: 地域宽表的所有数据
(370200,青岛市,370000,山东省,100000,中国)
(370202,市南区,370200,青岛市,370000,山东省)
(370203,市北区,370200,青岛市,370000,山东省)
(370211,黄岛区,370200,青岛市,370000,山东省)
...
(713702,金湖镇,713700,金门县,710000,台湾)
(713703,金沙镇,713700,金门县,710000,台湾)
~~~ # input2: 数据格式:json
~~~ 获取区:json
~~~ 数据中的如下四个字段的数据:data、type、database、table
~~~ 把这四个字段的数据贴在样例类中:TableObject
~~~ # filter:yanqi_trade_orders
~~~ 把dataInfo中的数据拿出我们想要的字段:orderId、orderNo、userId、status、totalMoney、areaId
~~~ 把这几个字段包装到TradeOrder样例类中
~~~ # 把订单数据根据areaId进行分组:区、市、省 市、省
~~~ # totalMoney:提取
~~~ 计算出订单的数量(订单..,1)((青岛市-山东省),(1000,1))
~~~ 根据【城市-省份 (地域)分组】.timeWindow
~~~ # aggregate方法中,进行计算的逻辑:
~~~ # AggFunc:UDF
~~~ # 订单总额的累加:
~~~ # 订单总数的累加:
### --- WindowFunc:
### --- timeWindow
~~~ 将计算结果数据向下游流动 ---- 输出到控制台上
三、程序下沉数据流程
### --- ODS:--增量的数据---canal将mysql的bin_log放在kafka的test中
use dwshow;
INSERT INTO `yanqi_trade_orders` VALUES ('1', '23a0b124546', '98', '2', '0.12', '10468.00','2', '0', '370203', '0', '0', '1', '2', '2020-06-28 18:14:01', '2020-06-28 18:14:01', '2020-10-21 22:54:31');
INSERT INTO `yanqi_trade_orders` VALUES ('2', '23a0b124546', '121', '2', '0.12', '6331.00','2', '0', '370203', '0', '0', '0', '1', '2020-06-28 16:55:02', '2020-06-28 16:55:02', '2020-10-21 22:54:32');
INSERT INTO `yanqi_trade_orders` VALUES ('3', '23a0b124546', '35', '2', '0.12', '1987.50','4', '0', '370203', '0', '0', '0', '1', '2020-06-28 12:07:01', '2020-06-28 12:07:01', '2020-10-21 22:54:34');
INSERT INTO `yanqi_trade_orders` VALUES ('4', '23a0b124546', '161', '2', '0.12', '43659.00','4', '0', '370203', '0', '0', '0', '1', '2020-06-28 13:19:48', '2020-06-28 13:19:48', '2020-10-21 22:54:35');
INSERT INTO `yanqi_trade_orders` VALUES ('5', '23a0b124546', '72', '2', '0.12', '32757.00','0', '0', '370203', '0', '0', '0', '1', '2020-06-28 22:14:21', '2020-06-28 22:14:21', '2020-10-21 22:54:37');
INSERT INTO `yanqi_trade_orders` VALUES ('6', '23a0b124546', '1', '2', '0.12', '5295.60','3', '0', '370203', '0', '0', '0', '1', '2020-06-28 18:28:48', '2020-06-28 18:28:48', '2020-10-21 22:55:03');
~~~ # DIM:同第四部分:用到DIM层的区城市省份的三级地区明细宽表:
~~~ # DW--DWS
四、编程实现
### --- 编程实现:
~~~ 每隔5分钟统计最近1小时内的订单交易情况,
~~~ 要求显示城市、省份、交易总金额、订单总数---增量统计
### --- 将计算结果数据向下游流程——输出到控制台上
{
"data": [
{
"orderId": "59",
"orderNo": "23a0b124546",
"userId": "52",
"status": "2",
"productMoney": "0.12",
"totalMoney": "59998.0",
"payMethod": "4",
"isPay": "0",
"areaId": "370211",
"tradeSrc": "0",
"tradeType": "0",
"isRefund": "0",
"dataFlag": "1",
"createTime": "2020-06-28 10:51:38",
"payTime": "2020-06-28 10:51:38",
"modifiedTime": "2020-10-21 22:56:38"
}
],
"database": "dwshow",
"es": 1607331155000,
"id": 16,
"isDdl": false,
"mysqlType": {
"orderId": "bigint(11)",
"orderNo": "varchar(20)",
"userId": "bigint(11)",
"status": "tinyint(4)",
"productMoney": "decimal(11,2)",
"totalMoney": "decimal(11,2)",
"payMethod": "tinyint(4)",
"isPay": "tinyint(4)",
"areaId": "int(11)",
"tradeSrc": "tinyint(4)",
"tradeType": "int(11)",
"isRefund": "tinyint(4)",
"dataFlag": "tinyint(4)",
"createTime": "varchar(25)",
"payTime": "varchar(25)",
"modifiedTime": "timestamp"
},
"old": null,
"pkNames": [
"orderId"
],
"sql": "",
"sqlType": {
"orderId": -5,
"orderNo": 12,
"userId": -5,
"status": -6,
"productMoney": 3,
"totalMoney": 3,
"payMethod": -6,
"isPay": -6,
"areaId": 4,
"tradeSrc": -6,
"tradeType": 4,
"isRefund": -6,
"dataFlag": -6,
"createTime": 12,
"payTime": 12,
"modifiedTime": 93
},
"table": "yanqi_trade_orders",
"ts": 1607331155758,
"type": "INSERT"
}
Walter Savage Landor:strove with none,for none was worth my strife.Nature I loved and, next to Nature, Art:I warm'd both hands before the fire of life.It sinks, and I am ready to depart
——W.S.Landor
分类:
bdv026-EB实时数仓
【推荐】国内首个AI IDE,深度理解中文开发场景,立即下载体验Trae
【推荐】编程新体验,更懂你的AI,立即体验豆包MarsCode编程助手
【推荐】抖音旗下AI助手豆包,你的智能百科全书,全免费不限次数
【推荐】轻量又高性能的 SSH 工具 IShell:AI 加持,快人一步
· 无需6万激活码!GitHub神秘组织3小时极速复刻Manus,手把手教你使用OpenManus搭建本
· Manus爆火,是硬核还是营销?
· 终于写完轮子一部分:tcp代理 了,记录一下
· 别再用vector<bool>了!Google高级工程师:这可能是STL最大的设计失误
· 单元测试从入门到精通