|NO.Z.00021|——————————|BigDataEnd|——|Hadoop&PB级数仓.V05|——|PB数仓.v05|会员活跃度分析|DWD建表&加载数据|

一、DWD层建表和数据加载
### --- DWD层建表和数据加载

~~~     主要任务:ODS(包含json串) => DWD
~~~     json数据解析,丢弃无用数据(数据清洗),保留有效信息,并将数据展开,形成每日启动明细表。
2021-09-28 18:19:32.966 [main] INFO com.yanqi.ecommerce.AppStart - 
{"app_active":{"name":"app_active","json":
{"entry":"1","action":"1","error_code":"0"},"time":1596309585861},"attr":
{"area":"绍兴","uid":"2F10092A10","app_v":"1.1.16","event_type":"common","device_id":"1FB872-9A10010","os_type":"3.0","channel":"ML",
"language":"chinese","brand":"Huawei-2"}}
2020-07-21
二、DWD层表创建和加载数据
### --- 创建DWD层表:表的格式:parquet、分区表
~~~     # 语法

use dwd;
drop table if exists dwd.dwd_start_log;
CREATE TABLE dwd.dwd_start_log(
`device_id` string,
`area` string,
`uid` string,
`app_v` string,
`event_type` string,
`os_type` string,
`channel` string,
`language` string,
`brand` string,
`entry` string,
`action` string,
`error_code` string
)
PARTITIONED BY (dt string)
STORED AS parquet;
hive (default)> use dwd;

hive (dwd)> CREATE TABLE dwd.dwd_start_log(
          > `device_id` string,
          > `area` string,
          > `uid` string,
          > `app_v` string,
          > `event_type` string,
          > `os_type` string,
          > `channel` string,
          > `language` string,
          > `brand` string,
          > `entry` string,
          > `action` string,
          > `error_code` string
          > )
          > PARTITIONED BY (dt string)
          > STORED AS parquet;

hive (dwd)> show tables;
dwd_start_log
二、加载DWD层数据
### --- 加载DWD层数据

[root@hadoop02 ~]# vim /data/yanqidw/script/member_active/dwd_load_start.sh
#!/bin/bash
source /etc/profile

# 可以输入日期;如果未输入日期取昨天的时间
if [ -n "$1" ]
then
    do_date=$1
else
    do_date=`date -d "-1 day" +%F`
fi

# 定义要执行的SQL
sql="with tmp as
( 
select split(str, ' ')[7] line 
from ods.ods_start_log where dt='$do_date'
)
insert overwrite table dwd.dwd_start_log partition(dt='$do_date')
select get_json_object(line, '$.attr.device_id'),
get_json_object(line, '$.attr.area'),
get_json_object(line, '$.attr.uid'),
get_json_object(line, '$.attr.app_v'),
get_json_object(line, '$.attr.event_type'),
get_json_object(line, '$.attr.os_type'),
get_json_object(line, '$.attr.channel'),
get_json_object(line, '$.attr.language'),
get_json_object(line, '$.attr.brand'),
get_json_object(line, '$.app_active.json.entry'),
get_json_object(line, '$.app_active.json.action'),
get_json_object(line, '$.app_active.json.error_code')
from tmp;
"
hive -e "$sql"
### --- 为脚本授予执行权限

[root@hadoop02 ~]# chmod +x /data/yanqidw/script/member_active/dwd_load_start.sh
三、执行该脚本,验证结果
### --- 执行该脚本,验证结果
~~~     # 加载数据到ODS分区下
~~~     # 确保ods分区下是有数据存在

hive (ods)> show partitions ods.ods_start_log;
OK
partition
dt=2021-09-28
hive (ods)> select * from ods.ods_start_log limit 10;
~~~     # 执行该脚本

[root@hadoop02 ~]# /data/yanqidw/script/member_active/dwd_load_start.sh 
~~~输出参数
Total MapReduce CPU Time Spent: 0 msec
OK
_c0 _c1 _c2 _c3 _c4 _c5 _c6 _c7 _c8 _c9 _c10    _c11
~~~     # 查看数据是否通过ODS加载到DWD下

hive (dwd)> show partitions dwd_start_log;
OK
partition
dt=2021-09-28
hive (dwd)> select * from dwd.dwd_start_log limit 5;
OK
dwd_start_log.device_id dwd_start_log.area  dwd_start_log.uid   dwd_start_log.app_v dwd_start_log.event_type    dwd_start_log.os_type   dwd_start_log.channel   dwd_start_log.language  dwd_start_log.brand dwd_start_log.entry dwd_start_log.action    dwd_start_log.error_code    dwd_start_log.dt
1FB872-9A1001   连云港 2F10092A1   1.1.8   common  0.43    PN  chinese iphone-7    1   0   0   2021-09-28
1FB872-9A1002   金昌  2F10092A2   1.1.5   common  5.8.7   OF  chinese xiaomi-0    1   1   0   2021-09-28
1FB872-9A1003   句容  2F10092A3   1.1.16  common  0.99    YI  chinese iphone-3    1   0   0   2021-09-28
1FB872-9A1004   肇庆  2F10092A4   1.1.6   common  6.6.2   CD  chinese xiaomi-6    1   1   0   2021-09-28
1FB872-9A1005   武汉  2F10092A5   1.1.2   common  6.9 WG  chinese xiaomi-1    1   1   0   2021-09-28
### --- 日志文件 =》 Flume =》 HDFS =》 ODS =》 DWD ODS =》 DWD;

~~~     json数据的解析;数据清洗
~~~     下一步任务:DWD(会员的每日启动信息明细) => DWS(如何建表,如何加载数据)
~~~     活跃会员 ===> 新增会员 ===> 会员留存

 
 
 
 
 
 
 
 
 

Walter Savage Landor:strove with none,for none was worth my strife.Nature I loved and, next to Nature, Art:I warm'd both hands before the fire of life.It sinks, and I am ready to depart
                                                                                                                                                   ——W.S.Landor

 

 

posted on   yanqi_vip  阅读(40)  评论(0编辑  收藏  举报

相关博文:
阅读排行:
· 无需6万激活码!GitHub神秘组织3小时极速复刻Manus,手把手教你使用OpenManus搭建本
· Manus爆火,是硬核还是营销?
· 终于写完轮子一部分:tcp代理 了,记录一下
· 别再用vector<bool>了!Google高级工程师:这可能是STL最大的设计失误
· 单元测试从入门到精通
< 2025年3月 >
23 24 25 26 27 28 1
2 3 4 5 6 7 8
9 10 11 12 13 14 15
16 17 18 19 20 21 22
23 24 25 26 27 28 29
30 31 1 2 3 4 5

导航

统计

点击右上角即可分享
微信分享提示