hive
hive测试步骤
1.在hive中建表,一个表存放原始数据,一个表存放清洗后的数据
create table sales_initial (
day_id varchar(30),
sale_nbr varchar(30),
buy_nbr varchar(30),
cnt varchar(30),
round varchar(30)
)
row format delimited
fields terminated by ',';
2.将csv文件拖入虚拟机,然后执行下边的语句,将csv文件数据导入hive建的表中
load data local inpath '/opt/module/hadoop-3.1.3/sales_initial.csv' into table sales_initial;
3.清洗日期
insert overwrite table sales_clean
select date_add('2023-8-31', cast(day_id AS INT)) as day_id,
sale_nbr,
buy_nbr,
cnt,
round
from sales_initial ;
4.根据题目要求输出的字段,建立对应的表
5.题目实现
题目3_1统计每天各个机场的销售数量和销售金额
insert into sales_3_1 (day_id, sale_nbr, cnt, round)
select day_id, sale_nbr, sum(cnt), sum(round)
from sales_clean
where sale_nbr like 'C%'
group by day_id, sale_nbr;
题目3_2统计每天各个代理商的销售数量和销售金额
insert into sales_3_2 (day_id, sale_nbr, cnt, round)
select day_id, sale_nbr, sum(cnt), sum(round)
from sales_clean
where sale_nbr like 'O%'
group by day_id, sale_nbr;
题目3_3统计每天各个代理商的销售活跃度
insert into sales_3_3(day_id, sale_nbr, sale_number)
select day_id, sale_nbr, count(sale_nbr)
from sales_clean
where sale_nbr like 'O%'
group by day_id, sale_nbr;
题目3_4汇总统计9月1日到9月15日之间各个代理商的销售利润
create table buy as select day_id, buy_nbr, sum(cnt) as cnt, sum(round) as round from sales_initial where buy_nbr like 'O%' and day_id >= '2023-09-01' AND day_id <= '2023-09-15' group by day_id, buy_nbr
create table sale as select day_id, sale_nbr, sum(cnt) as cnt , sum(round) as round from sales_initial where sale_nbr like 'O%' and day_id >= '2023-09-01' AND day_id <= '2021-09-15' group by day_id, sale_nbr
create table earned as select a.day_id as day_id, b.sale_nbr as nbr, a.cnt as cnt_buy, a.round as round_buy, b.cnt as cnt_sale, b.round as round_sale, b.round-a.round as money from buy a join sale b on a.buy_nbr = b.sale_nbr and a.day_id = b.day_id
6.将以上统计结果保存到本机的mysql中
实例:
bin/sqoop export \
--connect jdbc:mysql://localhost:3306/test \
--username root \
--password ******** \
--table sales_3_1 \
--export-dir /user/hive/warehouse/sales_3_1 \
--input-fields-terminated-by '\t'
数据表部分展示
7.echarts数据可视化实现
下面是echarts的官网的一个新手指导,我以前没了解过,因此需要学一下
快速上手 - Handbook - Apache ECharts
这里提供了大量的指导,因此剩下的难点就是如何将mysql表中的数据传入表中
这里我使用了一种简单粗暴的方法,但有的数据好像还有些问题
目前只能达到这个水平.