2021/10/23

1、将数据导入hive在hive进行数据的处理,对数据进行清洗将括号去掉;

导入表并删除括号;

 

 

 

 

 

创建test1和test2来分别存储只出不进和只进不出的企业;

Test1建表

create table test1(nsr_id String)

                         ROW format delimited fields terminated by ',' STORED AS TEXTFILE ;

在是纳税人表中但是出方的id没有他

insert into test1(nsr_id) select distinct nsr_id from nsrxx where nsr_id not in (select xf_id from zzsfp);

来判断出不出的;

 

建立test2

create table test2(nsr_id String)

                         ROW format delimited fields terminated by ',' STORED AS TEXTFILE ;

在是纳税人表中但是入方的id没有他

insert into test2(nsr_id) select distinct nsr_id from nsrxx where nsr_id not in (select gf_id from zzsfp);

判断出不入的:

 

 

将两个表整合,统计出只进不出和只出不进

insert into data(nsr_id) select distinct nsr_id from yc3 where nsr_id not in (select nsr_id from yc2);

存放在data将test1和test2进行关联

 

posted @ 2021-10-23 20:00  小强哥in  阅读(41)  评论(0编辑  收藏  举报