网站用户行为分析

网站用户行为分析

  1.  
  1. 操作过程

1)本地数据集上传到数据仓库Hive

A.数据集下载与查看

 

B.数据集预处理

 

 

 

C.把数据集导入HDFS中

 

D.Hive上创建数据库

 

 

2)Hive数据分析

1)用户行为分析需求:2014-12-11~12号有多少条购买商品的记录

 

  • 语句:select count(*) from bigdata_user where visit_date >'2014-12-10' and visit_date <'2014-12-13' and behavior_type='4' limit 10;

 

2)用户行为分析需求:分析每月1-31号购买情况

  • 语句:select day(visit_date) from bigdata_user limit 10;
  • 行为日期

 

  • 购买行为的记录数、不同用户数
  • 语句:select count(distinct uid) from bigdata_user where behavior_type='4’

 

  • 按日期统计记录数、用户数
  • 语句:select count(distinct uid),day(visit_date) from bigdata_user where behavior_type='4' group by day(visit_date) limit 10;

 

语句:select count(*),day(visit_date) from bigdata_user where behavior_type='4' group by day(visit_date) limit 10;

 

语句:create table day_count as select count(*),day(visit_date) from bigdata_user where behavior_type='4' group by day(visit_date);

 

语句:create table day_uid as select count(distinct uid),day(visit_date) from bigdata_user where behavior_type='4' group by day(visit_date);

 

  • 12号+购买行为
  • 语句:select * from bigdata_user where behavior_type='4'and visit_date='2014-12-12' limit 10;

 

  • 按用户编号分组
  • 语句:select uid from bigdata_user where behavior_type='4'and visit_date='2014-12-12' group by uid limit 10;

 

  • 按用户分组统计
  • 语句:select uid,count(*) from bigdata_user where behavior_type='4'and visit_date='2014-12-12' group by uid limit 10;

 

  • 12号,购买,4项以上
  • 语句:select uid,count(*) from bigdata_user where behavior_type='4'and visit_date='2014-12-12' group by uid having count(*)>4 limit 10;

 

  • 语句:select uid,count(*) from bigdata_user where behavior_type='4'and visit_date='2014-12-12' group by uid having count(behavior_type='4')>4 limit 10;

 

  • 2014-12-12号当天广东购买商品数
  • 语句:select count(*)from bigdata_user where visit_date='2014-12-12' and province='广东';

 

按省份统计购买数量

  • 语句:select count(*)from bigdata_user group by province;

 

  • 2014-12-12号当天的商品购买与浏览比例
  • 语句:select c.*,c.c4/c.c1 c41 from (select uid,count(*)countall, sum(case when  behavior_type='4' then 1 else 0 end)c4, sum(case when behavior_type='1' then 1 else 0 end)c1 from bigdata_user where visit_date='2014-12-12' group by uid)c order by c41 desc limit 10;

 

用户10001082在2014-12-12号当天活跃度:该用户点击行为占该天所有点击行为的比例

  • 语句:

select '10001082' uid,a.users,b.user_number,b.user_number/a.users rate from ( select count(*) users from bigdata_user where visit_date='2014-12-12' and behavior_type='1' ) a ,( select count(*) user_number  from bigdata_user where visit_date='2014-12-12' and behavior_type='1' and uid=10001082  ) b;

 

  • 2014-12-12号当天购买4件商品以上的用户
  • 语句:select uid,count(*)from bigdata_user where behavior_type='4' and visit_date='2014-12-12' group by uid having count(*)>4;

 

 

3)自定义需求:

12月10号买了超过四种商品的用户id

语句:select uid from bigdata_user where behavior_type='4' and visit_date='2014-12-10' group by uid having count(behavior_type='4')>5;

 

10号,购买,3项以上

 语句:select uid,count(*) from bigdata_user where behavior_type='4'and visit_date='2014-12-10' group by uid having count(*)>3 limit 10;

 

通国际当天购买商品钟类为3的天数,并保存到表中

语句:create table day_count_3 as select count(*),day(visit_date) from bigdata_user where behavior_type='3' group by day(visit_date);

 

3)Hive、MySQL、HBase数据互导

创建临时表user_actionHive会自动在HDFS文件系统中创建对应的数据文件

 

命令查看一下,确认这个数据文件在HDFS中确实被创建,确认这个数据文件在HDFS中确实被创建

 

dblab.bigdata_user数据插入到dblab.user_action表

 

查询上面的插入命令是否成功执行

 

以上user_action-灌入数据成功

  1. Hive数据分析结果(如用户购买与浏览比例),从Hive传输到MySQL中。

显示数据库并使用dblab

 

查看数据库的编码

 

下面在MySQL的数据库dblab中创建一个新表user_action,并设置其编码为utf-8

 

导入数据

 

 

  1. Hive数据分析结果(如用户购买与浏览比例),从MySQL传输到Hbase中。

创建表user_action

 

导入数据

 

查看HBase中user_action表的数据

 

posted @ 2021-01-06 16:52  赖泽梵  阅读(273)  评论(0编辑  收藏  举报