kettle结合MySQL生成保留最近6个月月度报告_20161009
之前计算用户ID各月的金额(各月在列字段),用的是下面代码
1 SELECT b.城市,SUM(IF(b.年月=201607,b.金额,NULL)) AS 7月金额,SUM(IF(b.年月=201608,b.金额,NULL)) AS 8月金额,SUM(IF(b.年月=201609,b.金额,NULL)) AS 9月金额 2 FROM ( 3 SELECT city AS 城市,DATE_FORMAT(order_time,"%Y%m") AS 年月,SUM(pay_money) AS 金额 4 FROM test_a03order AS a 5 GROUP BY city,DATE_FORMAT(order_time,"%Y%m") 6 ) AS b 7 GROUP BY b.城市
a.日常报表中一般下个月月初做上个月报表,随着时间推移文件越来越大,很多历史数据或许也没有多少价值,如果我们想生成固定的保留几个月的数据,比如总是保持最近6个月的数据,如何实现?原来如果计划保持最近6个月的 出报表的时候 就需要手动修改sum(if())的代码 把下个月的添加进来 把第一个月的去掉(现在是7,8,9月,下个月换为8,9,10,把10月加进来,7月删除 这样保持最近3个月) 有点麻烦
b.如果自动保持最近6个月的数据 大致思路是判断数据源的数据月份与当前月的间隔始终保持在1-6之间,添加这样的判断字段可以有多个方式,这里有两个 实质上是一个原理。
第一种实现办法是添加一个字段和当前年月间隔,用case when 进行判断处理
1.添加 月最后一天 和 与当前月间隔几月 字段
1 SELECT a1.city AS 城市,a1.username AS 用户ID,DATE_FORMAT(a1.order_date,"%Y%m") AS 年月,SUM(a1.pay_money) AS 金额,LAST_DAY(a1.order_date) AS 月最后一天, 2 CASE 3 WHEN DATE_FORMAT(LAST_DAY(a1.order_date),"%Y%m")=DATE_FORMAT(DATE_ADD(DATE_ADD(LAST_DAY(CURRENT_DATE),INTERVAL 1 DAY),INTERVAL - 7 MONTH),"%Y%m") THEN "6" 4 WHEN DATE_FORMAT(LAST_DAY(a1.order_date),"%Y%m")=DATE_FORMAT(DATE_ADD(DATE_ADD(LAST_DAY(CURRENT_DATE),INTERVAL 1 DAY),INTERVAL - 6 MONTH),"%Y%m") THEN "5" 5 WHEN DATE_FORMAT(LAST_DAY(a1.order_date),"%Y%m")=DATE_FORMAT(DATE_ADD(DATE_ADD(LAST_DAY(CURRENT_DATE),INTERVAL 1 DAY),INTERVAL - 5 MONTH),"%Y%m") THEN "4" 6 WHEN DATE_FORMAT(LAST_DAY(a1.order_date),"%Y%m")=DATE_FORMAT(DATE_ADD(DATE_ADD(LAST_DAY(CURRENT_DATE),INTERVAL 1 DAY),INTERVAL - 4 MONTH),"%Y%m") THEN "3" 7 WHEN DATE_FORMAT(LAST_DAY(a1.order_date),"%Y%m")=DATE_FORMAT(DATE_ADD(DATE_ADD(LAST_DAY(CURRENT_DATE),INTERVAL 1 DAY),INTERVAL - 3 MONTH),"%Y%m") THEN "2" 8 WHEN DATE_FORMAT(LAST_DAY(a1.order_date),"%Y%m")=DATE_FORMAT(DATE_ADD(DATE_ADD(LAST_DAY(CURRENT_DATE),INTERVAL 1 DAY),INTERVAL - 2 MONTH),"%Y%m") THEN "1" 9 WHEN DATE_FORMAT(LAST_DAY(a1.order_date),"%Y%m")=DATE_FORMAT(DATE_ADD(DATE_ADD(LAST_DAY(CURRENT_DATE),INTERVAL 1 DAY),INTERVAL - 1 MONTH),"%Y%m") THEN "0" 10 ELSE NULL END 与当前月间隔几月 11 FROM `test_a03order` AS a1 12 GROUP BY a1.city ,a1.username,DATE_FORMAT(a1.order_date,"%Y%m")
2、sum((if))函数行转列 通过控制与当前月间隔几月等于几 保留最近几个月的数据 这样就不用手动修改了 下个月保留的是最近6个月的数据
1 SELECT a.城市, 2 SUM(IF(与当前月间隔几月=6,金额,NULL)) AS "前6月",SUM(IF(与当前月间隔几月=5,金额,NULL)) AS "前5月", 3 SUM(IF(与当前月间隔几月=4,金额,NULL)) AS "前4月",SUM(IF(与当前月间隔几月=3,金额,NULL)) AS "前3月", 4 SUM(IF(与当前月间隔几月=2,金额,NULL)) AS "前2月",SUM(IF(与当前月间隔几月=1,金额,NULL)) AS "前1月" 5 FROM ( 6 SELECT a1.city AS 城市,a1.username AS 用户ID,DATE_FORMAT(a1.order_date,"%Y%m") AS 年月,SUM(a1.pay_money) AS 金额,LAST_DAY(a1.order_date) AS 月最后一天, 7 CASE 8 WHEN DATE_FORMAT(LAST_DAY(a1.order_date),"%Y%m")=DATE_FORMAT(DATE_ADD(DATE_ADD(LAST_DAY(CURRENT_DATE),INTERVAL 1 DAY),INTERVAL - 7 MONTH),"%Y%m") THEN "6" 9 WHEN DATE_FORMAT(LAST_DAY(a1.order_date),"%Y%m")=DATE_FORMAT(DATE_ADD(DATE_ADD(LAST_DAY(CURRENT_DATE),INTERVAL 1 DAY),INTERVAL - 6 MONTH),"%Y%m") THEN "5" 10 WHEN DATE_FORMAT(LAST_DAY(a1.order_date),"%Y%m")=DATE_FORMAT(DATE_ADD(DATE_ADD(LAST_DAY(CURRENT_DATE),INTERVAL 1 DAY),INTERVAL - 5 MONTH),"%Y%m") THEN "4" 11 WHEN DATE_FORMAT(LAST_DAY(a1.order_date),"%Y%m")=DATE_FORMAT(DATE_ADD(DATE_ADD(LAST_DAY(CURRENT_DATE),INTERVAL 1 DAY),INTERVAL - 4 MONTH),"%Y%m") THEN "3" 12 WHEN DATE_FORMAT(LAST_DAY(a1.order_date),"%Y%m")=DATE_FORMAT(DATE_ADD(DATE_ADD(LAST_DAY(CURRENT_DATE),INTERVAL 1 DAY),INTERVAL - 3 MONTH),"%Y%m") THEN "2" 13 WHEN DATE_FORMAT(LAST_DAY(a1.order_date),"%Y%m")=DATE_FORMAT(DATE_ADD(DATE_ADD(LAST_DAY(CURRENT_DATE),INTERVAL 1 DAY),INTERVAL - 2 MONTH),"%Y%m") THEN "1" 14 WHEN DATE_FORMAT(LAST_DAY(a1.order_date),"%Y%m")=DATE_FORMAT(DATE_ADD(DATE_ADD(LAST_DAY(CURRENT_DATE),INTERVAL 1 DAY),INTERVAL - 1 MONTH),"%Y%m") THEN "0" 15 ELSE NULL END 与当前月间隔几月 16 FROM `test_a03order` AS a1 17 GROUP BY a1.city ,a1.username,DATE_FORMAT(a1.order_date,"%Y%m") 18 ) AS a 19 GROUP BY a.城市 20 ORDER BY a.城市
第二种办法是只添加月最后一天字段 是通过判断和当前日期所在月和子表里月最后一天所处的年月的月间隔 进行数据源时间的截取 以及sum(if())函数的行转列 实现最终目的
#当前月和子表里月最后一天所处的年月的月间隔为6 就是前6月数据
PERIOD_DIFF(DATE_FORMAT(CURRENT_DATE,"%Y%m"),DATE_FORMAT(月最后一天,"%Y%m"))=6
这个函数是判断月间隔 不考虑月天数
1 SELECT a.城市, 2 SUM(IF(PERIOD_DIFF(DATE_FORMAT(CURRENT_DATE,"%Y%m"),DATE_FORMAT(月最后一天,"%Y%m"))=6,金额,NULL)) AS "前6月",SUM(IF(PERIOD_DIFF(DATE_FORMAT(CURRENT_DATE,"%Y%m"),DATE_FORMAT(月最后一天,"%Y%m"))=5,金额,NULL)) AS "前5月", 3 SUM(IF(PERIOD_DIFF(DATE_FORMAT(CURRENT_DATE,"%Y%m"),DATE_FORMAT(月最后一天,"%Y%m"))=4,金额,NULL)) AS "前4月",SUM(IF(PERIOD_DIFF(DATE_FORMAT(CURRENT_DATE,"%Y%m"),DATE_FORMAT(月最后一天,"%Y%m"))=3,金额,NULL)) AS "前3月", 4 SUM(IF(PERIOD_DIFF(DATE_FORMAT(CURRENT_DATE,"%Y%m"),DATE_FORMAT(月最后一天,"%Y%m"))=2,金额,NULL)) AS "前2月",SUM(IF(PERIOD_DIFF(DATE_FORMAT(CURRENT_DATE,"%Y%m"),DATE_FORMAT(月最后一天,"%Y%m"))=1,金额,NULL)) AS "前1月" 5 FROM ( 6 SELECT a1.city AS 城市,a1.username AS 用户ID,DATE_FORMAT(a1.order_date,"%Y%m") AS 年月,SUM(a1.pay_money) AS 金额,LAST_DAY(a1.order_date) AS 月最后一天 7 FROM `test_a03order` AS a1 8 GROUP BY a1.city ,a1.username,DATE_FORMAT(a1.order_date,"%Y%m") 9 ) AS a 10 GROUP BY a.城市 11 ORDER BY a.城市
因此推荐使用第二种办法代码短 第一种办法细致点容易理解 是对第二种的拆解
3、在excel里设置模板 把前6月字样用函数替换掉
excel里函数 设置表头 TEXT(DATE(YEAR(NOW()),MONTH(NOW())-6,1),"yyyymm")
Kettle步骤里 Microsoft Excel 输出的时候选择不输出表头就可以自动更新了