Dynamic Pivot
最近碰巧读到了关于sql server pivot 的两篇文章。详见:
1 Cross Tabs and Pivots, Part 1 – Converting Rows to Columns
2 Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs
这两篇文章对sql server里面行转列的两种技术cross tabs和pivot进行了分析和比较。然后着重介绍了动态cross tab的技术。其实这两种旋转在日常中用到的蛮多的,但是动态的旋转我以前考虑过,一直没有去实现,读了这两篇文章,收获颇丰。这两篇文章没有对动态pivot进行分析。其实动态pivot和动态cross tab原理上是一样的,只需要找到变化的部分,然后把这部分动态转换出来就可以了。本篇文章介绍动态pivot。
首先是填充样例数据表,这两篇文章中有介绍,我就借用下:
SELECT TOP 1000000 RowNum = IDENTITY(INT,1,1), SomeInt = ABS(CHECKSUM(NEWID()))%50000+1, SomeLetters2 = CHAR(ABS(CHECKSUM(NEWID()))%26+65) + CHAR(ABS(CHECKSUM(NEWID()))%26+65), SomeCSV = CAST('Part01,Part02,Part03,Part04,Part05,Part06,Part07,Part08,Part09,Part10' AS VARCHAR(80)), SomeMoney = CAST(ABS(CHECKSUM(NEWID()))%10000 /100.0 AS MONEY), SomeDate = CAST(RAND(CHECKSUM(NEWID()))*3653.0+36524.0 AS DATETIME), SomeHex12 = RIGHT(NEWID(),12) INTO dbo.JBMTest FROM Master.dbo.SysColumns t1 CROSS JOIN Master.dbo.SysColumns t2 --===== Add the Primary Key ALTER TABLE dbo.JBMTest ADD PRIMARY KEY CLUSTERED (RowNum)
以上填充了一百万条随机数据到表JBMTest中,这个表中有时间列SomeDate,数据列SomeMoney,还有标志列SomeLetters2。要求是对所有的SomeLetters2,按年月统计somemoney。在结果中所有的时间显示为列。
构造动态sql语句的前提是你首先必须根据你的业务逻辑写出静态sql 语句,然后针对变化的部分,找出替代变化的部分的脚本。首先找出实现结果的静态脚本,如下:
select SomeLetters2, --dynamic part 1 coalesce([Dec 2005], 0) [Dec 2005], coalesce([Jan 2006], 0) [Jan 2006], coalesce([Feb 2006], 0) [Feb 2006], coalesce([Mar 2006], 0) [Mar 2006], coalesce([Apr 2006], 0) [Apr 2006], coalesce([ALL], 0) [ALL] -- start part from ( select case when grouping(SomeLetters2) = 1 then '1-Total' else SomeLetters2 end as SomeLetters2, case when GROUPING( DATEADD(mm, datediff(mm, 0, SomeDate), 0)) = 1 then'ALL' else stuff(convert(char(11), DATEADD(mm, datediff(mm, 0, SomeDate), 0), 100), 4, 3, '') end as dt, SUM(somemoney) sm from dbo.jbmtest where SomeDate >= '12-5-2005' and SomeDate <= '4-2-2006' and SomeLetters2 in ('AI', 'AJ', 'AC') group by SomeLetters2, DATEADD(mm, datediff(mm, 0, SomeDate), 0) with cube ) as d pivot( sum(sm) for dt in( --dynamic part 2 [Dec 2005],[Jan 2006],[Feb 2006],[Mar 2006],[Apr 2006],[ALL] ) ) as p order by SomeLetters2
如上所示,动态的部分有2个地方,一个是在select的list里面,这里是选出最终的列;另外是在pivot语句的for语句的in list。其实两个地方无非是列出给定2个日期间的所有月。对于给点的两个日期,找出这两个日期间的以月为单位的时间,只需要使用到dateadd函数和数字表,如下:
declare @startDate datetime = '2005-12-5', @endDate datetime = '2006-3-2' set @endDate = DATEADD(MM, DATEDIFF(mm, 0, @enddate)+1, 0) select N, stuff(convert(char(11), DATEADD(mm, N-1, @startDate), 100), 4, 3, '') dtstr from dbo.Number where N <= DATEDIFF(mm, @startDate, @endDate)
以上的结果里面所有的月以行的显示在结果里面。现在需要把这些行转换成我需要的字符串。这个如下:
declare @startDate datetime = '2005-12-5', @endDate datetime = '2006-1-2'; set @endDate = DATEADD(MM, DATEDIFF(mm, 0, @enddate)+1, 0) declare @part1 nvarchar(max) = 'select SomeLetters2,' + char(10) ; declare @temp1 nvarchar(4000) = 'coalesce([tmpRep], 0) [tmpRep],' + char(10); declare @temp2 nvarchar(4000) = '[tmpRep],'; declare @part2 nvarchar(max) = ''; select @part1 = @part1 + dtpart, @part2 = @part2 + dtpart2 from( select REPLACE(@temp1, 'tmpRep', d.dtstr) dtpart, REPLACE(@temp2, 'tmpRep', d.dtstr) dtpart2 from( select N, stuff(convert(char(11), DATEADD(mm, N-1, @startDate), 100), 4, 3, '') dtstr from dbo.Number where N <= DATEDIFF(mm, @startDate, @endDate) ) as d ) as d2 set @part1 = @part1 + REPLACE(@temp1, 'tmpRep', 'ALL') set @part2 = @part2 + REPLACE(@temp2, 'tmpRep', 'ALL') --remove the last chart ',' @part1 have char(10) and ',' in the right, remove 2 chars set @part1 = LEFT(@part1, len(@part1) - 2) set @part2 = LEFT(@part2, len(@part2) -1 ) print @part1 print @part2
@part1是select list里面的脚本,如下:
select SomeLetters2,
coalesce([Dec 2005], 0) [Dec 2005],
coalesce([Jan 2006], 0) [Jan 2006],
coalesce([ALL], 0) [ALL]
@part2是pivot for in list里面的脚本,如下:
[Dec 2005],[Jan 2006],[ALL]
这样所有的动态的部分都已经实现了,余下的就是静态的部分了。
declare @part3 nvarchar(max) = ' from ( select case when grouping(SomeLetters2) = 1 then ' + quotename('1-Total', '''') + ' else SomeLetters2 end as SomeLetters2, case when GROUPING( DATEADD(mm, datediff(mm, 0, SomeDate), 0)) = 1 then' + quotename('ALL', '''')+ ' else stuff(convert(char(11), DATEADD(mm, datediff(mm, 0, SomeDate), 0), 100), 4, 3, '''') end as dt, SUM(somemoney) sm from dbo.jbmtest where SomeDate >= ' + quotename(@startDate, '''') + ' and SomeDate < ' +quotename(@endDate, '''') + ' group by SomeLetters2, DATEADD(mm, datediff(mm, 0, SomeDate), 0) with cube ) as d pivot( sum(sm) for dt in(tmpRep) ) as p order by SomeLetters2 '; set @part3 = @part1 + REPLACE(@part3, 'tmpRep', @part2) print @part3 exec (@part3)
在静态的部分里面需要注意,条件SomeDate >= '12 -5-2005' and SomeDate <= '4-2-2006' 需要转换,如果直接使用变量,会导致datatime无法转换成字符串类型的错误,使用quotename函数;其二,所有’’需要使用’’’’代替,否则会导致错误。
最后调用exec,可以得到结果。
最终所有的脚本如下:
declare @startDate datetime = '2005-12-5', @endDate datetime = '2006-1-2'; set @endDate = DATEADD(MM, DATEDIFF(mm, 0, @enddate)+1, 0) declare @part1 nvarchar(max) = 'select SomeLetters2,' + char(10) ; declare @temp1 nvarchar(4000) = 'coalesce([tmpRep], 0) [tmpRep],' + char(10); declare @temp2 nvarchar(4000) = '[tmpRep],'; declare @part2 nvarchar(max) = ''; select @part1 = @part1 + dtpart, @part2 = @part2 + dtpart2 from( select REPLACE(@temp1, 'tmpRep', d.dtstr) dtpart, REPLACE(@temp2, 'tmpRep', d.dtstr) dtpart2 from( select N, stuff(convert(char(11), DATEADD(mm, N-1, @startDate), 100), 4, 3, '') dtstr from dbo.Number where N <= DATEDIFF(mm, @startDate, @endDate) ) as d ) as d2 set @part1 = @part1 + REPLACE(@temp1, 'tmpRep', 'ALL') set @part2 = @part2 + REPLACE(@temp2, 'tmpRep', 'ALL') --remove the last chart ',' @part1 have char(10) and ',' in the right, remove 2 chars set @part1 = LEFT(@part1, len(@part1) - 2) set @part2 = LEFT(@part2, len(@part2) -1 ) --print @part1 --print @part2 declare @part3 nvarchar(max) = ' from ( select case when grouping(SomeLetters2) = 1 then ' + quotename('1-Total', '''') + ' else SomeLetters2 end as SomeLetters2, case when GROUPING( DATEADD(mm, datediff(mm, 0, SomeDate), 0)) = 1 then' + quotename('ALL', '''')+ ' else stuff(convert(char(11), DATEADD(mm, datediff(mm, 0, SomeDate), 0), 100), 4, 3, '''') end as dt, SUM(somemoney) sm from dbo.jbmtest where SomeDate >= ' + quotename(@startDate, '''') + ' and SomeDate < ' +quotename(@endDate, '''') + ' group by SomeLetters2, DATEADD(mm, datediff(mm, 0, SomeDate), 0) with cube ) as d pivot( sum(sm) for dt in(tmpRep) ) as p order by SomeLetters2 '; set @part3 = @part1 + REPLACE(@part3, 'tmpRep', @part2) print @part3 exec (@part3)