一个有趣的SQL Server 层级汇总数据问题
看SQL Server 大V宋大侠的博客文章,发现了一个有趣的sql server 层级汇总数据问题。
具体的问题如下:
parent_id emp_id emp_name total_amout
NULL 2 Andrew 200
2 1 Nancy 100
2 3 Janet 120
3 4 Michael 80
1 5 Robert 50
每个员工的总销售额=自己的销售额+其下级员工的总销售额,
比如:
Andrew = 200_100_120_80_50=550
Nancy = 100+50=150
Janet = 120+80=200
Michael = 80
Robert = 50
这个用SQL怎样可以查询得到,请教一下大家???
NULL 2 Andrew 200
2 1 Nancy 100
2 3 Janet 120
3 4 Michael 80
1 5 Robert 50
每个员工的总销售额=自己的销售额+其下级员工的总销售额,
比如:
Andrew = 200_100_120_80_50=550
Nancy = 100+50=150
Janet = 120+80=200
Michael = 80
Robert = 50
这个用SQL怎样可以查询得到,请教一下大家???
从数据表中的数据以及问题阐述来看可以确定该数据表是个父子层级类型数据表,这个在纬度类型中是一种比较常见:父子纬度。 从名字解释来看就是一种自引用的数据表,大家最熟悉的组织机构就是具有这种层级结构,其中不同级别的机构具有共同的特性。这种层级结构如下图所示:
以上图来自百度查询得到。
看来宋大侠针对该问题的解决方案(CTE递归查询+游标),还有其他的博友的评论(有的支出数据表设计的不完善,还有通过虚拟出层级层级字符串列来实现的,还有获取当前层级以下所有层级的汇总封装成存储等等)。
为了实现该问题,我使用的是CTE递归查询+APPLY,具体的实现思路如下:
1、通过CTE递归查询虚拟出若干列,其中就有层级索引字符串列(该列表示具有层级的标识ID的字符串格式,便于查找)。
2、使用APPLY来实现汇总数据(当然也可以使用SELECT + SUBQUERY)。
具体演示实现代码如下:
1 IF OBJECT_ID(N'dbo.MyEmp', N'U') IS NOT NULL
2 BEGIN
3 DROP TABLE dbo.MyEmp;
4 END
5 GO
6
7 CREATE TABLE dbo.MyEmp (
8 MyEmpID INT NOT NULL,
9 ParentID INT NULL,
10 MyEmpName NVARCHAR(20) NOT NULL,
11 HoursSalary INT NOT NULL
12 );
13 GO
14
15 IF OBJECT_ID(N'PK_U_CL_MyEmp_MyEmpID', N'PK') IS NULL
16 BEGIN
17 ALTER TABLE [dbo].[MyEmp] ADD CONSTRAINT [PK_U_CL_MyEmp_MyEmpID] PRIMARY KEY CLUSTERED
18 (
19 [MyEmpID] ASC
20 )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, FILLFACTOR = 90)
21 ON [PRIMARY];
22 END
23 GO
24
25 IF OBJECT_ID(N'FK_MyEmp_MyEmp_ParentID', N'F') IS NULL
26 BEGIN
27 ALTER TABLE dbo.MyEmp ADD CONSTRAINT FK_MyEmp_MyEmp_ParentID FOREIGN KEY (ParentID) REFERENCES dbo.MyEmp (MyEmpID);
28 END
29 GO
30
31 -- Insert Test Data
32 INSERT INTO dbo.MyEmp (MyEmpID, ParentID, MyEmpName, HoursSalary) VALUES
33 (1, NULL, N'Andrew', 200),
34 (2, 1, N'Nancy', 100),
35 (3, 1, N'Janet', 120),
36 (4, 3, N'Michael', 80),
37 (5, 2, N'Robert', 50)
38 GO
39
40 ;WITH tData (MyEmpID, MyEmpName, ParentID, ParentName, HoursSalary, ParentHierarchyIndex, HierarchyIndex, LevelID, HierarchyName, HierarchyName2) AS (
41 -- 基准点查询
42 SELECT MyEmpID /*雇员ID*/
43 ,MyEmpName /*雇员名称*/
44 ,ISNULL(ParentID, 0) AS ParentID /*父雇员ID*/
45 ,CAST(N'' AS NVARCHAR(20)) AS ParentName /*父雇员名称*/
46 ,HoursSalary /*小时薪水*/
47 ,CAST(CONCAT(',', ISNULL(ParentID, 0), ',') AS VARCHAR(300)) AS ParentHierarchyIndex /*父层级索引字符串*/
48 ,CAST(CONCAT(',', ISNULL(ParentID, 0), ',', MyEmpID, ',') AS VARCHAR(300)) AS HierarchyIndex /*层级索引字符串串,包含当前层级*/
49 ,CAST(1 AS INT) AS LevelID /*层级ID,根层级为1,层级越深则数字越大*/
50 ,CAST(MyEmpName AS NVARCHAR(800)) AS HierarchyName /*层级名称,树形结构显示*/
51 ,CAST(MyEmpName AS NVARCHAR(80)) AS HierarchyName2 /*层级名称2,水平结构显示*/
52 FROM dbo.MyEmp
53 WHERE ParentID IS NULL
54 -- 递归查询
55 UNION ALL
56 SELECT T.MyEmpID
57 ,T.MyEmpName
58 ,T.ParentID
59 ,T2.MyEmpName
60 ,T.HoursSalary
61 ,CAST(CONCAT(T2.ParentHierarchyIndex, T.ParentID, ',') AS VARCHAR(300)) AS ParentHierarchyIndex
62 ,CAST(CONCAT(T2.HierarchyIndex, T.MyEmpID, ',') AS VARCHAR(300)) AS HierarchyIndex
63 ,T2.LevelID + 1 AS LevelID
64 ,CAST(CONCAT(REPLICATE(N'| ', T2.LevelID), T.MyEmpName) AS NVARCHAR(800)) AS HierarchyName
65 ,CAST(CONCAT(T2.HierarchyName2, '->', T.MyEmpName) AS NVARCHAR(80)) AS HierarchyName2
66 FROM dbo.MyEmp AS T
67 INNER JOIN tData AS T2
68 ON T.ParentID = T2.MyEmpID
69 )
70
71 -- 使用HierarchyIndex来实现
72
73 -- CROSS APPLY
74 SELECT T.*, T2.TotalSalary AS TotalSalary
75 FROM tData AS T
76 CROSS APPLY (SELECT SUM(tData.HoursSalary) AS TotalSalary
77 FROM tData
78 WHERE HierarchyIndex LIKE CONCAT(T.HierarchyIndex, '%')) AS T2
79 ORDER BY T.HierarchyIndex ASC;
80
81 -- SELECT + 子查询
82 --SELECT T.*, TotalSalary = ( SELECT SUM(tData.HoursSalary) FROM tData WHERE tData.HierarchyIndex LIKE CONCAT(T.HierarchyIndex, '%'))
83 --FROM tData AS T
84 --ORDER BY T.HierarchyIndex ASC;
85
86 -- 使用ParentHierarchyIndex
87 -- CROSS APPLY
88 --SELECT T.*, T.HoursSalary + T2.DownMemberTotalHoursSalary AS TotalSalary
89 --FROM tData AS T
90 -- CROSS APPLY (
91 -- SELECT ISNULL(SUM(tData.HoursSalary), 0) AS DownMemberTotalHoursSalary
92 -- FROM tData
93 -- WHERE tData.ParentHierarchyIndex LIKE CONCAT(T.ParentHierarchyIndex, T.MyEmpID, '%')
94 -- ) AS T2
95 --ORDER BY T.HierarchyIndex ASC;
96
97 -- SELECT + 子查询
98 --SELECT T.*, T.HoursSalary + (SELECT ISNULL(SUM(tData.HoursSalary), 0) AS DownMemberTotalHoursSalary
99 -- FROM tData
100 -- WHERE tData.ParentHierarchyIndex LIKE CONCAT(T.ParentHierarchyIndex, T.MyEmpID, '%')) AS TotalSalary
101 --FROM tData AS T
102 --ORDER BY T.HierarchyIndex ASC;
103 GO
以上解决方案是在不修改数据结构的情况下来实现的,从以上解决方案中,我们可以从数据表的设计入手,将虚拟出来的父层级索引字符串列增加到数据表中,将该列创建为聚集索引, 便于提高查询性能。
增加新列的T-SQL脚本如下:
1 IF NOT EXISTS (SELECT 1 FROM sys.columns WHERE OBJECT_ID = OBJECT_ID(N'dbo.MyEmp', N'U') AND name = N'HierarchyIndex')
2 BEGIN
3 ALTER TABLE dbo.MyEmp ADD HierarchyIndex VARCHAR(800) NOT NULL CONSTRAINT DF_MyEmp_HierarchyIndex DEFAULT '';
4 END
5 GO
如果该列创建为聚集且唯一,则相应的T-SQL脚本如下:
1 -- 删除外键
2 IF OBJECT_ID(N'FK_MyEmp_MyEmp_ParentID', N'F') IS NOT NULL
3 BEGIN
4 ALTER TABLE dbo.MyEmp DROP CONSTRAINT FK_MyEmp_MyEmp_ParentID;
5 END
6 GO
7
8 -- 删除主键
9 IF OBJECT_ID(N'PK_U_CL_MyEmp_MyEmpID', N'PF') IS NULL
10 BEGIN
11 ALTER TABLE dbo.MyEmp DROP CONSTRAINT PK_U_CL_MyEmp_MyEmpID;
12 END
13 GO
14
15 -- 创建(唯一:语义分析得到的,没有使用创建UNIQUE关键字)聚集索引
16 IF NOT EXISTS (SELECT 1 FROM sys.indexes WHERE OBJECT_ID = OBJECT_ID(N'dbo.MyEmp', N'U') AND name = N'IX_U_CL_MyEmp_HierarchyIndex')
17 BEGIN
18 CREATE CLUSTERED INDEX IX_U_CL_MyEmp_HierarchyIndex ON dbo.MyEmp
19 (
20 HierarchyIndex ASC
21 ) WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, FILLFACTOR = 90)
22 END
23 GO
24
25 -- 创建主键且非聚集索引
26 IF OBJECT_ID(N'PK_U_CL_MyEmp_MyEmpID', N'PK') IS NULL
27 BEGIN
28 ALTER TABLE [dbo].[MyEmp] ADD CONSTRAINT [PK_U_NCL_MyEmp_MyEmpID] PRIMARY KEY NONCLUSTERED
29 (
30 [MyEmpID] ASC
31 )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, FILLFACTOR = 90)
32 ON [PRIMARY];
33 END
34 GO
35
36 -- 创建外键
37 IF OBJECT_ID(N'FK_MyEmp_MyEmp_ParentID', N'F') IS NULL
38 BEGIN
39 ALTER TABLE dbo.MyEmp ADD CONSTRAINT FK_MyEmp_MyEmp_ParentID FOREIGN KEY (ParentID) REFERENCES dbo.MyEmp (MyEmpID);
40 END
41 GO
同步HierarchyIndex字段列值的T-SQL如下:
1 ;WITH tData (MyEmpID, HierarchyIndex) AS (
2 -- 基准点查询
3 SELECT MyEmpID, CAST(CONCAT(',', ISNULL(ParentID, 0), ',', MyEmpID, ',') AS VARCHAR(300)) AS HierarchyIndex
4 FROM dbo.MyEmp
5 WHERE ParentID IS NULL
6 -- 递归查询
7 UNION ALL
8 SELECT T.MyEmpID, CAST(CONCAT(T2.HierarchyIndex, T.MyEmpID, ',') AS VARCHAR(300)) AS HierarchyIndex
9 FROM dbo.MyEmp AS T /*子表*/
10 INNER JOIN tData AS T2 /*父表*/
11 ON T.ParentID = T2.MyEmpID
12 )
13
14 --SELECT T.*
15 UPDATE T2
16 SET T2.HierarchyIndex = T.HierarchyIndex
17 FROM tData AS T
18 INNER JOIN dbo.MyEmp AS T2
19 ON T.MyEmpID = T2.MyEmpID
20 WHERE T2.HierarchyIndex = '';
21 GO
使用以下T-SQL验证数据是否已经修改:
1 SELECT MyEmpID, ParentID, MyEmpName, HoursSalary, HierarchyIndex
2 FROM dbo.MyEmp;
3 GO
解决该问题的解决方案的T-SQL语句如下:
1 SELECT T.MyEmpID, T.ParentID, T.MyEmpName, T.HoursSalary, T.HierarchyIndex, T2.TotalSalary AS TotalSalary
2 FROM dbo.MyEmp AS T
3 CROSS APPLY (SELECT SUM(HoursSalary) AS TotalSalary FROM dbo.MyEmp WHERE HierarchyIndex LIKE CONCAT(T.HierarchyIndex, '%')) AS T2;
4 GO