SQL逗号分隔的字段统计（摘自网络）

前言：

由于很多业务表因为历史原因或者性能原因，都使用了违反第一范式的设计模式。即同一个列中存储了多个属性值（具体结构见下表）。

这种模式下，应用常常需要将这个列依据分隔符进行分割，并得到列转行的结果。

表数据：

ID　 Value

1 tiny,small,big

2 small,medium

3 tiny,big

期望得到结果：

ID Value

1 tiny

1 small

1 big

2 small

2 medium

3 tiny

3 big

正文：

www.2cto.com

#需要处理的表

create table tbl_name (ID int ,mSize varchar(100));

insert into tbl_name values (1,'tiny,small,big');

insert into tbl_name values (2,'small,medium');

insert into tbl_name values (3,'tiny,big');

#用于循环的自增表

create table incre_table (AutoIncreID int);

insert into incre_table values (1);

insert into incre_table values (2);

insert into incre_table values (3);

select a.ID,substring_index(substring_index(a.mSize,',',b.AutoIncreID),',',-1)

from

tbl_name a

join

incre_table b

on b.AutoIncreID <= (length(a.mSize) - length(replace(a.mSize,',',''))+1)

order by a.ID;

原理分析：

这个join最基本原理是笛卡尔积。通过这个方式来实现循环。

以下是具体问题分析：

length(a.Size) - length(replace(a.mSize,',',''))+1 表示了，按照逗号分割后，改列拥有的数值数量，下面简称n

join过程的伪代码：

根据ID进行循环

{

判断：i 是否 <= n

{

获取最靠近第 i 个逗号之前的数据，即 substring_index(substring_index(a.mSize,',',b.ID),',',-1)

i = i +1

}

ID = ID +1

} www.2cto.com

总结：

这种方法的缺点在于，我们需要一个拥有连续数列的独立表（这里是incre_table)。并且连续数列的最大值一定要大于符合分割的值的个数。

例如有一行的mSize 有100个逗号分割的值，那么我们的incre_table 就需要有至少100个连续行。

当然，mysql内部也有现成的连续数列表可用。如mysql.help_topic： help_topic_id 共有504个数值，一般能满足于大部分需求了。

改写后如下:

select a.ID,substring_index(substring_index(a.mSize,',',b.help_topic_id+1),',',-1)

from

tbl_name a

join

mysql.help_topic b

on b.help_topic_id < (length(a.mSize) - length(replace(a.mSize,',',''))+1)

order by a.ID;

=========================================================================

问题：

有个表中的一个字段Author，如下
ID        Author
1         张三
2         张三,李四
3         王五
4         李四
5         张三,李四,王五

现在想查询出这样的结果
Author        Count
张三            3
李四            3
王五            2


sql server 解答：

if object_id('Tempdb..#Num') is not null 
     drop table #Num 
 select top 100 ID=Identity(int,1,1) into #Num from syscolumns a,syscolumns b 
 Select  
     Author=substring(a.Author,b.ID,charindex(',',a.Author+',',b.ID)-b.ID),count(*)
 from  
     table1 a,#Num b 
 where 
     charindex(',',','+a.Author,b.ID)=b.ID
group by substring(a.Author,b.ID,charindex(',',a.Author+',',b.ID)-b.ID);
 
table1替换成你自己表名，top 100 那个100你就替换个稍微大点的数吧

posted @ 2015-05-21 10:11 步子521 阅读(1739) 评论(1) 收藏举报

刷新页面返回顶部

步子521

SQL逗号分隔的字段统计（摘自网络）

公告