SQL 之 查询操作重复记录
有时,我们的数据表中会存在一些冗余数据,这就要求我们查询并操作这些冗余数据。
一、查询表中重复记录
例如,查找重复记录是根据单个字段(peopleId)来判断
SELECT * FROM Tpeople WHERE peopleId IN ( SELECT peopleId FROM Tpeople GROUP BY peopleId HAVING COUNT(peopleId) > 1)
二、删除表中多余的重复记录
例如,重复记录是根据单个字段(peopleId)来判断,只保留最先增加的记录,下面是保留ID最小的记录
DELETE FROM Tpeople WHERE peopleName IN ( SELECT peopleName FROM Tpeople GROUP BY peopleName HAVING COUNT(peopleName)>1) AND peopleId NOT IN ( SELECT MIN(peopleId) FROM Tpeople GROUP BY peopleName HAVING COUNT(peopleName)>1)
三、查找表中多余的重复记录(多个字段)
A,DB2中可以如下查询
SELECT * FROM vitae TA WHERE (TA.peopleId, TA.seq) IN ( SELECT peopleId,seq FROM vitae TB GROUP BY peopleId,seq HAVING COUNT(*) > 1)
B,SQLServer如下查询
SELECT * FROM vitae TA WHERE EXISTS ( SELECT * FROM vitae TB WHERE TB.peopleId=TA.peopleId AND TB.seq =TA.seq GROUP BY peopleId,seq HAVING COUNT(1) > 1)
四、删除表中多余的重复记录(多个字段),只留有最先插入的记录
A,DB2中可以如下删除
DELETE FROM vitae TA WHERE (TA.peopleId,TA.seq) IN (SELECT peopleId,seq FROM vitae TB GROUP BY peopleId,seq HAVING COUNT(1) > 1) AND rowid NOT IN ( SELECT MIN(rowid) FROM vitae TC GROUP BY peopleId,seq HAVING COUNT(1)>1)
B,SQLServer中如下删除
DELETE FROM vitae TA WHERE EXISTS ( SELECT * FROM vitae TB WHERE TB.peopleId=TA.peopleId AND TB.seq =TA.seq GROUP BY peopleId,seq HAVING COUNT(1) > 1) AND rowid NOT IN ( SELECT MIN(rowid) FROM vitae TC GROUP BY peopleId,seq HAVING COUNT(1)>1)